Google Webmaster Central Blog - Official news on crawling and indexing sites for the Google index

Webmaster Central YouTube update for June 22nd - 26th

Tuesday, June 30, 2009 at 10:22 AM

Want to know what's new on the Webmaster Central YouTube channel? Here's what we've uploaded in the past week:

As part of Google's goal to make the web faster, we uploaded several video tips about optimizing the speed of your website. Check out the tutorials page to view the tutorials and associated videos.

Matt Cutts answered a new question each day from the Grab Bag:
And during Adam Lasnik's visit to India, he was interviewed by Webmaster Help Forum guide Jayan Tharayil about issues related to webmasters in India. We have the full three-part interview right here.

We'll get you started on this batch of videos with Matt's tips for targeting your site to a specific region:


Feel free to leave comments letting us know how you liked the videos, and if you have any specific questions, ask the experts in the Webmaster Help Forum.

Traffic drops and site architecture issues

Sunday, June 28, 2009 at 3:39 AM

Webmaster Level: Intermediate.

We hear lots of questions about site architecture issues and traffic drops, so it was a pleasure to talk about it in greater detail at SMX London and I'd like to highlight some key concepts from my presentation here. First off, let's gain a better understanding of drops in traffic, and then we'll take a look at site design and architecture issues.

Understanding drops in traffic

As you know, fluctuations in search results happen all the time; the web is constantly evolving and so is our index. Improvements in our ability to understand our users' interests and queries also often lead to differences in how our algorithms select and rank pages. We realize, however, that such changes might be confusing and sometimes foster misconceptions, so we'd like to address a couple of these myths head-on.

Myth number 1: Duplicate content causes drops in traffic!
Webmasters often wonder if the duplicates on their site can have a negative effect on their site's traffic. As mentioned in our guidelines, unless this duplication is intended to manipulate Google and/or users, the duplication is not a violation of our Webmaster Guidelines. The second part of my presentation illustrates in greater detail how to deal with duplicate content using canonicalization.

Myth number 2: Affiliate programs cause drops in traffic!
Original and compelling content is crucial for a good user experience. If your website participates in affiliate programs, it's essential to consider whether the same content is available in many other places on the web. Affiliate sites with little or no original and compelling content are not likely to rank well in Google search results, but including affiliate links within the context of original and compelling content isn't in itself the sort of thing that leads to traffic drops.

Having reviewed a few of the most common concerns, I'd like to highlight two important sections of the presentation. The first illustrates how malicious attacks -- such as an injection of hidden text and links -- might cause your site to be removed from Google's search results. On a happier note, it also covers how you can use the Google cache and Webmaster Tools to identify this issue. On a related note, if we've found a violation of the Webmaster Guidelines such as the use of hidden text or the presence of malware on your site, you will typically find a note regarding this in your Webmaster Tools Message center.
You may also find your site's traffic decreased if your users are being redirected to another site...for example, due to a hacker-applied server- or page-level redirection triggered by referrals from search engines. A similar scenario -- but with different results -- is the case in which a hacker has instituted a redirection for crawlers only. While this will cause no immediate drop in traffic since users and their visits are not affected, it might lead to a decrease in pages indexed over time.





Site design and architecture issues
Now that we've seen how malicious changes might affect your site and its traffic, let's examine some design and architecture issues. Specifically, you want to ensure that your site is able to be both effectively crawled and indexed, which is the prerequisite to being shown in our search results. What should you consider?

  • First off, check that your robots.txt file has the correct status code and is not returning an error.
  • Keep in mind some best practices when moving to a new site and the new "Change of address" feature recently added to Webmaster Tools.
  • Review the settings of the robots.txt file to make sure no pages -- particularly those rewritten and/or dynamic -- are blocked inappropriately.
  • Finally, make good use of the rel="canonical" attribute to reduce the indexing of duplicate content on your domain. The example in the presentation shows how using this attribute helps Google understand that a duplicate can be clustered with the canonical and that the original, or canonical, page should be indexed.



In conclusion, remember that fluctuations in search results are normal but there are steps that you can take to avoid malicious attacks or design and architecture issues that might cause your site to disappear or fluctuate unpredictably in search results. Start by learning more about attacks by hackers and spammers, make sure everything is running properly at crawling and indexing level by double-checking the HTML suggestions in Webmaster Tools, and finally, test your robots.txt file in case you are accidentally blocking Googlebot. And don't forget about those "robots.txt unreachable" errors!

Spam2.0: Fake user accounts and spam profiles

Friday, June 26, 2009 at 9:06 AM

You're a good webmaster or web developer, and you've done everything you can to keep your site from being hacked and keep your forums and comment sections free of spam. You're now the proud owner of a buzzing web2.0 social community, filling the web with user-generated content, and probably getting lots of visitors from Google and other search engines.

Many of your site's visitors will create user profiles, and some will spend hours posting in forums, joining groups, and getting the sparkles exactly right on the rainbow-and-unicorn image for their BFF's birthday. This is all great.

Others, however, will create accounts and fill their profiles with gibberish, blatherskite and palaver. Even worse, they'll add a sneaky link, a bit of redirecting JavaScript code, or a big fake embedded video that takes your users off to the seediest corners of the web.

Welcome to the world of spam profiles. The social web is growing incredibly quickly and spammers look at every kind of user content on the web as an opportunity for traffic. I've spoken with a number of experienced webmasters who were surprised to find out this was even a problem, so I thought I would talk a little bit about spam profiles and what you might do to find and clean them out of your site.

Why is this important?

Imagine the following scenario:

"Hello there, welcome to our new web2.0 social networking site. Boy, have I got a new friend for you. His name is Mr. BuyMaleEnhancementRingtonesNow, and he'd love for you to check out his profile. He's a NaN-year-old from Pharmadelphia, PA and you can check out his exciting home page at http://example.com/obviousflimflam.


Not interested? Then let me introduce you to my dear friend PrettyGirlsWebCam1234, she says she's an old college friend of yours and has exciting photos and videos you might want to see."


You probably don't want your visitors' first impression of your site to include inappropriate images or bogus business offers. You definitely don't want your users hounded by fake invites to the point where they stop visiting altogether. If your site becomes filled with spammy content and links to bad parts of the web, search engines may lose trust in your otherwise fine site.

Why would anyone create spam profiles?

Spammers create fake profiles for a number of nefarious purposes. Sometimes they're just a way to reach users internally on a social networking site. This is somewhat similar to the way email spam works - the point is to send your users messages or friend invites and trick them into following a link, making a purchase, or downloading malware by sending a fake or low-quality proposition.

Spammers are also using spam profiles as yet another avenue to generate webspam on otherwise good domains. They scour the web for opportunities to get their links, redirects, and malware to users. They use your site because it's no cost to them and they hope to piggyback off your good reputation.

The latter case is becoming more and more common. Some fake profiles are obvious, using popular pharmaceuticals as the profile name, for example; but we've noticed an increase in savvier spammers that try to use real names and realistic data to sneak in their bad links. To make sure their newly-minted gibberish profile shows up in searches they will also generate links on hacked sites, comment spam, and yes, other spam profiles. This results in a lot of bad content on your domain, unwanted incoming links from spam sites, and annoyed users.

Which sites are being abused?

You may be thinking to yourself, "But my site isn't a huge social networking juggernaut; surely I don't need to worry." Unfortunately, we see spam profiles on everything from the largest social networking sites to the smallest forums and bulletin boards. Many popular bulletin boards and content management systems (CMS) such as vBulletin, phpBB, Moodle, Joomla, etc. generate member pages for every user that creates an account. In general CMSs are great because they make it easy for you to deploy content and interactive features to your site, but auto-generated pages can be abused if you're not aware.

For all of you out there who do work for huge social networking juggernauts, your site is a target as well. Spammers want access to your large userbase, hoping that users on social sites will be more trusting of incoming friend requests, leading to larger success rates.

What can you do?

This isn't an easy problem to solve - the bad guys are attacking a wide range of sites and seem to be able to adapt their scripts to get around countermeasures. Google is constantly under attack by spammers trying to create fake accounts and generate spam profiles on our sites, and despite all of our efforts some have managed to slip through. Here are some things you can do to make their lives more difficult and keep your site clean and useful:

  • Make sure you have standard security features in place, including CAPTCHAs, to make it harder for spammers to create accounts en masse. Watch out for unlikely behavior - thousands of new user accounts created from the same IP address, new users sending out thousands of friend requests, etc. There is no simple solution to this problem, but often some simple checks will catch most of the worst spam.

  • Use a blacklist to prevent repetitive spamming attempts. We often see large numbers of fake profiles on one innocent site all linking to the same domain, so once you find one, you should make it simple to remove all of them.

  • Watch out for cross-site scripting (XSS) vulnerabilities and other security holes that allow spammers to inject questionable code onto their profile pages. We've seen techniques such as JavaScript used to redirect users to other sites, iframes that attempt to give users malware, and custom CSS code used to cover over your page with spammy content.

  • Consider nofollowing the links on untrusted user profile pages. This makes your site less attractive to anyone trying to pass PageRank from your site to their spammy site. Spammers seem to go after the low-hanging fruit, so even just nofollowing new profiles with few signals of trustworthiness will go a long way toward mitigating the problem. On the flip side, you could also consider manually or automatically lifting the nofollow attribute on links created by community members that are likely more trustworthy, such as those who have contributed substantive content over time.

  • Consider noindexing profile pages for new, not yet trustworthy users. You may even want to make initial profile pages completely private, especially if the bulk of the content on your site is in blogs, forums, or other types of pages.

  • Add a "report spam" feature to user profiles and friend invitations. Let your users help you solve the problem - they care about your community and are annoyed by spam too.

  • Monitor your site for spammy pages. One of the best tools for this is Google Alerts - set up a site: query along with commercial or adult keywords that you wouldn't expect to see on your site. This is also a great tool to help detect hacked pages. You can also check 'Keywords' data in Webmaster Tools for strange, volatile vocabulary.

  • Watch for spikes in traffic from suspicious queries. It's always great to see the line on your pageviews chart head upward, but pay attention to commercial or adult queries that don't fit your site's content. In cases like this where a spammer has abused your site, that traffic will provide little if any benefit while introducing users to your site as "the place that redirected me to that virus."


Have any other tips to share? Please feel free to comment below. If you have any questions, you can always ask in our Webmaster Help Forum.

Written by Jason Morrison, Search Quality Team

Tell us what you think!

Wednesday, June 24, 2009 at 2:17 PM

(Cross-posted on the Google Product Ideas Blog)

The Webmaster Central team does our best to support the webmaster community via Webmaster Tools, the Webmaster Central Blog, the Webmaster YouTube Channel, Help Center, our forum, and a fellow named Matt Cutts.

If you've got ideas and suggestions for Webmaster Central - features you want, things we can do better - tell us. From now until Friday, July 24, 2009, Product Ideas for Webmaster Central will be open for feedback. Every suggestion you add will be seen not only by the Webmaster Central team, but by other users and webmasters. We'll review every submission, and we'll update you regularly with our progress and feedback.

The more feedback the better, so get started now.

Posted by Sagar Kamdar, Product Manager, Webmaster Tools

Watch out for your .yu domain!

Are you the owner of a .yu domain? Then you might have heard the news: as of September 30, all .yu domains will stop working, regardless of their renewal date. This means that any content you're hosting on a .yu domain will no longer be online. For those of you who would still like to have your site online, we've prepared some recommendations to make sure that Google keeps crawling, indexing, and serving your content appropriately.
  • Check your backlinks. Since it won't be possible to set up a redirection from the old .yu domain to your new one, all links pointing to .yu domains will lead to dead ends. This means that it will be increasingly difficult for search engines to retrieve your new content. To find out who is linking to you, sign up with Google Webmaster Tools and check the links to your site (you can also download this list as a "comma separated value" -- .csv -- file for ease of use). Then read through the list for sites that you recognize as important and contact their webmasters to make sure that they update their links to your new website.
  • Check your internal links. If you are planning to simply move your content in bulk from the old to the new site, make sure that the new internal navigation is up to date. For example, if you are renaming pages on your site from "www.example.yu/home.htm" to "www.example.com/home.htm" make sure that your internal navigation reflects such changes to prevent broken links.
  • Start moving the site to your new domain. It's a good idea to start moving while you can still maintain control of your old domain, so don't wait! As mentioned in our best practices when moving your site, we recommend starting by moving a single directory or subdomain, and testing the results before completing the move. Remember that you will not be able to keep a 301 redirection on your old domain after September 30, so start your test early.

While you're moving your site, you can test how Google crawls and indexes your new site at its new location by submitting a Sitemap via Google Webmaster Tools. Although we may not crawl or index all the pages listed in each Sitemap, we recommend that you submit one because doing so helps Google understand your site better. You can read more on this topic in our answers to the most frequently asked questions on Sitemaps. And remember that for any question or concerns we're waiting for you in the Google Webmaster Help Forum!

Let's make the web faster

Tuesday, June 23, 2009 at 3:55 PM

(Cross-posted on the Official Google Blog and the Google Code Blog)

From building data centers in different parts of the world to designing highly efficient user interfaces, we at Google always strive to make our services faster. We focus on speed as a key requirement in product and infrastructure development, because our research indicates that people prefer faster, more responsive apps. Over the years, through continuous experimentation, we've identified some performance best practices that we'd like to share with the web community on code.google.com/speed, a new site for web developers, with tutorials, tips and performance tools.

We are excited to discuss what we've learned about web performance with the Internet community. However, to optimize the speed of web applications and make browsing the web as fast as turning the pages of a magazine, we need to work together as a community, to tackle some larger challenges that keep the web slow and prevent it from delivering its full potential:
  • Many protocols that power the Internet and the web were developed when broadband and rich interactive web apps were in their infancy. Networks have become much faster in the past 20 years, and by collaborating to update protocols such as HTML and TCP/IP we can create a better web experience for everyone. A great example of the community working together is HTML5. With HTML5 features such as AppCache, developers are now able to write JavaScript-heavy web apps that run instantly and work and feel like desktop applications.

  • In the last decade, we have seen close to a 100x improvement in JavaScript speed. Browser developers and the communities around them need to maintain this recent focus on performance improvement in order for the browser to become the platform of choice for more feature-rich and computationally-complex applications.

  • Many websites can become faster with little effort, and collective attention to performance can speed up the entire web. Tools such as Yahoo!'s YSlow and our own recently launched Page Speed help web developers create faster, more responsive web apps. As a community, we need to invest further in developing a new generation of tools for performance measurement, diagnostics, and optimization that work at the click of a button.

  • While there are now more than 400 million broadband subscribers worldwide, broadband penetration is still relatively low in many areas of the world. Steps have been taken to bring the benefits of broadband to more people, such as the FCC's decision to open up the white spaces spectrum, for which the Internet community, including Google, was a strong champion. Bringing the benefits of cheap reliable broadband access around the world should be one of the primary goals of our industry.
To find out what Googlers think about making the web faster, see the video below. If you have ideas on how to speed up the web, please share them with the rest of the community. Let's all work together to make the web faster!



Webmaster Central YouTube update for June 15th - 19th

Monday, June 22, 2009 at 3:36 PM

Want to know what's new on the Webmaster Central YouTube channel? Here's what we've uploaded in the past week:

Maile Ohye gave a webmaster-focused presentation about Product Search.

Matt Cutts answered a new question each day from the Grab Bag:
To get you started on watching this latest batch of videos, here's Matt's answer about directories and paid links:



Feel free to leave comments letting us know how you liked the videos, and if you have any specific questions, ask the experts in the Webmaster Help Forum.