GOOGLE SEARCH

Google

Saturday, June 02, 2007

Google's Supplemental Index

If you're not familiar with GSI, let's start with Google's own definition of their new supplemental index.

"A supplemental result is just like a regular web result, except that it's pulled from our supplemental index. We're able to place fewer restraints on sites we crawl for this supplemental index than we do on sites that are crawled for our main index. For example, the number of parameters in a URL might exclude a site from being crawled for inclusion in our main index; however, it could still be crawled and added to our supplemental index.

If you're a webmaster, please note that the index in which a site is included is completely automated; there's no way to select or change the index in which a site appears. Please also be assured that the index in which a site is included doesn't affect its Page Rank."

Although there is no direct way you can influence which pages of your site will be in the regular or supplemental index, you can take steps not to let your site get put into Google's Supplemental Index (GSI) in the first place. Google is telling webmasters, if your site is difficult to crawl or deemed to be of low quality without much to differentiate it with unique content from other sites in your niche, your site will be placed in a second rate low quality index.

However, even though it would appear that there is no direct way of influencing which of your site pages will be affected, here are a few steps that you can take to avoid getting listed on GSI. Quality is king!. If the Google crawler finds your site difficult to index or locates duplicate content, then you'll probably end up in "Supplemental Hell". So why do so many pages end up in Google's Supplemental Index?

Well, here are some reasons why?

1. Using Duplicate Content (on the same site or externally).
2. The page in question contains the same Title and META tags as other pages on your site.

3. Having loads of unrelated external links on one page or not enough internal or external inbound links.
4. Your web page no longer exists, or is orphaned with no internal links pointing to it or buried too deep to be crawled properly.

... and a few steps to avoiding it!

• Remove duplicate content from your website and keep your remaining content as fresh and unique as possible.
• If you're using PLR content - rewrite at least 30-50% of it.
• Shorten any long URL's to something simple.
• Increase relevant inbound links and use contextual linking where possible.
• Use deep-linking (linking to other pages than just your index page), but try to keep within 2-3 levels.
• Create and submit a Sitemap which will allow Google easy access to all your web pages and ensure that all your pages are indexed regularly and correctly.

The above is by no means an exhaustive list of avoiding Google's Supplemental Index, but it is certainly enough to get started on.