Web Design - The Complete Reference: Chapter 9: Search

Robot Control with <meta>

An alternative method to the robots.txt file that is useful particularly for those users who have no access to the root directory of their domain is to use a <meta> tag to control indexing. To disallow indexing of a particular page, use this <meta> tag in the <head> section of the HTML document:

<meta name="robots" content="noindex" />

You can also inform a spider not to follow any links coming out of the page:

<meta name="robots" content="noindex, nofollow" />

When using this type of exclusion, just make sure not to confuse the robot with contradictory information like

<meta name="robots" content="index, noindex" />

<meta name="robots" content="index, nofollow, follow " />

as the spider may either ignore the information entirely or maybe even index anyway. The other downside to the <meta> tag approach is that fewer search engines support it than do robots.txt.

Next: Optimizing for Search Engines

Overview | Chapters | Examples | Resources | Buy the Book!