Chapter 9: Search
Robot Control with <meta>An alternative method to the robots.txt file that is useful particularly for those users who have no access to the root directory of their domain is to use a <meta> tag to control indexing. To disallow indexing of a particular page, use this <meta> tag in the <head> section of the HTML document:
<meta name="robots" content="noindex" />You can also inform a spider not to follow any links coming out of the page:
<meta name="robots" content="noindex, nofollow" />When using this type of exclusion, just make sure not to confuse the robot with contradictory information like
<meta name="robots" content="index, noindex" />or
<meta name="robots" content="index, nofollow, follow " />as the spider may either ignore the information entirely or maybe even index anyway. The other downside to the <meta> tag approach is that fewer search engines support it than do robots.txt.
Overview | Chapters | Examples | Resources | Buy the Book!