This file is a set of instructions for robots (i.e. spiders) that index the content of your website. For spiders that obey the file, it provides instructions for what they should or should not index. It’s important to have a robots.txt file present in the root directory of every domain you manage. You can view an example at http://www.seoseek.com/robots.txt
The format of the robots.txt file is also important. Without a properly formatted robots exclusion file, secure areas of the site may be completely indexed and exposed to the internet by the major search engines. Files that are not intended to be indexed, should be placed in a folder and disallow command should be used on that directory. Another option is to put them in a password protected directory.
It is important to validate the robots.txt file. An Example can be found at http://tool.motoricerca.info/robots-checker.phtml
Creating a robots.txt file is easy. You can use note page or your favorite editor. Lets say you wanted to tell the google bot to not index the files in a directory called private. You would type the following into your editor.
User-Agent: Googlebot
Disallow: /private/
If you want all search engine spiders to index your entire site you would use the following commands.
User-Agent: *
Disallow: /
|