Question: What is a robots.txt file?
Answer: The Robot Exclusion Standard allows a website owner to exclude certain parts of a website from being accessed by search engine spiders and other types of automated tools. This is useful if you have something internal or used only for testing that you do not want to be publicly seen. In the past, search engine inclusion required you to actually submit your site and pages. Now, search engines like Google have become so advanced that pages created yesterday may show up in search results tomorrow, with no action on the part of the creator.
To use the standard, simply create a robots.txt file in the website root file directory of your server. Then, enter the files and/or directories you want to exclude. For example, to exclude everything, you would enter:
To exclude a test directory and a script directory:
You should not use this as a security measure. It relies on the robots to acknowledge the robots.txt file but does not require them to do so. In other words, if someone wants to access a page listed in the file, he could. If you need actually security, try password protection and encryption.
Photo Source: SXC