Sep 10 2008
Robots.txt is not a security measure
I am increasingly coming across people who think robots.txt file can be used to prevent search engine crawlers from crawling sensitive data in their websites. Seriously.
This is just plain wrong. Data to be excluded using a robots.txt file is: unwanted, redundant or useless data. An entry in the robots.txt file cannot protect your sensitive data from going out. Sensitive data should not be left open in your website in the first place.
There are many malicious crawlers which crawl only the pages blocked by the robot.txt file in every website. I bet many interesting stuff will turn up in their search results.




Subscribe
on 11 Sep 2008 at 7:14 am
Nice Post
on 11 Sep 2008 at 11:53 am
Thanks Anish