Sep 10 2008

Robots.txt is not a security measure

By Niyaz PK under Security

I am increasingly coming across people who think robots.txt file can be used to prevent search engine crawlers from crawling sensitive data in their websites. Seriously.

This is just plain wrong. Data to be excluded using a robots.txt file is: unwanted, redundant or useless data. An entry in the robots.txt file cannot protect your sensitive data from going out. Sensitive data should not be left open in your website in the first place.

There are many malicious crawlers which crawl only the pages blocked by the robot.txt file in every website. I bet many interesting stuff will turn up in their search results.

2 Responses to “Robots.txt is not a security measure”

  1. Anish K.S
    on 11 Sep 2008 at 7:14 am

    Nice Post :)

  2. Niyaz PK
    on 11 Sep 2008 at 11:53 am

    Thanks Anish

Trackback URI | Comments RSS

Leave a Reply