[issue17403] Robotparser fails to parse some robots.txt

Senthil Kumaran report at bugs.python.org
Mon Apr 22 15:29:16 CEST 2013


Senthil Kumaran added the comment:

My suggestion for this issue is going ahead with patch2 of Mher. It does  a simple normalization and does the right thing.

The case in the question is an empty query string and behavior or Allow and Disallow for that and patch addresses that. (I don't know why this *bug* was not detected earlier)

Robotparser implements the updated one ( www.robotstxt.org/norobots-rfc.txt) - You can check for Allow string verification in both code and tests.

That said, if updating robotparser further to more compliant with many cases which the 3rd party modules adhere, +1 to that. I suggest that be taken as a different issue and not be confused with this bug.

----------
nosy: +orsenthil

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue17403>
_______________________________________


More information about the Python-bugs-list mailing list