[issue17403] Robotparser fails to parse some robots.txt
Senthil Kumaran
report at bugs.python.org
Mon Apr 22 15:29:16 CEST 2013
Senthil Kumaran added the comment:
My suggestion for this issue is going ahead with patch2 of Mher. It does a simple normalization and does the right thing.
The case in the question is an empty query string and behavior or Allow and Disallow for that and patch addresses that. (I don't know why this *bug* was not detected earlier)
Robotparser implements the updated one ( www.robotstxt.org/norobots-rfc.txt) - You can check for Allow string verification in both code and tests.
That said, if updating robotparser further to more compliant with many cases which the 3rd party modules adhere, +1 to that. I suggest that be taken as a different issue and not be confused with this bug.
----------
nosy: +orsenthil
_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue17403>
_______________________________________
More information about the Python-bugs-list
mailing list