[Patches] [Patch #102229] a better robotparser.py module
noreply@sourceforge.net
noreply@sourceforge.net
Thu, 2 Nov 2000 09:40:27 -0800
Patch #102229 has been updated.
Project: python
Category: Modules
Status: Open
Summary: a better robotparser.py module
Follow-Ups:
Date: 2000-Nov-02 09:40
By: calvin
Comment:
I have written a new RobotParser module 'robotparser2.py'.
This module is
o backward compatible with the old one
o makes correct useragent matching (is buggy in
robotparser.py)
o strips comments correctly (is buggy in robotparser.py)
o uses httplib instead of urllib.urlopen() to catch HTTP
connect errors correctly (is buggy in robotparser.py)
o implements not only the draft at
http://info.webcrawler.com/mak/projects/robots/norobots.html
but also the new one at
http://info.webcrawler.com/mak/projects/robots/norobots-rfc.html
Bastian Kleineidam
-------------------------------------------------------
-------------------------------------------------------
For more info, visit:
http://sourceforge.net/patch/?func=detailpatch&patch_id=102229&group_id=5470