[issue15851] Lib/robotparser.py doesn't accept setting a user agent string, instead it uses the default.

Eduardo A. Bustamante López report at bugs.python.org
Sun Sep 9 23:37:59 CEST 2012


Eduardo A. Bustamante López added the comment:

I'm not sure what's the best approach here.

1. Avoid changes in the Lib, and document a work-around, which involves
   installing an opener with the specific User-agent. The draw-back is that it
   modifies the behaviour of urlopen() globally, so that change affects any
   other call to urllib.request.urlopen.

2. Revert to the old way, using an instance of a FancyURLopener (or URLopener),
   in the RobotFileParser class. This requires a modification of the Lib, but
   allows us to modify only the behaviour of that specific instance of
   RobotFileParser. The user could sub-class FancyURLopener, set the appropiate
   version string.

I attach a script, tested against the ``default`` branch of the mercurial
repository. It shows the work around for python3.3.

----------
Added file: http://bugs.python.org/file27158/test.py

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue15851>
_______________________________________
-------------- next part --------------
import urllib.robotparser
import urllib.request
opener = urllib.request.build_opener()
opener.addheaders = [('User-agent', 'MyUa/0.1')]
urllib.request.install_opener(opener)
rp = urllib.robotparser.RobotFileParser('http://localhost:9999')
rp.read()


More information about the Python-bugs-list mailing list