Why doesn't Python's "robotparser" like Wikipedia's "robots.txt" file?

John Nagle nagle at animats.com
Tue Oct 2 11:00:19 EDT 2007


Lawrence D'Oliveiro wrote:
> In message <HYiMi.9932$JD.6615 at newssvr21.news.prodigy.net>, John Nagle
> wrote:
> 
>>     For some reason, Python's parser for "robots.txt" files
>> doesn't like Wikipedia's "robots.txt" file:
>>
>>  >>> import robotparser
>>  >>> url = 'http://wikipedia.org/robots.txt'
>>  >>> chk = robotparser.RobotFileParser()
>>  >>> chk.set_url(url)
>>  >>> chk.read()
>>  >>> testurl = 'http://wikipedia.org'
>>  >>> chk.can_fetch('Mozilla', testurl)
>> False
>>  >>>
> 
>     >>> chk.errcode
>     403
> 
> Significant?
> 
    Helpful.  Also an undocumented feature.  See

	http://docs.python.org/lib/module-robotparser.html

					John Nagle



More information about the Python-list mailing list