Why doesn't Python's "robotparser" like Wikipedia's "robots.txt" file?
John Nagle
nagle at animats.com
Tue Oct 2 11:00:19 EDT 2007
Lawrence D'Oliveiro wrote:
> In message <HYiMi.9932$JD.6615 at newssvr21.news.prodigy.net>, John Nagle
> wrote:
>
>> For some reason, Python's parser for "robots.txt" files
>> doesn't like Wikipedia's "robots.txt" file:
>>
>> >>> import robotparser
>> >>> url = 'http://wikipedia.org/robots.txt'
>> >>> chk = robotparser.RobotFileParser()
>> >>> chk.set_url(url)
>> >>> chk.read()
>> >>> testurl = 'http://wikipedia.org'
>> >>> chk.can_fetch('Mozilla', testurl)
>> False
>> >>>
>
> >>> chk.errcode
> 403
>
> Significant?
>
Helpful. Also an undocumented feature. See
http://docs.python.org/lib/module-robotparser.html
John Nagle
More information about the Python-list
mailing list