BUG URLLIB2

John Hunter jdhunter at nitace.bsd.uchicago.edu
Thu Nov 14 21:22:57 EST 2002


>>>>> "JohnJacob" == JohnJacob  <JJ at JJ.com> writes:

    JohnJacob> Except that this web site doesn't appear to be
    JohnJacob> forbidden. I can view it just fine with IE6, but when I
    JohnJacob> try it with urllib2, I get the same error.
    JohnJacob> Nonetheless, I have no idea why this is happening.

Good point!  Perhaps the following is informative:

http://www.google.it/robots.txt

User-agent: *
Disallow: /search
Disallow: /groups
Disallow: /images

 ... snip ...

So google.it is disallowing robots in the search subdir.  I suspect
urllib will allow you to override the default robots behavior.  Viewer
discretion is advised.

John Hunter





More information about the Python-list mailing list