Simple Python web proxy stalls for some web sites
Richie Hindle
richie at entrian.com
Fri Oct 8 05:20:29 EDT 2004
[Richie]
> By default, urllib2 specifies "User-Agent: Python-urllib/x.y" Some
> sites, Google included, reject this because they don't like to be
> web-scraped.
[Bryan]
> Google dis' Python? No way!
Way, I'm afraid.
> I checked, and Google is answering in good faith.
Try doing an actual query:
>>> import urllib2
>>> f = urllib2.urlopen("http://www.google.com/") # Works OK
>>> f = urllib2.urlopen("http://www.google.com/search?q=python")
Traceback (most recent call last):
[...]
urllib2.HTTPError: HTTP Error 403: Forbidden
>>>
This is probably not the problem you're facing right now, but it will be
a problem when you solve your current one. 8-)
--
Richie Hindle
richie at entrian.com
More information about the Python-list
mailing list