Simple Python web proxy stalls for some web sites

Bryan Olson fakeaddress at nowhere.org
Thu Oct 7 23:34:22 EDT 2004


Richie Hindle wrote:
 > By default, urllib2 specifies "User-Agent: Python-urllib/x.y"  Some
 > sites, Google included, reject this because they don't like to be
 > web-scraped.

Google dis' Python?  No way!

I checked, and Google is answering in good faith.  Some web
sites block unknown user-agents, but only the most evil would
hang the connection.  Google doesn't even block wget.


--
--Bryan

Full disclosure: I used to work for Google.  I don't now, and
never did have any authority to speak for them.



More information about the Python-list mailing list