mysteries of urllib/urllib2

John Nagle nagle at animats.com
Tue Jul 3 12:36:52 EDT 2007


Adrian Smith wrote:
> I'm trying to use urllib2 to download a page (I'd rather use urllib,
> but I need to change the User-Agent header to look like a browser or
> G**gle won't send it to me, the big meanies). The following (pinched
> from Dive Into Python) seems to work perfectly in Idle, but falls at
> the final hurdle when run as a cgi script - can anyone suggest
> anything I may have overlooked?
> 
> request = urllib2.Request(some_URL)
> request.add_header('User-Agent', 'some_plausible_string')
> opener = urllib2.build_opener()
> data = opener.open(request).read()

    I doubt that's the problem here, but don't use a USER-AGENT string
that ends in "m" without a preceding "m" when the USER-AGENT
string is the last element of the header.  Coyote Point load balancers
will drop the packet.

    (Coyote Point uses regular expressions to parse HTTP headers, and
I think somebody wrote "\m" where they meant "\n".)

				John Nagle



More information about the Python-list mailing list