Python Web Servers and Page Retrievers
Max Erickson
maxerickson at gmail.com
Wed Apr 11 20:31:07 EDT 2007
"Collin Stocks" <collinstocks at gmail.com> wrote:
> ------=_Part_19087_21002019.1176329323968
> I tried it, and when checking it using a proxy, saw that it
> didn't really work, at least in the version that I have (urllib
> v1.17 and urllib2 v2.5). It just added that header onto the end,
> therefore making there two User-Agent headers, each with
> different values. I might add that my script IS able to retrieve
> search pages from Google, whereas both urllibs are FORBIDDEN with
> the headers that they use.
>
I don't know enough about either library to argue about it, but here
is what I get following the Dive Into Python example(but hitting
google for a search):
>>> import urllib2
>>> opener=urllib2.build_opener()
>>> request=urllib2.Request('http://www.google.com/search?
q=tesla+battery')
>>> request.add_header('User-Agent','OpenAnything/1.0
+http://diveintopython.org/')
>>> data=opener.open(request).read()
>>> data
'<html><head><meta http-equiv="content-type" content="text/html;
charset=ISO-8859-1"><title>tesla battery - Google Search</title><
[snip rest of results page]
This is with python 2.5 on windows.
max
More information about the Python-list
mailing list