Building browser-like GET request

Gilles Ganault nospam at nospam.com
Sat Apr 21 18:13:29 EDT 2007


On 21 Apr 2007 14:47:55 -0700, Björn Keil <abgrund at silberdrache.net>
wrote:
>Well, I am brand new to Python, so it takes me a lot of guessing, but
>since it seems you're using urlib2:

 Thanks. Indeed, it looks like urlib2 is the way to go when going
through a proxy.

For those interested, here's how to download a page through a proxy:

----------------------------
import sys 
import urllib
import urllib2
import re

#set up proxy
proxy_info = { 'host' : 'localhost','port' : 8080}
proxy_support = urllib2.ProxyHandler({"http" :
"http://%(host)s:%(port)d" % proxy_info})
opener = urllib2.build_opener(proxy_support)
urllib2.install_opener(opener)

#call page with specific headers
url  = 'http://www.acme.com/cgi-bin/read?code=123'
headers = {
	'User-Agent' : 'Mozilla/4.0 (compatible; MSIE 5.5; Windows
NT)',
	'Accept' :
'text/xml,application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5',
	'Accept-Language' : 'fr-fr,en-us;q=0.7,en;q=0.3',
	'Accept-Charset' : 'ISO-8859-1,utf-8;q=0.7,*;q=0.7'
	}
#None = GET; set values to use POST
req = urllib2.Request(url, None, headers)

response = urllib2.urlopen(req).read()
log = open('output.html','w') 
log.write(response) 
log.close()
----------------------------

Thanks.



More information about the Python-list mailing list