urrlib2 multithreading error

Wed Jan 17 11:48:45 EST 2007

viscanti at gmail.com wrote:

> I'm using urllib2 to retrieve some data usign http in a multithreaded
> application.
> Here's a piece of code:
> 		req = urllib2.Request(url, txdata, txheaders)
> 		opener = urllib2.build_opener()
> 		opener.addheaders = [('User-agent', user_agent)]
> 		request = opener.open(req)
> 		data = request.read(1024)
>
> I'm trying to read only the first 1024 bytes to retrieve http headers
> (if is html then I will retrieve the entire page).

Why so much bother? You just can create the Request, open it, and ask
for the headers:

>>> req = urllib2.Request("http://www.google.com.ar")
>>> u = urllib2.urlopen(req)
>>> u.headers["content-type"]
'text/html'
>>> 

Take into account that you can add the headers where you put
"txheaders", it's not necessary to use "addheaders".

And see that I'm not reading the page at all, urllib2.urlopen just
retrieves the headers...

Regards,

-- 
.   Facundo
.
Blog: http://www.taniquetil.com.ar/plog/
PyAr: http://www.python.org/ar/