httplib slow read
Kjetil Jacobsen
setattr at yahoo.no
Thu Dec 6 05:26:14 EST 2001
John Hunter <jdhunter at nitace.bsd.uchicago.edu> wrote in message news:<m2itbliumd.fsf at mother.paradise.lost>...
> >>>>> "Toby" == Toby Dickenson <tdickenson at devmail.geminidataloggers.co.uk> writes:
>
> Toby> It could well be a problem with your hand-crafted http
> Toby> request. I suggest you go with urllib.
>
> I needed to set some headers, like Referer and Cookie, which is why I
> went with httplib. I sniffed port 80 to find out how what was being
> sent by my browser, and then constructed the headers from that info,
> so I think my headers were ok. Can't say for sure.
>
> I suppose the headers can also be set with the urlencode format of
> urllib, so this is probably the way to go; thanks for the suggestion.
> Still curious why the read is so slow with httplib, though.
another option may be to use the pycurl module which wraps the
curl library:
>>> import pycurl
>>> f = open('output','w') # file to store document in
>>> c = pycurl.init()
>>> c.setopt(pycurl.URL, 'http://www.python.org')
>>> c.setopt(pycurl.FILE, f)
>>> c.perform()
pycurl is pretty efficient and in my experience performs faster
than httplib and urllib. in particular when you have multiple
python-threads concurrently downloading documents.
curl: http://curl.haxx.se
pycurl: http://pycurl.sf.net/
regards,
- kjetil
More information about the Python-list
mailing list