Lost between urllib and httplib

Doug Fort dougfort at downright.com
Sun Jun 10 20:52:20 EDT 2001


Content-Transfer-Encoding: 8Bit



Gustaf Liljegren wrote:



> Python 2.1:

> 

> I'm choosing between urllib and httplib, but have trouble with both.

> urllib.urlopen() is hiding HTTP errors (hope this will be fixed in next

> version!), and httplib.HTTP() doesn't seem to be able to access some pages

> if you only supply a domain name.

> 

> Here's the script I use for testing httplib:

> 

> import urlparse

> import httplib

> import sys

> 

> url = sys.argv[1]

> 

> try:

>   h = httplib.HTTP(urlparse.urlparse(url)[1])

>   h.putrequest('GET', urlparse.urlparse(url)[2])

>   h.putheader('Accept', 'text/html')

>   h.endheaders()

> except:

>   print "Host not found."

>   sys.exit()

> 

> print h.getreply()[0]

> 

> On some websites, I get HTTP error 301, 302 or even 404. Try for example:

> 

> http://www.webstandards.org (301)

> http://www.native-instruments.com (302)

> http://www.chaos.com (404)

> 

> If using urllib however, these places are all accessed successfully. What

> is it in urllib.urlopen() that has to be added if using httplib?

> 

> Regards,

> 

> Gustaf Liljegren

> 

HTTP results 301 and 302 are redirections.  You have to handle them 

yourself using httplib.  The good news is that httplib gives you the 

flexibility to handle them as you want.  I have attached one of our HTTP 

clients that uses httplib.  It handles all three sites you mentioned.



<shameless plug>

This code is used in our website load testing system http://www.stressmy.com

</shameless plug>

-- 

Doug Fort <dougfort at downright.com>

Senior Meat Manager

Downright Software LLC

http://www.downright.com
-------------- next part --------------
A non-text attachment was scrubbed...
Name: rawhtmlpage.py
Type: text/x-java
Size: 14233 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/python-list/attachments/20010611/ed149af9/attachment.java>


More information about the Python-list mailing list