[ python-Bugs-874842 ] httplib fails on Akamai URLs
SourceForge.net
noreply at sourceforge.net
Mon Apr 12 16:32:47 EDT 2004
Bugs item #874842, was opened at 2004-01-11 06:16
Message generated for change (Comment added) made by gvanrossum
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=874842&group_id=5470
Category: Python Library
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Leif Hedstrom (zwoop)
Assigned to: Nobody/Anonymous (nobody)
Summary: httplib fails on Akamai URLs
Initial Comment:
Using Python 2.3.2 and httplib, reading from Akamai
URLs will always hang at the end of the transacation.
As common as this must be, I couldn't find anything
related to it on any search engines, nor on the bug
list here.
The problem is that Akamai returns an HTTP/1.0
response, with a header like:
Connection: keep-alive
httplib does not recognize this response properly (the
Connection: header parsing is only done for HTTP/1.1
responses). I'm not sure exactly what the right
solution is, but I'm supplying one alternative solution
that does solve the problem. I'm attaching a diff
against httplib.py.
----------------------------------------------------------------------
>Comment By: Guido van Rossum (gvanrossum)
Date: 2004-04-12 16:32
Message:
Logged In: YES
user_id=6380
Hmm... Indeed. read() checks will_close and apparently
setting that to False will do the right thing.
I don't know HTTP and this code well enough to approve this
fix though. Also, the comment right above your patch should
probably be fixed; it claims that connection headers on
HTTP/1.0 are due to confused proxies. (Maybe that's what
Akamai servers are? :-)
----------------------------------------------------------------------
Comment By: Leif Hedstrom (zwoop)
Date: 2004-04-12 16:13
Message:
Logged In: YES
user_id=480913
Yeah, that works for me to. But the problem is in the
HTTPResponse class from the httplib.py module. For example,
this code (butchered from my application) will hang on
Akamai URLs:
#!/usr/bin/python
import httplib
def testHTTPlib(host, url):
http = httplib.HTTPConnection(host)
try:
http.request('GET', url)
response = http.getresponse()
except IOError:
self._log.warning("Can't connect to %s", url)
return False
except socket.error:
self._log.error("Socket error retrieving %s", url)
return False
except socket.timeout:
self._log.warning("Timeout connecting to %s", url)
return False
else:
try:
data = response.read()
return True
except socket.timeout:
self._log.warning("Timeout reading from %s", url)
return False
return False
print testHTTPlib("www.ogre.com", "/")
print testHTTPlib("www.akamai.com", "/")
Granted, I think Akamai aren't strictly following the
protocols, but it's inconvenient that this piece of code
stalls here (and only for akamai.com domains, I've tried a
lot of them).
Thanks!
-- Leif
----------------------------------------------------------------------
Comment By: Guido van Rossum (gvanrossum)
Date: 2004-04-12 15:36
Message:
Logged In: YES
user_id=6380
Can you give a complete program that reproduces this? I've
tried this:
>>> import urllib
>>> urllib.urlopen("http://www.akamai.com").read()
and it doesn't hang for me. I tried a number of Python
versions from 2.2 through 2.4a0.
----------------------------------------------------------------------
Comment By: Leif Hedstrom (zwoop)
Date: 2004-01-11 14:37
Message:
Logged In: YES
user_id=480913
Oh, I forgot, this is easiest reproduced by simple
requesting the URL
http://www.akamai.com/
Fortunately they Akamai their home page as well. :-)
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=874842&group_id=5470
More information about the Python-bugs-list
mailing list