[New-bugs-announce] [issue14044] IncompleteRead error with urllib2 or urllib.request -- fine with urllib, wget, or curl

Alex Quinn report at bugs.python.org
Fri Feb 17 18:36:17 CET 2012


New submission from Alex Quinn <aq2009 at alexquinn.org>:

When accessing this URL, both urllib2 (Py2) and urlib.client (Py3) raise an IncompleteRead error.
http://info.kingcounty.gov/health/ehs/foodsafety/inspections/XmlRest.aspx?Zip_Code=98199

Previous discussions about similar errors suggest that this may be due to a problem with the server and chunked data transfer.  (See links below.)  I can't understand what that means.  However, this works fine with urllib (Py2), curl, wget, and all regular web browsers I've tried it with.  Thus, I would have expected urllib2 (Py2) and urllib.request (Py3) to cope with it similarly.

Versions I've tested with:
- Fails with urllib2 + Python 2.5.4, 2.6.1, 2.7.2  (Error messages vary.)
- Fails with urllib.request + Python 3.1.2, 3.2.2
- Succeeds with urllib + Python 2.5.4, 2.6.1, 2.7.2
- Succeeds with wget 1.11.1
- Succeeds with curl 7.15.5

___________________________________________________________
TEST CASES

# FAILS - Python 2.7, 2.6, 2.5
import urllib2
url = "http://info.kingcounty.gov/health/ehs/foodsafety/inspections/XmlRest.aspx?Zip_Code=98199"
xml_str = urllib2.urlopen(url).read() # Raises httplib.IncompleteRead

# FAILS - Python 3.2, 3.1
import urllib.request
url = "http://info.kingcounty.gov/health/ehs/foodsafety/inspections/XmlRest.aspx?Zip_Code=98199"
xml_str = urllib.request.urlopen(url).read() # Raises http.client.IncompleteRead

# SUCCEEDS - Python 2.7, 2.6, 2.5
import urllib
url = "http://info.kingcounty.gov/health/ehs/foodsafety/inspections/XmlRest.aspx?Zip_Code=98199"
xml_str = urllib.urlopen(url).read()
dom = xml.dom.minidom.parseString(xml_str) # Verify XML is complete
print("urllib:  %d bytes received and parsed successfully"%len(xml_str))

# SUCCEEDS - wget
wget -O- "http://info.kingcounty.gov/health/ehs/foodsafety/inspections/XmlRest.aspx?Zip_Code=98199" | wc

# SUCCEEDS - curl - prints an error, but returns the full data anyway
curl "http://info.kingcounty.gov/health/ehs/foodsafety/inspections/XmlRest.aspx?Zip_Code=98199" | wc

___________________________________________________________
RELATED DISCUSSIONS

http://www.gossamer-threads.com/lists/python/python/847985
http://bugs.python.org/issue11463  (closed)
http://bugs.python.org/issue6785   (closed)
http://bugs.python.org/issue6312   (closed)

----------
components: Library (Lib)
messages: 153581
nosy: Alex Quinn
priority: normal
severity: normal
status: open
title: IncompleteRead error with urllib2 or urllib.request -- fine with urllib, wget, or curl
versions: Python 2.6, Python 2.7, Python 3.1, Python 3.2

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue14044>
_______________________________________


More information about the New-bugs-announce mailing list