[ python-Bugs-767111 ] AttributeError thrown by urllib.open_http

SourceForge.net noreply at sourceforge.net
Thu Mar 18 18:43:37 EST 2004


Bugs item #767111, was opened at 2003-07-07 13:52
Message generated for change (Comment added) made by robzed
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=767111&group_id=5470

Category: Python Library
Group: Python 2.4
Status: Open
Resolution: None
Priority: 6
Submitted By: Stuart Bishop (zenzen)
Assigned to: A.M. Kuchling (akuchling)
Summary: AttributeError thrown by urllib.open_http

Initial Comment:
In 2.3b2, looks like an error condition isn't being picked up 
on line 300 or 301 of urllib.py.

The code that triggered this traceback was simply:
        url = urllib.urlopen(action, data)


Traceback (most recent call last):
  File "autospamrep.py", line 170, in ?
    current_page = handle_spamcop_page(current_page)
  File "autospamrep.py", line 140, in handle_spamcop_page
    url = urllib.urlopen(action, data)
  File "/Library/Frameworks/Python.framework/Versions/2.3/
lib/python2.3/urllib.py", line 78, in urlopen
    return opener.open(url, data)
  File "/Library/Frameworks/Python.framework/Versions/2.3/
lib/python2.3/urllib.py", line 183, in open
    return getattr(self, name)(url, data)
  File "/Library/Frameworks/Python.framework/Versions/2.3/
lib/python2.3/urllib.py", line 308, in open_http
    return self.http_error(url, fp, errcode, errmsg, headers, 
data)
  File "/Library/Frameworks/Python.framework/Versions/2.3/
lib/python2.3/urllib.py", line 323, in http_error
    return self.http_error_default(url, fp, errcode, errmsg, 
headers)
  File "/Library/Frameworks/Python.framework/Versions/2.3/
lib/python2.3/urllib.py", line 551, in http_error_default
    return addinfourl(fp, headers, "http:" + url)
  File "/Library/Frameworks/Python.framework/Versions/2.3/
lib/python2.3/urllib.py", line 837, in __init__
    addbase.__init__(self, fp)
  File "/Library/Frameworks/Python.framework/Versions/2.3/
lib/python2.3/urllib.py", line 787, in __init__
    self.read = self.fp.read
AttributeError: 'NoneType' object has no attribute 'read'

----------------------------------------------------------------------

Comment By: Rob Probin (robzed)
Date: 2004-03-18 23:43

Message:
Logged In: YES 
user_id=1000470

The file pointer (fp) is None (inside urllib) from httplib. This appears to 
be caused by a BadStatusLine exception in getreply() (line1016 httplib). 

This sets self.file to self._conn.sock.makefile('rb', 0) then does a 
self.close() which sets self.file to None. 

Being new to this peice of code, I'm not sure whether it's urllib assuming 
the file isn't going to be closed, or the BadStatusLine exception clearing 
the file. Certainly it looks like the error -1 is not being trapped by 
open_http() in urllib upon calling h.getreply() and assuming that the file 
still exists even in an error condition?

It maybe a coincidence but it appears to occur more when a web browser 
on the same machine is refreshing. 

Regards
Rob


----------------------------------------------------------------------

Comment By: Rob Probin (robzed)
Date: 2004-03-17 22:24

Message:
Logged In: YES 
user_id=1000470

""" 
This comment is program to reproduce the problem. Sorry it's not an 
attachment - as a relative Sourceforge newbie I have no idea how to 
attach to an existing bug. More notes available via email if required - 
including all local variables for each function from post mortem.

As said before, seems to be fp = None. Although the exception is caused 
by the 'self.read = self.fp.read', it looks like 'fp = h.getfile()' inside 
open_http()

This is repeatable, but you may have to run this more than once. 
(Apologies to noaa.gov).

*** PLEASE: Run only where absolutely necessary for reproduction of 
bug!!! ***

"""

""" Attribute Error test case  - Python 2.3 """

import urllib

url = "http://adds.aviationweather.noaa.gov/metars/index.php"

params = urllib.urlencode({ "station_ids" : "KJFK", 
				"hoursStr" : "most recent only", 
				"std_trans" : "standard", 
				"chk_metars" : "on",
				"submit":"Submit"})

print "test"

for i in range(1, 1000):
	x = urllib.urlopen(url, params)
	string = x.read()
	print i

"""
Local variables for middle level routine...

	classURLopener
	open_http(self, url, data=None)
		args	('User-agent', 'Python-urllib/1.15')
		auth	None
		data	
'hoursStr=most+recent+only&station_ids=KJFK&std_trans=standard&sub
mit=Submit&chk_metars=on'
		errcode	-1
		errmsg	''
		fp	None
		h	<httplib.HTTP instance at 0x507df30>
		headers	None
		host	'adds.aviationweather.noaa.gov'
		httplib	<module 'httplib' from '/System/Library/Frameworks/
Python.framework/Versions/2.3/lib/python2.3/httplib.pyc'>
		realhost	'adds.aviationweather.noaa.gov'
		selector	'/metars/index.php'
		self	<urllib.FancyURLopener instance at 0x465f3c8>
		url	'//adds.aviationweather.noaa.gov/metars/index.php'
		user_passwd	None
"""


----------------------------------------------------------------------

Comment By: Barry A. Warsaw (bwarsaw)
Date: 2003-07-29 04:23

Message:
Logged In: YES 
user_id=12800

Please provide a self-contained, complete example that we
can use to reproduce this problem.  Without enough
information, I can't see us fixing this for Python 2.3, and
time for that is rapidly running out.

Lowering to priority 6.

----------------------------------------------------------------------

Comment By: Stuart Bishop (zenzen)
Date: 2003-07-17 05:34

Message:
Logged In: YES 
user_id=46639

I've finally managed to get another traceback with some more 
information, using an assert I inserted into urllib.py a while ago to 
catch .fp == None:

Traceback (most recent call last):
  File "/Users/zen/bin/autospamrep.py", line 168, in ?
    current_page = urllib.urlopen(start_url).read()
  File "/Library/Frameworks/Python.framework/Versions/2.3/lib/
python2.3/urllib.py", line 76, in urlopen
    return opener.open(url)
  File "/Library/Frameworks/Python.framework/Versions/2.3/lib/
python2.3/urllib.py", line 181, in open
    return getattr(self, name)(url)
  File "/Library/Frameworks/Python.framework/Versions/2.3/lib/
python2.3/urllib.py", line 305, in open_http
    assert fp is not None, 'errcode %r, errmsg %r, headers %r' % 
(errcode, errmsg, headers)
AssertionError: errcode -1, errmsg '', headers None


----------------------------------------------------------------------

Comment By: John J Lee (jjlee)
Date: 2003-07-12 14:14

Message:
Logged In: YES 
user_id=261020

HTTPResponse.read returns '' if its .fp is None, but the 
backwards-compat HTTP class' .getfile() just returns self.file, 
which it previously grabbed from HTTPResponse in .getreply(). 
 
Wild guess: maybe HTTP.getreply should just do 
 
self.file = response 
 
rather than 
 
self.file = response.fp 
 
The object returned by HTTP.getfile() was documented as 
returning an object supporting .readline() and .readlines(), 
while HTTPResponse only supports .read(), so that's obviously 
not the whole solution. 
 

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2003-07-09 06:50

Message:
Logged In: YES 
user_id=80475

What were the values of 'action' and 'data' when the call was 
made?

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=767111&group_id=5470



More information about the Python-bugs-list mailing list