[Python-bugs-list] [ python-Bugs-511073 ] urllib problems
noreply@sourceforge.net
noreply@sourceforge.net
Tue, 05 Feb 2002 16:34:18 -0800
Bugs item #511073, was opened at 2002-01-30 23:25
You can respond by visiting:
http://sourceforge.net/tracker/?func=detail&atid=105470&aid=511073&group_id=5470
Category: Macintosh
Group: Python 2.2
Status: Open
Resolution: None
Priority: 5
Submitted By: Yair Benita (ybenita)
Assigned to: Jack Jansen (jackjansen)
Summary: urllib problems
Initial Comment:
when using urllib.urlopen("url") and then reading
the file with handle.read() i get only parts of pages.
it works for short webpages but if i use it to
download large pages it always come too short. To
me it looks that it tries to read the file before it is
downloaded. Jack Jansen's said: MacPython may
do short reads on sockets. I've always maintained
that this was correct (which reasoning was quietly
accepted by everyone here), but last year I finally
admitted that it may actually be incorrect (which
was again quietly accepted:-)
example:
x=urllib.urlopen("http://www.ebi.ac.uk/cgi-bin/emblf
etch?db=embl&format=fasta&style=raw&id=AB002
378")
print x.read()
compare the file downloaded by any html browser
and the file from macpython.
----------------------------------------------------------------------
>Comment By: Jack Jansen (jackjansen)
Date: 2002-02-05 16:34
Message:
Logged In: YES
user_id=45365
I probably found the cause for this, now the only task remaining is finding out who to blame:-)
httplib explicitly sets non-buffering I/O on the file corresponding to the socket, by calling
self.fp = socket.makefile("rb", 0).
MSL, the CodeWarrior I/O library, has an optimization (or bug:-) that if you fread() from a binary
file with buffering turned off it will call the underlying read() straight away.
Python's fileobject.c file_read() reacts to a short fread() return value by returning.
One of these three is wrong, apparently.
----------------------------------------------------------------------
You can respond by visiting:
http://sourceforge.net/tracker/?func=detail&atid=105470&aid=511073&group_id=5470