[Patches] [ python-Patches-1062060 ] fix for 1016880 urllib.urlretrieve silently truncates dwnld

Fri Jul 15 10:51:47 CEST 2005

Patches item #1062060, was opened at 2004-11-07 21:15
Message generated for change (Comment added) made by birkenfeld
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=1062060&group_id=5470

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: Library (Lib)
Group: Python 2.4
Status: Open
Resolution: None
Priority: 6
Submitted By: Irmen de Jong (irmen)
>Assigned to: Raymond Hettinger (rhettinger)
Summary: fix for 1016880 urllib.urlretrieve silently truncates dwnld

Initial Comment:
The patch makes urllib.urlretrieve raise an IOError if
the actual download size is different from the expected
size (taken from the content-length header). 

----------------------------------------------------------------------

>Comment By: Reinhold Birkenfeld (birkenfeld)
Date: 2005-07-15 10:51

Message:
Logged In: YES 
user_id=1188172

Attaching new patch which implements Martin's suggestion
(urllib-truncate.diff). Please review.

----------------------------------------------------------------------

Comment By: Martin v. Löwis (loewis)
Date: 2005-02-24 22:07

Message:
Logged In: YES 
user_id=21627

I think the patch is essentially right. However, I'm
concerned with losing the data that got just downloaded - I
propose to stick them into the IOError (or, better, subclass
IOError to keep the data, and document where to find them).

----------------------------------------------------------------------

Comment By: Irmen de Jong (irmen)
Date: 2004-12-24 15:30

Message:
Logged In: YES 
user_id=129426

Suggested addition to the doc of urllib (liburllib.tex, if
I'm not mistaken):

"""

urlretrieve will raise IOError when it detects that the
amount of data available 
was less than the expected amount (which is the size
reported by a Content-Length
header). This can occur, for example, when the download is
interrupted.

The Content-Length is treated as a lower bound (just like
tools such as wget and 
Ffirefox appear to do): if there's more data to read,
urlretrieve reads more data, but 
if less data is available, it raises IOError.

If no Content-Length header was supplied, urlretrieve can
not check the size
of the data it has downloaded, and just returns it. In this
case you
just have to assume that the download was successful.
"""

----------------------------------------------------------------------

Comment By: Irmen de Jong (irmen)
Date: 2004-12-24 15:10

Message:
Logged In: YES 
user_id=129426

Yes I'm having trouble building the docs from source, so I
will just add the text that I would like to change in the docs.
When I have some time left (it's Christmas after all :) )
I'll also create a regression test for the new behavior.
In the meantime, the "urllib.patch" may be deleted because
"urllib.patch2" is the correct patch (I seem to be unable to
delete the attachment myself)

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2004-12-21 01:00

Message:
Logged In: YES 
user_id=80475

Irmin, please attach the new patch.

If you're having trouble with the docs, that's fine, just
include the text you want changed.

Do include tests with your patch.

----------------------------------------------------------------------

Comment By: Johannes Gijsbers (jlgijsbers)
Date: 2004-12-06 22:48

Message:
Logged In: YES 
user_id=469548

Sorry Irmen, I'm a bit late with this, but now is the time
to get new "features" checked into the trunk. Could you add
a doc patch that explains the behavior as you did in your
previous message and a tests patch? I can check it in then.

----------------------------------------------------------------------

Comment By: Irmen de Jong (irmen)
Date: 2004-11-07 21:54

Message:
Logged In: YES 
user_id=129426

NOTE:
urllib.patch2 may be a bit better. It fixes a misspelling,
and also is more relaxed about a 'wrong' download size.
To be more precise: it treats content-length as a lower
bound (just like wget and firefox appear to do). So if
there's more data to read, it  reads more data, but if less
data is available, it gives an IOError

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=1062060&group_id=5470