Unpythonic Python

Rob Williscroft rtw at freenet.co.uk
Wed Aug 25 13:11:51 EDT 2004


David Abrahams wrote in news:uy8k31as1.fsf at boost-consulting.com in
comp.lang.python: 

> Rob Williscroft <rtw at freenet.co.uk> writes:
> 
>> David Abrahams wrote in news:uzn4j2s38.fsf at boost-consulting.com in
>> comp.lang.python: 
>>
>>>> That's not the problem.  I can download the file reliably from
>>>> other machines. 
>>
>> At the same time, using http ?
> 
> I can download the file reliably using IE from my WinXP box.
> 
> I can download the file reliably using urllib from Cygwin Python 2.3.2
> 
> The 2nd element returned by urlretrieve is 

Which version, the one that works or the one that doesn't ?

> 
>   'Date: Wed, 25 Aug 2004 14:50:17 GMT\r\nServer: Apache/2.0.40 (Red
>   Hat Linux)\r\nLast-Modified: Wed, 25 Aug 20 2 GMT\r\nETag:

Something is missing here:

  Last-Modified: Wed, 25 Aug 20 2 GMT

Contrast:

  Wed, 25 Aug 2004 14:50:17 GMT

>   "b63d5b-20ec84b-18057e80"\r\nAccept-Ranges: bytes\r\nContent-Length:
>   34523211\r\nContent-Type: n/x-bzip2\r\nConnection: close\r\n'

34 MB's ( I got 6 MB's )

>>> Trying again with Python 2.3 on Cygwin.
> 
> As you can see from the above, it works.  Is there a known urllib bug
> in earlier Pythons?

Sorry I don't know, but I've seen the same truncation with no python,
and no unix.
 
>> Is it possible the file is being (re) uploaded (via cvs) during your 
>> cron job's download, thus truncating your download ?
> 
> I don't think so.

Can you test wether or not this is happening ? I.e if you don't
get the full 34523211 bytes re-download and compare the above
Length, ETag and Last-Modified.

> 
>> Perhapse you should change to cvs:
>>
>>   os.system( 'cvs ... ' )
> 
> The problem with that is that I want to capture the whole CVS
> history, not just today's state.

I was suggesting you get the tarball via cvs, though presumably
sourceforge don't give you the option. http has the problem that
the server will just truncate the download if the source file
gets replaced.

> 
>> FWIW, I tried downlading with IE using the link above I got a
>> truncated 6 and bit MB's (16:15 BST (UTC +0100)).
>   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> 
> Sorry, what does that mean?  Did it show that message in a dialog,
> or...?
> 

No, I got a download complete, but the file was only 6 MB's, bzip2 -t 
told me the file was truncated, the (16:15 ...) is the time I tried
downloading, BST = British Summer Time, though you wouldn't know it 
from the weather :).

Further I just ran:

import urllib

filename, headers = \
    urllib.urlretrieve(
        'http://cvs.sourceforge.net/cvstarballs/boost-cvsroot.tar.bz2', 
        'boost-cvsroot.tar.bz2')

print filename

print headers

boost-cvsroot.tar.bz2
Date: Wed, 25 Aug 2004 16:53:20 GMT
Server: Apache/2.0.40 (Red Hat Linux)
Last-Modified: Wed, 25 Aug 2004 14:14:02 GMT
ETag: "b63d5b-20ec84b-18057e80"
Accept-Ranges: bytes
Content-Length: 34523211
Content-Type: application/x-bzip2
Connection: close

The script ended at 17::59 BST, Note the difference bettween the two
times in the headers, suggesting the file was modified 1:45 min's
ago ~ the same time my attempted download with IE failed.

Rob.
-- 
http://www.victim-prime.dsl.pipex.com/



More information about the Python-list mailing list