corrupt download with urllib2

Tue Nov 10 08:59:45 EST 2015

Peter Otten <__peter__ at web.de> wrote:
> Ulli Horlacher wrote:
> 
> >     if u.getcode() == 200:
> >       print(u.read(),file=szo,end='')
> >       szo.close()
> >     else:
> >       die('cannot get %s - server reply: %d' % (szurl,u.getcode()))
> 
> More random remarks:

Always welcome - I am here to learn :-)

> - print() gives the impression that you are dealing with text, and using it
>   with binary strings will produce surprising results when you migrate to
>   Python 3:
> 
> Python 2:
> 
> >>> from __future__ import print_function

I already have this in my code, to make a later transition to Python 3
easier.

> >>> print(b"foo")
> foo
> 
> Python 3:
> 
> >>> print(b"foo")
> b'foo'

Bad.
Is there a better alternative to write arbitrary binary data?

> - with open(...) ensures that the file is closed when an exception occurs.
>   It doesn't matter here as your script is going to die() anyway, but using
>   with is a got habit to get into.

When an error occurs I do want to write more data, anyway.

> - consider shutil.copyfileobj to limit memory usage when dealing with data
>   of arbitrary size.
> 
> Putting it together:
> 
>     with open(sz, "wb") as szo:
>         shutil.copyfileobj(u, szo)

This writes the http stream binary to the file. without handling it
manually chunk by chunk?

Great. This would be my next task! You are answering my questions, before
I ask them! :-)

Background: I am rewriting my Perl program fexsend in Python.
fexsend transfers files up to TB range, see:
http://fex.rus.uni-stuttgart.de/

-- 
Ullrich Horlacher              Server und Virtualisierung
Rechenzentrum IZUS/TIK         E-Mail: horlacher at tik.uni-stuttgart.de
Universitaet Stuttgart         Tel:    ++49-711-68565868
Allmandring 30a                Fax:    ++49-711-682357
70550 Stuttgart (Germany)      WWW:    http://www.tik.uni-stuttgart.de/