Automatically resume a download w/ urllib?

Mark Rowe mark21rowe at yahoo.com
Wed Oct 24 03:43:53 EDT 2001


Hi,

Your code certainly fits in better in an object oriented system.  The only
small problem I saw is the capitization of the headers that you send.  The
HTTP RFC (RFC 2616) indicates that the headers should be in the case
indicated.  In some cases incorrect case may affect the servers parsing of
the message.  Therefore the only changes to your code would be the
capitalization of the 'Range' header, and the retrieval of the
'Content-Length' header.

Mark


"Chris Moffitt" <chris_moffitt at yahoo.com> wrote in message
news:OHpB7.15055$CN5.1164149 at typhoon.mn.mediaone.net...
> After Oleg pointed me in the right direction, I managed to put something
> together too.  Too bad someone else beat me to the punch!  Still, the
> solution I came up with is a little different.  It just needs to override
an
> error handler in urllib.FancyURLopener..  Let me know what you think.
>
> import urllib, os
>
> class myURLOpener(urllib.FancyURLopener):
>     """Create sub-class in order to overide error 206.  This error means a
> partial file is being sent,
>        which is ok in this case.  Do nothing with this error.
>     """
>     def http_error_206(self, url, fp, errcode, errmsg, headers,
data=None):
>         pass
> loop = 1
> dlFile = "testfile"
> existSize = 0
> myUrlclass = myURLOpener()
> if os.path.exists(dlFile):
>     outputFile = open(dlFile,"ab")
>     existSize = os.path.getsize(dlFile)
>     #If the file exists, then only download the remainder
>     myUrlclass.addheader("range","bytes=%s-" % (existSize))
> else:
>     outputFile = open(dlFile,"wb")
>
> webPage = myUrlclass.open("http://192.168.1.4/%s" % dlFile)
>
> #If the file exists, but we already have the whole thing, don't download
> again
> if int(webPage.headers['content-length']) == existSize:
>     loop = 0
>     print "File already downloaded"
>
> numBytes = 0
> while loop:
>     data = webPage.read(8192)
>     if not data:
>         break
>     outputFile.write(data)
>     numBytes = numBytes + len(data)
>
> webPage.close()
> outputFile.close()
>
> for k,v in webPage.headers.items():
>     print k, "=",v
> print "copied", numBytes, "bytes from", webPage.url
>
>
> "Oleg Broytmann" <phd at phd.pp.ru> wrote in message
> news:mailman.1003831763.10689.python-list at python.org...
> > On Tue, Oct 23, 2001 at 05:31:54PM +1300, Mark Rowe wrote:
> > > After reading some RFC's and tweaking some Apache settings, I managed
to
> get
> > > this working.  I was only able to test it on my local server and it
> appears
> > > to work fine.  Any comments or improvements, feel free :)
> >
> >    Thank you.
> >
> > [skip]
> > >     if existSize > 0:
> > >         h.putheader('Range', 'bytes=%d-' % (existSize, ))
> >
> >    Looks good.
> >
> > [skip]
> > >         ## HTTP error 416 = Request Range not Satisiable
> >                                                     ^
> >    Typo (-:                                        _|
> >
> > Oleg.
> > --
> >      Oleg Broytmann            http://phd.pp.ru/
phd at phd.pp.ru
> >            Programmers don't die, they just GOSUB without RETURN.
> >
>
>





More information about the Python-list mailing list