urllib, urlretrieve method, how to get headers?

Kushal Kumaran kushal.kumaran+python at gmail.com
Fri Jul 1 13:01:42 EDT 2011


On Fri, Jul 1, 2011 at 2:23 PM, Даниил Рыжков <daniil.re at gmail.com> wrote:
> Hello again!
> Another question: urlopen() reads full file's content, but how can I
> get page by small parts?
>

Set the Range header for HTTP requests.  The format is specified here:
http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.35.  Note
that web servers are not *required* to support this header.

In [10]: req = urllib2.Request('http://cdimage.debian.org/debian-cd/6.0.2.1/amd64/iso-cd/debian-6.0.2.1-amd64-CD-1.iso',
headers = { 'Range' : 'bytes=0-499' })

In [11]: f = urllib2.urlopen(req)

In [12]: data = f.read()

In [13]: len(data)
Out[13]: 500

In [14]: print f.headers
Date: Fri, 01 Jul 2011 16:59:39 GMT
Server: Apache/2.2.14 (Unix)
Last-Modified: Sun, 26 Jun 2011 16:54:45 GMT
ETag: "ebff2f-28700000-4a6a04ab27f10"
Accept-Ranges: bytes
Content-Length: 500
Age: 225
Content-Range: bytes 0-499/678428672
Connection: close
Content-Type: application/octet-stream


-- 
regards,
kushal



More information about the Python-list mailing list