Retrieving the last bit Re: File objects? - under the hood question

Jeremy Bowers jerf at jerf.org
Fri Jan 21 01:52:58 EST 2005


On Thu, 20 Jan 2005 21:06:31 -0800, Eric Pederson wrote:
> Here the sort of thing (seek, then read) I think I want:
> 
>>>> IDV2=open(("http://musicsite.com/song453.mp3","rb")[:-128])
> 
>>>> song453.tags=IDV2.read()
> 
>>>> len(song453.tags)
> 
> 128
> 
> 
> But it's not a Python problem.  :-(

OK, HTTP. It's true that it isn't a Python problem, but the fact that this
is possible and even easy isn't generally known, since it involves
actually understanding HTTP as more than a protocol that says "give me
this file". :-{

You need to use the Range header in HTTP to request just the end. urllib
doesn't seem to like the Range header (it interprets the 206 response that
results as an error, at least in 2.3.4 which I'm using here, which I would
consider a bug; 2xx responses are "success"), but you can still do it with
httplib:

Python 2.3.4 (#1, Oct 26 2004, 20:13:42) 
[GCC 3.4.2  (Gentoo Linux 3.4.2-r2, ssp-3.4.1-1, pie-8.7.6.5)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import httplib
>>> connection = httplib.HTTPConnection("www.jerf.org")
>>> connection.request("GET", "/", headers = {"Range": "bytes=-100"})
>>> response = connection.getresponse()
>>> response.read()
'rch Google -->\r\n\t\t\t\t\t</DIV>\r\n\t\t\t\t\t<P CLASS="Seperator" /> </P>\r\n\t\t\t<div>\r\n\t\t</body>\r\n\t</html>\r\n'

The bad news is I think you would have to chase redirects and such on your
own. Hopefully a urllib expert will pop in and show how to quickly tell
urllib to chill out when it gets a 206; I'm pretty sure it's easy, but I
can't quite rattle off how. Or maybe 2.4 has a better urllib.



More information about the Python-list mailing list