Downloading TV listings with urllib

Gerhard Häring gh at ghaering.de
Thu Apr 17 13:58:43 EDT 2003


Josh wrote:
> Can someone give me general pointers on how it could be done? I am trying 
> to go to tvguide.com and download the TV listings for my area. The problem 
> is how to specify my location and cable provider. I used Proxomitron to 
> look at the traffic generated while downloading the listings with my 
> browser. When I used httplib with:
> 
> host = 'www.tvguide.com'
> pathn = '/Listings/index.asp'
> 
> def tvg():
>     try:
>         h = httplib.HTTPConnection(host)
>         h.putrequest('GET', pathn)
>         h.putheader('Accept', 'text/html')
>         h.putheader('Accept', 'text/plain')
>         h.putheader('User-Agent:', 'Mozilla/5.0 (Windows; U; Windows NT 
> 5.1; en-US; rv:1.3) Gecko/20030312')
>         h.putheader('Accept:', 'application/x-shockwave-
> flash,text/xml,application/xml,application/xhtml+xml,text/html;q=0.9,text/p
> lain;q=0.8,video/x-mng,image/png,image/jpeg,image/gif;q=0.2,*/*;q=0.1')
>         h.putheader('Accept-Language:','en-us,en;q=0.5')
>         h.putheader('Accept-Encoding:', 'gzip,deflate,compress;q=0.9')
>         h.putheader('Accept-Charset:', 'ISO-8859-1,utf-8;q=0.7,*;q=0.7')
>         h.putheader('Keep-Alive:', '300')
>         h.putheader('Cookie', 
> 'SITESERVER=ID=fd7d0ca25acdd2439ab3f4b48b0827b4; GBA=18; 
> TVGID=9CE64992B85F4F9A82B57EBC092D11B2; nat=0; ServiceID=75347; zip=63130; 
> ptfc=yfki')
>         h.endheaders()
>         err = h.getresponse()
>     	  return err
>     except socket.error, er:
>         print 'socket error ', er
> 
> when I use err.read() to read the object returned I get everything in Hex.

My guess is that you get what you're asking for - gzip-compressed data.

-- Gerhard





More information about the Python-list mailing list