Downloading TV listings with urllib

Josh mlsj at earthlink.net
Thu Apr 17 13:23:34 EDT 2003


Can someone give me general pointers on how it could be done? I am trying 
to go to tvguide.com and download the TV listings for my area. The problem 
is how to specify my location and cable provider. I used Proxomitron to 
look at the traffic generated while downloading the listings with my 
browser. When I used httplib with:

host = 'www.tvguide.com'
pathn = '/Listings/index.asp'

def tvg():
    try:
        h = httplib.HTTPConnection(host)
        h.putrequest('GET', pathn)
        h.putheader('Accept', 'text/html')
        h.putheader('Accept', 'text/plain')
        h.putheader('User-Agent:', 'Mozilla/5.0 (Windows; U; Windows NT 
5.1; en-US; rv:1.3) Gecko/20030312')
        h.putheader('Accept:', 'application/x-shockwave-
flash,text/xml,application/xml,application/xhtml+xml,text/html;q=0.9,text/p
lain;q=0.8,video/x-mng,image/png,image/jpeg,image/gif;q=0.2,*/*;q=0.1')
        h.putheader('Accept-Language:','en-us,en;q=0.5')
        h.putheader('Accept-Encoding:', 'gzip,deflate,compress;q=0.9')
        h.putheader('Accept-Charset:', 'ISO-8859-1,utf-8;q=0.7,*;q=0.7')
        h.putheader('Keep-Alive:', '300')
        h.putheader('Cookie', 
'SITESERVER=ID=fd7d0ca25acdd2439ab3f4b48b0827b4; GBA=18; 
TVGID=9CE64992B85F4F9A82B57EBC092D11B2; nat=0; ServiceID=75347; zip=63130; 
ptfc=yfki')
        h.endheaders()
        err = h.getresponse()
    	  return err
    except socket.error, er:
        print 'socket error ', er

when I use err.read() to read the object returned I get everything in Hex.

I tried using urllib, but there I can't figure out a way to pass the 
location information etc.

As I said, I just need just general pointers on how to deal with this. Any 
help would be greatly appreciated.

Thanks

Josh

    	





More information about the Python-list mailing list