Downloading TV listings with urllib
Josh
mlsj at earthlink.net
Thu Apr 17 13:23:34 EDT 2003
Can someone give me general pointers on how it could be done? I am trying
to go to tvguide.com and download the TV listings for my area. The problem
is how to specify my location and cable provider. I used Proxomitron to
look at the traffic generated while downloading the listings with my
browser. When I used httplib with:
host = 'www.tvguide.com'
pathn = '/Listings/index.asp'
def tvg():
try:
h = httplib.HTTPConnection(host)
h.putrequest('GET', pathn)
h.putheader('Accept', 'text/html')
h.putheader('Accept', 'text/plain')
h.putheader('User-Agent:', 'Mozilla/5.0 (Windows; U; Windows NT
5.1; en-US; rv:1.3) Gecko/20030312')
h.putheader('Accept:', 'application/x-shockwave-
flash,text/xml,application/xml,application/xhtml+xml,text/html;q=0.9,text/p
lain;q=0.8,video/x-mng,image/png,image/jpeg,image/gif;q=0.2,*/*;q=0.1')
h.putheader('Accept-Language:','en-us,en;q=0.5')
h.putheader('Accept-Encoding:', 'gzip,deflate,compress;q=0.9')
h.putheader('Accept-Charset:', 'ISO-8859-1,utf-8;q=0.7,*;q=0.7')
h.putheader('Keep-Alive:', '300')
h.putheader('Cookie',
'SITESERVER=ID=fd7d0ca25acdd2439ab3f4b48b0827b4; GBA=18;
TVGID=9CE64992B85F4F9A82B57EBC092D11B2; nat=0; ServiceID=75347; zip=63130;
ptfc=yfki')
h.endheaders()
err = h.getresponse()
return err
except socket.error, er:
print 'socket error ', er
when I use err.read() to read the object returned I get everything in Hex.
I tried using urllib, but there I can't figure out a way to pass the
location information etc.
As I said, I just need just general pointers on how to deal with this. Any
help would be greatly appreciated.
Thanks
Josh
More information about the Python-list
mailing list