save gb-2312 web page in a .html file

Peter Pei yantao at telus.com
Wed Dec 26 18:22:06 EST 2007


You must be right, since I tried one page and it worked. But there is 
something wrong with this particular page: 
http://overseas.btchina.net/?categoryid=-1. When I open the saved file (with 
IE7), it is all messed up.

    url = 'http://overseas.btchina.net/?categoryid=-1'
    headers = { 'User-Agent' : 'Mozilla/4.0 (compatible; MSIE 5.5; Windows 
NT)' }
    req = urllib2.Request(url, None, headers)
    page = urllib2.urlopen(req).read()

    htmlfile = open('btchina.html','w')
    htmlfile.write(page)
    htmlfile.close() 




More information about the Python-list mailing list