Cannot able to retreive compressed html URL

rushik rushik.upadhyay at gmail.com
Sun Feb 15 16:17:53 EST 2009


On Feb 15, 11:56 am, rushik <rushik.upadh... at gmail.com> wrote:
> Hi,
> I am trying to build python script which retreives and analyze the
> various URLs and generate reports.
>
> Some of the urls are like "http://xyz.com/test.html.gz", I am trying
> to retreive it using urllib2 library and then using gzip library
> trying to decompress it.
>
> ex - server_url is say -http://xyz.com/test.html.gz
>
>                 logpage = urllib2.urlopen(server_url)
>                 html_content = cal_logpage.read()
>                 logpage.close()
>
>                 gz_tmp = open("gzip.txt.gz", "w")
>                 gz_tmp.write(html_content)
>                 gz_tmp.close()
>                 f = gzip.open("gzip.txt.gz", "rb")
>                 file_content = f.read()
>                 f.close()
>
>                 #return the resulting html content.
>                 return html_content
>
> on executing the code, its giving
>
> zlib.error - Error -3 while decompressing: invalid distance too far
> back
>
> the same URL I am able to retreive in proper html page format from
> browser
>
> please let me know if I am doing something wrong here, or is there any
> other better way to do so.
>
> Thanks,
> R

I got the solution !! using now urllib.retrieve

thx,
R



More information about the Python-list mailing list