httplib problems.

Wed Apr 11 13:36:13 EDT 2001

"Sammy Mannaert" <nstalkie at tvd.be> wrote in message
news:3AD5E147.6133A8DD at tvd.be...
> hi,
>
> i'm trying to use httplib to fetch a file
> automatically. it's basically just the
> example from the 2.0 httplib GET example.
>
> it works for all urls except for
> http://www.deathinjune.net/html/news/news.htm
>
> does anyone know why it won't work for this
> url ? is there an easy way to fix it ?
> i tried surfinf to the url in netscape and
> lynx. both worked fine.
>
> sammy
>
>
> --- program follows ---
> #! /usr/bin/env python
>
> import httplib
>
> def fetch(domain, path):
>     h = httplib.HTTP(domain)
>     h.putrequest('GET', path)
>     h.putheader('Accept', 'text/html')
>     h.putheader('Accept', 'text/plain')
>     h.endheaders()
>     errcode, errmsg, headers = h.getreply()
>     print errcode
>     f = h.getfile()
>     data = f.read()
>     f.close()
>     print len(data)
>
> def main():
>     fetch('www.brainwashed.com', '/c93/news1.html')
>     fetch('www.deathinjune.net', '/html/news/news.htm')
>
> main()
>
> --- program end ---
>
sammy:

Here, I got the following output:

200
40605
404
212

Not sure why we are getting the 404 since, like you, I can read the page
fine in a browser. Still got a 404 if I reversed the order of the page
acesses, too.

Just as a matter of interest, is there a particular reason to use httplib?
urllib is almost always more convenient for this type of operation, and I
had no trouble with:

>>> import urllib
>>> u = urllib.urlopen("http://www.deathinjune.net/html/news/news.htm")
>>> c = u.read()
>>> len(c)
4133

regards
 Steve