Getting HTTP responses - a python linkchecking script.

p-d-p=pas-de-spam pas at despam
Mon May 8 16:54:05 EDT 2006


blair.bethwaite at gmail.com a écrit :
> Hi Folks,
> 
> I'm thinking about writing a script that can be run over a whole site
> and produce a report about broken links etc...
> 
> I've been playing with the urllib2 and httplib modules as a starting
> point and have found that with urllib2 it doesn't seem possible to get
> HTTP status codes.
> 
> I've had more success with httplib...
> Firstly I create a new HTTPConnection object with a given hostname and
> port then I try connecting to the host and catch any socket errors
> which I can assume mean the server is either down or doesn't exist at
> this place anymore.
> If the connection was successful I try requesting the resource in
> question, I then get the response and check the status code.
> 
> So, I've got the tools I need to do the job sufficiently.  Just
> wondering whether anybody can recommend any alternatives.
> 
> Cheers,
>    -Blair
> 
have a look at

urllib2 - The Missing Manual

http://www.voidspace.org.uk/python/articles/urllib2.shtml



More information about the Python-list mailing list