Fast URL validation?

Thomas Weholt 2002 at weholt.org
Tue Mar 11 07:13:24 EST 2003


I guess you could use httplib and just check the headers in the response
from the server, not read the entire document. Something like :

>>> import httplib
>>> conn = httplib.HTTPConnection("www.python.org")
>>> conn.request("GET", "/index.html")
>>> r1 = conn.getresponse()
>>> print r1.status, r1.reason
200 OK
>>> conn.close()

If r1.status is something other than 200 then the document is either
missing, server failure or your request will be redirected etc. If you
wanted to read the entire document you'd have to do this before closing the
connection:
>>> document = r1.read()
>>> conn.close()

Or something similar. Don't know if this is the fastest way of doing it.
Please, post a better solution if you find one.

Thomas



"Robert Oschler" <no_replies at fake_email_address.invalid> wrote in message
news:t%jba.11$_O5.47359 at news2.news.adelphia.net...
> Is there a way, with Python 2.2,  to validate a URL without having to
> download the entire document?  I want to rapidly (very rapidly!) scan a
list
> of URL's and make sure I don't get a "server not found" or "page not
found"
> error.  What external library would I need to import and and what
functions?
>
> thx
>
>
>
> --
>
> Robert Oschler
> Android Technologies, Inc.
> http://www.androidtechnologies.com
> The home of PowerSell! (tm)
> - "Power Tools for Amazon Associates" (sm)
>
>






More information about the Python-list mailing list