Check URL --> Simply?

Markus Schaber markus at schabi.de
Thu Aug 16 15:02:17 EDT 2001


Hi,

Garth Grimm <garth_grimm at hp.com> schrub:

>> >     % python check_url.py
>> >     http://msnbc.com/nonsense (200, 'OK')
>> >     http://msnbc.com/ (302, 'Object moved')
>> >     http://w3c.org/ (301, 'Moved Permanently')
>> >     http://w3c.org/nonsense (301, 'Moved Permanently')
>> >     http://w3c.org/Consortium/ (301, 'Moved Permanently')
>> >     http://ibm.com/ (200, 'OK')
>> >     http://ibm.com/nonsense (404, 'Not Found')
>> >
>> > I tried a few sites to get these examples... but not all *that*
>> > many. All the sites that end in 'nonsense' LOOK, to my human eyes,
>> > like broken links... and all the others look like content (well,
>> > except msnbc.com, which refuses to load--I think because I won't
>> > give it a cookie--and wouldn't actually be other than nonsense if
>> > it would load :-)).
>>
>> I don't know about the msnbc examples, but the 301 from w3c is
>> telling you something useful -- it prefers to be called www.w3.org.
> 
> Yes, but to a typical user, that is called a dead link.  From the
> server side, the most user friendly way to handle this is to return a
> 404 (or perhaps a 301) code and message, but then use a web page with
> a refresh of 0 to send the user to the home page.

"Moved permanently" means you follow the redirection. When trying the 
new URL, you should get 200 or a 40X whether the page works or not.

In case of 40X, tell the user "dead link", when the redirected page 
works tell the user "page moved permanently" - you may even make this 
switchable by command line parameter.

markus
-- 
1) Customers cause problems.
2) Marketing is trying to create more customers.
Therefore:
3) Marketing is evil.  (Grand Edwards in comp.lang.python)



More information about the Python-list mailing list