Why does this fail?

John J. Lee jjl at pobox.com
Tue Jan 6 11:28:34 EST 2004


"Dave Murray" <dlmurray at micro-net.com> writes:

> New to Python question, why does this fail?
[...]
> def Checkit(URL):
[...]

(already answered six times, so I won't bother...)

You might want to have a look at the unittest module.

Also (advert ;-), if you're doing any kind of web scraping in Python
(including functional testing), you might want to look at this little
FAQ (though it certainly doesn't nearly cover everything relevant):

http://wwwsearch.sf.net/bits/clientx.html

BTW, in response to another question in this thread (IIRC), and
entirely contrary to my previous assertion here <wink>, it appears
that HTMLParser.HTMLParser is a bit more finicky with HTML than is
sgmllib/htmllib (htmllib is a thin wrapper over sgmllib).  I hope to
investigate and fix that -- HTMLParser.HTMLParser knows about XHTML,
so in that respect is a better choice than sgmllib/htmllib.  If you
want to process junk HTML, though (or perhaps even valid HTML that the
library you're using doesn't like), look at mxTidy or uTidylib.  I
should link to those on my FAQ page...


John



More information about the Python-list mailing list