Looking for a decent HTML parser for Python...
Fredrik Lundh
fredrik at pythonware.com
Wed Dec 6 00:47:49 EST 2006
> Except it appears to be buggy or, at least, not very robust. There are
> websites for which it falsely terminates early in the parsing.
which probably means that the sites are broken. the amount of broken
HTML on the net is staggering, as is the amount of code in a typical web
browser for dealing with all that crap. for a more tolerant parser, see:
http://www.crummy.com/software/BeautifulSoup/
</F>
More information about the Python-list
mailing list