trying to parse non valid html documents with HTMLParser
Benji York
benji at benjiyork.com
Tue Aug 2 16:29:56 EDT 2005
florent wrote:
> I'm trying to parse html documents from the web, using the HTMLParser
> class of the HTMLParser module (python 2.3), but some web documents are
> not fully valids.
From http://www.crummy.com/software/BeautifulSoup/:
You didn't write that awful page. You're just trying to get
some data out of it. Right now, you don't really care what
HTML is supposed to look like.
Neither does this parser.
--
Benji York
More information about the Python-list
mailing list