[XML-SIG] How to get SAX to parse not well formed HTML doc?

Martin v. Loewis martin@loewis.home.cs.tu-berlin.de
Wed, 18 Jul 2001 01:17:14 +0200


>   Another possibility would be to use the HTMLParser module, which is
> new in Python 2.2.  It was originally developed for another project
> and is stable and well-tested.  Feel free to extract the module from
> the Python CVS repository.

Of course, a "true" HTML parser should get the DTD right,
i.e. generate closing elements where they are missing, expand entities
(to unicode strings), etc.

Regards,
Martin