[XML-SIG] xml.dom.ext.reader.HtmlLib

Martin v. Loewis martin@loewis.home.cs.tu-berlin.de
Wed, 18 Jul 2001 01:13:45 +0200


> Part of the problem here is that we have a separate Reader for HTML
> documents. IMHO it would be much preferrable to have a SAX driver for
> the HTML parser instead. That could then use the SAX Reader, and
> behaviour would be consistent. 
> 
> In addition, we would get increased flexibility by having a SAX driver
> for this parser.

Sounds like an interesting project for a volunteer. I'd personally
recommend to build this SAX driver on top of sgmlop; the true
challenge is to get the events right that only result from the SGML
DTD for HTML (e.g. missing closing tags, etc).

Regards,
Martin