[XML-SIG] xml.dom.ext.reader.HtmlLib

Lars Marius Garshol larsga@garshol.priv.no
17 Jul 2001 12:22:02 +0200


* Alexandre Fayolle
| 
| With HtmlLib's reader, this is not the case : the owner document I'm
| passing is getting emptied. Cf. line 42-46:
|         if doc:
|             while doc.firstChild:
|                 # Empty out the document
|                 node = doc.removeChild(doc.firstChild)
|                 ReleaseNode(node)
| 
| First (minor) thing is, this supposes I'm using a 4DOM document, since it
| uses ReleaseNode, second (important) thing is, I'm much annoyed that the
| document should be emptied, since in the case at hand, it already had some
| contents, and I was merely passing it in order to be sure that the right
| DOM implementation would be used, and to avoid an expensive call to
| importNode.

Part of the problem here is that we have a separate Reader for HTML
documents. IMHO it would be much preferrable to have a SAX driver for
the HTML parser instead. That could then use the SAX Reader, and
behaviour would be consistent. 

In addition, we would get increased flexibility by having a SAX driver
for this parser.
 
| As a side note, Sgmlop.HtmlParser uses non NS methods to build it's
| DOM. Is this what is intended ?

Should be, shouldn't it? HTML doesn't have namespaces, only XHTML does.
 
--Lars M.