[XML-SIG] xml.dom.ext.reader.HtmlLib

Uche Ogbuji uche.ogbuji@fourthought.com
Wed, 18 Jul 2001 09:29:55 -0600


> Hello,
> 
> I was hunting for a bug in Narval, and ended up in
> xml.dom.ext.reader.HtmlLib. I would like some feedback on this to know
> is this is indeed a bug, a documentation issue, or just me daydreaming
> that all APIs should do what I'd like them to, instead of what the coder
> meant.
> 
> When I use xml.dom.ext.reader.Sax2, if I pass an ownerDocument to the
> reader when reading the data, I'll get back a DocumentFragment, belonging
> to the same document. 
> 
> With HtmlLib's reader, this is not the case : the owner document I'm
> passing is getting emptied. Cf. line 42-46:
>         if doc:
>             while doc.firstChild:
>                 # Empty out the document
>                 node = doc.removeChild(doc.firstChild)
>                 ReleaseNode(node)
> 
> First (minor) thing is, this supposes I'm using a 4DOM document, since it
> uses ReleaseNode, second (important) thing is, I'm much annoyed that the
> document should be emptied, since in the case at hand, it already had some
> contents, and I was merely passing it in order to be sure that the right
> DOM implementation would be used, and to avoid an expensive call to
> importNode.

I wasn't aware of this, and I agree it's a nasty bug.  Please do prep a patch 
if you can.  Just be sure to check it in to the o6maint branch, or put it on 
SF for me to do so (yes, we do intend to work down the SF docket before final 
release).

> As a side note, Sgmlop.HtmlParser uses non NS methods to build it's
> DOM. Is this what is intended ?

I think so.  The main danger is using import to merhe HTML and XML+NS DOMs, 
but I think this is a pretty sticky case anyway.  Since namespaces don't even 
really have a meaning in HTML, I think the current approach is the right one.

> I'll be glad to work on some patches, hopefully in time for PyXML 0.6.6,
> once the correct behaviour has been agreed on.

I'd love to see a patch on the ownerDoc misbehavior.


-- 
Uche Ogbuji                               Principal Consultant
uche.ogbuji@fourthought.com               +1 303 583 9900 x 101
Fourthought, Inc.                         http://Fourthought.com 
4735 East Walnut St, Boulder, CO 80301-2537, USA
XML strategy, XML tools (http://4Suite.org), knowledge management