[XML-SIG] DOM: .getElementsByTagName problem!

A.M. Kuchling akuchlin@cnri.reston.va.us
Tue, 20 Oct 1998 22:16:13 -0400


I was working on the NodeList and NamedNodeList classes tonight, and
realized that the DOM spec says that the NodeList returned from
getElementsByTagName() is live; its contents change to reflect changes
in the underlying document!

	.getElementsByTagName(tagName) returns a NodeList, which has
essentially the semantics of a Python list, contains all Element nodes
named tagName, in the order they'd be encountered in a pre-order
traversal of the tree.  Making this live means that, when you
rearrange the tree, any affected NodeLists are magically updated.

	Implementing this seems nightmarishly difficult; any
structural changes to the document require updating those NodeLists,
and I can't see a much better implementation than just retraversing
the whole tree.  Worse, I think this will end up re-introducing
cycles and memory leaks.

	Here's my current, quickly formed, idea of how to handle this.

	* _nodeData instances need to have a dictionary on them; call
this dictionary _D.  _D[ elementName ] is a list containing the
_nodeData instances that are returned by .getElementsByTagName(
elementName ).

	* The NodeList instance would then hold a reference to the
proper list, and if the underlying list gets mutated, the NodeList
picks up the change.  The NodeList also probably needs to hold a
reference to the _nodeData instance, because you need to
garbage-collect the entry in _D when the NodeList gets deleted.  (How
to handle multiple calls for the same element and tag name, though?)  

	* When the tree is modified by inserting a new Element node or
subtree, you need to walk from the modified node back up to the
Document Node.  (I don't know how to do that; remember, the whole
point of using _nodeData instances was to avoid keeping pointers to
the parent node.) Along the way, when a node is found to have a
non-empty _D, you need to regenerate all the lists, in case they've
changed.  This would require one complete traversal of the subtree
below the node (you'd try to do all the NodeLists in one traversal.)

	I have a bad feeling about this plan; it's complicated, it
adds nasty garbage-collection complexities, and I suspect it'll
re-introduce complicated cycles.  Does anyone have a better idea?
Help!

	(Aarrgh! why the hell did they put such database-style
complexity in the Level 1 spec?  Who expects the results of
.getElementsByTagName to be updated, anyway?  I certainly didn't...)

-- 
A.M. Kuchling			http://starship.skyport.net/crew/amk/
Something Dian used to say to me. She'd say, "Wes, don't say anything unless
you've got something to say." Advice I took to heart. She would also say,
"It's a long, long trail that has no turning." And how right she was.
    -- Wesley Dodds, in SANDMAN #72, part three of "The Wake"