[XML-SIG] Building a DOM tree

Andrew M. Kuchling akuchlin@cnri.reston.va.us
Tue, 23 Mar 1999 16:22:46 -0500 (EST)


Simon Pepping writes:
>The data structure of a DOM tree consists of a tree of _nodeData
>instances, which I call the backbone. These objects refer to each
>other by their children attribute.

	Correct.

>These _nodeData instances are never presented to the user. Whenever
>the user approaches a node by one of the DOM methods, an instance of a
>subclass of Node (Document, Element, Text, Attribute) is created,

	Also correct.

	This convoluted structured is required in order to avoid
creating cycles of references, because CPython's garbage collection
can't handle cycles.  (JPython is dependent on the Java VM for garbage
collection, and I'd imagine that most Java VMs can correctly collect
garbage cycles.)  The problem is parent pointers: if you have a tree
of Node instances, and each node holds a reference to its parent, you
have to deliberately break the cycle before it can be garbage
collection, so you'd have to call some .destroy() or .close() method
on nodes before they go out of scope.  Forget this, and your program
would leak memory.  

	Fewer cycles can be created if every node has a reference to
the Document node, but that doesn't avoid the problem completely;
you'd still have to explicitly destroy Document nodes.  If anyone can
suggest a solution that avoids cycles and avoids having to remember to 
destroy things, please post!

	We can change the Builder class to deal with _nodeData
instances directly, instead of creating proxy nodes and then adding
them; that will help software that relies on Builder, but won't speed
up software that builds DOM trees on its own.


-- 
A.M. Kuchling			http://starship.python.net/crew/amk/
A pig can learn more tricks than a dog, but has too much sense to want to do
it.
    -- Robertson Davies, _The Table Talk of Samuel Marchbanks_