Help with fast tree-like structure

David Hirschfield davidh at ilm.com
Wed Jan 25 14:32:11 EST 2006


I've written a tree-like data structure that stores arbitrary python 
objects. The objective was for the tree structure to allow any number of 
children per node, and any number of root nodes...and for it to be 
speedy for trees with thousands of nodes.

At its core, the structure is just a list of lists arranged so that if 
node A has children B and C and node B has child D the data looks like:

A = [<A node data>, B, C]
    B = [<B node data>, D]
    C = [<C node data>]

where B, C and D are all lists with similar structures to A. I am 
holding references to the individual nodes so that access to individual 
nodes by reference is quick. Access by "tree path" is done by giving a 
tuple of integers indicating where in the tree the node you want lies. 
The path (1,2,5) indicates the 6th child of the 3rd child of the 2nd 
root node. Everything involving basic tree functions works fine. Finding 
any particular node this way is just a function of the depth of the node 
in the tree, so it's very quick unless you have some degenerate tree 
structure where nodes end up miles deep.

Here's my problem: I need to allow the tree to "hide" the roots, so that 
the top-level appears to the outside world to be the children under the 
root nodes, not the root nodes themselves. That means a path of (5,2) 
indicates the 3rd child of the 6th "pseudo-root" node. Unfortunately, in 
a tree with many root nodes, each containing many children (a common use 
case for me) finding the 6th pseudo-root is going to be slow, and the 
ways I've thought of to make it fast all require slow bookkeeping when 
new nodes are inserted or removed at the pseudo-root level.

I'm running out of ideas, so if anyone has any thoughts on how to deal 
with fudging which nodes are the roots efficiently...I'm all ears.
Thanks in advance,
-David

-- 
Presenting:
mediocre nebula.




More information about the Python-list mailing list