Linear Time Tree Traversal Generator

Ned Batchelder ned at nedbatchelder.com
Wed Sep 21 07:11:50 EDT 2016


On Tuesday, September 20, 2016 at 12:48:55 PM UTC-4, ROGER GRAYDON CHRISTMAN wrote:
> I am trying to find a better (i.e. more efficient) way to implement a generator
> that traverses a tree.
> 
> The current model of the code (which is also used by a textbook I am teaching
> from does this)
> 
>    def __iter__(node):
>          for x in iter(node._left):
>               yield x
>          yield node._value
>          for x in iter(node._right)
>               yield x
> 
> This is nice, simple, and straightforward, but has an O(n log n) running time,
> since
> values from the leaves of the tree have to be yielded multiple times to the top
> of the tree.
> 
> Now, I could certainly implement a linear-time traversal without the gnerator:
> 
>     def to_list(node,result):
>           """append node information to result"""
>           result = to_list(node._left, result)
>           result.append(node._value)
>           return to_list(node,_right, result)
> 
> but then that requires the extra memory space to build the list into, which
> is basically what the generator is supposed to avoid.
> 
> Now, I did see that itertools has a chain method for concatenating
> iterators, so it would be nice to concatenate the results from the 
> recursive calls without the for loops, but I have no idea how to
> squeeze the 'yield node._value' in between them.
> 
> Is there hope for a linear-time tree-traversal generator, or will
> I have just have to settle for an n-log-n generator or a linear time
> behavior with linear extra space?

Another option is linear time, log-n space, by using an explicit stack
in your iterator:

    def __iter__(self):
        cur = self
        stack = []
        while True:
            if cur:
                stack.append(cur)
                cur = cur._left
            elif not stack:
                break
            else:
                cur = stack.pop()
                yield cur._value
                cur = cur._right

This replaces the Python call stack with a list variable which tracks
parent nodes we haven't yet iterated, so it will grow as log-n. There's
no Python recursion, so each yield directly produces a value to the
caller.

--Ned.



More information about the Python-list mailing list