bad generator performance

Robert Brewer fumanchu at amor.org
Sat Feb 5 23:00:23 EST 2005


Johannes Ahl mann wrote:
> i am in the process of profiling an application and noticed how much
> time my depth first generator for walking a tree data structure took.
> 
> i wrote the same thing as a recursive function producing an array, and
> this non-generator version is nearly 10 times faster!
> 
> any ideas why this is, what i did wrong...
> for some reason the generator version appears as being called 
> much more
> often than the recursive one. no idea why that is, but the data is
> definitely identical!
> 
> here the output from the python profiler and the respective code:
> 
> ### snip ###
> 
> ncalls  tottime  percall  cumtime  percall filename:lineno(function)
> 94209/8191    0.560    0.000    0.590    0.000  
> :71(depthFirstIterator2)
> 4095/1    0.060    0.000    0.080    0.080  :62(depthFirstIterator1)
> 
> ### snip ###
> 	
> def depthFirstIterator1(self, depth = 0):
>     ret = [[self, True, depth]]
> 
>     if self.isFolder():
>       for c in self.children:
>         ret = ret + c.depthFirstIterator(depth = depth + 1)
> 
>     return ret + [[self, False, depth]]
> 
> def depthFirstIterator2(self, depth = 0):
>     yield [self, True, depth]
> 
>     if self.isFolder():
>       for c in self.children:
>         for y in c.depthFirstIterator(depth = depth + 1):
>           yield y
> 
>     yield [self, False, depth]
> 
> ### snip ###
> 
> i'd appreciate any comments or explanations why the generator might be
> so much slower, as i had just decided to use generators more 
> frequently
> and in the respective PEP they are praised as generally fast.

Um. Under my definition of "recursion", you haven't really removed
recursion in the generator version. That is, you're still calling the
depthFirstIterator method upon each iteration--you've just changed from
list-concatenation to yielding instead, which trades off list-management
overhead for closure-maintenance overhead. A truly non-recursive
implementation would probably exist outside of whatever class you've got
this method in, and would keep its own pointer to the current node as it
traversed the tree.

Hope that helps,


Robert Brewer
MIS
Amor Ministries
fumanchu at amor.org



More information about the Python-list mailing list