bad generator performance

Jimmy Retzlaff jimmy at retzlaff.com
Sun Feb 6 09:57:03 EST 2005


Johannes Ahl-mann wrote:
> i just wondered if there was a concise, clean way of doing this with
> generators which i had overlooked, or whether there was some blatant
> mistake in my code.

Aside from the recursion question...

You don't really give the complete story so it's hard to tell what
exactly is going on. For example, I would assume the recursion is
calling the same method (i.e., depthFirstIterator1 or
depthFirstIterator2), but then you posted separate timing information
for a "recursive helper function" so I'm not so sure. Also there is not
much of a hint as to the nature of your data.

Below is a test where each of your 2 methods calls itself recursively.
To make it work I had to build a data tree - I used the files and
folders in my C:\Python23 directory. In my testing (on Python 2.3.4 on
Windows XP), the generator version takes about 1/4 of the time that the
list version takes. If both versions call the list version of the method
recursively (i.e., you're only getting the benefit of the generator at
the top level of the recursion), the generator version is still about
20% faster.

Timing differences could potentially depend on your data also - things
like how deep vs. wide your tree is.

Jimmy


import os
import time

class Node:
    def __init__(self, pathname):
        self.thisIsAFolder = os.path.isdir(pathname)
        self.children = []
        if self.isFolder():
            for filename in os.listdir(pathname):
                childpathname = os.path.join(pathname, filename)
                self.children.append(Node(childpathname))

    def isFolder(self):
        return self.thisIsAFolder

    def depthFirstIterator1(self, depth = 0):
        ret = [[self, True, depth]]

        if self.isFolder():
          for c in self.children:
            ret = ret + c.depthFirstIterator1(depth = depth + 1)

        return ret + [[self, False, depth]]

    def depthFirstIterator2(self, depth = 0):
        yield [self, True, depth]

        if self.isFolder():
          for c in self.children:
            for y in c.depthFirstIterator2(depth = depth + 1):
              yield y

        yield [self, False, depth]


x = Node(r'C:\Python23')
for iterator in (x.depthFirstIterator1, x.depthFirstIterator2):
    print iterator.__name__,
    start = time.time()
    for item in iterator():
        pass
    print round(time.time()-start, 2)



More information about the Python-list mailing list