Geneator/Iterator Nesting Problem - Any Ideas? 2.4

ChaosKCW da.martian at gmail.com
Mon Mar 7 11:05:08 EST 2005


For those that are interested I ran a performance comparison of various
functions for wrapping sql results in a interables and generators. The
results are below and the code is at the very bottom.

Surprisinly (in a happy way) the functional version of the batch
retrieve comes in a very close second beating out two other common
iterative approaches. However the winner by a small margin is still an
iterative routine. The elegance of the more functional approach (thanks
to Steve) and its excellent performance makes it a clear winner in my
mind.

The other intersting conclusion is that batch fetching results gives a
fairly significant and real performance boost. Its not just academic.

The Winner:

def resultset_functional_batchgenerator(cursor, size=100):
      for results in iter(lambda: cursor.fetchmany(size), []):
         for rec in results:
             yield rec


Test Results (P2.4 IBM T41 Thinkpad):
   Ordered by: cumulative time
   List reduced from 57 to 7 due to restriction <'test_'>

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    2.140    2.140   54.002   54.002
PerformanceTestGenerators.py:102(test_resultset_functional_generator)
        1    1.957    1.957   45.484   45.484
PerformanceTestGenerators.py:98(test_resultset_iterative_generator)
        1    2.433    2.433   41.844   41.844
PerformanceTestGenerators.py:94(test_resultset_iterator)
        1    1.930    1.930   39.793   39.793
PerformanceTestGenerators.py:110(test_resultset_iterative_batchgenerator_2)
        1    1.734    1.734   35.561   35.561
PerformanceTestGenerators.py:114(test_resultset_iterative_batchgenerator_3)
        1    1.980    1.980   34.579   34.579
PerformanceTestGenerators.py:118(test_resultset_functional_batchgenerator)
        1    1.780    1.780   31.696   31.696
PerformanceTestGenerators.py:106(test_resultset_iterative_batchgenerator_1)

Code:

import unittest
import odbc
import profile
import pstats

class resultset_iterator:
    "Iterate over the recordset and frees the cursor afterwards."
    def __init__(self, cursor):
        self.cursor = cursor
    def __iter__(self): return self
    def next(self):
        rec = self.cursor.fetchone()
        if not rec:
           raise StopIteration
        return rec

def resultset_iterative_generator(cursor):
    rec = cursor.fetchone();
    while rec:
        yield rec
        rec = cursor.fetchone();

def resultset_functional_generator(cursor):
    for rec in iter(lambda: cursor.fetchone(), None):
        yield rec

def resultset_iterative_batchgenerator_1(cursor, size=100):
    results = cursor.fetchmany(size)
    while results:
        for rec in results:
            yield rec
        results = cursor.fetchmany(size)

def resultset_iterative_batchgenerator_2(cursor, arraysize=100):
    'An iterator that uses fetchmany to keep memory usage down'
    done = False
    while not done:
        results = cursor.fetchmany(arraysize)
        if results == []:
            done = True
        for result in results:
            yield result

def resultset_iterative_batchgenerator_3(cursor, size=100):
     while True:
         results = cursor.fetchmany(size)
         if not results:
             break
         for rec in results:
             yield rec

def resultset_functional_batchgenerator(cursor, size=100):
      for results in iter(lambda: cursor.fetchmany(size), []):
         for rec in results:
             yield rec

class testResultSetGenerators(unittest.TestCase):

    connectstring = "*REMOVED*"
    sql = "*REMOVED*"

    def setUp(self):
        self.con = odbc.odbc(self.connectstring)
        self.cur = self.con.cursor()
        self.cur.execute(self.sql)

    def tearDown(self):
        self.cur.close()
        self.con.close()

    def test_resultset_iterator(self):
        for row in resultset_iterator(self.cur):
            pass

    def test_resultset_iterative_generator(self):
        for row in resultset_iterative_generator(self.cur):
            pass

    def test_resultset_functional_generator(self):
        for row in resultset_functional_generator(self.cur):
            pass

    def test_resultset_iterative_batchgenerator_1(self):
        for row in resultset_iterative_batchgenerator_1(self.cur):
            pass

    def test_resultset_iterative_batchgenerator_2(self):
        for row in resultset_iterative_batchgenerator_2(self.cur):
            pass

    def test_resultset_iterative_batchgenerator_3(self):
        for row in resultset_iterative_batchgenerator_3(self.cur):
            pass

    def test_resultset_functional_batchgenerator(self):
        for row in resultset_functional_batchgenerator(self.cur):
            pass

if __name__ == '__main__':
    suite = unittest.makeSuite(testResultSetGenerators)
    profile.run('unittest.TextTestRunner(verbosity=2).run(suite)',
'c:\\temp\\profile')

    p = pstats.Stats('c:\\temp\\profile')
    p.strip_dirs().sort_stats('cumulative').print_stats('test_')




More information about the Python-list mailing list