Geneator/Iterator Nesting Problem - Any Ideas? 2.4
ChaosKCW
da.martian at gmail.com
Mon Mar 7 11:05:08 EST 2005
For those that are interested I ran a performance comparison of various
functions for wrapping sql results in a interables and generators. The
results are below and the code is at the very bottom.
Surprisinly (in a happy way) the functional version of the batch
retrieve comes in a very close second beating out two other common
iterative approaches. However the winner by a small margin is still an
iterative routine. The elegance of the more functional approach (thanks
to Steve) and its excellent performance makes it a clear winner in my
mind.
The other intersting conclusion is that batch fetching results gives a
fairly significant and real performance boost. Its not just academic.
The Winner:
def resultset_functional_batchgenerator(cursor, size=100):
for results in iter(lambda: cursor.fetchmany(size), []):
for rec in results:
yield rec
Test Results (P2.4 IBM T41 Thinkpad):
Ordered by: cumulative time
List reduced from 57 to 7 due to restriction <'test_'>
ncalls tottime percall cumtime percall filename:lineno(function)
1 2.140 2.140 54.002 54.002
PerformanceTestGenerators.py:102(test_resultset_functional_generator)
1 1.957 1.957 45.484 45.484
PerformanceTestGenerators.py:98(test_resultset_iterative_generator)
1 2.433 2.433 41.844 41.844
PerformanceTestGenerators.py:94(test_resultset_iterator)
1 1.930 1.930 39.793 39.793
PerformanceTestGenerators.py:110(test_resultset_iterative_batchgenerator_2)
1 1.734 1.734 35.561 35.561
PerformanceTestGenerators.py:114(test_resultset_iterative_batchgenerator_3)
1 1.980 1.980 34.579 34.579
PerformanceTestGenerators.py:118(test_resultset_functional_batchgenerator)
1 1.780 1.780 31.696 31.696
PerformanceTestGenerators.py:106(test_resultset_iterative_batchgenerator_1)
Code:
import unittest
import odbc
import profile
import pstats
class resultset_iterator:
"Iterate over the recordset and frees the cursor afterwards."
def __init__(self, cursor):
self.cursor = cursor
def __iter__(self): return self
def next(self):
rec = self.cursor.fetchone()
if not rec:
raise StopIteration
return rec
def resultset_iterative_generator(cursor):
rec = cursor.fetchone();
while rec:
yield rec
rec = cursor.fetchone();
def resultset_functional_generator(cursor):
for rec in iter(lambda: cursor.fetchone(), None):
yield rec
def resultset_iterative_batchgenerator_1(cursor, size=100):
results = cursor.fetchmany(size)
while results:
for rec in results:
yield rec
results = cursor.fetchmany(size)
def resultset_iterative_batchgenerator_2(cursor, arraysize=100):
'An iterator that uses fetchmany to keep memory usage down'
done = False
while not done:
results = cursor.fetchmany(arraysize)
if results == []:
done = True
for result in results:
yield result
def resultset_iterative_batchgenerator_3(cursor, size=100):
while True:
results = cursor.fetchmany(size)
if not results:
break
for rec in results:
yield rec
def resultset_functional_batchgenerator(cursor, size=100):
for results in iter(lambda: cursor.fetchmany(size), []):
for rec in results:
yield rec
class testResultSetGenerators(unittest.TestCase):
connectstring = "*REMOVED*"
sql = "*REMOVED*"
def setUp(self):
self.con = odbc.odbc(self.connectstring)
self.cur = self.con.cursor()
self.cur.execute(self.sql)
def tearDown(self):
self.cur.close()
self.con.close()
def test_resultset_iterator(self):
for row in resultset_iterator(self.cur):
pass
def test_resultset_iterative_generator(self):
for row in resultset_iterative_generator(self.cur):
pass
def test_resultset_functional_generator(self):
for row in resultset_functional_generator(self.cur):
pass
def test_resultset_iterative_batchgenerator_1(self):
for row in resultset_iterative_batchgenerator_1(self.cur):
pass
def test_resultset_iterative_batchgenerator_2(self):
for row in resultset_iterative_batchgenerator_2(self.cur):
pass
def test_resultset_iterative_batchgenerator_3(self):
for row in resultset_iterative_batchgenerator_3(self.cur):
pass
def test_resultset_functional_batchgenerator(self):
for row in resultset_functional_batchgenerator(self.cur):
pass
if __name__ == '__main__':
suite = unittest.makeSuite(testResultSetGenerators)
profile.run('unittest.TextTestRunner(verbosity=2).run(suite)',
'c:\\temp\\profile')
p = pstats.Stats('c:\\temp\\profile')
p.strip_dirs().sort_stats('cumulative').print_stats('test_')
More information about the Python-list
mailing list