Fw: Python Database Objects (PDO) 1.2.0 Released

Jon Franz jfranz at neurokode.com
Wed Nov 19 15:17:10 EST 2003


----- Original Message ----- 
From: "Jon Franz" <jfranz at neurokode.com>
To: "Serge Orlov" <sombDELETE at pobox.ru>
Sent: Wednesday, November 19, 2003 2:39 PM
Subject: Re: Python Database Objects (PDO) 1.2.0 Released


> > Yes, if the .open() is an generator then it must return a sequence of
> items
> > but only one at a time. If the loop body doesn't keep the result object
> > it will be garbage collected pretty soon. You don't need to return
> > a dictionary you can return a special "coupler" object that will bind
> > the column description data (created only one time) with the column
> > values. Of course, it means one more allocation per row and extra
> > references, but I don't really think it's very expensive. After all it
is
> > idiomatic iteration over a sequence. Without hard data to prove
> > that it's really expensive I don't think it's right to say it's
expensive.
>
> Well, it is, hard data following, but first an aside: generators are
> something we want to avoid right now, since we support older
> versions of python that do not support generators.  Thus, older
> interpreters would suffer from the memory bloat I've already
> described.
>
> Sorry for the caps below, DB2 uses caps everywhere and I'm just
> trying to be correct.
>
> However, the creation of the mapping object for each result adds up
> over time.  Coupling objects perform bad as well - I think it has to do
> with the sheer # of objects that are created as you loop over the
> results - gcing them takes time.
>
> * Hard data.
> I populated the SAMPLE.STAFF table of my DB2 installation with
> 5235 records.
> I created two methods that grab all records from this table and loop
> over them, outputting each field for every record.  The regtimer
> method used the .next() call, meanwhile the itertimer method
> used an generator function to get a mapping object per record.
>
> Note that this test is not very optimized - for every record I loop
> over the .fields member of the Resultset, or in the iterator case,
> I loop over the members of the returned mapping object.
> The iterator mapping object is a python dictionary - I do no
> tricks to provide associated column data - it's just name:value.
> 'Coupler' objects were even more expensive, and I didn't want to
> lose the use of my test server for too long, so I dropped them
> before progressing to the 100 run tests.
>
> I used the timeit module to call each method 100 times,
> and reran the tests 3 times.
> Python was restarted in between runs.
> Output was identical between the tests.
>
> regtimer method: 281.4 seconds (avg)
> itertimer method:  627.1 seconds (avg)
>
> Building 5235 * 3 * 100 mapping objects seems very expensive
> (even when using an iterator/generator system) versus providing
> a mapping-object interface and keeping track of indexes
> internally.
>
> With smaller numbers of records, the difference is less pronounced,
> but it's always there.
>
> > > It may help to quit thinking of a Resultset as a sequence of
> dictionaries -
> > > PDO explicitly avoids that.
> > Isn't it premature optimization?
>
> Nope, we'd already through about this and investigated it - until you
> asked, no one knew that we had.  Our older tests showed that memory
> bloat was a problem, since we created the mapping objects all
> at once, and that performance was horrible.
>
> the iterator/generator version is faster than our old tests, but is still
> slower.
>
> cheers.
>
> ~Jon Franz
> NeuroKode Labs, LLC
>






More information about the Python-list mailing list