How to find bad row with db api executemany()?

Roy Smith roy at panix.com
Fri Mar 29 21:19:22 EDT 2013


In article <mailman.3977.1364605026.2939.python-list at python.org>,
 Chris Angelico <rosuav at gmail.com> wrote:

> On Sat, Mar 30, 2013 at 11:41 AM, Roy Smith <roy at panix.com> wrote:
> > In article <mailman.3971.1364595940.2939.python-list at python.org>,
> >  Dennis Lee Bieber <wlfraed at ix.netcom.com> wrote:
> >
> >> If using MySQLdb, there isn't all that much difference... MySQLdb is
> >> still compatible with MySQL v4 (and maybe even v3), and since those
> >> versions don't have "prepared statements", .executemany() essentially
> >> turns into something that creates a newline delimited "list" of
> >> "identical" (but for argument substitution) statements and submits that
> >> to MySQL.
> >
> > Shockingly, that does appear to be the case.  I had thought during my
> > initial testing that I was seeing far greater throughput, but as I got
> > more into the project and started doing some side-by-side comparisons,
> > it the differences went away.
> 
> How much are you doing per transaction? The two extremes (everything
> in one transaction, or each line in its own transaction) are probably
> the worst for performance. See what happens if you pepper the code
> with 'begin' and 'commit' statements (maybe every thousand or ten
> thousand rows) to see if performance improves.
> 
> ChrisA

We're doing it all in one transaction, on purpose.  We start with an 
initial dump, then get updates about once a day.  We want to make sure 
that the updates either complete without errors, or back out cleanly.  
If we ever had a partial daily update, the result would be a mess.

Hmmm, on the other hand, I could probably try doing the initial dump the 
way you describe.  If it fails, we can just delete the whole thing and 
start again.



More information about the Python-list mailing list