How to find bad row with db api executemany()?

Dave Angel davea at davea.name
Fri Mar 29 14:53:30 EDT 2013


On 03/29/2013 10:48 AM, Roy Smith wrote:
> I'm inserting a gazillion rows into a MySQL database using MySQLdb and cursor.executemany() for efficiency.  Every once in a while, I get a row which violates some kind of database constraint and raises Error.
>
> I can catch the exception, but don't see any way to tell which row caused the problem.  Is this information obtainable, short of retrying each row one by one?
>

I don't know the direct answer, or even if there is one (way to get 
MySQL to tell you which one failed), but ...

Assuming that executeMany is much cheaper than a million calls to 
executeOne (or whatever).

  -- single bad rows --
If you have a million items, and you know exactly one is different, you 
can narrow it down more quickly than just sequencing through them.  You 
can do half of them at a time, carefully choosing which subset of the 
total you use each time.  After 20 such calls, you can then calculate 
exactly which one is different.  Standard CS algorithm.

  -- sparse set of rows --
If you know that it's at least one, but still less than a dozen or so, 
it's a little trickier, but you should still converge on a final list 
pretty quickly.  Each time you do half, you also do the complementary 
half.  If either of them has no 'differences" you can then eliminate 
half the cases.

If you don't get a specific answer where MySQL can tell you the bad row, 
and if you don't know what I'm talking about, ask and I'll try to 
elaborate on one of the two above cases.

-- 
DaveA



More information about the Python-list mailing list