sqlite INSERT performance

duncan smith buzzard at urubu.freeserve.co.uk
Thu May 31 11:25:10 EDT 2012


On 31/05/12 06:15, John Nagle wrote:
> On 5/30/2012 6:57 PM, duncan smith wrote:
>> Hello,
>> I have been attempting to speed up some code by using an sqlite
>> database, but I'm not getting the performance gains I expected.
>
> SQLite is a "lite" database. It's good for data that's read a
> lot and not changed much. It's good for small data files. It's
> so-so for large database loads. It's terrible for a heavy load of
> simultaneous updates from multiple processes.
>

Once the table is created the data will not be changed at all. 
Corresponding integer codes will have to be generated for columns. (I 
want to do this lazily because some columns might never be needed for 
output files, and processing all columns was relatively expensive for my 
initial solution.) After that it's a series of 'SELECT a, b, ... FROM 
table WHERE f="g" ORDER by a, b, ...' style queries dumped to space 
separated text files.

> However, wrapping the inserts into a transaction with BEGIN
> and COMMIT may help.
>

Unfortunately there's no discernible difference.

> If you have 67 columns in a table, you may be approaching the
> problem incorrectly.
>

Quite possibly. I have defined start and end points. The data are 
contained in text files. I need to do the mapping to integer codes and 
generate output files for subsets of variables conditional on the levels 
of other variables. (I was doing the subsequent sorting separately, but 
if I'm using SQL I guess I might as well include that in the query.) The 
output files are inputs for other (C++) code that I have no control over.

Any approach that doesn't consume large amounts of memory will do. Cheers.

Duncan




More information about the Python-list mailing list