sqlite3 import performance
Gerhard Häring
gh at ghaering.de
Fri Sep 5 19:30:18 EDT 2008
Ben Lee wrote:
> hi folks --
>
> a quick python and sqlite3 performance question. i find that
> inserting a million rows of in-memory data into an in-memory database
> via a single executemany() is about 30% slower than using the sqlite3
> CLI and the .import command (reading the same data from a disk file,
> even.) i find this surprising, executemany() i assume is using a
> prepared statement and this is exactly what the .import command does
> (based on my quick perusal of the source.)
>
> is this discrepancy to be expected? where is the overhead coming
> from? [...]
Ok, I'll bite.
Well, first, the a 30 % slowdown with a Python DB-API wrapper compared
to the native commandline tool of the database is to be considered still
quite reasonable, in my opinion.
About a year ago I compared the performance of pysqlite vs. the other
SQLite wrapper, APSW. At the time, APSW was a bit faster, not
excessively, but measurable. In meaningless benchmarks like yours ;-)
So I changed pysqlite here and there to get the same performance as
APSW. Only minor tweaks, nothing spectacular. And a few hardcore tricks
as well, like special-casing *not-subclassed* classes. The result was
pysqlite 2.3.5:
http://oss.itsystementwicklung.de/trac/pysqlite/wiki/2.3.5_Changelog
These optmizations are not yet in Python 2.5.x, but they can be found in
the sqlite3 module of the Python 2.6/3.0 betas.
Well, here are my results of your benchmark :-)
-- Gerhard
# with Python 2.5.2's sqlite3 module
gerhard at lara:~/tmp$ python t.py
generating data...
done!
testing 1000000 inserts...
[42.795290946960449, 44.337385892868042, 46.35642409324646]
# with pysqlite 2.5.0, which I released earlier today
gerhard at lara:~/tmp$ python t.py
generating data...
done!
testing 1000000 inserts...
[33.027599096298218, 32.73675012588501, 32.823790073394775]
# SQLite commandline
gerhard at lara:~/tmp$ time sqlite3 -init sqlcmds ':memory:' '.quit'
real 0m32.514s
gerhard at lara:~/tmp$ time sqlite3 -init sqlcmds ':memory:' '.quit'
real 0m32.576s
gerhard at lara:~/tmp$ time sqlite3 -init sqlcmds ':memory:' '.quit'
real 0m32.604s
More information about the Python-list
mailing list