What is the recommended python module for SQL database access?

Chris Angelico rosuav at gmail.com
Sun Feb 9 06:00:58 EST 2014


On Sun, Feb 9, 2014 at 9:20 PM, Marcel Rodrigues <marcelgmr at gmail.com> wrote:
> As Chris said, if your needs are simple, use SQLite back-end. It's probably
> already installed on your computer and Python has a nice interface to it in
> its standard library.

Already installed? I thought the point of SQLite3 being in the Python
stdlib was that Python actually included the entire engine (that's why
there's no, for instance, PostgreSQL client in the stdlib - because
there's no server; I disagree with the reasoning, but it is consistent
and valid), so you don't need _anything_ externally installed.

In any case, SQLite is ideal for really simple databasing. Back in the
1990s, I had DB2, DB2, and DB2, for all my database work. I wanted a
way to query a dictionary of English words using SQL, so I created a
DB2 database and threw ~60K rows into a table. Massive overkill for a
one-column table. These days, I could use SQLite (or more likely, just
use grep on /usr/share/dict/words - grep does everything that I wanted
SQL for, if you include piping from one grep into another), cutting
the overhead down enormously.

The biggest downside of SQLite3 is concurrency. I haven't dug into the
exact details of the pager system and such, but it seems to be fairly
coarse in its locking. Also, stuff gets a bit complicated when you do
a single transaction involving multiple files. So if you have lots of
processes writing to the same set of SQLite tables, you'll see pretty
poor performance. PostgreSQL handles that situation far better, but
has a lot more overhead, so it's a poor choice for a single simple
application. MySQL's locking/concurrency system is specifically
optimized for a model that's common for web applications: a huge
number of readers and a tiny number of writers (sometimes referred to
as Data Warehousing, because you basically stuff a warehouse full of
data and then everyone comes looking for it). For the write-heavy
model (sometimes called OLTP or On-Line Transaction Processing),
PostgreSQL will hugely outperform MySQL, thanks to its MVCC model.

Broad recommendation: Single application, tiny workload, concurrency
not an issue, simplicity desired? Go SQLite. Big complex job, need
performance, lots of things reading and writing at once, want
networked access? Go PGSQL. And don't go MySQL if PG is an option.

And definitely don't go for a non-free option (MS-SQL, DB2, etc)
unless you've looked into it really closely and you are absolutely
thoroughly *sure* that you need that system (which probably means you
need your app to integrate with someone else's, and that other app
demands one particular database).

ChrisA



More information about the Python-list mailing list