Fastest database solution

Curt Hash curt.hash at gmail.com
Fri Feb 6 03:10:43 EST 2009


I'm writing a small application for detecting source code plagiarism that
currently relies on a database to store lines of code.

The application has two primary functions: adding a new file to the database
and comparing a file to those that are already stored in the database.

I started out using sqlite3, but was not satisfied with the performance
results. I then tried using psycopg2 with a local postgresql server, and the
performance got even worse. My simple benchmarks show that sqlite3 is an
average of 3.5 times faster at inserting a file, and on average less than a
tenth of a second slower than psycopg2 at matching a file.

I expected postgresql to be a lot faster ... is there some peculiarity in
psycopg2 that could be causing slowdown? Are these performance results
typical? Any suggestions on what to try from here? I don't think my
code/queries are inherently slow, but I'm not a DBA or a very accomplished
Python developer, so I could be wrong.

Any advice is appreciated.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-list/attachments/20090206/6afc9b1f/attachment.html>


More information about the Python-list mailing list