Large Dictionaries
Chris Foote
chris at foote.com.au
Tue May 16 05:05:36 EDT 2006
Claudio Grondi wrote:
> Chris Foote wrote:
>> p.s. Disk-based DBs are out of the question because most
>> key lookups will result in a miss, and lookup time is
>> critical for this application.
>>
> Python Bindings (\Python24\Lib\bsddb vers. 4.3.0) and the DLL for
> BerkeleyDB (\Python24\DLLs\_bsddb.pyd vers. 4.2.52) are included in the
> standard Python 2.4 distribution.
However, please note that the Python bsddb module doesn't support
in-memory based databases - note the library documentation's[1] wording:
"Files never intended to be preserved on disk may be created by
passing None as the filename."
which closely mirrors the Sleepycat documentation[2]:
"In-memory databases never intended to be preserved on disk may be
created by setting the file parameter to NULL."
It does actually use a temporary file (in /var/tmp), for which
performance for my purposes is unsatisfactory:
# keys dictionary metakit bsddb (all using psyco)
------ ---------- ------- -----
1M 8.8s 22.2s 20m25s[3]
2M 24.0s 43.7s N/A
5M 115.3s 105.4s N/A
Cheers,
Chris
[1] bsddb docs:
http://www.python.org/doc/current/lib/module-bsddb.html
[2] Sleepycat BerkeleyDB C API:
http://www.sleepycat.com/docs/api_c/db_open.html
[3] Wall clock time. Storing the (long_integer, integer) key in string
form "long_integer:integer" since bsddb doesn't support keys that aren't
integers or strings.
More information about the Python-list
mailing list