Large Dictionaries
Claudio Grondi
claudio.grondi at freenet.de
Tue May 16 08:09:10 EDT 2006
Chris Foote wrote:
> Claudio Grondi wrote:
>
>> Chris Foote wrote:
>>
>>> p.s. Disk-based DBs are out of the question because most
>>> key lookups will result in a miss, and lookup time is
>>> critical for this application.
>>>
>> Python Bindings (\Python24\Lib\bsddb vers. 4.3.0) and the DLL for
>> BerkeleyDB (\Python24\DLLs\_bsddb.pyd vers. 4.2.52) are included in
>> the standard Python 2.4 distribution.
>
>
> However, please note that the Python bsddb module doesn't support
> in-memory based databases - note the library documentation's[1] wording:
>
> "Files never intended to be preserved on disk may be created by
> passing None as the filename."
>
> which closely mirrors the Sleepycat documentation[2]:
>
> "In-memory databases never intended to be preserved on disk
> may be created by setting the file parameter to NULL."
>
> It does actually use a temporary file (in /var/tmp), for which
> performance for my purposes is unsatisfactory:
>
> # keys dictionary metakit bsddb (all using psyco)
> ------ ---------- ------- -----
> 1M 8.8s 22.2s 20m25s[3]
> 2M 24.0s 43.7s N/A
> 5M 115.3s 105.4s N/A
>
> Cheers,
> Chris
>
> [1] bsddb docs:
> http://www.python.org/doc/current/lib/module-bsddb.html
>
> [2] Sleepycat BerkeleyDB C API:
> http://www.sleepycat.com/docs/api_c/db_open.html
>
> [3] Wall clock time. Storing the (long_integer, integer) key in string
> form "long_integer:integer" since bsddb doesn't support keys that aren't
> integers or strings.
I have to admit, that I haven't wrote any own code to actually test
this, but if 20m25s for storing of a single MByte of strings in a
database table index column is really what you are getting, I can't get
rid of the feeling, that there is something elementary wrong with your
way doing it.
Posting the code for your test cases appears to me to be the only option
to see what is the reason for the mystery you are getting here (this
will clarify also the other mysterious things considered by the posters
to this thread up to now).
Claudio
More information about the Python-list
mailing list