Large Dictionaries

Claudio Grondi claudio.grondi at freenet.de
Mon May 15 09:22:09 EDT 2006


Chris Foote wrote:
> Hi all.
> 
> I have the need to store a large (10M) number of keys in a hash table,
> based on a tuple of (long_integer, integer).  The standard python
> dictionary works well for small numbers of keys, but starts to
> perform badly for me inserting roughly 5M keys:
> 
> # keys   dictionary  metakit   (both using psyco)
> ------   ----------  -------
> 1M            8.8s     22.2s
> 2M           24.0s     43.7s
> 5M          115.3s    105.4s
> 
> Has anyone written a fast hash module which is more optimal for
> large datasets ?
> 
> p.s. Disk-based DBs are out of the question because most
> key lookups will result in a miss, and lookup time is
> critical for this application.
> 
> Cheers,
> Chris
Python Bindings (\Python24\Lib\bsddb vers. 4.3.0) and the DLL for 
BerkeleyDB (\Python24\DLLs\_bsddb.pyd vers. 4.2.52) are included in the 
standard Python 2.4 distribution.

"Berkeley DB was  20 times faster  than other databases.  It has the 
operational speed of  a main memory database, the startup and  shut down 
speed of a  disk-resident database, and does not have the  overhead  of 
a client-server inter-process communication."
Ray  Van Tassle,  Senior  Staff Engineer, Motorola

Please let me/us know if it is what you are looking for.

Claudio



More information about the Python-list mailing list