Large Dictionaries

Chris Foote chris at foote.com.au
Mon May 15 08:43:00 EDT 2006


Hi all.

I have the need to store a large (10M) number of keys in a hash table,
based on a tuple of (long_integer, integer).  The standard python
dictionary works well for small numbers of keys, but starts to
perform badly for me inserting roughly 5M keys:

# keys   dictionary  metakit   (both using psyco)
------   ----------  -------
1M            8.8s     22.2s
2M           24.0s     43.7s
5M          115.3s    105.4s

Has anyone written a fast hash module which is more optimal for
large datasets ?

p.s. Disk-based DBs are out of the question because most
key lookups will result in a miss, and lookup time is
critical for this application.

Cheers,
Chris



More information about the Python-list mailing list