Large Dictionaries

Chris Foote chris at foote.com.au
Mon May 15 23:19:50 EDT 2006


Aahz wrote:
> In article <roy-765731.09112115052006 at reader1.panix.com>,
> Roy Smith  <roy at panix.com> wrote:
>> In article <1147699064.107490 at teuthos>, Chris Foote <chris at foote.com.au> 
>> wrote:
>>> I have the need to store a large (10M) number of keys in a hash table,
>>> based on a tuple of (long_integer, integer).  The standard python
>>> dictionary works well for small numbers of keys, but starts to
>>> perform badly for me inserting roughly 5M keys:
>>>
>>> # keys   dictionary  metakit   (both using psyco)
>>> ------   ----------  -------
>>> 1M            8.8s     22.2s
>>> 2M           24.0s     43.7s
>>> 5M          115.3s    105.4s
>> Are those clock times or CPU times?
> 
> And what are these times measuring?

The loading of a file into a dictionary.  i.e. no lookup operations.

 > Don't forget that adding keys
> requires resizing the dict, which is a moderately expensive operation.

Yep, that's why I probably need a dictionary where I can pre-specify
an approximate size at the time of its creation.

> Once the dict is constructed, lookup times should be quite good.

Very good indeed with Psyco!

Cheers,
Chris



More information about the Python-list mailing list