Populating a dictionary, fast [SOLVED]

Francesc Altet faltet at carabos.com
Tue Nov 13 11:27:11 EST 2007


A Monday 12 November 2007, Michael Bacarella escrigué:
> As for the solution, after trying a half-dozen different integer
> hashing functions
> and hash table sizes (the brute force approach), on a total whim I
> switched to a
> model with two dictionary tiers and got whole orders of magnitude
> better performance.
>
> The tiering is, for a given key of type long:
>
>     id2name[key >> 40][key & 0x10000000000] = name
>
> Much, much better.  A few minutes versus hours this way.
>
> I suspect it could be brought down to seconds with a third level of
> tiers but this is no longer posing the biggest bottleneck... ;)

I don't know exactly why do you need a dictionary for keeping the data, 
but in case you want ultra-fast access to values, there is no 
replacement for keeping a sorted list of keys and a list with the 
original indices to values, and the proper list of values.  Then, to 
access a value, you only have to do a binary search on the sorted list, 
another lookup in the original indices list and then go straight to the 
value in the value list.  This should be the faster approach I can 
think of.

Another possibility is using an indexed column in a table in a DB.  
Lookups there should be much faster than using a dictionary as well.

HTH,

-- 
>0,0<   Francesc Altet     http://www.carabos.com/
V   V   Cárabos Coop. V.   Enjoy Data
 "-"



More information about the Python-list mailing list