don't need dictionary's keys - hash table?

Piet van Oostrum piet at cs.uu.nl
Thu Jul 13 09:52:49 EDT 2006


>>>>> kdotsky at gmail.com (k) wrote:

>k> Hello,
>k> I am using some very large dictionaries with keys that are long strings
>k> (urls).  For a large dictionary these keys start to take up a
>k> significant amount of memory.  I do not need access to these keys -- I
>k> only need to be able to retrieve the value associated with a certain
>k> key, so I do not want to have the keys stored in memory.  Could I just
>k> hash() the url strings first and use the resulting integer as the key?
>k> I think what I'm after here is more like a tradition hash table.  If I
>k> do it this way am I going to get the memory savings I am after?  Will
>k> the hash function always generate unique keys?  Also, would the same
>k> technique work for a set?

Maybe a Berkeley DB hash file would be a good alternative. It can contain
all your key,value pairs but will only keep a small amount in memory.
-- 
Piet van Oostrum <piet at cs.uu.nl>
URL: http://www.cs.uu.nl/~piet [PGP 8DAE142BE17999C4]
Private email: piet at vanoostrum.org



More information about the Python-list mailing list