don't need dictionary's keys - hash table?

Diez B. Roggisch deets at nospam.web.de
Wed Jul 12 12:38:11 EDT 2006


kdotsky at gmail.com wrote:

> Hello,
> I am using some very large dictionaries with keys that are long strings
> (urls).  For a large dictionary these keys start to take up a
> significant amount of memory.  I do not need access to these keys -- I
> only need to be able to retrieve the value associated with a certain
> key, so I do not want to have the keys stored in memory.  Could I just
> hash() the url strings first and use the resulting integer as the key?
> I think what I'm after here is more like a tradition hash table. 

python dictionaries are "traditional" hash-tables.

> If I 
> do it this way am I going to get the memory savings I am after?  Will
> the hash function always generate unique keys?  Also, would the same
> technique work for a set?
> 
> Any other thoughts or considerations are appreciated.

You could try and create a md5 sum of your strings and use that as key. It
_should_ be good enough, but I'm no crypto expert so take that with a grain
of salt.

Diez



More information about the Python-list mailing list