don't need dictionary's keys - hash table?

Nick Vatamaniuc vatamane at gmail.com
Wed Jul 12 12:51:12 EDT 2006


Dictionaries are hash tables in Python.

If you don't really want to store the keys, just use
dic[hash(key)]=value, this way the dictionary will have the same shape
and distribution of keys as dic[key]=value because
hash('abc')=hash(hash('abc'))  but the long string of actual keys are
not referenced by the dictionary.

Now you couldn't do dic.keys() and see your urls, and every time you
want to do dic['abc'] you would get a KeyError exception.

Hope this helps,
Nick Vatamaniuc

kdotsky at gmail.com wrote:
> kdot... at gmail.com wrote:
> > Hello,
> > I am using some very large dictionaries with keys that are long strings
> > (urls).  For a large dictionary these keys start to take up a
> > significant amount of memory.  I do not need access to these keys -- I
> > only need to be able to retrieve the value associated with a certain
> > key, so I do not want to have the keys stored in memory.  Could I just
> > hash() the url strings first and use the resulting integer as the key?
> > I think what I'm after here is more like a tradition hash table.  If I
> > do it this way am I going to get the memory savings I am after?  Will
> > the hash function always generate unique keys?  Also, would the same
> > technique work for a set?
> >
>
> I just realized that of course the hash is not always going to be
> unique, so this wouldn't really work.  And it seems a hash table would
> still need to store the keys (as strings) so that string comparisons
> can be done when a collision occurs.  I guess there's no avoiding
> storing they keys?




More information about the Python-list mailing list