mutable objects as dictionary keys

Andreas Kraemer akraemer at sbcglobal.net
Tue Oct 9 13:44:01 EDT 2007


Hi everyone,

I know that the subject of mutable objects as dictionary keys has been discussed a number of times in this forum (see for instance "freezing" of classes), but I would love to hear the thoughts of the experts on the approach below. 

The use case that I encounter frequently is the classification of objects according to certain rules or properties: Say, I have objects A, B, C, ... (e.g. instances of a class, dict, etc), I can write

    d = {}
    d.setdefault(A,[]).append(A)
    d.setdefault(B,[]).append(B)
    ...

so objects that map to the same hash key will end up in the same bucket. (A "real world" example I had to deal with recently was for instance the construction of a directed acyclic graph from many directed trees where identical subtrees needed to be identified.)

The easiest way is of course to define custom __hash__() and __eq__() methods, but this breaks if objects are mutated (accidentally) after having been inserted into the dictionary. The "best" working approach I came up with so far is to generate an "immutable view" V of a mutable object O according to my classification rule, delegate O.__hash__ and O.__eq__ to V, and make sure that the V is memoized and cannot (easily) be altered later, even when O is mutated:

def hashable_mutable_factory(mutable,rule):
  class _mutable(mutable):
    def __init__(self,*args,**kw):
      self._view_cache = {}
      super(_mutable,self).__init__(*args,**kw)
    def _view(self):
      id_ = id(self)
      if not self._view_cache.has_key(id_):
        self._view_cache[id_] = rule(self)
      return self._view_cache[id_]
    def __hash__(self):
      return hash(self._view())
    def __eq__(self,other):
      return self._view() == other._view()
  return _mutable

E.g.:

>>> hashable_dict = hashable_mutable_factory(dict,lambda obj: frozenset(obj.iteritems()))
>>> h = hashable_dict(a=1,b=2)
>>> d = {}
>>> d[h] = 'foo'
>>> d
{{'a': 1, 'b': 2}: 'foo'}
>>> h['c'] = 'bar'
>>> d
{{'a': 1, 'c': 'bar', 'b': 2}: 'foo'}
>>> g = hashable_dict(a=1,b=2)
>>> h

{'a': 1, 'c': 'bar', 'b': 2}

>>> g

{'a': 1, 'b': 2}
>>> id(g) == id(h)
False
>>> g == h
True

I slightly favor the factory function idiom above over defining the rule in a super class (this would have to be done for each mutable type and rule function separately), especially since I read that future versions of python (2.6 ?, 3.0 ?) will contain class decorators and allow syntax like class A(*bases): pass

Is there a better approach? Any comments are appreciated. 

I have been seriously using Python for one year know, mostly in the context of graph algorithms etc., and it has always been a delightful coding experience!

Best regards,

Andreas




-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-list/attachments/20071009/07030119/attachment.html>


More information about the Python-list mailing list