dictionary mutability, hashability, __eq__, __hash__

Jussi Piitulainen jussi.piitulainen at helsinki.fi
Sun Nov 27 11:41:14 EST 2016


Veek M writes:

> Jussi Piitulainen wrote:
>
>> Veek M writes:
>> 
>> [snip]
>> 
>>> Also if one can do x.a = 10 or 20 or whatever, and the class instance
>>> is mutable, then why do books keep stating that keys need to be
>>> immutable?  After all, __hash__ is the guy doing all the work and
>>> maintaining consistency for us. One could do:
>>>
>>> class Fruit:
>>>   editable_value = ''
>>> def __hash__(self):
>>>  if 'apple' in self.value:
>>>    return 10
>>>  elif 'banana' in self.value:
>>>    return 20
>>>
>>>
>>>  and use 'apple' 'bannana' as keys for whatever mutable data..
>>> Are the books wrong?
>> 
>> The hash does not do all the work, and the underlying implementation
>> of a dictionary does not react appropriately to a key changing its
>> hash value. You could experiment further to see for yourself.
>> 
>> Here's a demonstration that Python's dictionary retains both keys
>> after they are mutated so that they become equal, yet finds neither
>> key (because they are not physically where their new hash value
>> indicates).
>> 
>> I edited your class so that its methods manipulate an attribute that
>> it actually has, all hash values are integers, constructor takes an
>> initial value, objects are equal if their values are equal, and the
>> written representation of an object shows the value (I forgot quotes).
>> 
>> test = { Fruit('apple') : 'one', Fruit('orange') : 'two' }
>> 
>> print(test)
>> print(test[Fruit('orange')])
>> # prints:
>> # {Fruit(apple): 'one', Fruit(orange): 'two'}
>> # two
>> 
>> for key in test: key.value = 'banana'
>> 
>> print(test)
>> print(test[Fruit('banana')])
>> 
>> # prints:
>> # {Fruit(banana): 'one', Fruit(banana): 'two'}
>> # Traceback (most recent call last):
>> #   File "hash.py", line 25, in <module>
>> #     print(test[Fruit('banana')])
>> # KeyError: Fruit(banana)
>
> ah! not so: that's because you are messing/changing the integer value 
> for the key. If apple-object was returning 10, you can't then return 20 
> (the text mangling seems to be completely irrelevant except you need it 
> to figure out which integer to return but barring that..).

It was my best guess to what you intended __hash__ to be. You took that
risk when you posted obviously broken code.

Your new __hash__ function below behaves the same way.

> Here's an example of what you're doing (note 'fly' is returning 20 BUT 
> the object-instance is 'apple' - that obviously won't work and has 
> nothing to do with my Q, err.. (don't mean to be rude):
> class Fruit(object):
>     def __init__(self, text):
>         self.text = text
>         
>     def mangle(self,text):
>         self.text = text
>         
>     def __hash__(self):
>         if 'apple' in self.text:
>             return 10
>         elif 'orange' in self.text:
>             return 20
>         elif 'fly' in self.text:
>             return 20
>         else:
>             pass
>         
> apple = Fruit('apple')
> orange = Fruit('orange')
>
> d = { apple : 'APPLE_VALUE', orange : 'ORANGE_VALUE' }
> print d
>
> apple.mangle('fly')
> print d[apple]

Did you bother to try that? I get a KeyError (because the hash value of
the key object has changed).

> The Question is specific.. what I'm saying is that you can change
> attributes and the contents and totally mash the object up, so long as
> __hash__ returns the same integer for the same object. Correct?

Your __hash__ doesn't.

In your own example just above, you get 10 before mangling, and 20
after.

> Where does __eq__ fit in all this?

Make two different objects hash the same. Make them be __eq__ by
mangling them when they are already keys. See if you can still use both
as an index to get at the associated value. (You can't.)

But yes, you should be free to mutate fields that do not affect hashing
and equality. Object identity should work, if you are otherwise happy to
use object identity.



More information about the Python-list mailing list