Hashability questions

Chris Kaynor ckaynor at zindagigames.com
Mon May 14 19:38:26 EDT 2012


On Sun, May 13, 2012 at 12:11 PM, Bob Grommes <bob.grommes at gmail.com> wrote:
> Noob alert: writing my first Python class library.
>
> I have a straightforward class called Utility that lives in Utility.py.
>
> I'm trying to get a handle on best practices for fleshing out a library.  As such, I've done the following for starters:
>
>  def __str__(self):
>    return str(type(self))
>
> #  def __eq__(self,other):
> #    return hash(self) == hash(other)
>
> The commented-out method is what I'm questioning.  As-is, I can do the following from my test harness:
>
> u = Utility()
> print(str(u))
> print(hash(u))
> u2 = Utility()
> print(hash(u2))
> print(hash(u) == hash(u2))
>
> However if I uncomment the above _eq_() implementation, I get the following output:
>
> <class 'Utility.Utility'>
> Traceback (most recent call last):
>  File "/Users/bob/PycharmProjects/BGC/Tests.py", line 7, in <module>
>    print(hash(u))
> TypeError: unhashable type: 'Utility'
>
> Process finished with exit code 1
>
> Obviously there is some sort of default implementation of __hash__() at work and my implementation of _eq_() has somehow broken it.  Can anyone explain what's going on?

In Python, the default implementations of __hash__ and __eq__ are set
to return the id of the object. Thus, an object by default compares
equal only to itself, and it hashes the same everytime.

In Python3, if you override __eq__, the default __hash__ is removed,
however it can also be overridden to provide better hashing support.
In Python2, the default removal of __hash__ did not exist, which could
lead to stubble bugs where a class would override __eq__ by leave
__hash__ as the default implementation.

Generally, the default __eq__ and __hash__ functions provide the
correct results, and are nice and convenient to have. From there, the
case where __eq__ is overridden is the next most common, and if it is
overridden, the default __hash__ is almost never correct, and thus the
object should either not be hashable (the default in Python3) or
should also be overriden to produce the correct results.

The rule is that, if two objects return different results from
__hash__, they should never compare equal. The opposite rule also
holds true: if two objects compare equal, they should return the same
value from __hash__.

See http://docs.python.org/reference/datamodel.html#object.__hash__
and http://docs.python.org/reference/datamodel.html#object.__lt__ for
more information.



More information about the Python-list mailing list