hashability

Steven D'Aprano steven at REMOVE.THIS.cybersource.com.au
Wed Aug 12 03:34:33 EDT 2009


On Tue, 11 Aug 2009 17:54:36 -0700, James Stroud wrote:

> Hello All,
> 
> I wrote the function to test hashability of arbitrary objects. My reason
> is that the built-in python (2.5) hashing is too permissive for some
> uses. A symptom of this permissiveness comes from the ability to
> successfully hash() arbitrary objects:

No it doesn't.


>>> hash([])
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: list objects are unhashable

Python successfully hashes *non*-arbitrary objects, including those that 
inherit from object.



>    py> class C(object): pass
>    ...
>    py> {C():4}[C()]
>    ------------------------------------------------------------
>    Traceback (most recent call last):
>      File "<ipython console>", line 1, in <module>
>    <type 'exceptions.KeyError'>: <__main__.C object at 0xe21610>

Why would you expect that to succeed? The object you are using as the key 
is a different instance than the object you are looking up. Since your 
instances don't even compare equal, why would you expect them to hash 
equal?

>>> C() == C()
False



> The basis for the exception is that the two instances do not have the
> same hash() although conceptually they might seem equal to the
> unitiated. 

Yes, well, the ignorant and uninitiated frequently make mistakes. I feel 
your pain.



> Were I to re-design python, I'd throw an exception in this
> case because of the ill-defined behavior one might expect if a C()
> serves as a key for a dict.

It's not ill-defined, it's a perfectly reasonable design you just don't 
happen to like. It's quite useful for some.


> To prevent users of one of my libraries from falling into this and
> similar traps (which have potentially problematic consequences), I came
> up with this test for hashability:

That's not a test for hashability. That's a test for hashability and 
equality testing together.


> def hashable(k):
>    try:
>      hash(k)
>    except TypeError:
>      good = False
>    else:
>      good = (hasattr(k, '__hash__') and
>              (hasattr(k, '__eq__') or hasattr(k, '__cmp__')))
>    return good

The test for __hash__ is redundant, given that hash() has already 
succeeded.

It will wrongly reject classes that deliberately don't define equality, 
for whatever reason, and that *expect* the behaviour you are treating as 
an error.



> It works as I would like for most of the cases I can invent:
> 
>    py> all(map(hashable, [1,1.0,"",(1,2,3)])) True
>    py> any(map(hashable, [None, [1,2], {}, C(), __import__('sys')]))
>    False

Well there you go -- why on earth would you prohibit None as a dictionary 
key??? That's a serious failure.



-- 
Steven



More information about the Python-list mailing list