hashability

Carl Banks pavlovevidence at gmail.com
Tue Aug 11 21:40:15 EDT 2009


On Aug 11, 5:54 pm, James Stroud <jstr... at mbi.ucla.edu> wrote:
> Hello All,
>
> I wrote the function to test hashability of arbitrary objects. My reason
> is that the built-in python (2.5) hashing is too permissive for some
> uses. A symptom of this permissiveness comes from the ability to
> successfully hash() arbitrary objects:
>
>    py> class C(object): pass
>    ...
>    py> {C():4}[C()]
>    ------------------------------------------------------------
>    Traceback (most recent call last):
>      File "<ipython console>", line 1, in <module>
>    <type 'exceptions.KeyError'>: <__main__.C object at 0xe21610>
>
> The basis for the exception is that the two instances do not have the
> same hash() although conceptually they might seem equal to the
> unitiated. Were I to re-design python, I'd throw an exception in this
> case because of the ill-defined behavior one might expect if a C()
> serves as a key for a dict.

That's arguably the right thing to do.

Personally I've found that being able to use class instances as
hashable objects to be terribly useful (these objects are hashed and
compared by identity, of course), so I don't mind it.  But I can
definitely see how this straddles the line between "practicality" and
"face of ambiguity".  And if Python didn't do it by default, it would
be little trouble to add the appropriate __eq__ and __hash__ methods.


> To prevent users of one of my libraries from falling into this and
> similar traps (which have potentially problematic consequences),

Even so, I would consider whether some of your users might, like me,
also find this terribly useful, and if so (probably a few will unless
they are all novices), allow them to disable or disregard this check.

> I came
> up with this test for hashability:
>
> def hashable(k):
>    try:
>      hash(k)
>    except TypeError:
>      good = False
>    else:
>      good = (hasattr(k, '__hash__') and
>              (hasattr(k, '__eq__') or hasattr(k, '__cmp__')))
>    return good

I wouldn't call the function "hashable".  Class instances like C() are
hashable whether you approve or not.  Something like
"deliberately_hashable" would be a better name.


> It works as I would like for most of the cases I can invent:
>
>    py> all(map(hashable, [1,1.0,"",(1,2,3)]))
>    True
>    py> any(map(hashable, [None, [1,2], {}, C(), __import__('sys')]))
>    False
>
> Can anyone think of boundary cases I might be missing with this approach?

It is possible to redefine == operator by defining __ne__ instead of
__eq__, at least on Python 2.5, so you should keep that in mind.


Carl Banks



More information about the Python-list mailing list