[ python-Bugs-660098 ] New style classes and __hash__

SourceForge.net noreply at sourceforge.net
Fri Dec 5 13:30:19 EST 2003


Bugs item #660098, was opened at 2002-12-30 13:39
Message generated for change (Comment added) made by gvanrossum
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=660098&group_id=5470

Category: None
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Thomas Heller (theller)
Assigned to: Guido van Rossum (gvanrossum)
Summary: New style classes and __hash__

Initial Comment:
New style classes obviously inherit a __hash__ 
implementation from object which returns the id. Per 
default this allows using instances as dictionary keys, 
but usually with the wrong behaviour, because most 
often user classes are mutable, and their contained data 
should be used to calculate the hash value.

IMO one possible solution would be to change 
typeobject.c:object_hash() to raise TypeError, and 
change all the immutable (core) Python objects to use 
_Py_HashPointer in their tp_hash slot.

----------------------------------------------------------------------

>Comment By: Guido van Rossum (gvanrossum)
Date: 2003-12-05 13:30

Message:
Logged In: YES 
user_id=6380

Here's the patch I am thinking of.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2003-12-05 13:06

Message:
Logged In: YES 
user_id=6380

I wonder if the solution could be as simple as removing the
tp_hash slot from the object class? 
 
I just tried that and it passes the entire test suite, as
well as the tests that Tim added to the patch. 
 
The trick is that PyObject_Hash() has a fallback which does
the right thing. 
 
And when the base object class doesn't set tp_compare or
tp_richcompare, I think it should be allowed not to set
tp_hash either. 

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2003-05-11 09:41

Message:
Logged In: YES 
user_id=6380

Oops, not so fast. This also makes object.__hash__() calls
fail when it is explicitly invoked, e.g. when a class
overrides __eq__ to print a message and then call the base
class __eq__, it must do the same for __hash__, but
object.__hash__ will still fail in this case. I'll think of
a fix for that.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2003-05-11 06:16

Message:
Logged In: YES 
user_id=6380

OK, feel free to check it in.

----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2003-05-11 00:39

Message:
Logged In: YES 
user_id=31435

The patch seems fines to me.  I've attached a new patch, 
combining yours with new tests in test_class.py.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2003-05-09 14:04

Message:
Logged In: YES 
user_id=6380

A trick similar to what we do in object_new might work.
There, we raise an error if the tp_init slot is the default
function (object_init) and any arguments are passed.

I propose that object_hash checks that tp_compare and
tp_richcompare are both NULL. I'm attaching a patch -- let
me know if that works.

----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2003-05-09 13:54

Message:
Logged In: YES 
user_id=31435

>>> class C:  # classic class complains
...   __cmp__ = lambda a, b: 0
...
>>> {C(): 1}
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
TypeError: unhashable instance
>>> class C(object):   # new-style class does not complain
...   __cmp__ = lambda a, b: 0
...
>>> {C(): 1}
{<__main__.C object at 0x007F6970>: 1}
>>> 

That was under current CVS.  I see the same behavior in 
2.2.3, so this isn't new.

About Thomas's original report, I don't agree -- the default 
behavior is very useful.  The rule I've always lived by is that, to 
be usable as a dict key, an instance's class must either:

1. Implement none of {__cmp__, __eq__, __hash__}.

or

2. Implement __hash__ and (at least) one of {__cmp, __eq__}.

Classic classes still work this way.  New-style classes don't 
appear to outlaw any combination here.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2003-05-09 13:44

Message:
Logged In: YES 
user_id=6380

Yes, please paste an example here.

----------------------------------------------------------------------

Comment By: Jeremy Hylton (jhylton)
Date: 2003-05-09 13:32

Message:
Logged In: YES 
user_id=31392

Currently, a new-style class that defines __cmp__ but not
__hash__ is usable as a dictionary key.  That seems related
to this bug.  Should I paste the example here and bump the
priority?  Or should I open a separate bug report?


----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2003-02-11 18:01

Message:
Logged In: YES 
user_id=6380

I spent an afternoon looking into this, and I can't see an
easy solution. The idea of only inheriting __hash__ together
with certain other slots is really flawed; it may be better
if object DIDN'T define a default implementation for
__hash__, comparisons (both flavors), and other things, or
maybe the default __hash__ should raise an exception when
the comparisons are not those inherited from object, or
maybe PyType_Ready should insert a dummy __hash__ when it
sees that you redefine __eq__, or...  I really don't know.
I'm going to sleep on this some more, and lower the
priority. You can always get the right behavior by
explicitly defining __hash__.

----------------------------------------------------------------------

Comment By: Thomas Heller (theller)
Date: 2002-12-30 13:50

Message:
Logged In: YES 
user_id=11105

You mean at the end of the inherit_slots() function?
For my extension which I'm currently debugging, tp_compare, 
tp_richcompare, and tp_hash are inherited from base, but 
only tp_hash is != NULL there.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-12-30 13:44

Message:
Logged In: YES 
user_id=6380

There seems to be code that tries to inherit tp_hash only
when tp_compare and tp_richcompare are also inherited, but
it seems to be failing.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=660098&group_id=5470



More information about the Python-bugs-list mailing list