[Python-Dev] Documentation Error for __hash__

Matt Giuca matt.giuca at gmail.com
Fri Aug 29 15:07:25 CEST 2008


> Note that only instances have the default hash value id(obj). This
> is not true in general. Most types don't implement the tp_hash
> slot and thus are not hashable. Indeed, mutable types should not
> implement that slot unless they know what they're doing :-)


By "instances" you mean "instances of user-defined classes"?
(I carefully avoid the term "instance" on its own, since I believe that was
phased out when they merged types and classes; it probably still exists in
the C API but shouldn't be mentioned in Python-facing documentation).

But anyway, yes, we should make that distinction.

Sorry, I wasn't clear enough: with "not defining an equal comparison"
> I meant that an equal comparison does not succeed, ie. raises an
> exception or returns Py_NotImplemented (at the C level).


Oh OK. I didn't even realise it was "valid" or "usual" to explicitly block
__eq__ like that.


> Again, the situation is better at the C level, since types
> don't have a default tp_hash implementation, so have to explicitly
> code such a slot in order for hash(obj) to work.


Yes but I gather that this "data model" document we are talking about is not
designed for C authors, but Python authors, so it should be written for the
point of view of people coding only in Python. (Only the "Extending and
Embedding" and the "C API" documents are for C authors).

The documentation should probably say:
>
> "If you implement __cmp__ or
> __eq__ on a class, also implement a __hash__ method (and either
> have it raise an exception or return a valid non-changing hash
> value for the object)."
>

I agree, except maybe not for the Python 3 docs. As long as the behaviour I
am observing is well-defined and not just a side-effect which could go away
-- that is, if you define __eq__/__cmp__ but not __hash__, in a user-defined
class, it raises a TypeError -- then I think it isn't necessary to recommend
implementing a __hash__ method and raising a TypeError. Better just to leave
as-is ("if it defines
__cmp__()<http://docs.python.org/dev/3.0/reference/datamodel.html#object.__cmp__>or
__eq__()<http://docs.python.org/dev/3.0/reference/datamodel.html#object.__eq__>but
not
__hash__()<http://docs.python.org/dev/3.0/reference/datamodel.html#object.__hash__>,
its instances will not be usable as dictionary keys") and clarify the later
statement.


>
> "If you implement __hash__ on classes, you should consider implementing
> __eq__ and/or __cmp__ as well, in order to control how dictionaries use
> your objects."


I don't think I agree with that. I'm not sure why you'd implement __hash__
without __eq__ and/or __cmp__, but it doesn't cause issues so we may as well
not address it.


> In general, it's probably best to always implement both methods
> on classes, even if the application will just use one of them.
>

Well it certainly is for new-style classes in the 2.x branch. I don't think
you should implement __hash__ in Python 3 if you just want a non-hashable
object (since this is the default behaviour anyway).

A lot of my opinion here, though, which doesn't count very much -- so I'm
just making suggestions.

Matt
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20080829/2facfe3d/attachment.htm>


More information about the Python-Dev mailing list