[Python-Dev] returning longs from __hash__()

Tim Peters tim.peters at gmail.com
Tue Aug 8 22:20:01 CEST 2006


[Armin]
>> Maybe the user should just be able to return any integer value from a
>> custom __hash__() without having to worry about not exceeding
>> sys.maxint.
>>
>> After all the returned value has no real meaning.  If a long is returned
>> we could take its hash again, and use that number internally.

[Martin]
> Nick Coghlan already suggested that, when __hash__ returns a long int,
> the tp_hash of long int should be used to compute the true hash value.
>
> Could you see any problems with that approach? If not, and if I don't
> hear other objections, I'd like to go ahead and fix it that way.

It sounds fine to me, except I'm not immediately clear on which code
needs to be changed.  The internal _Py_HashPointer() already does
exactly this (return the hash of a Python long) when PyObject_Hash()
decides to hash an address on a SIZEOF_LONG < SIZEOF_VOID_P box ...
but on a SIZEOF_LONG == SIZEOF_VOID_P box, _Py_HashPointer() may still
return a negative C long.  I /hope/ that a class that decides to add

    def __hash__(self):
         return id(self)

will end up using the same hash code internally as when that
supposedly do-nothing-different definition doesn't exist.

Note that a while back I changed several custom __hash__ methods in
Python's test suite to stop returning id(self) (as a result of tests
failing after the "make id() non-negative" change).  That's why we
haven't seen such complaints from the buildbots recently.  I expect
that few Python programmers realize it was never legit for __hash__()
to return id(self), and that it's not worth forcing them to learn that
now ;-)


More information about the Python-Dev mailing list