Planning a Python Course for Beginners

Marko Rauhamaa marko at pacujo.net
Wed Aug 9 13:07:48 EDT 2017


Chris Angelico <rosuav at gmail.com>:

> On Wed, Aug 9, 2017 at 11:46 PM, Marko Rauhamaa <marko at pacujo.net> wrote:
>> Really, the most obvious use case for hashed objects is their membership
>> in a set. For example:
>>
>>     invitees = set(self.bff)
>>     invitees |= self.classmates()
>>     invitees |= self.relatives()
>
> Okay. So you should define value by object identity - NOT any sort of
> external primary key.

Good point! A very good __hash__() implementation is:

    def __hash__(self):
        return id(self)

In fact, I didn't know Python (kinda) did this by default already. I
can't find that information in the definition of object.__hash__():

   <URL: https://docs.python.org/3/reference/datamodel.html?#object.__hash__>

I only found it out by trying it.

> That goes completely against your original statement, which I shall
> quote again:
>
>>>> In relational-database terms, your "value" is the primary key and
>>>> your "metadata" is the rest of the columns.
>
> If there is any possibility that you could have two objects in memory
> with the same primary key but other attributes different, you'd have
> major MAJOR problems with this kind of set operation.

In light of the above realization, don't override __hash__() in any way
in your class, and your object works perfectly as a key or a set member.

A __hash__() definition is only needed when your __eq__() definition is
different from "is".

As for running into "major MAJOR" problems, yes, you need to know what
you're doing and face the consequences. It's a bit analogous to sort()
depending on the definitions of the "rich" comparison.


Marko



More information about the Python-list mailing list