Why __hash__() does not return an UUID4?

Cameron Simpson cs at cskk.id.au
Wed Aug 26 22:57:52 EDT 2020


On 26Aug2020 22:10, Marco Sulla <Marco.Sulla.Python at gmail.com> wrote:
>As title. The reasons that came in my mind are:
>1. speed
>2. security

Various reasons come to mind:
- it is overkill for what __hash__ has to do (computational and feature 
  overkill)
- requires importing the uuid module (overkill again)
- speed
- the hash must be the same each time it is called for a given object, 
  so you'd need to waste space saving a UUID

The requirements for a hash are that:
- it be stable for a given object
- for use in dicts, objects of equal value (via ==) have the same hash

That second is actually a problem for UUID4. Suppose we have 2 
independent objects with the same value - how are you to ensure they 
have the same UUID?

Think through how a hash is used in a hash table - the hash is used to 
distribute values into buckets in a statisticly even fashion, so that 
all the values which are "equal" land in the _same_ bucket. That 
requires coordination of the hash values. Where that matters, you 
compute the hash from the value. UUID4s are not like that.

Cheers,
Cameron Simpson <cs at cskk.id.au>


More information about the Python-list mailing list