[Python-ideas] incremental hashing in __hash__

Paul Moore p.f.moore at gmail.com
Thu Jan 5 10:58:42 EST 2017


On 5 January 2017 at 13:28, Neil Girdhar <mistersheik at gmail.com> wrote:
> The point is that the OP doesn't want to write his own hash function, but
> wants Python to provide a standard way of hashing an iterable.  Today, the
> standard way is to convert to tuple and call hash on that.  That may not be
> efficient. FWIW from a style perspective, I agree with OP.

The debate here regarding tuple/frozenset indicates that there may not
be a "standard way" of hashing an iterable (should order matter?).
Although I agree that assuming order matters is a reasonable
assumption to make in the absence of any better information.

Hashing is low enough level that providing helpers in the stdlib is
not unreasonable. It's not obvious (to me, at least) that it's a
common enough need to warrant it, though. Do we have any information
on how often people implement their own __hash__, or how often
hash(tuple(my_iterable)) would be an acceptable hash, except for the
cost of creating the tuple? The OP's request is the only time this has
come up as a requirement, to my knowledge. Hence my suggestion to copy
the tuple implementation, modify it to work with general iterables,
and publish it as a 3rd party module - its usage might give us an idea
of how often this need arises. (The other option would be for someone
to do some analysis of published code).

Assuming it is a sufficiently useful primitive to add, then we can
debate naming. But I'd prefer it to be named in such a way that it
makes it clear that it's a low-level helper for people writing their
own __hash__ function, and not some sort of variant of hashing (which
hash.from_iterable implies to me).

Paul


More information about the Python-ideas mailing list