hash(unicode(string)) == hash(string) sometimes (was Re: Why KeyError ???)

Thu Mar 7 00:09:57 EST 2002

[John Machin]
> ...
> BTW, I'm not so sure of the utility of hash(1) == hash(1.0)

Which also equals hash(1L) and hash(1+0j).  In Guido's mind <wink>, these
are all "numbers", and equal numbers should have equal hashes.

> --- why on earth would anyone want to use floats as keys in a dictionary,
> anyway?

For example, I've often seen rounded Unix timestamps used as dict keys, to
map "representative time" to a list of events seen at that time.  A
generalization uses float-keyed dicts to build histograms, after reducing
the domain to bins at the desired granularity.

> Eveything one reads on floating-point fulminates against equality
> testing. Seems like extra code and extra run-time for little benefit.

The extra code is confined to the internal routine _Py_HashDouble(), and
most of that routine is scratching its head over how to get *any* reliable
hash code for a C double (C doesn't the expose the bits, and fp formats can
and do vary across platforms).  The code to ensure that it matches the hash
code for a "compares equal" int or long is about a dozen lines.

If you're indeed correct that floatish numbers have no use as dict keys,
this routine is never executed, so a claim of "extra runtime" would swallow
itself in embarrassment <wink>.