[Python-Dev] String hash function multiplier

Raymond Hettinger python at rcn.com
Wed Apr 14 12:05:50 EDT 2004


> [Raymond]
> > Does anyone have any issues with changing the hash multiplier for
the
> > string and Unicode hash functions?

[Tim]
> Don't touch it unless you can prove major benefits -- it's a
remarkable
> fact
> of life that the current multiplier hasn't resulted in any real-life
(but
> non-contrived) pathological cases.

Will leave it alone. 


>  Perhaps you think shifts and adds
> are
> faster?  I wouldn't -- the imul instruction on modern Pentiums is very
> fast.

On the P4, the documented latency went up from 4 cycles to 14 cycles
while shifts and adds went down to 0.5 cycles and 1 cycle respectively.
Timings confirm the result.

It looks like the best bet is to try to speedup the code without
changing the multiplier.  Intel's software optimization cookbook
recommends a partial unrolling and elimination of data dependencies so
that a second multiply can start 4 cycles after the previous one
started.  If practice bears out the theory, the timings could show a
three or fourfold speedup without changing the multiplier.



> (read Knuth).

Of course, I already have :-)



>  The right thing to compare Python's string hash
> to is "the standard" Fowler-Noll-Vo string hash

Ditto.



Raymond


#################################################################
#################################################################
#################################################################
#####
#####
#####
#################################################################
#################################################################
#################################################################



More information about the Python-Dev mailing list