[issue18771] Reduce the cost of hash collisions for set objects

Raymond Hettinger report at bugs.python.org
Sun Aug 18 03:49:58 CEST 2013


Raymond Hettinger added the comment:

Here's a simple benchmark to start with.

On my machine (2.4Ghz Core2 Duo with 2MB L3 cache) and compiled with GCC-4.8, the benchmark shows a 6% speedup for sets that spill over L2 cache and 11% for sets that spill over the L3 cache.

The theoretically expected result is 9.75%:

   26% collision rate (for tables that are 64% full)
 x 75% chance of adjacent entry in same cache line
 x 50% savings (two lookups for the price of one)
 -----
 9.75%
 =====

Most programs will have less benefit (they may smaller tables that fit in L1 cache or have tables that are less densely filled resulting in fewer collisions).  My quick timings show either no change or inconsequential improvement in the performance of those programs.

----------
Added file: http://bugs.python.org/file31349/set_bench.py

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue18771>
_______________________________________


More information about the Python-bugs-list mailing list