[Cython] Hash-based vtables

Dag Sverre Seljebotn d.s.seljebotn at astro.uio.no
Thu Jun 7 00:03:57 CEST 2012



Dag Sverre Seljebotn <d.s.seljebotn at astro.uio.no> wrote:

>On 06/06/2012 11:16 PM, Robert Bradshaw wrote:
>> On Wed, Jun 6, 2012 at 1:57 PM, Dag Sverre Seljebotn
>> <d.s.seljebotn at astro.uio.no>  wrote:
>>> On 06/06/2012 10:41 PM, Dag Sverre Seljebotn wrote:
>>>>
>>>> On 06/05/2012 12:30 AM, Robert Bradshaw wrote:
>>>>>
>>>>> I just found http://cmph.sourceforge.net/ which looks quite
>>>>> interesting. Though the resulting hash functions are supposedly
>cheap,
>>>>> I have the feeling that branching is considered cheap in this
>context.
>>>>
>>>>
>>>> Actually, this lead was *very* promising. I believe the very first
>>>> reference I actually read through and didn't eliminate after the
>>>> abstract totally swept away our home-grown solutions!
>>>>
>>>> "Hash&  Displace" by Pagh (1999) is actually very simple, easy to
>>>> understand, and fast both for generation and (the branch-free)
>lookup:
>>>>
>>>>
>>>>
>http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.69.3753&rep=rep1&type=pdf
>>>>
>>>>
>>>> The idea is:
>>>>
>>>> - Find a hash `g(x)` to partition the keys into `b` groups (the
>paper
>>>> requires b>  2n, though I think in practice you can often get away
>with
>>>> less)
>>>>
>>>> - Find a hash `f(x)` such that f is 1:1 within each group (which is
>>>> easily achieved since groups only has a few elements)
>>>>
>>>> - For each group, from largest to smallest: Find a displacement
>>>> `d[group]` so that `f(x) ^ d` doesn't cause collisions.
>>>>
>>>> It requires extra storage for the displacement table. However, I
>think 8
>>>> bits per element might suffice even for vtables of 512 or 1024 in
>size.
>>>> Even with 16 bits it's rather negligible compared to the
>minimum-128-bit
>>>> entries of the table.
>>>>
>>>> I benchmarked these hash functions:
>>>>
>>>> displace1: ((h>>  r1) ^ d[h&  63])&  m1
>>>> displace2: ((h>>  r1) ^ d[h&  m2])&  m1
>>>> displace3: ((h>>  r1) ^ d[(h>>  r2)&  m2])&  m1
>>>>
>>>> Only the third one is truly in the spirit of the algorithm, but I
>think
>>>> the first two should work well too (and when h is known
>compile-time,
>>>> looking up d[h&  63] isn't harder than looking up r1 or m1).
>>>>
>>>> My computer is acting up and all my numbers today are slower than
>the
>>>> earlier ones (yes, I've disabled turbo-mode in the BIOS for a year
>ago,
>>>> and yes, I've pinned the CPU speed). But here's today's numbers,
>>>> compiled with -DIMHASH:
>>>>
>>>> direct: min=5.37e-09 mean=5.39e-09 std=1.96e-11
>val=2400000000.000000
>>>> index: min=6.45e-09 mean=6.46e-09 std=1.15e-11
>val=1800000000.000000
>>>> twoshift: min=6.99e-09 mean=7.00e-09 std=1.35e-11
>val=1800000000.000000
>>>> threeshift: min=7.53e-09 mean=7.54e-09 std=1.63e-11
>val=1800000000.000000
>>>> displace1: min=6.99e-09 mean=7.00e-09 std=1.66e-11
>val=1800000000.000000
>>>> displace2: min=6.99e-09 mean=7.02e-09 std=2.77e-11
>val=1800000000.000000
>>>> displace3: min=7.52e-09 mean=7.54e-09 std=1.19e-11
>val=1800000000.000000
>>>>
>>>>
>>>> I did a dirty prototype of the table-finder as well and it works:
>>>>
>>>> https://github.com/dagss/hashvtable/blob/master/pagh99.py
>>>
>>>
>>> The paper obviously puts more effort on minimizing table size and
>not a fast
>>> lookup. My hunch is that our choice should be
>>>
>>> ((h>>  table.r) ^ table.d[h&  m2])&  m1
>>>
>>> and use 8-bits d (because even if you have 1024 methods, you'd
>rather double
>>> the number of bins than those 2 extra bits available for
>displacement
>>> options).
>>>
>>> Then keep incrementing the size of d and the number of table slots
>(in such
>>> an order that the total vtable size is minimized) until success. In
>practice
>>> this should almost always just increase the size of d, and keep the
>table
>>> size at the lowest 2**k that fits the slots (even for 64 methods or
>128
>>> methods :-))
>>>
>>> Essentially we avoid the shift in the argument to d[] by making d
>larger.
>>
>> Nice. I'm surprised that the indirection on d doesn't cost us much;
>
>Well, table->d[const & const] compiles down to the same kind of code as
>
>table->m1. I guess I'm surprised too that displace2 doesn't penalize.
>
>> hopefully its size wouldn't be a big issue either. What kinds of
>> densities were you achieving?
>
>The algorithm is designed for 100% density in the table itself. (We can
>
>lift that to compensate for a small space of possible hash functions I 
>guess.)
>
>I haven't done proper simulations yet, but I just tried |vtable|=128, 
>|d|=128 from the command line and I had 15 successes or so before the 
>first failure. That's with a 100% density in the vtable itself! (And 
>when it fails, you increase |d| to get your success).
>
>The caveat is the space spent on d (it's small in comparison, but
>that's 
>why this isn't too good to be true).
>
>A disadvantage might be that we may no longer have the opportunity to 
>not make the table size a power of two (i.e. replace the mask with "if 
>(likely(slot < n))"). I think for that to work one would need to
>replace 
>the xor group with addition on Z_d.

Strike this paragraph; don't know what I was thinking...

Dag
>
>> Going back to the idea of linear probing on a cache miss, this has
>the
>> advantage that one can write a brain-dead provider that sets m=0 and
>> simply lists the methods instead of requiring a table optimizer.
>(Most
>> tools, of course, would do the table optimization.) It also lets you
>> get away with a "kind-of good" hash rather than requiring you search
>> until you find a (larger?) perfect one.
>
>Well, given that we can have 100% density, and generating the table is 
>lightning fast, and the C code to generate the table is likely a 300 
>line utility... I'm not convinced.
>
>We should however make sure that *callers* can do a linear scan and use
>
>strcmp if they don't care about performance.
>
>Dag
>_______________________________________________
>cython-devel mailing list
>cython-devel at python.org
>http://mail.python.org/mailman/listinfo/cython-devel

-- 
Sent from my Android phone with K-9 Mail. Please excuse my brevity.


More information about the cython-devel mailing list