Perfect hash generator

Andrew M. Kuchling akuchlin at mems-exchange.org
Tue Mar 28 16:48:02 EST 2000


I've written an amusing little hack, a script that generates perfect
minimal hash functions.  Perfect hash functions are commonly used in
cases where you have some fixed number of symbols, and want to
distinguish between them without having to do string comparisons.  For
example, compilers often use perfect hashes for the set of keywords in
the language they support, so they can do 'token = perfecthash(string)
; switch (token) {...}'.

The perfect_hash.py script outputs Python code for the perfect hash
that looks like this:

f1 =  lambda key: hash('Xk6mtQxP9w' + str(key)) % 15
f2 =  lambda key: hash('DrCEzy4RWs' + str(key)) % 15
G = [ 0, 1, 0, 9, 2, 10, 3, 7, 11, 6, 14, 13, 2, 8, 0, ]
def perfecthash(key):
    return (G[ f1(key) ] + G[ f2(key) ] ) % 15

In this case, perfecthash('January') == 1, perfecthash('October') ==
10, and so forth.  Where do the numbers come from?  They're magic! [1]

A Web page for the code is at:
http://starship.python.net/crew/amk/python/code/perfect-hash.html 

Comments & suggestions welcomed!

[1] The magic incantations are explained in "Optimal algorithms for
minimal perfect hashing", G. Havas, B.S. Majewski. 
ftp://ftp.csee.uq.oz.au/pub/Publications/Techreports/department/TR0234.dvi.gz
Pretty elegant algorithm...

-- 
A.M. Kuchling			http://starship.python.net/crew/amk/
She was no longer Delight, and the blossoms had already begun to fall in her
domain, becoming smudged and formless colours, and she had no one to talk to...
  -- Delight becomes Delirium, in SANDMAN #42: "Brief Lives:2"




More information about the Python-list mailing list