Newbie question about dictionaries

Fredrik Lundh fredrik at pythonware.com
Mon Jul 29 12:24:24 EDT 2002


Michael Hudson wrote:

> > Is it right then that I have to implement a custom class for key objects if
> > I want to use a custom hash function?
>
> Yes.  Why do you ask?  Is there some feature of the standard type's
> hash functions that you dislike?

tinkering with the hash function might be pretty meaningless if
you don't know the dictionary implementation...

quoting Tim from Objects/dictobject.c:

    Most hash schemes depend on having a "good" hash
    function, in the sense of simulating randomness.  Python
    doesn't: its most important hash functions (for strings
    and ints) are very regular in common cases:

    >>> map(hash, (0, 1, 2, 3))
    [0, 1, 2, 3]
    >>> map(hash, ("namea", "nameb", "namec", "named"))
    [-1658398457, -1658398460, -1658398459, -1658398462]
    >>>

    This isn't necessarily bad!  To the contrary, in a table of
    size 2**i, taking the low-order i bits as the initial table index
    is extremely fast, and there are no collisions at all for dicts
    indexed by a contiguous range of ints.  The same is app-
    roximately true when keys are "consecutive" strings.  So
    this gives better-than-random behavior in common cases,
    and that's very desirable.

    OTOH, when collisions occur, the tendency to fill contiguous
    slices of the hash table makes a good collision resolution
    strategy crucial.

    /.../

see the rest of that comment for an extensive discussion of how
exactly that strategy works in Python...

</F>





More information about the Python-list mailing list