Dictionary Keys question

Berteun Damman berteun at NO_SPAMdds.nl
Wed Jan 30 18:09:16 EST 2008


On Wed, 30 Jan 2008 14:47:36 -0800 (PST), FireNWater <khoard at gmail.com> wrote:
> I'm curious why the different outputs of this code.  If I make the
> dictionary with letters as the keys, they are not listed in the
> dictionary in alphabetical order, but if I use the integers then the
> keys are in numerical order.
>
> I know that the order of the keys is not important in a dictionary,
> but I was just curious about what causes the differences.  Thanks!!

I don't know the exact way Python's hash function works, but I can take
a guess. I'm sorry if I explain something you already know.

A hash is for quickly looking up data. Yet, you don't want to waste too
much memory. So there is a limit number of spaces allocated, in which to
store objects. This number of spaces can be thought of as a list. Then,
if you put something into the dict, Python computes the 'hash' of this
object, which basically forms the index in the list where to store it.
So every object should be mapped onto some index within the list. (If
you retrieve it, the hash is computed again, and the value on that index
is looked up, like list indexing, these are fast operations.)

Say, if you have 100 spaces, and someone puts in integer in the list,
the hashfunction used might be % 100. So the first 100 integers would
always be placed at consecutive places. For strings however, a more
complicated hash-function would be used, which takes into account more
characters, so strings don't end up in order.

For integers, if you put in integers that are spread very widely apart,
they won't end up in order either (see the mod 100 example, 104 will
come before 10).

If you replace the list2 in your example by:
list2 = [10000 * x for x in range(1,9)]

You will see that this one doesn't end up in order either. So, there's
no exception for integers when it comes to the order, yet the particular
properties of the hash function will cause sequential integers to end up
in order under some circumstances.

Berteun

PS:
What happens if two values map onto the same space is of course an
obvious question, and the trick is choosing your hashfunction so this
occurs not very often on average. If it happens there are several
strategies. Wikipedia probably has an explanation of how hash-functions
can work in such a case.



More information about the Python-list mailing list