frozendict (v0.1)

Arnaud Delobelle arnodel at gmail.com
Fri Oct 8 10:49:31 EDT 2010



kj wrote:
> In <878w29kxjp.fsf at gmail.com> Arnaud Delobelle <arnodel at gmail.com> writes:
>
> >E.g., try with {1:'a', 1j:'b'}
>
> I see.  Thanks for this clarification.  I learned a lot from it.
>
> I guess that frozenset must have some way of canonicalizing the
> order of its elements that is dependent on their Python values but
> not on their comparability.  My first thought was that they are
> ordered according to their hash values, but this theory doesn't
> hold:
>
> >>> abc = ('a', 'b', 'c')
> >>> sorted(map(hash, abc))
> [-468864544, -340864157, -212863774]
> >>> map(hash, frozenset(abc))
> [-468864544, -212863774, -340864157]
>
> I.e. the ordering of the elements in the frozenset does not correspond
> to the ordering of their hashes in either direction.  Hmmm.
>
> I tried to understand this by looking at the C source but I gave
> up after 10 fruitless minutes.  (This has been invariably the
> outcome of all my attempts at finding my way through the Python C
> source.)
>
> I guess the take-home message is that frozenset is a more general
> way to canonicalize an iterable object than sorting, even though
> the reasons for this still remain a mystery to me...  Then again,
> just looking at the voodoo that goes into algorithms for computing
> hashes fills me with despair.  As much as I dislike it, sooner or
> later I'll have to go on faith.

Computing the hash value of a container object usually involves
accumulating the hash values of its components, i.e. something like

def hash(container):
     hashval = initial_value
     for x in container:
          hashval = f(hashval, hash(x))
     return hashval

Now if one chooses f such that:
    * f(x, y) == f(y, x)
    * f(x, f(y, z)) = f(f(x, y), z)

(IOW, f defines a commutative and associative operation)

Then it doesn't matter in what order the elements of container are
enumerated, the resulting hash value will always be the same.  This
avoids having to find a canonical order to iterate over then elements
in.

--
Arnaud



More information about the Python-list mailing list