hashkey/digest for a complex object

Arnaud Delobelle arnodel at gmail.com
Sat Oct 9 16:39:51 EDT 2010


kj <no.email at please.post> writes:

> In <87y6a9lqnj.fsf at gmail.com> Arnaud Delobelle <arnodel at gmail.com> writes:
>
>>You could do something like this:
>
>>deep_methods = {       
>>    list: lambda f, l: tuple(map(f, l)),
>>    dict: lambda f, d: frozenset((k, f(v)) for k, v in d.items()),
>>    set: lambda f, s: frozenset(map(f, s)),
>>    # Add more if needed
>>    }
>
>>def apply_method(f, obj):
>>    try:
>>        method = deep_methods[type(obj)]
>>    except KeyError:
>>        return obj
>>    return method(f, obj)
>
>>def deepfreeze(obj):
>>    """Return a 'hashable version' of an object
>>    return apply_method(deepfreeze, obj)
>
>>def deephash(obj):
>>    """Return hash(deepfreeze(obj)) without deepfreezing"""
>>    return hash(apply_method(deephash, obj))
>
>># Example of deepfreezable object:
>>obj = [1, "foo", {(2, 4): {7, 5, 4}, "bar": "baz"}]
>                            ^       ^
>                            |       |
>                            `-------`------- what's this?

This is set literal notation, introduced in Python 3!
In 2.X, you would write set([7, 5, 4])

>
>
>>>>> deepfreeze(obj)
>>(1, 'foo', frozenset({('bar', 'baz'), ((2, 4), frozenset({4, 5, 7}))}))
>>>>> deephash(obj)
>>1341422540
>>>>> hash(deepfreeze(obj))
>>1341422540
>
>
> After fixing the missing """ in deepfreeze this code works as
> advertised, 

Oops, I must admit I added the docstrings after pasting the code :)

> but I'm mystified by the identity between hash(deepfreeze(...))  and
> deephash(...).  Without some knowledge of the Python internals, I
> don't see how this follows.

OK.  I haven't actually proved it, but it follows from the following
facts:

1. hash() is an idempotent function, i.e. hash(hash(x)) == hash(x) hold
for any hashable x (this is a simple consequence of the fact that
hash(x) == x for any int x (by 'int' I mean 2.X int)).

2. Container hashable objects compute their hash from the hash of their
elements.

I don't think either of these two facts is documented, but it would be quite
easy to verify them in the Python source (I must admit I have not done
it).  And it is difficult to conceive how it could be otherwise.

-- 
Arnaud



More information about the Python-list mailing list