[Python-ideas] An identity dict

Raymond Hettinger raymond.hettinger at gmail.com
Wed Jun 2 18:37:07 CEST 2010


>> * In the examples you posted (such
> as http://codespeak.net/svn/pypy/trunk/pypy/tool/algo/graphlib.py ),
>> it appears that PyPy already has an identity dict,  so how are they helped by
> adding one to the collections module?
> 
> My purpose with those examples was to prove it as a generally useful utility.
> 
>> 
>> * Most of the posted examples already work with regular dicts (which check
> identity before they check equality) -- don't the other implementations already
> implement regular dicts which need to have identity-implied-equality in order to
> pass the test suite?  I would expect the following snippet to work under all
> versions and implementations of Python:
>> 
>> 
>>     >>> class A: 
>>     ...         pass
>>     >>> a = A()
>>     >>> d = {a: 10}
>>     >>> assert d[a] == 10   # uses a's identity for lookup
> 
> Yes, but that would be different if you have two "a"s with __eq__ defined to be
> equal and you want to hash them separately.

None of the presented examples take advantage of that property.
All of them work with regular dictionaries.   This proposal is still
use case challenged.

AFAICT from code searches, the idea of needing to override
an existing __eq__ with an identity-only comparison seems
to never come up.  It would not even be popular as an ASPN recipe.

Moreover, I think that including it in the standard library would be harmful.
The language makes very few guarantees about object identity.
In most cases a user would far better off using a regular dictionary.
If a rare case arose where __eq__ needed to be overridden with an
identity-only check, it is not hard to write d[id(obj)]=value.  

Strong -1 on including this in the standard library.


Raymond


P.S.  ISTM that including subtly different variations of a data type
does more harm than good.   Understanding how to use an
identity dictionary correctly requires understanding the nuances
of object identity, how to keep the object alive outside the dictionary
(even if the dictionary keeps it alive, a user still needs an external reference
to be able to do a lookup), and knowing that the version proposed for
CPython has dramatically worse speed/space performance than
a regular dictionary.  The very existence of an identity dictionary in
collections is likely to distract a user away from a better solution using:
d[id(obj)]=value.


More information about the Python-ideas mailing list