[Python-Dev] Creating dicts from dict subclasses

Thu Dec 14 14:35:10 CET 2006

Armin Rigo wrote:

> Hi Walter,
> 
> On Wed, Dec 13, 2006 at 05:57:16PM +0100, Walter D?rwald wrote:
>> I tried to reimplement weakref.WeakValueDictionary as a subclass of
>> dict. The test passes except for one problem: To compare results
>> test_weakref.py converts a weakdict to a real dict via dict(weakdict).
>> This no longer works because PyDict_Merge() does a PyDict_Check() on the
>> argument and then ignores all overwritten methods. (The old version
>> worked because UserDict.UserDict was used).
> 
> This is an instance of a general problem in Python: if you subclass a
> built-in type, then your overridden methods may or may not be used in
> various situations.  In this case you might have subtle problems with
> built-in functions and statements that expect a dict and manipulate it
> directly, because they will see the underlying dict structure.  It is
> also quite fragile: e.g. if a future version of CPython adds a new
> method to dicts, then your existing code will also grow the new method
> automatically - but as inherited from 'dict', which produces quite
> surprizing results for the user.
> 
>>    for key in iter(arg.keys()):
>>       self[key] = arg.__getitem__(key)
>>
>> Why can't we use:
>>
>>    for key in iter(arg):
>>       self[key] = arg.__getitem__(key)
> 
> The latter would allow 'arg' to be a sequence instead of a mapping.  It
> may even not crash but produce nonsense instead, e.g. if 'arg' is a list
> of small integers.

Of course I meant: use the alternate code inside PyDict_Merge() where
dict_update_common() already has decided that the argument is a mapping
(which is done via PyObject_HasAttrString(arg, "keys")).

> Moreover there are multiple places in the code base
> that assume that mappings are "something with a 'keys' and a
> '__getitem__'", so I suppose any change in that should be done
> carefully.

Doing a
   grep PyMapping_Keys `find -name '*.[ch]'`
reveals the following:

./Python/ceval.c:               all = PyMapping_Keys(dict);

This is used for "import *" and simply iterates over the keys, so it
could use iterkeys()/iter()

./Objects/object.c:             result = PyMapping_Keys(locals);

This is in PyObject_Dir(). It does return the keylist, so no
iterkeys()/iter() here.

./Objects/descrobject.c:        return PyMapping_Keys(pp->dict);

This too must return a list of keys.

./Objects/dictobject.c:                 PyObject *keys = PyMapping_Keys(b);

This is the dict constructor.

./PC/_subprocess.c:     keys = PyMapping_Keys(environment);

This iterates over keys() and values() to format the complete
environment, so it probably could be switched to iterkeys()/iter().

./Modules/_sre.c:    keys = PyMapping_Keys(self->pattern->groupindex);

This again does iteration, so could be switched.

./Modules/posixmodule.c:        keys = PyMapping_Keys(env);
./Modules/posixmodule.c:        keys = PyMapping_Keys(env);
./Modules/posixmodule.c:        keys = PyMapping_Keys(env);

Those three are for execve/spawnve/spawnvpe and do basically the same as
PC/_subprocess.c, so could be switched too.

Servus,
   Walter