Inconsistency between dict() and collections.OrderedDict() methods.

Erik python at lucidity.plus.com
Sat Apr 29 16:19:43 EDT 2017


I have a subclass of dict that enforces which keys are allowed to be set 
and only allows each key to be set at most once:

class StrictDict(dict):
   def __init__(self, validkeys, *args, **kwargs):
     self.validkeys = validkeys
     super(StrictDict, self).__init__(*args, **kwargs)

   def __setitem__(self, key, value):
     if key not in self.validkeys:
       raise KeyError("'%s' is not a valid key" % key)

     if key in self.keys():
       raise KeyError("'%s' is already set" % key)

     super(StrictDict, self).__setitem__(key, value)

This works fine in the general case:

/tmp$ ~/sw/Python-3.6.1/python
Python 3.6.1 (heads/master:7bcbcb9, Apr 21 2017, 01:52:28)
[GCC 5.4.0 20160609] on linux
Type "help", "copyright", "credits" or "license" for more information.
 >>> from strictdict import *
 >>> d = StrictDict(('foo', 'bar'))
 >>> d['foo'] = 1
 >>> d['foo'] = 1
Traceback (most recent call last):
   File "<stdin>", line 1, in <module>
   File "/tmp/strictdict.py", line 11, in __setitem__
     raise KeyError("'%s' is already set" % key)
KeyError: "'foo' is already set"
 >>> d['baz'] = 1
Traceback (most recent call last):
   File "<stdin>", line 1, in <module>
   File "/tmp/strictdict.py", line 8, in __setitem__
     raise KeyError("'%s' is not a valid key" % key)
KeyError: "'baz' is not a valid key"
 >>> d
{'foo': 1}

However, when I call other methods which set items in the dictionary, I 
do not get errors for invalid or duplicate keys (__setitem__ is not called):

/tmp$ ~/sw/Python-3.6.1/python
Python 3.6.1 (heads/master:7bcbcb9, Apr 21 2017, 01:52:28)
[GCC 5.4.0 20160609] on linux
Type "help", "copyright", "credits" or "license" for more information.
 >>> from strictdict import *
 >>> d = StrictDict(('foo', 'bar'), baz=1)
 >>> d
{'baz': 1}
 >>> d.update({'spam': 1})
 >>> d
{'baz': 1, 'spam': 1}

It seems a little onerous that I have to put the key checks in several 
places and implement each of those APIs manually again (and keep on top 
of that if dict() grows some new methods that involve setting items). Is 
there a compelling reason why the dict module doesn't call a custom 
__setitem__ each time an item needs to be set? Performance is 
undoubtably a concern with something a fundamental as dicts, but what 
I'm suggesting could be done fairly efficiently (with a single "is the 
self object a subclass which overrides __setitem__" test followed by 
running the existing code or a slower version which calls __setitem__).

Anyway, related to but separate from that, as the subject of this 
message states it turns out that collections.OrderedDict() *does* do 
what I would expect:

from collections import OrderedDict
class OrderedStrictDict(OrderedDict):
   def __init__(self, validkeys, *args, **kwargs):
     self.validkeys = validkeys
     super(OrderedStrictDict, self).__init__(*args, **kwargs)

   def __setitem__(self, key, value):
     if key not in self.validkeys:
       raise KeyError("'%s' is not a valid key" % key)

     if key in self.keys():
       raise KeyError("'%s' is already set" % key)

     super(OrderedStrictDict, self).__setitem__(key, value)

/tmp$ ~/sw/Python-3.6.1/python
Python 3.6.1 (heads/master:7bcbcb9, Apr 21 2017, 01:52:28)
[GCC 5.4.0 20160609] on linux
Type "help", "copyright", "credits" or "license" for more information.
 >>> from strictdict import *
 >>> d = OrderedStrictDict(('foo', 'bar'), baz=1)
Traceback (most recent call last):
   File "<stdin>", line 1, in <module>
   File "/tmp/strictdict.py", line 19, in __init__
     super(OrderedStrictDict, self).__init__(*args, **kwargs)
   File "/tmp/strictdict.py", line 23, in __setitem__
     raise KeyError("'%s' is not a valid key" % key)
KeyError: "'baz' is not a valid key"
 >>> d = OrderedStrictDict(('foo', 'bar'))
 >>> d.update({'baz': 1})
Traceback (most recent call last):
   File "<stdin>", line 1, in <module>
   File "/tmp/strictdict.py", line 23, in __setitem__
     raise KeyError("'%s' is not a valid key" % key)
KeyError: "'baz' is not a valid key"


The documentation for OrderedDict says: "Return an instance of a dict 
subclass, supporting the usual dict methods. An OrderedDict is a dict 
that remembers the order that keys were first inserted. If a new entry 
overwrites an existing entry, the original insertion position is left 
unchanged. Deleting an entry and reinserting it will move it to the end."

The strong implication there is that the class behaves exactly like a 
dict apart from the ordering, but that's not true.

E.



More information about the Python-list mailing list