trouble building data structure

Ian Kelly ian.g.kelly at gmail.com
Mon Sep 29 11:36:25 EDT 2014


On Mon, Sep 29, 2014 at 7:52 AM, Steven D'Aprano
<steve+comp.lang.python at pearwood.info> wrote:
> Whether you prefer to use setdefault, or a defaultdict, is a matter of
> taste.

There is potentially a significant difference in performance -- with
setdefault, the subordinate data structure is created on every call to
be passed into setdefault, only to be discarded if the key already
exists. With defaultdict, the subordinate data structure is only
created when needed. Dicts are pretty cheap to construct though, and
this is probably not worth fretting over until profiling shows it to
be a problem.

On the other hand, it's easier to nest setdefaults arbitrarily deep
than it is for defaultdicts. This is because defaultdict suffers from
a design flaw -- defaultdict should be a function that returns a class
(like namedtuple), not a class itself. Fortunately that's easily fixable:

_DEFAULT_DICT_CACHE = {}

def defaultdict(callable):
  try:
    return _DEFAULT_DICT_CACHE[callable]
  except KeyError:
    class _defaultdict(dict):
      def __missing__(self, key):
        self[key] = value = self._callable()
        return value

    _DEFAULT_DICT_CACHE[callable] = _defaultdict
    return _defaultdict

A downside is that it would take some extra work to make this picklable.



More information about the Python-list mailing list