[Python-Dev] Proposal: defaultdict

Mon Feb 20 04:52:43 CET 2006

"Michael Urman" <murman at gmail.com> wrote:
> 
> On 2/19/06, Josiah Carlson <jcarlson at uci.edu> wrote:
> > My post probably hasn't convinced you, but much of the confusion, I
> > believe, is based on Martin's original belief that 'k in dd' should
> > always return true if there is a default.  One can argue that way, but
> > then you end up on the circular train of thought that gets you to "you
> > can't do anything useful if that is the case, .popitem() doesn't work,
> > len() is undefined, ...".  Keep it simple, keep it sane.
> 
> A default factory implementation fundamentally modifies the behavior
> of the mapping. There is no single answer to the question "what is the
> right behavior for contains, len, popitem" as that depends on what the
> code that consumes the mapping is written like, what it is attempting
> to do, and what you are attempting to override it to do. Or, simply,
> on why you are providing a default value. Resisting the temptation to
> guess the why and just leaving the methods as is seems  the best
> choice; overriding __contains__ to return true is much easier than
> reversing that behavior would be.

I agree, there is nothing perfect.  But at least in all of my use-cases,
and the majority of the ones I've seen 'in the wild', my previous post
provided an implementation that worked precisely like desired, and
precisely like a regular dictionary, except when accessing a
non-existant key via: value = dd[key] . __contains__, etc., all work
exactly like they do with a non-defaulting dictionary. Iteration via
popitem(), pop(key), items(), iteritems(), __iter__, etc., all work the
way you would expect them. The only nit is that code which iterates
like:

    for key in keys:
        try:
            value = dd[key]
        except KeyError:
            continue

(where 'keys' has nothing to do with dd.keys(), it is merely a listing
of keys which are desired at this particular point)  However, the
following works like it always did:

    for key in keys:
        if key not in dd:
            continue
        value = dd[key]

> An example when it could theoretically be used, if not particularly
> useful. The gettext.install() function was just updated to take a
> names parameter which controls which gettext accessor functions it
> adds to the builtin namespace. Its implementation looks for "method in
> names" to decide. Passing a default-true dict would allow the future
> behavior to be bind all checked names, but only if __contains__
> returns True.
> 
> Even though it would make a poor base implementation, and these
> effects aren't a good candidate for it,  the code style that could
> best leverage such a __contains__ exists.

Indeed, there are cases where an always-true __contains__ exists, and
the pure-Python implementation I previously posted can be easily
modified to offer such a feature.  However, because there are also use
cases for the not-always-true __contains__, picking either as the "one
true way" seems a bit unnecessary.

Presumably, if one goes into the collections module, the other will too. 
Actually, they could share all of their code except for a simple flag
which determines the always-true __contains__.  With minor work, that
'flag', or really the single bit it would require, may even be
embeddable into the type object.  Arguably, there should be a handful of
these defaulting dictionary-like objects, and for each variant, it
should be documented what their use-cases are, and any gotcha's that
will inevitably come up.

 - Josiah