Why defaultdict?

Chris Rebert clp2 at rebertia.com
Fri Jul 2 00:37:31 EDT 2010


On Thu, Jul 1, 2010 at 9:11 PM, Steven D'Aprano
<steve at remove-this-cybersource.com.au> wrote:
> I would like to better understand some of the design choices made in
> collections.defaultdict.

Perhaps python-dev should've been CC-ed...

> Firstly, to initialise a defaultdict, you do this:
>
> from collections import defaultdict
> d = defaultdict(callable, *args)
>
> which sets an attribute of d "default_factory" which is called on key
> lookups when the key is missing. If callable is None, defaultdicts are
> *exactly* equivalent to built-in dicts, so I wonder why the API wasn't
> added on to dict rather than a separate class that needed to be imported.
> That is:
>
> d = dict(*args)
> d.default_factory = callable
>
> If you failed to explicitly set the dict's default_factory, it would
> behave precisely as dicts do now. So why create a new class that needs to
> be imported, rather than just add the functionality to dict?

Don't know personally, but here's one thought: If it was done that
way, passing around a dict could result in it getting a
default_factory set where there wasn't one before, which could lead to
strange results if you weren't anticipating that. The defaultdict
solution avoids this.

<snip>
> Second, why is the factory function not called with key?

Agree, I've never understood this. Ruby's Hash::new does it better
(http://ruby-doc.org/core/classes/Hash.html), and even supports your
case 0; it calls the equivalent of default_factory(d, key) when
generating a default value.

> There are three
> obvious kinds of "default values" a dict might want, in order of more-to-
> less general:
>
> (1) The default value depends on the key in some way: return factory(key)
> (2) The default value doesn't depend on the key: return factory()
> (3) The default value is a constant: return C
>
> defaultdict supports (2) and (3):
>
> defaultdict(factory, *args)
> defaultdict(lambda: C, *args)
>
> but it doesn't support (1). If key were passed to the factory function,
> it would be easy to support all three use-cases, at the cost of a
> slightly more complex factory function.
<snip>
> (There is a zeroth case as well, where the default value depends on the
> key and what else is in the dict: factory(d, key). But I suspect that's
> well and truly YAGNI territory.)

Cheers,
Chris
--
http://blog.rebertia.com



More information about the Python-list mailing list