Defaultdict and speed

bearophileHUGS at lycos.com bearophileHUGS at lycos.com
Fri Nov 3 03:29:23 EST 2006


This post sums some things I have written in another Python newsgroup.
More than 40% of the times I use defaultdict like this, to count
things:

>>> from collections import defaultdict as DD
>>> s = "abracadabra"
>>> d = DD(int)
>>> for c in s: d[c] += 1
...
>>> d
defaultdict(<type 'int'>, {'a': 5, 'r': 2, 'b': 2, 'c': 1, 'd': 1})

But I have seen that if keys are quite sparse, and int() becomes called
too much often, then code like this is faster:

>>> d = {}
>>> for c in s:
...   if c in d: d[c] += 1
...   else: d[c] = 1
...
>>> d
{'a': 5, 'r': 2, 'b': 2, 'c': 1, 'd': 1}

So to improve the speed for such special but common situation, the
defaultdict can manage the case with default_factory=int in a different
and faster way.

Bye,
bearophile




More information about the Python-list mailing list