Defaultdict and speed
bearophileHUGS at lycos.com
bearophileHUGS at lycos.com
Fri Nov 3 03:29:23 EST 2006
This post sums some things I have written in another Python newsgroup.
More than 40% of the times I use defaultdict like this, to count
things:
>>> from collections import defaultdict as DD
>>> s = "abracadabra"
>>> d = DD(int)
>>> for c in s: d[c] += 1
...
>>> d
defaultdict(<type 'int'>, {'a': 5, 'r': 2, 'b': 2, 'c': 1, 'd': 1})
But I have seen that if keys are quite sparse, and int() becomes called
too much often, then code like this is faster:
>>> d = {}
>>> for c in s:
... if c in d: d[c] += 1
... else: d[c] = 1
...
>>> d
{'a': 5, 'r': 2, 'b': 2, 'c': 1, 'd': 1}
So to improve the speed for such special but common situation, the
defaultdict can manage the case with default_factory=int in a different
and faster way.
Bye,
bearophile
More information about the Python-list
mailing list