[New-bugs-announce] [issue23509] Counter.__iadd__ falls back to __add__ with bad performance impact

Jörn Hees report at bugs.python.org
Tue Feb 24 10:43:42 CET 2015


New submission from Jörn Hees:

This caught me when i worked on a stats script and updated c = Counter() in a loop with c += Counter(temp_dict). After profiling, i found out that my script spent 2 minutes in Counter.__add__ which seemed terribly slow (each invocation was in the ms not µs range).

Looking into this i found out that there is no implementation for Counter.__iadd__ (or __isub__) which i just assumed to be based on Counter.update (Counter.subtract).

Here's some timing output:

In [1]: from collections import Counter

In [2]: a = range(1000)

In [3]: b = range(1000)

In [4]: ab = a+b

In [5]: len(ab)
Out[5]: 2000

In [6]: %timeit c=Counter(a)
1000 loops, best of 3: 361 µs per loop

In [7]: %timeit c=Counter(ab)
1000 loops, best of 3: 734 µs per loop

In [8]: %timeit c=Counter(a) ; c += Counter(b)
1000 loops, best of 3: 1.51 ms per loop

In [9]: %timeit c=Counter(a) ; c.update(b)
1000 loops, best of 3: 741 µs per loop

Counter.update is way faster than +=, even if you re-add the overhead to create a new Counter:

In [10]: %timeit c=Counter(a) ; c.update(Counter(b))
1000 loops, best of 3: 1.16 ms per loop

In [11]: %timeit c=Counter(a) ; c.update({x:1 for x in b})
1000 loops, best of 3: 844 µs per loop


Would it be welcome if i add __iadd__ and __isub__ which just invoke update and subtract?

One reason not to have this is the current __add__ and __sub__ behavior of min values > 0, which from a set-theoretic standpoint makes sense, but has a certain potential of confusing developers who just use it as a simplistic Stats module.

----------
messages: 236484
nosy: joern
priority: normal
severity: normal
status: open
title: Counter.__iadd__ falls back to __add__ with bad performance impact
type: performance
versions: Python 2.7, Python 3.4

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue23509>
_______________________________________


More information about the New-bugs-announce mailing list