[New-bugs-announce] [issue36781] Optimize sum() for bools

Serhiy Storchaka report at bugs.python.org
Fri May 3 05:49:33 EDT 2019


New submission from Serhiy Storchaka <storchaka+cpython at gmail.com>:

To count the number of items that satisfy certain condition you can use either

    sum(1 for x in data if pred(x))

or

    sum(pred(x) for x in data)

where pred(x) is a boolean expression.

The latter case is shorter but slower. There are two causes for this:

1. The generator expression needs to generate more items, not only when pred(x) is true, but also when pred(x) is false.

2. sum() is optimized for integers and floats, but not for bools.

The first cause is out of the scope of this issue, but sum() can optimized for bools.

$ ./python -m timeit -s "a = [True] * 10**6" -- "sum(a)"
Unpatched:  10 loops, best of 5: 22.3 msec per loop
Patched:    50 loops, best of 5: 6.26 msec per loop

$ ./python -m timeit -s "a = list(range(10**6))" -- "sum(x % 2 == 0 for x in a)"
Unpatched:  5 loops, best of 5: 89.8 msec per loop
Patched:    5 loops, best of 5: 67.5 msec per loop

$ ./python -m timeit -s "a = list(range(10**6))" -- "sum(1 for x in a if x % 2 == 0)"
5 loops, best of 5: 53.9 msec per loop

----------
components: Interpreter Core
messages: 341330
nosy: rhettinger, serhiy.storchaka
priority: normal
severity: normal
status: open
title: Optimize sum() for bools
type: performance
versions: Python 3.8

_______________________________________
Python tracker <report at bugs.python.org>
<https://bugs.python.org/issue36781>
_______________________________________


More information about the New-bugs-announce mailing list