[issue18032] Optimization for set/frozenset.issubset()

Serhiy Storchaka report at bugs.python.org
Wed May 27 12:44:58 CEST 2015


Serhiy Storchaka added the comment:

In C implementation no need to create set object seen. More efficient way is to use bit array.

Here is a patch that uses this approach.

./python -m timeit -s "s1 = set(range(1000))" "s1.issubset(range(1000))"
Unpatched  : 10000 loops, best of 3: 115 usec per loop
bru's patch: 10000 loops, best of 3: 114 usec per loop
My patch   : 10000 loops, best of 3: 92.6 usec per loop

./python -m timeit -s "s1 = set(range(1000))" "s1.issubset(range(1, 1000))"
Unpatched  : 10000 loops, best of 3: 73.4 usec per loop
bru's patch: 10000 loops, best of 3: 114 usec per loop
My patch   : 10000 loops, best of 3: 93 usec per loop

./python -m timeit -s "s1 = set(range(100))" "s1.issubset(range(1, 1000))"
Unpatched  : 10000 loops, best of 3: 73.6 usec per loop
bru's patch: 10000 loops, best of 3: 89 usec per loop
My patch   : 10000 loops, best of 3: 62.4 usec per loop

./python -m timeit -s "s1 = set(range(100))" "s1.issubset(range(1000))"
Unpatched  : 10000 loops, best of 3: 75.5 usec per loop
bru's patch: 100000 loops, best of 3: 8.77 usec per loop
My patch   : 100000 loops, best of 3: 5.34 usec per loop

./python -m timeit -s "s1 = set('a'); s2 = ['a'] + ['b'] * 10000" "s1.issubset(s2)"
Unpatched  : 1000 loops, best of 3: 326 usec per loop
bru's patch: 1000000 loops, best of 3: 0.394 usec per loop
My patch   : 1000000 loops, best of 3: 0.409 usec per loop

./python -m timeit -s "s1 = set('a'); from itertools import repeat" "s1.issubset(repeat('a'))"
Unpatched  : NEVER FINISHES
bru's patch: 1000000 loops, best of 3: 0.794 usec per loop
My patch   : 1000000 loops, best of 3: 0.725 usec per loop

----------
nosy: +serhiy.storchaka

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue18032>
_______________________________________


More information about the Python-bugs-list mailing list