Which is faster? (if not b in m) or (if m.count(b) > 0)
Kent Johnson
kent at kentsjohnson.com
Wed Feb 15 12:25:01 EST 2006
Steven D'Aprano wrote:
> On Wed, 15 Feb 2006 08:44:10 +0100, Marc 'BlackJack' Rintsch wrote:
>>``if not b in m`` looks at each element of `m` until it finds `b` in it
>>and stops then. Assuming `b` is in `m`, otherwise all elements of `m` are
>>"touched".
>>
>>``if m.count(b) > 0`` will always goes through all elements of `m` in the
>>`count()` method.
>
>
> But the first technique executes in (relatively slow) pure Python, while
> the count method executes (relatively fast) C code. So even though count
> may do more work, it may do it faster.
'a in b' actually takes fewer bytecodes because 'in' has it's own
bytecode but b.count requires an attribute lookup:
>>> import dis
>>> def foo():
... a in b
...
>>> def bar():
... b.count(a)
...
>>> dis.dis(foo)
2 0 LOAD_GLOBAL 0 (a)
3 LOAD_GLOBAL 1 (b)
6 COMPARE_OP 6 (in)
9 POP_TOP
10 LOAD_CONST 0 (None)
13 RETURN_VALUE
>>> dis.dis(bar)
2 0 LOAD_GLOBAL 0 (b)
3 LOAD_ATTR 1 (count)
6 LOAD_GLOBAL 2 (a)
9 CALL_FUNCTION 1
12 POP_TOP
13 LOAD_CONST 0 (None)
16 RETURN_VALUE
Not much difference in speed when the item is not in the list, a slight
edge to 'a in b' for a large list:
D:\>python -m timeit -s "lst = range(1000)" "1001 in lst"
10000 loops, best of 3: 55.5 usec per loop
D:\>python -m timeit -s "lst = range(1000)" "lst.count(1001) > 1"
10000 loops, best of 3: 55.5 usec per loop
D:\>python -m timeit -s "lst = range(1000000)" "lst.count(-1) > 1"
10 loops, best of 3: 62.2 msec per loop
D:\>python -m timeit -s "lst = range(1000000)" "(-1) in lst"
10 loops, best of 3: 60.8 msec per loop
Of course if the item appears in the list, 'a in b' has a huge advantage:
D:\>python -m timeit -s "lst = range(1000000)" "(1) in lst"
10000000 loops, best of 3: 0.171 usec per loop
Kent
More information about the Python-list
mailing list