dict.has_key(x) versus 'x in dict'

Wed Dec 6 09:02:38 EST 2006

Bjoern Schliessmann wrote:

> Paul Melis wrote:
> 
>> I've always been using the has_key() method to test if a
>> dictionary contains a certain key. Recently I tried the same using
>> 'in', e.g.
>> 
>> d = { ... }
>> if k in d:
>>     ...
> 
> Wouldn't be "if k in d.keys()" be the exact replacement?

No, 'k in d' is equivalent to 'd.has_key(k)', only with less (constant)
overhead for the function call. 'k in d.keys()' on the other hand creates a
list of keys which is then searched linearly -- about the worst thing you
can do about both speed and memory footprint. Some numbers:

$ python2.5 -m timeit -s"N = 1; d = dict.fromkeys(range(N))" "N in d.keys()"
1000000 loops, best of 3: 0.693 usec per loop
$ python2.5 -m timeit -s"N = 1000; d = dict.fromkeys(range(N))" "N in
d.keys()"
10000 loops, best of 3: 77.2 usec per loop # ouch!

$ python2.5 -m timeit -s"N = 1; d = dict.fromkeys(range(N))" "N in d"
10000000 loops, best of 3: 0.192 usec per loop
~ $ python2.5 -m timeit -s"N = 1000; d = dict.fromkeys(range(N))" "N in d"
10000000 loops, best of 3: 0.192 usec per loop

$ python2.5 -m timeit -s"N = 1; d = dict.fromkeys(range(N))" "d.has_key(N)"
1000000 loops, best of 3: 0.376 usec per loop
$ python2.5 -m timeit -s"N = 1000; d = dict.fromkeys(range(N))"
"d.has_key(N)"
1000000 loops, best of 3: 0.385 usec per loop

Peter