Pythonic way for missing dict keys

Alex Martelli aleax at mac.com
Sat Jul 21 10:37:12 EDT 2007


Carsten Haese <carsten at uniqsys.com> wrote:

> On Sat, 21 Jul 2007 09:22:32 +0530, Rustom Mody wrote
> > Can someone who knows about python internals throw some light on why
> > >>> x in dic
> > is cheaper than
> > >>> dic.has_key(x)
> > 
> > ??
> 
> I won't claim to know Python internals, but compiling and disassembling the
> expressions in question reveals the reason:
> 
> >>> from compiler import compile
> >>> from dis import dis
> >>> dis(compile("dic.has_key(x)","","eval"))
>   1           0 LOAD_NAME                0 (dic)
>               3 LOAD_ATTR                1 (has_key)
>               6 LOAD_NAME                2 (x)
>               9 CALL_FUNCTION            1
>              12 RETURN_VALUE
> >>> dis(compile("x in dic","","eval"))
>   1           0 LOAD_NAME                0 (x)
>               3 LOAD_NAME                1 (dic)
>               6 COMPARE_OP               6 (in)
>               9 RETURN_VALUE
> 
> "dic.has_key(x)" goes through an attribute lookup to find the function that
> looks for the key. "x in dic" finds the function more directly.

Yup, it's mostly that, as microbenchmarking can confirm:

brain:~ alex$ python -mtimeit -s'd={}; f=d.has_key' 'f(23)'
10000000 loops, best of 3: 0.146 usec per loop
brain:~ alex$ python -mtimeit -s'd={}; f=d.has_key' '23 in d'
10000000 loops, best of 3: 0.142 usec per loop
brain:~ alex$ python -mtimeit -s'd={}; f=d.has_key' 'f(23)'
10000000 loops, best of 3: 0.146 usec per loop
brain:~ alex$ python -mtimeit -s'd={}; f=d.has_key' '23 in d'
10000000 loops, best of 3: 0.142 usec per loop
brain:~ alex$ python -mtimeit -s'd={}; f=d.has_key' 'd.has_key(23)'
1000000 loops, best of 3: 0.278 usec per loop
brain:~ alex$ python -mtimeit -s'd={}; f=d.has_key' 'd.has_key(23)'
1000000 loops, best of 3: 0.275 usec per loop

the in operator still appears to have a tiny repeatable advantage (about
4 nanoseconds on my laptop) wrt even the hoisted method, but the
non-hoisted method, due to repeated lookup, is almost twice as slow
(over 100 nanoseconds penalty, on my laptop).


Alex



More information about the Python-list mailing list