dict.has_key(x) versus 'x in dict'

Thu Dec 7 17:06:50 EST 2006

skip at pobox.com wrote:

>     Hendrik> why? - the main point is actually that the code worked,
>     and was Hendrik> stable - that should make you proud, not
>     embarrassed.  that Hendrik> there is far too much emphasis on
>     doing things the quickest way Hendrik> - as long as it works, and
>     is fast enough, its not broken, so Hendrik> don't fix it...
> 
> That's the rub.  It wasn't fast enough.  I only realized that had
> been a problem once I fixed it though.

Yep - it's so often the case that scaling problems don't turn up in
developer testing, but only in the field ... and sometimes in much less
obvious ways than Skip's case. I had one where the particulars of the
data I was using meant that what was meant to be an O(1) HashMap lookup
(yes, Java - boo) was essentially an O(n) lookup much of the time. For a
lot of the data, the key chosen ended up being the same (resulting in
the hashmap having very long chains). By using additional data in the
key as a tie-breaker, I got back my normal-case O(1) behaviour.

In this particular case though ...

    dict.has_key(x)
    x in dict.keys()

one is obviously clearer than the other, as well as being faster.

Of course (getting back to the original question), in current python:

    x in dict

is obviously the clearest, and fastest method, and should always be
preferred over either of the above.

Tim Delaney