[Tutor] Re: Testing if a number occurs more than once [dict version]

Danny Yoo dyoo@hkn.eecs.berkeley.edu
Wed Dec 4 21:38:06 2002


On Tue, 3 Dec 2002, Scot Stevenson wrote:

> Uh, I may have screwed up here:
>
> ===============================
> def morethanone(L):
>     numbersfound = {}
>     for number in L:
>         if number in numbersfound:
>             return 1
>         else:
>             numbersfound[number]=1
>     return 0
> ===============================
>
> After rereading some stuff about dictionaries, I think the line
>
> 	if number in numbersfound:
>
> probably should be
>
> 	if numbersfound.has_key(number):


Hi Scot,

No, no, you should have trusted your instincts: you had the right idea in
the first place.  *grin*


Checking to see if an key is in a dictionary, is exactly the right thing
to do: dictionaries are designed so that the act of looking a key up is
very fast.

That is, it's not the 'in' operation itself that's inherently expensive:
the data structure that we apply 'in' on makes more of a difference.

For lists, 'in' requires a scan, element by element, through the
container, till we either hit the thing we're looking for, or fall off our
list.  For dictionaries, 'in' requires a single lookup: it's either in
there, or it's not.


So the operation of membership testing --- 'in' --- itself doesn't have a
explicit cost associated to it: it's the combination of the operation and
the data structure that allows us to measure performance.



> It seems that "in" does a linear search (just what we were trying to
> avoid) why "has_key" uses the hash table.

> def morethanone(L):
>     numbersfound = {}
>     for number in L:
>         if number in numbersfound:
>             return 1
>         else:
>             numbersfound[number]=1
>     return 0

Since 'numbersfound' is a dictionary, the 'in' operation will be fast, so
no worries there.


Good luck to you!