[Tutor] Help with duplicates using for...

Danny Yoo dyoo@hkn.eecs.berkeley.edu
Sun, 30 Jun 2002 01:23:12 -0700 (PDT)


On Sun, 30 Jun 2002, Guess Who? Me wrote:

> list = [4, 5, 7, 8, 9, 1,0,7,10]
> list.sort()
> prev = list[0]
> del list[0]
> for item in list:
>         if prev == item:
>                 print "Duplicate of ",prev," Found"
>         prev = item

> I don't understand this piece of code. I don't get why you have to
> remove the first list value. Could somebody please explain it some ? I
> almost have it.


Hi Travis,

Actually, we don't have to delete the first element from our list ---
that's just how this particular way is doing it.



Sorting the list brings similar values together, so if there are any
duplicates, there'll be adjacent to each other.  The code keeps a "prev"
variable to keep track of the very last "previous" value that it has seen,
and to start the whole thing off, it says that last thing it's seen so far
is list[0].

    prev = list[0]

Now, what we'd like to do is march over the rest of the list, and scan our
eye across the list, one by one.  Since potential duplicates are right
next to each other, this "scanning" should do the trick.

###
for item in list:
    if prev == item:
        print "Duplicate of ",prev," Found"
    prev = item
###


But there's one subtle point: we've got to make sure that we're not
looking at the first element of the list when we start scanning forward!


If it helps, think of two fingers on our list, like this:


###
[0 1 4 5 7 7 8 9 10]
 ^
 |
 |
 |
finger 1
"prev"
###


What we want to avoid is this situation:

###
[ 0    1 4 5 7 7 8 9 10]
 ^ ^
 | |
 | +-------+
 |         |
finger 1   |
"prev"     |
         finger 2
         "next"
##

... where both fingers are pointing at the same first element.  If that
happens, the code will always say there's a duplicate, even when there
isn't one.


The writer of the code thought that deleting the first element would be a
good way to avoid this problem, but it's not the only way.  The delete
itself isn't the only way to solve this problem, but we do have to deal
with the situation somehow.



Hope that makes some sort of sense... *grin* Please feel free to ask more
questions about this.