[Tutor] Looking for duplicates within a list
Peter Otten
__peter__ at web.de
Sat Jun 12 13:19:37 CEST 2010
Ken G. wrote:
> I have been working on this problem for several days and I am not making
> any progress. I have a group of 18 number, in ascending order, within a
> list. They ranged from 1 to 39. Some numbers are duplicated as much as
> three times or as few as none.
>
> I started with one list containing the numbers. For example, they are
> listed as like below:
>
> a = [1, 2, 3, 3, 4]
>
> I started off with using a loop:
>
> for j in range (0, 5):
> x = a[0] # for example, 1
>
> How would I compare '1' with 2, 3, 3, 4?
>
> Do I need another duplicated list such as b = a and compare a[0] with
> either b[0], b[1], b[2], b[3], b[4]?
>
> Or do I compare a[0] with a[1], a[2], a[3], a[4]?
>
> In any event, if a number is listed more than once, I would like to know
> how many times, such as 2 or 3 times. For example, '3' is listed twice
> within a list.
I haven't seen a solution that takes advantage of the fact that the numbers
are sorted, so here's one:
>>> items = [1, 2, 3, 3, 4]
>>> value = items[0]
>>> n = 0
>>> result = []
>>> for item in items:
... if value != item:
... print value, "-->", n
... value = item
... n = 1
... else:
... n += 1
...
1 --> 1
2 --> 1
3 --> 2
>>> print value, "-->", n
4 --> 1
Python has a cool feature called generator that allows you to encapsulate
the above nicely:
>>> def count_runs(items):
... it = iter(items)
... value = next(it)
... n = 1 # next consumes one item
... for item in it: # the for loop starts with the second item
... if value != item:
... yield value, n
... value = item
... n = 1
... else:
... n += 1
... yield value, n
...
>>> for v, n in count_runs([1, 1, 1, 2, 2, 3, 4, 4, 4, 4]):
... print v, "-->", n
...
1 --> 3
2 --> 2
3 --> 1
4 --> 4
Finally, there is itertools.groupby() which allows you to simplify the
implementation count_runs():
>>> from itertools import groupby
>>> def count_runs2(items):
... for key, group in groupby(items):
... yield key, sum(1 for _ in group)
...
>>> for v, n in count_runs2([1, 1, 1, 2, 2, 3, 4, 4, 4, 4]):
... print v, "-->", n
...
1 --> 3
2 --> 2
3 --> 1
4 --> 4
Peter
More information about the Tutor
mailing list