listerator clonage

Michael Spencer mahs at telcopartners.com
Fri Feb 11 22:58:09 EST 2005


Cyril BAZIN wrote:
> Hello, 
> 
> I want to build a function which return values which appear two or
> more times in a list:

This is very similar to removing duplicate items from a list which was the 
subject of a long recent thread, full of suggested approaches.
Here's one way to do what you want:

 >>> l = [1, 7, 3, 4, 3, 2, 1]
 >>> seen = set()
 >>> set(x for x in l if x in seen or seen.add(x))
set([1, 3])
 >>>

This is a 'generator expression' applied as an argument to the set constructor. 
  It relies on the fact that seen.add returns None, and is therefore always false.

this is equivalent to:

 >>> def _generate_duplicates(iterable):
...     seen = set()
...     for x in iterable:
...         if x in seen: # it's a duplicate
...             yield x
...         else:
...             seen.add(x)
...
 >>> generator = _generate_duplicates(l)
 >>> generator
<generator object at 0x16C114B8>
 >>> set(generator)
set([1, 3])

 >>> # In case you want to preserve the order and number of the duplicates, you
 >>> # would use a list
 >>> generator = _generate_duplicates(l)
 >>> list(generator)
[3, 1]
 >>>

> 
> So, I decided to write a little example which doesn't work:
> #l = [1, 7, 3, 4, 3, 2, 1]
> #i = iter(l)
> #for x in i:
> #    j = iter(i)
> #    for y in j:
> #        if x == y:
> #            print x
> 
> In thinked that the instruction 'j= iter(i)' create a new iterator 'j'
> based on 'i' (some kind of clone). I wrote this little test which show
> that 'j = iter(i)' is the same as 'j = i' (that makes me sad):

I don't think your algorithm would work even if iter(iterator) did return a copy 
or separate iterator.  If, however, you do have an algorithm that needs that 
capability, you can use itertools.tee

Cheers
Michael




More information about the Python-list mailing list