listerator clonage
Michael Spencer
mahs at telcopartners.com
Fri Feb 11 22:58:09 EST 2005
Cyril BAZIN wrote:
> Hello,
>
> I want to build a function which return values which appear two or
> more times in a list:
This is very similar to removing duplicate items from a list which was the
subject of a long recent thread, full of suggested approaches.
Here's one way to do what you want:
>>> l = [1, 7, 3, 4, 3, 2, 1]
>>> seen = set()
>>> set(x for x in l if x in seen or seen.add(x))
set([1, 3])
>>>
This is a 'generator expression' applied as an argument to the set constructor.
It relies on the fact that seen.add returns None, and is therefore always false.
this is equivalent to:
>>> def _generate_duplicates(iterable):
... seen = set()
... for x in iterable:
... if x in seen: # it's a duplicate
... yield x
... else:
... seen.add(x)
...
>>> generator = _generate_duplicates(l)
>>> generator
<generator object at 0x16C114B8>
>>> set(generator)
set([1, 3])
>>> # In case you want to preserve the order and number of the duplicates, you
>>> # would use a list
>>> generator = _generate_duplicates(l)
>>> list(generator)
[3, 1]
>>>
>
> So, I decided to write a little example which doesn't work:
> #l = [1, 7, 3, 4, 3, 2, 1]
> #i = iter(l)
> #for x in i:
> # j = iter(i)
> # for y in j:
> # if x == y:
> # print x
>
> In thinked that the instruction 'j= iter(i)' create a new iterator 'j'
> based on 'i' (some kind of clone). I wrote this little test which show
> that 'j = iter(i)' is the same as 'j = i' (that makes me sad):
I don't think your algorithm would work even if iter(iterator) did return a copy
or separate iterator. If, however, you do have an algorithm that needs that
capability, you can use itertools.tee
Cheers
Michael
More information about the Python-list
mailing list