how to remove multiple occurrences of a string within a list?

attn.steven.kuo at gmail.com attn.steven.kuo at gmail.com
Thu Apr 5 13:52:44 EDT 2007


On Apr 4, 7:43 pm, a... at mac.com (Alex Martelli) wrote:

(snipped)

> A "we-don't-need-no-stinkin'-one-liners" more relaxed approach:
>
> import collections
> d = collections.defaultdict(int)
> for x in myList: d[x] += 1
> list(x for x in myList if d[x]==1)
>
> yields O(N) performance (give that dict-indexing is about O(1)...).
>
> Collapsing this kind of approach back into a "one-liner" while keeping
> the O(N) performance is not easy -- whether this is a fortunate or
> unfortunate ocurrence is of course debatable.  If we had a "turn
> sequence into bag" function somewhere (and it might be worth having it
> for other reasons):
>
> def bagit(seq):
>   import collections
>   d = collections.defaultdict(int)
>   for x in seq: d[x] += 1
>   return d
>
> then one-linerness might be achieved in the "gimme nonduplicated items"
> task via a dirty, rotten trick...:
>
> list(x for bag in [bagit(myList)] for x in myList if bag[x] == 1)
>
> ...which I would of course never mention in polite company...
>
> Alex



With a "context managed" set one could have, as a one-liner:

with cset() as s: retval = [ (x, s.add(x))[0] for x in myList if x not
in s ]

I've not looked at the underlying implementation of set(), but I
presume that checking set membership is also about O(1).

--
Cheers,
Steven





More information about the Python-list mailing list