remove duplicates from list *preserving order*

Michael Spencer mahs at telcopartners.com
Thu Feb 3 17:34:14 EST 2005


Steven Bethard wrote:
> I'm sorry, I assume this has been discussed somewhere already, but I 
> found only a few hits in Google Groups...  If you know where there's a 
> good summary, please feel free to direct me there.
> 
> 
> I have a list[1] of objects from which I need to remove duplicates.  I 
> have to maintain the list order though, so solutions like set(lst), etc. 
> will not work for me.  What are my options?  So far, I can see:
> 
> def filterdups(iterable):
>     result = []
>     for item in iterable:
>         if item not in result:
>             result.append(item)
>     return result
> 
> def filterdups(iterable):
>     result = []
>     seen = set()
>     for item in iterable:
>         if item not in seen:
>             result.append(item)
>             seen.add(item)
>     return result
> 
> def filterdups(iterable):
>     seen = set()
>     for item in iterable:
>         if item not in seen:
>             seen.add(item)
>             yield item
> 
> Does anyone have a better[2] solution?
> 
> STeve
> 
> [1] Well, actually it's an iterable of objects, but I can convert it to 
> a list if that's helpful.
> 
> [2] Yes I know, "better" is ambiguous.  If it helps any, for my 
> particular situation, speed is probably more important than memory, so 
> I'm leaning towards the second or third implementation.

How about:

 >>> def filterdups3(iterable):
...     seen = set()
...     def _seen(item):
...         return item in seen or seen.add(item)
...     return itertools.ifilterfalse(_seen,iterable)
...
 >>> list(filterdups3([1,2,2,3,3,3,4,4,4,2,2,5]))
[1, 2, 3, 4, 5]
 >>>

Michael




More information about the Python-list mailing list