how to remove duplicated elements in a list?

Tom Anderson twic at urchin.earth.li
Tue Dec 20 19:00:57 EST 2005


On Mon, 19 Dec 2005, Brian van den Broek wrote:

> bonono at gmail.com said unto the world upon 2005-12-19 02:27:
>> Steve Holden wrote:
>> 
>>> Kevin Yuan wrote:
>>> 
>>>> How to remove duplicated elements in a list? eg.
>>>> [1,2,3,1,2,3,1,2,1,2,1,3] -> [1,2,3]?
>>>> Thanks!!
>>> 
>>>  >>> list(set([1,2,3,1,2,3,1,2,1,2,1,3]))
>>> [1, 2, 3]
>> 
>> Would this have the chance of changing the order ? Don't know if he
>> wants to maintain the order or don't care though.
>
> For that worry:
>
>>>> orig_list = [3,1,2,3,1,2,3,1,2,1,2,1,3]
>>>> new_list = list(set(orig_list))
>>>> new_list.sort(cmp= lambda x,y: cmp(orig_list.index(x), 
> orig_list.index(y)))
>>>> new_list
> [3, 1, 2]
>>>>

Ah, that gives me an idea:

>>> import operator
>>> orig_list = [3,1,2,3,1,2,3,1,2,1,2,1,3]
>>> new_list = map(operator.itemgetter(1),
...     filter(lambda (i, x): i == orig_list.index(x),
...             enumerate(orig_list)))
>>> new_list
[3, 1, 2]

This is a sort of decorate-fondle-undecorate, where the fondling is 
filtering on whether this is the first occurrance of the the value. This 
is, IMHO, a clearer expression of the original intent - "how can i remove 
such-and-such elements from a list" is begging for filter(), i'd say.

My code is O(N**2), a bit better than your O(N**2 log N), but we can get 
down to O(N f(N)), where f(N) is the complexity of set.__in__ and set.add, 
using a lookaside set sort of gizmo:

>>> orig_list = [3,1,2,3,1,2,3,1,2,1,2,1,3]
>>> seen = set()
>>> def unseen(x):
...     if (x in seen):
...             return False
...     else:
...             seen.add(x)
...             return True
...
>>> new_list = filter(unseen, orig_list)
>>> new_list
[3, 1, 2]

Slightly tidier like this, i'd say:

>>> orig_list = [3,1,2,3,1,2,3,1,2,1,2,1,3]
>>> class seeingset(set):
...     def see(self, x):
...             if (x in self):
...                     return False
...             else:
...                     self.add(x)
...                     return True
...
>>> new_list = filter(seeingset().see, orig_list)
>>> new_list
[3, 1, 2]


tom

-- 
Hit to death in the future head



More information about the Python-list mailing list