[Numpy-discussion] Efficient removal of duplicates: Numpy discussion board

Daran L. Rife drife at ucar.edu
Sun Mar 29 16:28:37 EDT 2009


Marjolaine,

Solution:  unique_index = [i for i,x in enumerate(l) if not or x != l[i-1]]

Remember that enumerate gives the index,value pairs of the
items in any iterable object.

Try it for yourself. Here's the output from my IDLE session.

In [1]: l =  [(1,1), (2,3), (1, 1), (4,5), (2,3), (10,21)]

In [2]: l.sort()

In [3]: l
Out[3]: [(1, 1), (1, 1), (2, 3), (2, 3), (4, 5), (10, 21)]

In [4]: unique_index = [i for i, x in enumerate(l) if not i or x != l[i-1]]

In [5]: unique_index
Out[5]: [0, 2, 4, 5]


BTW, I'm posting my response to the numpy-discussion group
so others may benefit. It's best to address your questions
to the group, as individuals are not always available to
answer your question in a timely manner. And by posting your
message to the group, you draw from a large body of very
knowledgeable people who will gladly help you.


Daran

--

> I saw your message on the numpy discussion board regarding the solution for the
efficient removal of duplicates. I have the same problem but need to need to
return
> the indices of the values as an input with associated z values.
> I was wondering if there was any ways to have the method you propoosed return the
indices of the duplicate (or alternatively unique) values in a.
>
>
> Here is the piece of code that you suggested at the time (Re: [Numpy-discussion]
Efficient removal of duplicates, posted on Tue, 16 Dec 2008 01:10:00 -0800)
>
>
> ---------------------------------------------
> import numpy as np
>
> a = [(x0,y0), (x1,y1), ...] # A numpy array, but could be a list l = a.tolist()
> l.sort()
> unique = [x for i, x in enumerate(l) if not i or x != l[i-1]] # <---- a_unique =
np.asarray(unique)
>
> ---------------------------------------------
>
> Best regards, marjolaine
>
>
>
>
> --
> This message is subject to the CSIR's copyright terms and conditions, e-mail legal
notice, and implemented Open Document Format (ODF) standard.
> The full disclaimer details can be found at http://www.csir.co.za/disclaimer.html.
>
> This message has been scanned for viruses and dangerous content by MailScanner,
and is believed to be clean.  MailScanner thanks Transtec Computers for their
support.
>
>







More information about the NumPy-Discussion mailing list