[Numpy-discussion] Picking rows with the first (or last) occurrence of each key

Skip Montanaro skip.montanaro at gmail.com
Mon Jul 4 19:27:14 EDT 2016


> Any way that you can make your keys numeric? Then you can run np.diff on
> that first column, and use the indices of nonzero entries (np.flatnonzero)
> to know where values change. With a +1/-1 offset (that I am too lazy to
> figure out right now ;) you can then index into the original rows to get
> either the first or last occurrence of each run.

I'll give it some thought, but one of the elements of the key is definitely
a (short, < six characters) string.  Hashing it probably wouldn't work, too
great a chance for collisions.

S





More information about the NumPy-Discussion mailing list