[Numpy-discussion] Find indices of largest elements

Wed Apr 14 16:56:43 EDT 2010

On Wed, Apr 14, 2010 at 12:39 PM, Nikolaus Rath <Nikolaus at rath.org> wrote:
> Keith Goodman <kwgoodman at gmail.com> writes:
>> On Wed, Apr 14, 2010 at 8:49 AM, Keith Goodman <kwgoodman at gmail.com> wrote:
>>> On Wed, Apr 14, 2010 at 8:16 AM, Nikolaus Rath <Nikolaus at rath.org> wrote:
>>>> Hello,
>>>>
>>>> How do I best find out the indices of the largest x elements in an
>>>> array?
>>>>
>>>> Example:
>>>>
>>>> a = [ [1,8,2], [2,1,3] ]
>>>> magic_function(a, 2) == [ (0,1), (1,2) ]
>>>>
>>>> Since the largest 2 elements are at positions (0,1) and (1,2).
>>>
>>> Here's a quick way to rank the data if there are no ties and no NaNs:
>>
>> ...or if you need the indices in order:
>>
>>>> shape = (3,2)
>>>> x = np.random.rand(*shape)
>>>> x
>> array([[ 0.52420123,  0.43231286],
>>        [ 0.97995333,  0.87416228],
>>        [ 0.71604075,  0.66018382]])
>>>> r = x.reshape(-1).argsort().argsort()
>
> I don't understand why this works. Why do you call argsort() twice?
> Doesn't that give you the indices of the sorted indices?

It is confusing. Let's look at an example:

>> x = np.random.rand(4)
>> x
   array([ 0.37412289,  0.68248559,  0.12935131,  0.42510212])

If we call argsort once we get the index that will sort x:

>> idx = x.argsort()
>> idx
   array([2, 0, 3, 1])
>> x[idx]
   array([ 0.12935131,  0.37412289,  0.42510212,  0.68248559])

Notice that the first element of idx is 2. That's because element x[2]
is the min of x. But that's not what we want. We want the first
element to be the rank of the first element of x. So we need to
shuffle idx around so that the order aligns with x. How do we do that?
We sort it!

>> idx.argsort()
   array([1, 3, 0, 2])

The min value of x is x[2], that's why 2 is the first element of idx
which means that we want ranked(x) to contain a 0 at position 2 which
it does.

Bah, it's all magic.