[Numpy-discussion] Find indices of largest elements

Thu Apr 15 16:48:32 EDT 2010

On Thu, Apr 15, 2010 at 12:41 PM, Nikolaus Rath <Nikolaus at rath.org> wrote:
> Keith Goodman <kwgoodman at gmail.com> writes:
>> On Wed, Apr 14, 2010 at 12:39 PM, Nikolaus Rath <Nikolaus at rath.org> wrote:
>>> Keith Goodman <kwgoodman at gmail.com> writes:
>>>> On Wed, Apr 14, 2010 at 8:49 AM, Keith Goodman <kwgoodman at gmail.com> wrote:
>>>>> On Wed, Apr 14, 2010 at 8:16 AM, Nikolaus Rath <Nikolaus at rath.org> wrote:
>>>>>> Hello,
>>>>>>
>>>>>> How do I best find out the indices of the largest x elements in an
>>>>>> array?
>>>>>>
>>>>>> Example:
>>>>>>
>>>>>> a = [ [1,8,2], [2,1,3] ]
>>>>>> magic_function(a, 2) == [ (0,1), (1,2) ]
>>>>>>
>>>>>> Since the largest 2 elements are at positions (0,1) and (1,2).
>>>>>
>>>>> Here's a quick way to rank the data if there are no ties and no NaNs:
>>>>
>>>> ...or if you need the indices in order:
>>>>
>>>>>> shape = (3,2)
>>>>>> x = np.random.rand(*shape)
>>>>>> x
>>>> array([[ 0.52420123,  0.43231286],
>>>>        [ 0.97995333,  0.87416228],
>>>>        [ 0.71604075,  0.66018382]])
>>>>>> r = x.reshape(-1).argsort().argsort()
>>>
>>> I don't understand why this works. Why do you call argsort() twice?
>>> Doesn't that give you the indices of the sorted indices?
>>
>> It is confusing. Let's look at an example:
>>
>>>> x = np.random.rand(4)
>>>> x
>>    array([ 0.37412289,  0.68248559,  0.12935131,  0.42510212])
>>
>> If we call argsort once we get the index that will sort x:
>>
>>>> idx = x.argsort()
>>>> idx
>>    array([2, 0, 3, 1])
>>>> x[idx]
>>    array([ 0.12935131,  0.37412289,  0.42510212,  0.68248559])
>>
>> Notice that the first element of idx is 2. That's because element x[2]
>> is the min of x. But that's not what we want.
>
> I think that's exactly what I want, the index of the smallest element.
> It also seems to work:
>
> In [3]: x = np.random.rand(3,3)
> In [4]: x
> Out[4]:
> array([[ 0.49064281,  0.54989584,  0.05319183],
>       [ 0.50510206,  0.39683101,  0.22801874],
>       [ 0.04595144,  0.3329171 ,  0.61156205]])
> In [5]: idx = x.reshape(-1).argsort()
> In [6]: [ np.unravel_index(i, x.shape) for i in idx[-3:] ]
> Out[6]: [(1, 0), (0, 1), (2, 2)]

Yes, you are right. My first thought was to approach the problem by
ranking the data. But that is not needed here since the position in
the argsorted index tells us the rank. I guess my approach was to rank
first and then ask questions later. Well, at least we got to see
Anne's fast ranking method.

>
> So why the additional complication with the second argsort? I just don't
> get it...
>
>> We want the first
>> element to be the rank of the first element of x.
>
> I'm not quite sure why we want that...?
>
>
> Best,
>
>   -Nikolaus
>
> --
>  »Time flies like an arrow, fruit flies like a Banana.«
>
>  PGP fingerprint: 5B93 61F8 4EA2 E279 ABF6  02CF A9AD B7F8 AE4E 425C
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>