[Numpy-discussion] Strange behavior for argsort() and take()

Thu Jun 19 10:26:43 EDT 2008

2008/6/18 Charles R Harris <charlesr.harris at gmail.com>:
>
>
> On Wed, Jun 18, 2008 at 11:48 AM, Anne Archibald <peridot.faceted at gmail.com>
> wrote:
>>
>> 2008/6/18 Stéfan van der Walt <stefan at sun.ac.za>:
>> > 2008/6/18 Anne Archibald <peridot.faceted at gmail.com>:
>> >> In [7]: x.take(x.argsort())
>> >> Out[7]: array([ 0. ,  0.1,  0.2,  0.3])
>> >>
>> >> If you would like to think of it more mathematically, when you feed
>> >> np.argsort() an array that represents a permutation of the numbers
>> >> 0,1,...,n-1, you get back the inverse permutation. When you pass a
>> >> permutation as the argument to x.take(), you apply the permutation to
>> >> x. (You can also invert a permutation by feeding it as indices to
>> >> arange(n)).
>> >>
>> >> I have been tempted to write some support functions for manipulating
>> >> permutations, but I'm not sure how generally useful they would be.
>> >
>> > Do we have an easy way of grabbing the elements out of an array that
>> > correspond to
>> >
>> > argmax(x, axis) ?
>> >
>> > `take` won't do the job; there `axis` has a different meaning.
>>
>> Well, here's a quick hack:
>>
>> def take_axis(X, ix, axis):
>>    XX = np.rollaxis(X,axis)
>>    s = XX.shape
>>    return XX[(ix,)+tuple(np.indices(s[1:]))]
>>
>> And a version that works for argsort()ing:
>>
>> def take_axis_argsort(X, ix, axis):
>>    XX = np.rollaxis(X,axis)
>>    ixix = np.rollaxis(ix,axis)
>>    s = XX.shape
>>    return np.rollaxis(XX[(ixix,)+tuple(np.indices(s)[1:])],0,axis+1)
>>
>> I'm always a little bit leery of using indices(), since it can produce
>> very large arrays:
>
> And fancy indexing too, as it tends to be slow. The axis=None case also
> needs to be handled. I think this is best done in C where the code stacks
> for argsort and arg{max,min} are good starting points. The possibility of
> different integer types for the index array adds a bit of complication.

Actually, apart from the sizes of the index arrays, I don't really
think fancy indexing can be beat. I suspect that one could write
"tuple_indices" that used broadcasting to get the index arrays, if you
were willing to return a tuple. That would bring these up to nearly C
speed, I think.

Of course they're not general routines, but I think with a bit of work
they might serve.

Anne