[SciPy-User] reversing or avoiding the need to reverse argsort()

Sat Mar 20 12:09:53 EDT 2010

On 20 March 2010 11:37, Vincent Davis <vincent at vincentdavis.net> wrote:

> What I want to do is simple but not sure the best way to go about it.
>
> I have an array a, shape(a) = rXc I need to sort each column
> aargsort = a.argsort(axis=0)  # May use this later
> aSort = a.sort(axis=0)
>
> now average each row
> aSortRM = asort.mean(axis=1)
>
> now replace each col in a row with the row mean.
> is there a better way than this
>
> aWithMeans = ones_like(a)
> for ind in range(r)  # r = number of rows
>     aWithMeans[ind]* aSortRM[ind]
>
> Now I need to undo the sort I did in the first step.
> I think this is where I need the aargsort (argsort of a before any
> modification) I am kinda stumped on this step. Everything I think of is
> rather convoluted.
> Is it possible to do this whole process without actually rearranging the
> original array and just using the index info from the argsort() and would I
> want to do this?
>

First of all, there's no need to use both argsort and sort:

In [2]: A = np.random.randn(1000)

In [3]: I = A.argsort()

In [4]: B = A[I]

In [5]: np.all(np.diff(B)>=0)
Out[5]: True

Now, to invert the permutation I, the easiest way is just to argsort() it;
but this is N log N instead of order N, so there's another, slightly more
cumbersome, way to do it:

In [12]: Ij = np.zeros_like(I)

In [13]: Ij[I] = np.arange(len(I))

In [23]: AA = B[Ij]

In [24]: np.all(AA==A)
Out[24]: True

Alternatively, you can avoid ever creating the inverse permutation:

In [25]: AA[I] = B

In [26]: np.all(AA==A)
Out[26]: True

All of this is somewhat more painful with two-dimensional arrays, since
fancy and normal indexing do not mix well, and argsort returns something
peuliar, but it can be made to work:

In [51]: A = np.random.randn(100,100)

In [52]: I = np.argsort(A, axis=0)

In [54]: B=A[I,np.arange(A.shape[1])]

In [55]: AA = np.zeros_like(A)

In [56]: AA[I,np.arange(A.shape[1])] = B

In [57]: np.all(AA==A)
Out[57]: True

Now for processing B in place, that's a simpler problem:

In [58]: BB = np.zeros_like(B)

In [59]: BB[...] = np.mean(B,axis=1)

Here numpy's broadcasting rules take care of duplicating the entries of B as
needed to fill out BB. In fact we can dispense with BB entirely:

In [60]: AA[I,np.arange(A.shape[1])] = np.mean(B,axis=1)

So your code can be rewritten as the impenetrable three lines:

In [61]: AA = np.zeros_like(A)

In [62]: I = np.argsort(A,axis=0)

In [63]: AA[I,np.arange(A.shape[1])] =
np.mean(A[I,np.arange(A.shape[1])],axis=1)

Or, even less clearly, making I into what I think argsort should probably
return:

In [64]: I = np.argsort(A,axis=0), np.arange(A.shape[1])

In [65]: AA[I] = np.mean(A[I],axis=1)

Anne
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20100320/8c06cf70/attachment.html>