[SciPy-User] pandas: independent row sorting of data frame
Fabrizio Pollastri
f.pollastri at inrim.it
Mon Jan 3 16:21:19 EST 2011
Wes McKinney <wesmckinn <at> gmail.com> writes:
>
> Hi Fabrizio,
>
> I'm not that familiar with xts but I think you need only do:
>
> sort_xs = df.apply(np.sort, axis=1)
> sort_index = df.apply(np.argsort, axis=1)
>
> Using the apply function with the axis argument is preferable to using
> tapply-- that function is still around to support old client code (I
> may add a deprecation warning in the future).
>
> This will only be about as fast as the R counterpart-- it would be
> easy to write a more optimized version, though.
>
> NB many NumPy functions work using the array interface, e.g.:
>
> np.argsort(df, axis=1)
>
> But np.sort isn't one of them.
>
> HTH,
> Wes
>
Hi Wes,
thanks for your hints, but I have some problems with sort.
Let see the folowing code.
import numpy as np
from pandas import DataFrame
df = DataFrame({'a': [1,3,1], 'b':[2,2,3], 'c':[3,1,2]})
>>> df
a b c
0 1 2 3
1 3 2 1
2 1 3 2
# sort index is ok.
sort_index = df.apply(np.argsort, axis=1)
>>> sort_index
a b c
0 0 1 2
1 2 1 0
2 0 2 1
# sorted df is not as expected: it is equal to df.
sorted_df = df.apply(np.sort, axis=1)
>>> sorted_df
a b c
0 1 2 3
1 3 2 1
2 1 3 2
Where is the trick?
TIA,
Fabrizio
More information about the SciPy-User
mailing list