[Numpy-discussion] Coverting ranks to a Gaussian

Mon Jun 9 22:30:09 EDT 2008

On Mon, Jun 9, 2008 at 7:02 PM, Pierre GM <pgmdevlist at gmail.com> wrote:
> On Monday 09 June 2008 22:06:24 Keith Goodman wrote:
>> On Mon, Jun 9, 2008 at 4:45 PM, Robert Kern <robert.kern at gmail.com> wrote:
>
>> > There are subtleties in computing ranks when ties are involved. Take a
>> > look at the implementation of scipy.stats.rankdata().
>>
>> Good point. I had to deal with ties and missing data. I bet
>> scipy.stats.rankdata() is faster than my implementation.
>
> There's a scipy.stats.mstats.rankdata() that take care of both ties and
> missing data. Missing data are allocated a rank of either 0 or the average
> rank, depending on some parameter.

That sounds interesting. But I can't find it:

>> import scipy
>> from scipy import stats
>> scipy.stats.m
scipy.stats.mannwhitneyu  scipy.stats.mean          scipy.stats.mielke
       scipy.stats.moment        scipy.stats.morestats
scipy.stats.maxwell       scipy.stats.median        scipy.stats.mode
       scipy.stats.mood          scipy.stats.mvn
>> scipy.stats.morestats.r
scipy.stats.morestats.r_     scipy.stats.morestats.ravel

In my implementation I leave the missing values as missing. I think
that would be a nice option for rankdata.