[SciPy-User] [SciPy-user] Two Sample Kolmogorov-Smirnov Test scipy vs R

amundell andrewhdmundell at gmail.com
Thu Dec 20 08:34:40 EST 2012


The results are numerically correct and the definitions of 'greater' and
'less' are correct and are consistent (not reversed) with R. Not sure if I
made some handling errors yesterday (perhaps samples were reversed) or
changes had been made to the code in scipy.stats.mstats in the meantime. If
it was the former (i.e. from my end), I do apologise for taking up your time
with this.

Andrew


On Thu, Dec 20, 2012 at 6:50 AM, amundell <andrewhdmundell at gmail.com> wrote:
>
> Hi again Josef, after investigating a few different alogrithm approaches
> (most notably in Press, Teukolsky et al. Numerical Recipes) and looking
> into
> R's source), it does seem that 'greater' and 'less' does refer to maximum
> (D) deviation of the first sample's (sample1's) cdf 'above' and 'below'
> the
> sample2's cdf. By observing and calculating the maximum values in the plot
> of my data, they are consistent with the results that SciPy
> (scipy.stats.mstats.ks_twosamp) generate. I have just confirmed I am
> generating similar results in a third software package to SciPy. I
> therefore
> believe SciPy's results are accurate.

The results are numerically correct, but the definition of 'greater'
and 'less' is still reversed in scipy.stats.mstats.?

BTW there is a tie-correction in the mstats version, where I don't
know whether it has the same asymptotic distribution for the test
statistic.

Josef


-- 
View this message in context: http://old.nabble.com/Two-Sample-Kolmogorov-Smirnov-Test-scipy-vs-R-tp34814758p34819413.html
Sent from the Scipy-User mailing list archive at Nabble.com.




More information about the SciPy-User mailing list