[SciPy-dev] stats - kstest

Manuel Metz mmetz at astro.uni-bonn.de
Tue Jul 20 05:32:05 EDT 2004


Travis Oliphant wrote:
> I reviewed what was done again and now believe we were correct.  The 
> distribution that is being used in kstest is the Kolmogorov one-sided 
> distribution, KS+   Because this is the distribution used, the test is 
> done with a one-sided statistic.
> 
> SciPy only has an approximate two-sided statistic which is valid for 
> large N.  We do not have it wrapped in a kstest-like command, but the 
> distribution is available as kstwobign.
> We could modify kstest or make a new command for the two-sided test.
> 
> Questions and/or comments welcome.

Hm, my first suggestion is to make the notation clear(er): Many people 
know and use the "Numerical recipes" (NR). The notation there is: ksone 
= two-sided statistic; kstwo = two-sided statistic for a 
2D-distribution. So this may lead to some confusion...

The algorithm of the SciPy distribution 'kstwobign' is the same as given 
in the NR (there 'probks'). They say that the approximation is good for 
N>4. Maybe it would be a good idea to implement the two-sided test with 
a new name, like 'kstest2side' or 'kstest2s and for clarity change the 
doc-string of kstest to make clear, that this is the D+ test.

However, I found 2 paper that provide more accurate solutions to the 
two-sided test (as I understood it):

"Computing the Cumulative Distribution Function of the 
Kolmogornov-Smirnov Statistic" by Drew, Glen & Leemis; 
http://www.math.wm.edu/~leemis/

and

"Evaluating Kolmogorov's Distribution" by Marsaglia, Tsang & Wang;
http://www.jstatsoft.org/v08/i18/k.ps

In the Introduction of the second paper they say:
"We provide here a relatively small C procedure, K(n,d), that will 
provide Pr(D_n<d) with far greater precision than is needed in practice."
This may be a good candidate to be used in SciPy...

Manuel




More information about the SciPy-Dev mailing list