[SciPy-User] ks_2samp and searchsorted on concatenated array

josef.pktd at gmail.com josef.pktd at gmail.com
Sun May 22 16:25:50 EDT 2011


On Sun, May 22, 2011 at 3:52 PM,  <josef.pktd at gmail.com> wrote:
> I was looking again at Kolmogorov-Smirnov and other gof tests
>
> from ks_2samp: (data1, data2 are 1d)
>
>    data1 = np.sort(data1)
>    data2 = np.sort(data2)
>    data_all = np.concatenate([data1,data2])
>    cdf1 = np.searchsorted(data1,data_all,side='right')/(1.0*n1)
>    cdf2 = (np.searchsorted(data2,data_all,side='right'))/(1.0*n2)
>    d = np.max(np.absolute(cdf1-cdf2))
>
> What does searchsorted do with an array that is the concatenation of
> two sorted arrays?
>
> I don't understand why data_all doesn't need to be sorted (after the
> concatenation).
>
> (I wrote this in 2008 just after learning about searchsorted, but the
> MonteCarlos, that I did, looked good. And I didn't find a reference
> why I did it this way.)
>
> Bug or not? (maybe I'm just slow in thinking today)

Ok, I'm slow in thinking today.

searchsorted inserts the *second* array into the *first*, not the
other way around.

Sorry for the noise. no bug

Josef

>
> Josef
>



More information about the SciPy-User mailing list