[SciPy-user] Are two distributions different

Anne Archibald peridot.faceted at gmail.com
Tue Aug 14 23:59:25 EDT 2007


On 14/08/07, Tommy Grav <tgrav at mac.com> wrote:
> I have two binned distributions R and S (both generated using the
> numpy.histogram() with the same bins keyword. I would like to
> check if these two distributions are different using the chi squared
> and k-s tests. I know that scipy implements these two tests,
> but I have been unable to figure out how to use them.
> Any help is appreciated.

For the K-S test, don't bin the samples: it works directly on unbinned
data. (Generally, don't bin things unless you have to, it tends to
introduce artificial features which are hard to understand.) IIRC, if
you simply supply kstwo with two samples (that is, arrays of numbers,
each representing a sample), it will return the probability that
samples this different could be drawn from the same distribution, and
it will also return some internal number you don't care about.

I haven't used the chisquared test in numpy.

Anne



More information about the SciPy-User mailing list