[SciPy-user] Some help with chisquare

Mon Oct 27 16:14:59 EDT 2008

On Wed, Oct 22, 2008 at 12:55, Erik Wickstrom <erik at erikwickstrom.com> wrote:
> Hi,
>
> I'm trying to port an application to python, and want to use scipy to handle
> the statistics.
>
> The app takes several tests and uses chi-square to determines which has the
> highest success rate with a confidence of 95% or better (critical
> values/degrees of freedom).
>
> For example:
> Test a:
> Total trials = 100
> Total successes = 60
>
> Test b:
> Total trials = 105
> Total successes = 46
>
> Test c:
> Total trials = 98
> Total successes = 52
>
> It then puts the data through some sort of chi-square formula (or so the
> comments say) and produces a chi-square value that can be compared against
> the critical values for 95% confidence.
>
> Trouble is, I'm not sure which of the many scipy chi-square functions to
> use, and what data I need to feed into them....

scipy.stats.chisquare() is probably what you want. Pass it arrays of
the actual and expected frequencies for each Test. It will return to
you a Chi^2 value and the associated p-value. If the p-value is <
0.05, then the Chi^2 value is greater than the critical value for the
95% confidence region.

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth."
  -- Umberto Eco