[SciPy-Dev] chi-square test for a contingency (R x C) table

Warren Weckesser warren.weckesser at enthought.com
Wed Jun 2 00:28:35 EDT 2010


I've been digging into some basic statistics recently, and developed the 
following function for applying the chi-square test to a contingency 
table.  Does something like this already exist in scipy.stats? If not, 
any objects to adding it?  (Tests are already written :)

Warren

-----

def chisquare_contingency(table):
    """Chi-square calculation for a contingency (R x C) table.

    This function computes the chi-square statistic and p-value of the
    data in the table.  The expected frequencies are computed based on
    the relative frequencies in the table.

    Parameters
    ----------
    table : array_like, 2D
        The contingency table, also known as the R x C table.

    Returns
    -------
    chisquare statistic : float
        The chisquare test statistic
    p : float
        The p-value of the test.
    """
    table = np.asarray(table)
    if table.ndim != 2:
        raise ValueError("table must be a 2D array.")

    # Create the table of expected frequencies.
    total = table.sum()
    row_sum = table.sum(axis=1).reshape(-1,1)
    col_sum = table.sum(axis=0)
    expected = row_sum * col_sum / float(total)

    # Since we are passing in 1D arrays of length table.size, the default
    # number of degrees of freedom is table.size-1.
    # For a contingency table, the actual number degrees of freedom is
    # (nr - 1)*(nc-1).  We use the ddof argument
    # of the chisquare function to adjust the default.
    nr, nc = table.shape
    dof = (nr - 1) * (nc - 1)
    dof_adjust = (table.size - 1) - dof

    chi2, p = chisquare(np.ravel(table), np.ravel(expected), 
ddof=dof_adjust)
    return chi2, p

-----




More information about the SciPy-Dev mailing list