[SciPy-Dev] (no subject)

josef.pktd at gmail.com josef.pktd at gmail.com
Mon Dec 22 21:26:03 EST 2014


On Mon, Dec 22, 2014 at 7:47 PM, Clint Valentine <valentine.c at husky.neu.edu>
wrote:

> I ported an algorithm originally written in MatLab to Python for
> generating a random contingency table (when one exists) for a statistical
> test I am writing. I know something like this doesn't exist in
> scipy.stats.contingency and was wondering if it would be a good addition? I
> haven't developed any tests for the function yet but can if needed.
>
> I have just started developing in Python as an undergraduate student
> studying computational biology. This would be my first time contributing to
> open source and I am looking forward to the the experience and all the
> advice you are willing to offer. Thanks for your time.
>

Hi Clint,

More tools for the analysis of contigency tables, or other missing pieces
in stats, will be useful and will find a home in either scipy.stats or
statsmodels.
Specifically to this: I never looked at random generation of contingency
tables, but I think it would be useful for bootstrap or Monte Carlo based
hypothesis tests, which are currently not yet available in scipy.stats but
some are in other pacakges.

However, scipy, and statsmodels and most other packages in this
neighborhood, are BSD licensed and GPL is not compatible with it.  So, we
need to either translate code that is license compatible (like most of the
code on the matlab fileexchange), or write the code based on the
description of the algorithm without copying/translating code that has an
incompatible license..

Josef


>
> -Clint
>
> def rcont(ncol, nrow, ncolt, nrowt):
>     """
>     Return a random contingency table based on the two-dimensional shape
>     and marginal totals of a given contingency table
>
>     There must be 1 or many tables with nonnegative, integral entries
>     that have the given shape and marginal sums as input. The function
>     will generate, at random, one of the tables and return it. Repeated
>     calls to the function will return new random selections.
>
>     Parameters:
>     -----------
>     ncol : integer
>         The vector length of the columns in the contingency table
>
>     nrow : integer
>         The vector length of the rows in the contingency table
>
>     ncolt : array_like
>         The vector of the marginal column totals in the contingency table
>
>     nrowt : array_like
>         The vector of the marginal row totals in the contingency table
>
>     Returns
>     -------
>     imatrix : ndarray of integers
>         The randomly generated two-dimensional ndarray preserving the
>         same shape and marginal totals provided if exists
>
>     Reference
>     ---------
>     Algorithm AS144:  Appl. Statist. (1979) v.28, pp.329-332
>
>     Licensing:
>         GNU LGPL license
>
>     Author:
>         Original FORTRAN77 version by James Boyett.
>         MATLAB version by John Burkardt.
>         Python version by Clint Valentine
>
>     Reference:
>         James Boyett,
>         Algorithm AS 144:
>         Random R x C Tables with Given Row and Column Totals,
>         Applied Statistics,
>         Volume 28, Number 3, pages 329-332, 1979.
>     """
>     # Initialize and permute vector
>     nvect = np.array(xrange(sum(ncolt)), dtype=np.int_)
>     np.random.shuffle(nvect)
>
>     # Construct random matrix and partial column sums
>     imatrix = np.zeros([nrow, ncol], dtype=np.int_)
>     nsubt = np.cumsum(ncolt)
>
>     # Generate random RxC contingency tables preserving shape and
>     # marginal column and row sums and total sum.
>     count = 0
>     for i in xrange(nrow):
>         limit = nrowt[i]
>         for _ in xrange(limit):
>             for j in xrange(ncol):
>                 if nvect[count] <= nsubt[j]:
>                     count += 1
>                     imatrix[i][j] += 1
>                     break
>     return imatrix
>
> _______________________________________________
> SciPy-Dev mailing list
> SciPy-Dev at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-dev
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scipy-dev/attachments/20141222/504c5855/attachment.html>


More information about the SciPy-Dev mailing list