[SciPy-User] Fisher exact test, anyone?

Tue Nov 16 10:01:53 EST 2010

On Tue, Nov 16, 2010 at 8:04 AM, Ralf Gommers
<ralf.gommers at googlemail.com> wrote:
>
>
> On Mon, Nov 15, 2010 at 12:40 AM, Bruce Southey <bsouthey at gmail.com> wrote:
>>
>> On Sat, Nov 13, 2010 at 8:50 PM,  <josef.pktd at gmail.com> wrote:
>> > http://projects.scipy.org/scipy/ticket/956 and
>> > http://pypi.python.org/pypi/fisher/ have Fisher's exact
>> > testimplementations.
>> >
>> > It would be nice to get a version in for 0.9. I spent a few
>> > unsuccessful days on it earlier this year. But since there are two new
>> > or corrected versions available, it looks like it just needs testing
>> > and a performance comparison.
>> >
>> > I won't have time for this, so if anyone volunteers for this, scipy
>> > 0.9 should be able to get Fisher's exact.
>>
> https://github.com/rgommers/scipy/tree/fisher-exact
> All tests pass. There's only one usable version (see below) so I didn't do
> performance comparison. I'll leave a note on #956 as well, saying we're
> discussing on-list.
>
>> I briefly looked at the code at pypi link but I do not think it is
>> good enough for scipy. Also, I do not like when people license code as
>> 'BSD' and there is a comment in cfisher.pyx  '# some of this code is
>> originally from the internet. (thanks)'. Consequently we can not use
>> that code.
>
> I agree, that's not usable. The plain Python algorithm is also fast enough
> that there's no need to bother with Cython.
>>
>> The code with ticket 956 still needs work especially in terms of the
>> input types and probably the API (like having a function that allows
>> the user to select either 1 or 2 tailed tests).
>
> Can you explain what you mean by work on input types? I used np.asarray and
> forced dtype to be int64. For the 1-tailed test, is it necessary? I note
> that pearsonr and spearmanr also only do 2-tailed.

adding 1 tailed tests would be a nice bonus.

I think, we should add them as much as possible. Currently one-sided
versus two-sided is still somewhat inconsistent across functions. I
added one-sided tests to some functions.
Tests based on symmetric distributions (t or normal) like the t-tests
don't necessarily need both because the one sided test can essentially
be recovered from the two sided test, half or double the pvalue.

I added a comment to the ticket, fisher3 looks good except for the
python 2.4 incompatibility.

Thanks Ralf for taking care of this,

Josef

>
> Cheers,
> Ralf
>
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user
>
>