[SciPy-User] Questions/comments about scipy.stats.mannwhitneyu

Sat Feb 16 21:36:08 EST 2013

On Sat, Feb 16, 2013 at 9:17 PM,  <josef.pktd at gmail.com> wrote:
> On Sat, Feb 16, 2013 at 7:51 PM,  <josef.pktd at gmail.com> wrote:
>> On Fri, Feb 15, 2013 at 1:44 PM, Chris Rodgers <xrodgers at gmail.com> wrote:
>>> Thanks Josef. Your points make sense to me.
>>>
>>> While we're on the subject, maybe I should ask whether this function
>>> is even appropriate for my data. My data are Poisson-like integer
>>> counts, and I want to know if the rate is significantly higher in
>>> dataset1 or dataset2. I'm reluctant to use poissfit because there is a
>>> scientific reason to believe that my data might deviate significantly
>>> from Poisson, although I haven't checked this statistically.
>>>
>>> Mann-whitney U seemed like a safe alternative because it doesn't make
>>> distributional assumptions and it deals with ties, which is especially
>>> important for me because half the counts or more can be zero. Does
>>> that seem like a good choice, as long as I have >20 samples and the
>>> large-sample approximation is appropriate? Comments welcome.
>>
>> Please bottom or inline post.
>>
>> I don't have any direct experience with this.
>>
>> The >20 samples is just a guideline (as usual). If you have many ties,
>> then I would expect be that you need more samples (no reference).
>>
>> What I would do in cases like this is to run a small Monte Carlo, with
>> Poisson data, or data that looks somewhat similar to your data, to see
>> whether the test has the correct size (for example reject roughly 5%
>> at a 5% alpha), and to see whether the test has much power in small
>> samples.
>> I would expect that the size is ok, but power might not be large
>> unless the difference in the rate parameter is large.
>
> (Since I was just working on a different 2 sample test, I had this almost ready)
> https://gist.github.com/josef-pkt/4969715
>
> Even for sample size of each sample equal to 10, the results look
> still pretty ok, slightly under rejecting.
> with 20 observations each, size is pretty good
> power is good for most lambda differences I looked at (largish).
> (I only used 1000 replications)

with asymmetric small sample sizes
n_mc = 50000
nobs1, nobs2 = 15, 5 #20, 20
we also get a bit of under rejection, especially at small alpha (0.005 or 0.01)

(and as fun part:
plotting the histogram of the p-values shows gaps, because with ranks
not all values are possible; if I remember the interpretation
correctly.)

Josef


>
> Sometimes I'm surprised how fast we get to the asymptotics.
>
> Josef
>
>
>
>>
>> Another possibility is to compare permutation p-values with asymptotic
>> p-values, to see whether they are close.
>>
>> There should be alternative tests, but I don't think they are
>> available in python, specific tests for comparing count data (I have
>> no idea), general 2 sample goodness-of-fit test (like ks_2samp) but we
>> don't have anything for discrete data.
>>
>> If you want to go parametric, then you could also use poisson (or
>> negative binomial) regression in statsmodels, and directly test the
>> equality of the distribution parameter. (there is also zeroinflated
>> poisson, but with less verification).
>>
>> Josef
>>
>>
>>>
>>> Thanks
>>> Chris
>>>
>>> On Fri, Feb 15, 2013 at 8:58 AM,  <josef.pktd at gmail.com> wrote:
>>>> On Fri, Feb 15, 2013 at 11:35 AM,  <josef.pktd at gmail.com> wrote:
>>>>> On Fri, Feb 15, 2013 at 11:16 AM,  <josef.pktd at gmail.com> wrote:
>>>>>> On Thu, Feb 14, 2013 at 7:06 PM, Chris Rodgers <xrodgers at gmail.com> wrote:
>>>>>>> Hi all
>>>>>>>
>>>>>>> I use scipy.stats.mannwhitneyu extensively because my data is not at
>>>>>>> all normal. I have run into a few "gotchas" with this function and I
>>>>>>> wanted to discuss possible workarounds with the list.
>>>>>>
>>>>>> Can you open a ticket ? http://projects.scipy.org/scipy/report
>>>>>>
>>>>>> I partially agree, but any changes won't be backwards compatible, and
>>>>>> I don't have time to think about this enough.
>>>>>>
>>>>>>>
>>>>>>> 1) When this function returns a significant result, it is non-trivial
>>>>>>> to determine the direction of the effect! The Mann-Whitney test is NOT
>>>>>>> a test on difference of medians or means, so you cannot determine the
>>>>>>> direction from these statistics. Wikipedia has a good example of why
>>>>>>> it is not a test for difference of median.
>>>>>>> http://en.wikipedia.org/wiki/Mann%E2%80%93Whitney_U#Illustration_of_object_of_test
>>>>>>>
>>>>>>> I've reprinted it here. The data are the finishing order of hares and
>>>>>>> tortoises. Obviously this is contrived but it indicates the problem.
>>>>>>> First the setup:
>>>>>>> results_l = 'H H H H H H H H H T T T T T T T T T T H H H H H H H H H H
>>>>>>> T T T T T T T T T'.split(' ')
>>>>>>> h = [i for i in range(len(results_l)) if results_l[i] == 'H']
>>>>>>> t = [i for i in range(len(results_l)) if results_l[i] == 'T']
>>>>>>>
>>>>>>> And the results:
>>>>>>> In [12]: scipy.stats.mannwhitneyu(h, t)
>>>>>>> Out[12]: (100.0, 0.0097565768849708391)
>>>>>>>
>>>>>>> In [13]: np.median(h), np.median(t)
>>>>>>> Out[13]: (19.0, 18.0)
>>>>>>>
>>>>>>> Hares are significantly faster than tortoises, but we cannot determine
>>>>>>> this from the output of mannwhitneyu. This could be fixed by either
>>>>>>> returning u1 and u2 from the guts of the function, or testing them in
>>>>>>> the function and returning the comparison. My current workaround is
>>>>>>> testing the means which is absolutely wrong in theory but usually
>>>>>>> correct in practice.
>>>>>>
>>>>>> In some cases I'm reluctant to return the direction when we use a
>>>>>> two-sided test. In this case we don't have a one sided tests.
>>>>>> In analogy to ttests, I think we could return the individual u1, u2
>>>>>
>>>>> to expand a bit:
>>>>> For the Kolmogorov Smirnov test, we refused to return an indication of
>>>>> the direction. The alternative is two-sided and the distribution of
>>>>> the test statististic and the test statistic are different in the
>>>>> one-sided test.
>>>>> So we shouldn't draw any one-sided conclusions from the two-sided test.
>>>>>
>>>>> In the t_test and mannwhitenyu the test statistic is normally
>>>>> distributed (in large samples), so we can infer the one-sided test
>>>>> from the two-sided statistic and p-value.
>>>>>
>>>>> If there are tables for the small sample case, we would need to check
>>>>> if we get consistent interpretation between one- and two-sided tests.
>>>>>
>>>>> Josef
>>>>>
>>>>>>
>>>>>>>
>>>>>>> 2) The documentation states that the sample sizes must be at least 20.
>>>>>>> I think this is because the normal approximation for U is not valid
>>>>>>> for smaller sample sizes. Is there a table of critical values for U in
>>>>>>> scipy.stats that is appropriate for small sample sizes or should the
>>>>>>> user implement his or her own?
>>>>>>
>>>>>> not available in scipy. I never looked at this.
>>>>>> pull requests for this are welcome if it works. It would be backwards
>>>>>> compatible.
>>>>
>>>> since I just looked at a table collection for some other test, they
>>>> also have Mann-Whitney U statistic
>>>> http://faculty.washington.edu/heagerty/Books/Biostatistics/TABLES/Wilcoxon/
>>>> but I didn't check if it matches the test statistic in scipy.stats
>>>>
>>>> Josef
>>>>
>>>>>>
>>>>>>>
>>>>>>> 3) This is picky but is there a reason that it returns a one-tailed
>>>>>>> p-value, while other tests (eg ttest_*) default to two-tailed?
>>>>>>
>>>>>> legacy wart, that I don't like,  but it wasn't offending me enough to change it.
>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Thanks for any thoughts, tips, or corrections and please don't take
>>>>>>> these comments as criticisms ... if I didn't enjoy using scipy.stats
>>>>>>> so much I wouldn't bother bringing this up!
>>>>>>
>>>>>> Thanks for the feedback.
>>>>>> In large parts review of the functions relies on comments by users
>>>>>> (and future contributors).
>>>>>>
>>>>>> The main problem is how to make changes without breaking current
>>>>>> usage, since many of those functions are widely used.
>>>>>>
>>>>>> Josef
>>>>>>
>>>>>>
>>>>>>>
>>>>>>> Chris
>>>>>>> _______________________________________________
>>>>>>> SciPy-User mailing list
>>>>>>> SciPy-User at scipy.org
>>>>>>> http://mail.scipy.org/mailman/listinfo/scipy-user
>>>> _______________________________________________
>>>> SciPy-User mailing list
>>>> SciPy-User at scipy.org
>>>> http://mail.scipy.org/mailman/listinfo/scipy-user
>>> _______________________________________________
>>> SciPy-User mailing list
>>> SciPy-User at scipy.org
>>> http://mail.scipy.org/mailman/listinfo/scipy-user