[SciPy-user] help with scipy.stats.mannwhitneyu

josef.pktd at gmail.com josef.pktd at gmail.com
Thu Feb 5 13:31:00 EST 2009


On Thu, Feb 5, 2009 at 1:11 PM, Pierre GM <pgmdevlist at gmail.com> wrote:
>
> On Feb 5, 2009, at 1:03 PM, Sturla Molden wrote:
>
>> On 2/5/2009 6:46 PM, josef.pktd at gmail.com wrote:
>>
>>> so whether bigU or smallU is used in the calculation of z doesn't
>>> matter, I have no idea why in this specific implementation both are
>>> calculated if smallU would be enough.
>>
>> By the way, there is a fucntion scipy.stats.ranksums that does a
>> Wilcoxon rank-sum test. It seems to be using a large-sample
>> approximation, and has no correction for tied ranks.
>
>
> Please keep in mind that some of the tests have been reimplemented in
> scipy.stats.mstats to support masked/missing values in scipy.mstats
> and to take ties into accounts ...
> I trust y'all to let me know of any inconsistencies between the masked/
> unmasked versions, whether in terms of signatures or assumptions.
> Thx a lot in advance...
> _______________________________________________
> SciPy-user mailing list
> SciPy-user at scipy.org
> http://projects.scipy.org/mailman/listinfo/scipy-user
>

a quick check looks pretty good (still example without ties)

>>> stats.mstats.kruskal(rvs1,rvs2)[1] - stats.ranksums(rvs1,rvs2)[1]
-4.8572257327350599e-016
>>> stats.mstats.kruskalwallis(rvs1,rvs2)[1] - stats.ranksums(rvs1,rvs2)[1]
-4.8572257327350599e-016

>>> stats.mstats.mannwhitneyu(rvs1,rvs2)[1] - stats.ranksums(rvs1,rvs2)[1]
0.00029058688269312238
>>> stats.mstats.mannwhitneyu(rvs1,rvs2)
(4363.0, 0.11989439052971618)
>>> stats.mstats.mannwhitneyu(rvs1,rvs2)[1] - rwilcox(rvs1,rvs2,correct = False)['p.value']
0.00029058688269296973
>>> stats.mstats.mannwhitneyu(rvs1,rvs2)[1] - rwilcox(rvs1,rvs2)['p.value']
0.0

stats.mstats.mannwhitneyu employs continuity correction by default as in R.


Just calling this, according to docstring, requires sequence, correct
usage is not clear:

>>> stats.mstats.compare_medians_ms(rvs1,rvs2)
Traceback (most recent call last):
  File "<pyshell#127>", line 1, in <module>
    stats.mstats.compare_medians_ms(rvs1,rvs2)
  File "\Programs\Python25\Lib\site-packages\scipy\stats\mstats_extras.py",
line 332, in compare_medians_ms
    (std_1, std_2) = (mstats.stde_median(group_1, axis=axis),
  File "C:\Programs\Python25\lib\site-packages\scipy\stats\mstats_basic.py",
line 1511, in stde_median
    return _stdemed_1D(data)
  File "C:\Programs\Python25\lib\site-packages\scipy\stats\mstats_basic.py",
line 1504, in _stdemed_1D
    n = len(sorted)
TypeError: object of type 'builtin_function_or_method' has no len()

Josef



More information about the SciPy-User mailing list