[SciPy-dev] Possible Error in Kendall's Tau (scipy.stats.stats.kendalltau)

josef.pktd at gmail.com josef.pktd at gmail.com
Wed Mar 18 12:13:09 EDT 2009


kendall's tau in R stats is exactly the same as scipy.stats.kendalltau
for all the test cases, independent of ties and matching ties, see
also

>>> rcortest([1,1,2], [1,1,2], method = "kendall", exact=0)['estimate']['tau']
1.0

R help doesn't specify which version of kendall tau is in cor.test

import rpy
rcortest = rpy.r('cor.test')
rkend = rcortest(x, y, method = "kendall", exact=0)
tr = rkend['estimate']['tau']
ts, ps = stats.kendalltau(x, y)
assert_almost_equal(tr, ts, decimal=10)

doesn't raise exception for any test cases,

I also checked the p-values, but there are a few discrepancies between
R and scipy.stats. but all test cases have for very small sample size

difference in p-values for test cases:
np.diff(rcomparr, axis=1).T
array([[ -1.17641404e-04,  -2.40189199e-04,  -3.84139134e-04,
         -1.38669151e-03,  -3.84139134e-04,  -4.01141101e-02,
         -2.52429121e-02,   5.42191308e-09,  -4.17244442e-02,
         -2.52429121e-02,   0.00000000e+00,   1.09393424e-08,
          3.23933650e-09,   9.63088664e-09,   1.09393424e-08,
          1.18686162e-08,   1.04538346e-08,   1.37460343e-09,
          1.27648634e-08,   1.22071195e-08]])


Since we also agree with R.stats cor.test for the definition of
kendall tau, there is really no reason to change stats.kendalltau,
maybe checking for which cases the p-values differ could be useful.

Josef



More information about the SciPy-Dev mailing list