[SciPy-dev] Possible Error in Kendall's Tau (scipy.stats.stats.kendalltau)
josef.pktd at gmail.com
josef.pktd at gmail.com
Wed Mar 18 12:13:09 EDT 2009
kendall's tau in R stats is exactly the same as scipy.stats.kendalltau
for all the test cases, independent of ties and matching ties, see
also
>>> rcortest([1,1,2], [1,1,2], method = "kendall", exact=0)['estimate']['tau']
1.0
R help doesn't specify which version of kendall tau is in cor.test
import rpy
rcortest = rpy.r('cor.test')
rkend = rcortest(x, y, method = "kendall", exact=0)
tr = rkend['estimate']['tau']
ts, ps = stats.kendalltau(x, y)
assert_almost_equal(tr, ts, decimal=10)
doesn't raise exception for any test cases,
I also checked the p-values, but there are a few discrepancies between
R and scipy.stats. but all test cases have for very small sample size
difference in p-values for test cases:
np.diff(rcomparr, axis=1).T
array([[ -1.17641404e-04, -2.40189199e-04, -3.84139134e-04,
-1.38669151e-03, -3.84139134e-04, -4.01141101e-02,
-2.52429121e-02, 5.42191308e-09, -4.17244442e-02,
-2.52429121e-02, 0.00000000e+00, 1.09393424e-08,
3.23933650e-09, 9.63088664e-09, 1.09393424e-08,
1.18686162e-08, 1.04538346e-08, 1.37460343e-09,
1.27648634e-08, 1.22071195e-08]])
Since we also agree with R.stats cor.test for the definition of
kendall tau, there is really no reason to change stats.kendalltau,
maybe checking for which cases the p-values differ could be useful.
Josef
More information about the SciPy-Dev
mailing list