[issue21184] statistics.pvariance with known mean does not work as expected

Wolfgang Maier report at bugs.python.org
Wed Apr 9 10:20:42 CEST 2014


Wolfgang Maier added the comment:

I do not think this is a bug in the module, but rather incorrect usage.

>From your own docs:
    data should be an iterable of Real-valued numbers, with at least one
    value. The optional argument mu, if given, should be the mean of
    the data. If it is missing or None, the mean is automatically calculated.

Nowhere does it say that mu should be the known population mean, and rightly so!
The definition of p_variance is that it is the variance of the data assuming that data *is* the whole population (so the correct mean can be calculated from it)
s_variance on the other hand should give an estimate of the population variance under the assumption that data is a random sample of the population, but its formula _ss/(n-1) is derived under the assumption that mu is the sample mean, not the population mean.

So everything's fine and there is nothing to fix really!

----------
nosy: +wolma

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue21184>
_______________________________________


More information about the Python-bugs-list mailing list