[SciPy-user] stats review: std/var and samplestd/samplevar
Travis Oliphant
oliphant at ee.byu.edu
Mon Apr 3 19:20:50 EDT 2006
Zachary Pincus wrote:
>Hi again folks,
>
>
>
>>I think the original poster meant (N-1) some of the time when they
>>said (1-N).
>>
>>
>
>Yeah, sorry.
>
>The take-home message is that scipy.stats uses "sample variance" to
>mean "a variance denominated by N", when the rest of the world uses
>"sample variance" to mean "an estimator of the population variance
>denominated by N-1 or N", and scipy.stats uses "variance" to mean
>"the unbiased estimator of population variance (denominated by N-1)",
>which is not in general what "variance" means.
>
>
Let's change the documentation to be more consistent and minimize
confusion.
Let's add an option. Frankly, I'm not enamored with unbiased
estimators and would probably divide by N on computing variance by
default and allow the option to change it.
The only reason to do differently in library code is because of
overwhelming expectation. But, if we are wrong about that, then let's
do it right.
-Travis
More information about the SciPy-User
mailing list