[SciPy-user] stats review: std/var and samplestd/samplevar

Travis Oliphant oliphant at ee.byu.edu
Mon Apr 3 19:20:50 EDT 2006


Zachary Pincus wrote:

>Hi again folks,
>
>  
>
>>I think the original poster meant (N-1) some of the time when they
>>said (1-N).
>>    
>>
>
>Yeah, sorry.
>
>The take-home message is that scipy.stats uses "sample variance" to  
>mean "a variance denominated by N", when the rest of the world uses  
>"sample variance" to mean "an estimator of the population variance  
>denominated by N-1 or N", and scipy.stats uses "variance" to mean  
>"the unbiased estimator of population variance (denominated by N-1)",  
>which is not in general what "variance" means.
>  
>

Let's change the documentation to be more consistent and minimize 
confusion.

Let's add an option.   Frankly, I'm not enamored with unbiased 
estimators and would probably divide by N on computing variance by 
default and allow the option to change it.

The only reason to do differently in library code is because of 
overwhelming expectation.  But, if we are wrong about that, then let's 
do it right.


-Travis




More information about the SciPy-User mailing list