[SciPy-user] Inconsistent standard deviation and variance implementation in scipy vs. scipy.stats

Johann Rohwer jr at sun.ac.za
Wed Sep 24 10:05:01 EDT 2008


Hi 

It seems that the default implementation of std and var differs 
between numpy/scipy and scipy.stats, in that numpy/scipy is using 
the "biased" formulation (i.e. dividing by N) whereas scipy.stats is 
using the "unbiased" formulation (dividing by N-1) by default. Is 
this intentional (it could be potentially confusing...)? I realise 
that the "biased" version can be accessed in sp.stats with a kwarg, 
but what is the reason for two different implementations of the 
function(s)?

In [30]: a
Out[30]: array([ 1.,  2.,  3.,  2.,  3.,  1.])

In [31]: np.std(a)
Out[31]: 0.81649658092772603

In [32]: sp.std(a)
Out[32]: 0.81649658092772603

In [33]: sp.stats.std(a)
Out[33]: 0.89442719099991586

In [34]: sp.stats.std(a, bias=True)
Out[34]: 0.81649658092772603

Same for np.var vs scipy.stats.var

Johann



More information about the SciPy-User mailing list