PEP 450 Adding a statistics module to Python

Sun Aug 11 10:02:21 EDT 2013

In article <mailman.479.1376221844.1251.python-list at python.org>,
 Skip Montanaro <skip at pobox.com> wrote:

> > See the Rationale of PEP 450 for more reasons why â€œinstall NumPyâ€ is not
> > a feasible solution for many use cases, and why having â€˜statisticsâ€™ as a
> > pure-Python, standard-library package is desirable.
> 
> I read that before posting but am not sure I agree. I don't see the
> screaming need for this package.  Why can't it continue to live on
> PyPI, where, once again, it is available as "pip install ..."?

My previous comments on this topic were along the lines of "installing 
numpy is a non-starter if all you need are simple mean/std-dev".  You 
do, however, make a good point here.  Running "pip install statistics" 
is a much lower barrier to entry than getting numpy going, especially if 
statistics is pure python and thus has no dependencies on compiler tool 
chains which may be missing.

Still, I see two classes of function in PEP-450.  Class 1 is the really 
basic stuff:

* mean
* std-dev

Class 2 are the more complicated things like:

* linear regression
* median
* mode
* functions for calculating the probability of random variables
  from the normal, t, chi-squared, and F distributions
* inference on the mean
* anything that differentiates between population and sample

I could see leaving class 2 stuff in an optional pure-python module to 
be installed by pip, but for (as the PEP phrases it), the simplest and 
most obvious statistical functions (into which I lump mean and std-dev), 
having them in the standard library would be a big win.