[Python-Dev] PEP 450 adding statistics module

Skip Montanaro skip at pobox.com
Mon Sep 9 12:44:43 CEST 2013


> However, it's common in economic statistics to have a rectangular
> array, and extract both certain rows (tuples of observations on
> variables) and certain columns (variables).  For example you might
> have data on populations of American states from 1900 to 2012, and
> extract the data on New England states from 1946 to 2012 for analysis.

When Steven first brought up this PEP on comp.lang.python, my main concern
was basically, "we have SciPy, why do we need this?" Steven's response, which
I have come to accept, is that there are uses for basic statistics for
which SciPy's
stats module would be overkill.

However, once you start slicing your data structure along more than one axis, I
think you very quickly will find that you need numpy arrays for performance
reasons, at which point you might as go "all the way" and install SciPy. I don't
think slicing along multiple dimensions should be a significant concern for this
package.

Alternatively, I thought there was discussion a long time ago about
getting numpy's
(or even further back, numeric's?) array type into the core. Python
has an array type
which I don't think gets a lot of use (or love). Might it be
worthwhile to make sure the
PEP 450 package works with that? Then extend it to multiple dimensions? Or just
bite the bullet and get numpy's array type into the Python core once
and for all?

Sort of Tulip for arrays...

Skip


More information about the Python-Dev mailing list