[Python-ideas] Running average and stdev in the statistics module?

Luca Baldini luca.baldini at pi.infn.it
Sun May 5 05:58:51 EDT 2019


Hi here,
I wonder if the idea of adding to the statistics module a class to 
calculate the running statistics (average and standard deviation) of a 
generic input data stream has ever come up in the past.

The basic idea is to do the necessary book-keeping as the data are fed 
into the accumulator class and to be able to query the average variance 
of the sequence at any point in time without having to loop over the 
thing again. The obvious way to do that is well know, and described, 
e.g., in Knuth TAOCP vol 2, 3rd edition, page 232. FWIW It is something 
that through the years I have coded myself a myriad of times (e.g., for 
real-time data processing)---and maybe worth considering for addition to 
the standard library.

For completeness, a cursory look on google brings up this fairly nice 
package
https://pypi.org/project/runstats/
but really, the core algorithm would be trivial to code in a fashion 
that works with decimal and fraction objects to be integrated into the 
statistics module. Should this spur enough interest (and assuming that 
the maintainer(s) of the module are not hostile to the idea) I'd like to 
volunteer to put together an tentative implementation.

[It's my first post on this list, so please be gentle :-)]

Luca

-- 
===============================================================================
Luca Baldini

Universita' di Pisa
and
Istituto Nazionale di Fisica Nucleare - Sezione di Pisa
Largo Bruno Pontecorvo 3, I-56127, Pisa, ITALY.

phone  : +39 050 2214438
fax    : +39 050 2214317
e-mail : luca.baldini at pi.infn.it
icq    : 396247302 (Garrone)
web    : http://www.df.unipi.it/~baldini
mirror : http://www.pi.infn.it/~lbaldini
===============================================================================



More information about the Python-ideas mailing list