[Python-ideas] Running average and stdev in the statistics module?
Luca Baldini
luca.baldini at pi.infn.it
Sun May 5 05:58:51 EDT 2019
Hi here,
I wonder if the idea of adding to the statistics module a class to
calculate the running statistics (average and standard deviation) of a
generic input data stream has ever come up in the past.
The basic idea is to do the necessary book-keeping as the data are fed
into the accumulator class and to be able to query the average variance
of the sequence at any point in time without having to loop over the
thing again. The obvious way to do that is well know, and described,
e.g., in Knuth TAOCP vol 2, 3rd edition, page 232. FWIW It is something
that through the years I have coded myself a myriad of times (e.g., for
real-time data processing)---and maybe worth considering for addition to
the standard library.
For completeness, a cursory look on google brings up this fairly nice
package
https://pypi.org/project/runstats/
but really, the core algorithm would be trivial to code in a fashion
that works with decimal and fraction objects to be integrated into the
statistics module. Should this spur enough interest (and assuming that
the maintainer(s) of the module are not hostile to the idea) I'd like to
volunteer to put together an tentative implementation.
[It's my first post on this list, so please be gentle :-)]
Luca
--
===============================================================================
Luca Baldini
Universita' di Pisa
and
Istituto Nazionale di Fisica Nucleare - Sezione di Pisa
Largo Bruno Pontecorvo 3, I-56127, Pisa, ITALY.
phone : +39 050 2214438
fax : +39 050 2214317
e-mail : luca.baldini at pi.infn.it
icq : 396247302 (Garrone)
web : http://www.df.unipi.it/~baldini
mirror : http://www.pi.infn.it/~lbaldini
===============================================================================
More information about the Python-ideas
mailing list