[New-bugs-announce] [issue46257] Convert statistics sum of squares to a single pass algorithm

Raymond Hettinger report at bugs.python.org
Tue Jan 4 12:04:05 EST 2022


New submission from Raymond Hettinger <raymond.hettinger at gmail.com>:

The existing code makes two passes, one to compute the mean and another to compute the sum of squared differences from the mean.  A consequence of making two passes is that iterator inputs must be converted to a list before processing.  This throws away the memory saving advantages of iterators. 

The ostensible reason for the two pass code is that the single pass variant is numerically unstable when implemented with floating point accumulators.  However, this code uses fractions throughout, so the accumulation is exact.

Changing to a single pass saves memory, doubles the speed, and simplifies the upstream code in variance(), pvariance(), stdev(), and pstdev().

----------
components: Library (Lib)
messages: 409692
nosy: rhettinger, steven.daprano
priority: normal
severity: normal
status: open
title: Convert statistics sum of squares to a single pass algorithm
type: performance
versions: Python 3.11

_______________________________________
Python tracker <report at bugs.python.org>
<https://bugs.python.org/issue46257>
_______________________________________


More information about the New-bugs-announce mailing list