[issue35904] Add statistics.fmean(seq)

Mark Dickinson report at bugs.python.org
Wed Feb 6 15:58:33 EST 2019


Mark Dickinson <dickinsm at gmail.com> added the comment:

Double-checking my own assertions: here's an example of a list xs of floats for which `fsum(xs) / len(xs)` is out by more than 1 ulp. (Obtained simply by checking a few lists of random.random() outputs; it's probably possible to construct something more obvious.)

Python 3.7.2 (default, Dec 30 2018, 08:55:50)
[Clang 10.0.0 (clang-1000.11.45.5)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import fractions, math>>> xs = [float.fromhex(h) for h in ['0x1.88104e64ffa5cp-3', '0x1.9b793215ddca8p-3', '0x1.754cbf6b09730p-1', '0x1.e2b4ca1df3680p-2', '0x1.91689b66782e1p-1']]
>>> approx_mean = math.fsum(xs) / len(xs)
>>> approx_mean  # in [0.25, 0.5], so 1 ulp is 2**-54
0.47536945341150305
>>> exact_mean = sum(fractions.Fraction(x) for x in xs) / len(xs)
>>> exact_mean
Fraction(10704368466236809, 22517998136852480)
>>> error_in_ulps = abs(exact_mean - fractions.Fraction(approx_mean)) * 2**54
>>> float(error_in_ulps)
1.2

I ran checks on 1000000 such randomly generated lists, and the error never exceeded 1.5 ulps.

Sketch of proof of the 1.5 ulps bound: the fsum result is out by at most 0.5 ulps; the length n of the list is presumably represented exactly (lists with more than 2**53 elements would be infeasible). Division of the fsum result by n keeps the relative error the same, but potentially magnifies the ulps error by two, due to the usual "wobble" between relative error and ulps error, so that gives us up to 1 ulp error. Then the result of the division may need to be rounded again, giving another potential error of up to 0.5 ulps. The bound is strict: we can't actually attain 1.5 ulps, so the result we get can't be more than 1 ulp away from the correctly rounded result.

> the proposed fmean name seems a bit unfortunate in that it's reversing the sense of sum and fsum

I see Josh already made this observation: apologies for the duplicate bikeshedding.

----------

_______________________________________
Python tracker <report at bugs.python.org>
<https://bugs.python.org/issue35904>
_______________________________________


More information about the Python-bugs-list mailing list