[issue45876] Improve accuracy of stdev functions in statistics

Fri Nov 26 16:46:19 EST 2021

Mark Dickinson <dickinsm at gmail.com> added the comment:

> All the rounding has already happened at the point where ldexp is called, and the result of the ldexp call is exact.

Sketch of proof:

[Here](https://github.com/python/cpython/blob/4ebde73b8e416eeb1fd5d2ca3283f7ddb534c5b1/Objects/longobject.c#L3929) we have:

    shift = Py_MAX(diff, DBL_MIN_EXP) - DBL_MANT_DIG - 2;

from which (assuming IEEE 754 as usual) shift >= -1076. (DBL_MIN_EXP = -1021, DBL_MANT_DIG = 53)

[Here](https://github.com/python/cpython/blob/4ebde73b8e416eeb1fd5d2ca3283f7ddb534c5b1/Objects/longobject.c#L4008) we round away the last two or three bits of x, after which x is guaranteed to be a multiple of 4:

    x->ob_digit[0] = low & ~(2U*mask-1U);

Then after converting the PyLong x to a double dx with exactly the same value [here](https://github.com/python/cpython/blob/4ebde73b8e416eeb1fd5d2ca3283f7ddb534c5b1/Objects/longobject.c#L4020) we make the ldexp call:

    result = ldexp(dx, (int)shift);

At this point dx is a multiple of 4 and shift >= -1076, so the result of the ldexp scaling is a multiple of 2**-1074, and in the case of a subnormal result, it's already exactly representable.

For the int/int division possibly not being correctly rounded on x87, see [here](https://github.com/python/cpython/blob/4ebde73b8e416eeb1fd5d2ca3283f7ddb534c5b1/Objects/longobject.c#L3889-L3892).

It won't affect _this_ application, but possibly we should fix this anyway. Though the progression of time is already effectively fixing it for us, as x87 becomes less and less relevant.

----------

_______________________________________
Python tracker <report at bugs.python.org>
<https://bugs.python.org/issue45876>
_______________________________________