[Python-checkins] Improve hypot() accuracy with three separate accumulators (GH-22032)
Raymond Hettinger
webhook-mailer at python.org
Wed Sep 2 01:00:58 EDT 2020
https://github.com/python/cpython/commit/5b24d1592a990ad7cf81cd1498d255bad41a0b14
commit: 5b24d1592a990ad7cf81cd1498d255bad41a0b14
branch: master
author: Raymond Hettinger <rhettinger at users.noreply.github.com>
committer: GitHub <noreply at github.com>
date: 2020-09-01T22:00:50-07:00
summary:
Improve hypot() accuracy with three separate accumulators (GH-22032)
files:
M Modules/mathmodule.c
diff --git a/Modules/mathmodule.c b/Modules/mathmodule.c
index 6621951ee97d2..d227a5d15dca2 100644
--- a/Modules/mathmodule.c
+++ b/Modules/mathmodule.c
@@ -2456,7 +2456,7 @@ Given that csum >= 1.0, we have:
Since lo**2 is less than 1/2 ulp(csum), we have csum+lo*lo == csum.
To minimize loss of information during the accumulation of fractional
-values, the lo**2 term has a separate accumulator.
+values, each term has a separate accumulator.
The square root differential correction is needed because a
correctly rounded square root of a correctly rounded sum of
@@ -2487,7 +2487,7 @@ static inline double
vector_norm(Py_ssize_t n, double *vec, double max, int found_nan)
{
const double T27 = 134217729.0; /* ldexp(1.0, 27)+1.0) */
- double x, csum = 1.0, oldcsum, frac = 0.0, frac_lo = 0.0, scale;
+ double x, csum = 1.0, oldcsum, scale, frac=0.0, frac_mid=0.0, frac_lo=0.0;
double t, hi, lo, h;
int max_e;
Py_ssize_t i;
@@ -2529,12 +2529,12 @@ vector_norm(Py_ssize_t n, double *vec, double max, int found_nan)
assert(fabs(csum) >= fabs(x));
oldcsum = csum;
csum += x;
- frac += (oldcsum - csum) + x;
+ frac_mid += (oldcsum - csum) + x;
assert(csum + lo * lo == csum);
frac_lo += lo * lo;
}
- frac += frac_lo;
+ frac += frac_lo + frac_mid;
h = sqrt(csum - 1.0 + frac);
x = h;
More information about the Python-checkins
mailing list