Lists: Converting Double to Single
Duncan Booth
duncan.booth at invalid.invalid
Wed Feb 28 04:14:47 EST 2007
Dennis Lee Bieber <wlfraed at ix.netcom.com> wrote:
> For floating point, smallest magnitude to largest IS the most
> precise.
>
> Pretend you only have 2 significant digits of precision...
>
> 10 + 0.4 + 0.4 + 0.4 => 10
>
> 0.4 + 0.4 + 0.4 + 10 => 11
and if you try the way I suggested then initial input order doesn't
matter:
(10 + 0.4) = 10, (0.4 + 0.4) = 0.8, (10 + 0.8) = 11
(0.4 + 0.4) = 0.8, (10 + 0.4) = 10, (0.8 + 10) = 11
Pretend you ran the example code I posted. Specifically the bit where
the output is:
all the same
builtin sum 1000000.0000016732
pairwise sum 1000000.00001
It isn't so much the order of the initial values that matters
(especially if the values are all similar). What *really* matters is
keeping the magnitude of the intermediate results as small as possible.
Summing in pairs ensures that you keep the precision as long as
possible. The order of the initial values is less important than this
(although it does still have a minor effect). For a solution which works
with a sequence of unknown length and doesn't require lookahead I don't
think you can do much better than that.
BTW, in case someone thinks my example numbers are a bit too weird you
can show the same problem averaging a list of 10 digit floating point
numbers with exact representations:
v = [9999999999.] * 10000000
print "builtin sum ", (sum(v)/len(v))
print "pairwise sum", (sumpairs(v)/len(v))
gives:
builtin sum 9999999999.91
pairwise sum 9999999999.0
In both cases the total is large enough to go beyond the range of
integers that can be expressed exactly in floating point and as soon as
that happens the builtin sum loses precision on every subsequent
addition. pairwise summation only loses precision on a very few of the
additions.
P.S. Apologies for the sloppy coding in the sumpairs function. It should
of course do the final addition manually otherwise it is going to give
odd results summing lists or strings or anything else where the order
matters. Revised version:
def sumpairs(seq):
tmp = []
for i,v in enumerate(seq):
if i&1:
tmp[-1] += v
i = i + 1
n = i & -i
while n > 2:
t = tmp.pop(-1)
tmp[-1] = tmp[-1] + t
n >>= 1
else:
tmp.append(v)
while len(tmp) > 1:
t = tmp.pop(-1)
tmp[-1] = tmp[-1] + t
return tmp[0]
More information about the Python-list
mailing list