on floating-point numbers

Fri Sep 3 11:55:05 EDT 2021

What's really going on is that you are printing out more digits than you are entitled to.  39.60000000000001 :   16 decimal digits.   4e16 should require 55 binary bits (in the mantissa) to represent, at least as I calculate it.

Double precision floating point has 52 bits in the mantissa, plus one assumed due to normalization.  So 53 bits.

The actual minor difference in sums that you see is because when you put the largest value 1st it makes a difference in the last few bits of the mantissa.

I recommend that you print out double precision values to at most 14 digits.  Then you will never see this kind of issue.  If you don't like that suggestion, you can create your own floating point representation using a Python integer as the mantissa, so it can grow as large as you have memory to represent the value; and a sign and an exponent.  It would be slow, but it could have much more accuracy (if implemented to preserve accuracy).

By the way, this is why banks and other financial institutions use BCD (binary coded decimal).   They cannot tolerate sums that have fraction of a cent errors.

I should also point out another float issue: subtractive cancellation.   Try 1e14 + 0.1  - 1e14.     The result clearly should be 0.1, but it won't be.  That's because 0.1 cannot be accurately represented in binary, and it was only represented in the bottom few bits.  I just tried it:   I got 0.09375       This is not a Python issue.  This is a well known issue when using binary floating point.   So, when you sum a large array of data, to avoid these issues, you could either
1) sort the data smallest to largest ... may be helpful, but maybe not.
2) Create multiple sums of a few of the values.   Next layer: Sum a few of the sums.    Top layer: Sum the sum of sums to get the final sum.  This is much more likely to work accurately than adding up all the values in one summation except the last, and then adding the last (which could be a relatively small value).  

--- Joseph S.

Teledyne Confidential; Commercially Sensitive Business Data

-----Original Message-----
From: Hope Rouselle <hrouselle at jevedi.com> 
Sent: Thursday, September 2, 2021 9:51 AM
To: python-list at python.org
Subject: on floating-point numbers

Just sharing a case of floating-point numbers.  Nothing needed to be solved or to be figured out.  Just bringing up conversation.

(*) An introduction to me

I don't understand floating-point numbers from the inside out, but I do know how to work with base 2 and scientific notation.  So the idea of expressing a number as 

  mantissa * base^{power}

is not foreign to me. (If that helps you to perhaps instruct me on what's going on here.)

(*) A presentation of the behavior

>>> import sys
>>> sys.version
'3.8.10 (tags/v3.8.10:3d8993a, May  3 2021, 11:48:03) [MSC v.1928 64 bit (AMD64)]'

>>> ls = [7.23, 8.41, 6.15, 2.31, 7.73, 7.77]
>>> sum(ls)
39.599999999999994

>>> ls = [8.41, 6.15, 2.31, 7.73, 7.77, 7.23]
>>> sum(ls)
39.60000000000001

All I did was to take the first number, 7.23, and move it to the last position in the list.  (So we have a violation of the commutativity of
addition.)

Let me try to reduce the example.  It's not so easy.  Although I could display the violation of commutativity by moving just a single number in the list, I also see that 7.23 commutes with every other number in the list.

(*) My request

I would like to just get some clarity.  I guess I need to translate all these numbers into base 2 and perform the addition myself to see the situation coming up?