on floating-point numbers

Sat Sep 4 09:40:35 EDT 2021

Chris Angelico <rosuav at gmail.com> writes:

> On Fri, Sep 3, 2021 at 4:58 AM Hope Rouselle <hrouselle at jevedi.com> wrote:
>>
>> Hope Rouselle <hrouselle at jevedi.com> writes:
>>
>> > Just sharing a case of floating-point numbers.  Nothing needed to be
>> > solved or to be figured out.  Just bringing up conversation.
>> >
>> > (*) An introduction to me
>> >
>> > I don't understand floating-point numbers from the inside out, but I do
>> > know how to work with base 2 and scientific notation.  So the idea of
>> > expressing a number as
>> >
>> >   mantissa * base^{power}
>> >
>> > is not foreign to me. (If that helps you to perhaps instruct me on
>> > what's going on here.)
>> >
>> > (*) A presentation of the behavior
>> >
>> >>>> import sys
>> >>>> sys.version
>> > '3.8.10 (tags/v3.8.10:3d8993a, May 3 2021, 11:48:03) [MSC v.1928 64
>> > bit (AMD64)]'
>> >
>> >>>> ls = [7.23, 8.41, 6.15, 2.31, 7.73, 7.77]
>> >>>> sum(ls)
>> > 39.599999999999994
>> >
>> >>>> ls = [8.41, 6.15, 2.31, 7.73, 7.77, 7.23]
>> >>>> sum(ls)
>> > 39.60000000000001
>> >
>> > All I did was to take the first number, 7.23, and move it to the last
>> > position in the list.  (So we have a violation of the commutativity of
>> > addition.)
>>
>> Suppose these numbers are prices in dollar, never going beyond cents.
>> Would it be safe to multiply each one of them by 100 and therefore work
>> with cents only?  For instance
>
> Yes and no. It absolutely *is* safe to always work with cents, but to
> do that, you have to be consistent: ALWAYS work with cents, never with
> floating point dollars.
>
> (Or whatever other unit you choose to use. Most currencies have a
> smallest-normally-used-unit, with other currency units (where present)
> being whole number multiples of that minimal unit. Only in forex do
> you need to concern yourself with fractional cents or fractional yen.)
>
> But multiplying a set of floats by 100 won't necessarily solve your
> problem; you may have already fallen victim to the flaw of assuming
> that the numbers are represented accurately.

Hang on a second.  I see it's always safe to work with cents, but I'm
only confident to say that when one gives me cents to start with.  In
other words, if one gives me integers from the start.  (Because then, of
course, I don't even have floats to worry about.)  If I'm given 1.17,
say, I am not confident that I could turn this number into 117 by
multiplying it by 100.  And that was the question.  Can I always
multiply such IEEE 754 dollar amounts by 100?

Considering your last paragraph above, I should say: if one gives me an
accurate floating-point representation, can I assume a multiplication of
it by 100 remains accurately representable in IEEE 754?

>> --8<---------------cut here---------------start------------->8---
>> >>> ls = [7.23, 8.41, 6.15, 2.31, 7.73, 7.77]
>> >>> sum(map(lambda x: int(x*100), ls)) / 100
>> 39.6
>>
>> >>> ls = [8.41, 6.15, 2.31, 7.73, 7.77, 7.23]
>> >>> sum(map(lambda x: int(x*100), ls)) / 100
>> 39.6
>> --8<---------------cut here---------------end--------------->8---
>>
>> Or multiplication by 100 isn't quite ``safe'' to do with floating-point
>> numbers either?  (It worked in this case.)
>
> You're multiplying and then truncating, which risks a round-down
> error. Try adding a half onto them first:
>
> int(x * 100 + 0.5)
>
> But that's still not a perfect guarantee. Far safer would be to
> consider monetary values to be a different type of value, not just a
> raw number. For instance, the value $7.23 could be stored internally
> as the integer 723, but you also know that it's a value in USD, not a
> simple scalar. It makes perfect sense to add USD+USD, it makes perfect
> sense to multiply USD*scalar, but it doesn't make sense to multiply
> USD*USD.

Because of the units?  That would be USD squared?  (Nice analysis.)

>> I suppose that if I multiply it by a power of two, that would be an
>> operation that I can be sure will not bring about any precision loss
>> with floating-point numbers.  Do you agree?
>
> Assuming you're nowhere near 2**53, yes, that would be safe. But so
> would multiplying by a power of five. The problem isn't precision loss
> from the multiplication - the problem is that your input numbers
> aren't what you think they are. That number 7.23, for instance, is
> really....

Hm, I think I see what you're saying.  You're saying multiplication and
division in IEEE 754 is perfectly safe --- so long as the numbers you
start with are accurately representable in IEEE 754 and assuming no
overflow or underflow would occur.  (Addition and subtraction are not
safe.)

>>>> 7.23.as_integer_ratio()
> (2035064081618043, 281474976710656)
>
> ... the rational number 2035064081618043 / 281474976710656, which is
> very close to 7.23, but not exactly so. (The numerator would have to
> be ...8042.88 to be exactly correct.) There is nothing you can do at
> this point to regain the precision, although a bit of multiplication
> and rounding can cheat it and make it appear as if you did.
>
> Floating point is a very useful approximation to real numbers, but
> real numbers aren't the best way to represent financial data. Integers
> are.

I'm totally persuaded.  Thanks.