python 3.44 float addition bug?

Thu Jun 26 05:15:25 EDT 2014

On Thu, 26 Jun 2014 13:39:23 +1000, Ben Finney wrote:

> Steven D'Aprano <steve at pearwood.info> writes:
> 
>> On Wed, 25 Jun 2014 14:12:31 -0700, Maciej Dziardziel wrote:
>>
>> > Floating points values use finite amount of memory, and cannot
>> > accurately represent infinite amount of numbers, they are only
>> > approximations. This is limitation of float type and applies to any
>> > languages that uses types supported directly by cpu. To deal with it
>> > you can either use decimal.Decimal type that operates using decimal
>> > system and saves you from such surprises
>>
>> That's a myth. decimal.Decimal *is* a floating point value
> 
> That's misleading: Decimal uses *a* floating-point representation, but
> not the one commonly referred to. That is, Decimal does not use IEEE-754
> floating point.

You're technically correct, but only by accident.

IEEE-754 covers both binary and decimal floating point numbers:

http://en.wikipedia.org/wiki/IEEE_floating_point

but Python's decimal module is based on IEEE-854, not 754.

http://en.wikipedia.org/wiki/IEEE_854-1987

So you're right on a technicality, but wrong in the sense of knowing what 
you're talking about *wink*

>> and is subject to *exactly* the same surprises as binary floats,
> 
> Since those “surprises” are the ones inherent to *decimal*, not binary,
> floating point, I'd say it's also misleading to refer to them as
> “exactly the same surprises”. They're barely surprises at all, to
> someone raised on decimal notation.

Not at all. They are surprises to people who are used to *mathematics*, 
fractions, rational numbers, the real numbers, etc. It is surprising that 
the rational number "one third" added together three times should fail to 
equal one. Ironically, binary float gets this one right:

py> 1/3 + 1/3 + 1/3 == 1
True
py> Decimal(1)/3 + Decimal(1)/3 + Decimal(1)/3 == 1
False

but for other rationals, that is not necessarily the case.

It is surprising when x*(y+z) fails to equal x*y + x*z, but that can 
occur with both binary floats and Decimals.

It is surprising when (x + y) + z fails to equal x + (y + z), but that 
can occur with both binary floats and Decimals.

It is surprising when x != 0 and y != 0 but x*y == 0, but that too can 
occur with both binary floats and Decimals. 

And likewise for most other properties of the rationals and reals, which 
people learn in school, or come to intuitively expect. People are 
surprised when floating-point arithmetic fails to obey the rules of 
mathematical arithmetic.

If anyone is aware of a category of surprise which binary floats are 
prone to, but Decimal floats are not, apart from the decimal-
representation issue I've already mentioned, I'd love to hear of it. But 
I doubt such a thing exists.

Decimal in the Python standard library has another advantage, it supports 
user-configurable precisions. But that doesn't avoid any category of 
surprise, it just mitigates against being surprised as often.

> This makes the Decimal functionality starkly different from the built-in
> ‘float’ type, and it *does* save you from the rather-more-surprising
> behaviour of the ‘float’ type. This is not mythical.

It simply is not true that Decimal avoids the floating point issues that 
"What Every Computer Scientist Needs To Know About Floating Point" warns 
about:

http://docs.oracle.com/cd/E19957-01/806-3568/ncg_goldberg.html

It *cannot* avoid them, because Decimal is itself a floating point 
format, it is not an infinite precision number type like 
fractions.Fraction.

Since Decimal cannot avoid these issues, all we can do is push the 
surprises around, and hope to have less of them, or shift them to parts 
of the calculation we don't care about. (Good luck with that.) Decimal, 
by default, uses 28 decimal digits of precision, about 11 or 12 more 
digits than Python floats are able to provide. So right away, by shifting 
to Decimal you gain precision and hence might expect fewer surprises, all 
else being equal.

But all else isn't equal. The larger the base, the larger the "wobble". 
See Goldberg above for the definition of wobble, but it's a bad thing. 
Binary floats have the smallest wobble, which is to their advantage.

If you stick to trivial calculations using nothing but trivially "neat" 
decimal numbers, like 0.1, you may never notice that Decimal is subject 
to the same problems as float (only worse, in some ways -- Decimal 
calculations can fail in some spectacularly horrid ways that binary 
floats cannot). But as soon as you start doing arbitrary calculations, 
particularly if they involve divisions and square roots, things are no 
longer as neat and tidy.

Here's an error that *cannot* occur with binary floats: the average of 
two numbers x and y is not guaranteed to lie between x and y!

py> from decimal import *
py> getcontext().prec = 3
py> x = Decimal('0.516')
py> y = Decimal('0.518')
py> (x + y) / 2
Decimal('0.515')

Ouch!

-- 
Steven