Floating point equality [was Re: What exactly is "exact" (was Clean Singleton Docstrings)]

Wed Jul 20 09:54:55 EDT 2016

Steven D'Aprano <steve at pearwood.info>:

> I am not a good computer scientist. But Bruce Dawson *is* a good
> computer scientist:
>
> https://randomascii.wordpress.com/2014/01/27/theres-only-four-billion-f
> loatsso-test-them-all/
>
> Quote:
>
>     Conventional wisdom says that you should never compare two floats
>     for equality – you should always use an epsilon. Conventional
>     wisdom is wrong.
>
>     I’ve written in great detail about how to compare floating-point
>     values using an epsilon, but there are times when it is just not
>     appropriate. Sometimes there really is an answer that is correct,
>     and in those cases anything less than perfection is just sloppy.
>
>     So yes, I’m proudly comparing floats to see if they are equal.

The point of view in the linked article is very different from that of
most application programming that makes use of floating-point numbers.

Yes, if what you are testing or developing is a numeric or mathematical
package, you should test its numeric/mathematical soundness to the bit.
However, in virtually any other context, you have barely any use for
a floating-point equality comparison because:

 1. Floating-point numbers are an approximation of *real numbers*. Two
    independently measured real numbers are never equal because under
    any continuous probability distribution, the probability of any
    given real number is zero. Only continuous ranges can have nonzero
    probabilities.

 2. Floating-point numbers are *imperfect approximations* of real
    numbers. Even when real numbers are derived exactly, floating-point
    operations may introduce "lossy compression artifacts" that have to
    be compensated for in application programs.

What you have to do exactly to compensate for these challenges depends
on the application, and is very easy to get wrong. However, if an
application programmer is using == to compare two floating-point data
values, it is almost certainly a mistake.

> Or you might be using a language like Javascript, which intentionally
> has only floats for numbers. That's okay, you can still perform exact
> integer arithmetic, so long as you stay within the bounds of ±2**16.
>
> Not even in Javascript do you need to write something like this:
>
> x = 0.0
> for i in range(20):
>     x += 1.0
>
> assert abs(x - 20.0) <= 1e-16

Correct because Javascript makes an exactness guarantee of its integers
(I imagine).

In Python, I think it would usually be bad style to rely even on:

   1.0 + 1.0 == 2.0

It is very difficult to find a justification for that assumption in
Python's specifications. What we have:

   Floating-point numbers are represented in computer hardware as base 2
   (binary) fractions.
   <URL: https://docs.python.org/3/tutorial/floatingpoint.html>

   almost all platforms map Python floats to IEEE-754 “double precision”
   <URL: https://docs.python.org/3/tutorial/floatingpoint.html#represent
   ation-error>

   numbers.Real (float)
     These represent machine-level double precision floating point
     numbers. You are at the mercy of the underlying machine
     architecture (and C or Java implementation) for the accepted range
     and handling of overflow.
   <URL: https://docs.python.org/3/reference/datamodel.html#the-standar
   d-type-hierarchy>

I believe a Python implementation that would have:

   1.0 + 1.0 != 2.0

would not be in violation of Python's data model. In fact, even:

   1.0 != 1.0

might be totally conformant. For example, we could have a new underlying
real-number technology that stored the value in an *analogous* format
(say, an ultra-precise voltage level) and performed the calculations
using some fast, analogous circuitry.

Marko