[Python-ideas] Way to check for floating point "closeness"?

Thu Jan 15 18:42:18 CET 2015

On 01/15/2015 01:29 AM, Steven D'Aprano wrote:
> On Wed, Jan 14, 2015 at 08:13:42PM -0600, Ron Adam wrote:

> The question of which to use as the denominator is more subtle. Like
> you, I used to think that you should choose ahead of time which value
> was expected and which was actual, and divide by the actual. Or should
> that be the expected? I could never decide which I wanted: error
> relative to the expected, or error relative to the actual? And then I
> could never remember which order the two arguments went.
>
> Finally I read Bruce Dawson (I've already linked to his blog three or
> four times) and realised that he is correct and I was wrong. Error
> calculations should be symmetrical, so that
>
>      error(a, b) == error(b, a)
>
> regardless of whether you have absolute or relative error. Furthermore,
> for safety you normally want the larger estimate of error, not the
> smaller: given the choice between
>
>      (abs(a - b))/abs(a)
>
> versus
>
>      (abs(a - b))/abs(b)
>
>
> you want the *larger* error estimate, which means the *smaller*
> denominator. That's the conservative way of doing it.
>
> A concrete example: given a=5 and b=7, we have:
>
> absolute error = 2
> relative error (calculated relative to a) = 0.4
> relative error (calculated relative to b) = 0.286
>
> That is, b is off by 40% relative to a; or a is off by 28.6% relative to
> b. Or another way to put it, given that a is the "true" value, b is 40%
> too big; or if you prefer, 28.6% of b is in error.
>
> Whew! Percentages are hard! *wink*

Ewww the P word.  :)

> The conservative, "safe" way to handle this is to just treat the error
> function as symmetrical
What if we are not concerned with the location two points are relative to 
zero?  Or if the numbers straddle zero?
and always report the larger of the two relative
> errors (excluding the case where the denominator is 0, in which case
> the relative error is either 100% or it doesn't exist). Worst case, you
> may reject some values which you should accept, but you will never
> accept any values that you should reject.

Consider two points that are a constant distance apart, but moving relative 
to zero.  Their closeness doesn't change, but the relative error in respect 
to each other (and zero) does change.

There is an implicit assumption that the number system used and the origin 
the numbers are measured from are chosen and relate to each other in some 
expected way.

When ever you supply all the numbers, like in a test, it's not a problem, 
you just give good numbers.

>> Note that you would never compare to an expected value of zero.
>
> You *cannot* compare to an expected value of zero, but you certainly can
> be in a situation where you would like to: math.sin(math.pi) should
> return 0.0, but doesn't, it returns 1.2246063538223773e-16 instead. What
> is the relative error of the sin function at x = math.pi?
>
>
>>      relerr(a - b, expected_feet) < tolerance   # relative feet from b
>>      relerr(a - 0, expected_feet) < tolerance   # relative feet from zero
>>      relerr(a - b, ulp)    # percentage of ulp's
>
> I don't understand what you think these three examples are showing.

A percentage of an expected distance.

   Error of two points compared to a specific distance.

     >>> relerr(5 - -5, 10)
     0.0

I think unless you use decimal, the ulp example will either be zero or some 
large multiple of ulp.

> Take a look at the statistics test suite.

I definitely will. :-)

> I'll be the first to admit
> that the error tolerances are plucked from thin air, based on what I
> think are "close enough", but they show how such a function might work:
>
> * you provide two values, and at least one of an absolute error
>    tolerance and a relative error;
> * if the error is less than the error(s) you provided, the test
>    passes, otherwise it fails;
> * NANs and INFs are handled apprpriately.
>
>
>>       is_close(218.345, 220, 1, .05)   # OHMs
>>       is_close(a, b, ULP, 2)     # ULPs
>>       is_close(a, b, AU, .001)   # astronomical units
>>
>>
>> I don't see anyway to generalise those with just a function.
>
> Generalise in what way?

I meant a function that would work in many places without giving some sort 
size and tolerance hints.

Given two floating point numbers and noting else, I don't think you can 
tell if they represent something that is close without assuming some sort 
of context.  At best, you need to assume the distance from zero and the 
numbers used are chosen to give a meaningful return value.  While that can 
sometimes work, I don't think you can depend on it.

>> By using objects we can do a bit more.  I seem to recall coming across
>> measurement objects some place.  They keep a bit more context with them.
>
> A full system of <value + unit> arithmetic is a *much* bigger problem
> than just calculating error estimates correctly, and should be a
> third-party library before even considering it for the std lib.

Yes, I agree.  There are a few of them out there already.

Cheers,
    Ron