[Python-ideas] Floating point "closeness" Proposal Outline

Chris Barker chris.barker at noaa.gov
Mon Jan 19 07:32:14 CET 2015


OK folks,

There has been a lot of chatter about this, which I think has served to
provide some clarity, at least to me. However, I'm concerned that the
upshot, at least for folks not deep into the discussion, will be: clearly
there are too many use-case specific details to put any one thing in the
std lib. But I still think we can provide something that is useful for most
use-cases, and would like to propose what that is, and what the decision
points are:

A function for the math module, called somethign like "is_close",
"approx_equal", etc. It will compute a relative tolerance, with a  default
maybe around 1-e12, with the user able to specify the tolerance they want.

Optionally, the user can specify an "minimum absolute tolerance", it will
default to zero, but can be set so that comparisons to zero can be handled
gracefully.

The relative tolerance will be computed from the smallest of the two input
values, so as to get symmetry : is_close(a,b) == is_close(b,a). (this is
the Boost "strong" definition, and what is used by Steven D'Aprano's code
in the statistics test module)

Alternatively, the relative error could be computed against a particular
one of the input values (the second one?). This would be asymmetric, but be
more clear exactly how "relative" is defined, and be closer to what people
may expect when using it as a "actual vs expected" test. --- "expected"
would be the scaling value. If the tolerance is small, it makes very little
difference anyway, so I'm happy with whatever consensus moves us to. Note
that if we go this way, then the parameter names should make it at least a
little more clear -- maybe "actual" and "expected", rather than x and y or
a and b or... and the function name should be something like is_close_to,
rather than just is_close.

It will be designed for floating point numbers, and handle inf, -inf, and
NaN "properly". But is will also work with other numeric types, to the
extent that duck typing "just works" (i.e. division and comparisons all
work).

complex numbers will be handled by:
is_close(x.real, y.real) and is_close(x.imag, y.imag)
(but i haven't written any code for that yet)

It will not do a simple absolute comparison -- that is the job of a
different function, or, better yet, folks just write it themselves:

abs(x - y) <= delta

really isn't much harder to write than a function call:

absolute_diff(x,y,delta)

Here is a gist with a sample implementation:

https://gist.github.com/PythonCHB/6e9ef7732a9074d9337a

I need to add more tests, and make the test proper unit tests, but it's a
start.

I also need to see how it does with other data types than float --
hopefully, it will "just work" with the core set.

I hope we can come to some consensus that something like this is the way to
go.

-Chris















On Sun, Jan 18, 2015 at 11:27 AM, Ron Adam <ron3200 at gmail.com> wrote:

>
>
> On 01/17/2015 11:37 PM, Chris Barker wrote:
>
>>        (Someone claimed that 'nothing is close to zero'.  This is
>>     nonsensical both in applied math and everyday life.)
>>
>>
>> I'm pretty sure someone (more than one of use) asserted that "nothing is
>> *relatively* close to zero -- very different.
>>
>
> Yes, that is the case.
>
>
>  And I really wanted a way to have a default behavior that would do a
>> reasonable transition to an absolute tolerance near zero, but I no longer
>> thing that's possible. (numpy's implimentaion kind of does that, but it is
>> really wrong for small numbers, and if you made the default min_tolerance
>> the smallest possible representable number, it really wouldn't be useful.
>>
>
> I'm going to try to summarise what I got out of this discussion.  Maybe it
> will help bring some focus to the topic.
>
> I think there are two case's to consider.
>
>      # The most common case.
>      rel_is_good(actual, expected, delta)   # value +- %delta.
>
>      # Testing for possible equivalence?
>      rel_is_close(value1, value2, delta)    # %delta close to each other.
>
> I don't think they are quite the same thing.
>
>      rel_is_good(9, 10, .1) --> True
>      rel_is_good(10, 9, .1) --> False
>
>      rel_is_close(9, 10, .1) --> True
>      rel_is_close(10, 9, .1) --> True
>
>
> In the "is close" case, it shouldn't matter what order the arguments are
> given. The delta is the distance from the larger number the smaller number
> is.  (of the same sign)
>
> So when calculating the relative error from two values, you want it to be
> consistent with the rel_is_close function.
>
>      rel_is_close(a, b, delta) <---> rel_err(a, b) <= delta
>
> And you should not use the rel_err function in the rel_is_good function.
>
>
>
> The next issue is, where does the numeric accuracy of the data,
> significant digits, and the languages accuracy (ULPs), come into the
> picture.
>
> My intuition.. I need to test the idea to make a firmer claim.. is that in
> the case of is_good, you want to exclude the uncertain parts, but with
> is_close, you want to include the uncertain parts.
>
> Two values "are close" if you can't tell one from the other with
> certainty.  The is_close range includes any uncertainty.
>
> A value is good if it's within a range with certainty.  And this excludes
> any uncertainty.
>
> This is where taking in consideration of an absolute delta comes in. The
> minimum range for both is the uncertainty of the data. But is_close and
> is_good do different things with it.
>
> Of course all of this only applies if you agree with these definitions of
> is_close, and is_good. ;)
>
> Cheers,
>    Ron
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
>



-- 

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150118/e2cf0208/attachment-0001.html>


More information about the Python-ideas mailing list