[Python-ideas] Fwd: PEP 485: A Function for testing approximate equality

Tue Jan 27 01:16:28 CET 2015

OOPS, sent to only Paul by mistake the first time.

-Chris

On Sun, Jan 25, 2015 at 11:07 PM, Paul Moore <p.f.moore at gmail.com> wrote:

> And to many users (including me, apparently - I expected the first one
> to give False), the following is "floating point arcana":
>
> >>> 0.1*10 == 1.0
> True
> >>> sum([0.1]*10)
> 0.9999999999999999
> >>> sum([0.1]*10) == 1
> False

Any of the approaches on the table will do something reasonable in this
case:

In [4]: is_close_to.is_close_to(sum([0.1]*10), 1)
testing: 0.9999999999999999 1
Out[4]: True

Note that the 1e-8 d3fault I chose (which I am not committed to) is not
ENTIRELY arbitrary -- it's about half the digits carried by a python float
(double) -- essentially saying the values are close to about half of the
precision available. And we are constrained here, the options are between
0.1 (which would be crazy, if you ask me!) and 1e-14 -- any larger an it
would meaningless, and any smaller, and it would surpass the precision of a
python float. PIcking a default near the middle of that range seems quite
sane to me.

This is quite different than setting a value for an absolute tolerance --
saying something is close to another number if the difference is less than
1e-8 would be wildly inappropriate when the smallest numbers a float can
hold are on order of 1e-300!

> This does seem relatively straightforward, though. Although in the
> second case you glossed over the question of X% of *what* which is the
> root of the "comparison to zero" question, and is precisely where the
> discussion explodes into complexity that I can't follow, so maybe
> that's precisely the bit of "floating point arcana" that the naive
> user doesn't catch on to.

arcana, maybe, not it's not a floating point issue -- X% of zero is zero
absolutely precisely.

But back to a point made earlier -- the idea here is to provide something
better than naive use of

x == y

for floating point. A default for the relative tolerance provides that.
There is no sane default for absolute tolerance, so I dont think we should
set one.

Note that the zero_tolerance idea provides a way to let
is_close_to(something, 0.0) work out of the box, but I'm not sure we can
come up with a sane default for that either -- in which case, what's the
point?

> I'm not sure what you're saying here - by "not setting defaults" do
> you mean making it mandatory for the user to supply a tolerance, as I
> suggested above?

I for one, think making it mandatory to set one would be better than just
letting the zeros get used.

I've used the numpy version of this a lot for tests, , and my work flow is
usually:

write a test with the defaults

if it pases, I'm done.

If it fails, then I look and see if my code is broken, or if I can accept a
larger tolerance.

So I'm quite happy to have a default.

> I really think that having three tolerances, once of which is nearly
> > always ignored, is poor API design. The user usually knows when they are
> > comparing against an expected value of zero and can set an absolute
> > error tolerance.
>
> Agreed.
>

also agreed -- Nathanial -- can you live with this?

Note that Nathaniel found a LOT of examples of people using
assertAlmostEqual to compare to zero -- I think that's why he thinks it's
important to have defaults that do something sane for that case. However,
that is an absolute comparison function -- inherently different anyway --
it's also tied to "number of digits after the decimal place", so only
appropriate for values near 1 anyway -- so you can define a sensible
default there. not so here.

> - Absolute tolerance defaults to zero (which is equivalent to
> >   exact equality).
>

yup

> > - Relative tolerance defaults to something (possibly zero) to be
> >   determined after sufficient bike-shedding.
>
> Starting the bike-shedding now, -1 on zero. Having is_close default to
> something that most users won't think of as behaving like their naive
> expectation of "is close" (as opposed to "equals") would be confusing.

1e-8 -- but you already know that ;-) -- anything between 1e-8 and 1e-12
would be fine with me.

> Just make it illegal to set both. What happens when you have both set
> is another one of the things that triggers discussions that make my
> head explode.

sorry about the head -- but setting both is a pretty useful use-case -- you
have a bunch of values you want to do a realive test on. Some of them maybe
exactly zero (note -- it could be either expected or actual) -- and you
know what "close to zero" means to you. so you set that for abs_tolerance.
Done.

And I actually didn't think anyone objected to that approach -- though
maybe the exploded heads weren't able to write ;-)

In fact, an absolute tolerance is so easy that I wouldn't write a function
for it:

abs(expected - actual) <= abs_tolerance

I ended up adding it to my version to deal with the zero case.

> Having said that, I don't think the name "is_close" makes the
> asymmetry clear enough. Maybe "is_close_to" would work better (there's
> still room for bikeshedding over which of the 2 arguments is implied
> as the "expected" value in that case)?

I now have "expected" and "actual", but I think those makes are too unclear
-- I like "expected" -- anyone have a better idea for the other one?

> > abs(actual - expected) <= relative_tolerance*expected
> >
> > Now if expected is zero, the condition is true if and only if
> > actual==expected.
>
> I would call out this edge case explicitly in the documentation.

It is called out, but I guess I need to make that clearer.

 Overall, I think it would be better to simplify the proposed function

> in order to have it better suit the expectations of its intended
> audience, rather than trying to dump too much functionality in it on
> the grounds of making it "general".

right -- that is why I've been resisting adding flags for all the various
options

-Chris

-- 

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov

-- 

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150126/fa7cfbe7/attachment.html>