[Python-ideas] PEP 485: A Function for testing approximate equality

Fri Jan 23 18:21:27 CET 2015

On Fri, Jan 23, 2015 at 8:05 AM, Paul Moore <p.f.moore at gmail.com> wrote:

> > The primary expected use case is various forms of testing -- "are the
> > results computed near what I expect as a result?" This sort of test
> > may or may not be part of a formal unit testing suite.
> >
> > The function might be used also to determine if a measured value is
> > within an expected value.
>
> This section is very weak.

I'll see what I can do to strengthen it.

> As someone who doesn't do numerically
> intensive computing I would start with the assumption that people who
> do would have the appropriate tools in packages like numpy, and they
> would have the knowledge and understanding to use them properly. So my
> expectation is that this function is intended specifically for
> non-specialists like me.
>

Indeed that is the idea (though there are plenty of specialists using numpy
as well ;-) )

Based on that, I can't imagine when I'd use this function. You mention
> testing, but unittest has a function to do this already. Sure, it's
> tied tightly to unittest, so it's not useful for something like
> py.test, but that's because unittest is the stdlib testing framework.
> If you wanted to make that check more widely available, why not simply
> make it into a full-fledged function rather than an assertion?

That would be an option, but I don't think the one in unittest is the right
test anyway -- its focus on on number of decimal digits after the decimal
place is not generally useful. (that would make some sense for the Decimal
type...)

And if
> it's not suitable for that purpose, why does this PEP not propose
> updating the unittest assertion to use the new function?

well, for backward compatibility reasons, I had just assumed it was off the
table -- or a long, painful road anyway. And the unitest is very vested in
it's OO structure -- would we want add free-form functions to it?

> It can't be
> right to have 2 *different* "nearly equal" functions in the stdlib.
>

Well, they do have a different functionality -- maybe some people really do
want the decimal digits thing. I'm not sure we'd want one function with a
whole bunch of different ways to call it -- maybe we would, but having
different functions seems fine to me.

> Outside of testing, there seems to be no obvious use for the new
> function. You mention measured values, but what does that mean?
> "Measure in the length of the line and type in the result, and I'll
> confirm if it matches the value calculated"? That seems a bit silly.
>

This came up in examples in the discussion thread -- I'd don't think I
would use it that way myself, so I'm going to leave it to others to suggest
better examples or wording. Otherwise, I'll probably take it out.

I'd like to see a couple of substantial, properly explained examples
> that aren't testing and aren't specialist.

In practice, I think testing is the biggest use case, but not necessarily
formal unit testing. That's certainly how I would use it (and the use case
that prompted me to start this whole thread to begin with..). I'll look in
my code to see if I use it other ways, and I'm open to any other examples
anyone might have.

But maybe it should be with testing code in that case -- but I don't see
any free-form testing utility functions in there now. Maybe it should go in
unitest.util ? I'd rather not, but it's just a different import line.

> My worry is that what this
> function will *actually* be used for is to allow naive users to gloss
> over their lack of understanding of floating point:
>
>     n = 0.0
>     while not is_close_to(n, 1.0): # Because I don't understand floating
> point
>         do_something_with(n)
>         n += 0.1
>

Is that necessarily worse? it would at least terminate ;-) floating point
is a bit of an attractive nuisance anyway.

> BTW, when writing that I had to keep scrolling up to see which order
> actual and expected went in. I'd imagine plenty of naive users will
> assume "it's symmetrical so it shouldn't matter" and get the order
> wrong.
>

Well, I think the biggest real issue about this (other than should it be in
the stdlib at all) is the question of a symmetrical vs. symmetrical test. I
decided to go (for this draft, anyway) with the asymmetric test, as it is
better defined and easier to reason about, and more appropriate for some
cases. And the biggest argument for a symmetric test is that it is what
people would expect.

So I tried to make the parameter names that would make it clear (rather
than a,b or x,y) -- I think I failed on that, however -- anyone have a
better suggestion for names? It turns out "actual" is far too similar in
meaning to "expected".

 In summary - it looks too much like an attractive nuisance to me,

If it's not there, the folks will cobble somethign up themselves (and I'm
sure do, all the time). If they know what they are doing, and take care,
then great, but if not then they may get something with worse behavior that
this. Maybe they will at least understand it better, but I suspect the
pitfalls will all still be there in a typical case. And in any case, have
to take the time to write it. That's my logic anyway.

-Chris

-- 

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150123/46bdc51e/attachment-0001.html>