NaN comparisons - Call For Anecdotes

Tue Jul 8 10:53:47 EDT 2014

Most people don't need to deal with NaN's in Python at all,
fortunately. They just don't appear in normal computation, because the
interpreter raises an exception instead.

It happens in my work I come across them quite a lot. I'm writing
software that talks to embedded applications that can contain NaN
values for a variety of reasons - never-initialised storage,
initialise-to-NaN, hardware failures etc.

So when my software reads these values in binary, unpack them using
the struct module, and goes to work. And NaN's are no different from
any other value, it's something to store, compare, display etc.

And that worked fine in my Python 2.4 apps.  Then I upgraded to 2.7
and it broke.  Because 2.7 goes out of it's way to ensure that NaN's
don't compare equal to themselves.

I discovered it when a sanity check told me that two functions,
to_binary and from_binary, weren't each other's inverse, as they were
intended to be.  Apparently,
bitPattern==to_binary(from_binary(bitPattern)) wasn't true for this
particular value of bitPattern.  Of course, the bit pattern in
question was the binary representation for a floating-point NaN.

Panic time! If I can't trust == to return True for (mathematically)
equal objects, that means that every single algorithm I had ever written
that explicitly or implicitly does .__eq__ or .__ne__ comparison was
suspect!

That meant I had 30000 lines of code to review.  Every time there's a
comparison, if there was any chance that either value could be a
float NaN, I would have to change e.g.
    if x==y:
to
    if x==y or (isinstance(x, float) and isinstance(y, float) and
                      math.isnan(x) and math.isnan(y)):
To make it bearable, I could wrap the pattern up in a function and
write
    if my_equal(x,y):
but I would still lose, because the standard library does == and !=
all over the place without calling my_equal.

In the end I came up with this hack: Every time I struct.unpack'd a
float, I check if it's a NaN, and if it is, then I replace it with a
reference to a single, shared, "canonical" NaN. That means that
container objects that skip __equal__ when comparing an object to
itself will work -- e.g. hash keys.

It's half a solution, of course: Any further computation with a NaN
value will change it to a different NaN object, so I still needed to
do explicit NaN-checks in various places.  I'm sure there are still
NaN-related bugs in my code, but right now it's "good enough" - I
haven't seen NaN-related bugs in a good while.

Now, all this bothers me.  Not that I had to do some work to get stuff
to work in an imperfect world.  No, what bothers me is that this
behaviour was explicitly and deliberately put in for no good reason.
The only reason is "standard says so". Not that there's any legal
requirement for Python to follow the IEEE-754 standard. Or for that
matter, for Python's spelling of IEEE-754 comparisons to be "==".

So I make this claim: float.__eq__ implementing IEEE-754 NaN
comparison rules creates real problems for developers. And it has
never, ever, helped anyone do anything.

"Never" is a strong claim, and easily disproven if false: Simply
provide a counterexample.  So that is my challenge: If you have a
program (a pre-existing and useful one, not something artificial
created for this challenge) that benefits from NaN!=NaN and that would
fail if x==x for all float objects x, then please come forward and
show it, and I'll buy you a beer the next time I'm at PyCon.

regards, Anders