float("nan") in set or as key

Fri Jun 3 19:29:10 EDT 2011

On Fri, 03 Jun 2011 14:52:39 +0000, Grant Edwards wrote:

>> It's arguable that NaN itself simply shouldn't exist in Python; if
>> the FPU ever generates a NaN, Python should raise an exception at
>> that point.
> 
> Sorry, I just don't "get" that argument.  I depend on compliance with
> IEEE-754, and I find the current NaN behavior very useful, and
> labor-saving.

If you're "fluent" in IEEE-754, then you won't find its behaviour
unexpected. OTOH, if you are approach the issue without preconceptions,
you're likely to notice that you effectively have one exception mechanism
for floating-point and another for everything else.

>> But given that NaNs propagate in almost the same manner as
>> exceptions, you could "optimise" this by treating a NaN as a
>> special-case implementation of exceptions, and turn it into a real
>> exception at the point where you can no longer use a NaN (e.g. when
>> using a comparison operator).
>>
>> This would produce the same end result as raising an exception
>> immediately, but would reduce the number of isnan() tests.
> 
> I've never found the number of isnan() checks in my code to be an
> issue -- there just arent that many of them, and when they are there,
> it provides an easy to read and easy to maintain way to handle things.

I think that you misunderstood. What I was saying here was that, if you
wanted exception-on-NaN behaviour from Python, the interpreter wouldn't
need to call isnan() on every value received from the FPU, but rely upon
NaN-propagation and only call it at places where a NaN might disappear
(e.g. comparisons).

>> I mean undefined, in the sense that 0/0 is undefined
> 
> But 0.0/0.0 _is_ defined.  It's NaN.  ;)

Mathematically, it's undefined.

>> (I note that Python actually raises an exception for "0.0/0.0").
> 
> IMHO, that's a bug.  IEEE-754 states explicit that 0.0/0.0 is NaN.
> Pythons claims it implements IEEE-754.  Python got it wrong.

But then IEEE-754 considers integers and floats to be completely different
beasts, while Python makes some effort to maintain a unified "numeric"
interface. If you really want IEEE-754 to-the-letter, that's undesirable,
although I'd question the choice of Python in such situations.

>> The definition is entirely arbitrary.
> 
> I don't agree, but even if was entirely arbitrary, that doesn't make
> the decision meaningless.  IEEE-754 says it's True, and standards
> compliance is valuable.

True, but so are other things. People with a background in mathematics (as
opposed to arithmetic and numerical methods) would probably consider
following the equivalence axioms to be valuable. Someone more used to
Python than IEEE-754 might consider following the "x is y => x == y" axiom
to be valuable.

As for IEEE-754 saying that it's True: they only really had two
choices: either it's True or it's False. NaNs provide "exceptions"
even if the hardware or the language lacks them, but that falls down once
you leave the scope of floating-point. It wouldn't have been within
IEEE-754's ambit to declare that comparing NaNs should return NaB
(Not A Boolean).

>> Actually, "NaN == NaN" makes more sense than "NaN != NaN", as the
>> former upholds the equivalence axioms
> 
> You seem to be talking about reals.  We're talking about floats.

Floats are supposed to approximate reals. They're also a Python
data type, and should make some effort to fit in with the rest of
the language.

>> If anything, it has proven to be a major nuisance. It takes a lot of
>> effort to create (or even specify) code which does the right thing in
>> the presence of NaNs.
> 
> That's not been my experience.  NaNs save a _huge_ amount of effort
> compared to having to pass value+status info around throughout complex
> caclulations.

That's what exceptions are for. NaNs probably save a huge amount of effort
in languages which lack exceptions, but that isn't applicable to Python.
In Python, they result in floats not "fitting in".

Let's remember that the thread started with an oddity relating to using
floats as dictionary keys, which mostly works but fails for NaN because of
the (highly unusual) property that "x == x" is False for NaNs.

Why did the Python developers choose this behaviour? It's quite likely
that they didn't choose it, but just overlooked the fact that NaN
creates this corner-case which breaks code which works for every other
primitive type except floats and even every other float except NaN.

In any case, I should probably re-iterate at this point that I'm not
actually arguing *for* exception-on-NaN or NaN==NaN or similar, just
pointing out that IEEE-754 is not the One True Approach and that other
approaches are not necessarily heresy and may have some merit. To go back
to the point where I entered this thread:

>>> The correct answer to "nan == nan" is to raise an exception,
>>> because you have asked a question for which the answer is nether True
>>> nor False.
>> 
>> Wrong.
>
> That's overstating it.