Rich Comparisons Gotcha

Mon Dec 8 16:51:23 EST 2008

Rhamphoryncus wrote:
> On Dec 8, 1:04 pm, Robert Kern <robert.k... at gmail.com> wrote:
>> Rhamphoryncus wrote:
>>> On Dec 8, 11:54 am, Robert Kern <robert.k... at gmail.com> wrote:
>>>> Rhamphoryncus wrote:
>>>>> On Dec 7, 4:20 pm, Steven D'Aprano <st... at REMOVE-THIS-
>>>>> cybersource.com.au> wrote:
>>>>>> On Sun, 07 Dec 2008 15:32:53 -0600, Robert Kern wrote:
>>>>>>> Rasmus Fogh wrote:
>>>>>>>> Current behaviour is both inconsistent and counterintuitive, as these
>>>>>>>> examples show.
>>>>>>>>>>> x = float('NaN')
>>>>>>>>>>> x == x
>>>>>>>> False
>>>>>>> Blame IEEE for that one. Rich comparisons have nothing to do with that
>>>>>>> one.
>>>>>> There is nothing to blame them for. This is the correct behaviour. NaNs
>>>>>> should *not* compare equal to themselves, that's mathematically
>>>>>> incoherent.
>>>>> Mathematically, NaNs shouldn't be comparable at all.  They should
>>>>> raise an exception when compared.  In fact, they should raise an
>>>>> exception when *created*.  But that's not what we want.  What we want
>>>>> is a dummy value that silently plods through our calculations.  For a
>>>>> dummy value it seems a lot more sense to pick an arbitrary yet
>>>>> consistent sort order (I suggest just above -Inf), rather than quietly
>>>>> screwing up the sort.
>>>> Well, there are explicitly two kinds of NaNs: signalling NaNs and quiet NaNs, to
>>>> accommodate both requirements. Additionally, there is significant flexibility in
>>>> trapping the signals.
>>> Right, but most of that's lower level.  By the time it reaches Python
>>> we only care about quiet NaNs.
>> No, signaling NaNs raise the exception that you are asking for. You're right
>> that if you get a Python float object that is a NaN, it is probably going to be
>> quiet, but signaling NaNs can affect Python in the way that you want.
>>
>>>>> Regarding the mythical IEEE 754, although it's extremely rare to find
>>>>> quotations, I have one on just this subject.  And it does NOT say "x
>>>>> == NaN gives false".  It says it gives *unordered*.  It is C and
>>>>> probably most other languages that turn that into false (as they want
>>>>> a dummy value, not an error.)
>>>>> http://groups.google.ca/group/sci.math.num-analysis/browse_thread/thr...
>>>> Table 4 on page 9 of the standard is pretty clear on the subject. When the two
>>>> operands are unordered, the operator == returns False. The standard defines how
>>>> to do comparisons notionally; two operands can be "greater than", "less than",
>>>> "equal" or "unordered". It then goes on to map these notional concepts to
>>>> programming language boolean predicates.
>>> Ahh, interesting.  Still though, does it give an explanation for such
>>> behaviour, or use cases?  There must be some situation where blindly
>>> returning false is enough benefit to trump screwing up sorting.
>> Well, the standard was written in the days of Fortran. You didn't really have
>> generic sorting routines. You *could* implement whatever ordering you wanted
>> because you *had* to implement the ordering yourself. You didn't have to use a
>> limited boolean predicate.
>>
>> Basically, the boolean predicates have to return either True or False. Neither
>> one is really satisfactory, but that's the constraint you're under.
> 
> "We've always done it that way" is NOT a use case!  Certainly, it's a
> factor, but it seems quite weak compared to the sort use case.

I didn't say it was. I was explaining that sorting was probably *not* a use case 
for the boolean predicates at the time of writing of the standard. In fact, it 
suggests implementing a Compare() function that returns "greater than", "less 
than", "equal" or "unordered" in addition to the boolean predicates. That Python 
eventually chose to use a generic boolean predicate as the basis of its sorting 
routine many years after the IEEE-754 standard is another matter entirely.

In any case, the standard itself is quite short, and does not spend much time 
justifying itself in any detail.

> I suppose what I'm hoping for is an small example program (one or a
> few functions) that needs the "always false" behaviour of NaN.

Steven D'Aprano gave one earlier in the thread. Additionally, (x!=x) is a simple 
test for NaNs if an IsNaN(x) function is not available. Really, though, the 
result falls out from the way that IEEE-754 constructed the logic of the 
system. It is not defined that (NaN==NaN) should return False, per se. Rather, 
all of the boolean predicates are defined in terms of that Compare(x,y) 
function. If that function returns "unordered", then (x==y) is False. It doesn't 
matter if one or both are NaNs; in either case, the result is "unordered".

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless enigma
  that is made terrible by our own mad attempt to interpret it as though it had
  an underlying truth."
   -- Umberto Eco