Sorting NaNs

Steven D'Aprano steve+comp.lang.python at pearwood.info
Sat Jun 2 07:02:45 EDT 2018


On Sat, 02 Jun 2018 10:31:37 +0100, Paul Moore wrote:

> 1. The behaviour of comparisons involving NaN values is weird (not
> undefined, as far as I know NaN behaviour is very well defined, but
> violates a number of normally fundamental properties of comparisons) 

Not so much weird. They're just *unordered* -- no order is defined on 
NANs, so if you ask whether one NAN is less than something, the answer is 
always False.

Some maths libraries provide a separate set of comparison functions which 
return a NAN instead of True/False on NAN comparisons, but Python doesn't 
do that.

The violation of reflexivity is weird though :-)

    x = float(NAN)
    x == x  # returns False

Don't argue, just accept it :-)


> 2. The precise behaviour of the sort algorithm use by Python is not
> mandated by the language.

I don't think that is quite right: Python's sort is defined to be a 
stable, lexicographical sort. That's quite well-defined, for values which 
define a total ordering.

http://mathworld.wolfram.com/TotallyOrderedSet.html

The floats excluding NANs are totally ordered; the floats including NANs 
are not.

For values which are not totally ordered, no guarantees can be made. The 
result of sorting will depend on the precise details of the sort 
algorithm, the semantics of how the values compare, and the initial order 
of the values. 


> A consequence of (1) is that there is no meaningful definition of "a
> sorted list of numbers" if that list includes NaN,

Correct.


[...]
> So call it an accident of implementation of you like. Or "sorting a list
> with NaNs in it is meaningless" if you prefer. Or "undefined behaviour"
> if you're a fan of the language in the C standard.


While sorting NANs will return in some unpredictable or arbitrary order, 
that last one does not apply. If the C definition of undefined behaviour 
applied, you could write a two-line script like this:


print("Hello")
result = sorted([1, float("nan"), 2])


and the compiler could *legally* generate code to erase your hard disk 
instead of printing "Hello".



-- 
Steven D'Aprano
"Ever since I learned about confirmation bias, I've been seeing
it everywhere." -- Jon Ronson




More information about the Python-list mailing list