NaN, Null, and Sorting

Eelco hoogendoorn.eelco at gmail.com
Mon Jan 16 05:22:55 EST 2012


On 13 jan, 20:04, Ethan Furman <et... at stoneleaf.us> wrote:
> With NaN, it is possible to get a list that will not properly sort:
>
> --> NaN = float('nan')
> --> spam = [1, 2, NaN, 3, NaN, 4, 5, 7, NaN]
> --> sorted(spam)
> [1, 2, nan, 3, nan, 4, 5, 7, nan]
>
> I'm constructing a Null object with the semantics that if the returned
> object is Null, it's actual value is unknown.
>
>  From a purist point of view if it is unknown then comparison results
> are also unknown since the actual value might be greater, lesser, or the
> same as the value being compared against.
>
>  From a practical point of view a list with Nulls scattered throughout
> is a pain in the backside.
>
> So I am strongly leaning towards implementing the comparisons such that
> Null objects are less than other objects so they will always sort together.
>
> Thoughts/advice/criticisms/etc?
>
> ~Ethan~

My suggestion would be thus: nans/nulls are unordered; sorting them is
fundamentally an ill defined notion. What you want, conceptually, is a
sorted list of the sortable entries, and a seperate list of the
unsorted entries. Translated into code, the most pure solution would
be to filter out the nanas/nulls in their own list first, and then
sort the rest. If the interface demands it, you can concatenate the
lists afterwards, but probably it is most convenient to keep them in
seperate lists.

Perhaps arbitrarily defining the ordering of nulls/nans is slightly
more efficient than the above, but it should not make a big
difference, and in terms of purity its no contest.



More information about the Python-list mailing list