Sorting NaNs

Richard Damon Richard at Damon-Family.org
Sat Jun 2 20:16:36 EDT 2018


On 6/2/18 7:35 PM, Steven D'Aprano wrote:
> On Sat, 02 Jun 2018 17:28:28 +0200, Peter J. Holzer wrote:
>
>> On 2018-06-02 10:40:48 +0000, Steven D'Aprano wrote:
>>> On Sat, 02 Jun 2018 09:32:05 +0200, Peter J. Holzer wrote:
>>>> Also nope. It looks like NaNs just mess up sorting in an
>>>> unpredictable way. Is this the intended behaviour or just an accident
>>>> of implementation? (I think it's the latter: I can see how a sort
>>>> algorithm which doesn't treat NaN specially would produce such
>>>> results.)
>>>
>>> Neither -- it is a deliberate decision
>> If it was a deliberate decicion I would say it was intentional.
>
> As the author of the statistics module, I can absolutely and 
> categorically tell you without even the tiniest doubt that the behavour 
> of statistics.median with NANs is certainly not intentional.
>
> The behaviour of median with NANs is unspecified. It is whatever the 
> implementation happens to do. If it gives the right answer, great, and if 
> it doesn't, it doesn't.
>
> If somebody cares about this use case enough to suggest an alternative 
> behaviour, I'm willing to consider it.
>
The two behaviors that I have heard suggested are:

1) If any of the inputs are a NaN, the median should be a NaN.
(Propagating the NaN as indicator of a numeric error)

2) Remove the NaNs from the input set and process what is left. If
nothing, then return a NaN (treating NaN as a 'No Data' placeholder).

These are very different in interpretation, and not hard to create as a
wrapper to the function, so maybe not worth adding to the core function
and make it a bit slower.

-- 
Richard Damon




More information about the Python-list mailing list