Treatment of NANs in the statistics module

Steven D'Aprano steve+comp.lang.python at pearwood.info
Sat Mar 17 04:12:48 EDT 2018


On Sat, 17 Mar 2018 16:41:01 +1100, Ben Finney wrote:

>> (4) median() should strip out NANs.
> 
> Too much magic.

Statistically, ignoring NANs is equivalent to taking them as missing 
values. That is, for the purposes of calculating some statistic (let's 
say, median, although it applies to others as well), the sample data:

    21, 37, 41, NAN, 65, 72

is equivalent to:

    21, 37, 41, 65, 72

That's probably the most mathematically correct thing to do, *if* you 
interpret NANs as missing values.


Thanks for your feedback.



-- 
Steve




More information about the Python-list mailing list