[SciPy-dev] scipy.stats: test for handling nan values

Bruce Southey bsouthey at gmail.com
Thu Jul 26 12:13:43 EDT 2007


Hi,
Yes, I would agree that treating missing as nan would achieve the
desired results if the stats functions are set to ignore nan. But  it
is not technically correct to treat missing as nan because you get a
non-missing value by valid operations (like division by zero).
Treating missing as nan  really makes the bad assumption that the
person using those functions 'knows' this difference.

One solution is actually using something like masked arrays because a
user can set their own coding for missing values.

I think there is a very related thread that was discussed some time
ago on the Numpy list with the title 'Re: ndarray.fill and
ma.array.filled' by Sasha:
http://projects.scipy.org/pipermail/numpy-discussion/2006-April/007438.html

Regards
Bruce

On 7/25/07, David Cournapeau <david at ar.media.kyoto-u.ac.jp> wrote:
> Hi,
>
>     Trying to solve a few tickets related to nanmean and co, I wanted to
> add tests for those functions, as well as general behaviour of basic
> statistics function with nan. Part of the test suite (in test_stats.py)
> is based on the Statistical quiz for Wilkinson; missing values are not
> supported. If I finish the test suite by implemenging MISSING by nan
> values, is this conceptually correct or not ? I wanted to be sure before
> committing the change in the test suite (actually, only adding
> originally disabled tests)
>
>     cheers,
>
>     David
> _______________________________________________
> Scipy-dev mailing list
> Scipy-dev at scipy.org
> http://projects.scipy.org/mailman/listinfo/scipy-dev
>



More information about the SciPy-Dev mailing list