[Numpy-discussion] What should be the result in some statistics corner cases?
Warren Weckesser
warren.weckesser at gmail.com
Sun Jul 14 16:55:08 EDT 2013
On 7/14/13, Charles R Harris <charlesr.harris at gmail.com> wrote:
> Some corner cases in the mean, var, std.
>
> *Empty arrays*
>
> I think these cases should either raise an error or just return nan.
> Warnings seem ineffective to me as they are only issued once by default.
>
> In [3]: ones(0).mean()
> /home/charris/.local/lib/python2.7/site-packages/numpy/core/_methods.py:61:
> RuntimeWarning: invalid value encountered in double_scalars
> ret = ret / float(rcount)
> Out[3]: nan
>
> In [4]: ones(0).var()
> /home/charris/.local/lib/python2.7/site-packages/numpy/core/_methods.py:76:
> RuntimeWarning: invalid value encountered in true_divide
> out=arrmean, casting='unsafe', subok=False)
> /home/charris/.local/lib/python2.7/site-packages/numpy/core/_methods.py:100:
> RuntimeWarning: invalid value encountered in double_scalars
> ret = ret / float(rcount)
> Out[4]: nan
>
> In [5]: ones(0).std()
> /home/charris/.local/lib/python2.7/site-packages/numpy/core/_methods.py:76:
> RuntimeWarning: invalid value encountered in true_divide
> out=arrmean, casting='unsafe', subok=False)
> /home/charris/.local/lib/python2.7/site-packages/numpy/core/_methods.py:100:
> RuntimeWarning: invalid value encountered in double_scalars
> ret = ret / float(rcount)
> Out[5]: nan
>
> *ddof >= number of elements*
>
> I think these should just raise errors. The results for ddof >= #elements
> is happenstance, and certainly negative numbers should never be returned.
>
> In [6]: ones(2).var(ddof=2)
> /home/charris/.local/lib/python2.7/site-packages/numpy/core/_methods.py:100:
> RuntimeWarning: invalid value encountered in double_scalars
> ret = ret / float(rcount)
> Out[6]: nan
>
> In [7]: ones(2).var(ddof=3)
> Out[7]: -0.0
> *
> nansum*
>
> Currently returns nan for empty arrays. I suspect it should return nan for
> slices that are all nan, but 0 for empty slices. That would make it
> consistent with sum in the empty case.
>
For nansum, I would expect 0 even in the case of all nans. The point
of these functions is to simply ignore nans, correct? So I would aim
for this behaviour: nanfunc(x) behaves the same as func(x[~isnan(x)])
Warren
> Chuck
>
More information about the NumPy-Discussion
mailing list