[Numpy-discussion] Nansum function behavior

Fri Oct 23 21:43:54 EDT 2015

I saw this thread and I totally disagree with thouis argument…
Of course, you can have NaN if there are only NaNs. Thanks goodness, There is a lot of way to do that. 
But it’s not convenient, consistent and above all, it is wrong logically to do that. NaN does not mean zeros and operation with NaN only cannot return a figure…
You lose information about your array. It is easier to fill the result of nansum with zeros than to keep a mask of your orignal array or whatever you do.

Why it’s misleading ? 
For example you want to sum rows of a array and mean the result :

a = np.array([[2,np.nan,4], [np.nan,np.nan, np.nan]])
b = np.nansum(a, axis=1) # array([ 6.,  0.])
m = np.nanmean(b) # 3.0 WRONG because you wanted to get 6

> On 24 Oct 2015, at 09:28, Stephan Hoyer <shoyer at gmail.com> wrote:
> 
> Hi Charles,
> 
> You should read the previous discussion about this issue on GitHub:
> https://github.com/numpy/numpy/issues/1721
> 
> For what it's worth, I do think the new definition of nansum is more consistent.
> 
> If you want to preserve NaN if there are no non-NaN values, you can often calculate this desired quantity from nanmean, which does return NaN if there are only NaNs.
> 
> Stephan
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> https://mail.scipy.org/mailman/listinfo/numpy-discussion