[SciPy-user] calculate average by excluding NaN value in Array

Mon Mar 30 09:21:23 EDT 2009

David Cournapeau wrote:
>> nanmean should do what you want,

Actually, I don't think that's what the OP wanted:

>>>> a=array([2.,3,4,5])
>>>> b=([3.,2,4, NaN])
>>>> average(array([a,b],"f"),axis=0)
> array([ 2.5,  2.5,  4. ,  NaN], dtype=float32)
>>>>
> I mean that I need is [2.5,2.5,4,5], instead.

In this case, numpy really is doing the only correct thing, in an 
operation like this, the result should be the same size as the input 
arrays, and the average of NaN and anything else can only be NaN.

Imagine there were more than on NaN in the inputs -- how would you eve 
know which values in the resulting array belonged to which inputs? So 
you should probably check for NaN afterwards, and then do what you need 
with them:

 >>> import numpy as np
 >>> a=np.array([2.,3,4,5])
 >>> b=np.array([3.,2,4, np.NaN])
 >>> avg = np.average(np.array([a,b],dtype = np.float),axis=0)

 >>> avg
array([ 2.5,  2.5,  4. ,  NaN])
 >>> # now check for NaN:

 >>> np.isfinite(avg)
array([ True,  True,  True, False], dtype=bool)

 >>> # or
 >>> np.isnan(avg)
array([False, False, False,  True], dtype=bool)

 >>> # get a version without the non-finite numbers:
 >>> avg[np.isfinite(avg)]
array([ 2.5,  2.5,  4. ])

So I think what you want is to strip the NaNs out later: