[Numpy-discussion] mean of two or more arrays while ignoring a specific value

Tue Jul 14 15:47:53 EDT 2009

On Tue, Jul 14, 2009 at 14:42, Chris Colbert<sccolbert at gmail.com> wrote:
> for your particular case:
>
>>>> a = np.array([1, 5, 4, 99], 'f')
>>>> b = np.array([3, 7, 2, 8], 'f')
>>>> c = b.copy()
>>>> d = a!=99
>>>> c[d] = (a[d] + b[d])/2.
>>>> c
> array([ 2.,  6.,  3.,  8.], dtype=float32)
>>>>

A more general answer is to use masked arrays.

In [5]: a = np.array([1, 5, 4, 99], 'f')

In [6]: b = np.array([3, 7, 2, 8], 'f')

In [7]: c = np.vstack([a,b])

In [8]: d = np.ma.masked_equal(c, 99.0)

In [9]: d
Out[9]: 8

masked_array(data =
 [[1.0 5.0 4.0 --]
 [3.0 7.0 2.0 8.0]],
             mask =
 [[False False False  True]
 [False False False False]],
       fill_value = 1e+20)

In [10]: d.mean(axis=0)
Out[10]: 4

masked_array(data = [2.0 6.0 3.0 8.0],
             mask = [False False False False],
       fill_value = 1e+20)

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth."
  -- Umberto Eco