[Numpy-discussion] sum of array for masked area

Scott Sinclair scott.sinclair.za at gmail.com
Thu Nov 28 03:20:47 EST 2013


On 28 November 2013 09:06, questions anon <questions.anon at gmail.com> wrote:
> I have a separate text file for daily rainfall data that covers the whole
> country. I would like to calculate the monthly mean, min, max and the mean
> of the sum for one state.
>
> I can get the max, min and mean for the state, but the mean of the sum keeps
> giving me a result for the whole country rather than just the state, even

> def accumulate_month(year, month):
>     files = glob.glob(GLOBTEMPLATE.format(year=year, month=month))
>     monthlyrain=[]
>      for ifile in files:
>         try:
>             f=np.genfromtxt(ifile,skip_header=6)
>         except:
>             print "ERROR with file:", ifile
>             errors.append(ifile)
>         f=np.flipud(f)
>
>         stateonly_f=np.ma.masked_array(f, mask=newmask.mask) # this masks
> data to state
>
>
>         print "stateonly_f:", stateonly_f.max(), stateonly_f.mean(),
> stateonly_f.sum()
>
>         monthlyrain.append(stateonly_f)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
At this point monthlyrain is a list of masked arrays

>     r_sum=np.sum(monthlyrain, axis=0)
                              ^^^^^^^^^^^
Passing a list of masked arrays to np.sum returns an np.ndarray object
(*not* a masked array)

>     r_mean_of_sum=MA.mean(r_sum)

Therefore this call to MA.mean returns the mean of all values in the
ndarray r_sum.

To fix: convert your monthlyrain list to a 3D maksed array before
calling np.sum(monthlyrain, axis=0). In this case np.sum will call the
masked array's .sum() method which knows about the mask.

monthlyrain = np.ma.asarray(monthlyrain)
r_sum=np.sum(monthlyrain, axis=0)

Consider the following simplified example:

alist = []
for k in range(2):
    a = np.arange(4).reshape((2,2))

    alist.append(np.ma.masked_array(a, mask=[[0,1],[0,0]]))

print(alist)
print(type(alist))

alist = np.ma.asarray(alist)
print(alist)
print(type(alist))

asum = np.sum(alist, axis=0)

print(asum)
print(type(asum))

print(asum.mean())

Cheers,
Scott



More information about the NumPy-Discussion mailing list