[Numpy-discussion] ignore NAN in numpy.true_divide()

Xavier Barthelemy xabart at gmail.com
Mon Dec 5 22:31:31 EST 2011


Well, I would see  solutions:
1- to keep how your code is, withj a python list (you can stack numpy
arrays if they have the same dimensions):

for filename in netCDF_list:
        ncfile=netCDF4.Dataset(filename)
        TSFC=ncfile.variables['T_SFC'][:]
        fillvalue=ncfile.variables['T_SFC']._FillValue
        TSFC=MA.masked_values(TSFC, fillvalue)
        TSFCWithOutNan=[]
        for a in TSFC:
                indexnonNaN=N.isfinite(a)
                SliceofTotoWithoutNan=a[indexnonNaN]
                print SliceofTotoWithoutNan
                TSFCWithOutNan .append( SliceofTotoWithoutNan )



        for i in xrange(0,len(TSFCWithOutNan  )-1,1):
                        slice_counter +=1
                #print slice_counter
                        try:
                                running_sum=N.add(running_sum,
TSFCWithOutNan  [i])
                        except NameError:
                                print "Initiating the running total of my
variable..."
                                running_sum=N.array(TSFCWithOutNan  [i])
...

or 2- everything in the same loop:

slice_counter  =0
        for a in TSFC:
                indexnonNaN=N.isfinite(a)
                SliceofTotoWithoutNan=a[indexnonNaN]
                slice_counter +=1
                #print slice_counter
                        try:
                                running_sum=N.add(running_sum,
SliceofTotoWithoutNan )
                        except NameError:
                                print "Initiating the running total of my
variable..."
                                running_sum=N.array( SliceofTotoWithoutNan
)
TSFC_avg=N.true_divide(running_sum, slice_counter)
N.set_printoptions(threshold='nan')
print "the TSFC_avg is:", TSFC_avg

See if it works. it is just a rapid guess
Xavier

for dir in glob.glob(MainFolder + '*/01/')+ glob.glob(MainFolder +
'*/02/')+ glob.glob(MainFolder + '*/12/'):

>         #print dir
>
>         for ncfile in glob.glob(dir + '*.nc'):
>             netCDF_list.append(ncfile)
>
> slice_counter=0
> print netCDF_list
> for filename in netCDF_list:
>         ncfile=netCDF4.Dataset(filename)
>         TSFC=ncfile.variables['T_SFC'][:]
>         fillvalue=ncfile.variables['T_SFC']._FillValue
>         TSFC=MA.masked_values(TSFC, fillvalue)
>         for a in TSFC:
>                 indexnonNaN=N.isfinite(a)
>                 SliceofTotoWithoutNan=a[indexnonNaN]
>                 print SliceofTotoWithoutNan
>         TSFC=SliceofTotoWithoutNan
>
>
>         for i in xrange(0,len(TSFC)-1,1):
>                         slice_counter +=1
>                 #print slice_counter
>                         try:
>                                 running_sum=N.add(running_sum, TSFC[i])
>                         except NameError:
>                                 print "Initiating the running total of my
> variable..."
>                                 running_sum=N.array(TSFC[i])
>
> TSFC_avg=N.true_divide(running_sum, slice_counter)
> N.set_printoptions(threshold='nan')
> print "the TSFC_avg is:", TSFC_avg
>
>
>
>
> On Tue, Dec 6, 2011 at 9:50 AM, Xavier Barthelemy <xabart at gmail.com>wrote:
>
>> Hi,
>> I don't know if it is the best choice, but this is what I do in my code:
>>
>> for each slice:
>>   indexnonNaN=np.isfinite(SliceOf Toto)
>>   SliceOf TotoWithoutNan= SliceOf Toto [indexnonNaN]
>>
>> and then perform all operation I want o on the last array.
>>
>> i hope it does answer your question
>>
>> Xavier
>>
>>
>> 2011/12/6 questions anon <questions.anon at gmail.com>
>>
>>>  Maybe I am asking the wrong question or could go about this another way.
>>> I have thousands of numpy arrays to flick through, could I just identify
>>> which arrays have NAN's and for now ignore the entire array. is there a
>>> simple way to do this?
>>> any feedback will be greatly appreciated.
>>>
>>> On Thu, Dec 1, 2011 at 12:16 PM, questions anon <
>>> questions.anon at gmail.com> wrote:
>>>
>>>> I am trying to calculate the mean across many netcdf files. I cannot
>>>> use numpy.mean because there are too many files to concatenate and I end up
>>>> with a memory error. I have enabled the below code to do what I need but I
>>>> have a few nan values in some of my arrays. Is there a way to ignore these
>>>> somewhere in my code. I seem to face this problem often so I would love a
>>>> command that ignores blanks in my array before I continue on to the next
>>>> processing step.
>>>> Any feedback is greatly appreciated.
>>>>
>>>>
>>>> netCDF_list=[]
>>>> for dir in glob.glob(MainFolder + '*/01/')+ glob.glob(MainFolder +
>>>> '*/02/')+ glob.glob(MainFolder + '*/12/'):
>>>>         for ncfile in glob.glob(dir + '*.nc'):
>>>>             netCDF_list.append(ncfile)
>>>>
>>>> slice_counter=0
>>>> print netCDF_list
>>>>
>>>> for filename in netCDF_list:
>>>>         ncfile=netCDF4.Dataset(filename)
>>>>         TSFC=ncfile.variables['T_SFC'][:]
>>>>         fillvalue=ncfile.variables['T_SFC']._FillValue
>>>>         TSFC=MA.masked_values(TSFC, fillvalue)
>>>>         for i in xrange(0,len(TSFC)-1,1):
>>>>                 slice_counter +=1
>>>>                 #print slice_counter
>>>>                 try:
>>>>                         running_sum=N.add(running_sum, TSFC[i])
>>>>                 except NameError:
>>>>                         print "Initiating the running total of my
>>>> variable..."
>>>>                         running_sum=N.array(TSFC[i])
>>>>
>>>> TSFC_avg=N.true_divide(running_sum, slice_counter)
>>>> N.set_printoptions(threshold='nan')
>>>> print "the TSFC_avg is:", TSFC_avg
>>>>
>>>>
>>>
>>> _______________________________________________
>>> NumPy-Discussion mailing list
>>> NumPy-Discussion at scipy.org
>>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>>>
>>>
>>
>>
>> --
>>  « Quand le gouvernement viole les droits du peuple, l'insurrection est,
>> pour le peuple et pour chaque portion du peuple, le plus sacré des droits
>> et le plus indispensable des devoirs »
>>
>> Déclaration des droits de l'homme et du citoyen, article 35, 1793
>>
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion at scipy.org
>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>>
>>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>


-- 
 « Quand le gouvernement viole les droits du peuple, l'insurrection est,
pour le peuple et pour chaque portion du peuple, le plus sacré des droits
et le plus indispensable des devoirs »

Déclaration des droits de l'homme et du citoyen, article 35, 1793
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20111206/b6b532c3/attachment.html>


More information about the NumPy-Discussion mailing list