[Numpy-discussion] ignore NAN in numpy.true_divide()

questions anon questions.anon at gmail.com
Mon Dec 5 23:27:37 EST 2011


thanks again for you response. I must still be doing something wrong!!
both options resulted in :
the TSFC_avg is: [-- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --
-- -- -- -- -- --
 -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --
 -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --
 -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --
 -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --
 -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --

1st option:

slice_counter=0

for filename in netCDF_list:
        ncfile=netCDF4.Dataset(filename)
        TSFC=ncfile.variables['T_SFC'][:]
        fillvalue=ncfile.variables['T_SFC']._FillValue
        TSFC=MA.masked_values(TSFC, fillvalue)
        TSFCWithOutNan=[]
        for a in TSFC:
                indexnonNaN=N.isfinite(a)
                SliceofTotoWithoutNan=a[indexnonNaN]
                print SliceofTotoWithoutNan
                TSFCWithOutNan.append(SliceofTotoWithoutNan)
        for i in xrange(0,len(TSFCWithOutNan)-1,1):
                slice_counter +=1
                try:
                        running_sum=N.add(running_sum, TSFCWithOutNan[i])
                except NameError:
                        print "Initiating the running total of my
variable..."
                        running_sum=N.array(TSFCWithOutNan[i])

TSFC_avg=N.true_divide(running_sum, slice_counter)
N.set_printoptions(threshold='nan')
print "the TSFC_avg is:", TSFC_avg



the 2nd option :

for filename in netCDF_list:
        ncfile=netCDF4.Dataset(filename)
        TSFC=ncfile.variables['T_SFC'][:]
        fillvalue=ncfile.variables['T_SFC']._FillValue
        TSFC=MA.masked_values(TSFC, fillvalue)

        slice_counter=0
        for a in TSFC:
                indexnonNaN=N.isfinite(a)
                SliceofTotoWithoutNan=a[indexnonNaN]
                slice_counter +=1
                try:
                        running_sum=N.add(running_sum,
SliceofTotoWithoutNan)
                except NameError:
                         print "Initiating the running total of my
variable..."
                         running_sum=N.array(SliceofTotoWithoutNan)

TSFC_avg=N.true_divide(running_sum, slice_counter)
N.set_printoptions(threshold='nan')
print "the TSFC_avg is:", TSFC_avg





On Tue, Dec 6, 2011 at 2:31 PM, Xavier Barthelemy <xabart at gmail.com> wrote:

> Well, I would see  solutions:
> 1- to keep how your code is, withj a python list (you can stack numpy
> arrays if they have the same dimensions):
>
> for filename in netCDF_list:
>         ncfile=netCDF4.Dataset(filename)
>         TSFC=ncfile.variables['T_SFC'][:]
>         fillvalue=ncfile.variables['T_SFC']._FillValue
>         TSFC=MA.masked_values(TSFC, fillvalue)
>         TSFCWithOutNan=[]
>         for a in TSFC:
>                 indexnonNaN=N.isfinite(a)
>                 SliceofTotoWithoutNan=a[indexnonNaN]
>                 print SliceofTotoWithoutNan
>                 TSFCWithOutNan .append( SliceofTotoWithoutNan )
>
>
>
>         for i in xrange(0,len(TSFCWithOutNan  )-1,1):
>
>                         slice_counter +=1
>                 #print slice_counter
>                         try:
>                                 running_sum=N.add(running_sum,
> TSFCWithOutNan  [i])
>
>                         except NameError:
>                                 print "Initiating the running total of my
> variable..."
>                                 running_sum=N.array(TSFCWithOutNan  [i])
> ...
>
> or 2- everything in the same loop:
>
> slice_counter  =0
>         for a in TSFC:
>                 indexnonNaN=N.isfinite(a)
>                 SliceofTotoWithoutNan=a[indexnonNaN]
>                 slice_counter +=1
>                 #print slice_counter
>                         try:
>                                 running_sum=N.add(running_sum,
> SliceofTotoWithoutNan )
>
>                         except NameError:
>                                 print "Initiating the running total of my
> variable..."
>                                 running_sum=N.array( SliceofTotoWithoutNan
> )
> TSFC_avg=N.true_divide(running_sum, slice_counter)
> N.set_printoptions(threshold='nan')
> print "the TSFC_avg is:", TSFC_avg
>
> See if it works. it is just a rapid guess
> Xavier
>
>
> for dir in glob.glob(MainFolder + '*/01/')+ glob.glob(MainFolder +
> '*/02/')+ glob.glob(MainFolder + '*/12/'):
>
>>         #print dir
>>
>>         for ncfile in glob.glob(dir + '*.nc'):
>>             netCDF_list.append(ncfile)
>>
>> slice_counter=0
>> print netCDF_list
>> for filename in netCDF_list:
>>         ncfile=netCDF4.Dataset(filename)
>>         TSFC=ncfile.variables['T_SFC'][:]
>>         fillvalue=ncfile.variables['T_SFC']._FillValue
>>         TSFC=MA.masked_values(TSFC, fillvalue)
>>         for a in TSFC:
>>                 indexnonNaN=N.isfinite(a)
>>                 SliceofTotoWithoutNan=a[indexnonNaN]
>>                 print SliceofTotoWithoutNan
>>         TSFC=SliceofTotoWithoutNan
>>
>>
>>         for i in xrange(0,len(TSFC)-1,1):
>>                         slice_counter +=1
>>                 #print slice_counter
>>                         try:
>>                                 running_sum=N.add(running_sum, TSFC[i])
>>                         except NameError:
>>                                 print "Initiating the running total of my
>> variable..."
>>                                 running_sum=N.array(TSFC[i])
>>
>> TSFC_avg=N.true_divide(running_sum, slice_counter)
>> N.set_printoptions(threshold='nan')
>> print "the TSFC_avg is:", TSFC_avg
>>
>>
>>
>>
>> On Tue, Dec 6, 2011 at 9:50 AM, Xavier Barthelemy <xabart at gmail.com>wrote:
>>
>>> Hi,
>>> I don't know if it is the best choice, but this is what I do in my code:
>>>
>>> for each slice:
>>>   indexnonNaN=np.isfinite(SliceOf Toto)
>>>   SliceOf TotoWithoutNan= SliceOf Toto [indexnonNaN]
>>>
>>> and then perform all operation I want o on the last array.
>>>
>>> i hope it does answer your question
>>>
>>> Xavier
>>>
>>>
>>> 2011/12/6 questions anon <questions.anon at gmail.com>
>>>
>>>>  Maybe I am asking the wrong question or could go about this another
>>>> way.
>>>> I have thousands of numpy arrays to flick through, could I just
>>>> identify which arrays have NAN's and for now ignore the entire array. is
>>>> there a simple way to do this?
>>>> any feedback will be greatly appreciated.
>>>>
>>>> On Thu, Dec 1, 2011 at 12:16 PM, questions anon <
>>>> questions.anon at gmail.com> wrote:
>>>>
>>>>> I am trying to calculate the mean across many netcdf files. I cannot
>>>>> use numpy.mean because there are too many files to concatenate and I end up
>>>>> with a memory error. I have enabled the below code to do what I need but I
>>>>> have a few nan values in some of my arrays. Is there a way to ignore these
>>>>> somewhere in my code. I seem to face this problem often so I would love a
>>>>> command that ignores blanks in my array before I continue on to the next
>>>>> processing step.
>>>>> Any feedback is greatly appreciated.
>>>>>
>>>>>
>>>>> netCDF_list=[]
>>>>> for dir in glob.glob(MainFolder + '*/01/')+ glob.glob(MainFolder +
>>>>> '*/02/')+ glob.glob(MainFolder + '*/12/'):
>>>>>         for ncfile in glob.glob(dir + '*.nc'):
>>>>>             netCDF_list.append(ncfile)
>>>>>
>>>>> slice_counter=0
>>>>> print netCDF_list
>>>>>
>>>>> for filename in netCDF_list:
>>>>>         ncfile=netCDF4.Dataset(filename)
>>>>>         TSFC=ncfile.variables['T_SFC'][:]
>>>>>         fillvalue=ncfile.variables['T_SFC']._FillValue
>>>>>         TSFC=MA.masked_values(TSFC, fillvalue)
>>>>>         for i in xrange(0,len(TSFC)-1,1):
>>>>>                 slice_counter +=1
>>>>>                 #print slice_counter
>>>>>                 try:
>>>>>                         running_sum=N.add(running_sum, TSFC[i])
>>>>>                 except NameError:
>>>>>                         print "Initiating the running total of my
>>>>> variable..."
>>>>>                         running_sum=N.array(TSFC[i])
>>>>>
>>>>> TSFC_avg=N.true_divide(running_sum, slice_counter)
>>>>> N.set_printoptions(threshold='nan')
>>>>> print "the TSFC_avg is:", TSFC_avg
>>>>>
>>>>>
>>>>
>>>> _______________________________________________
>>>> NumPy-Discussion mailing list
>>>> NumPy-Discussion at scipy.org
>>>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>>>>
>>>>
>>>
>>>
>>> --
>>>  « Quand le gouvernement viole les droits du peuple, l'insurrection est,
>>> pour le peuple et pour chaque portion du peuple, le plus sacré des droits
>>> et le plus indispensable des devoirs »
>>>
>>> Déclaration des droits de l'homme et du citoyen, article 35, 1793
>>>
>>> _______________________________________________
>>> NumPy-Discussion mailing list
>>> NumPy-Discussion at scipy.org
>>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>>>
>>>
>>
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion at scipy.org
>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>>
>>
>
>
> --
>  « Quand le gouvernement viole les droits du peuple, l'insurrection est,
> pour le peuple et pour chaque portion du peuple, le plus sacré des droits
> et le plus indispensable des devoirs »
>
> Déclaration des droits de l'homme et du citoyen, article 35, 1793
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20111206/c9f943cd/attachment.html>


More information about the NumPy-Discussion mailing list