[Numpy-discussion] ignore NAN in numpy.true_divide()
Xavier Barthelemy
xabart at gmail.com
Mon Dec 5 22:31:31 EST 2011
Well, I would see solutions:
1- to keep how your code is, withj a python list (you can stack numpy
arrays if they have the same dimensions):
for filename in netCDF_list:
ncfile=netCDF4.Dataset(filename)
TSFC=ncfile.variables['T_SFC'][:]
fillvalue=ncfile.variables['T_SFC']._FillValue
TSFC=MA.masked_values(TSFC, fillvalue)
TSFCWithOutNan=[]
for a in TSFC:
indexnonNaN=N.isfinite(a)
SliceofTotoWithoutNan=a[indexnonNaN]
print SliceofTotoWithoutNan
TSFCWithOutNan .append( SliceofTotoWithoutNan )
for i in xrange(0,len(TSFCWithOutNan )-1,1):
slice_counter +=1
#print slice_counter
try:
running_sum=N.add(running_sum,
TSFCWithOutNan [i])
except NameError:
print "Initiating the running total of my
variable..."
running_sum=N.array(TSFCWithOutNan [i])
...
or 2- everything in the same loop:
slice_counter =0
for a in TSFC:
indexnonNaN=N.isfinite(a)
SliceofTotoWithoutNan=a[indexnonNaN]
slice_counter +=1
#print slice_counter
try:
running_sum=N.add(running_sum,
SliceofTotoWithoutNan )
except NameError:
print "Initiating the running total of my
variable..."
running_sum=N.array( SliceofTotoWithoutNan
)
TSFC_avg=N.true_divide(running_sum, slice_counter)
N.set_printoptions(threshold='nan')
print "the TSFC_avg is:", TSFC_avg
See if it works. it is just a rapid guess
Xavier
for dir in glob.glob(MainFolder + '*/01/')+ glob.glob(MainFolder +
'*/02/')+ glob.glob(MainFolder + '*/12/'):
> #print dir
>
> for ncfile in glob.glob(dir + '*.nc'):
> netCDF_list.append(ncfile)
>
> slice_counter=0
> print netCDF_list
> for filename in netCDF_list:
> ncfile=netCDF4.Dataset(filename)
> TSFC=ncfile.variables['T_SFC'][:]
> fillvalue=ncfile.variables['T_SFC']._FillValue
> TSFC=MA.masked_values(TSFC, fillvalue)
> for a in TSFC:
> indexnonNaN=N.isfinite(a)
> SliceofTotoWithoutNan=a[indexnonNaN]
> print SliceofTotoWithoutNan
> TSFC=SliceofTotoWithoutNan
>
>
> for i in xrange(0,len(TSFC)-1,1):
> slice_counter +=1
> #print slice_counter
> try:
> running_sum=N.add(running_sum, TSFC[i])
> except NameError:
> print "Initiating the running total of my
> variable..."
> running_sum=N.array(TSFC[i])
>
> TSFC_avg=N.true_divide(running_sum, slice_counter)
> N.set_printoptions(threshold='nan')
> print "the TSFC_avg is:", TSFC_avg
>
>
>
>
> On Tue, Dec 6, 2011 at 9:50 AM, Xavier Barthelemy <xabart at gmail.com>wrote:
>
>> Hi,
>> I don't know if it is the best choice, but this is what I do in my code:
>>
>> for each slice:
>> indexnonNaN=np.isfinite(SliceOf Toto)
>> SliceOf TotoWithoutNan= SliceOf Toto [indexnonNaN]
>>
>> and then perform all operation I want o on the last array.
>>
>> i hope it does answer your question
>>
>> Xavier
>>
>>
>> 2011/12/6 questions anon <questions.anon at gmail.com>
>>
>>> Maybe I am asking the wrong question or could go about this another way.
>>> I have thousands of numpy arrays to flick through, could I just identify
>>> which arrays have NAN's and for now ignore the entire array. is there a
>>> simple way to do this?
>>> any feedback will be greatly appreciated.
>>>
>>> On Thu, Dec 1, 2011 at 12:16 PM, questions anon <
>>> questions.anon at gmail.com> wrote:
>>>
>>>> I am trying to calculate the mean across many netcdf files. I cannot
>>>> use numpy.mean because there are too many files to concatenate and I end up
>>>> with a memory error. I have enabled the below code to do what I need but I
>>>> have a few nan values in some of my arrays. Is there a way to ignore these
>>>> somewhere in my code. I seem to face this problem often so I would love a
>>>> command that ignores blanks in my array before I continue on to the next
>>>> processing step.
>>>> Any feedback is greatly appreciated.
>>>>
>>>>
>>>> netCDF_list=[]
>>>> for dir in glob.glob(MainFolder + '*/01/')+ glob.glob(MainFolder +
>>>> '*/02/')+ glob.glob(MainFolder + '*/12/'):
>>>> for ncfile in glob.glob(dir + '*.nc'):
>>>> netCDF_list.append(ncfile)
>>>>
>>>> slice_counter=0
>>>> print netCDF_list
>>>>
>>>> for filename in netCDF_list:
>>>> ncfile=netCDF4.Dataset(filename)
>>>> TSFC=ncfile.variables['T_SFC'][:]
>>>> fillvalue=ncfile.variables['T_SFC']._FillValue
>>>> TSFC=MA.masked_values(TSFC, fillvalue)
>>>> for i in xrange(0,len(TSFC)-1,1):
>>>> slice_counter +=1
>>>> #print slice_counter
>>>> try:
>>>> running_sum=N.add(running_sum, TSFC[i])
>>>> except NameError:
>>>> print "Initiating the running total of my
>>>> variable..."
>>>> running_sum=N.array(TSFC[i])
>>>>
>>>> TSFC_avg=N.true_divide(running_sum, slice_counter)
>>>> N.set_printoptions(threshold='nan')
>>>> print "the TSFC_avg is:", TSFC_avg
>>>>
>>>>
>>>
>>> _______________________________________________
>>> NumPy-Discussion mailing list
>>> NumPy-Discussion at scipy.org
>>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>>>
>>>
>>
>>
>> --
>> « Quand le gouvernement viole les droits du peuple, l'insurrection est,
>> pour le peuple et pour chaque portion du peuple, le plus sacré des droits
>> et le plus indispensable des devoirs »
>>
>> Déclaration des droits de l'homme et du citoyen, article 35, 1793
>>
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion at scipy.org
>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>>
>>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
--
« Quand le gouvernement viole les droits du peuple, l'insurrection est,
pour le peuple et pour chaque portion du peuple, le plus sacré des droits
et le plus indispensable des devoirs »
Déclaration des droits de l'homme et du citoyen, article 35, 1793
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20111206/b6b532c3/attachment.html>
More information about the NumPy-Discussion
mailing list