[SciPy-User] memory error - numpy mean - netcdf4

Phil Morefield philmorefield at yahoo.com
Fri Aug 26 13:31:00 EDT 2011


First off, the netCDF4 module has a multi-file class that concatonates multiple netCDF files for you: http://netcdf4-python.googlecode.com/svn/trunk/docs/netCDF4.MFDataset-class.html. That will simplify things a bit.
 
Second, usually the "TIME" dimension is axis=2. Axes 0 and 1 usually correspond to the X and Y dimensions.
 
Finally, you're getting the MemoryError because you're trying to put an ginormous array into memory all at once. Your OS can't handle it. Just loop through each time step and keep a running total and counter. Then divide your total (which is an array) by your counter (which is an integer or float) and presto: you have your average. It's plenty fast, don't worry.
 
 
 

From: questions anon <questions.anon at gmail.com>
To: scipy-user at scipy.org
Sent: Tuesday, August 23, 2011 7:00 PM
Subject: [SciPy-User] memory error - numpy mean - netcdf4


Hi All, 
I am receiving a memory error when I try to calculate the Numpy mean across many NetCDF files.
Is there a way to fix this? The code I am using is below.
Any feedback will be greatly appreciated.


from netCDF4 import Dataset
import matplotlib.pyplot as plt
import numpy as N
from mpl_toolkits.basemap import Basemap
from netcdftime import utime
from datetime import datetime
import os

MainFolder=r"E:/GriddedData/T_SFC/"

all_TSFC=[] 
for (path, dirs, files) in os.walk(MainFolder):
    for dir in dirs:
        print dir
    path=path+'/'
    for ncfile in files:
        if ncfile[-3:]=='.nc':
            #print "dealing with ncfiles:", ncfile
            ncfile=os.path.join(path,ncfile)
            ncfile=Dataset(ncfile, 'r+', 'NETCDF4')
            TSFC=ncfile.variables['T_SFC'][4::24,:,:]
            LAT=ncfile.variables['latitude'][:]
            LON=ncfile.variables['longitude'][:]
            TIME=ncfile.variables['time'][:]
            fillvalue=ncfile.variables['T_SFC']._FillValue
            ncfile.close()

            #combine all TSFC to make one array for analyses
            all_TSFC.append(TSFC)
           
big_array=N.ma.concatenate(all_TSFC)
#calculate the mean of the combined array
Mean=big_array.mean(axis=0)
print "the mean is", Mean


#plot output summary stats
map = Basemap(projection='merc',llcrnrlat=-40,urcrnrlat=-33,
              llcrnrlon=139.0,urcrnrlon=151.0,lat_ts=0,resolution='i')
map.drawcoastlines()
map.drawstates()
x,y=map(*N.meshgrid(LON,LAT))
plt.title('TSFC Mean at 3pm')
ticks=[-5,0,5,10,15,20,25,30,35,40,45,50]
CS = map.contourf(x,y,Mean, cmap=plt.cm.jet)
l,b,w,h =0.1,0.1,0.8,0.8
cax = plt.axes([l+w+0.025, b, 0.025, h])
plt.colorbar(CS,cax=cax, drawedges=True)

plt.savefig((os.path.join(MainFolder, 'Mean.png')))
plt.show()
plt.close()

print "end processing"          




_______________________________________________
SciPy-User mailing list
SciPy-User at scipy.org
http://mail.scipy.org/mailman/listinfo/scipy-user
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20110826/58ef882c/attachment.html>


More information about the SciPy-User mailing list