Memory error

Jamie Mitchell jamiemitchell1604 at gmail.com
Mon Mar 24 07:32:31 EDT 2014


Hello all,

I'm afraid I am new to all this so bear with me...

I am looking to find the statistical significance between two large netCDF data sets.

Firstly I've loaded the two files into python:

swh=netCDF4.Dataset('/data/cr1/jmitchel/Q0/swh/controlperiod/averages/swh_control_concat.nc', 'r')

swh_2050s=netCDF4.Dataset('/data/cr1/jmitchel/Q0/swh/2050s/averages/swh_2050s_concat.nc', 'r')

I have then isolated the variables I want to perform the pearson correlation on:

hs=swh.variables['hs']

hs_2050s=swh_2050s.variables['hs']

Here is the metadata for those files:

print hs
<type 'netCDF4.Variable'>
int16 hs(time, latitude, longitude)
    standard_name: significant_height_of_wind_and_swell_waves
    long_name: significant_wave_height
    units: m
    add_offset: 0.0
    scale_factor: 0.002
    _FillValue: -32767
    missing_value: -32767
unlimited dimensions: time
current shape = (86400, 350, 227)

print hs_2050s
<type 'netCDF4.Variable'>
int16 hs(time, latitude, longitude)
    standard_name: significant_height_of_wind_and_swell_waves
    long_name: significant_wave_height
    units: m
    add_offset: 0.0
    scale_factor: 0.002
    _FillValue: -32767
    missing_value: -32767
unlimited dimensions: time
current shape = (86400, 350, 227)


Then to perform the pearsons correlation:

from scipy.stats.stats import pearsonr

pearsonr(hs,hs_2050s)

I then get a memory error:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/sci/lib/python2.7/site-packages/scipy/stats/stats.py", line 2409, in pearsonr
    x = np.asarray(x)
  File "/usr/local/sci/lib/python2.7/site-packages/numpy/core/numeric.py", line 321, in asarray
    return array(a, dtype, copy=False, order=order)
MemoryError

This also happens when I try to create numpy arrays from the data.

Does anyone know how I can alleviate theses memory errors?

Cheers,

Jamie



More information about the Python-list mailing list