[Numpy-discussion] NetCDF4/numpy question

Olivier Delalleau shish at keba.be
Sat Jan 28 00:28:42 EST 2012


Eric's probably right and it's indexing with a masked array that's causing
you trouble.
Since you seem to say your NaN values correspond to your mask, you should
be able to simply do:

modelData[modeData.mask] = dataMin

Note that in further processing it may then make more sense to remove the
mask, since your array is now full with valid data:
modelData = modelData.data

-=- Olivier

Le 27 janvier 2012 17:37, Howard <howard at renci.org> a écrit :

>  On 1/27/12 5:21 PM, Eric Firing wrote:
>
> On 01/27/2012 11:18 AM, Howard wrote:
>
>  Hi all
>
> I am a fairly recent convert to python and I have got a question that's
> got me stumped. I hope this is the right mailing list: here goes :)
>
> I am reading some time series data out of a netcdf file a single
> timestep at a time. If the data is NaN, I want to reset it to the
> minimum of the dataset over all timesteps (which I already know). The
> data is in a variable of type numpy.ma.core.MaskedArray called modelData.
>
> If I do this:
>
> for i in range(len(modelData)):
> if math.isnan(modelData[i]):
> modelData[i] = dataMin
>
> I get the effect I want, If I do this:
>
> modelData[np.isnan(modelData)] = dataMin
>
> it doesn't seem to be working. Of course I could just do the first one,
> but len(modelData) is about 3.5 million, and it's taking about 20
> seconds to run. This is happening inside of a rendering loop, so I'd
> like it to be as fast as possible, and I thought the second one might be
> faster, and maybe it is, but it doesn't seem to be working! :)
>
>  It would help if you would say explicitly what you mean by "doesn't seem
> to be working", ideally by providing a minimal complete example
> illustrating the problem.
>
>  Hi Eric
>
> Thanks for the reply.  Yes, I can be a little more specific about the
> issue.  I am reading data from a storm surge model out of a NetCDF file so
> I can render it with tricontourf. The model data has both a triangulation
> and a set of lat, lon points that are invariant for the entire model run,
> as well as data for each time step. As the model runs, triangles in the
> coastal plain wet and dry: the dry values are indicated by NaN values in
> the data and should not be rendered.  Those I mask off previous to this
> code. I have found, in using tricontourf, that in the mapping from data
> values to color values, the range of the data seems to include even the
> data from the masked triangles.  This causes the data to be either
> monochromatic or bi-chromatic (the high and low colors in the map).
> However, once the triangles are masked, if I set the corresponding data
> values to the known dataMin (or in fact, any value in the valid data range)
> the render proceeds correctly.  So in the case of the first piece of code,
> I get reasonable images: using the second I do not.
>
>
>  Does modelData have masked values that you want to keep separate from
> your NaN values?  If not, you can do this:
>
>
> No I don't think so.
>
> y = np.ma.masked_invalid(modelData).filled(dataMin)
>
> Then y will be an ordinary ndarray.  If this is not satisfactory because
> you need to keep separate some initially masked values, then you may
> need to save the initial mask and use it to turn y back into a masked array.
>
> You may be running into trouble with your initial approach because using
> np.isnan on a masked array is giving a masked array, and I think trying
> to index with a masked array is not advised.
>
>  This could certainly be be the issue. I will look into this Monday.
>
> Thanks very much for taking the time to reply.
> Howard
>
>
>  In [2]: np.isnan(np.ma.array([1.0, np.nan, 2.0], mask=[False, False, True]))
> Out[2]:
> masked_array(data = [False True --],
>               mask = [False False  True],
>         fill_value = True)
>
> Eric
>
>
>  Any ideas would be much appreciated.
>
> Thanks
> Howard
>
> --
> Howard Lander <mailto:howard at renci.org> <howard at renci.org>
> Senior Research Software Developer
> Renaissance Computing Institute (RENCI) <http://www.renci.org> <http://www.renci.org>
> The University of North Carolina at Chapel Hill
> Duke University
> North Carolina State University
> 100 Europa Drive
> Suite 540
> Chapel Hill, NC 27517
> 919-445-9651
>
>
>
> _______________________________________________
> NumPy-Discussion mailing listNumPy-Discussion at scipy.orghttp://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>  _______________________________________________
> NumPy-Discussion mailing listNumPy-Discussion at scipy.orghttp://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
>
> --
> Howard Lander <howard at renci.org>
>
> Senior Research Software Developer
> Renaissance Computing Institute (RENCI) <http://www.renci.org>
> The University of North Carolina at Chapel Hill
> Duke University
> North Carolina State University
> 100 Europa Drive
> Suite 540
> Chapel Hill, NC 27517
> 919-445-9651
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120128/7101e4c5/attachment.html>


More information about the NumPy-Discussion mailing list