[Numpy-discussion] Problems with Masked Arrays and NaN values

Nadav Horesh nadavh at VisionSense.com
Tue Nov 4 01:41:11 EST 2003


Look at the numarray package instead of Numeric --- I think it has a
better IEEE754 support even without the MA. For most cases, numarray is
a 1:1 replacement for Numeric.

  Nadav.

On Mon, 2003-11-03 at 19:23, Leo Breebaart wrote:
> Hi all,
> 
> I have a problem, and am looking for help...
> 
> I am trying to use Python as a glue language for passing some
> very large numeric arrays in and out of various C libraries.
> These arrays can contain NaN values to indicate missing elements.
> 
> As long as on the Python level I only use Numeric to pass these
> arrays around as opaque data containers, there is no problem:
> 
> - From C library FOO I obtain the huge array 'data';
> 
> - Using the PyArray_FromDimsAndData() constructor from the
>   Numeric C API, I create a Numeric array that references 'data';
> 
> - In Python, I can pass the Numeric array on to e.g. VTKPython
>   for visualisation. VTK has no problem with the NaNs --
>   everything works.
> 
> The problem arises because I want to allow people to manipulate
> these arrays from within Python as well. As is mentioned in its
> documentation, Numeric does not support NaNs, and instead advises
> to use Masked Arrays instead.
> 
> These would indeed seem to be well-suited for the job (setting
> aside the possible issues of performance and user-friendliness),
> but my problem is that I do not understand how I can create an
> appropriate mask for my array in the first place.
> 
> Something as simple as:
> 
>     import MA
> 
>     nanv = 1e30000/1e30000
>     a = MA.masked_array([0,nanv,nanv,nanv,4], mask=[0,1,1,1,0])
>     print MA.filled(2 * MA.sin(a))
> 
> works quite well, but explicit enumeration is clearly not an
> option for the huge pre-existing arrays I'm dealing with.
> 
> So I would want to do something similar to:
> 
>     a = MA.masked_object([0,1,nanv,3,4], nanv)
> 
> but this simply leads to a.mask() returning None.
> 
> At first I thought this was because 'nanv == nanv' always
> evaluates to False, but it turns out that in Python 2.3.2 it
> actually evaluates to True -- presumably because Python's own
> IEEE 754 support is lacking (if I understand PEP 754 correctly).
> So why doesn't the masked_object() constructor work? Beats me...
> It *does* work if I use e.g. '4' as the value parameter.
> 
> I tried many other approaches as well, including downloading the
> fpconst package mentioned in PEP 754 and trying to use its
> IsNaN() as a condition to the MA.masked_where() constructor --
> which doesn't work either, and gives me an exception somewhere
> deep within the bowels of MA.
> 
> At this point I think I've now reached the end of my rope. Does
> anybody reading this have any ideas on how I might beat MA into
> submission, or if there are any other solutions I could try that
> would allow me to manipulate large NaN-containing arrays
> efficiently (or even *at all*!) from within Python? Or am I
> perhaps simply (hopefully) missing something obvious?
> 
> I am eagerly looking forward to any help or advice. Many thanks
> in advance,





More information about the NumPy-Discussion mailing list