[Numpy-discussion] Who will use numpy.ma?

Thu Jan 12 04:44:03 EST 2006

> With the recent improvements to the array object in NumPy, 
> the MA library has fallen behind.  There are more than 50 methods 
> in the ndarray object that are not present in ma.array.

I guess maintaining MA means double work, doesn't it? So, even if
MA is updated now, in the future it's likely to be always somewhat behind.
This is perhaps an argument against the use of this library? I wouldn't
like to extensively use a library that will be phased out in a couple of 
years because it turns out to be somewhat redundant and/or behind
w.r.t. the rest of numpy.

The fact that with numpy you can do things like

>>> from numpy import *
>>> x = sin(1./arange(3.))              # arbitrary example
>>> x[1] = nan                             # 'masking' an element
>>> x = where(isnan(x), 10., x)      # replacing masks by a number
>>> x = where(x >= 10, nan, x)      # clipping with nan replacement
>>> y = sin(x)                               # sin(nan) = nan

fulfils most of my needs. However, I haven't compared execution times 
for large arrays. Note that using NaNs, we don't distinguish between NaN
and NA (not available). I am not sure this won't bite us somewhere in
the future.

I have a related question. numpy introduces the functions nanargmax(), 
nanargmin(), nanmax(), nanmin(), and nansum(). Was there a special 
reason to introduce extra nan... functions rather than adding an option
like ignore_nan = True to the normal functions? Is this a performance 
issue? Will similar nan... equivalents be introduced for the functions
mean() and reduce()?

A side remark: 

>>> y = array([1, nan, 0.47])
>>> sort(y)
array([ 1.        ,         nan,  0.47      ])

No exception, no sorting. Is this a feature or a bug? :)

J.

Disclaimer: http://www.kuleuven.be/cwis/email_disclaimer.htm