[Numpy-discussion] Thoughts on masked arrays
Michael Haggerty
mhagger at alum.mit.edu
Wed May 9 19:20:37 EDT 2001
Hi,
I've spent several days using the masked arrays that have been added
to NumPy recently. They're a great feature and they were just what I
needed for the little project I was working on (aside from a few bugs
that I found).
However, there were a few things about MA that I found inconvenient
and/or counterintuitive, so I thought I'd post them to the list while
they're fresh in my mind. I'm using Numeric-20.0.0b2.
1. I couldn't find a simple way to tell if all of the cells of a
masked array are unmasked. There are times when you fill an array
incrementally and you want to convert it to a Numeric array but
first make sure that all of the elements have been set.
"m.filled()" is a bit dangerous (in my opinion) because it silently
fills. The shortest idiom I could think of is
>>> assert not logical_or.reduce(ravel(MA.getmaskarray(m)))
which isn't very short :-) and is also awkward because it creates a
mask array even if m.mask() is None. How about a m.is_unmasked()
method, or even giving a special meaning to "m.filled(masked)",
namely that it raises an exception if any cells are still masked.
(As an optimization, this method could set m.__mask = None to speed
up future checks.)
2. I can't reproduce this problem now, but I could swear that the
MaskedArray.__str__() method sometimes printed "typecode='O'" if
masked.enabled() is true. This would be a byproduct of using
Numeric's __str__() method to print the array, at least under the
unknown circumstances in which Numeric.__str__() prints the
typecode. This confused me for a while.
3. I found the semantics of MA.compress(condition,a,axis=0) to be
inconvenient and inconsistent with those of Numeric.compress.
MA.compress() squeezes out not only those elements for which
condition is false, but also those elements that are masked. This
differs from the behavior of Numeric.compress, which always returns
an array with the "axis" dimension equal to the number of nonzero
elements of "condition". The real problem, though, is that
MA.compress can't be used on a multidimensional array with a
nontrivial mask, because squeezing out the masked values is highly
unlikely to result in a rectangular matrix. It is nice to be able
to squeeze masked values out of a 1-d array, but not at the price
of not being able to use compress on a multidimensional array. I
suggest giving MA.compress() semantics closer to those of
Numeric.compress(), and adding an optional argument or a separate
method to cause masked elements to be omitted.
Thanks for a great package!
Yours,
Michael
--
Michael Haggerty
mhagger at alum.mit.edu
More information about the NumPy-Discussion
mailing list