[Numpy-discussion] Re: ndarray.fill and ma.array.filled

Tim Hochberg tim.hochberg at cox.net
Fri Apr 7 11:22:05 EDT 2006


Sasha wrote:

> I am posting a reply to my own post in a hope to generate some 
> discussion of the original proposal.
>
> I am proposing to add a "filled" method to ndarray.  This can be a 
> pass-through, an alias to "copy" or a method to replace nans or some 
> other type-specific values.  This will allow code that uses "filled" 
> work on
> ndarrays without changes.

In general, I'm skeptical of adding more methods to the ndarray object 
-- there are plenty already.

In addition, it appears that both the method and function versions of 
filled are "dangerous" in the sense that they sometimes return the array 
itself and sometimes a copy.

Finally, changing ndarray to support masked array feels a bit like the 
tail wagging the dog.

Let me throw out an alternative proposal. I will admit up front that 
this proposal is based on exactly zero experience with masked array, so 
there may be some stupidities in it, but perhaps it will lead to an 
alternative solution.

    def asUnmaskedArray(obj, fill_value=None):
            mask = getattr(obj,  False)
            if mask is False:
                return obj
            if fill_value is None:
                 fill_value = obj.get_fill_value()
            newobj = obj.data().copy()
            newobj[mask] = fill_value
            return newobj

Or something like that anyway. This particular version should work on 
any array as long as if it exports a mask attribute it also exports 
get_fill_value and data. At least once any bugs are ironed out, I 
haven't tested it.

ma would have to be modified to use this instead of using filled 
everywhere, but that seems more appropriate than tacking on another 
method to ndarray IMO.
           
On advantage of this approach is that most array like objects that don't 
subclass ndarray will work with this automagically. If we keep expanding 
the methods of ndarray, it's harder and harder to implement other array 
like objects since they have to implement more and more methods, most of 
which are irrelevant to their particular case. The more we can implement 
stuff like this in terms of some relatively small set of core 
primitives, the happier we'll all be in the long run. This also builds 
on the idea of trying to push as much of the array/view ambiguity into 
the asXXXArray corner.

Regards,

-tim


>
>
> On 3/22/06, *Sasha* <ndarray at mac.com <mailto:ndarray at mac.com>> wrote:
>
>     In an ideal world, any function that accepts ndarray would accept
>     ma.array and vice versa.  Moreover, if the ma.array has no masked
>     elements and the same data as ndarray, the result should be the same.
>     Obviously current implementation falls short of this goal, but there
>     is one feature that seems to make this goal unachievable.
>
>     This feature is the "filled" method of ma.array.  Pydoc for this
>     method reports the following:
>
>     |  filled(self, fill_value=None)
>     |      A numeric array with masked values filled. If fill_value is
>     None,
>     |                 use self.fill_value().
>     |
>     |                 If mask is nomask, copy data only if not contiguous.
>     |                 Result is always a contiguous, numeric array.
>     |      # Is contiguous really necessary now?
>
>
>     That is not the best possible description ("filled" is "filled"), but
>     the essence is that the result of a.filled(value) is a contiguous
>     ndarray obtained from the masked array by copying non-masked elements
>     and using value for masked values.
>
>     I would like to propose to add a "filled" method to ndarray.  I see
>     several possibilities and would like  to hear your opinion:
>
>     1. Make filled simply return self.
>
>     2. Make filled return a contiguous copy.
>
>     3. Make filled replace nans with the fill_value if array is of
>     floating point type.
>
>
>     Unfortunately, adding "filled" will result is a rather confusing
>     situation where "fill" and "filled" both exist and have very different
>     meanings.
>
>     I would like to note that "fill" is a somewhat odd ndarray method.
>     AFAICT, it is the only non-special method that mutates the array.  It
>     appears to be just a performance trick: the same result can be
>     achived
>     with "a[...] = ".
>
>






More information about the NumPy-Discussion mailing list