[Numpy-discussion] feedback request: proposal to add masks to the core ndarray

Matthew Brett matthew.brett at gmail.com
Sat Jun 25 10:52:00 EDT 2011


Hi,

On Sat, Jun 25, 2011 at 3:46 PM, Charles R Harris
<charlesr.harris at gmail.com> wrote:
>
>
> On Sat, Jun 25, 2011 at 8:31 AM, Matthew Brett <matthew.brett at gmail.com>
> wrote:
>>
>> Hi,
>>
>> On Sat, Jun 25, 2011 at 3:21 PM, Charles R Harris
>> <charlesr.harris at gmail.com> wrote:
>> >
>> >
>> > On Sat, Jun 25, 2011 at 5:29 AM, Pierre GM <pgmdevlist at gmail.com> wrote:
>> >>
>> >> This thread is getting quite long, innit ?
>> >> And I think it's getting a tad confusing, because we're mixing two
>> >> different concepts: missing values and masks.
>> >> There should be support for missing values in numpy.core, I think we
>> >> all
>> >> agree on that.
>> >> * What's been suggested of adding new dtypes (nafloat, naint) is great,
>> >> by
>> >> why not making it the default, then ?
>> >>
>> >> * Operations involving a NA (whatever the NA actually is, depending on
>> >> the
>> >> dtype of the input) should result in a NA (whatever the NA defined by
>> >> the
>> >> outputs dtype). That could be done by overloading the existing ufuncs
>> >> to
>> >> support the new dtypes.
>> >> * There should be some simple methods to retrieve the location of those
>> >> NAs in an array. Whether we just output the indices or a full boolean
>> >> array
>> >> (w/ True for a NA, False for a non-NA or vice-versa) needs to be
>> >> decided.
>> >> * We can always re-implement masked arrays to use these NAs in a way
>> >> which
>> >> would be consistent with numpy.ma (so as not to confuse existing users
>> >> of
>> >> numpy.ma): a mask would be a boolean array with the same shape than the
>> >> underlying ndarray, with True for NA.
>> >> Mark, I'd suggest you modify your proposal, making it clearer that it's
>> >> not to add all of numpy.ma functionalities in the core, but just
>> >> support
>> >> these missing values. Using the term 'mask' should be avoided as much
>> >> as
>> >> possible, use a 'missing data' or whatever.
>> >
>> > I think he aims to support both.
>>
>> I don't think Mark is proposing to support both.  He's proposing to
>> implement only array.mask.
>>
>
> I think you are confusing function with implementation. If you look at the
> current NEP, it does NA but does so by using masks behind the scene in a
> transparent manner.

Yes, there is some confusion; just to be clear, I'm pointing out that
Mark is not proposing to implement na-dtypes, and is proposing to
implement array.mask.

See you,

Matthew



More information about the NumPy-Discussion mailing list