[Numpy-discussion] feedback request: proposal to add masks to the core ndarray

Robert Kern robert.kern at gmail.com
Fri Jun 24 10:44:00 EDT 2011


On Fri, Jun 24, 2011 at 09:35, Robert Kern <robert.kern at gmail.com> wrote:
> On Fri, Jun 24, 2011 at 09:24, Keith Goodman <kwgoodman at gmail.com> wrote:
>> On Fri, Jun 24, 2011 at 7:06 AM, Robert Kern <robert.kern at gmail.com> wrote:
>>
>>> The alternative proposal would be to add a few new dtypes that are
>>> NA-aware. E.g. an nafloat64 would reserve a particular NaN value
>>> (there are lots of different NaN bit patterns, we'd just reserve one)
>>> that would represent NA. An naint32 would probably reserve the most
>>> negative int32 value (like R does). Using the NA-aware dtypes signals
>>> that you are using NA values; there is no need for an additional flag.
>>
>> I don't understand the numpy design and maintainable issues, but from
>> a user perspective (mine) nafloat64, etc sounds nice.
>
> It's worth noting that this is not a replacement for masked arrays,
> nor is it intended to be the be-all, end-all solution to missing data
> problems. It's mostly just intended to be a focused tool to fill in
> the gaps where masked arrays are less convenient for whatever reason;
> e.g. where you're tempted to (ab)use NaNs for the purpose and the
> limitations on the range of values is acceptable. Not every dtype
> would have an NA-aware counterpart. I would suggest just nabool,
> nafloat64, naint32, nastring (a little tricky due to the flexible
> size, but doable), and naobject. Maybe a couple more, if we get
> requests, like naint64 and nacomplex128.

Oh, and nadatetime64 and natimedelta64.

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth."
  -- Umberto Eco



More information about the NumPy-Discussion mailing list