[Numpy-discussion] using NaN, INT_MIN etc in ndarray instead of a masked array

Mon Apr 17 23:04:04 EDT 2006

Michael Sorich wrote:
> On 4/8/06, *Sasha* <ndarray at mac.com <mailto:ndarray at mac.com>> wrote:
>
>     ...
>
>     See above. For ndarray mask is always False unless an add-on module is
>     loaded that redefines arithmetic to recognize special bit-patterns
>     such as NaN or INT_MIN.
>
>
> Is it possible to implement masked values using these special bit 
> patterns in the ndarray instead of using a separate MA class? If so 
> has there been any thought as to whether this may be the better 
> option. I think it would be preferable if the ability to handle masked 
> data was available in the standard array class (ndarray), as this 
> would increase the likelihood that functions built for numeric arrays 
> will handle masked values well. It seems that ndarray already has 
> decent support for nans (isnan() returns the equivalent of a boolean 
> mask array), indicating that such an approach may be acceptable. How 
> difficult is it to generalise the concept to other data types (int, 
> string, bool)?
>
I don't think the approach can be generalized at all.   It would only 
work with floating-point values and therefore is not particularly exciting.

I think ultimately, making masked arrays a C-based sub-class is where 
masked array should go.  For now the Python-based class is a good 
environment for developing the ideas behind how to preserve masked 
arrays through other functions if it is possible.

It seems that masked arrays must do things quite differently than other 
arrays on certain applications, and I'm not altogether clear on how to 
support them in all the NumPy code.  Because masked arrays are not used 
by everybody who uses NumPy arrays, it should be a separate sub-class. 

Ultimately, I hope we will get the basic array object into Python (what 
Tim was calling the super array) before 2.6

-Travis