[Numpy-discussion] NA masks in the next numpy release?

Stéfan van der Walt stefan at sun.ac.za
Fri Oct 28 16:13:12 EDT 2011


On Fri, Oct 28, 2011 at 12:47 PM, Benjamin Root <ben.root at ou.edu> wrote:
>
> 2011/10/28 Stéfan van der Walt <stefan at sun.ac.za>
>> The
>> implementation as it stands essentially gives us a faster and more
>> integrated version of numpy.ma; but it has become clear from this
>> conversation that such an approach overlooks a very common subset of
>> masked-related problems.
>>
> Which are...? (given the history of this discussion, let's not assume
> anything is clear).

The case where the number of elements in the array vastly outnumbers
the number of masked elements.  (Images, 3D volumes, large
time-series, tables, etc.)

E.g., if you are taking measurements from a sensor, but once in a blue
moon the sensor messes up, you simply want to mark those values as
missing, but you do not want to allocate a whole new chunk of memory
to do so.

I had a chat with JB Poline this morning, who mentioned that sparse
matrix storage of the mask may also be an option.  Those containers
typically trade off insertion vs. lookup speeds, so I'm not sure
whether it'd be feasible, but I like the idea.

Stéfan



More information about the NumPy-Discussion mailing list