[Numpy-discussion] NA masks in the next numpy release?

Fri Oct 28 16:27:42 EDT 2011

2011/10/28 Stéfan van der Walt <stefan at sun.ac.za>

> On Fri, Oct 28, 2011 at 12:47 PM, Benjamin Root <ben.root at ou.edu> wrote:
> >
> > 2011/10/28 Stéfan van der Walt <stefan at sun.ac.za>
> >> The
> >> implementation as it stands essentially gives us a faster and more
> >> integrated version of numpy.ma; but it has become clear from this
> >> conversation that such an approach overlooks a very common subset of
> >> masked-related problems.
> >>
> > Which are...? (given the history of this discussion, let's not assume
> > anything is clear).
>
> The case where the number of elements in the array vastly outnumbers
> the number of masked elements.  (Images, 3D volumes, large
> time-series, tables, etc.)
>
> E.g., if you are taking measurements from a sensor, but once in a blue
> moon the sensor messes up, you simply want to mark those values as
> missing, but you do not want to allocate a whole new chunk of memory
> to do so.
>
> I had a chat with JB Poline this morning, who mentioned that sparse
> matrix storage of the mask may also be an option.  Those containers
> typically trade off insertion vs. lookup speeds, so I'm not sure
> whether it'd be feasible, but I like the idea.
>
>
I think simple run length encoding might work well with masks.

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20111028/4bb552c4/attachment.html>