[Numpy-discussion] consensus (was: NA masks in the next numpy release?)

Ralf Gommers ralf.gommers at googlemail.com
Sat Oct 29 17:48:03 EDT 2011


On Sat, Oct 29, 2011 at 11:36 PM, Matthew Brett <matthew.brett at gmail.com>wrote:

> Hi,
>
> On Sat, Oct 29, 2011 at 1:48 PM, Matthew Brett <matthew.brett at gmail.com>
> wrote:
> > Hi,
> >
> > On Sat, Oct 29, 2011 at 1:44 PM, Ralf Gommers
> > <ralf.gommers at googlemail.com> wrote:
> >>
> >>
> >> On Sat, Oct 29, 2011 at 9:04 PM, Matthew Brett <matthew.brett at gmail.com
> >
> >> wrote:
> >>>
> >>> Hi,
> >>>
> >>> On Sat, Oct 29, 2011 at 3:26 AM, Ralf Gommers
> >>> <ralf.gommers at googlemail.com> wrote:
> >>> >
> >>> >
> >>> > On Sat, Oct 29, 2011 at 1:37 AM, Matthew Brett <
> matthew.brett at gmail.com>
> >>> > wrote:
> >>> >>
> >>> >> Hi,
> >>> >>
> >>> >> On Fri, Oct 28, 2011 at 4:21 PM, Ralf Gommers
> >>> >> <ralf.gommers at googlemail.com> wrote:
> >>> >> >
> >>> >> >
> >>> >> > On Sat, Oct 29, 2011 at 12:37 AM, Matthew Brett
> >>> >> > <matthew.brett at gmail.com>
> >>> >> > wrote:
> >>> >> >>
> >>> >> >> Hi,
> >>> >> >>
> >>> >> >> On Fri, Oct 28, 2011 at 3:14 PM, Charles R Harris
> >>> >> >> <charlesr.harris at gmail.com> wrote:
> >>> >> >> >>
> >>> >> >>
> >>> >> >> No, that's not what Nathaniel and I are saying at all. Nathaniel
> was
> >>> >> >> pointing to links for projects that care that everyone agrees
> before
> >>> >> >> they go ahead.
> >>> >> >
> >>> >> > It looked to me like there was a serious intent to come to an
> >>> >> > agreement,
> >>> >> > or
> >>> >> > at least closer together. The discussion in the summer was going
> >>> >> > around
> >>> >> > in
> >>> >> > circles though, and was too abstract and complex to follow.
> Therefore
> >>> >> > Mark's
> >>> >> > choice of implementing something and then asking for feedback made
> >>> >> > sense
> >>> >> > to
> >>> >> > me.
> >>> >>
> >>> >> I should point out that the implementation hasn't - as far as I can
> >>> >> see - changed the discussion.  The discussion was about the API.
> >>> >>
> >>> >> Implementations are useful for agreed APIs because they can point
> out
> >>> >> where the API does not make sense or cannot be implemented.  In this
> >>> >> case, the API Mark said he was going to implement - he did implement
> -
> >>> >> at least as far as I can see.  Again, I'm happy to be corrected.
> >>> >
> >>> > Implementations can also help the discussion along, by allowing
> people
> >>> > to
> >>> > try out some of the proposed changes. It also allows to construct
> >>> > examples
> >>> > that show weaknesses, possibly to be solved by an alternative API.
> Maybe
> >>> > you
> >>> > can hold the complete history of this topic in your head and
> comprehend
> >>> > it,
> >>> > but for me it would be very helpful if someone said:
> >>> > - here's my dataset
> >>> > - this is what I want to do with it
> >>> > - this is the best I can do with the current implementation
> >>> > - here's how API X would allow me to solve this better or simpler
> >>> > This can be done much better with actual data and an actual
> >>> > implementation
> >>> > than with a design proposal. You seem to disagree with this
> statement.
> >>> > That's fine. I would hope though that you recognize that concrete
> >>> > examples
> >>> > help people like me, and construct one or two to help us out.
> >>> That's what use-cases are for in designing APIs.  There are examples
> >>> of use in the NEP:
> >>>
> >>> https://github.com/numpy/numpy/blob/master/doc/neps/missing-data.rst
> >>>
> >>> the alterNEP:
> >>>
> >>> https://gist.github.com/1056379
> >>>
> >>> and my longer email to Travis:
> >>>
> >>>
> >>>
> http://article.gmane.org/gmane.comp.python.numeric.general/46544/match=ignored
> >>>
> >>> Mark has done a nice job of documentation:
> >>>
> >>> http://docs.scipy.org/doc/numpy/reference/arrays.maskna.html
> >>>
> >>> If you want to understand what the alterNEP case is, I'd suggest the
> >>> email, just because it's the most recent and I think the terminology
> >>> is slightly clearer.
> >>>
> >>> Doing the same examples on a larger array won't make the point easier
> >>> to understand.  The discussion is about what the right concepts are,
> >>> and you can help by looking at the snippets of code in those
> >>> documents, and deciding for yourself whether you think the current
> >>> masking / NA implementation seems natural and easy to explain, or
> >>> rather forced and difficult to explain, and then email back trying to
> >>> explain your impression (which is not always easy).
> >>
> >> If you seriously believe that looking at a few snippets is as helpful
> and
> >> instructive as being able to play around with them in IPython and modify
> >> them, then I guess we won't make progress in this part of the
> discussion.
> >> You're just telling me to go back and re-read things I'd already read.
> >
> > The snippets are in ipython or doctest format - aren't they?
>
> Oops - 10 minute rule.  Now I see that you mean that you can't
> experiment with the alternative implementation without working code.
>

Indeed.


> That's true, but I am hoping that the difference between - say:
>
> a[0:2] = np.NA
>
> and
>
> a.mask[0:2] = False
>
> would be easy enough to imagine.


It is in this case. I agree the explicit ``a.mask`` is clearer. This is a
quite specific point that could be improved in the current implementation.
It doesn't require ripping everything out.

Ralf
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20111029/0c40d669/attachment.html>


More information about the NumPy-Discussion mailing list