[Numpy-discussion] consensus (was: NA masks in the next numpy release?)

Sat Oct 29 22:52:39 EDT 2011

Hi,

On Sat, Oct 29, 2011 at 7:48 PM, Charles R Harris
<charlesr.harris at gmail.com> wrote:
>
>
> On Sat, Oct 29, 2011 at 7:47 PM, Matthew Brett <matthew.brett at gmail.com>
> wrote:
>>
>> Hi,
>>
>> On Sat, Oct 29, 2011 at 4:11 PM, Matthew Brett <matthew.brett at gmail.com>
>> wrote:
>> > Hi,
>> >
>> > On Sat, Oct 29, 2011 at 2:59 PM, Charles R Harris
>> > <charlesr.harris at gmail.com> wrote:
>> >>
>> >>
>> >> On Sat, Oct 29, 2011 at 3:55 PM, Matthew Brett
>> >> <matthew.brett at gmail.com>
>> >> wrote:
>> >>>
>> >>> Hi,
>> >>>
>> >>> On Sat, Oct 29, 2011 at 2:48 PM, Ralf Gommers
>> >>> <ralf.gommers at googlemail.com> wrote:
>> >>> >
>> >>> >
>> >>> > On Sat, Oct 29, 2011 at 11:36 PM, Matthew Brett
>> >>> > <matthew.brett at gmail.com>
>> >>> > wrote:
>> >>> >>
>> >>> >> Hi,
>> >>> >>
>> >>> >> On Sat, Oct 29, 2011 at 1:48 PM, Matthew Brett
>> >>> >> <matthew.brett at gmail.com>
>> >>> >> wrote:
>> >>> >> > Hi,
>> >>> >> >
>> >>> >> > On Sat, Oct 29, 2011 at 1:44 PM, Ralf Gommers
>> >>> >> > <ralf.gommers at googlemail.com> wrote:
>> >>> >> >>
>> >>> >> >>
>> >>> >> >> On Sat, Oct 29, 2011 at 9:04 PM, Matthew Brett
>> >>> >> >> <matthew.brett at gmail.com>
>> >>> >> >> wrote:
>> >>> >> >>>
>> >>> >> >>> Hi,
>> >>> >> >>>
>> >>> >> >>> On Sat, Oct 29, 2011 at 3:26 AM, Ralf Gommers
>> >>> >> >>> <ralf.gommers at googlemail.com> wrote:
>> >>> >> >>> >
>> >>> >> >>> >
>> >>> >> >>> > On Sat, Oct 29, 2011 at 1:37 AM, Matthew Brett
>> >>> >> >>> > <matthew.brett at gmail.com>
>> >>> >> >>> > wrote:
>> >>> >> >>> >>
>> >>> >> >>> >> Hi,
>> >>> >> >>> >>
>> >>> >> >>> >> On Fri, Oct 28, 2011 at 4:21 PM, Ralf Gommers
>> >>> >> >>> >> <ralf.gommers at googlemail.com> wrote:
>> >>> >> >>> >> >
>> >>> >> >>> >> >
>> >>> >> >>> >> > On Sat, Oct 29, 2011 at 12:37 AM, Matthew Brett
>> >>> >> >>> >> > <matthew.brett at gmail.com>
>> >>> >> >>> >> > wrote:
>> >>> >> >>> >> >>
>> >>> >> >>> >> >> Hi,
>> >>> >> >>> >> >>
>> >>> >> >>> >> >> On Fri, Oct 28, 2011 at 3:14 PM, Charles R Harris
>> >>> >> >>> >> >> <charlesr.harris at gmail.com> wrote:
>> >>> >> >>> >> >> >>
>> >>> >> >>> >> >>
>> >>> >> >>> >> >> No, that's not what Nathaniel and I are saying at all.
>> >>> >> >>> >> >> Nathaniel
>> >>> >> >>> >> >> was
>> >>> >> >>> >> >> pointing to links for projects that care that everyone
>> >>> >> >>> >> >> agrees
>> >>> >> >>> >> >> before
>> >>> >> >>> >> >> they go ahead.
>> >>> >> >>> >> >
>> >>> >> >>> >> > It looked to me like there was a serious intent to come to
>> >>> >> >>> >> > an
>> >>> >> >>> >> > agreement,
>> >>> >> >>> >> > or
>> >>> >> >>> >> > at least closer together. The discussion in the summer was
>> >>> >> >>> >> > going
>> >>> >> >>> >> > around
>> >>> >> >>> >> > in
>> >>> >> >>> >> > circles though, and was too abstract and complex to
>> >>> >> >>> >> > follow.
>> >>> >> >>> >> > Therefore
>> >>> >> >>> >> > Mark's
>> >>> >> >>> >> > choice of implementing something and then asking for
>> >>> >> >>> >> > feedback
>> >>> >> >>> >> > made
>> >>> >> >>> >> > sense
>> >>> >> >>> >> > to
>> >>> >> >>> >> > me.
>> >>> >> >>> >>
>> >>> >> >>> >> I should point out that the implementation hasn't - as far
>> >>> >> >>> >> as I
>> >>> >> >>> >> can
>> >>> >> >>> >> see - changed the discussion.  The discussion was about the
>> >>> >> >>> >> API.
>> >>> >> >>> >>
>> >>> >> >>> >> Implementations are useful for agreed APIs because they can
>> >>> >> >>> >> point
>> >>> >> >>> >> out
>> >>> >> >>> >> where the API does not make sense or cannot be implemented.
>> >>> >> >>> >>  In
>> >>> >> >>> >> this
>> >>> >> >>> >> case, the API Mark said he was going to implement - he did
>> >>> >> >>> >> implement -
>> >>> >> >>> >> at least as far as I can see.  Again, I'm happy to be
>> >>> >> >>> >> corrected.
>> >>> >> >>> >
>> >>> >> >>> > Implementations can also help the discussion along, by
>> >>> >> >>> > allowing
>> >>> >> >>> > people
>> >>> >> >>> > to
>> >>> >> >>> > try out some of the proposed changes. It also allows to
>> >>> >> >>> > construct
>> >>> >> >>> > examples
>> >>> >> >>> > that show weaknesses, possibly to be solved by an alternative
>> >>> >> >>> > API.
>> >>> >> >>> > Maybe
>> >>> >> >>> > you
>> >>> >> >>> > can hold the complete history of this topic in your head and
>> >>> >> >>> > comprehend
>> >>> >> >>> > it,
>> >>> >> >>> > but for me it would be very helpful if someone said:
>> >>> >> >>> > - here's my dataset
>> >>> >> >>> > - this is what I want to do with it
>> >>> >> >>> > - this is the best I can do with the current implementation
>> >>> >> >>> > - here's how API X would allow me to solve this better or
>> >>> >> >>> > simpler
>> >>> >> >>> > This can be done much better with actual data and an actual
>> >>> >> >>> > implementation
>> >>> >> >>> > than with a design proposal. You seem to disagree with this
>> >>> >> >>> > statement.
>> >>> >> >>> > That's fine. I would hope though that you recognize that
>> >>> >> >>> > concrete
>> >>> >> >>> > examples
>> >>> >> >>> > help people like me, and construct one or two to help us out.
>> >>> >> >>> That's what use-cases are for in designing APIs.  There are
>> >>> >> >>> examples
>> >>> >> >>> of use in the NEP:
>> >>> >> >>>
>> >>> >> >>>
>> >>> >> >>>
>> >>> >> >>> https://github.com/numpy/numpy/blob/master/doc/neps/missing-data.rst
>> >>> >> >>>
>> >>> >> >>> the alterNEP:
>> >>> >> >>>
>> >>> >> >>> https://gist.github.com/1056379
>> >>> >> >>>
>> >>> >> >>> and my longer email to Travis:
>> >>> >> >>>
>> >>> >> >>>
>> >>> >> >>>
>> >>> >> >>>
>> >>> >> >>>
>> >>> >> >>> http://article.gmane.org/gmane.comp.python.numeric.general/46544/match=ignored
>> >>> >> >>>
>> >>> >> >>> Mark has done a nice job of documentation:
>> >>> >> >>>
>> >>> >> >>> http://docs.scipy.org/doc/numpy/reference/arrays.maskna.html
>> >>> >> >>>
>> >>> >> >>> If you want to understand what the alterNEP case is, I'd
>> >>> >> >>> suggest
>> >>> >> >>> the
>> >>> >> >>> email, just because it's the most recent and I think the
>> >>> >> >>> terminology
>> >>> >> >>> is slightly clearer.
>> >>> >> >>>
>> >>> >> >>> Doing the same examples on a larger array won't make the point
>> >>> >> >>> easier
>> >>> >> >>> to understand.  The discussion is about what the right concepts
>> >>> >> >>> are,
>> >>> >> >>> and you can help by looking at the snippets of code in those
>> >>> >> >>> documents, and deciding for yourself whether you think the
>> >>> >> >>> current
>> >>> >> >>> masking / NA implementation seems natural and easy to explain,
>> >>> >> >>> or
>> >>> >> >>> rather forced and difficult to explain, and then email back
>> >>> >> >>> trying
>> >>> >> >>> to
>> >>> >> >>> explain your impression (which is not always easy).
>> >>> >> >>
>> >>> >> >> If you seriously believe that looking at a few snippets is as
>> >>> >> >> helpful
>> >>> >> >> and
>> >>> >> >> instructive as being able to play around with them in IPython
>> >>> >> >> and
>> >>> >> >> modify
>> >>> >> >> them, then I guess we won't make progress in this part of the
>> >>> >> >> discussion.
>> >>> >> >> You're just telling me to go back and re-read things I'd already
>> >>> >> >> read.
>> >>> >> >
>> >>> >> > The snippets are in ipython or doctest format - aren't they?
>> >>> >>
>> >>> >> Oops - 10 minute rule.  Now I see that you mean that you can't
>> >>> >> experiment with the alternative implementation without working
>> >>> >> code.
>> >>> >
>> >>> > Indeed.
>> >>> >
>> >>> >>
>> >>> >> That's true, but I am hoping that the difference between - say:
>> >>> >>
>> >>> >> a[0:2] = np.NA
>> >>> >>
>> >>> >> and
>> >>> >>
>> >>> >> a.mask[0:2] = False
>> >>> >>
>> >>> >> would be easy enough to imagine.
>> >>> >
>> >>> > It is in this case. I agree the explicit ``a.mask`` is clearer. This
>> >>> > is
>> >>> > a
>> >>> > quite specific point that could be improved in the current
>> >>> > implementation.
>> >>>
>> >>> Thanks - this is helpful.
>> >>>
>> >>> > It doesn't require ripping everything out.
>> >>>
>> >>> Nathaniel wasn't proposing 'ripping everything out' - but backing off
>> >>> until consensus has been reached.  That's different.    If you think
>> >>> we should not do that, and you are interested, please say why.
>> >>> Second - I was proposing that we do indeed keep the code in the
>> >>> codebase but discuss adaptations that could achieve consensus.
>> >>>
>> >>
>> >> I'm much opposed to ripping the current code out.
>> >
>> > You are repeating the loaded phrase 'ripping the current code out' and
>> > thus making the discussion less sensible and more hostile.
>> >
>> >>  It isn't like it is (known
>> >> to be) buggy, nor has anyone made the case that it isn't a basis on
>> >> which
>> >> build other options. It also smacks of gratuitous violence committed by
>> >> someone yet to make a positive contribution to the project.
>> >
>> > This is cheap, rude, and silly.  All I can see from Nathaniel is a
>> > reasonable, fair attempt to discuss the code.  He proposed backing off
>> > the code in good faith.   You are emphatically, and, in my view
>> > childishly, ignoring the substantial points he is making, and
>> > asserting over and over that he deserves no hearing because he has not
>> > contributed code.   This is a terribly destructive way to work.  If I
>> > was a new developer reading this, I would conclude, that I had better
>> > be damn careful which side I'm on, before I express my opinion,
>> > otherwise I'm going to be made to feel like I don't exist by the other
>> > people on the project.  That is miserable, it is silly, and it's the
>> > wrong way to do business.
>>
>> I conclude that it's bad to drink this much coffee in an afternoon,
>> and that the next time I visit my friend's house, I'll take some
>> decaf.
>>
>> Sorry Chuck - you're right - this was too personal.   I do disagree
>> with you, but I was rude here and I am sorry.  I owe you an expensive
>> drink, as per Ben's excellent suggestion.
>>
>
> Apology accepted.

Thank you, that is gracious of you.

> Let me add an argument for not pulling out the current
> implementation, which is the underlying reason of the release early, release
> often open software mantra: if the NA work is off in a branch, no one will
> use it and we will lack useful feedback. Now, I don't have a problem with
> adding a comment to the release notes stating that the API is not completely
> settled and can change due to user feedback. But we do need users, and they
> need to work with it for at least a few weeks. My own initial reaction to
> new software often evolves as: "WTF", followed by hours -- days -- weeks --
> while I wander around muttering "morons, idiots" to myself. That is not the
> best period of time for me to make a balanced assessment, that needs to wait
> until I settle down. Then I adapt and usually things no longer look so bad,
> maybe they even look good, maybe even great. So it goes.

Yes, that's very reasonable.  It may be that we don't have good hope
of resolving the current discussion in the near future, in which case
it would not make much sense to pull it out pending agreement.

Best (honestly),

Matthew