[Numpy-discussion] consensus (was: NA masks in the next numpy release?)

Matthew Brett matthew.brett at gmail.com
Sat Oct 29 15:04:29 EDT 2011


Hi,

On Sat, Oct 29, 2011 at 3:26 AM, Ralf Gommers
<ralf.gommers at googlemail.com> wrote:
>
>
> On Sat, Oct 29, 2011 at 1:37 AM, Matthew Brett <matthew.brett at gmail.com>
> wrote:
>>
>> Hi,
>>
>> On Fri, Oct 28, 2011 at 4:21 PM, Ralf Gommers
>> <ralf.gommers at googlemail.com> wrote:
>> >
>> >
>> > On Sat, Oct 29, 2011 at 12:37 AM, Matthew Brett
>> > <matthew.brett at gmail.com>
>> > wrote:
>> >>
>> >> Hi,
>> >>
>> >> On Fri, Oct 28, 2011 at 3:14 PM, Charles R Harris
>> >> <charlesr.harris at gmail.com> wrote:
>> >> >>
>> >>
>> >> No, that's not what Nathaniel and I are saying at all. Nathaniel was
>> >> pointing to links for projects that care that everyone agrees before
>> >> they go ahead.
>> >
>> > It looked to me like there was a serious intent to come to an agreement,
>> > or
>> > at least closer together. The discussion in the summer was going around
>> > in
>> > circles though, and was too abstract and complex to follow. Therefore
>> > Mark's
>> > choice of implementing something and then asking for feedback made sense
>> > to
>> > me.
>>
>> I should point out that the implementation hasn't - as far as I can
>> see - changed the discussion.  The discussion was about the API.
>>
>> Implementations are useful for agreed APIs because they can point out
>> where the API does not make sense or cannot be implemented.  In this
>> case, the API Mark said he was going to implement - he did implement -
>> at least as far as I can see.  Again, I'm happy to be corrected.
>
> Implementations can also help the discussion along, by allowing people to
> try out some of the proposed changes. It also allows to construct examples
> that show weaknesses, possibly to be solved by an alternative API. Maybe you
> can hold the complete history of this topic in your head and comprehend it,
> but for me it would be very helpful if someone said:
> - here's my dataset
> - this is what I want to do with it
> - this is the best I can do with the current implementation
> - here's how API X would allow me to solve this better or simpler
> This can be done much better with actual data and an actual implementation
> than with a design proposal. You seem to disagree with this statement.
> That's fine. I would hope though that you recognize that concrete examples
> help people like me, and construct one or two to help us out.
That's what use-cases are for in designing APIs.  There are examples
of use in the NEP:

https://github.com/numpy/numpy/blob/master/doc/neps/missing-data.rst

the alterNEP:

https://gist.github.com/1056379

and my longer email to Travis:

http://article.gmane.org/gmane.comp.python.numeric.general/46544/match=ignored

Mark has done a nice job of documentation:

http://docs.scipy.org/doc/numpy/reference/arrays.maskna.html

If you want to understand what the alterNEP case is, I'd suggest the
email, just because it's the most recent and I think the terminology
is slightly clearer.

Doing the same examples on a larger array won't make the point easier
to understand.  The discussion is about what the right concepts are,
and you can help by looking at the snippets of code in those
documents, and deciding for yourself whether you think the current
masking / NA implementation seems natural and easy to explain, or
rather forced and difficult to explain, and then email back trying to
explain your impression (which is not always easy).

>> >> In saying that we are insisting on our way, you are saying, implicitly,
>> >> 'I
>> >> am not going to negotiate'.
>> >
>> > That is only your interpretation. The observation that Mark compromised
>> > quite a bit while you didn't seems largely correct to me.
>>
>> The problem here stems from our inability to work towards agreement,
>> rather than standing on set positions.  I set out what changes I think
>> would make the current implementation OK.  Can we please, please have
>> a discussion about those points instead of trying to argue about who
>> has given more ground.
>>
>> > That commitment would of course be good. However, even if that were
>> > possible
>> > before writing code and everyone agreed that the ideas of you and
>> > Nathaniel
>> > should be implemented in full, it's still not clear that either of you
>> > would
>> > be willing to write any code. Agreement without code still doesn't help
>> > us
>> > very much.
>>
>> I'm going to return to Nathaniel's point - it is a highly valuable
>> thing to set ourselves the target of resolving substantial discussions
>> by consensus.   The route you are endorsing here is 'implementor
>> wins'.
>
> I'm not. All I want to point out is is that design and implementation are
> not completely separated either.

No, they often interact.  I was trying to explain why, in this case,
the implementation hasn't changed the issues substantially, as far as
I can see.   If you think otherwise, then that is helpful information,
because you can feed back about where the initial discussion has been
overtaken by the implementation, and so we can strip down the
discussion to its essential parts.

>> We don't need to do it that way.  We're a mature sensible
>> bunch of adults
>
> Agreed:)

Ah - if only it was that easy :)

>> who can talk out the issues until we agree they are
>> ready for implementation, and then implement.
>
> The history of this discussion doesn't suggest it straightforward to get a
> design right first time. It's a complex subject.

Right - and it's more complex when only some of the people involved
are interested in the discussion coming to a resolution.   That's
Nathaniel's point - that although it seems inefficient, working
towards a good resolution of big issues like this is very valuable in
getting the ideas right.

> The second part of your statement, "and then implement", sounds so simple.
> The reality is that there are only a handful of developers who have done a
> significant amount of work on the numpy core in the last two years. I
> haven't seen anyone saying they are planning to implement (part of) whatever
> design the outcome of this discussion will be. I don't think it's strange to
> keep this in mind to some extent.

No, but consensus building is a little bit all or none.   I guess we'd
all like consensus, but then sometimes, as Nathaniel points out, it is
inconvenient and annoying.  If we have no stated commitment to
consensus, at some unpredictable point in the discussion, those who
are implementing will - obviously - just duck out and do the
implementation.  I would do that, I guess.  Maybe I have done in the
projects I'm involved in.   The question Nathaniel is raising, and me
too, in a less coherent way, is - is that fine?    Does it matter that
we are short-cutting through substantial discussions?   Is that really
- in the long term - a more efficient way of building both the code
and the community?

Best,

Matthew



More information about the NumPy-Discussion mailing list