[Numpy-discussion] consensus (was: NA masks in the next numpy release?)
Matthew Brett
matthew.brett at gmail.com
Sat Oct 29 14:43:15 EDT 2011
Hi,
On Fri, Oct 28, 2011 at 8:38 PM, Benjamin Root <ben.root at ou.edu> wrote:
> Matt,
>
> On Friday, October 28, 2011, Matthew Brett <matthew.brett at gmail.com> wrote:
>>
>>> Forget about rudeness or decision processes.
>>
>> No, that's a common mistake, which is to assume that any conversation
>> about things which aren't technical, is not important. Nathaniel's
>> point is important. Rudeness is important. The reason we've got into
>> this mess is because we clearly don't have an agreed way of making
>> decisions. That's why countries and open-source projects have
>> constitutions, so this doesn't happen.
>
> Don't get me wrong. In general, you are right. And maybe we all should
> discuss something to that effect for numpy. But I would rather do that when
> there isn't such contention and tempers.
That's a reasonable point.
> As for allegations of rudeness, I believe that we are actually very close to
> consensus that I immediately wanted to squelch any sort of
> meta-meta-disagreements about who was being rude to who. As a quick
> band-aide, anybody who felt slighted by me gets a drink on me at the next
> scipy conference. From this point on, let's institute a 10 minute rule --
> write your email, wait ten minutes, read it again and edit it.
Good offer. I make the same one.
>>> I will start by saying that I am willing to separate ignore and absent,
>>> but
>>> only on the write side of things. On read, I want a single way to
>>> identify
>>> the missing values. I also want only a single way to perform
>>> calculations
>>> (either skip or propagate).
>>
>> Thank you - that is very helpful.
>>
>> Are you saying that you'd be OK setting missing values like this?
>>
>>>>> a.mask[0:2] = False
>>
>
> Probably not that far, because that would be an attribute that may or may
> not exist. Rather, I might like the idea of a NA to "always" mean absent
> (and destroys - even through views), and MA (or some other name) which
> always means ignore (and has the masking behavior with views). This makes
> specific behaviors tied distinctly to specific objects.
Ah - yes - thank you. I think you and I at least have somewhere to go
for agreement, but, I don't know how to work towards a numpy-wide
agreement. Do you have any thoughts?
>> For the read side, do you mean you're OK with this
>>
>>>>> a.isna()
>>
>> To identify the missing values, as is currently the case? Or something
>> else?
>>
>
> Yes. A missing value is a missing value, regardless of it being absent or
> marked as ignored. But it is a bit more subtle than that. I should just be
> able to add two arrays together and the "data should know what to do". When
> the core ufuncs get this right (like min, max, sum, cumsum, diff, etc), then
> I don't have to do much to prepare higher level funcs for missing data.
>
>> If so, then I think we're very close, it's just a discussion about names.
>>
>
> And what does ignore + absent equals. ;-)
ignore + absent == special_value_of_some_sort :)
Just joking,
See you,
Matthew
More information about the NumPy-Discussion
mailing list