[Numpy-discussion] NEP mask code and the 1.7 release

Nathaniel Smith njs at pobox.com
Mon Apr 23 15:57:10 EDT 2012


On Mon, Apr 23, 2012 at 6:18 PM, Ralf Gommers
<ralf.gommers at googlemail.com> wrote:
>
>
> On Mon, Apr 23, 2012 at 12:15 AM, Nathaniel Smith <njs at pobox.com> wrote:
>>
>> We need to decide what to do with the NA masking code currently in
>> master, vis-a-vis the 1.7 release. While this code is great at what it
>> is, we don't actually have consensus yet that it's the best way to
>> give our users what they want/need -- or even an appropriate way. So
>> we need to figure out how to release 1.7 without committing ourselves
>> to supporting this design in the future.
>>
>> Background: what does the code currently in master do?
>> --------------------------------------------
>>
>> It adds 3 pointers at the end of the PyArrayObject struct (which is
>> better known as the numpy.ndarray object). These new struct members,
>> and some accessors for them, are exposed as part of the public API.
>> There are also a few additions to the Python-level API (mask= argument
>> to np.array, skipna= argument to ufuncs, etc.)
>>
>> What does this mean for compatibility?
>> ------------------------------------------------
>>
>> The change in the ndarray struct is not as problematic as it might
>> seem, compatibility-wise, since Python objects are almost always
>> referred to by pointers. Since the initial part of the struct will
>> continue to have the same memory layout, existing source and binary
>> code that works with PyArrayObject *pointers* will continue to work
>> unchanged.
>>
>> One place where the actual struct size matters is for any C-level
>> ndarray subclasses, which will have their memory layout change, and
>> thus will need to be recompiled. (Python-level ndarray subclasses will
>> have their memory layout change as well -- e.g., they will have
>> different __dictoffset__ values -- but it's unlikely that any existing
>> Python code depends on such details.)
>>
>> What if we want to change our minds later?
>> -------------------------------------------------------
>>
>> For the same reasons as given above, any new code which avoids
>> referencing the new struct fields referring to masks, or using the new
>> masking APIs, will continue to work even if the masking is later
>> removed.
>>
>> Any new code which *does* refer to the new masking APIs, or references
>> the fields directly, will break if masking is later removed.
>> Specifically, source will fail to compile, and existing binaries will
>> silently access memory that is past the end of the PyArrayObject
>> struct, which will have unpredictable consequences. (Most likely
>> segfaults, but no guarantees.) This applies even to code which simply
>> tries to check whether a mask is present.
>>
>> So I think the preconditions for leaving this code as-is for 1.7 are
>> that we must agree:
>>  * We are willing to require a recompile of any C-level ndarray
>> subclasses (do any exist?)
>
>
> As long as it's only subclasses I think this may be OK. Not 100% sure on
> this one though.
>
>>
>>  * We are willing to make absolutely no guarantees about future
>> compatibility for code which uses APIs marked "experimental"
>
>
> That is what I understand "experimental" to mean. Could stay, could change -
> no guarantees.

Earlier you said it meant "some changes are to be expected, but not
complete removal", which seems different from "absolutely no
guarantees":
  http://www.mail-archive.com/numpy-discussion@scipy.org/msg36833.html
So I just wanted to double-check whether you're revising that earlier
opinion, or...?

>>  * We are willing for this breakage to occur in the form of random
>> segfaults
>
>
> This is not OK of course. But it shouldn't apply to the Python API, which I
> think is the most important one here.

Right, this part is specifically about ABI compatibility, not API
compatibility -- segfaults would only occur for extension libraries
that were compiled against one version of numpy and then used with a
different version.

- N



More information about the NumPy-Discussion mailing list