[Numpy-discussion] defining a NumPy API standard?

Sun Jun 2 02:59:25 EDT 2019

On Sun, Jun 2, 2019 at 12:35 AM Nathaniel Smith <njs at pobox.com> wrote:

> On Sat, Jun 1, 2019 at 1:05 PM Ralf Gommers <ralf.gommers at gmail.com>
> wrote:
> > I think this is potentially useful, but *far* more prescriptive and
> detailed than I had in mind. Both you and Nathaniel seem to have not
> understood what I mean by "out of scope", so I think that's my fault in not
> being explicit enough. I *do not* want to prescribe behavior. Instead, a
> simple yes/no for each function in numpy and method on ndarray.
>
> So yes/no are the answers. But what's the question?
>
> "If we were redesigning numpy in a fantasy world without external
> constraints or compatibility issues, would we include this function?"
> "Is this function well designed?"
> "Do we think that supporting this function is necessary to achieve
> practical duck-array compatibility?"
> "If someone implements this function, should we give them a 'numpy
> core compliant!' logo to put on their website?"
> "Do we recommend that people use this function in new code?"
> "If we were trying to design a minimal set of primitives and implement
> the rest of numpy in terms of them, then is this function a good
> candidate for a primitive?"
>
> These are all really different things, and useful for solving
> different problems... I feel like you might be lumping them together
> some?
>

No, I feel like you just want to see a real proposal. At this point I've
gotten some really useful feedback, in particular from Marten (thanks!),
and I have a better idea of what to do. So I'll answer a few of your
questions, and propose to leave the rest till I actually have some more
solid to discuss. That will likely answer many of your questions.

> Also, I'm guessing there are a bunch of functions where you think part
> of the interface is fine and part of the interface is broken. (E.g.
> dot's behavior on high-dimensional arrays.)

Indeed, but that's a much harder problem to tackle. Again, there's a reason
I put function behavior explicitly out of scope.

Do you think this "one
> bool per function" structure will be fine-grained enough for what you
> want to do?
>

yes

>
> > Two other thoughts:
> > 1. NumPy is not done. Our thinking on how to evolve the NumPy API is
> fairly muddled. When new functions are proposed, it's decided on on a
> case-by-case basis, usually without a guiding principle. We need to improve
> that. A "core of NumPy" list could be a part of that puzzle.
>
> I think we do have some rough consensus principles on what's in scope
> and what isn't in scope for numpy,

Very rough perhaps. I don't think we are on the same wavelength at all
about the cost of adding new functions, the cost of deprecations, the use
of submodules and even what's public or private right now.

That can't be solved all at once, but I think what my idea will help with
some of these.

but yeah, articulating them more
> clearly could be useful. Stuff like "output types and shape should be
> predictable from input types and shape", "numpy's core
> responsibilities are the array/dtype/ufunc interfaces, and providing a
> lingua franca for python numerical libraries to interoperate" (and
> therefore: "if it can live outside numpy it probably should"), etc.
>

All of these are valid questions. Most of that propably needs to be in the
scope document (https://www.numpy.org/neps/scope.html). Which also needs to
be improved.

I'm seeing this as a living document (a NEP?)

NEP would work. Although I'd prefer a way to be able to reference some
fixed version of it rather than it being always in flux.

that tries to capture
> some rules of thumb and that we update as we go. That seems pretty
> different to me than a long list of yes/no checkboxes though?
>
> > 2. We often argue about deprecations. Deprecations are costly, but so is
> keeping around functions that are not very useful or have a poor design.
> This may offer a middle ground. Don't let others repeat our mistakes,
> signal to users that a function is of questionable value, without breaking
> already written code.
>
> The idea has come up a few times of having a "soft deprecation" level,
> where we put a warning in the docs but not in the code. It seems like
> a reasonable idea to me. It's inherently a kind of case-by-case thing
> that can be done incrementally. But, if someone wants to
> systematically work through all the docs and do the case-by-case
> analysis, that also seems like a reasonable idea to me. I'm not sure
> if that's the same as your proposal or not.
>

not the same, but related.

Ralf
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20190602/44ab761f/attachment.html>