[Numpy-discussion] NEP: Dispatch Mechanism for NumPy’s high level API

Tue Jun 5 17:11:23 EDT 2018

On Tue, Jun 5, 2018 at 12:35 PM Marten van Kerkwijk <
m.h.vankerkwijk at gmail.com> wrote:

> Things would, I think, make much more sense if `ndarray.__array_ufunc__`
> (or `*_function__`) actually *were* the implementation for array-only. But
> while that is something I'd like to eventually get to, it seems out of
> scope for the current discussion.
>

If this is a desirable end-state, we should at least consider it now while
we are designing the __array_function__ interface.

With the current proposal, I think this would be nearly impossible. The
challenge is that ndarray.__array_function__ would somehow need to call the
non-overloaded version of the provided function provided that no other
arguments overload __array_function__. However, currently don't expose this
information in any way.

Some ways this could be done (including some of your prior suggestions):
- Add a coerce=True argument to all NumPy functions, which could be used by
non-overloaded implementations.
- A separate namespace for non-overloaded functions (e.g.,
numpy.array_only).
- Adding another argument to the __array_function__ interface to explicitly
provide the non-overloaded implementation (e.g., func_impl).

I don't like any of these options and I'm not sure I agree with your goal,
but the NEP should make clear that we are precluding this possibility.

Given that, I think that perhaps it is also best not to do
> `NotImplementedButCoercible` - as I think the implementers of
> `__array_function__` perhaps should just do that themselves. But I may well
> swing the other way again... Good examples of non-trivial benefits would
> help.
>

This would also be my default stance, and of course we can always add
NotImplementedButCoercible later.

I can think of two main use cases:
1. Libraries that only want to overload *some* NumPy functions, but want
the rest of NumPy's API by coercing arguments to NumPy arrays.
2. Library that want to eventually overload all of NumPy's high level API,
but need to do so incrementally, in a way that preserves backwards
compatibility.

I'm not sure I agree with use case 1. Arguably, libraries that only
overload a limited part of NumPy's API shouldn't encourage their users
their users to rely on it. This state of affairs is pretty confusing to
users.

However, case 2 is valid and potentially important. Consider the case of a
library with existing users that would like to start implementing
__array_function__ (e.g., dask, astropy, xarray, pandas). The right
strategy really depends upon whether the library considers the current
behavior of NumPy functions on their objects (silent coercion to numpy
arrays) a feature or a bug:
- If coercion is a bug and something that the library never intended to
support, then perhaps it would be OK to suddenly change all existing
overloads to return the correct type.
- However, if coercion is a feature (which is probably the attitude of at
least some users), ideally there really should be a graceful way to enable
the new overloaded behavior incrementally. For example, a library might
want to start issuing FutureWarning in version X, before switching over to
the new overloaded behavior in version X+1. I can't think of how to do this
without NotImplementedButCoercible.

For projects like dask and xarray, the benefits of __array_function__ are
so large that we will accept a hard transition that breaks some user code
without warning. But this may not be the case for other projects.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20180605/b75a2ec7/attachment-0001.html>