[Numpy-discussion] Should unique types of all arguments be passed on in __array_function__?

Stephan Hoyer shoyer at gmail.com
Sun Nov 4 19:51:07 EST 2018


On Sun, Nov 4, 2018 at 8:03 AM Marten van Kerkwijk <
m.h.vankerkwijk at gmail.com> wrote:

> I thought of this partially as I was wondering how an implementation for
> ndarray itself would look like. For that, it is definitely useful to know
> all unique types, since if it is only ndarray, no casting whatsoever needs
> to be done, while if there are integers, lists, etc, an attempt has to be
> made to turn these into arrays
>

OK, so hypothetically we could invoke versions of each the numpy function
that doesn't call `as[any]array`, and this would slightly speed-up
subclasses that call super().__array_function__?

The former feels pretty unlikely for now -- and would be speeding up a
somewhat niche use-case (more niche even than __array_function__ in
general) -- but perhaps I could be convinced.


> (i.e., the `as[any]array` calls currently present in the implementations,
> which really more logically are part of `ndarray.__array_function__`
> dispatch).
>

I can sort of see the reasoning for this, but I suspect the overhead of
actually calling `ndarray.__array_function__` as part of calling every
NumPy functions would be prohibitive. It would mean that __array_function__
attributes get checked twice, once for dispatching and once in
`ndarray.__array_function__`.

It would also mean that `ndarray.__array_function__` would need to grow a
general purpose coercion mechanism for converting array-like arguments into
ndarray objects. I suspect this isn't really possible given the diversity
of function signatures in NumPy, e.g., consider the handling of lists in
np.block() (recurse) vs. np.concatenate (pass through) vs ufuncs (coerce to
ndarray). The best we could do would be add another special function like
dispatchers for handling coercion for each specific NumPy functions.

Should we change this? It is quite trivially done, but perhaps I am missing
> a reason for omitting the non-override types.
>

Realistically, without these other changes in NumPy, how would this improve
code using __array_function__? From a general purpose dispatching
perspective, are there cases where you'd want to return NotImplemented
based on types that don't implement __array_function__?

I guess this might help if your alternative array class is super-explicit,
and doesn't automatically call `asmyarray()` on each argument. You could
rely on __array_function__ to return NotImplement (and thus raise
TypeError) rather than type checking in every function you write for your
alternative arrays.

One minor downside would speed: now __array_function__ implementations need
to check a longer list of types.

Another minor downside: if users follow the example of
NDArrayOperatorsMixin docstring, they would now need to explicitly list all
of the scalar types (without __array_function__) that they support,
including builtin types like int and type(None). I suppose this ties into
our recommended best practices for doing type checking in
__array_ufunc__/__array_function__ implementations, which should probably
be updated regardless:
https://github.com/numpy/numpy/issues/12258#issuecomment-432858949

Best,
Stephan
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20181104/d6cdd646/attachment-0001.html>


More information about the NumPy-Discussion mailing list