[Numpy-discussion] Proposal: add `force=` or `copy=` kwarg to `__array__` interface

Stephan Hoyer shoyer at gmail.com
Fri Apr 24 13:12:08 EDT 2020


On Fri, Apr 24, 2020 at 6:31 AM Sebastian Berg <sebastian at sipsolutions.net>
wrote:

> One thing to note is that `__array__` is actually asked to return a
> copy AFAIK.


The documentation on __array__ seems to quite limited, unfortunately. The
most I can find are a few sentences here:
https://numpy.org/doc/stable/reference/arrays.classes.html#numpy.class.__array__

I don't see anything about returning copies. My interpretation has always
been that __array__ can return either a copy or a view, like the
np.asarray() constructor.


> I doubt it always does, but if it does not I assume the
> object should and could provide `__array_interface__`.
>

Objects like xarray.DataArray and pandas.Series sometimes directly wrap
NumPy arrays and sometimes don't.

They both implement __array__ but not __array_inferace__. It's very obvious
how to implement a "forwarding" __array__ method (just call `np.asarray()`
on an argument that might implement it). I guess something similar could be
done for __array_interface__, but it's not clear to me that it's right to
implement __array_interface__ when doing so might require a copy.


> Under that assumption, it would be an opt-out right now since NumPy
> allows copies by default here.
> Defining things along copy does seem sensible, though I do not know how
> it would play with some of the current array-likes choosing to refuse
> `__array__`.
>
> - Sebastian
>
>
>
> > Eric
> >
> > On Fri, 24 Apr 2020 at 03:00, Juan Nunez-Iglesias <jni at fastmail.com>
> > wrote:
> >
> > > Hi everyone,
> > >
> > > One bit of expressivity we would miss is “copy if necessary, but
> > > otherwise
> > > > don’t bother”, but there are workarounds to this.
> > > >
> > >
> > > After a side discussion with Stéfan van der Walt, we came up with
> > > `allow_copy=True`, which would express to the downstream library
> > > that we
> > > don’t mind waiting, but that zero-copy would also be ok.
> > >
> > > This sounds like the sort of thing that is use case driven. If
> > > enough
> > > projects want to use it, then I have no objections to adding the
> > > keyword.
> > > OTOH, we need to be careful about adding too many interoperability
> > > tricks
> > > as they complicate the code and makes it hard for folks to
> > > determine the
> > > best solution. Interoperability is a hot topic and we need to be
> > > careful
> > > not put too leave behind too many experiments in the NumPy
> > > code.  Do you
> > > have any other ideas of how to achieve the same effect?
> > >
> > >
> > > Personally, I don’t have any other ideas, but would be happy to
> > > hear some!
> > >
> > > My view regarding API/experiment creep is that `__array__` is the
> > > oldest
> > > and most basic of all the interop tricks and that this can be
> > > safely
> > > maintained for future generations. Currently it only takes `dtype=`
> > > as a
> > > keyword argument, so it is a very lean API. I think this particular
> > > use
> > > case is very natural and I’ve encountered the reluctance to
> > > implicitly copy
> > > twice, so I expect it is reasonably common.
> > >
> > > Regarding difficulty in determining the best solution, I would be
> > > happy to
> > > contribute to the dispatch basics guide together with the new
> > > kwarg. I
> > > agree that the protocols are getting quite numerous and I couldn’t
> > > find a
> > > single place that gathers all the best practices together. But, to
> > > reiterate my point: `__array__` is the simplest of these and I
> > > think this
> > > keyword is pretty safe to add.
> > >
> > > For ease of discussion, here are the API options discussed so far,
> > > as well
> > > as a few extra that I don’t like but might trigger other ideas:
> > >
> > > np.asarray(my_duck_array, allow_copy=True)  # default is False, or
> > > None ->
> > > leave it to the duck array to decide
> > > np.asarray(my_duck_array, copy=True)  # always copies, but, if
> > > supported
> > > by the duck array, defers to it for the copy
> > > np.asarray(my_duck_array, copy=‘allow’)  # could take values
> > > ‘allow’,
> > > ‘force’, ’no’, True(=‘force’), False(=’no’)
> > > np.asarray(my_duck_array, force_copy=False, allow_copy=True)  #
> > > separate
> > > concepts, but unclear what force_copy=True, allow_copy=False means!
> > > np.asarray(my_duck_array, force=True)
> > >
> > > Juan.
> > > _______________________________________________
> > > NumPy-Discussion mailing list
> > > NumPy-Discussion at python.org
> > > https://mail.python.org/mailman/listinfo/numpy-discussion
> > >
> >
> > _______________________________________________
> > NumPy-Discussion mailing list
> > NumPy-Discussion at python.org
> > https://mail.python.org/mailman/listinfo/numpy-discussion
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20200424/4bcb2a43/attachment.html>


More information about the NumPy-Discussion mailing list