[Numpy-discussion] Extracting sub-fields from an array as a view (PR 350)

Dag Sverre Seljebotn d.s.seljebotn at astro.uio.no
Sun Jul 15 10:33:20 EDT 2012


On 07/15/2012 12:31 AM, Travis Oliphant wrote:
>
> In https://github.com/numpy/numpy/pull/350/files ,
>
> javius provides a patch to allow field extraction from a structured
> array to return a view instead of a copy. Generally, this is consistent
> with the desire to have NumPy return views whenever it can. The same
> idea underlies the change to the diagonal method.
>
> Suppose 'myarr' is a structured array with fields ['lat', 'long',
> 'meas1', 'meas2', 'meas3', 'meas4'].
>
> Currently,
>
> myarr[['lat', 'long', 'mesa3']] will return a copy of the data in the
> underlying array. The proposal is to have this return a view, but do it
> in a two-stage approach so that a first version returns a copy with the
> WARN_ON_WRITE flag set introduced in NumPy 1.7. A later version will
> remove the flag (and the copy).
>
> What are thoughts on this proposal and which version of NumPy it should
> go in?
>

There would at least need to be a deprecation plan where you use 
warnings to get users to insert extra explicit copy() wherever it's 
needed. With some very ugly hacks you could have copy() return "self" if 
the refcount is 1, but that also requires some knowledge of locals() of 
the calling frame to be safe and wouldn't work that well with Cython 
etc., so probably way too ugly.

I hesitate to write the below, but if you start going down this road, I 
feel someone should at least mention it:

I would prefer it if NumPy returned views in a lot more situations than 
today. Using the suboffsets idea of PEP 3118 you could also return a 
view for

a[[1, 2, 4]]

which would mean that

y = x[a]
y[...] = 4

would finally mean the same as

x[a] = 4

and be a lot more consistent overall.

It wouldn't be efficient, it wouldn't be a good idea for most users -- 
but it would still be within the structure of PEP 3118 (using suboffsets 
and allocating a pointer table temporarily), and it would lower the 
learning curve.

I realize the immense backwards compatability challenges and 
implementation challenges and that this probably won't ever happen, but 
I felt this was the time to at least bring it up.

Dag



More information about the NumPy-Discussion mailing list