[Numpy-discussion] Views of a different dtype

Thu Jan 29 11:57:57 EST 2015

On Thu, Jan 29, 2015 at 12:56 AM, Jaime Fernández del Río
<jaime.frio at gmail.com> wrote:
[...]
> With all these in mind, my proposal for the new behavior is that taking a
> view of an array with a different dtype would require:
>
> That the newtype and oldtype be compatible, as defined by the algorithm
> checking object offsets linked above.
> If newtype.itemsize == oldtype.itemsize no more checks are needed, make it
> happen!
> If the array is C/Fortran contiguous, check that the size in bytes of the
> last/first dimension is evenly divided by newtype.itemsize. If it does, go
> for it.
> For non-contiguous arrays:
>
> Ignoring dimensions of size 1, check that no stride is smaller than either
> oldtype.itemsize or newtype.itemsize. If any is found this is an as_strided
> product, sorry, can't do it!
> Ignoring dimensions of size 1, find a contiguous dimension, i.e. stride ==
> oldtype.itemsize
>
> If found, check that it is the only one with that stride, that it is the
> minimal stride, and that the size in bytes of that dimension is evenly
> divided by newitem,itemsize.
> If none is found, check if there is a size 1 dimension that is also unique
> (unless we agree on a default, as mentioned above) and that newtype.itemsize
> evenly divides oldtype.itemsize.

I'm really wary of this idea that we go grovelling around looking for
some suitable dimension somewhere to absorb the new items. Basically
nothing in numpy produces semantically different arrays (e.g., ones
with different shapes) depending on the *strides* of the input array.

Could we make it more like: check to see if the last dimension works.
If not, raise an error (and let the user transpose some other
dimension there if that's what they wanted)? Or require the user to
specify which dimension will absorb the shape change? (If we were
doing this from scratch, then it would be tempting to just say that we
always add a new dimension at the end with newtype.itemsize /
oldtype.itemsize entries, or absorb such a dimension if shrinking. As
a bonus, this would always work, regardless of contiguity! Except that
when shrinking the last dimension would have to be contiguous, of
course.)

I guess the main consideration for this is that we may be stuck with
stuff b/c of backwards compatibility. Can you maybe say a little bit
about what is allowed now, and what constraints that puts on things?
E.g. are we already grovelling around in strides and picking random
dimensions in some cases?

-n

-- 
Nathaniel J. Smith
Postdoctoral researcher - Informatics - University of Edinburgh
http://vorpus.org