[Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering

josef.pktd at gmail.com josef.pktd at gmail.com
Tue Apr 2 14:37:39 EDT 2013


On Tue, Apr 2, 2013 at 2:04 PM, Matthew Brett <matthew.brett at gmail.com> wrote:
> Hi,
>
> On Tue, Apr 2, 2013 at 12:29 PM, Chris Barker - NOAA Federal
> <chris.barker at noaa.gov> wrote:
>> On Mon, Apr 1, 2013 at 10:15 PM, Matthew Brett <matthew.brett at gmail.com> wrote:
>>> Thank you for the compliment, it's more enjoyable than other potential
>>> explanations of my confusion (sigh).
>>>
>>> But, I don't think that is the explanation.
>>
>> well, the core explanation is these are difficult and intertwined
>> concepts...And yes, better names and better docs can help.
>>
>>> Last, as soon as we came to the distinction between index order and
>>> memory layout, it was clear.
>>>
>>> We all agreed that this was an important distinction that would
>>> improve numpy if we made it.
>>
>> yup.
>>
>>> I think you agree that there is potential for confusion, and there
>>> doesn't seem any reason to continue with that confusion if we can come
>>> up with a clearer name.
>>
>> well, changing an API is not to be taken lightly -- we are not
>> discussion how we'd do it if we were to start from fresh here. So any
>> change should make things enough better that it is worth dealing with
>> the process of teh change.
>
> Yes, for sure.  I was only trying to point out that we are not talking
> about breaking backwards compatibility.
>
>>> So here is a compromise proposal.
>>
>>> * Preferring the names 'c-style' and 'f-style' for the indexing order
>>> case (ravel, reshape, flatiter)
>>
>>> * Leaving 'C" and 'F' as functional shortcuts, so there is no possible
>>> backwards-compatibility problem.
>>
>> seems reasonable enough -- though even with the backward
>> compatibility, users will be faces with many, many older examples and
>> docs that use "C' and 'F', while the new ones refer to the new names
>> -- might this be cause for even more confusion (at least for a few
>> years...)
>
> I doubt it would be 'even more' confusion.  They would only have to
> read the docstrings to work out what is meant, and I believe, with
> better names, they'd be less likely to fall into the traps I fell
> into, at least.
>
>> leaving me with an equivocal +0 on that ....
>>
>> antoher thought:
>>
>> """
>> Definition: np.ravel(a, order='C')
>>
>> A 1-D array, containing the elements of the input, is returned.  A copy is
>> made only if needed.
>>
>> Parameters
>> ----------
>> a : array_like
>>     Input array.  The elements in ``a`` are read in the order specified by
>>     `order`, and packed as a 1-D array.
>> order : {'C','F', 'A', 'K'}, optional
>>     The elements of ``a`` are read in this order. 'C' means to view
>>     the elements in C (row-major) order. 'F' means to view the elements
>>     in Fortran (column-major) order. 'A' means to view the elements
>>     in 'F' order if a is Fortran contiguous, 'C' order otherwise.
>>     'K' means to view the elements in the order they occur in memory,
>>     except for reversing the data when strides are negative.
>>     By default, 'C' order is used.
>> """
>>
>> Does ravel need to support the 'A' and 'K' options? It's kind of an
>> advanced use, and really more suited to .view(), perhaps?
>>
>> What I'm getting at is that this version of ravel() conflates the two
>> concepts: virtual ordering and memory ordering in one function --
>> maybe they should be considered as two different functions altogether
>> -- I think that would make for less confusion.
>
> I think it would conceal the confusion only.   If we don't have 'A'
> and 'K' in there, it allows us to keep the dream of a world where 'C"
> only refers to index ordering, but *only for this docstring*.   As
> soon as somebody does ``np.array(arr, order='C')`` they will find
> themselves in conceptual trouble again.

I still don't see why order is not a general concept, whether it
refers to memory or indexing/iterating.
The qualifier can be made clear in the docstrings (or from the context).

It's all over the documentation:
we can iterate in F-order over an array that is in C-order (*), or vice-versa
(*) or just some strides

http://docs.scipy.org/doc/numpy/reference/arrays.nditer.html
http://docs.scipy.org/doc/numpy/reference/generated/numpy.nditer.html#numpy.nditer
pure shape
http://docs.scipy.org/doc/numpy/reference/routines.array-manipulation.html#changing-array-shape
shape and copy
http://docs.scipy.org/doc/numpy/reference/generated/numpy.ndarray.flatten.html#numpy.ndarray.flatten
memory
http://docs.scipy.org/doc/numpy/reference/routines.array-manipulation.html#changing-kind-of-array
http://docs.scipy.org/doc/numpy/reference/routines.array-creation.html#from-existing-data

Josef

>
> Cheers,
>
> Matthew
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion



More information about the NumPy-Discussion mailing list