[Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering

Wed Apr 3 21:13:47 EDT 2013

Hi,

On Wed, Apr 3, 2013 at 11:44 AM, Matthew Brett <matthew.brett at gmail.com> wrote:
> Hi,
>
> On Wed, Apr 3, 2013 at 8:52 AM, Chris Barker - NOAA Federal
> <chris.barker at noaa.gov> wrote:
>> On Wed, Apr 3, 2013 at 6:24 AM, Sebastian Berg
>> <sebastian at sipsolutions.net> wrote:
>>>> the context where it gets applied. So giving the same strategy two
>>>> different names is silly; if anything it's the contexts that should
>>>> have different names.
>>>>
>>>
>>> Yup, thats how I think about it too...
>>
>> me too...
>>
>>> But I would really love if someone would try to make the documentation
>>> simpler!
>>
>> yes, I think this is where the solution lies.
>
> No question that better docs would be an improvement, let's all agree on that.
>
> We all agree that 'order' is used with two different and orthogonal
> meanings in numpy.
>
> I think we are now more or less agreeing that:
>
> np.reshape(a, (3, 4), index_order='F')
>
> is at least as clear as:
>
> np.reshape(a, (3, 4), order='F')

I believe uur job here is to come to some consensus.

In that spirit, I think we do agree on these statements above.

Now we have the cost / benefit.

Benefit : Some people may find it easier to understand numpy when
these constructs are separated.

Cost : There might be some confusion because we have changed the
default keywords.

Benefit
-----------

What proportion of people would find it easier to understand with the
order constructs separated?   Clearly Chris and Josef and Sebastian -
you estimate I think no change in your understanding, because your
understanding was near complete already.

At least I, Paul Ivanov, JB Poline found the current state strikingly
confusing.   I think we have other votes for that position here.  It's
difficult to estimate the proportions now because my original email
and the subsequent discussion are based on the distinction already
being made.  So, it is hard for us to be objective about whether a new
user is likely to get confused.  At least it seems reasonable to say
that some moderate proportion of users will get confused.

In that situation, it seems to me the long-term benefit for separating
these ideas is relatively high.   The benefit will continue over the
long term.

Cost
-------

The ravel docstring would looks something like this:

index_order : {'C','F', 'A', 'K'}, optional
    ...   This keyword used to be called simply 'order', and you can
also use the keyword 'order' to specify index_order (this parameter).

The problem would then be that, for a while, there will be older code
and docs using 'order' instead of 'index_order'.  I think this would
not cause much trouble.  Reading the docstring will explain the
change.  The old code will continue to work.

This cost will decrease to zero over time.

So, if we are planning for the long-term for numpy, I believe the
benefit to the change considerably outweighs the cost.

I'm happy to do the code changes, so that's not an issue.

Cheers,

Matthew