[Numpy-discussion] C vs. Fortran order -- misleading documentation?

Tue Jun 8 14:36:07 EDT 2010

On 06/08/2010 08:16 AM, Eric Firing wrote:
> On 06/08/2010 05:50 AM, Charles R Harris wrote:
>>
>>
>> On Tue, Jun 8, 2010 at 9:39 AM, David Goldsmith<d.l.goldsmith at gmail.com
>> <mailto:d.l.goldsmith at gmail.com>>  wrote:
>>
>>      On Tue, Jun 8, 2010 at 8:27 AM, Pavel Bazant<MaxPlanck at seznam.cz
>>      <mailto:MaxPlanck at seznam.cz>>  wrote:
>>
>>
>>           >  >  Correct me if I am wrong, but the paragraph
>>           >  >
>>           >  >  Note to those used to IDL or Fortran memory order as it
>>          relates to
>>           >  >  indexing. Numpy uses C-order indexing. That means that the
>>          last index
>>           >  >  usually (see xxx for exceptions) represents the most
>>          rapidly changing memory
>>           >  >  location, unlike Fortran or IDL, where the first index
>>          represents the most
>>           >  >  rapidly changing location in memory. This difference
>>          represents a great
>>           >  >  potential for confusion.
>>           >  >
>>           >  >  in
>>           >  >
>>           >  >  http://docs.scipy.org/doc/numpy/user/basics.indexing.html
>>           >  >
>>           >  >  is quite misleading, as C-order means that the last index
>>          changes rapidly,
>>           >  >  not the
>>           >  >  memory location.
>>           >  >
>>           >  >
>>           >  Any index can change rapidly, depending on whether is in an
>>          inner loop or
>>           >  not. The important distinction between C and Fortran order is
>>          how indices
>>           >  translate to memory locations. The documentation seems
>>          correct to me,
>>           >  although it might make more sense to say the last index
>>          addresses a
>>           >  contiguous range of memory. Of course, with modern
>>          processors, actual
>>           >  physical memory can be mapped all over the place.
>>           >
>>           >  Chuck
>>
>>          To me, saying that the last index represents the most rapidly
>>          changing memory
>>          location means that if I change the last index, the memory
>>          location changes
>>          a lot, which is not true for C-order. So for C-order, supposed
>>          one scans the memory
>>          linearly (the desired scenario),  it is the last *index* that
>>          changes most rapidly.
>>
>>          The inverted picture looks like this: For C-order,  changing the
>>          first index
>>          leads to the most rapid jump in *memory*.
>>
>>          Still have the feeling the doc is very misleading at this
>>          important issue.
>>
>>          Pavel
>>
>>
>>      The distinction between your two perspectives is that one is using
>>      for-loop traversal of indices, the other is using pointer-increment
>>      traversal of memory; from each of your perspectives, your
>>      conclusions are "correct," but my inclination is that the
>>      pointer-increment traversal of memory perspective is closer to the
>>      "spirit" of the docstring, no?
>>
>>
>> I think the confusion is in "most rapidly changing memory location",
>> which is kind of ambiguous because a change in the indices is always a
>> change in memory location if one hasn't used index tricks and such. So
>> from a time perspective it means nothing, while from a memory
>> perspective the largest address changes come from the leftmost indices.
>
> Exactly.  Rate of change with respect to what, or as you do what?
>
> I suggest something like the following wording, if you don't mind the
> verbosity as a means of conjuring up an image (although putting in
> diagrams would make it even clearer--undoubtedly there are already good
> illustrations somewhere on the web):
>
> ------------
>
> Note to those used to Matlab, IDL, or Fortran memory order as it relates
> to indexing. Numpy uses C-order indexing by default, although a numpy
> array can be designated as using Fortran order. [With C-order,
> sequential memory locations are accessed by incrementing the last

Maybe change "sequential" to "contiguous".

> index.]  For a two-dimensional array, think if it as a table.  With
> C-order indexing the table is stored as a series of rows, so that one is
> reading from left to right, incrementing the column (last) index, and
> jumping ahead in memory to the next row by incrementing the row (first)
> index. With Fortran order, the table is stored as a series of columns,
> so one reads memory sequentially from top to bottom, incrementing the
> first index, and jumps ahead in memory to the next column by
> incrementing the last index.
>
> One more difference to be aware of: numpy, like python and C, uses
> zero-based indexing; Matlab, [IDL???], and Fortran start from one.
>
> -----------------
>
> If you want to keep it short, the key wording is in the sentence in
> brackets, and you can chop out the table illustration.
>
> Eric
>
>
>>
>> Chuck
>>
>>
>>
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion at scipy.org
>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion