[Numpy-discussion] Change default order to Fortran order

Mon Aug 3 08:02:15 EDT 2015

On Mon Aug 3 10:49:35 2015 GMT+0200, Matthew Brett wrote:
> Hi,
> 
> On Mon, Aug 3, 2015 at 8:09 AM, Nathaniel Smith <njs at pobox.com> wrote:
> > On Aug 2, 2015 11:06 PM, "Kang Wang" <kwang24 at wisc.edu> wrote:
> >>
> >> This is very good discussion. Thank you all for replying.
> >>
> >> I can see the fundamental difference is that I always
> >> think/talk/read/write a 3D image as I(x, y, z), not (plane, row, column) . I
> >> am coming from MRI (magnetic resonance imaging) research, and I can assure
> >> you that the entire MRI community is using (x, y, z), including books,
> >> journal papers, conference abstracts, presentations, everything. We even
> >> talk about what we called "logical x/y/z" and "physical x/y/z", and the
> >> rotation matrix that converts the two coordinate systems. The radiologists
> >> are also used to (x, y, z). For example, we always say "my image is 256 by
> >> 256 by 20 slices", and we never say "20 by 256 by 256".
> >>
> >> So, basically, at least in MRI, we always talk about an image as I(x, y,
> >> z), and we always assume that "x" is the fastest changing index. That's why
> >> I prefer column-major (because it is more natural).
> >>
> >> Of course, I can totally get my work done by using row-major, I just have
> >> to always remind myself "write last dimension index first" when coding. I
> >> actually have done this before, and I found it would be so much easier if
> >> just using column-major.
> >
> > Why not just use I[x, y, z] like you're used to, and let the computer worry
> > about the physical layout in memory? Sometimes this will be Fortran order
> > and sometimes C order and sometimes something else, but you don't have to
> > know or care; 99% of the time it doesn't matter. The worst case is that when
> > you use a python wrapper to call into a library that can only handle Fortran
> > order, then the wrapper will quietly have to convert the memory order around
> > and it will be slightly slower than if you had used Fortran order in the
> > first place. But in practice you'll barely ever notice this, and when you
> > do, *then* you can tell numpy explicitly what memory layout you want in the
> > situation where it matters.
> 
> Yes - if you are using numpy, you really have to look numpy in the eye and say:
> 
> "I will let you worry about the array element order in memory, and in
> return, you promise to make indexing work as I would expect"
> 
> Just for example, let's say you loaded an MRI image into memory:
> 
> In [1]: import nibabel
> In [2]: img = nibabel.load('my_mri.nii')
> In [3]: data = img.get_data()
> 
> Because NIfTI images are Fortran memory layout, this happens to be the
> memory layout you get for your array:
> 
> In [4]: data.flags
> Out[4]:
>   C_CONTIGUOUS : False
>   F_CONTIGUOUS : True
>   OWNDATA : False
>   WRITEABLE : True
>   ALIGNED : True
>   UPDATEIFCOPY : False
> 
> But now - in Python - all I care about is what data I have on the
> first, second, third axes.  For example, I could do this:
> 
> In [5]: data_copy = data.copy()
> 
> This has exactly the same values as the original array, and at the
> same index positions:
> 
> In [7]: import numpy as np
> In [8]: np.all(data == data)
> Out[8]: memmap(True, dtype=bool)
> 
> but I now have a C memory layout array.
> 
> In [9]: data_copy.flags
> Out[9]:
>   C_CONTIGUOUS : True
>   F_CONTIGUOUS : False
>   OWNDATA : True
>   WRITEABLE : True
>   ALIGNED : True
>   UPDATEIFCOPY : False
>

Yeah, I would like to second those arguments. Most of the time, there is no need to worry about layout. For large chunks you allocate, it may make sense for speed, etc. So you can alias creation functions.  Generally, I would suggest to simply not worry about the memory layout. Also do not *trust* the layout for most function returns. If you need a specific layout to interface other code, always check what you got it.

-Sebastian 

> Worse than that, if I slice my original data array, then I get an
> array that is neither C- or Fortran- compatible in memory:
> 
> In [10]: data_view = data[:, :, ::2]
> In [11]: data_view.flags
> Out[11]:
>   C_CONTIGUOUS : False
>   F_CONTIGUOUS : False
>   OWNDATA : False
>   WRITEABLE : True
>   ALIGNED : True
>   UPDATEIFCOPY : False
> 
> So - if you want every array to be Fortran-contiguous in memory, I
> would not start with numpy at all, I would write your own array
> library.
> 
> The alternative - or "the numpy way" - is to give up on enforcing a
> particular layout in memory, until you need to pass an array to some C
> or C++ or Fortran code that needs some particular layout, in which
> case you get your extension code to copy the array into the required
> layout on entry.   Of course this is what numpy itself has to do when
> interfacing with external libraries like BLAS or LAPACK.
> 
> Cheers,
> 
> Matthew
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
> 
>