From chris.barker at noaa.gov Sat Aug 1 18:55:02 2015 From: chris.barker at noaa.gov (Chris Barker - NOAA Federal) Date: Sat, 1 Aug 2015 15:55:02 -0700 Subject: [Numpy-discussion] Proposal: Deprecate np.int, np.float, etc.? In-Reply-To: <1278336859460085079.353193sturla.molden-gmail.com@news.gmane.org> References: <55B25F1A.70107@googlemail.com>

<1281473095460055826.208120sturla.molden-gmail.com@news.gmane.org> <1051175736349575057@unknownmsgid> <1278336859460085079.353193sturla.molden-gmail.com@news.gmane.org> Message-ID: <-2868036826153660775@unknownmsgid> >> Turns out I was passing in numpy arrays that I had typed as "np.int". >> It worked OK two years ago when I was testing only on 32 bit pythons, >> but today I got a bunch of failed tests on 64 bit OS-X -- a np.int is >> now a C long! > > It has always been C long. It is the C long that varies between platforms. Of course, it's that a c long was a c int on the platform I wrote the code on the first time. Which is part of the problem with C -- if two types happen to be the same, the compiler is perfectly happy. But that was an error in the first place, it never should have passed. But that's just me. ;-) Anyway, as far as concrete proposals go. I say we deprecate the Python types in the numpy namespace (i.e int and float) Other than that, I'm not sure there's any problem. -Chris > > Sturla > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion From charlesr.harris at gmail.com Sat Aug 1 19:51:16 2015 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sat, 1 Aug 2015 17:51:16 -0600 Subject: [Numpy-discussion] Branching 1.10 Sunday, Aug 2. Message-ID: Hi All, Just a heads up. If anything absolutely needed has been left out, please make a noise. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From sturla.molden at gmail.com Sat Aug 1 19:52:47 2015 From: sturla.molden at gmail.com (Sturla Molden) Date: Sat, 1 Aug 2015 23:52:47 +0000 (UTC) Subject: [Numpy-discussion] Proposal: Deprecate np.int, np.float, etc.? References: <55B25F1A.70107@googlemail.com>

<1281473095460055826.208120sturla.molden-gmail.com@news.gmane.org> <1051175736349575057@unknownmsgid> <1278336859460085079.353193sturla.molden-gmail.com@news.gmane.org> <-2868036826153660775@unknownmsgid> Message-ID: <1157332328460165630.410393sturla.molden-gmail.com@news.gmane.org> Chris Barker - NOAA Federal wrote: > Which is part of the problem with C -- if two types happen to be the > same, the compiler is perfectly happy. That int and long int be the same is not more problematic than int and signed int be the same. Sturla From charlesr.harris at gmail.com Sun Aug 2 01:08:10 2015 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sat, 1 Aug 2015 23:08:10 -0600 Subject: [Numpy-discussion] mailmap update Message-ID: Hi All, I'm trying to update the .mailmap file on github and could use some help. The current version seems common to both numpy and scipy, hence the crosspost. Here is what I've got so far. Alex Griffing ncsu.edu> alex ncsu.edu> Alex Griffing ncsu.edu> argriffing ncsu.edu> Alex Griffing ncsu.edu> argriffing users.noreply.github.com> Behzad Nouri gmail.com> behzad nouri gmail.com> Carl Kleffner gmail.com> carlkl gmail.com> Christoph Gohlke uci.edu> Christolph Gohlke uci.edu> Christoph Gohlke uci.edu> cgholke ?> Christoph Gohlke uci.edu> cgohlke uci.edu> Han Genuit gmail.com> Han gmail.com> Jaime Fernandez gmail.com> Jaime gmail.com > Jaime Fernandez gmail.com> jaimefrio gmail.com> Mark Wiebe gmail.com> Mark gmail.com> Mark Wiebe gmail.com> Mark Wiebe enthought.com> Mark Wiebe gmail.com> Mark Wiebe georg.(none)> Nathaniel J. Smith pobox.com> njsmith pobox.com> Ond?ej ?ert?k gmail.com> Ondrej Certik gmail.com> Ralf Gommers googlemail.com> rgommers googlemail.com> Saullo Giovani gmail.com> saullogiovani gmail.com> Sebastian Berg sipsolutions.net> seberg sipsolutions.net> Anon abdulmuneer gmail.com> Anon amir gmail.com> Anon cel gmail.com> Anon chebee7i gmail.com> Anon empeeu yahoo.com> Anon endolith gmail.com> Anon hannaro gmx.net> Anon hpaulj myuw.net> Anon immerrr gmail.com> Anon jmrosen155 Jordans-MacBook-Pro.local> Anon jnothman student.usyd.edu.au> Anon kanhua gmail.com> Anon mamikony sig.com> Anon mbyt web.de> Anon mlai begws92.beg.utexas.edu> Anon ryanblak gmail.com> Anon styr gmail.com> Anon tdihp hotmail.com> Anon tpoole gmail.com> Anon wim glenn melbourneit.com.au> The Anon author is just a standing in for unknown author. I can make a guess at some of those, but would prefer it if the people in question could supply their proper name and address. TIA, Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From t.b.poole at gmail.com Sun Aug 2 05:04:07 2015 From: t.b.poole at gmail.com (Tom Poole) Date: Sun, 2 Aug 2015 10:04:07 +0100 Subject: [Numpy-discussion] mailmap update In-Reply-To: References: Message-ID: <3EC93933-DC4B-497E-B052-FE359CEA632F@gmail.com> Hi Chuck, Tom Poole gmail.com > tpoole gmail.com > Tom > On 2 Aug 2015, at 06:08, Charles R Harris wrote: > > Hi All, > > I'm trying to update the .mailmap file on github and could use some help. The current version seems common to both numpy and scipy, hence the crosspost. Here is what I've got so far. > > Alex Griffing ncsu.edu > alex ncsu.edu > > Alex Griffing ncsu.edu > argriffing ncsu.edu > > Alex Griffing ncsu.edu > argriffing users.noreply.github.com > > Behzad Nouri gmail.com > behzad nouri gmail.com > > Carl Kleffner gmail.com > carlkl gmail.com > > Christoph Gohlke uci.edu > Christolph Gohlke uci.edu > > Christoph Gohlke uci.edu > cgholke ?> > Christoph Gohlke uci.edu > cgohlke uci.edu > > Han Genuit gmail.com > Han gmail.com > > Jaime Fernandez gmail.com > Jaime gmail.com > > Jaime Fernandez gmail.com > jaimefrio gmail.com > > Mark Wiebe gmail.com > Mark gmail.com > > Mark Wiebe gmail.com > Mark Wiebe enthought.com > > Mark Wiebe gmail.com > Mark Wiebe georg.(none)> > Nathaniel J. Smith pobox.com > njsmith pobox.com > > Ond?ej ?ert?k gmail.com > Ondrej Certik gmail.com > > Ralf Gommers googlemail.com > rgommers googlemail.com > > Saullo Giovani gmail.com > saullogiovani gmail.com > > Sebastian Berg sipsolutions.net > seberg sipsolutions.net > > > Anon > abdulmuneer gmail.com > > Anon > amir gmail.com > > Anon > cel gmail.com > > Anon > chebee7i gmail.com > > Anon > empeeu yahoo.com > > Anon > endolith gmail.com > > Anon > hannaro gmx.net > > Anon > hpaulj myuw.net > > Anon > immerrr gmail.com > > Anon > jmrosen155 Jordans-MacBook-Pro.local> > Anon > jnothman student.usyd.edu.au > > Anon > kanhua gmail.com > > Anon > mamikony sig.com > > Anon > mbyt web.de > > Anon > mlai begws92.beg.utexas.edu > > Anon > ryanblak gmail.com > > Anon > styr gmail.com > > Anon > tdihp hotmail.com > > Anon > tpoole gmail.com > > Anon > wim glenn melbourneit.com.au > > > The Anon author is just a standing in for unknown author. I can make a guess at some of those, but would prefer it if the people in question could supply their proper name and address. > > TIA, > > Chuck > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 801 bytes Desc: Message signed with OpenPGP using GPGMail URL: From sturla.molden at gmail.com Sun Aug 2 08:13:14 2015 From: sturla.molden at gmail.com (Sturla Molden) Date: Sun, 02 Aug 2015 14:13:14 +0200 Subject: [Numpy-discussion] Proposal: Deprecate np.int, np.float, etc.? In-Reply-To: <55BB2611.10003@googlemail.com> References: <55B25F1A.70107@googlemail.com> <55BB2611.10003@googlemail.com> Message-ID: On 31/07/15 09:38, Julian Taylor wrote: > A long is only machine word wide on posix, in windows its not. Actually it is the opposite. A pointer is 64 bit on AMD64, but the native integer and pointer offset is only 32 bit. But it does not matter because it is int that should be machine word sized, not long, which it is on both platforms. Sturla From kwang24 at wisc.edu Sun Aug 2 09:55:54 2015 From: kwang24 at wisc.edu (Kang Wang) Date: Sun, 02 Aug 2015 08:55:54 -0500 Subject: [Numpy-discussion] Change default order to Fortran order Message-ID: <7740864542dd.55bddb1a@wiscmail.wisc.edu> Hi, I am an imaging researcher, and a new Python user. My first Python project is to somehow modify NumPy source code such that everything is Fortran column-major by default. I read about the information in the link below, but for us, the fact is that?we absolutely want to use Fortran column major, and we want to make it default. Explicitly writing " order = 'F' " all over the place is not acceptable to us. http://docs.scipy.org/doc/numpy/reference/internals.html#multidimensional-array-indexing-order-issues I tried searching in this email list, as well as google search in general. However, I have not found anything useful. This must be a common request/need, I believe. Can anyone provide any insight/help? Thank you very much, Kang -- Kang Wang, Ph.D. 1111 Highland Ave., Room 1113 Madison, WI 53705-2275 ---------------------------------------- -------------- next part -------------- An HTML attachment was scrubbed... URL: From sturla.molden at gmail.com Sun Aug 2 10:27:08 2015 From: sturla.molden at gmail.com (Sturla Molden) Date: Sun, 02 Aug 2015 16:27:08 +0200 Subject: [Numpy-discussion] Change default order to Fortran order In-Reply-To: <7740864542dd.55bddb1a@wiscmail.wisc.edu> References: <7740864542dd.55bddb1a@wiscmail.wisc.edu> Message-ID: On 02/08/15 15:55, Kang Wang wrote: > Can anyone provide any insight/help? There is no "default order". There was before, but now all operators control the order of their return arrays from the order of their input array. The only thing that makes C order "default" is the keyword argument to np.empty, np.ones and np.zeros. Just monkey patch those functions and it should be fine. Sturla From sebastian at sipsolutions.net Sun Aug 2 13:19:43 2015 From: sebastian at sipsolutions.net (Sebastian Berg) Date: Sun, 2 Aug 2015 17:19:43 +0000 Subject: [Numpy-discussion] Change default order to Fortran order In-Reply-To: References: <7740864542dd.55bddb1a@wiscmail.wisc.edu> Message-ID: Well, numpy has a tendency to prefer C order. There is nothing you can do about that really. But you just cannot be sure what you get in some cases. Often you need something specific for interfaceing other code. But in that case quite often you also do not need to fear the copy. - Sebastian On Sun Aug 2 16:27:08 2015 GMT+0200, Sturla Molden wrote: > On 02/08/15 15:55, Kang Wang wrote: > > > Can anyone provide any insight/help? > > There is no "default order". There was before, but now all operators > control the order of their return arrays from the order of their input > array. The only thing that makes C order "default" is the keyword > argument to np.empty, np.ones and np.zeros. Just monkey patch those > functions and it should be fine. > > Sturla > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > From kwang24 at wisc.edu Sun Aug 2 16:14:15 2015 From: kwang24 at wisc.edu (Kang Wang) Date: Sun, 02 Aug 2015 15:14:15 -0500 Subject: [Numpy-discussion] Change default order to Fortran order In-Reply-To: <772087c3198ed4.55be79da@wiscmail.wisc.edu> References: <7740864542dd.55bddb1a@wiscmail.wisc.edu>

<74b0de5619c9ee.55be77b0@wiscmail.wisc.edu> <77208f8c19abe0.55be77ed@wiscmail.wisc.edu> <7740c3e01988b3.55be782b@wiscmail.wisc.edu> <7740821519faf2.55be7868@wiscmail.wisc.edu> <74b0aa0e19ff5a.55be78a5@wiscmail.wisc.edu> <7610d2e119e175.55be78e3@wiscmail.wisc.edu> <76b0a897198bde.55be7921@wiscmail.wisc.edu> <7450811719dc7d.55be795e@wiscmail.wisc.edu> <75e0cf9619906d.55be799c@wiscmail.wisc.edu> <772087c3198ed4.55be79da@wiscmail.wisc.edu> <7720ae8f19c5b1.55be33c7@wiscmail.wisc.edu> Message-ID: On Aug 2, 2015 1:17 PM, "Kang Wang" wrote: > > Thank you all for replying! > > I did a quick test, using python 2.6.6, There's pretty much no good reason these days to be using python 2.6 (which was released in *2008*). I assume you're using it because you're using redhat or some redhat derivative, and that's what they ship by default? Even redhat engineers officially recommend that users *not* use the default python -- it's basically only intended for use by their own built-in system management scripts. If you're just getting started with python, then at this point I'd recommend starting with python 3.4. Some easy ways to get this installed: - Anaconda: the most popular scientific python distribution -- you pretty much just download one file and get a full, up to date setup of python and all the main scientific packages, in your home directory. Supported on all popular platforms. Trivial to use and requires no special permissions. http://continuum.io/downloads#py34 - One of Anaconda's competitors: http://www.scipy.org/install.html - Software collections: redhat's official way to do things like this: https://www.softwarecollections.org/en/ -n -------------- next part -------------- An HTML attachment was scrubbed... URL: From sank.daniel at gmail.com Sun Aug 2 18:24:51 2015 From: sank.daniel at gmail.com (Daniel Sank) Date: Sun, 2 Aug 2015 15:24:51 -0700 Subject: [Numpy-discussion] Change default order to Fortran order In-Reply-To: References: <7740864542dd.55bddb1a@wiscmail.wisc.edu> Message-ID: Could you please explain why you need 'F' ordering? It's pretty unlikely that you actually care about the internal memory layout, and you'll get better advice if you explain why you think you do care. > My first Python project is to somehow modify NumPy source > code such that everything is Fortran column-major by default. This is the road to pain. You'll have to maintain your own fork and will probably inject bugs when trying to rewrite. Nobody will want to help fix them because everyone else just uses numpy as is. > And to eliminate the order kwarg, use functools.partial to patch the > zeros function (or any others, as needed): Instead of monkey patching, why not just define your own shims: fortran_zeros = partial(np.zeros(order='F')) Seems like this would lead to a lot less confusion (although until you tell us why you care about the in-memory layout I don't know the point of doing this at all). -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Sun Aug 2 18:52:50 2015 From: njs at pobox.com (Nathaniel Smith) Date: Sun, 2 Aug 2015 22:52:50 +0000 Subject: [Numpy-discussion] Change default order to Fortran order In-Reply-To: References: <7740864542dd.55bddb1a@wiscmail.wisc.edu> Message-ID: On Aug 2, 2015 7:30 AM, "Sturla Molden" wrote: > > On 02/08/15 15:55, Kang Wang wrote: > > > Can anyone provide any insight/help? > > There is no "default order". There was before, but now all operators > control the order of their return arrays from the order of their input > array. This is... overoptimistic. I would not rely on this in code that I wrote. It's true that many numpy operations do preserve the input order. But there are also many operations that don't. And which are which often changes between releases. (Not on purpose usually, but it's an easy bug to introduce. And sometimes it is done intentionally, e.g. to make functions faster. It sucks to have to make a function slower for everyone because someone somewhere is depending on memory layout default details.) And there are operations where it's not even clear what preserving order means (indexing a C array with a Fortran array, add(C, fortran), ...), and even lots of operations that intrinsically break contiguity/ordering (transpose, diagonal, slicing, swapaxes, ...), so you will end up with mixed orderings one way or another in any non-trivial program. Instead, it's better to explicitly specify order= just at the places where you care. That way your code is *locally* correct (you can check it will work by just reading the one function). The alternative is to try and enforce a *global* property on your program ("everyone everywhere is very careful to only use contiguity-preserving operations", where "everyone" includes third party libraries like numpy and others). In software design, local invariants invariants are always better than global invariants -- the most well known example is local variables versus global variables, but the principle is much broader. -n -------------- next part -------------- An HTML attachment was scrubbed... URL: From kwang24 at wisc.edu Sun Aug 2 22:16:17 2015 From: kwang24 at wisc.edu (Kang Wang) Date: Sun, 02 Aug 2015 21:16:17 -0500 Subject: [Numpy-discussion] Change default order to Fortran order In-Reply-To: <74b0c506198959.55beced2@wiscmail.wisc.edu> References: <7740864542dd.55bddb1a@wiscmail.wisc.edu> <7610d3f519bbd6.55bec8c3@wiscmail.wisc.edu> <762089a419b1da.55bec901@wiscmail.wisc.edu> <7690e69c19c6f5.55bec93e@wiscmail.wisc.edu> <74508bf519b933.55bec97c@wiscmail.wisc.edu> <7600e48b19abd0.55bec9b9@wiscmail.wisc.edu> <76109628198969.55bec9f7@wiscmail.wisc.edu> <7620c52619dbc7.55beca36@wiscmail.wisc.edu> <7740b86c19a9af.55beca74@wiscmail.wisc.edu> <74b0c96f19dd2e.55becab2@wiscmail.wisc.edu> <7720e87c19fe7f.55becaf0@wiscmail.wisc.edu> <76b0c5ac19a331.55becb2e@wiscmail.wisc.edu> <7620b4de199872.55becb6d@wiscmail.wisc.edu> <74b0bbd319feec.55becbab@wiscmail.wisc.edu> <7620d52d19bee2.55becbeb@wiscmail.wisc.edu> <76b0e43919dfdb.55becc29@wiscmail.wisc.edu> <75e0a7dc19bdab.55becc68@wiscmail.wisc.edu> <75e0c028198583.55becca6@wiscmail.wisc.edu> <74b0a09d19c79c.55becce4@wiscmail.wisc.edu> <76b0808619d856.55becd22@wiscmail.wisc.edu> <7610891d1992e0.55becd5f@wiscmail.wisc.edu> <7740b16819811d.55becd9d@wiscmail.wisc.edu> <75e0d05a19a210.55becddb@wiscmail.wisc.edu> <7620f86d19cdfb.55bece19@wiscmail.wisc.edu> <74b0e26619d88e.55bece57@wiscmail.wisc.edu> <7450aa2219d85b.55bece94@wiscmail.wisc.edu> <74b0c506198959.55beced2@wiscmail.wisc.edu> Message-ID: <74b0e2ba198afc.55be88a1@wiscmail.wisc.edu> Thank you all for replying and providing useful insights and suggestions. The reasons I really want to use column-major are: I am image-oriented user (not matrix-oriented, as explained in?http://docs.scipy.org/doc/numpy/reference/internals.html#multidimensional-array-indexing-order-issues) I am so used to read/write "I(x, y, z)" in textbook and code, and it is very likely that if the environment (row-major environment) forces me to write I(z, y, x), I will write a bug if I am not 100% focused. When this happens, it is difficult to debug, because everything compile and build fine. You will see run time error. Depending on environment, you may get useful error message (i.e. index out of range), but sometimes you just get bad image results. It actually has not too much to do with the actual data layout in memory. In imaging processing, especially medical imaging where I am working in, if you have a 3D image, everyone will agree that in memory, the X index is the fasted changing index, and the Z dimension (we often call it the "slice" dimension) has the largest stride in memory. So, if data?layout?is like this in memory, and image-oriented users are so used to read/write "I(x,y,z)", the only storage order that makes sense is column-major I also write code in MATLAB and C/C++. In MATLAB, matrix is column-major array. In C/C++, we often use ITK, which is also column-major (http://www.itk.org/Doxygen/html/classitk_1_1Image.html). I really prefer always read/write column-major code to minimize coding bugs related to storage order. I also prefer index to be 0-based; however, there is nothing I can do about it for MATLAB (which is 1-based). I can see that my original thought about "modifying NumPy source and re-compile" is probably a bad idea. The suggestions about using "fortran_zeros = partial(np.zeros(order='F'))" is probably the best way so far, in my opinion, and I am going to give it a try. Again, thank you all for replying. Kang On 08/02/15, Nathaniel Smith wrote: > > On Aug 2, 2015 7:30 AM, "Sturla Molden" wrote: > > > > > > On 02/08/15 15:55, Kang Wang wrote: > > > > > > > Can anyone provide any insight/help? > > > > > > There is no "default order". There was before, but now all operators > > > control the order of their return arrays from the order of their input > > > array. > > This is... overoptimistic. I would not rely on this in code that I wrote. > > It's true that many numpy operations do preserve the input order. But there are also many operations that don't. And which are which often changes between releases. (Not on purpose usually, but it's an easy bug to introduce. And sometimes it is done intentionally, e.g. to make functions faster. It sucks to have to make a function slower for everyone because someone somewhere is depending on memory layout default details.) And there are operations where it's not even clear what preserving order means (indexing a C array with a Fortran array, add(C, fortran), ...), and even lots of operations that intrinsically break contiguity/ordering (transpose, diagonal, slicing, swapaxes, ...), so you will end up with mixed orderings one way or another in any non-trivial program. > > Instead, it's better to explicitly specify order= just at the places where you care. That way your code is *locally* correct (you can check it will work by just reading the one function). The alternative is to try and enforce a *global* property on your program ("everyone everywhere is very careful to only use contiguity-preserving operations", where "everyone" includes third party libraries like numpy and others). In software design, local invariants invariants are always better than global invariants -- the most well known example is local variables versus global variables, but the principle is much broader. > > -n > > -- Kang Wang, Ph.D. 1111 Highland Ave., Room 1113 Madison, WI 53705-2275 ---------------------------------------- -------------- next part -------------- An HTML attachment was scrubbed... URL: From nevion at gmail.com Sun Aug 2 22:55:53 2015 From: nevion at gmail.com (Jason Newton) Date: Sun, 2 Aug 2015 19:55:53 -0700 Subject: [Numpy-discussion] Change default order to Fortran order In-Reply-To: <74b0e2ba198afc.55be88a1@wiscmail.wisc.edu> References: <7740864542dd.55bddb1a@wiscmail.wisc.edu> <7610d3f519bbd6.55bec8c3@wiscmail.wisc.edu> <762089a419b1da.55bec901@wiscmail.wisc.edu> <7690e69c19c6f5.55bec93e@wiscmail.wisc.edu> <74508bf519b933.55bec97c@wiscmail.wisc.edu> <7600e48b19abd0.55bec9b9@wiscmail.wisc.edu> <76109628198969.55bec9f7@wiscmail.wisc.edu> <7620c52619dbc7.55beca36@wiscmail.wisc.edu> <7740b86c19a9af.55beca74@wiscmail.wisc.edu> <74b0c96f19dd2e.55becab2@wiscmail.wisc.edu> <7720e87c19fe7f.55becaf0@wiscmail.wisc.edu> <76b0c5ac19a331.55becb2e@wiscmail.wisc.edu> <7620b4de199872.55becb6d@wiscmail.wisc.edu> <74b0bbd319feec.55becbab@wiscmail.wisc.edu> <7620d52d19bee2.55becbeb@wiscmail.wisc.edu> <76b0e43919dfdb.55becc29@wiscmail.wisc.edu> <75e0a7dc19bdab.55becc68@wiscmail.wisc.edu> <75e0c028198583.55becca6@wiscmail.wisc.edu> <74b0a09d19c79c.55becce4@wiscmail.wisc.edu> <76b0808619d856.55becd22@wiscmail.wisc.edu> <7610891d1992e0.55becd5f@wiscmail.wisc.edu> <7740b16819811d.55becd9d@wiscmail.wisc.edu> <75e0d05a19a210.55becddb@wiscmail.wisc.edu> <7620f86d19cdfb.55bece19@wiscmail.wisc.edu> <74b0e26619d88e.55bece57@wiscmail.wisc.edu> <7450aa2219d85b.55bece94@wiscmail.wisc.edu> <74b0c506198959.55beced2@wiscmail.wisc.edu> <74b0e2ba198afc.55be88a1@wiscmail.wisc.edu> Message-ID: Just chiming in with my 2 cents, in direct response to your points... - Image oriented processing is most typically done with row-major storage layout. From hardware to general software implementations. - Well really think of it as [slice,] row, column (logical)... you don't actually have to be concerned about the layout unless you want higher performance - in which case for a better access pattern you process a fundamental image-line at a time. I also find it helps me avoid bugs with xyz semantics by working with rows and columns only and remembering x=col, y = row. - I'm most familiar with having slice first like the above. - ITK is stored as row-major actually, but it's index type has dimensions specified as column,row, slice . Matlab does alot of things column order and thus acts different from implementations which can result in different outputs, but matlab seems perfectly happy living on an island where it's the only implementation providing a specific answer given a specific input. - Numpy is 0 based...? Good luck keeping it all sane though, -Jason On Sun, Aug 2, 2015 at 7:16 PM, Kang Wang wrote: > Thank you all for replying and providing useful insights and suggestions. > > The reasons I really want to use column-major are: > > - I am image-oriented user (not matrix-oriented, as explained in > http://docs.scipy.org/doc/numpy/reference/internals.html#multidimensional-array-indexing-order-issues > ) > - I am so used to read/write "I(x, y, z)" in textbook and code, and it > is very likely that if the environment (row-major environment) forces me to > write I(z, y, x), I will write a bug if I am not 100% focused. When this > happens, it is difficult to debug, because everything compile and build > fine. You will see run time error. Depending on environment, you may get > useful error message (i.e. index out of range), but sometimes you just get > bad image results. > - It actually has not too much to do with the actual data layout in > memory. In imaging processing, especially medical imaging where I am > working in, if you have a 3D image, everyone will agree that in memory, the > X index is the fasted changing index, and the Z dimension (we often call it > the "slice" dimension) has the largest stride in memory. So, if > data layout is like this in memory, and image-oriented users are so used to > read/write "I(x,y,z)", the only storage order that makes sense is > column-major > - I also write code in MATLAB and C/C++. In MATLAB, matrix is > column-major array. In C/C++, we often use ITK, which is also column-major ( > http://www.itk.org/Doxygen/html/classitk_1_1Image.html). I really > prefer always read/write column-major code to minimize coding bugs related > to storage order. > - I also prefer index to be 0-based; however, there is nothing I can > do about it for MATLAB (which is 1-based). > > I can see that my original thought about "modifying NumPy source and > re-compile" is probably a bad idea. The suggestions about using > "fortran_zeros = partial(np.zeros(order='F'))" is probably the best way so > far, in my opinion, and I am going to give it a try. > > Again, thank you all for replying. > > Kang > > On 08/02/15, *Nathaniel Smith * wrote: > > On Aug 2, 2015 7:30 AM, "Sturla Molden" wrote: > > > > On 02/08/15 15:55, Kang Wang wrote: > > > > > Can anyone provide any insight/help? > > > > There is no "default order". There was before, but now all operators > > control the order of their return arrays from the order of their input > > array. > > This is... overoptimistic. I would not rely on this in code that I wrote. > > It's true that many numpy operations do preserve the input order. But > there are also many operations that don't. And which are which often > changes between releases. (Not on purpose usually, but it's an easy bug to > introduce. And sometimes it is done intentionally, e.g. to make functions > faster. It sucks to have to make a function slower for everyone because > someone somewhere is depending on memory layout default details.) And there > are operations where it's not even clear what preserving order means > (indexing a C array with a Fortran array, add(C, fortran), ...), and even > lots of operations that intrinsically break contiguity/ordering (transpose, > diagonal, slicing, swapaxes, ...), so you will end up with mixed orderings > one way or another in any non-trivial program. > > Instead, it's better to explicitly specify order= just at the places where > you care. That way your code is *locally* correct (you can check it will > work by just reading the one function). The alternative is to try and > enforce a *global* property on your program ("everyone everywhere is very > careful to only use contiguity-preserving operations", where "everyone" > includes third party libraries like numpy and others). In software design, > local invariants invariants are always better than global invariants -- the > most well known example is local variables versus global variables, but the > principle is much broader. > > -n > > -- > *Kang Wang, Ph.D.* > 1111 Highland Ave., Room 1113 > Madison, WI 53705-2275 > ---------------------------------------- > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Sun Aug 2 23:22:38 2015 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sun, 2 Aug 2015 21:22:38 -0600 Subject: [Numpy-discussion] 1.10.x is branched Message-ID: Hi All, Numpy 1.10.x is branched. There is still some cleanup to do before the alpha release, but that should be coming in a couple of days. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From sank.daniel at gmail.com Mon Aug 3 00:27:53 2015 From: sank.daniel at gmail.com (Daniel Sank) Date: Sun, 2 Aug 2015 21:27:53 -0700 Subject: [Numpy-discussion] Change default order to Fortran order In-Reply-To: <74b0e2ba198afc.55be88a1@wiscmail.wisc.edu> References: <7740864542dd.55bddb1a@wiscmail.wisc.edu> <7610d3f519bbd6.55bec8c3@wiscmail.wisc.edu> <762089a419b1da.55bec901@wiscmail.wisc.edu> <7690e69c19c6f5.55bec93e@wiscmail.wisc.edu> <74508bf519b933.55bec97c@wiscmail.wisc.edu> <7600e48b19abd0.55bec9b9@wiscmail.wisc.edu> <76109628198969.55bec9f7@wiscmail.wisc.edu> <7620c52619dbc7.55beca36@wiscmail.wisc.edu> <7740b86c19a9af.55beca74@wiscmail.wisc.edu> <74b0c96f19dd2e.55becab2@wiscmail.wisc.edu> <7720e87c19fe7f.55becaf0@wiscmail.wisc.edu> <76b0c5ac19a331.55becb2e@wiscmail.wisc.edu> <7620b4de199872.55becb6d@wiscmail.wisc.edu> <74b0bbd319feec.55becbab@wiscmail.wisc.edu> <7620d52d19bee2.55becbeb@wiscmail.wisc.edu> <76b0e43919dfdb.55becc29@wiscmail.wisc.edu> <75e0a7dc19bdab.55becc68@wiscmail.wisc.edu> <75e0c028198583.55becca6@wiscmail.wisc.edu> <74b0a09d19c79c.55becce4@wiscmail.wisc.edu> <76b0808619d856.55becd22@wiscmail.wisc.edu> <7610891d1992e0.55becd5f@wiscmail.wisc.edu> <7740b16819811d.55becd9d@wiscmail.wisc.edu> <75e0d05a19a210.55becddb@wiscmail.wisc.edu> <7620f86d19cdfb.55bece19@wiscmail.wisc.edu> <74b0e26619d88e.55bece57@wiscmail.wisc.edu> <7450aa2219d85b.55bece94@wiscmail.wisc.edu> <74b0c506198959.55beced2@wiscmail.wisc.edu> <74b0e2ba198afc.55be88a1@wiscmail.wisc.edu> Message-ID: Kang, Thank you for explaining your motivation. It's clear from your last note, as you said, that your desire for column-first indexing has nothing to do with in-memory data layout. That being the case, I strongly urge you to just use bare numpy and do not use the "fortran_zeros" function I recommended before. Changing the in-memory layout via the "order" keyword in numpy.zeros will not change the way indexing works at all. You gain absolutely nothing by changing the in-memory order unless you are writing some C or Fortran code which will interact with the data in memory. To see what I mean, consider the following examples: x = np.array([1, 2, 3], [4, 5, 6]]) x.shape >>> (2, 3) and x = np.array([1, 2, 3], [4, 5, 6]], order='F') x.shape >>> (2, 3) You see that changing the in-memory order has nothing whatsoever to do with the array's shape or how you access it. > You will see run time error. Depending on environment, you may get useful error message > (i.e. index out of range), but sometimes you just get bad image results. Could you give a very simple example of what you mean? I can't think of how this could ever happen and your fear here makes me think there's a fundamental misunderstanding about how array operations in numpy and other programming languages work. As an example, iteration in numpy goes through the first index: x = np.array([[1, 2, 3], [4, 5, 6]]) for foo in x: ... Inside the for loop, foo takes on the values [1, 2, 3] on the first iteration and [4, 5, 6] on the second. If you want to iterate through the columns just do this instead x = np.array([[1, 2, 3], [4, 5, 6]]) for foo in x.T: ... If your complaint is that you want np.array([[1, 2, 3], [4, 5, 6]]) to produce an array with shape (3, 2) then you should own up to the fact that the array constructor expects it the other way around and do this x = np.array([[1, 2, 3], [4, 5, 6]]).T instead. This is infinity times better than trying to write a shim function or patch numpy because with .T you're using (fast) built-in functionality which other people your code will understand. The real message here is that whether the first index runs over rows or columns is actually meaningless. The only places the row versus column issue has any meaning is when doing input/output (in which case you should use the transpose if you actually need it), or when doing iteration. One thing that would make sense if you're reading from a binary file format which uses column-major format would be to write your own reader function: def read_fortran_style_binary_file(file): return np.fromfile(file).T Note that if you do this then you already have a column major array in numpy and you don't have to worry about any other transposes (except, again, when doing more I/O or passing to something like a plotting function). On Sun, Aug 2, 2015 at 7:16 PM, Kang Wang wrote: > Thank you all for replying and providing useful insights and suggestions. > > The reasons I really want to use column-major are: > > - I am image-oriented user (not matrix-oriented, as explained in > http://docs.scipy.org/doc/numpy/reference/internals.html#multidimensional-array-indexing-order-issues > ) > - I am so used to read/write "I(x, y, z)" in textbook and code, and it > is very likely that if the environment (row-major environment) forces me to > write I(z, y, x), I will write a bug if I am not 100% focused. When this > happens, it is difficult to debug, because everything compile and build > fine. You will see run time error. Depending on environment, you may get > useful error message (i.e. index out of range), but sometimes you just get > bad image results. > - It actually has not too much to do with the actual data layout in > memory. In imaging processing, especially medical imaging where I am > working in, if you have a 3D image, everyone will agree that in memory, the > X index is the fasted changing index, and the Z dimension (we often call it > the "slice" dimension) has the largest stride in memory. So, if > data layout is like this in memory, and image-oriented users are so used to > read/write "I(x,y,z)", the only storage order that makes sense is > column-major > - I also write code in MATLAB and C/C++. In MATLAB, matrix is > column-major array. In C/C++, we often use ITK, which is also column-major ( > http://www.itk.org/Doxygen/html/classitk_1_1Image.html). I really > prefer always read/write column-major code to minimize coding bugs related > to storage order. > - I also prefer index to be 0-based; however, there is nothing I can > do about it for MATLAB (which is 1-based). > > I can see that my original thought about "modifying NumPy source and > re-compile" is probably a bad idea. The suggestions about using > "fortran_zeros = partial(np.zeros(order='F'))" is probably the best way so > far, in my opinion, and I am going to give it a try. > > Again, thank you all for replying. > > Kang > > On 08/02/15, *Nathaniel Smith * wrote: > > On Aug 2, 2015 7:30 AM, "Sturla Molden" wrote: > > > > On 02/08/15 15:55, Kang Wang wrote: > > > > > Can anyone provide any insight/help? > > > > There is no "default order". There was before, but now all operators > > control the order of their return arrays from the order of their input > > array. > > This is... overoptimistic. I would not rely on this in code that I wrote. > > It's true that many numpy operations do preserve the input order. But > there are also many operations that don't. And which are which often > changes between releases. (Not on purpose usually, but it's an easy bug to > introduce. And sometimes it is done intentionally, e.g. to make functions > faster. It sucks to have to make a function slower for everyone because > someone somewhere is depending on memory layout default details.) And there > are operations where it's not even clear what preserving order means > (indexing a C array with a Fortran array, add(C, fortran), ...), and even > lots of operations that intrinsically break contiguity/ordering (transpose, > diagonal, slicing, swapaxes, ...), so you will end up with mixed orderings > one way or another in any non-trivial program. > > Instead, it's better to explicitly specify order= just at the places where > you care. That way your code is *locally* correct (you can check it will > work by just reading the one function). The alternative is to try and > enforce a *global* property on your program ("everyone everywhere is very > careful to only use contiguity-preserving operations", where "everyone" > includes third party libraries like numpy and others). In software design, > local invariants invariants are always better than global invariants -- the > most well known example is local variables versus global variables, but the > principle is much broader. > > -n > > -- > *Kang Wang, Ph.D.* > 1111 Highland Ave., Room 1113 > Madison, WI 53705-2275 > ---------------------------------------- > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -- Daniel Sank -------------- next part -------------- An HTML attachment was scrubbed... URL: From jni.soma at gmail.com Mon Aug 3 00:54:31 2015 From: jni.soma at gmail.com (Juan Nunez-Iglesias) Date: Mon, 03 Aug 2015 04:54:31 +0000 Subject: [Numpy-discussion] Change default order to Fortran order In-Reply-To: References: <7740864542dd.55bddb1a@wiscmail.wisc.edu> <7610d3f519bbd6.55bec8c3@wiscmail.wisc.edu> <762089a419b1da.55bec901@wiscmail.wisc.edu> <7690e69c19c6f5.55bec93e@wiscmail.wisc.edu> <74508bf519b933.55bec97c@wiscmail.wisc.edu> <7600e48b19abd0.55bec9b9@wiscmail.wisc.edu> <76109628198969.55bec9f7@wiscmail.wisc.edu> <7620c52619dbc7.55beca36@wiscmail.wisc.edu> <7740b86c19a9af.55beca74@wiscmail.wisc.edu> <74b0c96f19dd2e.55becab2@wiscmail.wisc.edu> <7720e87c19fe7f.55becaf0@wiscmail.wisc.edu> <76b0c5ac19a331.55becb2e@wiscmail.wisc.edu> <7620b4de199872.55becb6d@wiscmail.wisc.edu> <74b0bbd319feec.55becbab@wiscmail.wisc.edu> <7620d52d19bee2.55becbeb@wiscmail.wisc.edu> <76b0e43919dfdb.55becc29@wiscmail.wisc.edu> <75e0a7dc19bdab.55becc68@wiscmail.wisc.edu> <75e0c028198583.55becca6@wiscmail.wisc.edu> <74b0a09d19c79c.55becce4@wiscmail.wisc.edu> <76b0808619d856.55becd22@wiscmail.wisc.edu> <7610891d1992e0.55becd5f@wiscmail.wisc.edu> <7740b16819811d.55becd9d@wiscmail.wisc.edu> <75e0d05a19a210.55becddb@wiscmail.wisc.edu> <7620f86d19cdfb.55bece19@wiscmail.wisc.edu> <74b0e26619d88e.55bece57@wiscmail.wisc.edu> <7450aa2219d85b.55bece94@wiscmail.wisc.edu> <74b0c506198959.55beced2@wiscmail.wisc.edu> <74b0e2ba198afc.55be88a1@wiscmail.wisc.edu> Message-ID: Hi Kang, Feel free to come chat about your application on the scikit-image list [1]! I'll note that we've been through the array order discussion many times there and even have a doc page about it [2]. The short version is that you'll save yourself a lot of pain by starting to think of your images as (plane, row, column) instead of (x, y, z). The syntax actually becomes friendlier too. For example, to do something to each slice of data, you do: for plane in image: plane += foo instead of for z in image.shape[2]: image[:, :, z] += foo for example. Juan. [1] scikit-image at googlegroups.com [2] http://scikit-image.org/docs/dev/user_guide/numpy_images.html#coordinate-conventions PS: As to the renamed Fortran-ordered numpy, may I suggest "funpy". The F is for Fortran and the fun is for all the fun you'll have maintaining it. =P On Mon, 3 Aug 2015 at 6:28 am Daniel Sank wrote: > Kang, > > Thank you for explaining your motivation. It's clear from your last note, > as you said, that your desire for column-first indexing has nothing to do > with in-memory data layout. That being the case, I strongly urge you to > just use bare numpy and do not use the "fortran_zeros" function I > recommended before. Changing the in-memory layout via the "order" keyword > in numpy.zeros will not change the way indexing works at all. You gain > absolutely nothing by changing the in-memory order unless you are writing > some C or Fortran code which will interact with the data in memory. > > To see what I mean, consider the following examples: > > x = np.array([1, 2, 3], [4, 5, 6]]) > x.shape > >>> (2, 3) > > and > > x = np.array([1, 2, 3], [4, 5, 6]], order='F') > x.shape > >>> (2, 3) > > You see that changing the in-memory order has nothing whatsoever to do > with the array's shape or how you access it. > > > You will see run time error. Depending on environment, you may get > useful error message > > (i.e. index out of range), but sometimes you just get bad image results. > > Could you give a very simple example of what you mean? I can't think of > how this could ever happen and your fear here makes me think there's a > fundamental misunderstanding about how array operations in numpy and other > programming languages work. As an example, iteration in numpy goes through > the first index: > > x = np.array([[1, 2, 3], [4, 5, 6]]) > for foo in x: > ... > > Inside the for loop, foo takes on the values [1, 2, 3] on the first > iteration and [4, 5, 6] on the second. If you want to iterate through the > columns just do this instead > > x = np.array([[1, 2, 3], [4, 5, 6]]) > for foo in x.T: > ... > > If your complaint is that you want np.array([[1, 2, 3], [4, 5, 6]]) to > produce an array with shape (3, 2) then you should own up to the fact that > the array constructor expects it the other way around and do this > > x = np.array([[1, 2, 3], [4, 5, 6]]).T > > instead. This is infinity times better than trying to write a shim > function or patch numpy because with .T you're using (fast) built-in > functionality which other people your code will understand. > > The real message here is that whether the first index runs over rows or > columns is actually meaningless. The only places the row versus column > issue has any meaning is when doing input/output (in which case you should > use the transpose if you actually need it), or when doing iteration. One > thing that would make sense if you're reading from a binary file format > which uses column-major format would be to write your own reader function: > > def read_fortran_style_binary_file(file): > return np.fromfile(file).T > > Note that if you do this then you already have a column major array in > numpy and you don't have to worry about any other transposes (except, > again, when doing more I/O or passing to something like a plotting > function). > > > > > On Sun, Aug 2, 2015 at 7:16 PM, Kang Wang wrote: > >> Thank you all for replying and providing useful insights and suggestions. >> >> The reasons I really want to use column-major are: >> >> - I am image-oriented user (not matrix-oriented, as explained in >> http://docs.scipy.org/doc/numpy/reference/internals.html#multidimensional-array-indexing-order-issues >> ) >> - I am so used to read/write "I(x, y, z)" in textbook and code, and >> it is very likely that if the environment (row-major environment) forces me >> to write I(z, y, x), I will write a bug if I am not 100% focused. When >> this happens, it is difficult to debug, because everything compile and >> build fine. You will see run time error. Depending on environment, you may >> get useful error message (i.e. index out of range), but sometimes you just >> get bad image results. >> - It actually has not too much to do with the actual data layout in >> memory. In imaging processing, especially medical imaging where I am >> working in, if you have a 3D image, everyone will agree that in memory, the >> X index is the fasted changing index, and the Z dimension (we often call it >> the "slice" dimension) has the largest stride in memory. So, if >> data layout is like this in memory, and image-oriented users are so used to >> read/write "I(x,y,z)", the only storage order that makes sense is >> column-major >> - I also write code in MATLAB and C/C++. In MATLAB, matrix is >> column-major array. In C/C++, we often use ITK, which is also column-major ( >> http://www.itk.org/Doxygen/html/classitk_1_1Image.html). I really >> prefer always read/write column-major code to minimize coding bugs related >> to storage order. >> - I also prefer index to be 0-based; however, there is nothing I can >> do about it for MATLAB (which is 1-based). >> >> I can see that my original thought about "modifying NumPy source and >> re-compile" is probably a bad idea. The suggestions about using >> "fortran_zeros = partial(np.zeros(order='F'))" is probably the best way so >> far, in my opinion, and I am going to give it a try. >> >> Again, thank you all for replying. >> >> Kang >> >> On 08/02/15, *Nathaniel Smith * wrote: >> >> On Aug 2, 2015 7:30 AM, "Sturla Molden" wrote: >> > >> > On 02/08/15 15:55, Kang Wang wrote: >> > >> > > Can anyone provide any insight/help? >> > >> > There is no "default order". There was before, but now all operators >> > control the order of their return arrays from the order of their input >> > array. >> >> This is... overoptimistic. I would not rely on this in code that I wrote. >> >> It's true that many numpy operations do preserve the input order. But >> there are also many operations that don't. And which are which often >> changes between releases. (Not on purpose usually, but it's an easy bug to >> introduce. And sometimes it is done intentionally, e.g. to make functions >> faster. It sucks to have to make a function slower for everyone because >> someone somewhere is depending on memory layout default details.) And there >> are operations where it's not even clear what preserving order means >> (indexing a C array with a Fortran array, add(C, fortran), ...), and even >> lots of operations that intrinsically break contiguity/ordering (transpose, >> diagonal, slicing, swapaxes, ...), so you will end up with mixed orderings >> one way or another in any non-trivial program. >> >> Instead, it's better to explicitly specify order= just at the places >> where you care. That way your code is *locally* correct (you can check it >> will work by just reading the one function). The alternative is to try and >> enforce a *global* property on your program ("everyone everywhere is very >> careful to only use contiguity-preserving operations", where "everyone" >> includes third party libraries like numpy and others). In software design, >> local invariants invariants are always better than global invariants -- the >> most well known example is local variables versus global variables, but the >> principle is much broader. >> >> -n >> >> -- >> *Kang Wang, Ph.D.* >> 1111 Highland Ave., Room 1113 >> Madison, WI 53705-2275 >> ---------------------------------------- >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> > > > -- > Daniel Sank > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From kwang24 at wisc.edu Mon Aug 3 02:02:42 2015 From: kwang24 at wisc.edu (Kang Wang) Date: Mon, 03 Aug 2015 01:02:42 -0500 Subject: [Numpy-discussion] Change default order to Fortran order In-Reply-To: <75e0a7e7199aa0.55bf03c5@wiscmail.wisc.edu> References: <7740864542dd.55bddb1a@wiscmail.wisc.edu> <7610d3f519bbd6.55bec8c3@wiscmail.wisc.edu> <762089a419b1da.55bec901@wiscmail.wisc.edu> <7690e69c19c6f5.55bec93e@wiscmail.wisc.edu> <74508bf519b933.55bec97c@wiscmail.wisc.edu> <7600e48b19abd0.55bec9b9@wiscmail.wisc.edu> <76109628198969.55bec9f7@wiscmail.wisc.edu> <7620c52619dbc7.55beca36@wiscmail.wisc.edu> <7740b86c19a9af.55beca74@wiscmail.wisc.edu> <74b0c96f19dd2e.55becab2@wiscmail.wisc.edu> <7720e87c19fe7f.55becaf0@wiscmail.wisc.edu> <76b0c5ac19a331.55becb2e@wiscmail.wisc.edu> <7620b4de199872.55becb6d@wiscmail.wisc.edu> <74b0bbd319feec.55becbab@wiscmail.wisc.edu> <7620d52d19bee2.55becbeb@wiscmail.wisc.edu> <76b0e43919dfdb.55becc29@wiscmail.wisc.edu> <75e0a7dc19bdab.55becc68@wiscmail.wisc.edu> <75e0c028198583.55becca6@wiscmail.wisc.edu> <74b0a09d19c79c.55becce4@wiscmail.wisc.edu> <76b0808619d856.55becd22@wiscmail.wisc.edu> <7610891d1992e0.55becd5f@wiscmail.wisc.edu> <7740b16819811d.55becd9d@wiscmail.wisc.edu> <75e0d05a19a210.55becddb@wiscmail.wisc.edu> <7620f86d19cdfb.55bece19@wiscmail.wisc.edu> <74b0e26619d88e.55bece57@wiscmail.wisc.edu> <7450aa2219d85b.55bece94@wiscmail.wisc.edu> <74b0c506198959.55beced2@wiscmail.wisc.edu> <74b0e2ba198afc.55be88a1@wiscmail.wisc.edu>

<7790f915199428.55befb9a@wiscmail.wisc.edu> <7790906919a1f1.55befbd8@wiscmail.wisc.edu> <7790852f19d571.55befc17@wiscmail.wisc.edu> <77209c0219e46d.55befc54@wiscmail.wisc.edu> <7720f14319db76.55befc92@wiscmail.wisc.edu> <7740cea919c20b.55befcd0@wiscmail.wisc.edu> <75e091da1983c6.55befd0d@wiscmail.wisc.edu> <75e092e119d0d4.55befd4c@wiscmail.wisc.edu> <74508dfa19c665.55befd89@wiscmail.wisc.edu> <779098271984ef.55befdc7@wiscmail.wisc.edu> <7790a0c619efcf.55befe05@wiscmail.wisc.edu> <7790e85019cb3c.55befe43@wiscmail.wisc.edu> <774081cb19c91f.55befe81@wiscmail.wisc.edu> <7740fc9119942d.55befebf@wiscmail.wisc.edu> <7740800619f499.55befefc@wiscmail.wisc.edu> <77408ff719acd0.55beffef@wiscmail.wisc.edu> <7720f55919b92b.55bf0068@wiscmail.wisc.edu> <7720c87719822d.55bf00a6@wiscmail.wisc.edu> <7720ae66199b7c.55bf00e4@wiscmail.wisc.edu> <7720bc3819aa29.55bf0122@wiscmail.wisc.edu> <7620dfc219e016.55bf0160@wiscmail.wisc.edu> <7620af8f19a909.55bf019e@wiscmail.wisc.edu> <7690986219ed18.55bf01dc@wiscmail.wisc.edu> <7690bc7419a786.55bf0219@wiscmail.wisc.edu> <7690ea4e19b283.55bf0258@wiscmail.wisc.edu> <76008e80199c29.55bf0296@wiscmail.wisc.edu> <75e0cd8219e481.55bf034b@wiscmail.wisc.edu> <75e0a7e7199aa0.55bf03c5@wiscmail.wisc.edu> Message-ID: <75e08cd219963d.55bebdb2@wiscmail.wisc.edu> This is very good discussion. Thank you all for replying. I can see the fundamental difference is that I always think/talk/read/write a 3D image as I(x, y, z), not (plane, row, column) . I am coming from MRI (magnetic resonance imaging) research, and I can assure you that the entire MRI community is using (x, y, z), including books, journal papers, conference abstracts, presentations, everything. We even talk about what we called "logical x/y/z" and "physical x/y/z", and the rotation matrix that converts the two coordinate systems. The radiologists are also used to (x, y, z). For example, we always say "my image is 256 by 256 by 20 slices", and we never say "20 by 256 by 256". So, basically, at least in MRI, we always talk about an image as I(x, y, z), and we always assume that "x" is the fastest changing index. That's why I prefer column-major (because it is more natural). Of course, I can totally get my work done by using row-major, I just have to always remind myself "write last dimension index first" when coding. I actually have done this before, and I found it would be so much easier if just using column-major. Kang On 08/02/15, Juan Nunez-Iglesias wrote: > Hi Kang, > > Feel free to come chat about your application on the scikit-image list [1]! I'll note that we've been through the array order discussion many times there and even have a doc page about it [2]. > > The short version is that you'll save yourself a lot of pain by starting to think of your images as (plane, row, column) instead of (x, y, z). The syntax actually becomes friendlier too. For example, to do something to each slice of data, you do: > > for plane in image: > plane?+= foo > > > instead of > > > for z in image.shape[2]: > image[:, :, z]?+= foo > > > for example. > > > Juan. > > > [1] scikit-image at googlegroups.com > [2]?http://scikit-image.org/docs/dev/user_guide/numpy_images.html#coordinate-conventions > > > PS: As to the renamed Fortran-ordered numpy, may I suggest "funpy". The F is for Fortran and the fun is for all the fun you'll have maintaining it. =P > > On Mon, 3 Aug 2015 at 6:28 am Daniel Sank wrote: > > > > Kang, > > > > Thank you for explaining your motivation. It's clear from your last note, as you said, that your desire for column-first indexing has nothing to do with in-memory data layout. That being the case, I strongly urge you to just use bare numpy and do not use the "fortran_zeros" function I recommended before. Changing the in-memory layout via the "order" keyword in numpy.zeros will not change the way indexing works at all. You gain absolutely nothing by changing the in-memory order unless you are writing some C or Fortran code which will interact with the data in memory. > > > > > > To see what I mean, consider the following examples: > > > > > > x = np.array([1, 2, 3], [4, 5, 6]]) > > x.shape > > >>> (2, 3) > > > > > > and > > > > > > x = np.array([1, 2, 3], [4, 5, 6]], order='F') > > x.shape > > >>> (2, 3) > > > > > > > > You see that changing the in-memory order has nothing whatsoever to do with the array's shape or how you access it. > > > > > > > > > You will see run time error. Depending on environment, you may get useful error message > > > (i.e. index out of range), but sometimes you just get bad image results. > > > > > > > > Could you give a very simple example of what you mean? I can't think of how this could ever happen and your fear here makes me think there's a fundamental misunderstanding about how array operations in numpy and other programming languages work. As an example, iteration in numpy goes through the first index: > > > > > > x = np.array([[1, 2, 3], [4, 5, 6]]) > > for foo in x: > > ... > > > > > > Inside the for loop, foo takes on the values [1, 2, 3] on the first iteration and [4, 5, 6] on the second. If you want to iterate through the columns just do this instead > > > > > > x = np.array([[1, 2, 3], [4, 5, 6]]) > > for foo in x.T: > > ... > > > > > > > > If your complaint is that you want np.array([[1, 2, 3], [4, 5, 6]]) to produce an array with shape (3, 2) then you should own up to the fact that the array constructor expects it the other way around and do this > > > > > > > > x = np.array([[1, 2, 3], [4, 5, 6]]).T > > > > > > > > instead. This is infinity times better than trying to write a shim function or patch numpy because with .T you're using (fast) built-in functionality which other people your code will understand. > > > > The real message here is that whether the first index runs over rows or columns is actually meaningless. The only places the row versus column issue has any meaning is when doing input/output (in which case you should use the transpose if you actually need it), or when doing iteration. One thing that would make sense if you're reading from a binary file format which uses column-major format would be to write your own reader function: > > > > > > > > def read_fortran_style_binary_file(file): > > return np.fromfile(file).T > > > > > > Note that if you do this then you already have a column major array in numpy and you don't have to worry about any other transposes (except, again, when doing more I/O or passing to something like a plotting function). > > > > > > > > > > > > > > > > > > > > > > On Sun, Aug 2, 2015 at 7:16 PM, Kang Wang wrote: > > > > > > > > > Thank you all for replying and providing useful insights and suggestions. > > > > > > The reasons I really want to use column-major are: > > > > > > I am image-oriented user (not matrix-oriented, as explained in?http://docs.scipy.org/doc/numpy/reference/internals.html#multidimensional-array-indexing-order-issues) > > > I am so used to read/write "I(x, y, z)" in textbook and code, and it is very likely that if the environment (row-major environment) forces me to write I(z, y, x), I will write a bug if I am not 100% focused. When this happens, it is difficult to debug, because everything compile and build fine. You will see run time error. Depending on environment, you may get useful error message (i.e. index out of range), but sometimes you just get bad image results. > > > It actually has not too much to do with the actual data layout in memory. In imaging processing, especially medical imaging where I am working in, if you have a 3D image, everyone will agree that in memory, the X index is the fasted changing index, and the Z dimension (we often call it the "slice" dimension) has the largest stride in memory. So, if data?layout?is like this in memory, and image-oriented users are so used to read/write "I(x,y,z)", the only storage order that makes sense is column-major > > > I also write code in MATLAB and C/C++. In MATLAB, matrix is column-major array. In C/C++, we often use ITK, which is also column-major (http://www.itk.org/Doxygen/html/classitk_1_1Image.html). I really prefer always read/write column-major code to minimize coding bugs related to storage order. > > > I also prefer index to be 0-based; however, there is nothing I can do about it for MATLAB (which is 1-based). > > > > > > I can see that my original thought about "modifying NumPy source and re-compile" is probably a bad idea. The suggestions about using "fortran_zeros = partial(np.zeros(order='F'))" is probably the best way so far, in my opinion, and I am going to give it a try. > > > > > > > > > Again, thank you all for replying. > > > > > > > > > Kang > > > > > > On 08/02/15, Nathaniel Smith wrote: > > > > > > > > On Aug 2, 2015 7:30 AM, "Sturla Molden" wrote: > > > > > > > > > > > > > > > > > > On 02/08/15 15:55, Kang Wang wrote: > > > > > > > > > > > > > > > > > > > Can anyone provide any insight/help? > > > > > > > > > > > > > > > > > > There is no "default order". There was before, but now all operators > > > > > > > > > control the order of their return arrays from the order of their input > > > > > > > > > array. > > > > > > > > This is... overoptimistic. I would not rely on this in code that I wrote. > > > > > > > > It's true that many numpy operations do preserve the input order. But there are also many operations that don't. And which are which often changes between releases. (Not on purpose usually, but it's an easy bug to introduce. And sometimes it is done intentionally, e.g. to make functions faster. It sucks to have to make a function slower for everyone because someone somewhere is depending on memory layout default details.) And there are operations where it's not even clear what preserving order means (indexing a C array with a Fortran array, add(C, fortran), ...), and even lots of operations that intrinsically break contiguity/ordering (transpose, diagonal, slicing, swapaxes, ...), so you will end up with mixed orderings one way or another in any non-trivial program. > > > > > > > > Instead, it's better to explicitly specify order= just at the places where you care. That way your code is *locally* correct (you can check it will work by just reading the one function). The alternative is to try and enforce a *global* property on your program ("everyone everywhere is very careful to only use contiguity-preserving operations", where "everyone" includes third party libraries like numpy and others). In software design, local invariants invariants are always better than global invariants -- the most well known example is local variables versus global variables, but the principle is much broader. > > > > > > > > -n > > > > > > > > > > > > > > > > > -- > > > Kang Wang, Ph.D. > > > 1111 Highland Ave., Room 1113 > > > Madison, WI 53705-2275 > > > ---------------------------------------- > > > > > > > > > _______________________________________________ > > > > > > NumPy-Discussion mailing list > > > > > > NumPy-Discussion at scipy.org > > > > > > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > > > > > > > > > > > > > > > > > > > > -- > > Daniel Sank > > > > > > > > > > > > > > > > > > _______________________________________________ > > > > NumPy-Discussion mailing list > > > > NumPy-Discussion at scipy.org > > > > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > > > > > > -- Kang Wang, Ph.D. 1111 Highland Ave., Room 1113 Madison, WI 53705-2275 ---------------------------------------- -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Mon Aug 3 02:42:05 2015 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Mon, 3 Aug 2015 08:42:05 +0200 Subject: [Numpy-discussion] 1.10.x is branched In-Reply-To: References: Message-ID: On Mon, Aug 3, 2015 at 5:22 AM, Charles R Harris wrote: > Hi All, > > Numpy 1.10.x is branched. There is still some cleanup to do before the > alpha release, but that should be coming in a couple of days. > > Thanks Chuck. Looks like it's shaping up nicely. Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Mon Aug 3 03:09:00 2015 From: njs at pobox.com (Nathaniel Smith) Date: Mon, 3 Aug 2015 07:09:00 +0000 Subject: [Numpy-discussion] Change default order to Fortran order In-Reply-To: <75e08cd219963d.55bebdb2@wiscmail.wisc.edu> References: <7740864542dd.55bddb1a@wiscmail.wisc.edu> <7610d3f519bbd6.55bec8c3@wiscmail.wisc.edu> <762089a419b1da.55bec901@wiscmail.wisc.edu> <7690e69c19c6f5.55bec93e@wiscmail.wisc.edu> <74508bf519b933.55bec97c@wiscmail.wisc.edu> <7600e48b19abd0.55bec9b9@wiscmail.wisc.edu> <76109628198969.55bec9f7@wiscmail.wisc.edu> <7620c52619dbc7.55beca36@wiscmail.wisc.edu> <7740b86c19a9af.55beca74@wiscmail.wisc.edu> <74b0c96f19dd2e.55becab2@wiscmail.wisc.edu> <7720e87c19fe7f.55becaf0@wiscmail.wisc.edu> <76b0c5ac19a331.55becb2e@wiscmail.wisc.edu> <7620b4de199872.55becb6d@wiscmail.wisc.edu> <74b0bbd319feec.55becbab@wiscmail.wisc.edu> <7620d52d19bee2.55becbeb@wiscmail.wisc.edu> <76b0e43919dfdb.55becc29@wiscmail.wisc.edu> <75e0a7dc19bdab.55becc68@wiscmail.wisc.edu> <75e0c028198583.55becca6@wiscmail.wisc.edu> <74b0a09d19c79c.55becce4@wiscmail.wisc.edu> <76b0808619d856.55becd22@wiscmail.wisc.edu> <7610891d1992e0.55becd5f@wiscmail.wisc.edu> <7740b16819811d.55becd9d@wiscmail.wisc.edu> <75e0d05a19a210.55becddb@wiscmail.wisc.edu> <7620f86d19cdfb.55bece19@wiscmail.wisc.edu> <74b0e26619d88e.55bece57@wiscmail.wisc.edu> <7450aa2219d85b.55bece94@wiscmail.wisc.edu> <74b0c506198959.55beced2@wiscmail.wisc.edu> <74b0e2ba198afc.55be88a1@wiscmail.wisc.edu>

Message-ID: I agree that often you don't need to worry about the memory order. However, it is not uncommon in medical imaging to go back and forth between a 2D or 3D image representation and a 1D array representation (e.g. as often used in image reconstruction algorithms). I found that the main time it was necessary to pay careful attention to the memory layout was when converting Matlab scripts that involve reshaping operations. On Mon, Aug 3, 2015 at 8:02 AM, Sebastian Berg wrote: > On Mon Aug 3 10:49:35 2015 GMT+0200, Matthew Brett wrote: > > Hi, > > > > On Mon, Aug 3, 2015 at 8:09 AM, Nathaniel Smith wrote: > > > On Aug 2, 2015 11:06 PM, "Kang Wang" wrote: > > >> > > >> This is very good discussion. Thank you all for replying. > > >> > > >> I can see the fundamental difference is that I always > > >> think/talk/read/write a 3D image as I(x, y, z), not (plane, row, > column) . I > > >> am coming from MRI (magnetic resonance imaging) research, and I can > assure > > >> you that the entire MRI community is using (x, y, z), including books, > > >> journal papers, conference abstracts, presentations, everything. We > even > > >> talk about what we called "logical x/y/z" and "physical x/y/z", and > the > > >> rotation matrix that converts the two coordinate systems. The > radiologists > > >> are also used to (x, y, z). For example, we always say "my image is > 256 by > > >> 256 by 20 slices", and we never say "20 by 256 by 256". > > >> > > >> So, basically, at least in MRI, we always talk about an image as I(x, > y, > > >> z), and we always assume that "x" is the fastest changing index. > That's why > > >> I prefer column-major (because it is more natural). > > >> > > >> Of course, I can totally get my work done by using row-major, I just > have > > >> to always remind myself "write last dimension index first" when > coding. I > > >> actually have done this before, and I found it would be so much > easier if > > >> just using column-major. > > > > > > Why not just use I[x, y, z] like you're used to, and let the computer > worry > > > about the physical layout in memory? Sometimes this will be Fortran > order > > > and sometimes C order and sometimes something else, but you don't have > to > > > know or care; 99% of the time it doesn't matter. The worst case is > that when > > > you use a python wrapper to call into a library that can only handle > Fortran > > > order, then the wrapper will quietly have to convert the memory order > around > > > and it will be slightly slower than if you had used Fortran order in > the > > > first place. But in practice you'll barely ever notice this, and when > you > > > do, *then* you can tell numpy explicitly what memory layout you want > in the > > > situation where it matters. > > > > Yes - if you are using numpy, you really have to look numpy in the eye > and say: > > > > "I will let you worry about the array element order in memory, and in > > return, you promise to make indexing work as I would expect" > > > > Just for example, let's say you loaded an MRI image into memory: > > > > In [1]: import nibabel > > In [2]: img = nibabel.load('my_mri.nii') > > In [3]: data = img.get_data() > > > > Because NIfTI images are Fortran memory layout, this happens to be the > > memory layout you get for your array: > > > > In [4]: data.flags > > Out[4]: > > C_CONTIGUOUS : False > > F_CONTIGUOUS : True > > OWNDATA : False > > WRITEABLE : True > > ALIGNED : True > > UPDATEIFCOPY : False > > > > But now - in Python - all I care about is what data I have on the > > first, second, third axes. For example, I could do this: > > > > In [5]: data_copy = data.copy() > > > > This has exactly the same values as the original array, and at the > > same index positions: > > > > In [7]: import numpy as np > > In [8]: np.all(data == data) > > Out[8]: memmap(True, dtype=bool) > > > > but I now have a C memory layout array. > > > > In [9]: data_copy.flags > > Out[9]: > > C_CONTIGUOUS : True > > F_CONTIGUOUS : False > > OWNDATA : True > > WRITEABLE : True > > ALIGNED : True > > UPDATEIFCOPY : False > > > > Yeah, I would like to second those arguments. Most of the time, there is > no need to worry about layout. For large chunks you allocate, it may make > sense for speed, etc. So you can alias creation functions. Generally, I > would suggest to simply not worry about the memory layout. Also do not > *trust* the layout for most function returns. If you need a specific layout > to interface other code, always check what you got it. > > -Sebastian > > > > Worse than that, if I slice my original data array, then I get an > > array that is neither C- or Fortran- compatible in memory: > > > > In [10]: data_view = data[:, :, ::2] > > In [11]: data_view.flags > > Out[11]: > > C_CONTIGUOUS : False > > F_CONTIGUOUS : False > > OWNDATA : False > > WRITEABLE : True > > ALIGNED : True > > UPDATEIFCOPY : False > > > > So - if you want every array to be Fortran-contiguous in memory, I > > would not start with numpy at all, I would write your own array > > library. > > > > The alternative - or "the numpy way" - is to give up on enforcing a > > particular layout in memory, until you need to pass an array to some C > > or C++ or Fortran code that needs some particular layout, in which > > case you get your extension code to copy the array into the required > > layout on entry. Of course this is what numpy itself has to do when > > interfacing with external libraries like BLAS or LAPACK. > > > > Cheers, > > > > Matthew > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at scipy.org > > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From matthew.brett at gmail.com Mon Aug 3 10:26:10 2015 From: matthew.brett at gmail.com (Matthew Brett) Date: Mon, 3 Aug 2015 15:26:10 +0100 Subject: [Numpy-discussion] Change default order to Fortran order In-Reply-To: References: <7740864542dd.55bddb1a@wiscmail.wisc.edu> <7610d3f519bbd6.55bec8c3@wiscmail.wisc.edu> <762089a419b1da.55bec901@wiscmail.wisc.edu> <7690e69c19c6f5.55bec93e@wiscmail.wisc.edu> <74508bf519b933.55bec97c@wiscmail.wisc.edu> <7600e48b19abd0.55bec9b9@wiscmail.wisc.edu> <76109628198969.55bec9f7@wiscmail.wisc.edu> <7620c52619dbc7.55beca36@wiscmail.wisc.edu> <7740b86c19a9af.55beca74@wiscmail.wisc.edu> <74b0c96f19dd2e.55becab2@wiscmail.wisc.edu> <7720e87c19fe7f.55becaf0@wiscmail.wisc.edu> <76b0c5ac19a331.55becb2e@wiscmail.wisc.edu> <7620b4de199872.55becb6d@wiscmail.wisc.edu> <74b0bbd319feec.55becbab@wiscmail.wisc.edu> <7620d52d19bee2.55becbeb@wiscmail.wisc.edu> <76b0e43919dfdb.55becc29@wiscmail.wisc.edu> <75e0a7dc19bdab.55becc68@wiscmail.wisc.edu> <75e0c028198583.55becca6@wiscmail.wisc.edu> <74b0a09d19c79c.55becce4@wiscmail.wisc.edu> <76b0808619d856.55becd22@wiscmail.wisc.edu> <7610891d1992e0.55becd5f@wiscmail.wisc.edu> <7740b16819811d.55becd9d@wiscmail.wisc.edu> <75e0d05a19a210.55becddb@wiscmail.wisc.edu> <7620f86d19cdfb.55bece19@wiscmail.wisc.edu> <74b0e26619d88e.55bece57@wiscmail.wisc.edu> <7450aa2219d85b.55bece94@wiscmail.wisc.edu> <74b0c506198959.55beced2@wiscmail.wisc.edu> <74b0e2ba198afc.55be88a1@wiscmail.wisc.edu>

Message-ID: Hi, On Mon, Aug 3, 2015 at 3:13 PM, Gregory Lee wrote: > I agree that often you don't need to worry about the memory order. However, > it is not uncommon in medical imaging to go back and forth between a 2D or > 3D image representation and a 1D array representation (e.g. as often used in > image reconstruction algorithms). I found that the main time it was > necessary to pay careful attention to the memory layout was when converting > Matlab scripts that involve reshaping operations. Yes, good point. A typical example would be this kind of thing: # data is a 4D array with time / volume axis last data_2d = data.reshape((-1, data.shape[-1]) For MATLAB, the columns of this array would (by default) have the values on the first axis fastest changing, then the second, then the third, whereas numpy's default is the other way round. I find I usually don't have to worry about this, because I'm later going to do: data_processed_4d = data_2d.reshape(data.shape) which will reverse the previous reshape in the correct way. But in any case - this is not directly to do with the array memory layout. You will get the same output from reshape whether the memory layout of `data` was Fortran or C. Cheers, Matthew From sebastian at sipsolutions.net Mon Aug 3 10:53:23 2015 From: sebastian at sipsolutions.net (Sebastian Berg) Date: Mon, 3 Aug 2015 14:53:23 +0000 Subject: [Numpy-discussion] Change default order to Fortran order In-Reply-To: References: <7740864542dd.55bddb1a@wiscmail.wisc.edu> <7610d3f519bbd6.55bec8c3@wiscmail.wisc.edu> <762089a419b1da.55bec901@wiscmail.wisc.edu> <7690e69c19c6f5.55bec93e@wiscmail.wisc.edu> <74508bf519b933.55bec97c@wiscmail.wisc.edu> <7600e48b19abd0.55bec9b9@wiscmail.wisc.edu> <76109628198969.55bec9f7@wiscmail.wisc.edu> <7620c52619dbc7.55beca36@wiscmail.wisc.edu> <7740b86c19a9af.55beca74@wiscmail.wisc.edu> <74b0c96f19dd2e.55becab2@wiscmail.wisc.edu> <7720e87c19fe7f.55becaf0@wiscmail.wisc.edu> <76b0c5ac19a331.55becb2e@wiscmail.wisc.edu> <7620b4de199872.55becb6d@wiscmail.wisc.edu> <74b0bbd319feec.55becbab@wiscmail.wisc.edu> <7620d52d19bee2.55becbeb@wiscmail.wisc.edu> <76b0e43919dfdb.55becc29@wiscmail.wisc.edu> <75e0a7dc19bdab.55becc68@wiscmail.wisc.edu> <75e0c028198583.55becca6@wiscmail.wisc.edu> <74b0a09d19c79c.55becce4@wiscmail.wisc.edu> <76b0808619d856.55becd22@wiscmail.wisc.edu> <7610891d1992e0.55becd5f@wiscmail.wisc.edu> <7740b16819811d.55becd9d@wiscmail.wisc.edu> <75e0d05a19a210.55becddb@wiscmail.wisc.edu> <7620f86d19cdfb.55bece19@wiscmail.wisc.edu> <74b0e26619d88e.55bece57@wiscmail.wisc.edu> <7450aa2219d85b.55bece94@wiscmail.wisc.edu> <74b0c506198959.55beced2@wiscmail.wisc.edu> <74b0e2ba198afc.55be88a1@wiscmail.wisc.edu>

Message-ID: On Mon Aug 3 16:26:10 2015 GMT+0200, Matthew Brett wrote: > Hi, > > On Mon, Aug 3, 2015 at 3:13 PM, Gregory Lee wrote: > > I agree that often you don't need to worry about the memory order. However, > > it is not uncommon in medical imaging to go back and forth between a 2D or > > 3D image representation and a 1D array representation (e.g. as often used in > > image reconstruction algorithms). I found that the main time it was > > necessary to pay careful attention to the memory layout was when converting > > Matlab scripts that involve reshaping operations. > > Yes, good point. A typical example would be this kind of thing: > > # data is a 4D array with time / volume axis last > data_2d = data.reshape((-1, data.shape[-1]) > > For MATLAB, the columns of this array would (by default) have the > values on the first axis fastest changing, then the second, then the > third, whereas numpy's default is the other way round. > > I find I usually don't have to worry about this, because I'm later going to do: > > data_processed_4d = data_2d.reshape(data.shape) > > which will reverse the previous reshape in the correct way. > > But in any case - this is not directly to do with the array memory > layout. You will get the same output from reshape whether the memory > layout of `data` was Fortran or C. > Just as a remark. Reshape has an (iteration not really memory) order parameter, thou it may do more copies if those do not match. - Sebastian > Cheers, > > Matthew > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > From sturla.molden at gmail.com Mon Aug 3 10:55:04 2015 From: sturla.molden at gmail.com (Sturla Molden) Date: Mon, 3 Aug 2015 14:55:04 +0000 (UTC) Subject: [Numpy-discussion] Change default order to Fortran order References: <74b0e2ba198afc.55be88a1@wiscmail.wisc.edu>

Message-ID: <1809670063460305340.813019sturla.molden-gmail.com@news.gmane.org> Juan Nunez-Iglesias wrote: > The short version is that you'll save yourself a lot of pain by starting to > think of your images as (plane, row, column) instead of (x, y, z). There are several things to consider here. 1. The vertices in computer graphics (OpenGL) are (x,y,z). 2. OpenGL rotation matrices and projection matrice are stored in column major order. 3. OpenGL frame buffers are indexed (x,y) in column major order with (0,0) being lower left. 4. ITK and VTK depends on OpenGL and are thus using column major order. 5. Those who use Matlab or Fortran in addition to Python prefer column major order. 6. BLAS and LAPACK use column major order. 7. The common notation in image prorcessing (as opposed to computer graphics in geberal) is indexing (row, column), in row major order, with (0,0) being upper left. All in all, this is a strong case for prefering column major order and the common mathematical notation (x,y,z). Also notice how the ususal notation in image pricessing differs from OpenGL. Sturla From c99.smruti at gmail.com Mon Aug 3 11:00:27 2015 From: c99.smruti at gmail.com (SMRUTI RANJAN SAHOO) Date: Mon, 3 Aug 2015 20:30:27 +0530 Subject: [Numpy-discussion] Change default order to Fortran order In-Reply-To: <7740864542dd.55bddb1a@wiscmail.wisc.edu> References: <7740864542dd.55bddb1a@wiscmail.wisc.edu> Message-ID: well its really great idea. i can help on python but i don't have any knowledge on fortran. On Sun, Aug 2, 2015 at 7:25 PM, Kang Wang wrote: > Hi, > > I am an imaging researcher, and a new Python user. My first Python project > is to somehow modify NumPy source code such that everything is Fortran > column-major by default. > > I read about the information in the link below, but for us, the fact is > that *we absolutely want to use Fortran column major, and we want to > make it default. Explicitly writing " order = 'F' " all over the place is > not acceptable to us*. > > http://docs.scipy.org/doc/numpy/reference/internals.html#multidimensional-array-indexing-order-issues > > I tried searching in this email list, as well as google search in general. > However, I have not found anything useful. This must be a common > request/need, I believe. > > Can anyone provide any insight/help? > > Thank you very much, > > Kang > > -- > *Kang Wang, Ph.D.* > 1111 Highland Ave., Room 1113 > Madison, WI 53705-2275 > ---------------------------------------- > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From matthew.brett at gmail.com Mon Aug 3 11:16:02 2015 From: matthew.brett at gmail.com (Matthew Brett) Date: Mon, 3 Aug 2015 16:16:02 +0100 Subject: [Numpy-discussion] Change default order to Fortran order In-Reply-To: <1809670063460305340.813019sturla.molden-gmail.com@news.gmane.org> References: <74b0e2ba198afc.55be88a1@wiscmail.wisc.edu>

<1809670063460305340.813019sturla.molden-gmail.com@news.gmane.org> Message-ID: On Mon, Aug 3, 2015 at 3:55 PM, Sturla Molden wrote: > Juan Nunez-Iglesias wrote: > >> The short version is that you'll save yourself a lot of pain by starting to >> think of your images as (plane, row, column) instead of (x, y, z). > > There are several things to consider here. > > 1. The vertices in computer graphics (OpenGL) are (x,y,z). > > 2. OpenGL rotation matrices and projection matrice are stored in column > major order. > > 3. OpenGL frame buffers are indexed (x,y) in column major order with (0,0) > being lower left. > > 4. ITK and VTK depends on OpenGL and are thus using column major order. > > 5. Those who use Matlab or Fortran in addition to Python prefer column > major order. > > 6. BLAS and LAPACK use column major order. > > 7. The common notation in image prorcessing (as opposed to computer > graphics in geberal) is indexing (row, column), in row major order, with > (0,0) being upper left. > > All in all, this is a strong case for prefering column major order and the > common mathematical notation (x,y,z). > > Also notice how the ususal notation in image pricessing differs from > OpenGL. Sure, but to avoid confusion, maybe move the discussion of image indexing order to another thread? I think this thread is about memory layout, which is a different issue. Cheers, Matthew From sturla.molden at gmail.com Mon Aug 3 11:42:03 2015 From: sturla.molden at gmail.com (Sturla Molden) Date: Mon, 3 Aug 2015 15:42:03 +0000 (UTC) Subject: [Numpy-discussion] Change default order to Fortran order References: <7740864542dd.55bddb1a@wiscmail.wisc.edu> Message-ID: <111980011460308224.682392sturla.molden-gmail.com@news.gmane.org> SMRUTI RANJAN SAHOO wrote: > well its really great idea. i can help on python but i don't have any > knowledge on fortran. I have been thinking in these lines too. But I have always thought it would be too much work for very little in return, and it might not interop properly with libraries written for NumPy (though PEP3118 might have changed that). I am not sure using Fortran in addition to Cython is a good idea, but it might be. At least if we limit the number of dimenstions to, say, 4 or less, ot would be easy to implement most of the code in vectorized Fortran. Sturla From sturla.molden at gmail.com Mon Aug 3 12:01:09 2015 From: sturla.molden at gmail.com (Sturla Molden) Date: Mon, 3 Aug 2015 16:01:09 +0000 (UTC) Subject: [Numpy-discussion] Change default order to Fortran order References: <74b0e2ba198afc.55be88a1@wiscmail.wisc.edu>

<1809670063460305340.813019sturla.molden-gmail.com@news.gmane.org> Message-ID: <240594636460309376.989533sturla.molden-gmail.com@news.gmane.org> Matthew Brett wrote: > Sure, but to avoid confusion, maybe move the discussion of image > indexing order to another thread? > > I think this thread is about memory layout, which is a different issue. It is actually a bit convoluted and not completely orthogonal. Memory layout does not matter for 2d ndexing, i.e. (x,y) vs. (row, column), if you are careful when iterating, but it does matter for Nd indexing. There is a reason to prefer (x,y,z,t,r) in column major order or (recording, time, slice, row, column) in row major order. Otherwise you can get very inefficient memory traversals. Then if you work with visualization libraries that expects (x,y,z) and column major order, e.g. ITK, VTK and OpenGL, this is really what you want to use. And the choise of indexing (x,y,z) cannot be seen as independent of the memory layout. Remember, it is not just a matter of mapping coordinates to pixels. The data sets are so large in MRI processing that memory layout does matter. Sturla From matthew.brett at gmail.com Mon Aug 3 12:24:51 2015 From: matthew.brett at gmail.com (Matthew Brett) Date: Mon, 3 Aug 2015 17:24:51 +0100 Subject: [Numpy-discussion] Change default order to Fortran order In-Reply-To: <240594636460309376.989533sturla.molden-gmail.com@news.gmane.org> References: <74b0e2ba198afc.55be88a1@wiscmail.wisc.edu>

<1809670063460305340.813019sturla.molden-gmail.com@news.gmane.org> <240594636460309376.989533sturla.molden-gmail.com@news.gmane.org> Message-ID: On Mon, Aug 3, 2015 at 5:01 PM, Sturla Molden