From friedrichromstedt at gmail.com Mon Feb 1 02:57:40 2021 From: friedrichromstedt at gmail.com (Friedrich Romstedt) Date: Mon, 1 Feb 2021 08:57:40 +0100 Subject: [Numpy-discussion] Unreliable crash when converting using numpy.asarray via C buffer interface In-Reply-To: References: Message-ID: Hi, Am Di., 26. Jan. 2021 um 09:48 Uhr schrieb Friedrich Romstedt : > > [...] The following Python > code crashes:: > > image = <... Image production ...> > ar = numpy.asarray(image) > > However, when I say:: > > image = <... Image production ...> > print("---") > ar = numpy.asarray(image) > > the entire program is executing properly with correct data in the > numpy ndarray produced using the buffer interface. > > [...] Does anyone have an idea about this? By the way, I noticed that this mailing list turned pretty quiet, am I missing something? For completeness, the abovementioned "crash" shows up as just a premature exit of the program. There is no error message whatsoever. The buffer view producing function raises Exceptions properly when something goes wrong; also notice that this code completes without error when the ``print("---")`` statement is in action. So I presume the culprit lies somewhere on the C level. I can only guess that it might be some side-effect unknown to me. Best, Friedrich From hameerabbasi at yahoo.com Mon Feb 1 03:03:33 2021 From: hameerabbasi at yahoo.com (Hameer Abbasi) Date: Mon, 1 Feb 2021 09:03:33 +0100 Subject: [Numpy-discussion] Unreliable crash when converting using numpy.asarray via C buffer interface In-Reply-To: References: Message-ID: Hey Friedrich, If you can produce an MVCE that would be really helpful, along with your hardware and environment. Without that, it isn?t possible to be of much help. https://matthewrocklin.com/blog/work/2018/02/28/minimal-bug-reports Best Regards, Hameer Abbasi -- Sent from Canary (https://canarymail.io) > On Dienstag, Jan. 26, 2021 at 9:49 AM, Friedrich Romstedt wrote: > Hi, > > This is with Python 3.8.2 64-bit and numpy 1.19.2 on Windows 10. I'd > like to be able to convert some C++ extension type to a numpy array by > using ``numpy.asarray``. The extension type implements the Python > buffer interface to support this. > > The extension type, called "Image" here, holds some chunk of > ``double``, C order, contiguous, 2 dimensions. It "owns" the buffer; > the buffer is not shared with other objects. The following Python > code crashes:: > > image = <... Image production ...> > ar = numpy.asarray(image) > > However, when I say:: > > image = <... Image production ...> > print("---") > ar = numpy.asarray(image) > > the entire program is executing properly with correct data in the > numpy ndarray produced using the buffer interface. > > The extension type permits reading the pixel values by a method; > copying them over by a Python loop works fine. I am ``Py_INCREF``-ing > the producer in the C++ buffer view creation function properly. The > shapes and strides of the buffer view are ``delete[]``-ed upon > releasing the buffer; avoiding this does not prevent the crash. I am > catching ``std::exception`` in the view creation function; no such > exception occurs. The shapes and strides are allocated by ``new > Py_ssize_t[2]``, so they will survive the view creation function. > > I spent some hours trying to figure out what I am doing wrong. Maybe > someone has an idea about this? I double-checked each line of code > related to this problem and couldn't find any mistake. Probabaly I am > not looking at the right aspect. > > Best, > Friedrich > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion -------------- next part -------------- An HTML attachment was scrubbed... URL: From matti.picus at gmail.com Mon Feb 1 03:46:37 2021 From: matti.picus at gmail.com (Matti Picus) Date: Mon, 1 Feb 2021 10:46:37 +0200 Subject: [Numpy-discussion] Unreliable crash when converting using numpy.asarray via C buffer interface In-Reply-To: References: Message-ID: On 2/1/21 9:57 AM, Friedrich Romstedt wrote: > Hi, > > Am Di., 26. Jan. 2021 um 09:48 Uhr schrieb Friedrich Romstedt > : >> [...] The following Python >> code crashes:: >> >> image = <... Image production ...> >> ar = numpy.asarray(image) >> >> However, when I say:: >> >> image = <... Image production ...> >> print("---") >> ar = numpy.asarray(image) >> >> the entire program is executing properly with correct data in the >> numpy ndarray produced using the buffer interface. >> >> [...] > Does anyone have an idea about this? By the way, I noticed that this > mailing list turned pretty quiet, am I missing something? > > For completeness, the abovementioned "crash" shows up as just a > premature exit of the program. There is no error message whatsoever. > The buffer view producing function raises Exceptions properly when > something goes wrong; also notice that this code completes without > error when the ``print("---")`` statement is in action. So I presume > the culprit lies somewhere on the C level. I can only guess that it > might be some side-effect unknown to me. > > Best, > Friedrich It is very hard to help you from this description. It may be a refcount problem, it may be a buffer protocol problem, it may be something else. Typically, one would create a complete example and then pointing to the code (as repo or pastebin, not as an attachment to a mail here). A few things you might want to check: - Make sure you give instructions how to build your project for Linux, since most of the people on this list do not use windows. - There are tools out there to analyze refcount problems. Python has some built-in tools for switching allocation strategies. - numpy.asarray has a number of strategies to convert instances, which one is it using? Matti From sebastian at sipsolutions.net Tue Feb 2 19:02:26 2021 From: sebastian at sipsolutions.net (Sebastian Berg) Date: Tue, 02 Feb 2021 18:02:26 -0600 Subject: [Numpy-discussion] NumPy Community Meeting Wednesday Message-ID: <080bd5452f7b3f5af6188bd0fde7e95c48a88234.camel@sipsolutions.net> Hi all, There will be a NumPy Community meeting Wednesday February 3rd at 12pm Pacific Time (20:00 UTC). Everyone is invited and encouraged to join in and edit the work-in-progress meeting topics and notes at: https://hackmd.io/76o-IxCjQX2mOXO_wwkcpg?both Best wishes Sebastian -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: This is a digitally signed message part URL: From calhoun137 at gmail.com Wed Feb 3 18:18:49 2021 From: calhoun137 at gmail.com (Matt Calhoun) Date: Wed, 3 Feb 2021 18:18:49 -0500 Subject: [Numpy-discussion] Math Inspector Beta Message-ID: Hi Everyone! I have been using numpy for an extremely long time, but this is the first time emailing the list. I recently released the beta version of my free open source math app called math inspector, and so far the response has been really amazing, it was on the front page of hacker news all day sunday and went from 15 stars to 348 on GitHub since then. I wanted to reach out to the community to find out if people like this project, have any feedback/suggestions/feature requests, or would possibly be interested in placing a link to the website (mathinspector.com) on the numpy homepage. Math inspector is a python interpreter which contains a frozen version of python and numpy, this makes it very easy for non-technical people to get started, it also creates a block coding environment which represents the memory of the running program. This block coding environment is at such a high level of generality that it's capable of working for all of python. It also has an interactive graphing system made in pygame which updates and modernizes all of the functionality in matplotlib. This graphing system is it's own stand alone module by the way. Math inspector also has a documentation browser which creates a beautiful interactive experience for exploring the documentation. Everything in math inspector has been designed specifically for numpy, even though it works for all of python. I started it 2 years ago when I got really confused after searching through the numpy website, and I wanted to build a system where I could dig into the modules in a directory file type structure that was highly organized. From there everything just took off. The main goal of this project is to support the mathematics education community on youtube, by providing a free tool that everyone can use to share code samples for their videos, but I believe it has a wide range of additional applications for scientific computing as well. I have been working really hard on this project, and I really hope everyone likes it! You can find the full source code on the GitHub page: https://github.com/MathInspector/MathInspector Cheers! - Matt -------------- next part -------------- An HTML attachment was scrubbed... URL: From mansourmoufid at gmail.com Wed Feb 3 22:03:26 2021 From: mansourmoufid at gmail.com (Mansour Moufid) Date: Wed, 3 Feb 2021 22:03:26 -0500 Subject: [Numpy-discussion] Math Inspector Beta In-Reply-To: References: Message-ID: Very cool! But the Mac disk image (mathinspector_0.9.1.dmg) isn't opening ("corrupt image"). It's 145279488 bytes and the shasum ends with f1ed9231. -------------- next part -------------- An HTML attachment was scrubbed... URL: From friedrichromstedt at gmail.com Thu Feb 4 03:07:59 2021 From: friedrichromstedt at gmail.com (Friedrich Romstedt) Date: Thu, 4 Feb 2021 09:07:59 +0100 Subject: [Numpy-discussion] Unreliable crash when converting using numpy.asarray via C buffer interface In-Reply-To: References: Message-ID: Hello Matti, Am Mo., 1. Feb. 2021 um 09:46 Uhr schrieb Matti Picus : > > [...] > > It is very hard to help you from this description. It may be a refcount > problem, it may be a buffer protocol problem, it may be something else. Yes, indeed! > Typically, one would create a complete example and then pointing to the > code (as repo or pastebin, not as an attachment to a mail here). https://github.com/friedrichromstedt/bughunting-01 I boiled it down considerably, compared to the program where I stumbled upon the problem. In the abovementioned repo, you find a Python test script in the `test/` folder. Therein, a single `print` statement can be used to trigger or to avoid the error. On Linux, I get a somewhat more precise description than just from the premature exit on Windows: It is a segfault. Certainly it is still asked quite much to skim through my source code, however, I hope that I trimmed it down sufficiently. > - Make sure you give instructions how to build your project for Linux, > since most of the people on this list do not use windows. The code reproducing the segfault can be compiled by `$ python3 setup.py install`, both on Windows as well as on Linux. > - There are tools out there to analyze refcount problems. Python has > some built-in tools for switching allocation strategies. Can you give me some pointer about this? > - numpy.asarray has a number of strategies to convert instances, which > one is it using? I've tried to read about this, but coudn't find anything. What are these different strategies? Many thanks in advance, Friedrich From ralf.gommers at gmail.com Thu Feb 4 04:36:58 2021 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Thu, 4 Feb 2021 10:36:58 +0100 Subject: [Numpy-discussion] Math Inspector Beta In-Reply-To: References: Message-ID: Hi Matt, Very cool, thanks for sharing! On Thu, Feb 4, 2021 at 12:19 AM Matt Calhoun wrote: > Hi Everyone! I have been using numpy for an extremely long time, but this > is the first time emailing the list. I recently released the beta version > of my free open source math app called math inspector, and so far the > response has been really amazing, it was on the front page of hacker news > all day sunday and went from 15 stars to 348 on GitHub since then. I > wanted to reach out to the community to find out if people like this > project, have any feedback/suggestions/feature requests, or would possibly > be interested in placing a link to the website (mathinspector.com) on the > numpy homepage. > We have an Ecosystem section on numpy.org, we can add it there. There's an Interactive Computing section where it kind of fits (although a place labeled education would be better). There's some discussion on the numpy.org issue tracker ( https://github.com/numpy/numpy.org/issues/313#issuecomment-751466980) about moving that to its own tab instead of having it as an entry under "Scientific computing", but for now we could add it there under Jupyter/IPython/Binder. > Math inspector is a python interpreter which contains a frozen version of > python and numpy, this makes it very easy for non-technical people to get > started, it also creates a block coding environment which represents the > memory of the running program. This block coding environment is at such a > high level of generality that it's capable of working for all of python. > It also has an interactive graphing system made in pygame which updates and > modernizes all of the functionality in matplotlib. This graphing system is > it's own stand alone module by the way. Math inspector also has a > documentation browser which creates a beautiful interactive experience for > exploring the documentation. > > Everything in math inspector has been designed specifically for > numpy, even though it works for all of python. I started it 2 years ago > when I got really confused after searching through the numpy website, and I > wanted to build a system where I could dig into the modules in a directory > file type structure that was highly organized. From there everything just > took off. > One thing I realized when browsing through the video on your front page is that the public module layout we have is very unhelpful for this kind of education - it'd be good if we had a way to hide things like core, emath, matrixlib, etc. that we don't want people to import and use directly. Essentially we'd to teach people mostly about the main namespace, and fft, linalg, and random. If you have other thoughts on what would help you to make NumPy more approachable, in Math Inspector or in general, those would be great to hear. Cheers, Ralf > The main goal of this project is to support the mathematics education > community on youtube, by providing a free tool that everyone can use to > share code samples for their videos, but I believe it has a wide range of > additional applications for scientific computing as well. > > I have been working really hard on this project, and I really hope > everyone likes it! > > You can find the full source code on the GitHub page: > https://github.com/MathInspector/MathInspector > > Cheers! > - Matt > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From melissawm at gmail.com Thu Feb 4 08:55:50 2021 From: melissawm at gmail.com (=?UTF-8?Q?Melissa_Mendon=C3=A7a?=) Date: Thu, 4 Feb 2021 10:55:50 -0300 Subject: [Numpy-discussion] Math Inspector Beta In-Reply-To: References: Message-ID: Hi Matt! This is great timing - we actually talked about mathinspector in our Documentation Team meeting on monday (you can see the meeting notes here: https://hackmd.io/oB_boakvRqKR-_2jRV-Qjg If you are interested, you are welcome to join our slack space and/or our docs meetings, we would love to chat in more detail. Cheers, Melissa On Thu, Feb 4, 2021 at 6:38 AM Ralf Gommers wrote: > Hi Matt, > > Very cool, thanks for sharing! > > > On Thu, Feb 4, 2021 at 12:19 AM Matt Calhoun wrote: > >> Hi Everyone! I have been using numpy for an extremely long time, but >> this is the first time emailing the list. I recently released the beta >> version of my free open source math app called math inspector, and so far >> the response has been really amazing, it was on the front page of hacker >> news all day sunday and went from 15 stars to 348 on GitHub since then. I >> wanted to reach out to the community to find out if people like this >> project, have any feedback/suggestions/feature requests, or would possibly >> be interested in placing a link to the website (mathinspector.com) on >> the numpy homepage. >> > > We have an Ecosystem section on numpy.org, we can add it there. There's > an Interactive Computing section where it kind of fits (although a place > labeled education would be better). There's some discussion on the > numpy.org issue tracker ( > https://github.com/numpy/numpy.org/issues/313#issuecomment-751466980) > about moving that to its own tab instead of having it as an entry under > "Scientific computing", but for now we could add it there under > Jupyter/IPython/Binder. > > >> Math inspector is a python interpreter which contains a frozen version of >> python and numpy, this makes it very easy for non-technical people to get >> started, it also creates a block coding environment which represents the >> memory of the running program. This block coding environment is at such a >> high level of generality that it's capable of working for all of python. >> It also has an interactive graphing system made in pygame which updates and >> modernizes all of the functionality in matplotlib. This graphing system is >> it's own stand alone module by the way. Math inspector also has a >> documentation browser which creates a beautiful interactive experience for >> exploring the documentation. >> >> Everything in math inspector has been designed specifically for >> numpy, even though it works for all of python. I started it 2 years ago >> when I got really confused after searching through the numpy website, and I >> wanted to build a system where I could dig into the modules in a directory >> file type structure that was highly organized. From there everything just >> took off. >> > > One thing I realized when browsing through the video on your front page is > that the public module layout we have is very unhelpful for this kind of > education - it'd be good if we had a way to hide things like core, emath, > matrixlib, etc. that we don't want people to import and use directly. > Essentially we'd to teach people mostly about the main namespace, and fft, > linalg, and random. > > If you have other thoughts on what would help you to make NumPy more > approachable, in Math Inspector or in general, those would be great to hear. > > Cheers, > Ralf > > > >> The main goal of this project is to support the mathematics education >> community on youtube, by providing a free tool that everyone can use to >> share code samples for their videos, but I believe it has a wide range of >> additional applications for scientific computing as well. >> >> I have been working really hard on this project, and I really hope >> everyone likes it! >> >> You can find the full source code on the GitHub page: >> https://github.com/MathInspector/MathInspector >> >> Cheers! >> - Matt >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at python.org >> https://mail.python.org/mailman/listinfo/numpy-discussion >> > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From calhoun137 at gmail.com Thu Feb 4 09:09:01 2021 From: calhoun137 at gmail.com (Matt Calhoun) Date: Thu, 4 Feb 2021 09:09:01 -0500 Subject: [Numpy-discussion] Math Inspector Beta Message-ID: @ Mansour Moufid > Very cool! > But the Mac disk image (mathinspector_0.9.1.dmg) isn't opening ("corrupt image"). > It's 145279488 bytes and the shasum ends with f1ed9231. Oh no, whoops! The .dmg file has been code signed with my apple developer id, notarized with apple, and passes all verification checks on my machine when I download it from the website. Ever since sunday I have been scrambling to support every platform and os version out there basically, and this is the first time I saw this one. For the sake of avoiding using the mailing list to debug, would be willing to open an issue on the Math Inspector github page? Thanks! (btw I checked the file on my machine and its the same filesize with the same shasum, so my guess is there is a pyinstaller issue related to an os version conflict, or a code signing issue, not sure though, I built it on BigSur 11.1) @ Ralf Gommers > Very cool, thanks for sharing! Thank you!!! > We have an Ecosystem section on numpy.org, we can add it there It's really important to me to make math inspector a part of the numpy ecosystem, and since this is the first time I am reaching out to the mailing list, I'd like emphasize that I am more than willing to work with the community to improve the product, respond to bug reports & feature requests, and in general I strongly value constructive criticism. > One thing I realized when browsing through the video on your front page is > that the public module layout we have is very unhelpful for this kind of > education...If you have other thoughts on what would help you to make NumPy more > approachable, in Math Inspector or in general, those would be great to hear. I completely agree with your observation here. It hadn't occurred to me to change numpy to make it better for math inspector, but I think you are hitting the nail on the head when you suggest re-organizing the file structure of the core package. The main suggestion I have is to update the documentation in a way that leverages the power of math inspector. The math inspector doc browser is a powerful tool with lots of extra functionality that is not available from the website or in the normal python help() function. This extra functionality could be used to make numpy more approachable. For example, replace references to matplotlib in the doc's with mathinspector.plot(), and substitute mathinspector for iPython as the recommended tool. Thanks for this fantastic feedback! -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Thu Feb 4 12:20:39 2021 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 4 Feb 2021 10:20:39 -0700 Subject: [Numpy-discussion] Math Inspector Beta In-Reply-To: References: Message-ID: On Wed, Feb 3, 2021 at 4:19 PM Matt Calhoun wrote: > Hi Everyone! I have been using numpy for an extremely long time, but this > is the first time emailing the list. I recently released the beta version > of my free open source math app called math inspector, and so far the > response has been really amazing, it was on the front page of hacker news > all day sunday and went from 15 stars to 348 on GitHub since then. I > wanted to reach out to the community to find out if people like this > project, have any feedback/suggestions/feature requests, or would possibly > be interested in placing a link to the website (mathinspector.com) on the > numpy homepage. > > Math inspector is a python interpreter which contains a frozen version of > python and numpy, this makes it very easy for non-technical people to get > started, it also creates a block coding environment which represents the > memory of the running program. This block coding environment is at such a > high level of generality that it's capable of working for all of python. > It also has an interactive graphing system made in pygame which updates and > modernizes all of the functionality in matplotlib. This graphing system is > it's own stand alone module by the way. Math inspector also has a > documentation browser which creates a beautiful interactive experience for > exploring the documentation. > > Everything in math inspector has been designed specifically for > numpy, even though it works for all of python. I started it 2 years ago > when I got really confused after searching through the numpy website, and I > wanted to build a system where I could dig into the modules in a directory > file type structure that was highly organized. From there everything just > took off. > > The main goal of this project is to support the mathematics education > community on youtube, by providing a free tool that everyone can use to > share code samples for their videos, but I believe it has a wide range of > additional applications for scientific computing as well. > > I have been working really hard on this project, and I really hope > everyone likes it! > > You can find the full source code on the GitHub page: > https://github.com/MathInspector/MathInspector > > Cheers! > - Matt > Somewhat off topic, but this brought to mind Model Based Design . MBD is a different subject, but I suspect the same underlying tools used for MathInspector might be useful. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From camel-cdr at protonmail.com Sat Feb 6 04:31:40 2021 From: camel-cdr at protonmail.com (camel-cdr at protonmail.com) Date: Sat, 06 Feb 2021 09:31:40 +0000 Subject: [Numpy-discussion] Question about optimizing random_standard_normal Message-ID: I tried to implement a different implementation of the ziggurat method for generating standard normal distributions that is about twice as fast and uses 2/3 of the memory than the old one. I tested the implementation separately and am very confident it's correct, but it does fail 28 test in coverage testing. Checking the testing code I found out that all the failed tests are inside TestRandomDist which has the goal of "Make[ing] sure the random distribution returns the correct value for a given seed". Why would this be needed? The only explanation I can come up with is that it's standard_normal is, in regards to seeding, required to be backwards compatible. If that's the case how would, could one even implement a new algorithm? -------------- next part -------------- An HTML attachment was scrubbed... URL: From tom at swirly.com Sat Feb 6 06:22:04 2021 From: tom at swirly.com (Tom Swirly) Date: Sat, 6 Feb 2021 12:22:04 +0100 Subject: [Numpy-discussion] Question about optimizing random_standard_normal In-Reply-To: References: Message-ID: Well, I can tell you why it needs to be backward compatible! I use random numbers fairly frequently, and to unit test them I set a specific seed and then make sure I get the same answers. If your change went in (and I were using numpy normal distributions, which I am not) then my tests would break. Particularly, you'd have the unfixable problem that it would be impossible to write tests for your code that worked regardless of the version of numpy that was installed. Yes, I agree that in your use case, this is powerfully unfortunate, and prevents you from making a change that would otherwise benefit everyone. The three ways to do this would be the following: - Add a new parameter to the function call, say, faster=False, which you set True to get the new behavior - Add a global flag somewhere you set to get the new behavior everywhere - Create a new function called normal_faster or some such All of these are ugly for obvious reasons. On Sat, Feb 6, 2021 at 10:33 AM wrote: > I tried to implement a different implementation of the ziggurat method for > generating standard normal distributions that is about twice as fast and > uses 2/3 of the memory than the old one. > I tested the implementation separately and am very confident it's correct, > but it does fail 28 test in coverage testing. > Checking the testing code I found out that all the failed tests are inside > TestRandomDist which has the goal of "Make[ing] sure the random > distribution returns the correct value for a given seed". Why would this be > needed? > The only explanation I can come up with is that it's standard_normal is, > in regards to seeding, required to be backwards compatible. If that's the > case how would, could one even implement a new algorithm? > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > -- /t PGP Key: https://flowcrypt.com/pub/tom.ritchford at gmail.com *https://tom.ritchford.com * *https://tom.swirly.com * -------------- next part -------------- An HTML attachment was scrubbed... URL: From kevin.k.sheppard at gmail.com Sat Feb 6 06:54:32 2021 From: kevin.k.sheppard at gmail.com (Kevin Sheppard) Date: Sat, 6 Feb 2021 11:54:32 +0000 Subject: [Numpy-discussion] Question about optimizing random_standard_normal In-Reply-To: References: Message-ID: Have you benchmarked it using the generator interface? The structure of this as a no monolithic generator makes it a good deal slower than generating in straight C (with everything inline). While I'm not sure a factor of 2 is enough to justify a change (for me 10x, 1.2x is not but I don't know where the cutoff is). Can you post benchmarks from using it through Generator? Also, those tests would be replaced with new values if the patch was accepted, so don't worry about them. Kevin On Sat, Feb 6, 2021, 09:32 wrote: > I tried to implement a different implementation of the ziggurat method for > generating standard normal distributions that is about twice as fast and > uses 2/3 of the memory than the old one. > I tested the implementation separately and am very confident it's correct, > but it does fail 28 test in coverage testing. > Checking the testing code I found out that all the failed tests are inside > TestRandomDist which has the goal of "Make[ing] sure the random > distribution returns the correct value for a given seed". Why would this be > needed? > The only explanation I can come up with is that it's standard_normal is, > in regards to seeding, required to be backwards compatible. If that's the > case how would, could one even implement a new algorithm? > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From camel-cdr at protonmail.com Sat Feb 6 07:25:49 2021 From: camel-cdr at protonmail.com (camel-cdr at protonmail.com) Date: Sat, 06 Feb 2021 12:25:49 +0000 Subject: [Numpy-discussion] Question about optimizing random_standard_normal In-Reply-To: References: Message-ID: > Well, I can tell you why it needs to be backward compatible! I use random numbers fairly frequently, and to unit test them I set a specific seed and then make sure I get the same answers. Hmm, I guess that makes sense. I tried to adjust my algorithms to do the same thing with the same bit's as the original one, but I couldn't get it to work. > Have you benchmarked it using the generator interface? The structure of this as a no monolithic generator makes it a good deal slower than generating in straight C (with everything inline). While I'm not sure a factor of 2 is enough to justify a change (for me 10x, 1.2x is not but I don't know where the cutoff is). I originally benchmarked my implementation against a bunch of other ones in c (because I was developing a c library https://github.com/camel-cdr/cauldron/blob/main/cauldron/random.h#L1928). But I did run the built-in benchmark: ./runtests.py --bench bench_random.RNG.time_normal_zig and the results are: new old PCG64 589?3?s 1.06?0.03ms MT19937 985?4?s 1.44?0.01ms Philox 981?30?s 1.39?0.01ms SFC64 508?4?s 900?4?s numpy 2.99?0.06ms 2.98?0.01ms # no change for /dev/urandom I'm not yet 100% certain about the implementations, but I attached a diff of my current progress. -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: ziggurat.diff Type: text/x-diff Size: 59720 bytes Desc: not available URL: From charlesr.harris at gmail.com Sat Feb 6 09:19:33 2021 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sat, 6 Feb 2021 07:19:33 -0700 Subject: [Numpy-discussion] Question about optimizing random_standard_normal In-Reply-To: References: Message-ID: On Sat, Feb 6, 2021 at 5:27 AM wrote: > Well, I can tell you why it needs to be backward compatible! I use random > numbers fairly frequently, and to unit test them I set a specific seed and > then make sure I get the same answers. > > Hmm, I guess that makes sense. I tried to adjust my algorithms to do the > same thing with the same bit's as the original one, but I couldn't get it > to work. > > Have you benchmarked it using the generator interface? The structure of > this as a no monolithic generator makes it a good deal slower than > generating in straight C (with everything inline). While I'm not sure a > factor of 2 is enough to justify a change (for me 10x, 1.2x is not but I > don't know where the cutoff is). > > > I originally benchmarked my implementation against a bunch of other ones > in c (because I was developing a c library > https://github.com/camel-cdr/cauldron/blob/main/cauldron/random.h#L1928). > But I did run the built-in benchmark: ./runtests.py --bench > bench_random.RNG.time_normal_zig and the results are: > > new old > PCG64 589?3?s 1.06?0.03ms > MT19937 985?4?s 1.44?0.01ms > Philox 981?30?s 1.39?0.01ms > SFC64 508?4?s 900?4?s > numpy 2.99?0.06ms 2.98?0.01ms # no change for /dev/urandom > > > I'm not yet 100% certain about the implementations, but I attached a diff > of my current progress. > > You can actually get rid of the loop entirely and implement the exponential function directly by using an exponential bound on the bottom ziggurat block ends. It just requires a slight change in the block boundaries. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Sat Feb 6 09:29:46 2021 From: robert.kern at gmail.com (Robert Kern) Date: Sat, 6 Feb 2021 09:29:46 -0500 Subject: [Numpy-discussion] Question about optimizing random_standard_normal In-Reply-To: References: Message-ID: On Sat, Feb 6, 2021 at 7:27 AM wrote: > Well, I can tell you why it needs to be backward compatible! I use random > numbers fairly frequently, and to unit test them I set a specific seed and > then make sure I get the same answers. > > Hmm, I guess that makes sense. I tried to adjust my algorithms to do the > same thing with the same bit's as the original one, but I couldn't get it > to work. > To be clear, this is not our backwards compatibility policy for the methods that you have modified. Our policy is spelled out here: https://numpy.org/neps/nep-0019-rng-policy.html The TestRandomDist suite of tests were adapted from the older RandomState (which is indeed frozen and not allowed to change algorithms). It's a mix of correctness tests that are valid regardless of the precise algorithm (does this method reject invalid arguments? do degenerate arguments yield the correct constant value?) and actual "has this algorithm changed unexpectedly?" tests. The former are the most valuable, but the latter are useful for testing in cross-platform contexts. Compilers and different CPUs can do naughty things sometimes, and we want the cross-platform differences to be minimal. When you do change an algorithm implementation for Generator, as you have done, you are expected to do thorough tests (offline, not in the unit tests) that it is correctly sampling from the target probability distribution, then once satisfied, change the hard-coded values in TestRandomDist to match whatever you are generating. -- Robert Kern -------------- next part -------------- An HTML attachment was scrubbed... URL: From camel-cdr at protonmail.com Sat Feb 6 09:49:07 2021 From: camel-cdr at protonmail.com (camel-cdr at protonmail.com) Date: Sat, 06 Feb 2021 14:49:07 +0000 Subject: [Numpy-discussion] Question about optimizing random_standard_normal In-Reply-To: References: Message-ID: ??????? Original Message ??????? On Saturday, February 6, 2021 3:29 PM, Robert Kern wrote: > On Sat, Feb 6, 2021 at 7:27 AM wrote: > >>> Well, I can tell you why it needs to be backward compatible! I use random numbers fairly frequently, and to unit test them I set a specific seed and then make sure I get the same answers. >> >> Hmm, I guess that makes sense. I tried to adjust my algorithms to do the same thing with the same bit's as the original one, but I couldn't get it to work. > > To be clear, this is not our backwards compatibility policy for the methods that you have modified. Our policy is spelled out here: > > https://numpy.org/neps/nep-0019-rng-policy.html > > The TestRandomDist suite of tests were adapted from the older RandomState (which is indeed frozen and not allowed to change algorithms). It's a mix of correctness tests that are valid regardless of the precise algorithm (does this method reject invalid arguments? do degenerate arguments yield the correct constant value?) and actual "has this algorithm changed unexpectedly?" tests. The former are the most valuable, but the latter are useful for testing in cross-platform contexts. Compilers and different CPUs can do naughty things sometimes, and we want the cross-platform differences to be minimal. When you do change an algorithm implementation for Generator, as you have done, you are expected to do thorough tests (offline, not in the unit tests) that it is correctly sampling from the target probability distribution, then once satisfied, change the hard-coded values in TestRandomDist to match whatever you are generating. > > -- > Robert Kern Ok, cool, that basically explains a lot. > When you do change an algorithm implementation for Generator, as you have done, you are expected to do thorough tests (offline, not in the unit tests) that it is correctly sampling from the target probability distribution, then once satisfied, change the hard-coded values in TestRandomDist to match whatever you are generating. I'm probably not versed enough in statistics to do thorough testing. I used the testing in https://www.seehuhn.de/pages/ziggurat and plotting histograms to verify correctness, that probably won't be sufficient. -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Sun Feb 7 13:12:21 2021 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sun, 7 Feb 2021 11:12:21 -0700 Subject: [Numpy-discussion] Pearu Peterson has joined the NumPy developers team. Message-ID: Hi All, Pearu Peterson has joined the NumPy developers team. Pearu was responsible for contributing f2py and much of distutils in the early days of NumPy. Welcome back Pearu. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From stefanv at berkeley.edu Sun Feb 7 15:08:41 2021 From: stefanv at berkeley.edu (Stefan van der Walt) Date: Sun, 07 Feb 2021 12:08:41 -0800 Subject: [Numpy-discussion] =?utf-8?q?Pearu_Peterson_has_joined_the_NumPy?= =?utf-8?q?_developers_team=2E?= In-Reply-To: References: Message-ID: <6fe3b84f-6ae0-4b01-934c-5cd8605a23bc@www.fastmail.com> On Sun, Feb 7, 2021, at 10:12, Charles R Harris wrote: > Pearu Peterson has joined the NumPy developers team. Pearu was responsible for contributing f2py and much of distutils in the early days of NumPy. Welcome back Pearu. Welcome back, it's good to see you around more, Pearu! Best regards, St?fan -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Sun Feb 7 16:23:04 2021 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sun, 7 Feb 2021 14:23:04 -0700 Subject: [Numpy-discussion] NumPy 1.20.1 released. Message-ID: Hi All, On behalf of the NumPy team I am pleased to announce the release of NumPy 1.20.1. NumPy 1.20.1 is a rapid bugfix release fixing several bugs and regressions reported after the 1.20.0 release. The Python versions supported for this release are 3.7-3.9. Wheels can be downloaded from PyPI ; source archives, release notes, and wheel hashes are available on Github . Linux users will need pip >= 0.19.3 in order to install manylinux2010 and manylinux2014 wheels. *Highlights* - The distutils bug that caused problems with downstream projects is fixed. - The ``random.shuffle`` regression is fixed. *Contributors* A total of 8 people contributed to this release. People with a "+" by their names contributed a patch for the first time. - Bas van Beek - Charles Harris - Nicholas McKibben + - Pearu Peterson - Ralf Gommers - Sebastian Berg - Tyler Reddy - @Aerysv + *Pull requests merged* A total of 15 pull requests were merged for this release. - gh-18306: MAINT: Add missing placeholder annotations - gh-18310: BUG: Fix typo in ``numpy.__init__.py`` - gh-18326: BUG: don't mutate list of fake libraries while iterating over... - gh-18327: MAINT: gracefully shuffle memoryviews - gh-18328: BUG: Use C linkage for random distributions - gh-18336: CI: fix when GitHub Actions builds trigger, and allow ci skips - gh-18337: BUG: Allow unmodified use of isclose, allclose, etc. with timedelta - gh-18345: BUG: Allow pickling all relevant DType types/classes - gh-18351: BUG: Fix missing signed_char dependency. Closes #18335. - gh-18352: DOC: Change license date 2020 -> 2021 - gh-18353: CI: CircleCI seems to occasionally time out, increase the limit - gh-18354: BUG: Fix f2py bugs when wrapping F90 subroutines. - gh-18356: MAINT: crackfortran regex simplify - gh-18357: BUG: threads.h existence test requires GLIBC > 2.12. - gh-18359: REL: Prepare for the NumPy 1.20.1 release. Cheers, Charles Harris -------------- next part -------------- An HTML attachment was scrubbed... URL: From kevin.k.sheppard at gmail.com Mon Feb 8 03:03:34 2021 From: kevin.k.sheppard at gmail.com (Kevin Sheppard) Date: Mon, 8 Feb 2021 08:03:34 +0000 Subject: [Numpy-discussion] Question about optimizing random_standard_normal In-Reply-To: References: Message-ID: If I understand correctly, there is no gain when applying this patch to Generator. I'm not that surprised that this is the case since the compiler is much more limited in when it can do in Generator than what it can when filling a large array directly with functions available for inlining and unrolling. Again, if I understand correctly I think it will be difficult to justify breaking the stream for a negligible gain in performance. Kevin On Sat, Feb 6, 2021 at 12:27 PM wrote: > Well, I can tell you why it needs to be backward compatible! I use random > numbers fairly frequently, and to unit test them I set a specific seed and > then make sure I get the same answers. > > Hmm, I guess that makes sense. I tried to adjust my algorithms to do the > same thing with the same bit's as the original one, but I couldn't get it > to work. > > Have you benchmarked it using the generator interface? The structure of > this as a no monolithic generator makes it a good deal slower than > generating in straight C (with everything inline). While I'm not sure a > factor of 2 is enough to justify a change (for me 10x, 1.2x is not but I > don't know where the cutoff is). > > > I originally benchmarked my implementation against a bunch of other ones > in c (because I was developing a c library > https://github.com/camel-cdr/cauldron/blob/main/cauldron/random.h#L1928). > But I did run the built-in benchmark: ./runtests.py --bench > bench_random.RNG.time_normal_zig and the results are: > > new old > PCG64 589?3?s 1.06?0.03ms > MT19937 985?4?s 1.44?0.01ms > Philox 981?30?s 1.39?0.01ms > SFC64 508?4?s 900?4?s > numpy 2.99?0.06ms 2.98?0.01ms # no change for /dev/urandom > > > I'm not yet 100% certain about the implementations, but I attached a diff > of my current progress. > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ilhanpolat at gmail.com Mon Feb 8 04:51:13 2021 From: ilhanpolat at gmail.com (Ilhan Polat) Date: Mon, 8 Feb 2021 10:51:13 +0100 Subject: [Numpy-discussion] Pearu Peterson has joined the NumPy developers team. In-Reply-To: <6fe3b84f-6ae0-4b01-934c-5cd8605a23bc@www.fastmail.com> References: <6fe3b84f-6ae0-4b01-934c-5cd8605a23bc@www.fastmail.com> Message-ID: This is very comforting news :) Welcome back On Sun, Feb 7, 2021 at 9:10 PM Stefan van der Walt wrote: > On Sun, Feb 7, 2021, at 10:12, Charles R Harris wrote: > > Pearu Peterson has joined the NumPy developers team. Pearu was responsible > for contributing f2py and much of distutils in the early days of NumPy. > Welcome back Pearu. > > > Welcome back, it's good to see you around more, Pearu! > > Best regards, > St?fan > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Mon Feb 8 10:40:40 2021 From: robert.kern at gmail.com (Robert Kern) Date: Mon, 8 Feb 2021 10:40:40 -0500 Subject: [Numpy-discussion] Question about optimizing random_standard_normal In-Reply-To: References: Message-ID: On Mon, Feb 8, 2021 at 3:05 AM Kevin Sheppard wrote: > If I understand correctly, there is no gain when applying this patch to > Generator. I'm not that surprised that this is the case since the compiler > is much more limited in when it can do in Generator than what it can when > filling a large array directly with functions available for inlining and > unrolling. Again, if I understand correctly I think it will be difficult to > justify breaking the stream for a negligible gain in performance. > Can you explain your understanding of the benchmark results? To me, it looks like nearly a 2x improvement with the faster BitGenerators (our default PCG64 and SFC64). That may or may not worth breaking the stream, but it's far from negligible. > But I did run the built-in benchmark: ./runtests.py --bench >> bench_random.RNG.time_normal_zig and the results are: >> >> >> new old >> PCG64 589?3?s 1.06?0.03ms >> MT19937 985?4?s 1.44?0.01ms >> Philox 981?30?s 1.39?0.01ms >> SFC64 508?4?s 900?4?s >> numpy 2.99?0.06ms 2.98?0.01ms # no change for /dev/urandom >> > -- Robert Kern -------------- next part -------------- An HTML attachment was scrubbed... URL: From kevin.k.sheppard at gmail.com Mon Feb 8 10:52:05 2021 From: kevin.k.sheppard at gmail.com (Kevin Sheppard) Date: Mon, 8 Feb 2021 15:52:05 +0000 Subject: [Numpy-discussion] Question about optimizing random_standard_normal In-Reply-To: References: , Message-ID: <11BB0603-25E4-4B02-9B55-F783E54B51DA@hxcore.ol> An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Mon Feb 8 11:05:27 2021 From: robert.kern at gmail.com (Robert Kern) Date: Mon, 8 Feb 2021 11:05:27 -0500 Subject: [Numpy-discussion] Question about optimizing random_standard_normal In-Reply-To: <11BB0603-25E4-4B02-9B55-F783E54B51DA@hxcore.ol> References: <11BB0603-25E4-4B02-9B55-F783E54B51DA@hxcore.ol> Message-ID: On Mon, Feb 8, 2021 at 10:53 AM Kevin Sheppard wrote: > My reading is that the first 4 are pure C, presumably using the standard > practice of inclining so as to make the tightest loop possible, and to > allow the compiler to make other optimizations. The final line is what > happens when you replace the existing ziggurat in NumPy with the new one. I > read it this way since it has both ?new? and ?old? with numpy. If it isn?t > this, then I?m unsure what ?new? and ?old? could mean in the context of > this thread. > No, these are our benchmarks of `Generator`. `numpy` is testing `RandomState`, which wasn't touched by their contribution. https://github.com/numpy/numpy/blob/master/benchmarks/benchmarks/bench_random.py#L93-L97 https://github.com/numpy/numpy/blob/master/benchmarks/benchmarks/bench_random.py#L123-L124 > I suppose camel-cdr can clarify what was actually done. > But I did run the built-in benchmark: ./runtests.py --bench > bench_random.RNG.time_normal_zig and the results are: > > -- Robert Kern -------------- next part -------------- An HTML attachment was scrubbed... URL: From kevin.k.sheppard at gmail.com Mon Feb 8 11:36:51 2021 From: kevin.k.sheppard at gmail.com (Kevin Sheppard) Date: Mon, 8 Feb 2021 16:36:51 +0000 Subject: [Numpy-discussion] Question about optimizing random_standard_normal In-Reply-To: References: <11BB0603-25E4-4B02-9B55-F783E54B51DA@hxcore.ol>, Message-ID: An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Mon Feb 8 12:05:28 2021 From: robert.kern at gmail.com (Robert Kern) Date: Mon, 8 Feb 2021 12:05:28 -0500 Subject: [Numpy-discussion] Question about optimizing random_standard_normal In-Reply-To: References: <11BB0603-25E4-4B02-9B55-F783E54B51DA@hxcore.ol> Message-ID: On Mon, Feb 8, 2021 at 11:38 AM Kevin Sheppard wrote: > That is good news indeed. Seems like a good upgrade, especially given the > breadth of application of normals and the multiple appearances within > distributions.c (e.g., Cauchy). Is there a deprecation for a change like > this? Or is it just a note and new random numbers in the next major? I > think this is the first time a substantially new algo has replaced an > existing one. > Per NEP 19, a change like this is a new feature that can be included in a feature release, like any other feature. I would like to see some more testing on the quality of the sequences more than the ones already quoted. Using Kolmogorov-Smirnov and/or Anderson-Darling tests as well, which should be more thorough. https://github.com/scipy/scipy/blob/master/scipy/stats/tests/test_continuous_basic.py#L604-L620 There are also some subtle issues involved in ziggurat method implementations that go beyond just the marginal distributions. I'm not sure, even, that our current implementation deals with the issues raised in the following paper, but I'd like to do no worse. https://www.doornik.com/research/ziggurat.pdf -- Robert Kern -------------- next part -------------- An HTML attachment was scrubbed... URL: From sebastian at sipsolutions.net Mon Feb 8 12:04:23 2021 From: sebastian at sipsolutions.net (Sebastian Berg) Date: Mon, 08 Feb 2021 11:04:23 -0600 Subject: [Numpy-discussion] Question about optimizing random_standard_normal In-Reply-To: References: <11BB0603-25E4-4B02-9B55-F783E54B51DA@hxcore.ol> , Message-ID: On Mon, 2021-02-08 at 16:36 +0000, Kevin Sheppard wrote: > That is good news indeed.? Seems like a good upgrade, especially > given the breadth of application of normals and the multiple > appearances within distributions.c (e.g., Cauchy). Is there a > deprecation for a change like this? Or is it just a note and new > random numbers in the next major?? I think this is the first time a > substantially new algo has replaced an existing one. > ? I don't think we can deprecate or even warn about it, that would just result in noise that cannot be silenced. If we really think warnings are necessary, it sounds like you would need an opt-in `numpy.random.set_warn_if_streams_will_change()`. That sounds like diminishing returns on first sight. It may be good that this happens now, rather than in a few years when adoption of the new API is probably still on the low side. This type of change should be in the release notes undoubtedly and likely a `.. versionchanged::` directive in the docstring. Maybe the best thing would be to create a single, prominent but brief, changelog listing all (or almost all) stream changes? (Possibly instead of documenting it in the individual function as `.. versionchanged::`) I am thinking just a table with: * version changed * short description * affected functions * how the stream changed (if that is ever relevant) Cheers, Sebastian > Kevin > ? > ? > From: Robert Kern > Sent: Monday, February 8, 2021 4:06 PM > To: Discussion of Numerical Python > Subject: Re: [Numpy-discussion] Question about optimizing > random_standard_normal > ? > On Mon, Feb 8, 2021 at 10:53 AM Kevin Sheppard < > kevin.k.sheppard at gmail.com> wrote: > > My reading is that the first 4 are pure C, presumably using the > > standard practice of inclining so as to make the tightest loop > > possible, and to allow the compiler to make other optimizations.? > > The final line is what happens when you replace the existing > > ziggurat in NumPy with the new one. I read it this way since it has > > both ?new? and ?old? with numpy. If it isn?t this, then I?m unsure > > what ?new? and ?old? could mean in the context of this thread. > > ? > No, these are our benchmarks of `Generator`. `numpy` is testing > `RandomState`, which wasn't touched by their contribution. > ? > ?? > https://github.com/numpy/numpy/blob/master/benchmarks/benchmarks/bench_random.py#L93-L97 > ? > https://github.com/numpy/numpy/blob/master/benchmarks/benchmarks/bench_random.py#L123-L124 > ? > > I suppose camel-cdr can clarify what was actually done. > > ? > > > > > But I did run the built-in benchmark: ./runtests.py --bench > > > > > bench_random.RNG.time_normal_zig and the results are: > > ? > -- > Robert Kern > ? > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: This is a digitally signed message part URL: From charlesr.harris at gmail.com Mon Feb 8 17:37:11 2021 From: charlesr.harris at gmail.com (Charles R Harris) Date: Mon, 8 Feb 2021 15:37:11 -0700 Subject: [Numpy-discussion] Question about optimizing random_standard_normal In-Reply-To: References: Message-ID: On Sat, Feb 6, 2021 at 2:32 AM wrote: > I tried to implement a different implementation of the ziggurat method for > generating standard normal distributions that is about twice as fast and > uses 2/3 of the memory than the old one. > I tested the implementation separately and am very confident it's correct, > but it does fail 28 test in coverage testing. > Checking the testing code I found out that all the failed tests are inside > TestRandomDist which has the goal of "Make[ing] sure the random > distribution returns the correct value for a given seed". Why would this be > needed? > The only explanation I can come up with is that it's standard_normal is, > in regards to seeding, required to be backwards compatible. If that's the > case how would, could one even implement a new algorithm? > Just for fun, I've attached the (C++) implementation that uses the exponentially extended base block. Note that the constructor produces the table. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: RandomNormal.hpp Type: text/x-c++hdr Size: 1913 bytes Desc: not available URL: From robert.kern at gmail.com Mon Feb 8 18:00:11 2021 From: robert.kern at gmail.com (Robert Kern) Date: Mon, 8 Feb 2021 18:00:11 -0500 Subject: [Numpy-discussion] Question about optimizing random_standard_normal In-Reply-To: References: <11BB0603-25E4-4B02-9B55-F783E54B51DA@hxcore.ol> Message-ID: On Mon, Feb 8, 2021 at 12:10 PM Sebastian Berg wrote: > > This type of change should be in the release notes undoubtedly and > likely a `.. versionchanged::` directive in the docstring. > > Maybe the best thing would be to create a single, prominent but brief, > changelog listing all (or almost all) stream changes? (Possibly instead > of documenting it in the individual function as `.. versionchanged::`) > > I am thinking just a table with: > * version changed > * short description > * affected functions > * how the stream changed (if that is ever relevant) > Both are probably useful. Good ideas. -- Robert Kern -------------- next part -------------- An HTML attachment was scrubbed... URL: From sebastian at sipsolutions.net Wed Feb 10 00:25:20 2021 From: sebastian at sipsolutions.net (Sebastian Berg) Date: Tue, 09 Feb 2021 23:25:20 -0600 Subject: [Numpy-discussion] NumPy Development Meeting Wednesday - Triage Focus Message-ID: Hi all, Our bi-weekly triage-focused NumPy development meeting is Wednesday, Feb 10th at 11 am Pacific Time (19:00 UTC). Everyone is invited to join in and edit the work-in-progress meeting topics and notes: https://hackmd.io/68i_JvOYQfy9ERiHgXMPvg I encourage everyone to notify us of issues or PRs that you feel should be prioritized, discussed, or reviewed. Best regards Sebastian -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: This is a digitally signed message part URL: From jfoxrabinovitz at gmail.com Wed Feb 10 17:31:35 2021 From: jfoxrabinovitz at gmail.com (Joseph Fox-Rabinovitz) Date: Wed, 10 Feb 2021 17:31:35 -0500 Subject: [Numpy-discussion] ENH: Proposal to add atleast_nd function Message-ID: I've created PR#18386 to add a function called atleast_nd to numpy and numpy.ma. This would generalize the existing atleast_1d, atleast_2d, and atleast_3d functions. I proposed a similar idea about four and a half years ago: https://mail.python.org/pipermail/numpy-discussion/2016-July/075722.html, PR#7804. The reception was ambivalent, but a couple of folks have asked me about this, so I'm bringing it back. Some pros: - This closes issue #12336 - There are a couple of Stack Overflow questions that would benefit - Been asked about this a couple of times - Implementation of three existing atleast_*d functions gets easier - Looks nicer that the equivalent broadcasting and reshaping Some cons: - Cluttering up the API - Maintenance burden (but not a big one) - This is just a utility function, which can be achieved through broadcasting and reshaping If this meets with approval, there are a couple of interface issues that probably need to be hashed out: - The consensus was that this function should accept a single array, rather than a tuple, or multiple arrays as the other atleast_nd functions do. Does that need to be revisited? - Right now, a `pos` argument specifies where to place new axes, if any. That can be specified in different ways. Another way might be to specify the offset of the existing dimensions, or something entirely different. -------------- next part -------------- An HTML attachment was scrubbed... URL: From sebastian at sipsolutions.net Wed Feb 10 17:48:30 2021 From: sebastian at sipsolutions.net (Sebastian Berg) Date: Wed, 10 Feb 2021 16:48:30 -0600 Subject: [Numpy-discussion] ENH: Proposal to add atleast_nd function In-Reply-To: References: Message-ID: <1001ae35de9d51204170cfb5742a6ffab6e89990.camel@sipsolutions.net> On Wed, 2021-02-10 at 17:31 -0500, Joseph Fox-Rabinovitz wrote: > I've created PR#18386 to add a function called atleast_nd to numpy > and > numpy.ma. This would generalize the existing atleast_1d, atleast_2d, > and > atleast_3d functions. > > I proposed a similar idea about four and a half years ago: > https://mail.python.org/pipermail/numpy-discussion/2016-July/075722.html > , > PR#7804. The reception was ambivalent, but a couple of folks have > asked me > about this, so I'm bringing it back. > > Some pros: > > - This closes issue #12336 > - There are a couple of Stack Overflow questions that would benefit > - Been asked about this a couple of times > - Implementation of three existing atleast_*d functions gets easier > - Looks nicer that the equivalent broadcasting and reshaping > > Some cons: > > - Cluttering up the API > - Maintenance burden (but not a big one) > - This is just a utility function, which can be achieved through > broadcasting and reshaping > My main concern would be the namespace cluttering. I can't say I use even the `atleast_2d` etc. functions personally, so I would tend to be slightly against the addition. But if others land on the "useful" side here (and it seemed a bit at least on github), I am also not opposed. ?It is a clean name that lines up with existing ones, so it doesn't seem like a big "mental load" with respect to namespace cluttering. Bike shedding the API is probably a good idea in any case. I have pasted the current PR documentation (as html) below for quick reference. I wonder a bit about the reasoning for having `pos` specify a value rather than just a side? numpy.atleast_nd(ary,?ndim,?pos=0) View input as array with at least?ndim?dimensions. New unit dimensions are inserted at the index given by?pos?if necessary. Parameters ary ?array_like The input array. Non-array inputs are converted to arrays. Arrays that already have?ndim?or more dimensions are preserved. ndim ?int The minimum number of dimensions required. pos ?int, optional The index to insert the new dimensions. May range from?-ary.ndim?- ?1?to?+ary.ndim?(inclusive). Non-negative indices indicate locations before the corresponding axis:?pos=0?means to insert at the very beginning. Negative indices indicate locations after the corresponding axis:?pos=-1?means to insert at the very end. 0 and -1 are always guaranteed to work. Any other number will depend on the dimensions of the existing array. Default is 0. Returns res ?ndarray An array with?res.ndim?>=?ndim. A view is returned for array inputs. Dimensions are prepended if?pos?is 0, so for example, a 1-D array of shape?(N,)?with?ndim=4becomes a view of shape?(1,?1,?1,?N). Dimensions are appended if?pos?is -1, so for example a 2-D array of shape?(M,?N)?becomes a view of shape?(M,?N,?1,?1)when?ndim=4. See also atleast_1d,?atleast_2d,?atleast_3d Notes This function does not follow the convention of the other?atleast_*d?functions in numpy in that it only accepts a single array argument. To process multiple arrays, use a comprehension or loop around the function call. See examples below. Setting?pos=0?is equivalent to how the array would be interpreted by numpy?s broadcasting rules. There is no need to call this function for simple broadcasting. This is also roughly (but not exactly) equivalent to?np.array(ary,?copy=False,?subok=True,?ndmin=ndim). It is easy to create functions for specific dimensions similar to the other?atleast_*d?functions using Python?s?functools.partial?function. An example is shown below. Examples >>> np.atleast_nd(3.0, 4) array([[[[ 3.]]]]) >>> x = np.arange(3.0) >>> np.atleast_nd(x, 2).shape (1, 3) >>> x = np.arange(12.0).reshape(4, 3) >>> np.atleast_nd(x, 5).shape (1, 1, 1, 4, 3) >>> np.atleast_nd(x, 5).base is x.base True >>> [np.atleast_nd(x) for x in ((1, 2), [[1, 2]], [[[1, 2]]])]: [array([[1, 2]]), array([[1, 2]]), array([[[1, 2]]])] >>> np.atleast_nd((1, 2), 5, pos=0).shape (1, 1, 1, 1, 2) >>> np.atleast_nd((1, 2), 5, pos=-1).shape (2, 1, 1, 1, 1) >>> from functools import partial >>> atleast_4d = partial(np.atleast_nd, ndim=4) >>> atleast_4d([1, 2, 3]) [[[[1, 2, 3]]]] -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: This is a digitally signed message part URL: From jni at fastmail.com Thu Feb 11 00:46:50 2021 From: jni at fastmail.com (Juan Nunez-Iglesias) Date: Thu, 11 Feb 2021 16:46:50 +1100 Subject: [Numpy-discussion] ENH: Proposal to add atleast_nd function In-Reply-To: <1001ae35de9d51204170cfb5742a6ffab6e89990.camel@sipsolutions.net> References: <1001ae35de9d51204170cfb5742a6ffab6e89990.camel@sipsolutions.net> Message-ID: I totally agree with the namespace clutter concern, but honestly, I would use `atleast_nd` with its `pos` argument (I might rename it to `position`, `axis`, or `axis_position`) any day over `at_least{1,2,3}d`, for which I had no idea where the new axes would end up. So, I?m in favour of including it, and optionally deprecating `atleast_{1,2,3}d`. Juan. > On 11 Feb 2021, at 9:48 am, Sebastian Berg wrote: > > On Wed, 2021-02-10 at 17:31 -0500, Joseph Fox-Rabinovitz wrote: >> I've created PR#18386 to add a function called atleast_nd to numpy and >> numpy.ma. This would generalize the existing atleast_1d, atleast_2d, and >> atleast_3d functions. >> >> I proposed a similar idea about four and a half years ago: >> https://mail.python.org/pipermail/numpy-discussion/2016-July/075722.html , >> PR#7804. The reception was ambivalent, but a couple of folks have asked me >> about this, so I'm bringing it back. >> >> Some pros: >> >> - This closes issue #12336 >> - There are a couple of Stack Overflow questions that would benefit >> - Been asked about this a couple of times >> - Implementation of three existing atleast_*d functions gets easier >> - Looks nicer that the equivalent broadcasting and reshaping >> >> Some cons: >> >> - Cluttering up the API >> - Maintenance burden (but not a big one) >> - This is just a utility function, which can be achieved through >> broadcasting and reshaping >> > > My main concern would be the namespace cluttering. I can't say I use even the `atleast_2d` etc. functions personally, so I would tend to be slightly against the addition. But if others land on the "useful" side here (and it seemed a bit at least on github), I am also not opposed. It is a clean name that lines up with existing ones, so it doesn't seem like a big "mental load" with respect to namespace cluttering. > > Bike shedding the API is probably a good idea in any case. > > I have pasted the current PR documentation (as html) below for quick reference. I wonder a bit about the reasoning for having `pos` specify a value rather than just a side? > > > > numpy.atleast_nd(ary, ndim, pos=0) > View input as array with at least ndim dimensions. > New unit dimensions are inserted at the index given by pos if necessary. > Parameters > ary array_like > The input array. Non-array inputs are converted to arrays. Arrays that already have ndim or more dimensions are preserved. > ndim int > The minimum number of dimensions required. > pos int, optional > The index to insert the new dimensions. May range from -ary.ndim - 1 to +ary.ndim (inclusive). Non-negative indices indicate locations before the corresponding axis: pos=0 means to insert at the very beginning. Negative indices indicate locations after the corresponding axis: pos=-1 means to insert at the very end. 0 and -1 are always guaranteed to work. Any other number will depend on the dimensions of the existing array. Default is 0. > Returns > res ndarray > An array with res.ndim >= ndim. A view is returned for array inputs. Dimensions are prepended if pos is 0, so for example, a 1-D array of shape (N,) with ndim=4becomes a view of shape (1, 1, 1, N). Dimensions are appended if pos is -1, so for example a 2-D array of shape (M, N) becomes a view of shape (M, N, 1, 1)when ndim=4. > See also > atleast_1d , atleast_2d , atleast_3d > Notes > This function does not follow the convention of the other atleast_*d functions in numpy in that it only accepts a single array argument. To process multiple arrays, use a comprehension or loop around the function call. See examples below. > Setting pos=0 is equivalent to how the array would be interpreted by numpy?s broadcasting rules. There is no need to call this function for simple broadcasting. This is also roughly (but not exactly) equivalent to np.array(ary, copy=False, subok=True, ndmin=ndim). > It is easy to create functions for specific dimensions similar to the other atleast_*d functions using Python?s functools.partial function. An example is shown below. > Examples > >>> np.atleast_nd(3.0, 4) > array([[[[ 3.]]]]) > >>> x = np.arange(3.0) > >>> np.atleast_nd(x, 2).shape > (1, 3) > >>> x = np.arange(12.0).reshape(4, 3) > >>> np.atleast_nd(x, 5).shape > (1, 1, 1, 4, 3) > >>> np.atleast_nd(x, 5).base is x.base > True > >>> [np.atleast_nd(x) for x in ((1, 2), [[1, 2]], [[[1, 2]]])]: > [array([[1, 2]]), array([[1, 2]]), array([[[1, 2]]])] > >>> np.atleast_nd((1, 2), 5, pos=0).shape > (1, 1, 1, 1, 2) > >>> np.atleast_nd((1, 2), 5, pos=-1).shape > (2, 1, 1, 1, 1) > >>> from functools import partial > >>> atleast_4d = partial(np.atleast_nd, ndim=4) > >>> atleast_4d([1, 2, 3]) > [[[[1, 2, 3]]]] > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion -------------- next part -------------- An HTML attachment was scrubbed... URL: From shoyer at gmail.com Thu Feb 11 01:55:30 2021 From: shoyer at gmail.com (Stephan Hoyer) Date: Wed, 10 Feb 2021 22:55:30 -0800 Subject: [Numpy-discussion] ENH: Proposal to add atleast_nd function In-Reply-To: References: <1001ae35de9d51204170cfb5742a6ffab6e89990.camel@sipsolutions.net> Message-ID: On Wed, Feb 10, 2021 at 9:48 PM Juan Nunez-Iglesias wrote: > I totally agree with the namespace clutter concern, but honestly, I would > use `atleast_nd` with its `pos` argument (I might rename it to `position`, > `axis`, or `axis_position`) any day over `at_least{1,2,3}d`, for which I > had no idea where the new axes would end up. > > So, I?m in favour of including it, and optionally deprecating > `atleast_{1,2,3}d`. > > I appreciate that `atleast_nd` feels more sensible than `at_least{1,2,3}d`, but I don't think "better" than a pattern we would not recommend is a good enough reason for inclusion in NumPy. It needs to stand on its own. What would be the recommended use-cases for this new function? Have any libraries building on top of NumPy implemented a version of this? > Juan. > > On 11 Feb 2021, at 9:48 am, Sebastian Berg > wrote: > > On Wed, 2021-02-10 at 17:31 -0500, Joseph Fox-Rabinovitz wrote: > > I've created PR#18386 to add a function called atleast_nd to numpy and > numpy.ma. This would generalize the existing atleast_1d, atleast_2d, and > atleast_3d functions. > > I proposed a similar idea about four and a half years ago: > https://mail.python.org/pipermail/numpy-discussion/2016-July/075722.html, > PR#7804. The reception was ambivalent, but a couple of folks have asked me > about this, so I'm bringing it back. > > Some pros: > > - This closes issue #12336 > - There are a couple of Stack Overflow questions that would benefit > - Been asked about this a couple of times > - Implementation of three existing atleast_*d functions gets easier > - Looks nicer that the equivalent broadcasting and reshaping > > Some cons: > > - Cluttering up the API > - Maintenance burden (but not a big one) > - This is just a utility function, which can be achieved through > broadcasting and reshaping > > > My main concern would be the namespace cluttering. I can't say I use even > the `atleast_2d` etc. functions personally, so I would tend to be slightly > against the addition. But if others land on the "useful" side here (and it > seemed a bit at least on github), I am also not opposed. It is a clean > name that lines up with existing ones, so it doesn't seem like a big > "mental load" with respect to namespace cluttering. > > Bike shedding the API is probably a good idea in any case. > > I have pasted the current PR documentation (as html) below for quick > reference. I wonder a bit about the reasoning for having `pos` specify a > value rather than just a side? > > > > numpy.atleast_nd(*ary*, *ndim*, *pos=0*) > View input as array with at least ndim dimensions. > New unit dimensions are inserted at the index given by *pos* if necessary. > Parameters*ary *array_like > The input array. Non-array inputs are converted to arrays. Arrays that > already have ndim or more dimensions are preserved. > *ndim *int > The minimum number of dimensions required. > *pos *int, optional > The index to insert the new dimensions. May range from -ary.ndim - 1 to > +ary.ndim (inclusive). Non-negative indices indicate locations before the > corresponding axis: pos=0 means to insert at the very beginning. Negative > indices indicate locations after the corresponding axis: pos=-1 means to > insert at the very end. 0 and -1 are always guaranteed to work. Any other > number will depend on the dimensions of the existing array. Default is 0. > Returns*res *ndarray > An array with res.ndim >= ndim. A view is returned for array inputs. > Dimensions are prepended if *pos* is 0, so for example, a 1-D array of > shape (N,) with ndim=4becomes a view of shape (1, 1, 1, N). Dimensions > are appended if *pos* is -1, so for example a 2-D array of shape (M, N) becomes > a view of shape (M, N, 1, 1)when ndim=4. > *See also* > atleast_1d > > , atleast_2d > > , atleast_3d > > *Notes* > This function does not follow the convention of the other atleast_*d functions > in numpy in that it only accepts a single array argument. To process > multiple arrays, use a comprehension or loop around the function call. See > examples below. > Setting pos=0 is equivalent to how the array would be interpreted by > numpy?s broadcasting rules. There is no need to call this function for > simple broadcasting. This is also roughly (but not exactly) equivalent to > np.array(ary, copy=False, subok=True, ndmin=ndim). > It is easy to create functions for specific dimensions similar to the other > atleast_*d functions using Python?s functools.partial > function. > An example is shown below. > *Examples* > > >>> np.atleast_nd(3.0, 4)array([[[[ 3.]]]]) > > >>> x = np.arange(3.0)>>> np.atleast_nd(x, 2).shape(1, 3) > > >>> x = np.arange(12.0).reshape(4, 3)>>> np.atleast_nd(x, 5).shape(1, 1, 1, 4, 3)>>> np.atleast_nd(x, 5).base is x.baseTrue > > >>> [np.atleast_nd(x) for x in ((1, 2), [[1, 2]], [[[1, 2]]])]:[array([[1, 2]]), array([[1, 2]]), array([[[1, 2]]])] > > >>> np.atleast_nd((1, 2), 5, pos=0).shape(1, 1, 1, 1, 2)>>> np.atleast_nd((1, 2), 5, pos=-1).shape(2, 1, 1, 1, 1) > > >>> from functools import partial>>> atleast_4d = partial(np.atleast_nd, ndim=4)>>> atleast_4d([1, 2, 3])[[[[1, 2, 3]]]] > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ben.v.root at gmail.com Thu Feb 11 12:40:16 2021 From: ben.v.root at gmail.com (Benjamin Root) Date: Thu, 11 Feb 2021 12:40:16 -0500 Subject: [Numpy-discussion] ENH: Proposal to add atleast_nd function In-Reply-To: References: <1001ae35de9d51204170cfb5742a6ffab6e89990.camel@sipsolutions.net> Message-ID: for me, I find that the at_least{1,2,3}d functions are useful for sanitizing inputs. Having an at_leastnd() function can be viewed as a step towards cleaning up the API, not cluttering it (although, deprecations of the existing functions probably should be long given how long they have existed). On Thu, Feb 11, 2021 at 1:56 AM Stephan Hoyer wrote: > On Wed, Feb 10, 2021 at 9:48 PM Juan Nunez-Iglesias > wrote: > >> I totally agree with the namespace clutter concern, but honestly, I would >> use `atleast_nd` with its `pos` argument (I might rename it to `position`, >> `axis`, or `axis_position`) any day over `at_least{1,2,3}d`, for which I >> had no idea where the new axes would end up. >> >> So, I?m in favour of including it, and optionally deprecating >> `atleast_{1,2,3}d`. >> >> > I appreciate that `atleast_nd` feels more sensible than > `at_least{1,2,3}d`, but I don't think "better" than a pattern we would not > recommend is a good enough reason for inclusion in NumPy. It needs to stand > on its own. > > What would be the recommended use-cases for this new function? > Have any libraries building on top of NumPy implemented a version of this? > > >> Juan. >> >> On 11 Feb 2021, at 9:48 am, Sebastian Berg >> wrote: >> >> On Wed, 2021-02-10 at 17:31 -0500, Joseph Fox-Rabinovitz wrote: >> >> I've created PR#18386 to add a function called atleast_nd to numpy and >> numpy.ma. This would generalize the existing atleast_1d, atleast_2d, and >> atleast_3d functions. >> >> I proposed a similar idea about four and a half years ago: >> https://mail.python.org/pipermail/numpy-discussion/2016-July/075722.html, >> PR#7804. The reception was ambivalent, but a couple of folks have asked me >> about this, so I'm bringing it back. >> >> Some pros: >> >> - This closes issue #12336 >> - There are a couple of Stack Overflow questions that would benefit >> - Been asked about this a couple of times >> - Implementation of three existing atleast_*d functions gets easier >> - Looks nicer that the equivalent broadcasting and reshaping >> >> Some cons: >> >> - Cluttering up the API >> - Maintenance burden (but not a big one) >> - This is just a utility function, which can be achieved through >> broadcasting and reshaping >> >> >> My main concern would be the namespace cluttering. I can't say I use even >> the `atleast_2d` etc. functions personally, so I would tend to be slightly >> against the addition. But if others land on the "useful" side here (and it >> seemed a bit at least on github), I am also not opposed. It is a clean >> name that lines up with existing ones, so it doesn't seem like a big >> "mental load" with respect to namespace cluttering. >> >> Bike shedding the API is probably a good idea in any case. >> >> I have pasted the current PR documentation (as html) below for quick >> reference. I wonder a bit about the reasoning for having `pos` specify a >> value rather than just a side? >> >> >> >> numpy.atleast_nd(*ary*, *ndim*, *pos=0*) >> View input as array with at least ndim dimensions. >> New unit dimensions are inserted at the index given by *pos* if >> necessary. >> Parameters*ary *array_like >> The input array. Non-array inputs are converted to arrays. Arrays that >> already have ndim or more dimensions are preserved. >> *ndim *int >> The minimum number of dimensions required. >> *pos *int, optional >> The index to insert the new dimensions. May range from -ary.ndim - 1 to >> +ary.ndim (inclusive). Non-negative indices indicate locations before >> the corresponding axis: pos=0 means to insert at the very beginning. >> Negative indices indicate locations after the corresponding axis: pos=-1 means >> to insert at the very end. 0 and -1 are always guaranteed to work. Any >> other number will depend on the dimensions of the existing array. Default >> is 0. >> Returns*res *ndarray >> An array with res.ndim >= ndim. A view is returned for array inputs. >> Dimensions are prepended if *pos* is 0, so for example, a 1-D array of >> shape (N,) with ndim=4becomes a view of shape (1, 1, 1, N). Dimensions >> are appended if *pos* is -1, so for example a 2-D array of shape (M, N) becomes >> a view of shape (M, N, 1, 1)when ndim=4. >> *See also* >> atleast_1d >> >> , atleast_2d >> >> , atleast_3d >> >> *Notes* >> This function does not follow the convention of the other atleast_*d functions >> in numpy in that it only accepts a single array argument. To process >> multiple arrays, use a comprehension or loop around the function call. See >> examples below. >> Setting pos=0 is equivalent to how the array would be interpreted by >> numpy?s broadcasting rules. There is no need to call this function for >> simple broadcasting. This is also roughly (but not exactly) equivalent to >> np.array(ary, copy=False, subok=True, ndmin=ndim). >> It is easy to create functions for specific dimensions similar to the >> other atleast_*d functions using Python?s functools.partial >> function. >> An example is shown below. >> *Examples* >> >> >>> np.atleast_nd(3.0, 4)array([[[[ 3.]]]]) >> >> >>> x = np.arange(3.0)>>> np.atleast_nd(x, 2).shape(1, 3) >> >> >>> x = np.arange(12.0).reshape(4, 3)>>> np.atleast_nd(x, 5).shape(1, 1, 1, 4, 3)>>> np.atleast_nd(x, 5).base is x.baseTrue >> >> >>> [np.atleast_nd(x) for x in ((1, 2), [[1, 2]], [[[1, 2]]])]:[array([[1, 2]]), array([[1, 2]]), array([[[1, 2]]])] >> >> >>> np.atleast_nd((1, 2), 5, pos=0).shape(1, 1, 1, 1, 2)>>> np.atleast_nd((1, 2), 5, pos=-1).shape(2, 1, 1, 1, 1) >> >> >>> from functools import partial>>> atleast_4d = partial(np.atleast_nd, ndim=4)>>> atleast_4d([1, 2, 3])[[[[1, 2, 3]]]] >> >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at python.org >> https://mail.python.org/mailman/listinfo/numpy-discussion >> >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at python.org >> https://mail.python.org/mailman/listinfo/numpy-discussion >> > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jfoxrabinovitz at gmail.com Thu Feb 11 12:48:40 2021 From: jfoxrabinovitz at gmail.com (Joseph Fox-Rabinovitz) Date: Thu, 11 Feb 2021 12:48:40 -0500 Subject: [Numpy-discussion] ENH: Proposal to add atleast_nd function In-Reply-To: References: <1001ae35de9d51204170cfb5742a6ffab6e89990.camel@sipsolutions.net> Message-ID: The original functions appear to have been written for things like *stack originally, which actually goes a long way to explaining the inconsistent argument list. - Joe On Thu, Feb 11, 2021, 12:41 Benjamin Root wrote: > for me, I find that the at_least{1,2,3}d functions are useful for > sanitizing inputs. Having an at_leastnd() function can be viewed as a step > towards cleaning up the API, not cluttering it (although, deprecations of > the existing functions probably should be long given how long they have > existed). > > On Thu, Feb 11, 2021 at 1:56 AM Stephan Hoyer wrote: > >> On Wed, Feb 10, 2021 at 9:48 PM Juan Nunez-Iglesias >> wrote: >> >>> I totally agree with the namespace clutter concern, but honestly, I >>> would use `atleast_nd` with its `pos` argument (I might rename it to >>> `position`, `axis`, or `axis_position`) any day over `at_least{1,2,3}d`, >>> for which I had no idea where the new axes would end up. >>> >>> So, I?m in favour of including it, and optionally deprecating >>> `atleast_{1,2,3}d`. >>> >>> >> I appreciate that `atleast_nd` feels more sensible than >> `at_least{1,2,3}d`, but I don't think "better" than a pattern we would not >> recommend is a good enough reason for inclusion in NumPy. It needs to stand >> on its own. >> >> What would be the recommended use-cases for this new function? >> Have any libraries building on top of NumPy implemented a version of this? >> >> >>> Juan. >>> >>> On 11 Feb 2021, at 9:48 am, Sebastian Berg >>> wrote: >>> >>> On Wed, 2021-02-10 at 17:31 -0500, Joseph Fox-Rabinovitz wrote: >>> >>> I've created PR#18386 to add a function called atleast_nd to numpy and >>> numpy.ma. This would generalize the existing atleast_1d, atleast_2d, and >>> atleast_3d functions. >>> >>> I proposed a similar idea about four and a half years ago: >>> https://mail.python.org/pipermail/numpy-discussion/2016-July/075722.html >>> , >>> PR#7804. The reception was ambivalent, but a couple of folks have asked >>> me >>> about this, so I'm bringing it back. >>> >>> Some pros: >>> >>> - This closes issue #12336 >>> - There are a couple of Stack Overflow questions that would benefit >>> - Been asked about this a couple of times >>> - Implementation of three existing atleast_*d functions gets easier >>> - Looks nicer that the equivalent broadcasting and reshaping >>> >>> Some cons: >>> >>> - Cluttering up the API >>> - Maintenance burden (but not a big one) >>> - This is just a utility function, which can be achieved through >>> broadcasting and reshaping >>> >>> >>> My main concern would be the namespace cluttering. I can't say I use >>> even the `atleast_2d` etc. functions personally, so I would tend to be >>> slightly against the addition. But if others land on the "useful" side here >>> (and it seemed a bit at least on github), I am also not opposed. It is a >>> clean name that lines up with existing ones, so it doesn't seem like a big >>> "mental load" with respect to namespace cluttering. >>> >>> Bike shedding the API is probably a good idea in any case. >>> >>> I have pasted the current PR documentation (as html) below for quick >>> reference. I wonder a bit about the reasoning for having `pos` specify a >>> value rather than just a side? >>> >>> >>> >>> numpy.atleast_nd(*ary*, *ndim*, *pos=0*) >>> View input as array with at least ndim dimensions. >>> New unit dimensions are inserted at the index given by *pos* if >>> necessary. >>> Parameters*ary *array_like >>> The input array. Non-array inputs are converted to arrays. Arrays that >>> already have ndim or more dimensions are preserved. >>> *ndim *int >>> The minimum number of dimensions required. >>> *pos *int, optional >>> The index to insert the new dimensions. May range from -ary.ndim - 1 to >>> +ary.ndim (inclusive). Non-negative indices indicate locations before >>> the corresponding axis: pos=0 means to insert at the very beginning. >>> Negative indices indicate locations after the corresponding axis: pos=-1 >>> means to insert at the very end. 0 and -1 are always guaranteed to >>> work. Any other number will depend on the dimensions of the existing array. >>> Default is 0. >>> Returns*res *ndarray >>> An array with res.ndim >= ndim. A view is returned for array inputs. >>> Dimensions are prepended if *pos* is 0, so for example, a 1-D array of >>> shape (N,) with ndim=4becomes a view of shape (1, 1, 1, N). Dimensions >>> are appended if *pos* is -1, so for example a 2-D array of shape (M, N) becomes >>> a view of shape (M, N, 1, 1)when ndim=4. >>> *See also* >>> atleast_1d >>> >>> , atleast_2d >>> >>> , atleast_3d >>> >>> *Notes* >>> This function does not follow the convention of the other atleast_*d functions >>> in numpy in that it only accepts a single array argument. To process >>> multiple arrays, use a comprehension or loop around the function call. See >>> examples below. >>> Setting pos=0 is equivalent to how the array would be interpreted by >>> numpy?s broadcasting rules. There is no need to call this function for >>> simple broadcasting. This is also roughly (but not exactly) equivalent to >>> np.array(ary, copy=False, subok=True, ndmin=ndim). >>> It is easy to create functions for specific dimensions similar to the >>> other atleast_*d functions using Python?s functools.partial >>> function. >>> An example is shown below. >>> *Examples* >>> >>> >>> np.atleast_nd(3.0, 4)array([[[[ 3.]]]]) >>> >>> >>> x = np.arange(3.0)>>> np.atleast_nd(x, 2).shape(1, 3) >>> >>> >>> x = np.arange(12.0).reshape(4, 3)>>> np.atleast_nd(x, 5).shape(1, 1, 1, 4, 3)>>> np.atleast_nd(x, 5).base is x.baseTrue >>> >>> >>> [np.atleast_nd(x) for x in ((1, 2), [[1, 2]], [[[1, 2]]])]:[array([[1, 2]]), array([[1, 2]]), array([[[1, 2]]])] >>> >>> >>> np.atleast_nd((1, 2), 5, pos=0).shape(1, 1, 1, 1, 2)>>> np.atleast_nd((1, 2), 5, pos=-1).shape(2, 1, 1, 1, 1) >>> >>> >>> from functools import partial>>> atleast_4d = partial(np.atleast_nd, ndim=4)>>> atleast_4d([1, 2, 3])[[[[1, 2, 3]]]] >>> >>> >>> _______________________________________________ >>> NumPy-Discussion mailing list >>> NumPy-Discussion at python.org >>> https://mail.python.org/mailman/listinfo/numpy-discussion >>> >>> >>> _______________________________________________ >>> NumPy-Discussion mailing list >>> NumPy-Discussion at python.org >>> https://mail.python.org/mailman/listinfo/numpy-discussion >>> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at python.org >> https://mail.python.org/mailman/listinfo/numpy-discussion >> > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From wieser.eric+numpy at gmail.com Thu Feb 11 13:12:51 2021 From: wieser.eric+numpy at gmail.com (Eric Wieser) Date: Thu, 11 Feb 2021 18:12:51 +0000 Subject: [Numpy-discussion] ENH: Proposal to add atleast_nd function In-Reply-To: References: <1001ae35de9d51204170cfb5742a6ffab6e89990.camel@sipsolutions.net> Message-ID: > I find that the at_least{1,2,3}d functions are useful for sanitizing inputs IMO, this type of "sanitization" goes against "In the face of ambiguity, refuse the temptation to guess". Instead of using `at_least{n}d`, it could be argued that `if np.ndim(x) != n: raise ValueError` is a safer bet, which forces the user to think about what's actually going on, and saves them from silent headaches. Of course, this is just an argument for discouraging users from using these functions, and for the fact that we perhaps should not have had them in the first place. Given we already have some of them, adding `atleast_nd` probably isn't going to make things any worse. In principle, it could actually make things better, as we could put a "Notes" section in the new function docs that describes the XY problem that makes atleast_nd look like a better solution that it is and presents better alternatives, and the other three function docs could link there. Eric On Thu, 11 Feb 2021 at 17:41, Benjamin Root wrote: > for me, I find that the at_least{1,2,3}d functions are useful for > sanitizing inputs. Having an at_leastnd() function can be viewed as a step > towards cleaning up the API, not cluttering it (although, deprecations of > the existing functions probably should be long given how long they have > existed). > > On Thu, Feb 11, 2021 at 1:56 AM Stephan Hoyer wrote: > >> On Wed, Feb 10, 2021 at 9:48 PM Juan Nunez-Iglesias >> wrote: >> >>> I totally agree with the namespace clutter concern, but honestly, I >>> would use `atleast_nd` with its `pos` argument (I might rename it to >>> `position`, `axis`, or `axis_position`) any day over `at_least{1,2,3}d`, >>> for which I had no idea where the new axes would end up. >>> >>> So, I?m in favour of including it, and optionally deprecating >>> `atleast_{1,2,3}d`. >>> >>> >> I appreciate that `atleast_nd` feels more sensible than >> `at_least{1,2,3}d`, but I don't think "better" than a pattern we would not >> recommend is a good enough reason for inclusion in NumPy. It needs to stand >> on its own. >> >> What would be the recommended use-cases for this new function? >> Have any libraries building on top of NumPy implemented a version of this? >> >> >>> Juan. >>> >>> On 11 Feb 2021, at 9:48 am, Sebastian Berg >>> wrote: >>> >>> On Wed, 2021-02-10 at 17:31 -0500, Joseph Fox-Rabinovitz wrote: >>> >>> I've created PR#18386 to add a function called atleast_nd to numpy and >>> numpy.ma. This would generalize the existing atleast_1d, atleast_2d, and >>> atleast_3d functions. >>> >>> I proposed a similar idea about four and a half years ago: >>> https://mail.python.org/pipermail/numpy-discussion/2016-July/075722.html >>> , >>> PR#7804. The reception was ambivalent, but a couple of folks have asked >>> me >>> about this, so I'm bringing it back. >>> >>> Some pros: >>> >>> - This closes issue #12336 >>> - There are a couple of Stack Overflow questions that would benefit >>> - Been asked about this a couple of times >>> - Implementation of three existing atleast_*d functions gets easier >>> - Looks nicer that the equivalent broadcasting and reshaping >>> >>> Some cons: >>> >>> - Cluttering up the API >>> - Maintenance burden (but not a big one) >>> - This is just a utility function, which can be achieved through >>> broadcasting and reshaping >>> >>> >>> My main concern would be the namespace cluttering. I can't say I use >>> even the `atleast_2d` etc. functions personally, so I would tend to be >>> slightly against the addition. But if others land on the "useful" side here >>> (and it seemed a bit at least on github), I am also not opposed. It is a >>> clean name that lines up with existing ones, so it doesn't seem like a big >>> "mental load" with respect to namespace cluttering. >>> >>> Bike shedding the API is probably a good idea in any case. >>> >>> I have pasted the current PR documentation (as html) below for quick >>> reference. I wonder a bit about the reasoning for having `pos` specify a >>> value rather than just a side? >>> >>> >>> >>> numpy.atleast_nd(*ary*, *ndim*, *pos=0*) >>> View input as array with at least ndim dimensions. >>> New unit dimensions are inserted at the index given by *pos* if >>> necessary. >>> Parameters*ary *array_like >>> The input array. Non-array inputs are converted to arrays. Arrays that >>> already have ndim or more dimensions are preserved. >>> *ndim *int >>> The minimum number of dimensions required. >>> *pos *int, optional >>> The index to insert the new dimensions. May range from -ary.ndim - 1 to >>> +ary.ndim (inclusive). Non-negative indices indicate locations before >>> the corresponding axis: pos=0 means to insert at the very beginning. >>> Negative indices indicate locations after the corresponding axis: pos=-1 >>> means to insert at the very end. 0 and -1 are always guaranteed to >>> work. Any other number will depend on the dimensions of the existing array. >>> Default is 0. >>> Returns*res *ndarray >>> An array with res.ndim >= ndim. A view is returned for array inputs. >>> Dimensions are prepended if *pos* is 0, so for example, a 1-D array of >>> shape (N,) with ndim=4becomes a view of shape (1, 1, 1, N). Dimensions >>> are appended if *pos* is -1, so for example a 2-D array of shape (M, N) becomes >>> a view of shape (M, N, 1, 1)when ndim=4. >>> *See also* >>> atleast_1d >>> >>> , atleast_2d >>> >>> , atleast_3d >>> >>> *Notes* >>> This function does not follow the convention of the other atleast_*d functions >>> in numpy in that it only accepts a single array argument. To process >>> multiple arrays, use a comprehension or loop around the function call. See >>> examples below. >>> Setting pos=0 is equivalent to how the array would be interpreted by >>> numpy?s broadcasting rules. There is no need to call this function for >>> simple broadcasting. This is also roughly (but not exactly) equivalent to >>> np.array(ary, copy=False, subok=True, ndmin=ndim). >>> It is easy to create functions for specific dimensions similar to the >>> other atleast_*d functions using Python?s functools.partial >>> function. >>> An example is shown below. >>> *Examples* >>> >>> >>> np.atleast_nd(3.0, 4)array([[[[ 3.]]]]) >>> >>> >>> x = np.arange(3.0)>>> np.atleast_nd(x, 2).shape(1, 3) >>> >>> >>> x = np.arange(12.0).reshape(4, 3)>>> np.atleast_nd(x, 5).shape(1, 1, 1, 4, 3)>>> np.atleast_nd(x, 5).base is x.baseTrue >>> >>> >>> [np.atleast_nd(x) for x in ((1, 2), [[1, 2]], [[[1, 2]]])]:[array([[1, 2]]), array([[1, 2]]), array([[[1, 2]]])] >>> >>> >>> np.atleast_nd((1, 2), 5, pos=0).shape(1, 1, 1, 1, 2)>>> np.atleast_nd((1, 2), 5, pos=-1).shape(2, 1, 1, 1, 1) >>> >>> >>> from functools import partial>>> atleast_4d = partial(np.atleast_nd, ndim=4)>>> atleast_4d([1, 2, 3])[[[[1, 2, 3]]]] >>> >>> >>> _______________________________________________ >>> NumPy-Discussion mailing list >>> NumPy-Discussion at python.org >>> https://mail.python.org/mailman/listinfo/numpy-discussion >>> >>> >>> _______________________________________________ >>> NumPy-Discussion mailing list >>> NumPy-Discussion at python.org >>> https://mail.python.org/mailman/listinfo/numpy-discussion >>> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at python.org >> https://mail.python.org/mailman/listinfo/numpy-discussion >> > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From shoyer at gmail.com Thu Feb 11 13:13:20 2021 From: shoyer at gmail.com (Stephan Hoyer) Date: Thu, 11 Feb 2021 10:13:20 -0800 Subject: [Numpy-discussion] ENH: Proposal to add atleast_nd function In-Reply-To: References: <1001ae35de9d51204170cfb5742a6ffab6e89990.camel@sipsolutions.net> Message-ID: On Thu, Feb 11, 2021 at 9:42 AM Benjamin Root wrote: > for me, I find that the at_least{1,2,3}d functions are useful for > sanitizing inputs. Having an at_leastnd() function can be viewed as a step > towards cleaning up the API, not cluttering it (although, deprecations of > the existing functions probably should be long given how long they have > existed). > I would love to see examples of this -- perhaps in matplotlib? My thinking is that in most cases it's probably a better idea to keep the interface simpler, and raise an error for lower-dimensional arrays. Automatic conversion is convenient (and endemic within the SciPy ecosystem), but is also a common source of bugs. On Thu, Feb 11, 2021 at 1:56 AM Stephan Hoyer wrote: > >> On Wed, Feb 10, 2021 at 9:48 PM Juan Nunez-Iglesias >> wrote: >> >>> I totally agree with the namespace clutter concern, but honestly, I >>> would use `atleast_nd` with its `pos` argument (I might rename it to >>> `position`, `axis`, or `axis_position`) any day over `at_least{1,2,3}d`, >>> for which I had no idea where the new axes would end up. >>> >>> So, I?m in favour of including it, and optionally deprecating >>> `atleast_{1,2,3}d`. >>> >>> >> I appreciate that `atleast_nd` feels more sensible than >> `at_least{1,2,3}d`, but I don't think "better" than a pattern we would not >> recommend is a good enough reason for inclusion in NumPy. It needs to stand >> on its own. >> >> What would be the recommended use-cases for this new function? >> Have any libraries building on top of NumPy implemented a version of this? >> >> >>> Juan. >>> >>> On 11 Feb 2021, at 9:48 am, Sebastian Berg >>> wrote: >>> >>> On Wed, 2021-02-10 at 17:31 -0500, Joseph Fox-Rabinovitz wrote: >>> >>> I've created PR#18386 to add a function called atleast_nd to numpy and >>> numpy.ma. This would generalize the existing atleast_1d, atleast_2d, and >>> atleast_3d functions. >>> >>> I proposed a similar idea about four and a half years ago: >>> https://mail.python.org/pipermail/numpy-discussion/2016-July/075722.html >>> , >>> PR#7804. The reception was ambivalent, but a couple of folks have asked >>> me >>> about this, so I'm bringing it back. >>> >>> Some pros: >>> >>> - This closes issue #12336 >>> - There are a couple of Stack Overflow questions that would benefit >>> - Been asked about this a couple of times >>> - Implementation of three existing atleast_*d functions gets easier >>> - Looks nicer that the equivalent broadcasting and reshaping >>> >>> Some cons: >>> >>> - Cluttering up the API >>> - Maintenance burden (but not a big one) >>> - This is just a utility function, which can be achieved through >>> broadcasting and reshaping >>> >>> >>> My main concern would be the namespace cluttering. I can't say I use >>> even the `atleast_2d` etc. functions personally, so I would tend to be >>> slightly against the addition. But if others land on the "useful" side here >>> (and it seemed a bit at least on github), I am also not opposed. It is a >>> clean name that lines up with existing ones, so it doesn't seem like a big >>> "mental load" with respect to namespace cluttering. >>> >>> Bike shedding the API is probably a good idea in any case. >>> >>> I have pasted the current PR documentation (as html) below for quick >>> reference. I wonder a bit about the reasoning for having `pos` specify a >>> value rather than just a side? >>> >>> >>> >>> numpy.atleast_nd(*ary*, *ndim*, *pos=0*) >>> View input as array with at least ndim dimensions. >>> New unit dimensions are inserted at the index given by *pos* if >>> necessary. >>> Parameters*ary *array_like >>> The input array. Non-array inputs are converted to arrays. Arrays that >>> already have ndim or more dimensions are preserved. >>> *ndim *int >>> The minimum number of dimensions required. >>> *pos *int, optional >>> The index to insert the new dimensions. May range from -ary.ndim - 1 to >>> +ary.ndim (inclusive). Non-negative indices indicate locations before >>> the corresponding axis: pos=0 means to insert at the very beginning. >>> Negative indices indicate locations after the corresponding axis: pos=-1 >>> means to insert at the very end. 0 and -1 are always guaranteed to >>> work. Any other number will depend on the dimensions of the existing array. >>> Default is 0. >>> Returns*res *ndarray >>> An array with res.ndim >= ndim. A view is returned for array inputs. >>> Dimensions are prepended if *pos* is 0, so for example, a 1-D array of >>> shape (N,) with ndim=4becomes a view of shape (1, 1, 1, N). Dimensions >>> are appended if *pos* is -1, so for example a 2-D array of shape (M, N) becomes >>> a view of shape (M, N, 1, 1)when ndim=4. >>> *See also* >>> atleast_1d >>> >>> , atleast_2d >>> >>> , atleast_3d >>> >>> *Notes* >>> This function does not follow the convention of the other atleast_*d functions >>> in numpy in that it only accepts a single array argument. To process >>> multiple arrays, use a comprehension or loop around the function call. See >>> examples below. >>> Setting pos=0 is equivalent to how the array would be interpreted by >>> numpy?s broadcasting rules. There is no need to call this function for >>> simple broadcasting. This is also roughly (but not exactly) equivalent to >>> np.array(ary, copy=False, subok=True, ndmin=ndim). >>> It is easy to create functions for specific dimensions similar to the >>> other atleast_*d functions using Python?s functools.partial >>> function. >>> An example is shown below. >>> *Examples* >>> >>> >>> np.atleast_nd(3.0, 4)array([[[[ 3.]]]]) >>> >>> >>> x = np.arange(3.0)>>> np.atleast_nd(x, 2).shape(1, 3) >>> >>> >>> x = np.arange(12.0).reshape(4, 3)>>> np.atleast_nd(x, 5).shape(1, 1, 1, 4, 3)>>> np.atleast_nd(x, 5).base is x.baseTrue >>> >>> >>> [np.atleast_nd(x) for x in ((1, 2), [[1, 2]], [[[1, 2]]])]:[array([[1, 2]]), array([[1, 2]]), array([[[1, 2]]])] >>> >>> >>> np.atleast_nd((1, 2), 5, pos=0).shape(1, 1, 1, 1, 2)>>> np.atleast_nd((1, 2), 5, pos=-1).shape(2, 1, 1, 1, 1) >>> >>> >>> from functools import partial>>> atleast_4d = partial(np.atleast_nd, ndim=4)>>> atleast_4d([1, 2, 3])[[[[1, 2, 3]]]] >>> >>> >>> _______________________________________________ >>> NumPy-Discussion mailing list >>> NumPy-Discussion at python.org >>> https://mail.python.org/mailman/listinfo/numpy-discussion >>> >>> >>> _______________________________________________ >>> NumPy-Discussion mailing list >>> NumPy-Discussion at python.org >>> https://mail.python.org/mailman/listinfo/numpy-discussion >>> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at python.org >> https://mail.python.org/mailman/listinfo/numpy-discussion >> > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ben.v.root at gmail.com Thu Feb 11 13:26:29 2021 From: ben.v.root at gmail.com (Benjamin Root) Date: Thu, 11 Feb 2021 13:26:29 -0500 Subject: [Numpy-discussion] ENH: Proposal to add atleast_nd function In-Reply-To: References: <1001ae35de9d51204170cfb5742a6ffab6e89990.camel@sipsolutions.net> Message-ID: My original usecase for these was dealing with output data from Matlab where those users would use `squeeze()` quite liberally. In addition, there was the problem of the implicit squeeze() in the numpy's loadtxt() for which I added the ndmin kwarg for in case an input CSV file had just one row or no rows. np.atleast_1d() is used in matplotlib in a bunch of places where inputs are allowed to be scalar or lists. On Thu, Feb 11, 2021 at 1:15 PM Stephan Hoyer wrote: > On Thu, Feb 11, 2021 at 9:42 AM Benjamin Root > wrote: > >> for me, I find that the at_least{1,2,3}d functions are useful for >> sanitizing inputs. Having an at_leastnd() function can be viewed as a step >> towards cleaning up the API, not cluttering it (although, deprecations of >> the existing functions probably should be long given how long they have >> existed). >> > > I would love to see examples of this -- perhaps in matplotlib? > > My thinking is that in most cases it's probably a better idea to keep the > interface simpler, and raise an error for lower-dimensional arrays. > Automatic conversion is convenient (and endemic within the SciPy > ecosystem), but is also a common source of bugs. > > On Thu, Feb 11, 2021 at 1:56 AM Stephan Hoyer wrote: >> >>> On Wed, Feb 10, 2021 at 9:48 PM Juan Nunez-Iglesias >>> wrote: >>> >>>> I totally agree with the namespace clutter concern, but honestly, I >>>> would use `atleast_nd` with its `pos` argument (I might rename it to >>>> `position`, `axis`, or `axis_position`) any day over `at_least{1,2,3}d`, >>>> for which I had no idea where the new axes would end up. >>>> >>>> So, I?m in favour of including it, and optionally deprecating >>>> `atleast_{1,2,3}d`. >>>> >>>> >>> I appreciate that `atleast_nd` feels more sensible than >>> `at_least{1,2,3}d`, but I don't think "better" than a pattern we would not >>> recommend is a good enough reason for inclusion in NumPy. It needs to stand >>> on its own. >>> >>> What would be the recommended use-cases for this new function? >>> Have any libraries building on top of NumPy implemented a version of >>> this? >>> >>> >>>> Juan. >>>> >>>> On 11 Feb 2021, at 9:48 am, Sebastian Berg >>>> wrote: >>>> >>>> On Wed, 2021-02-10 at 17:31 -0500, Joseph Fox-Rabinovitz wrote: >>>> >>>> I've created PR#18386 to add a function called atleast_nd to numpy and >>>> numpy.ma. This would generalize the existing atleast_1d, atleast_2d, >>>> and >>>> atleast_3d functions. >>>> >>>> I proposed a similar idea about four and a half years ago: >>>> https://mail.python.org/pipermail/numpy-discussion/2016-July/075722.html >>>> , >>>> PR#7804. The reception was ambivalent, but a couple of folks have asked >>>> me >>>> about this, so I'm bringing it back. >>>> >>>> Some pros: >>>> >>>> - This closes issue #12336 >>>> - There are a couple of Stack Overflow questions that would benefit >>>> - Been asked about this a couple of times >>>> - Implementation of three existing atleast_*d functions gets easier >>>> - Looks nicer that the equivalent broadcasting and reshaping >>>> >>>> Some cons: >>>> >>>> - Cluttering up the API >>>> - Maintenance burden (but not a big one) >>>> - This is just a utility function, which can be achieved through >>>> broadcasting and reshaping >>>> >>>> >>>> My main concern would be the namespace cluttering. I can't say I use >>>> even the `atleast_2d` etc. functions personally, so I would tend to be >>>> slightly against the addition. But if others land on the "useful" side here >>>> (and it seemed a bit at least on github), I am also not opposed. It is a >>>> clean name that lines up with existing ones, so it doesn't seem like a big >>>> "mental load" with respect to namespace cluttering. >>>> >>>> Bike shedding the API is probably a good idea in any case. >>>> >>>> I have pasted the current PR documentation (as html) below for quick >>>> reference. I wonder a bit about the reasoning for having `pos` specify a >>>> value rather than just a side? >>>> >>>> >>>> >>>> numpy.atleast_nd(*ary*, *ndim*, *pos=0*) >>>> View input as array with at least ndim dimensions. >>>> New unit dimensions are inserted at the index given by *pos* if >>>> necessary. >>>> Parameters*ary *array_like >>>> The input array. Non-array inputs are converted to arrays. Arrays that >>>> already have ndim or more dimensions are preserved. >>>> *ndim *int >>>> The minimum number of dimensions required. >>>> *pos *int, optional >>>> The index to insert the new dimensions. May range from -ary.ndim - 1 to >>>> +ary.ndim (inclusive). Non-negative indices indicate locations before >>>> the corresponding axis: pos=0 means to insert at the very beginning. >>>> Negative indices indicate locations after the corresponding axis: >>>> pos=-1 means to insert at the very end. 0 and -1 are always guaranteed >>>> to work. Any other number will depend on the dimensions of the existing >>>> array. Default is 0. >>>> Returns*res *ndarray >>>> An array with res.ndim >= ndim. A view is returned for array inputs. >>>> Dimensions are prepended if *pos* is 0, so for example, a 1-D array of >>>> shape (N,) with ndim=4becomes a view of shape (1, 1, 1, N). Dimensions >>>> are appended if *pos* is -1, so for example a 2-D array of shape (M, N) >>>> becomes a view of shape (M, N, 1, 1)when ndim=4. >>>> *See also* >>>> atleast_1d >>>> >>>> , atleast_2d >>>> >>>> , atleast_3d >>>> >>>> *Notes* >>>> This function does not follow the convention of the other atleast_*d functions >>>> in numpy in that it only accepts a single array argument. To process >>>> multiple arrays, use a comprehension or loop around the function call. See >>>> examples below. >>>> Setting pos=0 is equivalent to how the array would be interpreted by >>>> numpy?s broadcasting rules. There is no need to call this function for >>>> simple broadcasting. This is also roughly (but not exactly) equivalent to >>>> np.array(ary, copy=False, subok=True, ndmin=ndim). >>>> It is easy to create functions for specific dimensions similar to the >>>> other atleast_*d functions using Python?s functools.partial >>>> function. >>>> An example is shown below. >>>> *Examples* >>>> >>>> >>> np.atleast_nd(3.0, 4)array([[[[ 3.]]]]) >>>> >>>> >>> x = np.arange(3.0)>>> np.atleast_nd(x, 2).shape(1, 3) >>>> >>>> >>> x = np.arange(12.0).reshape(4, 3)>>> np.atleast_nd(x, 5).shape(1, 1, 1, 4, 3)>>> np.atleast_nd(x, 5).base is x.baseTrue >>>> >>>> >>> [np.atleast_nd(x) for x in ((1, 2), [[1, 2]], [[[1, 2]]])]:[array([[1, 2]]), array([[1, 2]]), array([[[1, 2]]])] >>>> >>>> >>> np.atleast_nd((1, 2), 5, pos=0).shape(1, 1, 1, 1, 2)>>> np.atleast_nd((1, 2), 5, pos=-1).shape(2, 1, 1, 1, 1) >>>> >>>> >>> from functools import partial>>> atleast_4d = partial(np.atleast_nd, ndim=4)>>> atleast_4d([1, 2, 3])[[[[1, 2, 3]]]] >>>> >>>> >>>> _______________________________________________ >>>> NumPy-Discussion mailing list >>>> NumPy-Discussion at python.org >>>> https://mail.python.org/mailman/listinfo/numpy-discussion >>>> >>>> >>>> _______________________________________________ >>>> NumPy-Discussion mailing list >>>> NumPy-Discussion at python.org >>>> https://mail.python.org/mailman/listinfo/numpy-discussion >>>> >>> _______________________________________________ >>> NumPy-Discussion mailing list >>> NumPy-Discussion at python.org >>> https://mail.python.org/mailman/listinfo/numpy-discussion >>> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at python.org >> https://mail.python.org/mailman/listinfo/numpy-discussion >> > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From wieser.eric+numpy at gmail.com Thu Feb 11 13:32:42 2021 From: wieser.eric+numpy at gmail.com (Eric Wieser) Date: Thu, 11 Feb 2021 18:32:42 +0000 Subject: [Numpy-discussion] ENH: Proposal to add atleast_nd function In-Reply-To: References: <1001ae35de9d51204170cfb5742a6ffab6e89990.camel@sipsolutions.net> Message-ID: I did a quick search of matplotlib, and found a few uses of all three functions: * https://github.com/matplotlib/matplotlib/blob/fed55c63a314351cd39a12783f385009782c06e1/lib/matplotlib/_layoutgrid.py#L441-L446 This one isn't really numpy at all, and is really just a shorthand for normalizing an argument `x=n` to `x=[n, n]` * https://github.com/matplotlib/matplotlib/blob/dd249744270f6abe3f540f81b7a77c0cb728ddbb/lib/matplotlib/mlab.py#L888 This one is the classic "either multivariate or single-variable data" thing endemic to the SciPy ecosystem. * https://github.com/matplotlib/matplotlib/blob/1eef019109b64ee4085732544cb5e310e69451ab/lib/matplotlib/cbook/__init__.py#L1325-L1326 Matplotlib has their own `_check_1d` function for input sanitization, although github says it's only used to parse the arguments to `plot`, which at this point are fairly established as being flexible. * https://github.com/matplotlib/matplotlib/blob/f72adc49092fe0233a8cd21aa0f317918dafb18d/lib/matplotlib/transforms.py#L631 This just looks like "defensive programming", and if the argument isn't already 3d then something is probably wrong. This isn't an exhaustive list, just a handful of different situations the functions were used. Eric On Thu, 11 Feb 2021 at 18:15, Stephan Hoyer wrote: > On Thu, Feb 11, 2021 at 9:42 AM Benjamin Root > wrote: > >> for me, I find that the at_least{1,2,3}d functions are useful for >> sanitizing inputs. Having an at_leastnd() function can be viewed as a step >> towards cleaning up the API, not cluttering it (although, deprecations of >> the existing functions probably should be long given how long they have >> existed). >> > > I would love to see examples of this -- perhaps in matplotlib? > > My thinking is that in most cases it's probably a better idea to keep the > interface simpler, and raise an error for lower-dimensional arrays. > Automatic conversion is convenient (and endemic within the SciPy > ecosystem), but is also a common source of bugs. > > On Thu, Feb 11, 2021 at 1:56 AM Stephan Hoyer wrote: >> >>> On Wed, Feb 10, 2021 at 9:48 PM Juan Nunez-Iglesias >>> wrote: >>> >>>> I totally agree with the namespace clutter concern, but honestly, I >>>> would use `atleast_nd` with its `pos` argument (I might rename it to >>>> `position`, `axis`, or `axis_position`) any day over `at_least{1,2,3}d`, >>>> for which I had no idea where the new axes would end up. >>>> >>>> So, I?m in favour of including it, and optionally deprecating >>>> `atleast_{1,2,3}d`. >>>> >>>> >>> I appreciate that `atleast_nd` feels more sensible than >>> `at_least{1,2,3}d`, but I don't think "better" than a pattern we would not >>> recommend is a good enough reason for inclusion in NumPy. It needs to stand >>> on its own. >>> >>> What would be the recommended use-cases for this new function? >>> Have any libraries building on top of NumPy implemented a version of >>> this? >>> >>> >>>> Juan. >>>> >>>> On 11 Feb 2021, at 9:48 am, Sebastian Berg >>>> wrote: >>>> >>>> On Wed, 2021-02-10 at 17:31 -0500, Joseph Fox-Rabinovitz wrote: >>>> >>>> I've created PR#18386 to add a function called atleast_nd to numpy and >>>> numpy.ma. This would generalize the existing atleast_1d, atleast_2d, >>>> and >>>> atleast_3d functions. >>>> >>>> I proposed a similar idea about four and a half years ago: >>>> https://mail.python.org/pipermail/numpy-discussion/2016-July/075722.html >>>> , >>>> PR#7804. The reception was ambivalent, but a couple of folks have asked >>>> me >>>> about this, so I'm bringing it back. >>>> >>>> Some pros: >>>> >>>> - This closes issue #12336 >>>> - There are a couple of Stack Overflow questions that would benefit >>>> - Been asked about this a couple of times >>>> - Implementation of three existing atleast_*d functions gets easier >>>> - Looks nicer that the equivalent broadcasting and reshaping >>>> >>>> Some cons: >>>> >>>> - Cluttering up the API >>>> - Maintenance burden (but not a big one) >>>> - This is just a utility function, which can be achieved through >>>> broadcasting and reshaping >>>> >>>> >>>> My main concern would be the namespace cluttering. I can't say I use >>>> even the `atleast_2d` etc. functions personally, so I would tend to be >>>> slightly against the addition. But if others land on the "useful" side here >>>> (and it seemed a bit at least on github), I am also not opposed. It is a >>>> clean name that lines up with existing ones, so it doesn't seem like a big >>>> "mental load" with respect to namespace cluttering. >>>> >>>> Bike shedding the API is probably a good idea in any case. >>>> >>>> I have pasted the current PR documentation (as html) below for quick >>>> reference. I wonder a bit about the reasoning for having `pos` specify a >>>> value rather than just a side? >>>> >>>> >>>> >>>> numpy.atleast_nd(*ary*, *ndim*, *pos=0*) >>>> View input as array with at least ndim dimensions. >>>> New unit dimensions are inserted at the index given by *pos* if >>>> necessary. >>>> Parameters*ary *array_like >>>> The input array. Non-array inputs are converted to arrays. Arrays that >>>> already have ndim or more dimensions are preserved. >>>> *ndim *int >>>> The minimum number of dimensions required. >>>> *pos *int, optional >>>> The index to insert the new dimensions. May range from -ary.ndim - 1 to >>>> +ary.ndim (inclusive). Non-negative indices indicate locations before >>>> the corresponding axis: pos=0 means to insert at the very beginning. >>>> Negative indices indicate locations after the corresponding axis: >>>> pos=-1 means to insert at the very end. 0 and -1 are always guaranteed >>>> to work. Any other number will depend on the dimensions of the existing >>>> array. Default is 0. >>>> Returns*res *ndarray >>>> An array with res.ndim >= ndim. A view is returned for array inputs. >>>> Dimensions are prepended if *pos* is 0, so for example, a 1-D array of >>>> shape (N,) with ndim=4becomes a view of shape (1, 1, 1, N). Dimensions >>>> are appended if *pos* is -1, so for example a 2-D array of shape (M, N) >>>> becomes a view of shape (M, N, 1, 1)when ndim=4. >>>> *See also* >>>> atleast_1d >>>> >>>> , atleast_2d >>>> >>>> , atleast_3d >>>> >>>> *Notes* >>>> This function does not follow the convention of the other atleast_*d functions >>>> in numpy in that it only accepts a single array argument. To process >>>> multiple arrays, use a comprehension or loop around the function call. See >>>> examples below. >>>> Setting pos=0 is equivalent to how the array would be interpreted by >>>> numpy?s broadcasting rules. There is no need to call this function for >>>> simple broadcasting. This is also roughly (but not exactly) equivalent to >>>> np.array(ary, copy=False, subok=True, ndmin=ndim). >>>> It is easy to create functions for specific dimensions similar to the >>>> other atleast_*d functions using Python?s functools.partial >>>> function. >>>> An example is shown below. >>>> *Examples* >>>> >>>> >>> np.atleast_nd(3.0, 4)array([[[[ 3.]]]]) >>>> >>>> >>> x = np.arange(3.0)>>> np.atleast_nd(x, 2).shape(1, 3) >>>> >>>> >>> x = np.arange(12.0).reshape(4, 3)>>> np.atleast_nd(x, 5).shape(1, 1, 1, 4, 3)>>> np.atleast_nd(x, 5).base is x.baseTrue >>>> >>>> >>> [np.atleast_nd(x) for x in ((1, 2), [[1, 2]], [[[1, 2]]])]:[array([[1, 2]]), array([[1, 2]]), array([[[1, 2]]])] >>>> >>>> >>> np.atleast_nd((1, 2), 5, pos=0).shape(1, 1, 1, 1, 2)>>> np.atleast_nd((1, 2), 5, pos=-1).shape(2, 1, 1, 1, 1) >>>> >>>> >>> from functools import partial>>> atleast_4d = partial(np.atleast_nd, ndim=4)>>> atleast_4d([1, 2, 3])[[[[1, 2, 3]]]] >>>> >>>> >>>> _______________________________________________ >>>> NumPy-Discussion mailing list >>>> NumPy-Discussion at python.org >>>> https://mail.python.org/mailman/listinfo/numpy-discussion >>>> >>>> >>>> _______________________________________________ >>>> NumPy-Discussion mailing list >>>> NumPy-Discussion at python.org >>>> https://mail.python.org/mailman/listinfo/numpy-discussion >>>> >>> _______________________________________________ >>> NumPy-Discussion mailing list >>> NumPy-Discussion at python.org >>> https://mail.python.org/mailman/listinfo/numpy-discussion >>> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at python.org >> https://mail.python.org/mailman/listinfo/numpy-discussion >> > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jni at fastmail.com Thu Feb 11 21:31:56 2021 From: jni at fastmail.com (Juan Nunez-Iglesias) Date: Fri, 12 Feb 2021 13:31:56 +1100 Subject: [Numpy-discussion] ENH: Proposal to add atleast_nd function In-Reply-To: References: <1001ae35de9d51204170cfb5742a6ffab6e89990.camel@sipsolutions.net> Message-ID: both napari and scikit-image use atleast_ a few times. I don?t have many examples of where I used nd because it didn?t exist. But I have the very distinct impression of needing it repeatedly. In some places, I?ve used `np.broadcast_to` to signal the same intention, where `atleast_nd` would have been the more readable solution. I don?t buy the argument that it?s just a way to mask errors. NumPy broadcasting also has that same potential but I hope no one would seriously consider deprecating it. Indeed, even if we accept that we (library authors) should force users to provide an array of the right dimensionality, that still argues for making it convenient for users to do that! I don?t feel super strongly about this. But I think atleast_nd is a move in a positive direction and I?d prefer it to what?s there now: In [1]: import numpy as np In [2]: np.atleast_3d(np.ones(4)).shape Out[2]: (1, 4, 1) There might be some linear algebraic reason why those axis positions make sense, but I?m not aware of it... Juan. > On 12 Feb 2021, at 5:32 am, Eric Wieser wrote: > > I did a quick search of matplotlib, and found a few uses of all three functions: > > * https://github.com/matplotlib/matplotlib/blob/fed55c63a314351cd39a12783f385009782c06e1/lib/matplotlib/_layoutgrid.py#L441-L446 > This one isn't really numpy at all, and is really just a shorthand for normalizing an argument `x=n` to `x=[n, n]` > * https://github.com/matplotlib/matplotlib/blob/dd249744270f6abe3f540f81b7a77c0cb728ddbb/lib/matplotlib/mlab.py#L888 > This one is the classic "either multivariate or single-variable data" thing endemic to the SciPy ecosystem. > * https://github.com/matplotlib/matplotlib/blob/1eef019109b64ee4085732544cb5e310e69451ab/lib/matplotlib/cbook/__init__.py#L1325-L1326 > Matplotlib has their own `_check_1d` function for input sanitization, although github says it's only used to parse the arguments to `plot`, which at this point are fairly established as being flexible. > * https://github.com/matplotlib/matplotlib/blob/f72adc49092fe0233a8cd21aa0f317918dafb18d/lib/matplotlib/transforms.py#L631 > This just looks like "defensive programming", and if the argument isn't already 3d then something is probably wrong. > > This isn't an exhaustive list, just a handful of different situations the functions were used. > > Eric > > > > On Thu, 11 Feb 2021 at 18:15, Stephan Hoyer > wrote: > On Thu, Feb 11, 2021 at 9:42 AM Benjamin Root > wrote: > for me, I find that the at_least{1,2,3}d functions are useful for sanitizing inputs. Having an at_leastnd() function can be viewed as a step towards cleaning up the API, not cluttering it (although, deprecations of the existing functions probably should be long given how long they have existed). > > I would love to see examples of this -- perhaps in matplotlib? > > My thinking is that in most cases it's probably a better idea to keep the interface simpler, and raise an error for lower-dimensional arrays. Automatic conversion is convenient (and endemic within the SciPy ecosystem), but is also a common source of bugs. > > On Thu, Feb 11, 2021 at 1:56 AM Stephan Hoyer > wrote: > On Wed, Feb 10, 2021 at 9:48 PM Juan Nunez-Iglesias > wrote: > I totally agree with the namespace clutter concern, but honestly, I would use `atleast_nd` with its `pos` argument (I might rename it to `position`, `axis`, or `axis_position`) any day over `at_least{1,2,3}d`, for which I had no idea where the new axes would end up. > > So, I?m in favour of including it, and optionally deprecating `atleast_{1,2,3}d`. > > > I appreciate that `atleast_nd` feels more sensible than `at_least{1,2,3}d`, but I don't think "better" than a pattern we would not recommend is a good enough reason for inclusion in NumPy. It needs to stand on its own. > > What would be the recommended use-cases for this new function? > Have any libraries building on top of NumPy implemented a version of this? > > Juan. > >> On 11 Feb 2021, at 9:48 am, Sebastian Berg > wrote: >> >> On Wed, 2021-02-10 at 17:31 -0500, Joseph Fox-Rabinovitz wrote: >>> I've created PR#18386 to add a function called atleast_nd to numpy and >>> numpy.ma . This would generalize the existing atleast_1d, atleast_2d, and >>> atleast_3d functions. >>> >>> I proposed a similar idea about four and a half years ago: >>> https://mail.python.org/pipermail/numpy-discussion/2016-July/075722.html , >>> PR#7804. The reception was ambivalent, but a couple of folks have asked me >>> about this, so I'm bringing it back. >>> >>> Some pros: >>> >>> - This closes issue #12336 >>> - There are a couple of Stack Overflow questions that would benefit >>> - Been asked about this a couple of times >>> - Implementation of three existing atleast_*d functions gets easier >>> - Looks nicer that the equivalent broadcasting and reshaping >>> >>> Some cons: >>> >>> - Cluttering up the API >>> - Maintenance burden (but not a big one) >>> - This is just a utility function, which can be achieved through >>> broadcasting and reshaping >>> >> >> My main concern would be the namespace cluttering. I can't say I use even the `atleast_2d` etc. functions personally, so I would tend to be slightly against the addition. But if others land on the "useful" side here (and it seemed a bit at least on github), I am also not opposed. It is a clean name that lines up with existing ones, so it doesn't seem like a big "mental load" with respect to namespace cluttering. >> >> Bike shedding the API is probably a good idea in any case. >> >> I have pasted the current PR documentation (as html) below for quick reference. I wonder a bit about the reasoning for having `pos` specify a value rather than just a side? >> >> >> >> numpy.atleast_nd(ary, ndim, pos=0) >> View input as array with at least ndim dimensions. >> New unit dimensions are inserted at the index given by pos if necessary. >> Parameters >> ary array_like >> The input array. Non-array inputs are converted to arrays. Arrays that already have ndim or more dimensions are preserved. >> ndim int >> The minimum number of dimensions required. >> pos int, optional >> The index to insert the new dimensions. May range from -ary.ndim - 1 to +ary.ndim (inclusive). Non-negative indices indicate locations before the corresponding axis: pos=0 means to insert at the very beginning. Negative indices indicate locations after the corresponding axis: pos=-1 means to insert at the very end. 0 and -1 are always guaranteed to work. Any other number will depend on the dimensions of the existing array. Default is 0. >> Returns >> res ndarray >> An array with res.ndim >= ndim. A view is returned for array inputs. Dimensions are prepended if pos is 0, so for example, a 1-D array of shape (N,) with ndim=4becomes a view of shape (1, 1, 1, N). Dimensions are appended if pos is -1, so for example a 2-D array of shape (M, N) becomes a view of shape (M, N, 1, 1)when ndim=4. >> See also >> atleast_1d , atleast_2d , atleast_3d >> Notes >> This function does not follow the convention of the other atleast_*d functions in numpy in that it only accepts a single array argument. To process multiple arrays, use a comprehension or loop around the function call. See examples below. >> Setting pos=0 is equivalent to how the array would be interpreted by numpy?s broadcasting rules. There is no need to call this function for simple broadcasting. This is also roughly (but not exactly) equivalent to np.array(ary, copy=False, subok=True, ndmin=ndim). >> It is easy to create functions for specific dimensions similar to the other atleast_*d functions using Python?s functools.partial function. An example is shown below. >> Examples >> >>> np.atleast_nd(3.0, 4) >> array([[[[ 3.]]]]) >> >>> x = np.arange(3.0) >> >>> np.atleast_nd(x, 2).shape >> (1, 3) >> >>> x = np.arange(12.0).reshape(4, 3) >> >>> np.atleast_nd(x, 5).shape >> (1, 1, 1, 4, 3) >> >>> np.atleast_nd(x, 5).base is x.base >> True >> >>> [np.atleast_nd(x) for x in ((1, 2), [[1, 2]], [[[1, 2]]])]: >> [array([[1, 2]]), array([[1, 2]]), array([[[1, 2]]])] >> >>> np.atleast_nd((1, 2), 5, pos=0).shape >> (1, 1, 1, 1, 2) >> >>> np.atleast_nd((1, 2), 5, pos=-1).shape >> (2, 1, 1, 1, 1) >> >>> from functools import partial >> >>> atleast_4d = partial(np.atleast_nd, ndim=4) >> >>> atleast_4d([1, 2, 3]) >> [[[[1, 2, 3]]]] >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at python.org >> https://mail.python.org/mailman/listinfo/numpy-discussion > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Fri Feb 12 05:13:18 2021 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Fri, 12 Feb 2021 11:13:18 +0100 Subject: [Numpy-discussion] ENH: Proposal to add atleast_nd function In-Reply-To: References: <1001ae35de9d51204170cfb5742a6ffab6e89990.camel@sipsolutions.net> Message-ID: On Fri, Feb 12, 2021 at 3:32 AM Juan Nunez-Iglesias wrote: > both napari and scikit-image use atleast_ a few times. I don?t have many > examples of where I used nd because it didn?t exist. But I have the very > distinct impression of needing it repeatedly. In some places, I?ve used > `np.broadcast_to` to signal the same intention, where `atleast_nd` would > have been the more readable solution. > > I don?t buy the argument that it?s just a way to mask errors. NumPy > broadcasting also has that same potential but I hope no one would seriously > consider deprecating it. Indeed, even if we accept that we (library > authors) should force users to provide an array of the right > dimensionality, that still argues for making it convenient for users to do > that! > > I don?t feel super strongly about this. But I think atleast_nd is a move > in a positive direction and I?d prefer it to what?s there now: > > In [1]: import numpy as np > In [2]: np.atleast_3d(np.ones(4)).shape > Out[2]: (1, 4, 1) > > There might be some linear algebraic reason why those axis positions make > sense, but I?m not aware of it... > Yes that's pretty weird. I'm also not sure there's a reason. It would be good that, if atleast_nd is not going to replicate this behavior, atleast_3d was deprecated (perhaps a release or two after introduction of atleast_nd). Not having `atleast_3d(x) == atleast_nd(x, pos=3)` is unnecessarily confusing. Ralf > Juan. > > On 12 Feb 2021, at 5:32 am, Eric Wieser > wrote: > > I did a quick search of matplotlib, and found a few uses of all three > functions: > > * > https://github.com/matplotlib/matplotlib/blob/fed55c63a314351cd39a12783f385009782c06e1/lib/matplotlib/_layoutgrid.py#L441-L446 > This one isn't really numpy at all, and is really just a shorthand for > normalizing an argument `x=n` to `x=[n, n]` > * > https://github.com/matplotlib/matplotlib/blob/dd249744270f6abe3f540f81b7a77c0cb728ddbb/lib/matplotlib/mlab.py#L888 > This one is the classic "either multivariate or single-variable data" > thing endemic to the SciPy ecosystem. > * > https://github.com/matplotlib/matplotlib/blob/1eef019109b64ee4085732544cb5e310e69451ab/lib/matplotlib/cbook/__init__.py#L1325-L1326 > Matplotlib has their own `_check_1d` function for input sanitization, > although github says it's only used to parse the arguments to `plot`, which > at this point are fairly established as being flexible. > * > https://github.com/matplotlib/matplotlib/blob/f72adc49092fe0233a8cd21aa0f317918dafb18d/lib/matplotlib/transforms.py#L631 > This just looks like "defensive programming", and if the argument isn't > already 3d then something is probably wrong. > > This isn't an exhaustive list, just a handful of different situations the > functions were used. > > Eric > > > > On Thu, 11 Feb 2021 at 18:15, Stephan Hoyer wrote: > >> On Thu, Feb 11, 2021 at 9:42 AM Benjamin Root >> wrote: >> >>> for me, I find that the at_least{1,2,3}d functions are useful for >>> sanitizing inputs. Having an at_leastnd() function can be viewed as a step >>> towards cleaning up the API, not cluttering it (although, deprecations of >>> the existing functions probably should be long given how long they have >>> existed). >>> >> >> I would love to see examples of this -- perhaps in matplotlib? >> >> My thinking is that in most cases it's probably a better idea to keep the >> interface simpler, and raise an error for lower-dimensional arrays. >> Automatic conversion is convenient (and endemic within the SciPy >> ecosystem), but is also a common source of bugs. >> >> On Thu, Feb 11, 2021 at 1:56 AM Stephan Hoyer wrote: >>> >>>> On Wed, Feb 10, 2021 at 9:48 PM Juan Nunez-Iglesias >>>> wrote: >>>> >>>>> I totally agree with the namespace clutter concern, but honestly, I >>>>> would use `atleast_nd` with its `pos` argument (I might rename it to >>>>> `position`, `axis`, or `axis_position`) any day over `at_least{1,2,3}d`, >>>>> for which I had no idea where the new axes would end up. >>>>> >>>>> So, I?m in favour of including it, and optionally deprecating >>>>> `atleast_{1,2,3}d`. >>>>> >>>>> >>>> I appreciate that `atleast_nd` feels more sensible than >>>> `at_least{1,2,3}d`, but I don't think "better" than a pattern we would not >>>> recommend is a good enough reason for inclusion in NumPy. It needs to stand >>>> on its own. >>>> >>>> What would be the recommended use-cases for this new function? >>>> Have any libraries building on top of NumPy implemented a version of >>>> this? >>>> >>>> >>>>> Juan. >>>>> >>>>> On 11 Feb 2021, at 9:48 am, Sebastian Berg >>>>> wrote: >>>>> >>>>> On Wed, 2021-02-10 at 17:31 -0500, Joseph Fox-Rabinovitz wrote: >>>>> >>>>> I've created PR#18386 to add a function called atleast_nd to numpy and >>>>> numpy.ma. This would generalize the existing atleast_1d, atleast_2d, >>>>> and >>>>> atleast_3d functions. >>>>> >>>>> I proposed a similar idea about four and a half years ago: >>>>> >>>>> https://mail.python.org/pipermail/numpy-discussion/2016-July/075722.html >>>>> , >>>>> PR#7804. The reception was ambivalent, but a couple of folks have >>>>> asked me >>>>> about this, so I'm bringing it back. >>>>> >>>>> Some pros: >>>>> >>>>> - This closes issue #12336 >>>>> - There are a couple of Stack Overflow questions that would benefit >>>>> - Been asked about this a couple of times >>>>> - Implementation of three existing atleast_*d functions gets easier >>>>> - Looks nicer that the equivalent broadcasting and reshaping >>>>> >>>>> Some cons: >>>>> >>>>> - Cluttering up the API >>>>> - Maintenance burden (but not a big one) >>>>> - This is just a utility function, which can be achieved through >>>>> broadcasting and reshaping >>>>> >>>>> >>>>> My main concern would be the namespace cluttering. I can't say I use >>>>> even the `atleast_2d` etc. functions personally, so I would tend to be >>>>> slightly against the addition. But if others land on the "useful" side here >>>>> (and it seemed a bit at least on github), I am also not opposed. It is a >>>>> clean name that lines up with existing ones, so it doesn't seem like a big >>>>> "mental load" with respect to namespace cluttering. >>>>> >>>>> Bike shedding the API is probably a good idea in any case. >>>>> >>>>> I have pasted the current PR documentation (as html) below for quick >>>>> reference. I wonder a bit about the reasoning for having `pos` specify a >>>>> value rather than just a side? >>>>> >>>>> >>>>> >>>>> numpy.atleast_nd(*ary*, *ndim*, *pos=0*) >>>>> View input as array with at least ndim dimensions. >>>>> New unit dimensions are inserted at the index given by *pos* if >>>>> necessary. >>>>> Parameters*ary *array_like >>>>> The input array. Non-array inputs are converted to arrays. Arrays that >>>>> already have ndim or more dimensions are preserved. >>>>> *ndim *int >>>>> The minimum number of dimensions required. >>>>> *pos *int, optional >>>>> The index to insert the new dimensions. May range from -ary.ndim - 1 >>>>> to +ary.ndim (inclusive). Non-negative indices indicate locations >>>>> before the corresponding axis: pos=0 means to insert at the very >>>>> beginning. Negative indices indicate locations after the corresponding axis: >>>>> pos=-1 means to insert at the very end. 0 and -1 are always >>>>> guaranteed to work. Any other number will depend on the dimensions of the >>>>> existing array. Default is 0. >>>>> Returns*res *ndarray >>>>> An array with res.ndim >= ndim. A view is returned for array inputs. >>>>> Dimensions are prepended if *pos* is 0, so for example, a 1-D array >>>>> of shape (N,) with ndim=4becomes a view of shape (1, 1, 1, N). >>>>> Dimensions are appended if *pos* is -1, so for example a 2-D array of >>>>> shape (M, N) becomes a view of shape (M, N, 1, 1)when ndim=4. >>>>> *See also* >>>>> atleast_1d >>>>> >>>>> , atleast_2d >>>>> >>>>> , atleast_3d >>>>> >>>>> *Notes* >>>>> This function does not follow the convention of the other atleast_*d functions >>>>> in numpy in that it only accepts a single array argument. To process >>>>> multiple arrays, use a comprehension or loop around the function call. See >>>>> examples below. >>>>> Setting pos=0 is equivalent to how the array would be interpreted by >>>>> numpy?s broadcasting rules. There is no need to call this function for >>>>> simple broadcasting. This is also roughly (but not exactly) equivalent to >>>>> np.array(ary, copy=False, subok=True, ndmin=ndim). >>>>> It is easy to create functions for specific dimensions similar to the >>>>> other atleast_*d functions using Python?s functools.partial >>>>> >>>>> function. An example is shown below. >>>>> *Examples* >>>>> >>>>> >>> np.atleast_nd(3.0, 4)array([[[[ 3.]]]]) >>>>> >>>>> >>> x = np.arange(3.0)>>> np.atleast_nd(x, 2).shape(1, 3) >>>>> >>>>> >>> x = np.arange(12.0).reshape(4, 3)>>> np.atleast_nd(x, 5).shape(1, 1, 1, 4, 3)>>> np.atleast_nd(x, 5).base is x.baseTrue >>>>> >>>>> >>> [np.atleast_nd(x) for x in ((1, 2), [[1, 2]], [[[1, 2]]])]:[array([[1, 2]]), array([[1, 2]]), array([[[1, 2]]])] >>>>> >>>>> >>> np.atleast_nd((1, 2), 5, pos=0).shape(1, 1, 1, 1, 2)>>> np.atleast_nd((1, 2), 5, pos=-1).shape(2, 1, 1, 1, 1) >>>>> >>>>> >>> from functools import partial>>> atleast_4d = partial(np.atleast_nd, ndim=4)>>> atleast_4d([1, 2, 3])[[[[1, 2, 3]]]] >>>>> >>>>> >>>>> _______________________________________________ >>>>> NumPy-Discussion mailing list >>>>> NumPy-Discussion at python.org >>>>> https://mail.python.org/mailman/listinfo/numpy-discussion >>>>> >>>>> >>>>> _______________________________________________ >>>>> NumPy-Discussion mailing list >>>>> NumPy-Discussion at python.org >>>>> https://mail.python.org/mailman/listinfo/numpy-discussion >>>>> >>>> _______________________________________________ >>>> NumPy-Discussion mailing list >>>> NumPy-Discussion at python.org >>>> https://mail.python.org/mailman/listinfo/numpy-discussion >>>> >>> _______________________________________________ >>> NumPy-Discussion mailing list >>> NumPy-Discussion at python.org >>> https://mail.python.org/mailman/listinfo/numpy-discussion >>> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at python.org >> https://mail.python.org/mailman/listinfo/numpy-discussion >> > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From wieser.eric+numpy at gmail.com Fri Feb 12 05:14:23 2021 From: wieser.eric+numpy at gmail.com (Eric Wieser) Date: Fri, 12 Feb 2021 10:14:23 +0000 Subject: [Numpy-discussion] ENH: Proposal to add atleast_nd function In-Reply-To: References: <1001ae35de9d51204170cfb5742a6ffab6e89990.camel@sipsolutions.net> Message-ID: > There might be some linear algebraic reason why those axis positions make sense, but I?m not aware of it... My guess is that the historical motivation was to allow grayscale `(H, W)` images to be converted into `(H, W, 1)` images so that they can be broadcast against `(H, W, 3)` RGB images. Eric On Fri, 12 Feb 2021 at 02:32, Juan Nunez-Iglesias wrote: > both napari and scikit-image use atleast_ a few times. I don?t have many > examples of where I used nd because it didn?t exist. But I have the very > distinct impression of needing it repeatedly. In some places, I?ve used > `np.broadcast_to` to signal the same intention, where `atleast_nd` would > have been the more readable solution. > > I don?t buy the argument that it?s just a way to mask errors. NumPy > broadcasting also has that same potential but I hope no one would seriously > consider deprecating it. Indeed, even if we accept that we (library > authors) should force users to provide an array of the right > dimensionality, that still argues for making it convenient for users to do > that! > > I don?t feel super strongly about this. But I think atleast_nd is a move > in a positive direction and I?d prefer it to what?s there now: > > In [1]: import numpy as np > In [2]: np.atleast_3d(np.ones(4)).shape > Out[2]: (1, 4, 1) > > There might be some linear algebraic reason why those axis positions make > sense, but I?m not aware of it... > > Juan. > > On 12 Feb 2021, at 5:32 am, Eric Wieser > wrote: > > I did a quick search of matplotlib, and found a few uses of all three > functions: > > * > https://github.com/matplotlib/matplotlib/blob/fed55c63a314351cd39a12783f385009782c06e1/lib/matplotlib/_layoutgrid.py#L441-L446 > This one isn't really numpy at all, and is really just a shorthand for > normalizing an argument `x=n` to `x=[n, n]` > * > https://github.com/matplotlib/matplotlib/blob/dd249744270f6abe3f540f81b7a77c0cb728ddbb/lib/matplotlib/mlab.py#L888 > This one is the classic "either multivariate or single-variable data" > thing endemic to the SciPy ecosystem. > * > https://github.com/matplotlib/matplotlib/blob/1eef019109b64ee4085732544cb5e310e69451ab/lib/matplotlib/cbook/__init__.py#L1325-L1326 > Matplotlib has their own `_check_1d` function for input sanitization, > although github says it's only used to parse the arguments to `plot`, which > at this point are fairly established as being flexible. > * > https://github.com/matplotlib/matplotlib/blob/f72adc49092fe0233a8cd21aa0f317918dafb18d/lib/matplotlib/transforms.py#L631 > This just looks like "defensive programming", and if the argument isn't > already 3d then something is probably wrong. > > This isn't an exhaustive list, just a handful of different situations the > functions were used. > > Eric > > > > On Thu, 11 Feb 2021 at 18:15, Stephan Hoyer wrote: > >> On Thu, Feb 11, 2021 at 9:42 AM Benjamin Root >> wrote: >> >>> for me, I find that the at_least{1,2,3}d functions are useful for >>> sanitizing inputs. Having an at_leastnd() function can be viewed as a step >>> towards cleaning up the API, not cluttering it (although, deprecations of >>> the existing functions probably should be long given how long they have >>> existed). >>> >> >> I would love to see examples of this -- perhaps in matplotlib? >> >> My thinking is that in most cases it's probably a better idea to keep the >> interface simpler, and raise an error for lower-dimensional arrays. >> Automatic conversion is convenient (and endemic within the SciPy >> ecosystem), but is also a common source of bugs. >> >> On Thu, Feb 11, 2021 at 1:56 AM Stephan Hoyer wrote: >>> >>>> On Wed, Feb 10, 2021 at 9:48 PM Juan Nunez-Iglesias >>>> wrote: >>>> >>>>> I totally agree with the namespace clutter concern, but honestly, I >>>>> would use `atleast_nd` with its `pos` argument (I might rename it to >>>>> `position`, `axis`, or `axis_position`) any day over `at_least{1,2,3}d`, >>>>> for which I had no idea where the new axes would end up. >>>>> >>>>> So, I?m in favour of including it, and optionally deprecating >>>>> `atleast_{1,2,3}d`. >>>>> >>>>> >>>> I appreciate that `atleast_nd` feels more sensible than >>>> `at_least{1,2,3}d`, but I don't think "better" than a pattern we would not >>>> recommend is a good enough reason for inclusion in NumPy. It needs to stand >>>> on its own. >>>> >>>> What would be the recommended use-cases for this new function? >>>> Have any libraries building on top of NumPy implemented a version of >>>> this? >>>> >>>> >>>>> Juan. >>>>> >>>>> On 11 Feb 2021, at 9:48 am, Sebastian Berg >>>>> wrote: >>>>> >>>>> On Wed, 2021-02-10 at 17:31 -0500, Joseph Fox-Rabinovitz wrote: >>>>> >>>>> I've created PR#18386 to add a function called atleast_nd to numpy and >>>>> numpy.ma. This would generalize the existing atleast_1d, atleast_2d, >>>>> and >>>>> atleast_3d functions. >>>>> >>>>> I proposed a similar idea about four and a half years ago: >>>>> >>>>> https://mail.python.org/pipermail/numpy-discussion/2016-July/075722.html >>>>> , >>>>> PR#7804. The reception was ambivalent, but a couple of folks have >>>>> asked me >>>>> about this, so I'm bringing it back. >>>>> >>>>> Some pros: >>>>> >>>>> - This closes issue #12336 >>>>> - There are a couple of Stack Overflow questions that would benefit >>>>> - Been asked about this a couple of times >>>>> - Implementation of three existing atleast_*d functions gets easier >>>>> - Looks nicer that the equivalent broadcasting and reshaping >>>>> >>>>> Some cons: >>>>> >>>>> - Cluttering up the API >>>>> - Maintenance burden (but not a big one) >>>>> - This is just a utility function, which can be achieved through >>>>> broadcasting and reshaping >>>>> >>>>> >>>>> My main concern would be the namespace cluttering. I can't say I use >>>>> even the `atleast_2d` etc. functions personally, so I would tend to be >>>>> slightly against the addition. But if others land on the "useful" side here >>>>> (and it seemed a bit at least on github), I am also not opposed. It is a >>>>> clean name that lines up with existing ones, so it doesn't seem like a big >>>>> "mental load" with respect to namespace cluttering. >>>>> >>>>> Bike shedding the API is probably a good idea in any case. >>>>> >>>>> I have pasted the current PR documentation (as html) below for quick >>>>> reference. I wonder a bit about the reasoning for having `pos` specify a >>>>> value rather than just a side? >>>>> >>>>> >>>>> >>>>> numpy.atleast_nd(*ary*, *ndim*, *pos=0*) >>>>> View input as array with at least ndim dimensions. >>>>> New unit dimensions are inserted at the index given by *pos* if >>>>> necessary. >>>>> Parameters*ary *array_like >>>>> The input array. Non-array inputs are converted to arrays. Arrays that >>>>> already have ndim or more dimensions are preserved. >>>>> *ndim *int >>>>> The minimum number of dimensions required. >>>>> *pos *int, optional >>>>> The index to insert the new dimensions. May range from -ary.ndim - 1 >>>>> to +ary.ndim (inclusive). Non-negative indices indicate locations >>>>> before the corresponding axis: pos=0 means to insert at the very >>>>> beginning. Negative indices indicate locations after the corresponding axis: >>>>> pos=-1 means to insert at the very end. 0 and -1 are always >>>>> guaranteed to work. Any other number will depend on the dimensions of the >>>>> existing array. Default is 0. >>>>> Returns*res *ndarray >>>>> An array with res.ndim >= ndim. A view is returned for array inputs. >>>>> Dimensions are prepended if *pos* is 0, so for example, a 1-D array >>>>> of shape (N,) with ndim=4becomes a view of shape (1, 1, 1, N). >>>>> Dimensions are appended if *pos* is -1, so for example a 2-D array of >>>>> shape (M, N) becomes a view of shape (M, N, 1, 1)when ndim=4. >>>>> *See also* >>>>> atleast_1d >>>>> >>>>> , atleast_2d >>>>> >>>>> , atleast_3d >>>>> >>>>> *Notes* >>>>> This function does not follow the convention of the other atleast_*d functions >>>>> in numpy in that it only accepts a single array argument. To process >>>>> multiple arrays, use a comprehension or loop around the function call. See >>>>> examples below. >>>>> Setting pos=0 is equivalent to how the array would be interpreted by >>>>> numpy?s broadcasting rules. There is no need to call this function for >>>>> simple broadcasting. This is also roughly (but not exactly) equivalent to >>>>> np.array(ary, copy=False, subok=True, ndmin=ndim). >>>>> It is easy to create functions for specific dimensions similar to the >>>>> other atleast_*d functions using Python?s functools.partial >>>>> >>>>> function. An example is shown below. >>>>> *Examples* >>>>> >>>>> >>> np.atleast_nd(3.0, 4)array([[[[ 3.]]]]) >>>>> >>>>> >>> x = np.arange(3.0)>>> np.atleast_nd(x, 2).shape(1, 3) >>>>> >>>>> >>> x = np.arange(12.0).reshape(4, 3)>>> np.atleast_nd(x, 5).shape(1, 1, 1, 4, 3)>>> np.atleast_nd(x, 5).base is x.baseTrue >>>>> >>>>> >>> [np.atleast_nd(x) for x in ((1, 2), [[1, 2]], [[[1, 2]]])]:[array([[1, 2]]), array([[1, 2]]), array([[[1, 2]]])] >>>>> >>>>> >>> np.atleast_nd((1, 2), 5, pos=0).shape(1, 1, 1, 1, 2)>>> np.atleast_nd((1, 2), 5, pos=-1).shape(2, 1, 1, 1, 1) >>>>> >>>>> >>> from functools import partial>>> atleast_4d = partial(np.atleast_nd, ndim=4)>>> atleast_4d([1, 2, 3])[[[[1, 2, 3]]]] >>>>> >>>>> >>>>> _______________________________________________ >>>>> NumPy-Discussion mailing list >>>>> NumPy-Discussion at python.org >>>>> https://mail.python.org/mailman/listinfo/numpy-discussion >>>>> >>>>> >>>>> _______________________________________________ >>>>> NumPy-Discussion mailing list >>>>> NumPy-Discussion at python.org >>>>> https://mail.python.org/mailman/listinfo/numpy-discussion >>>>> >>>> _______________________________________________ >>>> NumPy-Discussion mailing list >>>> NumPy-Discussion at python.org >>>> https://mail.python.org/mailman/listinfo/numpy-discussion >>>> >>> _______________________________________________ >>> NumPy-Discussion mailing list >>> NumPy-Discussion at python.org >>> https://mail.python.org/mailman/listinfo/numpy-discussion >>> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at python.org >> https://mail.python.org/mailman/listinfo/numpy-discussion >> > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From sebastian at sipsolutions.net Fri Feb 12 09:29:28 2021 From: sebastian at sipsolutions.net (Sebastian Berg) Date: Fri, 12 Feb 2021 08:29:28 -0600 Subject: [Numpy-discussion] ENH: Proposal to add atleast_nd function In-Reply-To: References: <1001ae35de9d51204170cfb5742a6ffab6e89990.camel@sipsolutions.net> Message-ID: <5f9919794e073c5474c3d19bb8ccbf3542e4ad09.camel@sipsolutions.net> On Fri, 2021-02-12 at 11:13 +0100, Ralf Gommers wrote: > On Fri, Feb 12, 2021 at 3:32 AM Juan Nunez-Iglesias > > wrote: > > > both napari and scikit-image use atleast_ a few times. I don?t have > > many > > examples of where I used nd because it didn?t exist. But I have the > > very > > distinct impression of needing it repeatedly. In some places, I?ve > > used > > `np.broadcast_to` to signal the same intention, where `atleast_nd` > > would > > have been the more readable solution. > > > > I don?t buy the argument that it?s just a way to mask errors. NumPy > > broadcasting also has that same potential but I hope no one would > > seriously > > consider deprecating it. Indeed, even if we accept that we (library > > authors) should force users to provide an array of the right > > dimensionality, that still argues for making it convenient for > > users to do > > that! > > > > I don?t feel super strongly about this. But I think atleast_nd is a > > move > > in a positive direction and I?d prefer? it to what?s there now: > > > > In [1]: import numpy as np > > In [2]: np.atleast_3d(np.ones(4)).shape > > Out[2]: (1, 4, 1) > > > > There might be some linear algebraic reason why those axis > > positions make > > sense, but I?m not aware of it... > > > > Yes that's pretty weird. I'm also not sure there's a reason. > > It would be good that, if atleast_nd is not going to replicate this > behavior, atleast_3d was deprecated (perhaps a release or two after > introduction of atleast_nd). > Planning to replace `atleast_3d` (not right now but soon), sounds like a good way forward. "1, 2, nd" is pretty good. `atleast_3d` seems not used all that much and is an odd one out. Having the `nd` version should make a future deprecation painless, so long term we will be better off. - Sebastian > Not having `atleast_3d(x) == atleast_nd(x, pos=3)` is unnecessarily > confusing. > > Ralf > > > > Juan. > > > > On 12 Feb 2021, at 5:32 am, Eric Wieser < > > wieser.eric+numpy at gmail.com> > > wrote: > > > > I did a quick search of matplotlib, and found a few uses of all > > three > > functions: > > > > * > > https://github.com/matplotlib/matplotlib/blob/fed55c63a314351cd39a12783f385009782c06e1/lib/matplotlib/_layoutgrid.py#L441-L446 > > ? This one isn't really numpy at all, and is really just a > > shorthand for > > normalizing an argument `x=n` to `x=[n, n]` > > * > > https://github.com/matplotlib/matplotlib/blob/dd249744270f6abe3f540f81b7a77c0cb728ddbb/lib/matplotlib/mlab.py#L888 > > ?? This one is the classic "either multivariate or single-variable > > data" > > thing endemic to the SciPy ecosystem. > > * > > https://github.com/matplotlib/matplotlib/blob/1eef019109b64ee4085732544cb5e310e69451ab/lib/matplotlib/cbook/__init__.py#L1325-L1326 > > ? Matplotlib has their own `_check_1d` function for input > > sanitization, > > although github says it's only used to parse the arguments to > > `plot`, which > > at this point are fairly established as being flexible. > > * > > https://github.com/matplotlib/matplotlib/blob/f72adc49092fe0233a8cd21aa0f317918dafb18d/lib/matplotlib/transforms.py#L631 > > ? This just looks like "defensive programming", and if the argument > > isn't > > already 3d then something is probably wrong. > > > > This isn't an exhaustive list, just a handful of different > > situations the > > functions were used. > > > > Eric > > > > > > > > On Thu, 11 Feb 2021 at 18:15, Stephan Hoyer > > wrote: > > > > > On Thu, Feb 11, 2021 at 9:42 AM Benjamin Root < > > > ben.v.root at gmail.com> > > > wrote: > > > > > > > for me, I find that the at_least{1,2,3}d functions are useful > > > > for > > > > sanitizing inputs. Having an at_leastnd() function can be > > > > viewed as a step > > > > towards cleaning up the API, not cluttering it (although, > > > > deprecations of > > > > the existing functions probably should be long given how long > > > > they have > > > > existed). > > > > > > > > > > I would love to see examples of this -- perhaps in matplotlib? > > > > > > My thinking is that in most cases it's probably a better idea to > > > keep the > > > interface simpler, and raise an error for lower-dimensional > > > arrays. > > > Automatic conversion is convenient (and endemic within the SciPy > > > ecosystem), but is also a common source of bugs. > > > > > > On Thu, Feb 11, 2021 at 1:56 AM Stephan Hoyer > > > wrote: > > > > > > > > > On Wed, Feb 10, 2021 at 9:48 PM Juan Nunez-Iglesias < > > > > > jni at fastmail.com> > > > > > wrote: > > > > > > > > > > > I totally agree with the namespace clutter concern, but > > > > > > honestly, I > > > > > > would use `atleast_nd` with its `pos` argument (I might > > > > > > rename it to > > > > > > `position`, `axis`, or `axis_position`) any day over > > > > > > `at_least{1,2,3}d`, > > > > > > for which I had no idea where the new axes would end up. > > > > > > > > > > > > So, I?m in favour of including it, and optionally > > > > > > deprecating > > > > > > `atleast_{1,2,3}d`. > > > > > > > > > > > > > > > > > I appreciate that `atleast_nd` feels more sensible than > > > > > `at_least{1,2,3}d`, but I don't think "better" than a pattern > > > > > we would not > > > > > recommend is a good enough reason for inclusion in NumPy. It > > > > > needs to stand > > > > > on its own. > > > > > > > > > > What would be the recommended use-cases for this new > > > > > function? > > > > > Have any libraries building on top of NumPy implemented a > > > > > version of > > > > > this? > > > > > > > > > > > > > > > > Juan. > > > > > > > > > > > > On 11 Feb 2021, at 9:48 am, Sebastian Berg < > > > > > > sebastian at sipsolutions.net> > > > > > > wrote: > > > > > > > > > > > > On Wed, 2021-02-10 at 17:31 -0500, Joseph Fox-Rabinovitz > > > > > > wrote: > > > > > > > > > > > > I've created PR#18386 to add a function called atleast_nd > > > > > > to numpy and > > > > > > numpy.ma. This would generalize the existing atleast_1d, > > > > > > atleast_2d, > > > > > > and > > > > > > atleast_3d functions. > > > > > > > > > > > > I proposed a similar idea about four and a half years ago: > > > > > > > > > > > > https://mail.python.org/pipermail/numpy-discussion/2016-July/075722.html > > > > > > , > > > > > > PR#7804. The reception was ambivalent, but a couple of > > > > > > folks have > > > > > > asked me > > > > > > about this, so I'm bringing it back. > > > > > > > > > > > > Some pros: > > > > > > > > > > > > - This closes issue #12336 > > > > > > - There are a couple of Stack Overflow questions that would > > > > > > benefit > > > > > > - Been asked about this a couple of times > > > > > > - Implementation of three existing atleast_*d functions > > > > > > gets easier > > > > > > - Looks nicer that the equivalent broadcasting and > > > > > > reshaping > > > > > > > > > > > > Some cons: > > > > > > > > > > > > - Cluttering up the API > > > > > > - Maintenance burden (but not a big one) > > > > > > - This is just a utility function, which can be achieved > > > > > > through > > > > > > broadcasting and reshaping > > > > > > > > > > > > > > > > > > My main concern would be the namespace cluttering. I can't > > > > > > say I use > > > > > > even the `atleast_2d` etc. functions personally, so I would > > > > > > tend to be > > > > > > slightly against the addition. But if others land on the > > > > > > "useful" side here > > > > > > (and it seemed a bit at least on github), I am also not > > > > > > opposed.? It is a > > > > > > clean name that lines up with existing ones, so it doesn't > > > > > > seem like a big > > > > > > "mental load" with respect to namespace cluttering. > > > > > > > > > > > > Bike shedding the API is probably a good idea in any case. > > > > > > > > > > > > I have pasted the current PR documentation (as html) below > > > > > > for quick > > > > > > reference. I wonder a bit about the reasoning for having > > > > > > `pos` specify a > > > > > > value rather than just a side? > > > > > > > > > > > > > > > > > > > > > > > > numpy.atleast_nd(*ary*, *ndim*, *pos=0*) > > > > > > View input as array with at least ndim dimensions. > > > > > > New unit dimensions are inserted at the index given by > > > > > > *pos* if > > > > > > necessary. > > > > > > Parameters*ary? *array_like > > > > > > The input array. Non-array inputs are converted to arrays. > > > > > > Arrays that > > > > > > already have ndim or more dimensions are preserved. > > > > > > *ndim? *int > > > > > > The minimum number of dimensions required. > > > > > > *pos? *int, optional > > > > > > The index to insert the new dimensions. May range from - > > > > > > ary.ndim - 1 > > > > > > to +ary.ndim (inclusive). Non-negative indices indicate > > > > > > locations > > > > > > before the corresponding axis: pos=0 means to insert at the > > > > > > very > > > > > > beginning. Negative indices indicate locations after the > > > > > > corresponding axis: > > > > > > ?pos=-1 means to insert at the very end. 0 and -1 are > > > > > > always > > > > > > guaranteed to work. Any other number will depend on the > > > > > > dimensions of the > > > > > > existing array. Default is 0. > > > > > > Returns*res? *ndarray > > > > > > An array with res.ndim >= ndim. A view is returned for > > > > > > array inputs. > > > > > > Dimensions are prepended if *pos* is 0, so for example, a > > > > > > 1-D array > > > > > > of shape (N,) with ndim=4becomes a view of shape (1, 1, 1, > > > > > > N). > > > > > > Dimensions are appended if *pos* is -1, so for example a 2- > > > > > > D array of > > > > > > shape (M, N) becomes a view of shape (M, N, 1, 1)when > > > > > > ndim=4. > > > > > > *See also* > > > > > > atleast_1d > > > > > > < > > > > > > https://18298-908607-gh.circle-artifacts.com/0/doc/build/html/reference/generated/numpy.atleast_1d.html#numpy.atleast_1d > > > > > > > > > > > > > , atleast_2d > > > > > > < > > > > > > https://18298-908607-gh.circle-artifacts.com/0/doc/build/html/reference/generated/numpy.atleast_2d.html#numpy.atleast_2d > > > > > > > > > > > > > , atleast_3d > > > > > > < > > > > > > https://18298-908607-gh.circle-artifacts.com/0/doc/build/html/reference/generated/numpy.atleast_3d.html#numpy.atleast_3d > > > > > > > > > > > > > *Notes* > > > > > > This function does not follow the convention of the other > > > > > > atleast_*d functions > > > > > > in numpy in that it only accepts a single array argument. > > > > > > To process > > > > > > multiple arrays, use a comprehension or loop around the > > > > > > function call. See > > > > > > examples below. > > > > > > Setting pos=0 is equivalent to how the array would be > > > > > > interpreted by > > > > > > numpy?s broadcasting rules. There is no need to call this > > > > > > function for > > > > > > simple broadcasting. This is also roughly (but not exactly) > > > > > > equivalent to > > > > > > ?np.array(ary, copy=False, subok=True, ndmin=ndim). > > > > > > It is easy to create functions for specific dimensions > > > > > > similar to the > > > > > > other atleast_*d functions using Python?s functools.partial > > > > > > < > > > > > > https://docs.python.org/dev/library/functools.html#functools.partial > > > > > > > > > > > > > ?function. An example is shown below. > > > > > > *Examples* > > > > > > > > > > > > > > > np.atleast_nd(3.0, 4)array([[[[ 3.]]]]) > > > > > > > > > > > > > > > x = np.arange(3.0)>>> np.atleast_nd(x, 2).shape(1, 3) > > > > > > > > > > > > > > > x = np.arange(12.0).reshape(4, 3)>>> np.atleast_nd(x, > > > > > > > > > 5).shape(1, 1, 1, 4, 3)>>> np.atleast_nd(x, 5).base > > > > > > > > > is x.baseTrue > > > > > > > > > > > > > > > [np.atleast_nd(x) for x in ((1, 2), [[1, 2]], [[[1, > > > > > > > > > 2]]])]:[array([[1, 2]]), array([[1, 2]]), array([[[1, > > > > > > > > > 2]]])] > > > > > > > > > > > > > > > np.atleast_nd((1, 2), 5, pos=0).shape(1, 1, 1, 1, > > > > > > > > > 2)>>> np.atleast_nd((1, 2), 5, pos=-1).shape(2, 1, 1, > > > > > > > > > 1, 1) > > > > > > > > > > > > > > > from functools import partial>>> atleast_4d = > > > > > > > > > partial(np.atleast_nd, ndim=4)>>> atleast_4d([1, 2, > > > > > > > > > 3])[[[[1, 2, 3]]]] > > > > > > > > > > > > > > > > > > _______________________________________________ > > > > > > NumPy-Discussion mailing list > > > > > > NumPy-Discussion at python.org > > > > > > https://mail.python.org/mailman/listinfo/numpy-discussion > > > > > > > > > > > > > > > > > > _______________________________________________ > > > > > > NumPy-Discussion mailing list > > > > > > NumPy-Discussion at python.org > > > > > > https://mail.python.org/mailman/listinfo/numpy-discussion > > > > > > > > > > > _______________________________________________ > > > > > NumPy-Discussion mailing list > > > > > NumPy-Discussion at python.org > > > > > https://mail.python.org/mailman/listinfo/numpy-discussion > > > > > > > > > _______________________________________________ > > > > NumPy-Discussion mailing list > > > > NumPy-Discussion at python.org > > > > https://mail.python.org/mailman/listinfo/numpy-discussion > > > > > > > _______________________________________________ > > > NumPy-Discussion mailing list > > > NumPy-Discussion at python.org > > > https://mail.python.org/mailman/listinfo/numpy-discussion > > > > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at python.org > > https://mail.python.org/mailman/listinfo/numpy-discussion > > > > > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at python.org > > https://mail.python.org/mailman/listinfo/numpy-discussion > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: This is a digitally signed message part URL: From robert.kern at gmail.com Fri Feb 12 09:30:07 2021 From: robert.kern at gmail.com (Robert Kern) Date: Fri, 12 Feb 2021 09:30:07 -0500 Subject: [Numpy-discussion] ENH: Proposal to add atleast_nd function In-Reply-To: References: <1001ae35de9d51204170cfb5742a6ffab6e89990.camel@sipsolutions.net> Message-ID: On Fri, Feb 12, 2021 at 5:15 AM Eric Wieser wrote: > > There might be some linear algebraic reason why those axis positions > make sense, but I?m not aware of it... > > My guess is that the historical motivation was to allow grayscale `(H, W)` > images to be converted into `(H, W, 1)` images so that they can be > broadcast against `(H, W, 3)` RGB images. > Correct. If you do introduce atleast_nd(), I'm not sure why you'd deprecate and remove the one existing function that *isn't* made redundant thereby. -- Robert Kern -------------- next part -------------- An HTML attachment was scrubbed... URL: From jfoxrabinovitz at gmail.com Fri Feb 12 09:44:31 2021 From: jfoxrabinovitz at gmail.com (Joseph Fox-Rabinovitz) Date: Fri, 12 Feb 2021 09:44:31 -0500 Subject: [Numpy-discussion] ENH: Proposal to add atleast_nd function In-Reply-To: References: <1001ae35de9d51204170cfb5742a6ffab6e89990.camel@sipsolutions.net> Message-ID: On Fri, Feb 12, 2021, 09:32 Robert Kern wrote: > On Fri, Feb 12, 2021 at 5:15 AM Eric Wieser > wrote: > >> > There might be some linear algebraic reason why those axis positions >> make sense, but I?m not aware of it... >> >> My guess is that the historical motivation was to allow grayscale `(H, >> W)` images to be converted into `(H, W, 1)` images so that they can be >> broadcast against `(H, W, 3)` RGB images. >> > > Correct. If you do introduce atleast_nd(), I'm not sure why you'd > deprecate and remove the one existing function that *isn't* made redundant > thereby. > `atleast_nd` handles the promotion of 2D to 3D correctly. The `pos` argument lets you tell it where to put the new axes. What's unintuitive to my is that the 1D case gets promoted to from shape `(x,)` to shape `(1, x, 1)`. It takes two calls to `atleast_nd` to replicate that behavior. One modification to `atleast_nd` I've thought about is making `pos` refer to the position of the existing axes in the new array rather than the position of the new axes, but that's likely not a useful way to go about it. - Joe > -- > Robert Kern > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Fri Feb 12 10:08:45 2021 From: robert.kern at gmail.com (Robert Kern) Date: Fri, 12 Feb 2021 10:08:45 -0500 Subject: [Numpy-discussion] ENH: Proposal to add atleast_nd function In-Reply-To: References: <1001ae35de9d51204170cfb5742a6ffab6e89990.camel@sipsolutions.net> Message-ID: On Fri, Feb 12, 2021 at 9:45 AM Joseph Fox-Rabinovitz < jfoxrabinovitz at gmail.com> wrote: > > > On Fri, Feb 12, 2021, 09:32 Robert Kern wrote: > >> On Fri, Feb 12, 2021 at 5:15 AM Eric Wieser >> wrote: >> >>> > There might be some linear algebraic reason why those axis positions >>> make sense, but I?m not aware of it... >>> >>> My guess is that the historical motivation was to allow grayscale `(H, >>> W)` images to be converted into `(H, W, 1)` images so that they can be >>> broadcast against `(H, W, 3)` RGB images. >>> >> >> Correct. If you do introduce atleast_nd(), I'm not sure why you'd >> deprecate and remove the one existing function that *isn't* made redundant >> thereby. >> > > `atleast_nd` handles the promotion of 2D to 3D correctly. The `pos` > argument lets you tell it where to put the new axes. What's unintuitive to > my is that the 1D case gets promoted to from shape `(x,)` to shape `(1, x, > 1)`. It takes two calls to `atleast_nd` to replicate that behavior. > When thinking about channeled images, the channel axis is not of the same kind as the H and W axes. Really, you tend to want to think about an RGB image as a (H, W) array of colors rather than an (H, W, 3) ndarray of intensity values. As much as possible, you want to treat RGB images similar to (H, W)-shaped grayscale images. Let's say I want to make a separable filter to convolve with my image, that is, we have a 1D filter for each of the H and W axes, and they are repeated for each channel, if RGB. Setting up a separable filter for (H, W) grayscale is straightforward with broadcasting semantics. I can use (ntaps,)-shaped vector for the W axis and (ntaps, 1)-shaped filter for the H axis. Now, when I go to the RGB case, I want the same thing. atleast_3d() adapts those correctly for the (H, W, nchannels) case. -- Robert Kern -------------- next part -------------- An HTML attachment was scrubbed... URL: From sebastian at sipsolutions.net Fri Feb 12 13:25:31 2021 From: sebastian at sipsolutions.net (Sebastian Berg) Date: Fri, 12 Feb 2021 12:25:31 -0600 Subject: [Numpy-discussion] ENH: Proposal to add atleast_nd function In-Reply-To: References: <1001ae35de9d51204170cfb5742a6ffab6e89990.camel@sipsolutions.net> Message-ID: <417beafed3212391571b55dcd10c0e6e4311034e.camel@sipsolutions.net> On Fri, 2021-02-12 at 10:08 -0500, Robert Kern wrote: > On Fri, Feb 12, 2021 at 9:45 AM Joseph Fox-Rabinovitz < > jfoxrabinovitz at gmail.com> wrote: > > > > > > > On Fri, Feb 12, 2021, 09:32 Robert Kern > > wrote: > > > > > On Fri, Feb 12, 2021 at 5:15 AM Eric Wieser < > > > wieser.eric+numpy at gmail.com> > > > wrote: > > > > > > > > There might be some linear algebraic reason why those axis > > > > > positions > > > > make sense, but I?m not aware of it... > > > > > > > > My guess is that the historical motivation was to allow > > > > grayscale `(H, > > > > W)` images to be converted into `(H, W, 1)` images so that they > > > > can be > > > > broadcast against `(H, W, 3)` RGB images. > > > > > > > > > > Correct. If you do introduce atleast_nd(), I'm not sure why you'd > > > deprecate and remove the one existing function that *isn't* made > > > redundant > > > thereby. > > > > > > > `atleast_nd` handles the promotion of 2D to 3D correctly. The `pos` > > argument lets you tell it where to put the new axes. What's > > unintuitive to > > my is that the 1D case gets promoted to from shape `(x,)` to shape > > `(1, x, > > 1)`. It takes two calls to `atleast_nd` to replicate that behavior. > > > > When thinking about channeled images, the channel axis is not of the > same > kind as the H and W axes. Really, you tend to want to think about an > RGB > image as a (H, W) array of colors rather than an (H, W, 3) ndarray of > intensity values. As much as possible, you want to treat RGB images > similar > to (H, W)-shaped grayscale images. Let's say I want to make a > separable > filter to convolve with my image, that is, we have a 1D filter for > each of > the H and W axes, and they are repeated for each channel, if RGB. > Setting > up a separable filter for (H, W) grayscale is straightforward with > broadcasting semantics. I can use (ntaps,)-shaped vector for the W > axis and > (ntaps, 1)-shaped filter for the H axis. Now, when I go to the RGB > case, I > want the same thing. atleast_3d() adapts those correctly for the (H, > W, > nchannels) case. Right, my initial feeling it that without such context `atleast_3d` is pretty surprising. So I wonder if we can design `atleast_nd` in a way that it is explicit about this context. The `pos` argument is the current solution to this, but maybe is a better way [2]? Meshgrid for example defaults to `indexing='xy'` and has `indexing='ij'` for a similar purpose [1]. Of course, if `atleast_3d` is common enough, I guess that argument could also swing to adding a keyword-only argument to `atleast_3d` (that way we can/will never change the default). - Sebastian [1] Not sure the purposes are comparable, but in both cases, they provide information about the "context" in which meshgrid/atleast_3d are used. [2] It feels a bit like you may have to think about what `pos=3` will actually do (in the sense, that we will all just end up doing trial and error :)). At which point I am not sure there is too much gained over the surprise of `atleast_3d`. > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: This is a digitally signed message part URL: From ralf.gommers at gmail.com Fri Feb 12 13:46:02 2021 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Fri, 12 Feb 2021 19:46:02 +0100 Subject: [Numpy-discussion] ENH: Proposal to add atleast_nd function In-Reply-To: <417beafed3212391571b55dcd10c0e6e4311034e.camel@sipsolutions.net> References: <1001ae35de9d51204170cfb5742a6ffab6e89990.camel@sipsolutions.net> <417beafed3212391571b55dcd10c0e6e4311034e.camel@sipsolutions.net> Message-ID: On Fri, Feb 12, 2021 at 7:25 PM Sebastian Berg wrote: > On Fri, 2021-02-12 at 10:08 -0500, Robert Kern wrote: > > On Fri, Feb 12, 2021 at 9:45 AM Joseph Fox-Rabinovitz < > > jfoxrabinovitz at gmail.com> wrote: > > > > > > > > > > > On Fri, Feb 12, 2021, 09:32 Robert Kern > > > wrote: > > > > > > > On Fri, Feb 12, 2021 at 5:15 AM Eric Wieser < > > > > wieser.eric+numpy at gmail.com> > > > > wrote: > > > > > > > > > > There might be some linear algebraic reason why those axis > > > > > > positions > > > > > make sense, but I?m not aware of it... > > > > > > > > > > My guess is that the historical motivation was to allow > > > > > grayscale `(H, > > > > > W)` images to be converted into `(H, W, 1)` images so that they > > > > > can be > > > > > broadcast against `(H, W, 3)` RGB images. > > > > > > > > > > > > > Correct. If you do introduce atleast_nd(), I'm not sure why you'd > > > > deprecate and remove the one existing function that *isn't* made > > > > redundant > > > > thereby. > > > > > > > > > > `atleast_nd` handles the promotion of 2D to 3D correctly. The `pos` > > > argument lets you tell it where to put the new axes. What's > > > unintuitive to > > > my is that the 1D case gets promoted to from shape `(x,)` to shape > > > `(1, x, > > > 1)`. It takes two calls to `atleast_nd` to replicate that behavior. > > > > > > > When thinking about channeled images, the channel axis is not of the > > same > > kind as the H and W axes. Really, you tend to want to think about an > > RGB > > image as a (H, W) array of colors rather than an (H, W, 3) ndarray of > > intensity values. As much as possible, you want to treat RGB images > > similar > > to (H, W)-shaped grayscale images. Let's say I want to make a > > separable > > filter to convolve with my image, that is, we have a 1D filter for > > each of > > the H and W axes, and they are repeated for each channel, if RGB. > > Setting > > up a separable filter for (H, W) grayscale is straightforward with > > broadcasting semantics. I can use (ntaps,)-shaped vector for the W > > axis and > > (ntaps, 1)-shaped filter for the H axis. Now, when I go to the RGB > > case, I > > want the same thing. atleast_3d() adapts those correctly for the (H, > > W, > > nchannels) case. > > Right, my initial feeling it that without such context `atleast_3d` is > pretty surprising. So I wonder if we can design `atleast_nd` in a way > that it is explicit about this context. > Agreed. I think such a use case is probably too specific to design a single function for, at least in such a hardcoded way. There's also "channels first" and "channels last" versions of RGB images as 3-D arrays, and "channels first" is the default in most deep learning frameworks - so the choice atleast_3d makes is a little outdated by now. Cheers, Ralf > The `pos` argument is the current solution to this, but maybe is a > better way [2]? Meshgrid for example defaults to `indexing='xy'` and > has `indexing='ij'` for a similar purpose [1]. > > Of course, if `atleast_3d` is common enough, I guess that argument > could also swing to adding a keyword-only argument to `atleast_3d` > (that way we can/will never change the default). > > - Sebastian > > > [1] Not sure the purposes are comparable, but in both cases, they > provide information about the "context" in which meshgrid/atleast_3d > are used. > > [2] It feels a bit like you may have to think about what `pos=3` will > actually do (in the sense, that we will all just end up doing trial and > error :)). At which point I am not sure there is too much gained over > the surprise of `atleast_3d`. > > > > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at python.org > > https://mail.python.org/mailman/listinfo/numpy-discussion > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Fri Feb 12 15:20:21 2021 From: robert.kern at gmail.com (Robert Kern) Date: Fri, 12 Feb 2021 15:20:21 -0500 Subject: [Numpy-discussion] ENH: Proposal to add atleast_nd function In-Reply-To: References: <1001ae35de9d51204170cfb5742a6ffab6e89990.camel@sipsolutions.net> <417beafed3212391571b55dcd10c0e6e4311034e.camel@sipsolutions.net> Message-ID: On Fri, Feb 12, 2021 at 1:47 PM Ralf Gommers wrote: > > On Fri, Feb 12, 2021 at 7:25 PM Sebastian Berg > wrote: > >> On Fri, 2021-02-12 at 10:08 -0500, Robert Kern wrote: >> > On Fri, Feb 12, 2021 at 9:45 AM Joseph Fox-Rabinovitz < >> > jfoxrabinovitz at gmail.com> wrote: >> > >> > > >> > > >> > > On Fri, Feb 12, 2021, 09:32 Robert Kern >> > > wrote: >> > > >> > > > On Fri, Feb 12, 2021 at 5:15 AM Eric Wieser < >> > > > wieser.eric+numpy at gmail.com> >> > > > wrote: >> > > > >> > > > > > There might be some linear algebraic reason why those axis >> > > > > > positions >> > > > > make sense, but I?m not aware of it... >> > > > > >> > > > > My guess is that the historical motivation was to allow >> > > > > grayscale `(H, >> > > > > W)` images to be converted into `(H, W, 1)` images so that they >> > > > > can be >> > > > > broadcast against `(H, W, 3)` RGB images. >> > > > > >> > > > >> > > > Correct. If you do introduce atleast_nd(), I'm not sure why you'd >> > > > deprecate and remove the one existing function that *isn't* made >> > > > redundant >> > > > thereby. >> > > > >> > > >> > > `atleast_nd` handles the promotion of 2D to 3D correctly. The `pos` >> > > argument lets you tell it where to put the new axes. What's >> > > unintuitive to >> > > my is that the 1D case gets promoted to from shape `(x,)` to shape >> > > `(1, x, >> > > 1)`. It takes two calls to `atleast_nd` to replicate that behavior. >> > > >> > >> > When thinking about channeled images, the channel axis is not of the >> > same >> > kind as the H and W axes. Really, you tend to want to think about an >> > RGB >> > image as a (H, W) array of colors rather than an (H, W, 3) ndarray of >> > intensity values. As much as possible, you want to treat RGB images >> > similar >> > to (H, W)-shaped grayscale images. Let's say I want to make a >> > separable >> > filter to convolve with my image, that is, we have a 1D filter for >> > each of >> > the H and W axes, and they are repeated for each channel, if RGB. >> > Setting >> > up a separable filter for (H, W) grayscale is straightforward with >> > broadcasting semantics. I can use (ntaps,)-shaped vector for the W >> > axis and >> > (ntaps, 1)-shaped filter for the H axis. Now, when I go to the RGB >> > case, I >> > want the same thing. atleast_3d() adapts those correctly for the (H, >> > W, >> > nchannels) case. >> >> Right, my initial feeling it that without such context `atleast_3d` is >> pretty surprising. So I wonder if we can design `atleast_nd` in a way >> that it is explicit about this context. >> > > Agreed. I think such a use case is probably too specific to design a > single function for, at least in such a hardcoded way. > That might be an argument for not designing a new one (or at least not giving it such a name). Not sure it's a good argument for removing a long-standing one. Broadcasting is a very powerful convention that makes coding with arrays tolerable. It makes some choices (namely, prepending 1s to the shape) to make some common operations with mixed-dimension arrays work "by default". But it doesn't cover all of the desired operations conveniently. atleast_3d() bridges the gap to an important convention for a major use-case of arrays. There's also "channels first" and "channels last" versions of RGB images as > 3-D arrays, and "channels first" is the default in most deep learning > frameworks - so the choice atleast_3d makes is a little outdated by now. > DL frameworks do not constitute the majority of image processing code, which has a very strong channels-last contingent. But nonetheless, the very popular Tensorflow defaults to channels-last. -- Robert Kern -------------- next part -------------- An HTML attachment was scrubbed... URL: From melissawm at gmail.com Fri Feb 12 15:36:50 2021 From: melissawm at gmail.com (=?UTF-8?Q?Melissa_Mendon=C3=A7a?=) Date: Fri, 12 Feb 2021 17:36:50 -0300 Subject: [Numpy-discussion] Documentation Team meeting - Monday February 15 In-Reply-To: References: Message-ID: Hi all! Our next Documentation Team meeting will be on *Monday, February 15* at ***4PM UTC***. All are welcome - you don't need to already be a contributor to join. If you have questions or are curious about what we're doing, we'll be happy to meet you! If you wish to join on Zoom, use this link: https://zoom.us/j/96219574921?pwd=VTRNeGwwOUlrYVNYSENpVVBRRjlkZz09#success Here's the permanent hackmd document with the meeting notes (still being updated in the next few days!): https://hackmd.io/oB_boakvRqKR-_2jRV-Qjg Hope to see you around! ** You can click this link to get the correct time at your timezone: https://www.timeanddate.com/worldclock/fixedtime.html?msg=NumPy+Documentation+Team+Meeting&iso=20210215T16&p1=1440&ah=1 - Melissa -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Fri Feb 12 15:41:26 2021 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Fri, 12 Feb 2021 21:41:26 +0100 Subject: [Numpy-discussion] ENH: Proposal to add atleast_nd function In-Reply-To: References: <1001ae35de9d51204170cfb5742a6ffab6e89990.camel@sipsolutions.net> <417beafed3212391571b55dcd10c0e6e4311034e.camel@sipsolutions.net> Message-ID: On Fri, Feb 12, 2021 at 9:21 PM Robert Kern wrote: > On Fri, Feb 12, 2021 at 1:47 PM Ralf Gommers > wrote: > >> >> On Fri, Feb 12, 2021 at 7:25 PM Sebastian Berg < >> sebastian at sipsolutions.net> wrote: >> >>> On Fri, 2021-02-12 at 10:08 -0500, Robert Kern wrote: >>> > On Fri, Feb 12, 2021 at 9:45 AM Joseph Fox-Rabinovitz < >>> > jfoxrabinovitz at gmail.com> wrote: >>> > >>> > > >>> > > >>> > > On Fri, Feb 12, 2021, 09:32 Robert Kern >>> > > wrote: >>> > > >>> > > > On Fri, Feb 12, 2021 at 5:15 AM Eric Wieser < >>> > > > wieser.eric+numpy at gmail.com> >>> > > > wrote: >>> > > > >>> > > > > > There might be some linear algebraic reason why those axis >>> > > > > > positions >>> > > > > make sense, but I?m not aware of it... >>> > > > > >>> > > > > My guess is that the historical motivation was to allow >>> > > > > grayscale `(H, >>> > > > > W)` images to be converted into `(H, W, 1)` images so that they >>> > > > > can be >>> > > > > broadcast against `(H, W, 3)` RGB images. >>> > > > > >>> > > > >>> > > > Correct. If you do introduce atleast_nd(), I'm not sure why you'd >>> > > > deprecate and remove the one existing function that *isn't* made >>> > > > redundant >>> > > > thereby. >>> > > > >>> > > >>> > > `atleast_nd` handles the promotion of 2D to 3D correctly. The `pos` >>> > > argument lets you tell it where to put the new axes. What's >>> > > unintuitive to >>> > > my is that the 1D case gets promoted to from shape `(x,)` to shape >>> > > `(1, x, >>> > > 1)`. It takes two calls to `atleast_nd` to replicate that behavior. >>> > > >>> > >>> > When thinking about channeled images, the channel axis is not of the >>> > same >>> > kind as the H and W axes. Really, you tend to want to think about an >>> > RGB >>> > image as a (H, W) array of colors rather than an (H, W, 3) ndarray of >>> > intensity values. As much as possible, you want to treat RGB images >>> > similar >>> > to (H, W)-shaped grayscale images. Let's say I want to make a >>> > separable >>> > filter to convolve with my image, that is, we have a 1D filter for >>> > each of >>> > the H and W axes, and they are repeated for each channel, if RGB. >>> > Setting >>> > up a separable filter for (H, W) grayscale is straightforward with >>> > broadcasting semantics. I can use (ntaps,)-shaped vector for the W >>> > axis and >>> > (ntaps, 1)-shaped filter for the H axis. Now, when I go to the RGB >>> > case, I >>> > want the same thing. atleast_3d() adapts those correctly for the (H, >>> > W, >>> > nchannels) case. >>> >>> Right, my initial feeling it that without such context `atleast_3d` is >>> pretty surprising. So I wonder if we can design `atleast_nd` in a way >>> that it is explicit about this context. >>> >> >> Agreed. I think such a use case is probably too specific to design a >> single function for, at least in such a hardcoded way. >> > > That might be an argument for not designing a new one (or at least not > giving it such a name). Not sure it's a good argument for removing a > long-standing one. > I agree. I'm not sure deprecating is best. But introducing new functionality where `nd(pos=3) != 3d` is also not great. At the very least, atleast_3d should be better documented. It also is telling that Juan (a long-time) scikit-image dev doesn't like atleast_3d and there's very little usage of it in scikit-image. Cheers, Ralf > Broadcasting is a very powerful convention that makes coding with arrays > tolerable. It makes some choices (namely, prepending 1s to the shape) to > make some common operations with mixed-dimension arrays work "by default". > But it doesn't cover all of the desired operations conveniently. > atleast_3d() bridges the gap to an important convention for a major > use-case of arrays. > > There's also "channels first" and "channels last" versions of RGB images >> as 3-D arrays, and "channels first" is the default in most deep learning >> frameworks - so the choice atleast_3d makes is a little outdated by now. >> > > DL frameworks do not constitute the majority of image processing code, > which has a very strong channels-last contingent. But nonetheless, the very > popular Tensorflow defaults to channels-last. > > -- > Robert Kern > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Fri Feb 12 16:04:49 2021 From: robert.kern at gmail.com (Robert Kern) Date: Fri, 12 Feb 2021 16:04:49 -0500 Subject: [Numpy-discussion] ENH: Proposal to add atleast_nd function In-Reply-To: References: <1001ae35de9d51204170cfb5742a6ffab6e89990.camel@sipsolutions.net> <417beafed3212391571b55dcd10c0e6e4311034e.camel@sipsolutions.net> Message-ID: On Fri, Feb 12, 2021 at 3:42 PM Ralf Gommers wrote: > > On Fri, Feb 12, 2021 at 9:21 PM Robert Kern wrote: > >> On Fri, Feb 12, 2021 at 1:47 PM Ralf Gommers >> wrote: >> >>> >>> On Fri, Feb 12, 2021 at 7:25 PM Sebastian Berg < >>> sebastian at sipsolutions.net> wrote: >>> >>>> >>>> Right, my initial feeling it that without such context `atleast_3d` is >>>> pretty surprising. So I wonder if we can design `atleast_nd` in a way >>>> that it is explicit about this context. >>>> >>> >>> Agreed. I think such a use case is probably too specific to design a >>> single function for, at least in such a hardcoded way. >>> >> >> That might be an argument for not designing a new one (or at least not >> giving it such a name). Not sure it's a good argument for removing a >> long-standing one. >> > > I agree. I'm not sure deprecating is best. But introducing new > functionality where `nd(pos=3) != 3d` is also not great. > > At the very least, atleast_3d should be better documented. It also is > telling that Juan (a long-time) scikit-image dev doesn't like atleast_3d > and there's very little usage of it in scikit-image. > I'm fairly neutral on atleast_nd(). I think that for n=1 and n=2, you can derive The One Way to Do It from broadcasting semantics, but for n>=3, I'm not sure there's much value in trying to systematize it to a single convention. I think that once you get up to those dimensions, you start to want to have domain-specific semantics. I do agree that, in retrospect, atleast_3d() probably should have been named more specifically. It was of a piece of other conveniences like dstack() that did special things to support channel-last images (and implicitly treat 3D arrays as such). For example, DL frameworks that assemble channeled images into minibatches (with different conventions like BHWC and BCHW), you'd want the n=4 behavior to do different things. I _think_ you'd just want to do those with different functions than a complicated set of arguments to one function. -- Robert Kern -------------- next part -------------- An HTML attachment was scrubbed... URL: From friedrichromstedt at gmail.com Mon Feb 15 04:12:56 2021 From: friedrichromstedt at gmail.com (Friedrich Romstedt) Date: Mon, 15 Feb 2021 10:12:56 +0100 Subject: [Numpy-discussion] Unreliable crash when converting using numpy.asarray via C buffer interface In-Reply-To: References: Message-ID: Hi, Am Do., 4. Feb. 2021 um 09:07 Uhr schrieb Friedrich Romstedt : > Am Mo., 1. Feb. 2021 um 09:46 Uhr schrieb Matti Picus : > > Typically, one would create a complete example and then pointing to the > > code (as repo or pastebin, not as an attachment to a mail here). > > https://github.com/friedrichromstedt/bughunting-01 Last week I updated my example code to be more slim. There now exists a single-file extension module: https://github.com/friedrichromstedt/bughunting-01/blob/master/lib/bughuntingfrmod/bughuntingfrmod.cpp. The corresponding test program https://github.com/friedrichromstedt/bughunting-01/blob/master/test/2021-02-11_0909.py crashes "properly" both on Windows 10 (Python 3.8.2, numpy 1.19.2) as well as on Arch Linux (Python 3.9.1, numpy 1.20.0), when the ``print`` statement contained in the test file is commented out. My hope to be able to fix my error myself by reducing the code to reproduce the problem has not been fulfillled. I feel that the abovementioned test code is short enough to ask for help with it here. Any hint on how I could solve my problem would be appreciated very much. There are some points which were not clarified yet; I am citing them below. So far, Friedrich > > - There are tools out there to analyze refcount problems. Python has > > some built-in tools for switching allocation strategies. > > Can you give me some pointer about this? > > > - numpy.asarray has a number of strategies to convert instances, which > > one is it using? > > I've tried to read about this, but couldn't find anything. What are > these different strategies? From Pietro.Fontana at synopsys.com Mon Feb 15 10:38:09 2021 From: Pietro.Fontana at synopsys.com (Pietro Fontana) Date: Mon, 15 Feb 2021 15:38:09 +0000 Subject: [Numpy-discussion] Compile NumPy with ifort, MSVC and MKL - DLL load failed Message-ID: Hi all, I've been trying to compile NumPy from source on Windows 10, with MSVC compiler and Intel MKL. Whenever I link to MKL it fails at loading DLLs. I am running Windows 10.0.18363 with Microsoft Visual Studio 2019 (16.8.5) and Intel MKL 2017.8.275. I managed to reproduce the issue with a minimal setup, using latest Python and NumPy. 1. Download latest Python (3.9.1) and latest NumPy (1.20.1) source. 2. Open a VS command prompt, unpack Python source, build with PCbuild\build.bat 3. Run mklvars.bat intel64 to get the right environment variables set. 4. Add the Intel compilers (needed for ifort) to PATH: set PATH=C:\Program Files (x86)\IntelSWTools\compilers_and_libraries_2017\windows\bin\intel64;%PATH% 1. Create a virtual env, copy a few files from the Python build and activate the virtual env: copy Python\PCbuild\amd64\python39.dll venv\Scripts copy Python\PC\pyconfig.h venv\Include 1. Build NumPy from source and install: pip install . -v 2. Try to import NumPy: python -c "import numpy" The error message appears as follows: Traceback (most recent call last): File "C:\path\numpy_clean_env\venv\lib\site-packages\numpy\core\__init__.py", line 22, in from . import multiarray File "C:\path\numpy_clean_env\venv\lib\site-packages\numpy\core\multiarray.py", line 12, in from . import overrides File "C:\path\numpy_clean_env\venv\lib\site-packages\numpy\core\overrides.py", line 7, in from numpy.core._multiarray_umath import ( ImportError: DLL load failed while importing _multiarray_umath: The specified module could not be found. During handling of the above exception, another exception occurred: Traceback (most recent call last): File "", line 1, in File "C:\path\numpy_clean_env\venv\lib\site-packages\numpy\__init__.py", line 145, in from . import core File "C:\path\numpy_clean_env\venv\lib\site-packages\numpy\core\__init__.py", line 48, in raise ImportError(msg) ImportError: [... useful suggestions that however did not lead to a solution...] Original error was: DLL load failed while importing _multiarray_umath: The specified module could not be found. The MKL libraries are picked up during compilation since it returns: FOUND: libraries = ['mkl_rt'] library_dirs = ['C:\\Program Files (x86)\\IntelSWTools\\compilers_and_libraries\\windows\\mkl\\lib\\intel64'] define_macros = [('SCIPY_MKL_H', None), ('HAVE_CBLAS', None)] include_dirs = ['C:\\Program Files (x86)\\IntelSWTools\\compilers_and_libraries\\windows\\mkl', 'C:\\Program Files (x86)\\IntelSWTools\\compilers_and_libraries\\windows\\mkl\\include', 'C:\\Program Files (x86)\\IntelSWTools\\compilers_and_libraries\\windows\\mkl\\lib'] I tried to analyze the DLL resolution on _multiarray_umath.pyd with Dependencies (the newer version of Dependency Walker) but it seems that the MKL DLL loads fine. There are some DLLs that appear as not correctly loaded, but as far as I understand it is caused by the inspection software limit with Windows API sets (api-ms-win-core-*, ext-ms-onecore-*, ext-ms-win-*, and similar), not by actual problems with this DLLs, so I think the system is correctly setup. If I skip the initialization of MKL environment variables, then the MKL libraries are not picked and NumPy is compiled to a functional state. In the past this setup used to work with Python 3.6, VS2015 and a similar version of Intel MKL. I was able to reproduce the issue with NumPy 1.16.2, 1.17 and 1.20.1; with Python 3.8.6 and Python 3.9.1; with Intel MKL 2017 and oneAPI 2020. Am I missing any obvious step to succeed in this adventure? -------------- next part -------------- An HTML attachment was scrubbed... URL: From sebastian at sipsolutions.net Mon Feb 15 10:54:19 2021 From: sebastian at sipsolutions.net (Sebastian Berg) Date: Mon, 15 Feb 2021 09:54:19 -0600 Subject: [Numpy-discussion] Unreliable crash when converting using numpy.asarray via C buffer interface In-Reply-To: References: Message-ID: On Mon, 2021-02-15 at 10:12 +0100, Friedrich Romstedt wrote: > Hi, > > Am Do., 4. Feb. 2021 um 09:07 Uhr schrieb Friedrich Romstedt > : > > Am Mo., 1. Feb. 2021 um 09:46 Uhr schrieb Matti Picus < > > matti.picus at gmail.com>: > > > Typically, one would create a complete example and then pointing > > > to the > > > code (as repo or pastebin, not as an attachment to a mail here). > > > > https://github.com/friedrichromstedt/bughunting-01 > > Last week I updated my example code to be more slim.? There now > exists > a single-file extension module: > https://github.com/friedrichromstedt/bughunting-01/blob/master/lib/bughuntingfrmod/bughuntingfrmod.cpp > . > The corresponding test program > https://github.com/friedrichromstedt/bughunting-01/blob/master/test/2021-02-11_0909.py > crashes "properly" both on Windows 10 (Python 3.8.2, numpy 1.19.2) as > well as on Arch Linux (Python 3.9.1, numpy 1.20.0), when the > ``print`` > statement contained in the test file is commented out. > > My hope to be able to fix my error myself by reducing the code to > reproduce the problem has not been fulfillled.? I feel that the > abovementioned test code is short enough to ask for help with it > here. > Any hint on how I could solve my problem would be appreciated very > much. I have tried it out, and can confirm that using debugging tools (namely valgrind), will allow you track down the issue (valgrind reports it from within python, running a python without debug symbols may obfuscate the actual problem; if that is the limiting you, I can post my valgrind output). Since you are running a linux system, I am confident that you can run it in valgrind to find it yourself. (There may be other ways.) Just remember to run valgrind with `PYTHONMALLOC=malloc valgrind` and ignore some errors e.g. when importing NumPy. Cheers, Sebastian > > There are some points which were not clarified yet; I am citing them > below. > > So far, > Friedrich > > > > - There are tools out there to analyze refcount problems. Python > > > has > > > some built-in tools for switching allocation strategies. > > > > Can you give me some pointer about this? > > > > > - numpy.asarray has a number of strategies to convert instances, > > > which > > > one is it using? > > > > I've tried to read about this, but couldn't find anything.? What > > are > > these different strategies? > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: This is a digitally signed message part URL: From cgohlke at uci.edu Mon Feb 15 11:34:36 2021 From: cgohlke at uci.edu (Christoph Gohlke) Date: Mon, 15 Feb 2021 08:34:36 -0800 Subject: [Numpy-discussion] Compile NumPy with ifort, MSVC and MKL - DLL load failed In-Reply-To: References: Message-ID: <2297ae37-cf0d-7929-400e-8033206bdb5d@uci.edu> Hello, On 2/15/2021 7:38 AM, Pietro Fontana wrote: > Hi all, > > I've been trying to compile NumPy from source on Windows 10, with MSVC > compiler and Intel MKL. Whenever I link to MKL it fails at loading DLLs. > I am running Windows 10.0.18363 with Microsoft Visual Studio 2019 > (16.8.5) and Intel MKL 2017.8.275. > > I managed to reproduce the issue with a minimal setup, using latest > Python and NumPy. > > 1. Download latest Python (3.9.1) and latest NumPy (1.20.1) source. > 2. Open a VS command prompt, unpack Python source, build with > PCbuild\build.bat > 3. Run mklvars.bat intel64 to get the right environment variables set. > 4. Add the Intel compilers (needed for ifort) to PATH: > > set PATH=C:\Program Files > (x86)\IntelSWTools\compilers_and_libraries_2017\windows\bin\intel64;%PATH% > > 5. Create a virtual env, copy a few files from the Python build and > activate the virtual env: > > copy Python\PCbuild\amd64\python39.dll venv\Scripts > copy Python\PC\pyconfig.h venv\Include > > 6. Build NumPy from source and install: pip install . -v > 7. Try to import NumPy: python -c "import numpy" > > The error message appears as follows: > > |Traceback (most recent |call||last|):| > > ||File||"C:\path\numpy_clean_env\venv\lib\site-packages\numpy\core\__init__.py"|, line > |22|, |in|<|module|>| > > ||from|. |import|multiarray| > > ||File||"C:\path\numpy_clean_env\venv\lib\site-packages\numpy\core\multiarray.py"|, > line |12|, |in|<|module|>| > > ||from|. |import|overrides| > > ||File||"C:\path\numpy_clean_env\venv\lib\site-packages\numpy\core\overrides.py"|, > line |7|, |in|<|module|>| > > ||from|numpy.core._multiarray_umath |import|(| > > |ImportError: DLL |load||failed||while|importing _multiarray_umath: The specified |module|could |not|be found.| > > || > > |During handling |of|the above |exception|, another |exception|occurred:| > > || > > |Traceback (most recent |call||last|):| > > ||File||""|, line |1|, |in|<|module|>| > > ||File||"C:\path\numpy_clean_env\venv\lib\site-packages\numpy\__init__.py"|, line |145|, |in|<|module|>| > > ||from|. |import|core| > > ||File||"C:\path\numpy_clean_env\venv\lib\site-packages\numpy\core\__init__.py"|, line > |48|, |in|<|module|>| > > ||raise|ImportError(msg)| > > |ImportError:| > > |[? useful suggestions that however did not lead to a solution?]| > > |Original |error|was: DLL |load||failed||while|importing _multiarray_umath: The specified |module|could |not|be found.| > > > > The MKL libraries are picked up during compilation since it returns: > > |FOUND:| > > |??????? libraries = [|'mkl_rt'|]| > > |??????? library_dirs = [|'C:\\Program Files > (x86)\\IntelSWTools\\compilers_and_libraries\\windows\\mkl\\lib\\intel64'|]| > > |??????? define_macros = [(|'SCIPY_MKL_H'|, |None|), (|'HAVE_CBLAS'|, |None|)]| > > |??????? include_dirs = [|'C:\\Program Files > (x86)\\IntelSWTools\\compilers_and_libraries\\windows\\mkl'|, |'C:\\Program Files > (x86)\\IntelSWTools\\compilers_and_libraries\\windows\\mkl\\include'|, |'C:\\Program Files > (x86)\\IntelSWTools\\compilers_and_libraries\\windows\\mkl\\lib'|]| > > I tried to analyze the DLL resolution on |_multiarray_umath.pyd| with > Dependencies (the newer version of Dependency Walker) but it seems that > the MKL DLL loads fine. There are some DLLs that appear as not correctly > loaded, but as far as I understand it is caused by the inspection > software limit with Windows API sets (api-ms-win-core-*, > ext-ms-onecore-*, ext-ms-win-*, and similar), not by actual problems > with this DLLs, so I think the system is correctly setup. > > If I skip the initialization of MKL environment variables, then the MKL > libraries are not picked and NumPy is compiled to a functional state. > > In the past this setup used to work with Python 3.6, VS2015 and a > similar version of Intel MKL. > I was able to reproduce the issue with NumPy 1.16.2, 1.17 and 1.20.1; > with Python 3.8.6 and Python 3.9.1; with Intel MKL 2017 and oneAPI 2020. > > Am I missing any obvious step to succeed in this adventure? > > Python >= 3.8 will no longer use PATH for resolving dependencies of extension modules. Use os.add_dll_directory(mkl_bin_path) in all your scripts before importing numpy or add the call to a _distributor_init.py file in the numpy package directory. Christoph From lev.maximov at gmail.com Mon Feb 15 12:49:34 2021 From: lev.maximov at gmail.com (Lev Maximov) Date: Tue, 16 Feb 2021 00:49:34 +0700 Subject: [Numpy-discussion] Unreliable crash when converting using numpy.asarray via C buffer interface In-Reply-To: References: Message-ID: Hi Friedrich, Try adding view->suboffsets = NULL; view->internal = NULL; to Image_getbuffer Best regards, Lev On Mon, Feb 15, 2021 at 10:57 PM Sebastian Berg wrote: > On Mon, 2021-02-15 at 10:12 +0100, Friedrich Romstedt wrote: > > Hi, > > > > Am Do., 4. Feb. 2021 um 09:07 Uhr schrieb Friedrich Romstedt > > : > > > Am Mo., 1. Feb. 2021 um 09:46 Uhr schrieb Matti Picus < > > > matti.picus at gmail.com>: > > > > Typically, one would create a complete example and then pointing > > > > to the > > > > code (as repo or pastebin, not as an attachment to a mail here). > > > > > > https://github.com/friedrichromstedt/bughunting-01 > > > > Last week I updated my example code to be more slim. There now > > exists > > a single-file extension module: > > > https://github.com/friedrichromstedt/bughunting-01/blob/master/lib/bughuntingfrmod/bughuntingfrmod.cpp > > . > > The corresponding test program > > > https://github.com/friedrichromstedt/bughunting-01/blob/master/test/2021-02-11_0909.py > > crashes "properly" both on Windows 10 (Python 3.8.2, numpy 1.19.2) as > > well as on Arch Linux (Python 3.9.1, numpy 1.20.0), when the > > ``print`` > > statement contained in the test file is commented out. > > > > My hope to be able to fix my error myself by reducing the code to > > reproduce the problem has not been fulfillled. I feel that the > > abovementioned test code is short enough to ask for help with it > > here. > > Any hint on how I could solve my problem would be appreciated very > > much. > > I have tried it out, and can confirm that using debugging tools (namely > valgrind), will allow you track down the issue (valgrind reports it > from within python, running a python without debug symbols may > obfuscate the actual problem; if that is the limiting you, I can post > my valgrind output). > Since you are running a linux system, I am confident that you can run > it in valgrind to find it yourself. (There may be other ways.) > > Just remember to run valgrind with `PYTHONMALLOC=malloc valgrind` and > ignore some errors e.g. when importing NumPy. > > Cheers, > > Sebastian > > > > > > There are some points which were not clarified yet; I am citing them > > below. > > > > So far, > > Friedrich > > > > > > - There are tools out there to analyze refcount problems. Python > > > > has > > > > some built-in tools for switching allocation strategies. > > > > > > Can you give me some pointer about this? > > > > > > > - numpy.asarray has a number of strategies to convert instances, > > > > which > > > > one is it using? > > > > > > I've tried to read about this, but couldn't find anything. What > > > are > > > these different strategies? > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at python.org > > https://mail.python.org/mailman/listinfo/numpy-discussion > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From Pietro.Fontana at synopsys.com Mon Feb 15 13:12:10 2021 From: Pietro.Fontana at synopsys.com (Pietro Fontana) Date: Mon, 15 Feb 2021 18:12:10 +0000 Subject: [Numpy-discussion] Compile NumPy with ifort, MSVC and MKL - DLL load failed In-Reply-To: <2297ae37-cf0d-7929-400e-8033206bdb5d@uci.edu> References: <2297ae37-cf0d-7929-400e-8033206bdb5d@uci.edu> Message-ID: Hi, thank you very much for pointing me at this. I managed not to find this bit of information despite spending quite some time on the issue. Cheers, Pietro From mansourmoufid at gmail.com Mon Feb 15 19:35:35 2021 From: mansourmoufid at gmail.com (Mansour Moufid) Date: Mon, 15 Feb 2021 19:35:35 -0500 Subject: [Numpy-discussion] Unreliable crash when converting using numpy.asarray via C buffer interface In-Reply-To: References: Message-ID: On Tue, Jan 26, 2021 at 3:50 AM Friedrich Romstedt wrote: > > Hi, > > This is with Python 3.8.2 64-bit and numpy 1.19.2 on Windows 10. I'd > like to be able to convert some C++ extension type to a numpy array by > using ``numpy.asarray``. The extension type implements the Python > buffer interface to support this. > > The extension type, called "Image" here, holds some chunk of > ``double``, C order, contiguous, 2 dimensions. It "owns" the buffer; > the buffer is not shared with other objects. The following Python > code crashes:: > > image = <... Image production ...> > ar = numpy.asarray(image) > > However, when I say:: > > image = <... Image production ...> > print("---") > ar = numpy.asarray(image) > > the entire program is executing properly with correct data in the > numpy ndarray produced using the buffer interface. Maybe a dereference bug. Try setting pointers to NULL after freeing, something like this: delete[] view->shape; view->shape = NULL; delete[] view->strides; view->strides = NULL; ... delete[] self->data; self->data = NULL; From mansourmoufid at gmail.com Mon Feb 15 19:47:42 2021 From: mansourmoufid at gmail.com (Mansour Moufid) Date: Mon, 15 Feb 2021 19:47:42 -0500 Subject: [Numpy-discussion] Unreliable crash when converting using numpy.asarray via C buffer interface In-Reply-To: References: Message-ID: On Mon, Feb 15, 2021 at 7:35 PM Mansour Moufid wrote: > > On Tue, Jan 26, 2021 at 3:50 AM Friedrich Romstedt > wrote: > > > > Hi, > > > > This is with Python 3.8.2 64-bit and numpy 1.19.2 on Windows 10. I'd > > like to be able to convert some C++ extension type to a numpy array by > > using ``numpy.asarray``. The extension type implements the Python > > buffer interface to support this. > > > > The extension type, called "Image" here, holds some chunk of > > ``double``, C order, contiguous, 2 dimensions. It "owns" the buffer; > > the buffer is not shared with other objects. The following Python > > code crashes:: > > > > image = <... Image production ...> > > ar = numpy.asarray(image) > > > > However, when I say:: > > > > image = <... Image production ...> > > print("---") > > ar = numpy.asarray(image) > > > > the entire program is executing properly with correct data in the > > numpy ndarray produced using the buffer interface. > > Maybe a dereference bug. > > Try setting pointers to NULL after freeing, something like this: > > delete[] view->shape; > view->shape = NULL; > delete[] view->strides; > view->strides = NULL; > > ... > > delete[] self->data; > self->data = NULL; Sorry for two messages in a row, I just noticed: I don't see the type's tp_free member defined? You can set it to PyObject_Free in Init_ImageType: ImageType.tp_free = PyObject_Free; See here: https://docs.python.org/3/c-api/typeobj.html#c.PyTypeObject.tp_free From pierre.augier at univ-grenoble-alpes.fr Tue Feb 16 04:14:18 2021 From: pierre.augier at univ-grenoble-alpes.fr (PIERRE AUGIER) Date: Tue, 16 Feb 2021 10:14:18 +0100 (CET) Subject: [Numpy-discussion] Type annotation for Numpy arrays, accelerators and numpy.typing Message-ID: <308708215.6518305.1613466858105.JavaMail.zimbra@univ-grenoble-alpes.fr> Hi, When Numpy 1.20 was released, I discovered numpy.typing and its documentation https://numpy.org/doc/stable/reference/typing.html I know that it is very new but I'm a bit lost. A good API to describe Array type would be useful not only for type checkers but also for Python accelerators using ndarrays (in particular Pythran, Numba, Cython, Transonic). For Transonic, I'd like to be able to use internally numpy.typing to have a better implementation of what we need in transonic.typing (in particular compatible with type checkers like MyPy). However, it seems that I can't do anything with what I see today in numpy.typing. For Python-Numpy accelerators, we need to be able to define precise array types to limit the compilation time and give useful hints for optimizations (ndim, partial or full shape). We also need fused types. What can be done with Transonic is described in these pages: https://transonic.readthedocs.io/en/latest/examples/type_hints.html and https://transonic.readthedocs.io/en/latest/generated/transonic.typing.html I think it would be good to be able to do things like that with numpy.typing. It may be already possible but I can't find how in the doc. I can give few examples here. First very simple: from transonic import Array Af3d = Array[float, "3d"] # Note that this can also be written without Array just as Af3d = "float[:,:,:]" # same thing but only contiguous C ordered Af3d = Array[float, "3d", "C"] Note: being able to limit the compilation just for C-aligned arrays is very important since it can drastically decrease the compilation time/memory and that some numerical kernels are anyway written to be efficient only with C (or Fortran) ordered arrays. # 2d color image A_im = Array[np.int16, "[:,:,3]"] Now, fused types. This example is taken from a real life case (https://foss.heptapod.net/fluiddyn/fluidsim/-/blob/branch/default/fluidsim/base/time_stepping/pseudo_spect.py) so it's really useful in practice. from transonic import Type, NDim, Array, Union N = NDim(2, 3, 4) A = Array[np.complex128, N, "C"] Am1 = Array[np.complex128, N - 1, "C"] N123 = NDim(1, 2, 3) A123c = Array[np.complex128, N123, "C"] A123f = Array[np.float64, N123, "C"] T = Type(np.float64, np.complex128) A1 = Array[T, N, "C"] A2 = Array[T, N - 1, "C"] ArrayDiss = Union[A1, A2] To summarize, type annotations are and will also be used for Python-Numpy accelerators. It would be good to also consider this application when designing numpy.typing. Cheers, Pierre From friedrichromstedt at gmail.com Tue Feb 16 05:00:34 2021 From: friedrichromstedt at gmail.com (Friedrich Romstedt) Date: Tue, 16 Feb 2021 11:00:34 +0100 Subject: [Numpy-discussion] Unreliable crash when converting using numpy.asarray via C buffer interface In-Reply-To: References: Message-ID: Hello again, Am Mo., 15. Feb. 2021 um 16:57 Uhr schrieb Sebastian Berg : > > On Mon, 2021-02-15 at 10:12 +0100, Friedrich Romstedt wrote: > > Last week I updated my example code to be more slim. There now > > exists > > a single-file extension module: > > https://github.com/friedrichromstedt/bughunting-01/blob/master/lib/bughuntingfrmod/bughuntingfrmod.cpp > > . > > The corresponding test program > > https://github.com/friedrichromstedt/bughunting-01/blob/master/test/2021-02-11_0909.py > > crashes "properly" both on Windows 10 (Python 3.8.2, numpy 1.19.2) as > > well as on Arch Linux (Python 3.9.1, numpy 1.20.0), when the > > ``print`` > > statement contained in the test file is commented out. > > I have tried it out, and can confirm that using debugging tools (namely > valgrind), will allow you track down the issue (valgrind reports it > from within python, running a python without debug symbols may > obfuscate the actual problem; if that is the limiting you, I can post > my valgrind output). > Since you are running a linux system, I am confident that you can run > it in valgrind to find it yourself. (There may be other ways.) > > Just remember to run valgrind with `PYTHONMALLOC=malloc valgrind` and > ignore some errors e.g. when importing NumPy. >From running ``PYTHONMALLOC=malloc valgrind python3 2021-01-11_0909.py`` (with the preceding call of ``print`` in :file:`2021-01-11_0909.py` commented out) I found a few things: - The call might or might not succeed. It doesn't always lead to a segfault. - "at 0x4A64A73: ??? (in /usr/lib/libpython3.9.so.1.0), called by 0x4A64914: PyMemoryView_FromObject (in /usr/lib/libpython3.9.so.1.0)", a "Conditional jump or move depends on uninitialised value(s)". After one more block of valgrind output ("Use of uninitialised value of size 8 at 0x48EEA1B: ??? (in /usr/lib/libpython3.9.so.1.0)"), it finally leads either to "Invalid read of size 8 at 0x48EEA1B: ??? (in /usr/lib/libpython3.9.so.1.0) [...] Address 0x1 is not stack'd, malloc'd or (recently) free'd", resulting in a segfault, or just to another "Use of uninitialised value of size 8 at 0x48EEA15: ??? (in /usr/lib/libpython3.9.so.1.0)", after which the program completes successfully. - All this happens within "PyMemoryView_FromObject". So I can only guess that the "uninitialised value" is compared to 0x0, and when it is different (e.g. 0x1), it leads via "Address 0x1 is not stack'd, malloc'd or (recently) free'd" to the segfault observed. I suppose I need to compile Python and numpy myself to see the debug symbols instead of the "???" marks? Maybe even with ``-O0``? Furthermore, the shared object belonging to my code isn't involved directly in any way, so the segfault possibly has to do with some data I am leaving "uninitialised" at the moment. Thanks for the other replies as well; for the moment I feel that going the valgrind way might teach me how to debug errors of this kind myself. So far, Friedrich From lev.maximov at gmail.com Tue Feb 16 05:48:31 2021 From: lev.maximov at gmail.com (Lev Maximov) Date: Tue, 16 Feb 2021 17:48:31 +0700 Subject: [Numpy-discussion] Unreliable crash when converting using numpy.asarray via C buffer interface In-Reply-To: References: Message-ID: I've reproduced the error you've described and got rid of it without valgrind. Those two lines are enough to avoid the segfault. But feel free to find it yourself :) Best regards, Lev On Tue, Feb 16, 2021 at 5:02 PM Friedrich Romstedt < friedrichromstedt at gmail.com> wrote: > Hello again, > > Am Mo., 15. Feb. 2021 um 16:57 Uhr schrieb Sebastian Berg > : > > > > On Mon, 2021-02-15 at 10:12 +0100, Friedrich Romstedt wrote: > > > Last week I updated my example code to be more slim. There now > > > exists > > > a single-file extension module: > > > > https://github.com/friedrichromstedt/bughunting-01/blob/master/lib/bughuntingfrmod/bughuntingfrmod.cpp > > > . > > > The corresponding test program > > > > https://github.com/friedrichromstedt/bughunting-01/blob/master/test/2021-02-11_0909.py > > > crashes "properly" both on Windows 10 (Python 3.8.2, numpy 1.19.2) as > > > well as on Arch Linux (Python 3.9.1, numpy 1.20.0), when the > > > ``print`` > > > statement contained in the test file is commented out. > > > > I have tried it out, and can confirm that using debugging tools (namely > > valgrind), will allow you track down the issue (valgrind reports it > > from within python, running a python without debug symbols may > > obfuscate the actual problem; if that is the limiting you, I can post > > my valgrind output). > > Since you are running a linux system, I am confident that you can run > > it in valgrind to find it yourself. (There may be other ways.) > > > > Just remember to run valgrind with `PYTHONMALLOC=malloc valgrind` and > > ignore some errors e.g. when importing NumPy. > > From running ``PYTHONMALLOC=malloc valgrind python3 > 2021-01-11_0909.py`` (with the preceding call of ``print`` in > :file:`2021-01-11_0909.py` commented out) I found a few things: > > - The call might or might not succeed. It doesn't always lead to a > segfault. > - "at 0x4A64A73: ??? (in /usr/lib/libpython3.9.so.1.0), called by > 0x4A64914: PyMemoryView_FromObject (in /usr/lib/libpython3.9.so.1.0)", > a "Conditional jump or move depends on uninitialised value(s)". After > one more block of valgrind output ("Use of uninitialised value of size > 8 at 0x48EEA1B: ??? (in /usr/lib/libpython3.9.so.1.0)"), it finally > leads either to "Invalid read of size 8 at 0x48EEA1B: ??? (in > /usr/lib/libpython3.9.so.1.0) [...] Address 0x1 is not stack'd, > malloc'd or (recently) free'd", resulting in a segfault, or just to > another "Use of uninitialised value of size 8 at 0x48EEA15: ??? (in > /usr/lib/libpython3.9.so.1.0)", after which the program completes > successfully. > - All this happens within "PyMemoryView_FromObject". > > So I can only guess that the "uninitialised value" is compared to 0x0, > and when it is different (e.g. 0x1), it leads via "Address 0x1 is not > stack'd, malloc'd or (recently) free'd" to the segfault observed. > > I suppose I need to compile Python and numpy myself to see the debug > symbols instead of the "???" marks? Maybe even with ``-O0``? > > Furthermore, the shared object belonging to my code isn't involved > directly in any way, so the segfault possibly has to do with some data > I am leaving "uninitialised" at the moment. > > Thanks for the other replies as well; for the moment I feel that going > the valgrind way might teach me how to debug errors of this kind > myself. > > So far, > Friedrich > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Tue Feb 16 05:50:48 2021 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Tue, 16 Feb 2021 11:50:48 +0100 Subject: [Numpy-discussion] Type annotation for Numpy arrays, accelerators and numpy.typing In-Reply-To: <308708215.6518305.1613466858105.JavaMail.zimbra@univ-grenoble-alpes.fr> References: <308708215.6518305.1613466858105.JavaMail.zimbra@univ-grenoble-alpes.fr> Message-ID: On Tue, Feb 16, 2021 at 10:20 AM PIERRE AUGIER < pierre.augier at univ-grenoble-alpes.fr> wrote: > Hi, > > When Numpy 1.20 was released, I discovered numpy.typing and its > documentation https://numpy.org/doc/stable/reference/typing.html > > I know that it is very new but I'm a bit lost. A good API to describe > Array type would be useful not only for type checkers but also for Python > accelerators using ndarrays (in particular Pythran, Numba, Cython, > Transonic). > > For Transonic, I'd like to be able to use internally numpy.typing to have > a better implementation of what we need in transonic.typing (in particular > compatible with type checkers like MyPy). > > However, it seems that I can't do anything with what I see today in > numpy.typing. > > For Python-Numpy accelerators, we need to be able to define precise array > types to limit the compilation time and give useful hints for optimizations > (ndim, partial or full shape). We also need fused types. > Hi Pierre, I think what you are getting at is that ArrayLike isn't useful for accelerators, right? ArrayLike is needed to add annotations to functions that use np.asarray to coerce their inputs, which may be scalars, lists, etc. That's indeed never what you want for an accelerator, and it'd be great if people stopped writing that kind of code - but we're stuck with a lot of it in SciPy and many other downstream libraries. For your purposes, I think you want one of two things: 1. functions that only take `ndarray`, or maybe at most `Union[float, ndarray]` 2. perhaps in the future, a well-defined array Protocol, to support multiple array types (this is hinted at in https://data-apis.github.io/array-api/latest/design_topics/static_typing.html ) You don't need numpy.typing for (1), you can directly annotate with `x : np.ndarray` > What can be done with Transonic is described in these pages: > https://transonic.readthedocs.io/en/latest/examples/type_hints.html and > https://transonic.readthedocs.io/en/latest/generated/transonic.typing.html > > I think it would be good to be able to do things like that with > numpy.typing. It may be already possible but I can't find how in the doc. > Two things that are still work-in-progress are annotating arrays with dtypes and with shapes. Your examples already have that, so that's useful input. For C/F-contiguity, I believe that's useful but normally shouldn't show up in user-facing APIs (only in internal helper routines) so probably less urgent. For dtype annotations, a lot of work is being done at the moment by Bas van Beek. Example: https://github.com/numpy/numpy/pull/18128. That all turns out to be quite complex, because there's so many valid ways of specifying a dtype. It's the same kind of flexibility problem as with `asarray` - the complexity is needed to correctly type current code in NumPy, SciPy et al., but it's not what you want for an accelerator. For that you'd want to accept only one way of spelling this, `dtype=`. > I can give few examples here. First very simple: > > from transonic import Array > > Af3d = Array[float, "3d"] > > # Note that this can also be written without Array just as > Af3d = "float[:,:,:]" > > # same thing but only contiguous C ordered > Af3d = Array[float, "3d", "C"] > > Note: being able to limit the compilation just for C-aligned arrays is > very important since it can drastically decrease the compilation > time/memory and that some numerical kernels are anyway written to be > efficient only with C (or Fortran) ordered arrays. > > # 2d color image > A_im = Array[np.int16, "[:,:,3]"] > > Now, fused types. This example is taken from a real life case ( > https://foss.heptapod.net/fluiddyn/fluidsim/-/blob/branch/default/fluidsim/base/time_stepping/pseudo_spect.py) > so it's really useful in practice. > Yes definitely useful, there's also a lot of Cython code in downstream libraries that shows this. Annotations for fused types, when dtypes are just type literals, should hopefully work out of the box with TypeVar without us having to do anything special in numpy. Cheers, Ralf > from transonic import Type, NDim, Array, Union > > N = NDim(2, 3, 4) > A = Array[np.complex128, N, "C"] > Am1 = Array[np.complex128, N - 1, "C"] > > N123 = NDim(1, 2, 3) > A123c = Array[np.complex128, N123, "C"] > A123f = Array[np.float64, N123, "C"] > > T = Type(np.float64, np.complex128) > A1 = Array[T, N, "C"] > A2 = Array[T, N - 1, "C"] > ArrayDiss = Union[A1, A2] > > To summarize, type annotations are and will also be used for Python-Numpy > accelerators. It would be good to also consider this application when > designing numpy.typing. > > Cheers, > Pierre > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From friedrichromstedt at gmail.com Tue Feb 16 06:40:56 2021 From: friedrichromstedt at gmail.com (Friedrich Romstedt) Date: Tue, 16 Feb 2021 12:40:56 +0100 Subject: [Numpy-discussion] Unreliable crash when converting using numpy.asarray via C buffer interface In-Reply-To: References: Message-ID: Hi Lev, Am Di., 16. Feb. 2021 um 11:50 Uhr schrieb Lev Maximov : > > I've reproduced the error you've described and got rid of it without valgrind. > Those two lines are enough to avoid the segfault. Okay, good to know, I'll try it! Thanks for looking into it. > But feel free to find it yourself :) Yes :-D Best wishes, Friedrich From jfoxrabinovitz at gmail.com Tue Feb 16 10:49:23 2021 From: jfoxrabinovitz at gmail.com (Joseph Fox-Rabinovitz) Date: Tue, 16 Feb 2021 10:49:23 -0500 Subject: [Numpy-discussion] ENH: Proposal to add atleast_nd function In-Reply-To: References: <1001ae35de9d51204170cfb5742a6ffab6e89990.camel@sipsolutions.net> <417beafed3212391571b55dcd10c0e6e4311034e.camel@sipsolutions.net> Message-ID: I'm getting a generally lukewarm not negative response. Should we put it to a vote? - Joe On Fri, Feb 12, 2021, 16:06 Robert Kern wrote: > On Fri, Feb 12, 2021 at 3:42 PM Ralf Gommers > wrote: > >> >> On Fri, Feb 12, 2021 at 9:21 PM Robert Kern >> wrote: >> >>> On Fri, Feb 12, 2021 at 1:47 PM Ralf Gommers >>> wrote: >>> >>>> >>>> On Fri, Feb 12, 2021 at 7:25 PM Sebastian Berg < >>>> sebastian at sipsolutions.net> wrote: >>>> >>>>> >>>>> Right, my initial feeling it that without such context `atleast_3d` is >>>>> pretty surprising. So I wonder if we can design `atleast_nd` in a way >>>>> that it is explicit about this context. >>>>> >>>> >>>> Agreed. I think such a use case is probably too specific to design a >>>> single function for, at least in such a hardcoded way. >>>> >>> >>> That might be an argument for not designing a new one (or at least not >>> giving it such a name). Not sure it's a good argument for removing a >>> long-standing one. >>> >> >> I agree. I'm not sure deprecating is best. But introducing new >> functionality where `nd(pos=3) != 3d` is also not great. >> >> At the very least, atleast_3d should be better documented. It also is >> telling that Juan (a long-time) scikit-image dev doesn't like atleast_3d >> and there's very little usage of it in scikit-image. >> > > I'm fairly neutral on atleast_nd(). I think that for n=1 and n=2, you can > derive The One Way to Do It from broadcasting semantics, but for n>=3, I'm > not sure there's much value in trying to systematize it to a single > convention. I think that once you get up to those dimensions, you start to > want to have domain-specific semantics. I do agree that, in retrospect, > atleast_3d() probably should have been named more specifically. It was of a > piece of other conveniences like dstack() that did special things to > support channel-last images (and implicitly treat 3D arrays as such). For > example, DL frameworks that assemble channeled images into minibatches > (with different conventions like BHWC and BCHW), you'd want the n=4 > behavior to do different things. I _think_ you'd just want to do those with > different functions than a complicated set of arguments to one function. > > -- > Robert Kern > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From sebastian at sipsolutions.net Tue Feb 16 11:00:12 2021 From: sebastian at sipsolutions.net (Sebastian Berg) Date: Tue, 16 Feb 2021 10:00:12 -0600 Subject: [Numpy-discussion] Unreliable crash when converting using numpy.asarray via C buffer interface In-Reply-To: References: Message-ID: <1c350429f25ade94f7b2b5a97aee7e7666c24bd7.camel@sipsolutions.net> On Tue, 2021-02-16 at 12:40 +0100, Friedrich Romstedt wrote: > Hi Lev, > > Am Di., 16. Feb. 2021 um 11:50 Uhr schrieb Lev Maximov < > lev.maximov at gmail.com>: > > > > I've reproduced the error you've described and got rid of it > > without valgrind. > > Those two lines are enough to avoid the segfault. > > Okay, good to know, I'll try it! Thanks for looking into it. Yeah, sorry if I was too fuzzy. Your error was random, and checking valgrind in that case is often helpful and typically quick (it runs slow, but not much preparation needed). Especially because you reported it succeeding sometimes, where "uninitialized" might help, although I guess a `gdb` backtrace in the crash case might have been just as clear. With debugging symbols in Python (a full debug build makes sense), it mentioned "suboffsets" in a function name for me (maybe when a crash happened), a debug Python will also default to a debug malloc: https://docs.python.org/3/using/cmdline.html#envvar-PYTHONMALLOC Which would not have been very useful here, but could be if you access a Python object after it was free'd for example. Uninitialized + "suboffsets" seemed fairly clear, but I may have underestimated it alot because I recognize "suboffsets" for buffers immediately. Cheers, Sebastian > > > But feel free to find it yourself :) > > Yes :-D > > Best wishes, > Friedrich > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: This is a digitally signed message part URL: From sebastian at sipsolutions.net Tue Feb 16 18:08:48 2021 From: sebastian at sipsolutions.net (Sebastian Berg) Date: Tue, 16 Feb 2021 17:08:48 -0600 Subject: [Numpy-discussion] What to do about structured string dtype and string regression? Message-ID: Hi all, In https://github.com/numpy/numpy/issues/18407 it was reported that there is a regression for `np.array()` and friends in NumPy 1.20 for code such as: np.array(["1234"], dtype=("U1", 4)) # NumPy 1.20: array(['1', '1', '1', '1'], dtype='>> np.array(["1234"], dtype="(4)U1,i") array([(['1', '1', '1', '1'], 1234)], dtype=[('f0', '>> np.array("1234", dtype="(4)U1,") # Numpy 1.20: array(['1', '1', '1', '1'], dtype='>> np.array(["12"],dtype=("(2,2)U1,")) array([[['1', '1'], ['2', '2']]], dtype='