From ilhanpolat at gmail.com Tue Feb 2 17:59:46 2021 From: ilhanpolat at gmail.com (Ilhan Polat) Date: Tue, 2 Feb 2021 23:59:46 +0100 Subject: [SciPy-Dev] Yes, we don't want any scipy modules BUT Message-ID: Hi everyone, This is an odd ball of an subject so bear with me for a paragraph. Currently we have lots of control related functions on scipy.signal with varying production grade some are there almost just as a placeholder some are pretty good. However, many things don't come with the box such as MIMO support, internal delay representations, time and bode plotting (properly spaced and considerably dense) and so on. Now of course we have python-control and (shameless plug) harold packages that can do some and fail to do others. Frankly in my particular case scipy is eating all my OSS time. And python-control has their own roadmap. I provide lots of MIMO stuff but lacking the academic catalogue functions like root-locus and other academic torture tools and python-control is mostly lacking MIMO support and a bit short of advanced stuff. In the mean time, there is a very nice Fortran library SLICOT which also powers some matlab functions in production however it is not open source. But they moved to GitHub recently and released its earlier version 5.7 under BSD3. Previously 5.0 was released under GPL and that was the one python-control vendored but 5.7 is already pretty capable and actually caused me to write this up. This library is quite diverse and written by very very high caliber researchers. The reason why I always avoided was obviously GPL but apparently they changed their mind which is personally fantastic news for me. So coming back to the meat of this discussion: I have looked at the LTI parts and very very closely and I don't see any way to overhaul them without extremely painful deprecation cycles and breakage. But I sincerely believe that together with PocketFFT scipy can serve a better quality LTI tools. In its current state it's a bit academic-ish and not production ready. So this brings us to three concrete options 1- Status quo : I don't like touching that many funcs and waking the sleeping dogs 2- Whatever we do we do it on the current functions: It doesn't matter if it takes 4 years, we don't want any adventures 3- Make a new module and lighten up the signal module which was probably not exactly the right place. Please make it as blunt as possible, no hard feelings but I think this discussion has to be done at least once and maybe for all. A tiny bit of it has already happened last year in https://github.com/scipy/scipy/pull/4515 but it barely grazed. Cheers, ilhan Current catalogue https://docs.scipy.org/doc/scipy/reference/signal.html python-control vendored version https://github.com/python-control/Slycot New BSD3 version https://github.com/SLICOT/SLICOT-Reference -------------- next part -------------- An HTML attachment was scrubbed... URL: From bussonniermatthias at gmail.com Thu Feb 4 14:40:41 2021 From: bussonniermatthias at gmail.com (Matthias Bussonnier) Date: Thu, 4 Feb 2021 11:40:41 -0800 Subject: [SciPy-Dev] Messages stuck in moderation ? Message-ID: Hello everyone, I am working with a student (Arthur Volant) on his first contribution to SciPy, It looks like his mails are not reaching the mailing list (they do not currently appear in archive[0]). Does anybody have access to the moderation panel to see if they are stuck in moderation or if something else is rejecting them ? Or is there a problem on mailman which does not display them, but subscribers to the mailing list still did receive them ? Below is attached a copy of the messages sent. Thanks, -- Matthias from: arthurvolant at gmail.com Hello Scipy-dev, I have been working for a couple of days now on the following PR[1]. The origin of this PR is this issue[2], asking to add Barnard and Boschloo test, which are two exact statistical tests. While working on it, I found that Fisher's exact test was already implemented. Barnard and Boshloo are two tests more powerful than Fisher's one. My `barnard_exact` implementation is so far working well. It is a little bit slower than Fisher exact test, but not that much, with an average execution time of 1.12 ms I was wondering though where to put my codes. It seems that there are two possible files : either in `scipy/stats/_hypotests.py` or either in `scipy/stats/contingency.py` which contains already `chi2_contingency` function. What would you advise me to do? I thank you for your time and answers, Arthur 0: https://mail.python.org/pipermail/scipy-dev/2021-February/date.html#start 1: https://github.com/scipy/scipy/pull/13441 2: https://github.com/scipy/scipy/issues/11014 From stefanv at berkeley.edu Thu Feb 4 19:23:11 2021 From: stefanv at berkeley.edu (Stefan van der Walt) Date: Thu, 04 Feb 2021 16:23:11 -0800 Subject: [SciPy-Dev] ENH: improve RBF interpolation In-Reply-To: References: Message-ID: <274ab45e-a9cc-4710-a40e-65b6f96c0b4a@www.fastmail.com> Hi Trever, On Fri, Jan 29, 2021, at 05:13, Trever Hines wrote: > I would like to contribute code to scipy to address some issues regarding RBF interpolation. The code can be found on my branch _here_ . My contribution consists of two new classes for scattered N-D interpolation: This is fantastic; thank you so much for sharing your expertise on RBFs! > 1. `RBFInterpolator`: This is intended to be a replacement for `Rbf` that addresses issues mentioned in 9904 and 4790 . Namely, the major differences with `Rbf` are 1) the usage is similar to `NearestNDInterpolator` and `LinearNDInterpolator` making it easier to swap out different interpolation methods, 2) the sign of the smoothing parameter is correct (see page 10 of these lecture notes ), and 3) the interpolant includes polynomial terms. > For some RBF choices (values of ?linear?, ?thin_plate?, ?cubic?, ?quintic?, or ?multiquadratic? for `function` in `Rbf`), the additional polynomial terms are needed to ensure that the interpolation problem is well-posed (see theorem 3.2.7 in this document ). Without the additional polynomial terms for these RBFs, I have noticed that some values for the smoothing parameter (with the corrected sign) result in an obviously erroneous interpolant. Even when the chosen RBF does not require additional polynomial terms, they still can improve the quality of the interpolant. In particular, the polynomial terms are able to accommodate shifts or linear trends in data, which the RBFs tend to struggle with by themselves. Is there any advantage to keeping the old interface, or should this eventually replace Rbf entirely? > 1. `KNearestRBFInterpolator`: This class performs RBF interpolation using only the k neaest data points to each interpolation point (which was suggested in 5180 ). This class is useful when there are too many observations for `RBFInterpolator` (on the order of tens of thousands) and you want an interpolant that* looks** *smoother than what you get with `NearestNDInterpolator` or `LinearNDInterpolator`. > My concern with interpolation using the k nearest neighbors is that it is a bit of an ad hoc strategy to work around computational limitations. That being said, I have seen a similar strategy used in the Kriging world (Kriging is a form of RBF interpolation). Superb! I've been trying to do RBF interpolation with N>1000 and with the N^2 memory requirement it gets you pretty quickly. This makes the algorithm much more pragmatic to apply to, e.g., images. We can always add other picking strategies later on. > I wou;d appreciate your feedback on whether you think these would be valuable contributions to scipy. If so, I will make the pull request after adding benchmarks, unit tests, and more docs. I'd say 100% yes; thank you again. Best regards, St?fan -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Fri Feb 5 07:05:51 2021 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Fri, 5 Feb 2021 13:05:51 +0100 Subject: [SciPy-Dev] Messages stuck in moderation ? In-Reply-To: References: Message-ID: On Thu, Feb 4, 2021 at 8:41 PM Matthias Bussonnier < bussonniermatthias at gmail.com> wrote: > Hello everyone, > > I am working with a student (Arthur Volant) on his first contribution to > SciPy, > It looks like his mails are not reaching the mailing list (they do not > currently appear in archive[0]). > > Does anybody have access to the moderation panel to see if they are > stuck in moderation or if something else is rejecting them ? Or is > there a problem on mailman which does not display them, but > subscribers to the mailing list still did receive them ? > Hey Mathias, no one on this list has admin access. Best would be to submit an issue through the email address at the bottom of https://mail.python.org/mailman/listinfo/scipy-dev. > Below is attached a copy of the messages sent. > > Thanks, > -- > Matthias > > > from: arthurvolant at gmail.com > Hello Scipy-dev, > > I have been working for a couple of days now on the following PR[1]. > The origin of this PR is this issue[2], asking to add Barnard and > Boschloo test, which are two exact statistical tests. > While working on it, I found that Fisher's exact test was already > implemented. > Barnard and Boshloo are two tests more powerful than Fisher's one. > My `barnard_exact` implementation is so far working well. It is a > little bit slower than Fisher exact test, but not that much, with an > average execution time of 1.12 ms > Both of these tests seem like a good idea to add. Feedback on the issue was positive, and the papers have ~350 and ~150 citations, respectively. So that should be good enough. > I was wondering though where to put my codes. It seems that there are > two possible files : > either in `scipy/stats/_hypotests.py` or either in > `scipy/stats/contingency.py` which contains already `chi2_contingency` > function. What would you advise me to do? > The PR as it is, adding it to _hypotests.py, seems fine to me. Thanks for working on this Arthur! Cheers, Ralf > I thank you for your time and answers, > Arthur > > 0: > https://mail.python.org/pipermail/scipy-dev/2021-February/date.html#start > 1: https://github.com/scipy/scipy/pull/13441 > 2: https://github.com/scipy/scipy/issues/11014 > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at python.org > https://mail.python.org/mailman/listinfo/scipy-dev > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Sat Feb 6 12:12:53 2021 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Sat, 6 Feb 2021 18:12:53 +0100 Subject: [SciPy-Dev] SLICOT, LTI functionality and scipy.signal (WAS: Yes, we don't want any scipy modules BUT) In-Reply-To: References: Message-ID: On Wed, Feb 3, 2021 at 12:00 AM Ilhan Polat wrote: > Hi everyone, > > This is an odd ball of an subject so bear with me for a paragraph. > Currently we have lots of control related functions on scipy.signal with > varying production grade some are there almost just as a placeholder some > are pretty good. However, many things don't come with the box such as MIMO > support, internal delay representations, time and bode plotting (properly > spaced and considerably dense) and so on. Now of course we have > python-control and (shameless plug) harold packages that can do some and > fail to do others. Frankly in my particular case scipy is eating all my OSS > time. And python-control has their own roadmap. I provide lots of MIMO > stuff but lacking the academic catalogue functions like root-locus and > other academic torture tools and python-control is mostly lacking MIMO > support and a bit short of advanced stuff. > > In the mean time, there is a very nice Fortran library SLICOT which also > powers some matlab functions in production however it is not open source. > But they moved to GitHub recently and released its earlier version 5.7 > under BSD3. Previously 5.0 was released under GPL and that was the one > python-control vendored but 5.7 is already pretty capable and actually > caused me to write this up. This library is quite diverse and written by > very very high caliber researchers. The reason why I always avoided was > obviously GPL but apparently they changed their mind which is personally > fantastic news for me. > That's very interesting. python-control itself is BSD-3, but the recommended optional dependency slycot is indeed GPL v2. So coming back to the meat of this discussion: I have looked at the LTI > parts and very very closely and I don't see any way to overhaul them > without extremely painful deprecation cycles and breakage. But I sincerely > believe that together with PocketFFT scipy can serve a better quality LTI > tools. In its current state it's a bit academic-ish and not production > ready. So this brings us to three concrete options > I agree. The scipy.signal module is of varying quality, and the LTI parts are indeed not great. > 1- Status quo : I don't like touching that many funcs and waking the > sleeping dogs > 2- Whatever we do we do it on the current functions: It doesn't matter if > it takes 4 years, we don't want any adventures > 3- Make a new module and lighten up the signal module which was probably > not exactly the right place. > > Please make it as blunt as possible, no hard feelings but I think this > discussion has to be done at least once and maybe for all. A tiny bit of it > has already happened last year in https://github.com/scipy/scipy/pull/4515 > but it barely grazed. > On reflection, creating a new scipy.control module worries me a little. What Eric Quintero said on gh-4515 is probably true: *"I'm a user and a big fan of python-control, but I don't think I'm quite on board with merging it into scipy. The scope of capabilities that users may expect from a controls package is a little bigger than what I imagine for scipy submodules. I think there is an advantage to be had to being a standalone module that can set its own schedule, deprecation policy, etc."* Control theory is a little specialized/niche compared to most of the topics covered by other SciPy submodules, and the combination of that domain-specific knowledge plus a large amount of Fortran code is not very appealing. I think my ideal outcome here would be that python-control and harold merge, and we recommend that one well-designed/maintained package to users over and above the LTI and filter design functionality in scipy.signal. We can't deprecate scipy.signal, because it's too widely used. But it could have a similar relation to the python-control-harold package as scipy.cluster has to scikit-learn: we offer the basics, and for higher-performance or more state-of-the-art stuff, go elsewhere. Cheers, Ralf > Cheers, > ilhan > > Current catalogue > https://docs.scipy.org/doc/scipy/reference/signal.html > > python-control vendored version > https://github.com/python-control/Slycot > > New BSD3 version > https://github.com/SLICOT/SLICOT-Reference > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at python.org > https://mail.python.org/mailman/listinfo/scipy-dev > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ilhanpolat at gmail.com Sat Feb 6 12:42:26 2021 From: ilhanpolat at gmail.com (Ilhan Polat) Date: Sat, 6 Feb 2021 18:42:26 +0100 Subject: [SciPy-Dev] SLICOT, LTI functionality and scipy.signal (WAS: Yes, we don't want any scipy modules BUT) In-Reply-To: References: Message-ID: I think python-control doesn't need harold at all and in pretty good shape by itself. So we can start pointing to them anyways, However, I don't know if it is luck but very recent issues like https://github.com/scipy/scipy/issues/13496 https://github.com/scipy/scipy/issues/13498 and many more in the backlog are a bit worrisome to me. But then again, regardless of the issues I don't mind any of these outcomes from this discussion. On Sat, Feb 6, 2021 at 6:13 PM Ralf Gommers wrote: > > > On Wed, Feb 3, 2021 at 12:00 AM Ilhan Polat wrote: > >> Hi everyone, >> >> This is an odd ball of an subject so bear with me for a paragraph. >> Currently we have lots of control related functions on scipy.signal with >> varying production grade some are there almost just as a placeholder some >> are pretty good. However, many things don't come with the box such as MIMO >> support, internal delay representations, time and bode plotting (properly >> spaced and considerably dense) and so on. Now of course we have >> python-control and (shameless plug) harold packages that can do some and >> fail to do others. Frankly in my particular case scipy is eating all my OSS >> time. And python-control has their own roadmap. I provide lots of MIMO >> stuff but lacking the academic catalogue functions like root-locus and >> other academic torture tools and python-control is mostly lacking MIMO >> support and a bit short of advanced stuff. >> >> In the mean time, there is a very nice Fortran library SLICOT which also >> powers some matlab functions in production however it is not open source. >> But they moved to GitHub recently and released its earlier version 5.7 >> under BSD3. Previously 5.0 was released under GPL and that was the one >> python-control vendored but 5.7 is already pretty capable and actually >> caused me to write this up. This library is quite diverse and written by >> very very high caliber researchers. The reason why I always avoided was >> obviously GPL but apparently they changed their mind which is personally >> fantastic news for me. >> > > That's very interesting. python-control itself is BSD-3, but the > recommended optional dependency slycot is indeed GPL v2. > > So coming back to the meat of this discussion: I have looked at the LTI >> parts and very very closely and I don't see any way to overhaul them >> without extremely painful deprecation cycles and breakage. But I sincerely >> believe that together with PocketFFT scipy can serve a better quality LTI >> tools. In its current state it's a bit academic-ish and not production >> ready. So this brings us to three concrete options >> > > I agree. The scipy.signal module is of varying quality, and the LTI parts > are indeed not great. > > >> 1- Status quo : I don't like touching that many funcs and waking the >> sleeping dogs >> 2- Whatever we do we do it on the current functions: It doesn't matter if >> it takes 4 years, we don't want any adventures >> 3- Make a new module and lighten up the signal module which was probably >> not exactly the right place. >> >> Please make it as blunt as possible, no hard feelings but I think this >> discussion has to be done at least once and maybe for all. A tiny bit of it >> has already happened last year in >> https://github.com/scipy/scipy/pull/4515 but it barely grazed. >> > > On reflection, creating a new scipy.control module worries me a little. > What Eric Quintero said on gh-4515 is probably true: > > *"I'm a user and a big fan of python-control, but I don't think I'm quite > on board with merging it into scipy. The scope of capabilities that users > may expect from a controls package is a little bigger than what I imagine > for scipy submodules. I think there is an advantage to be had to being a > standalone module that can set its own schedule, deprecation policy, etc."* > > Control theory is a little specialized/niche compared to most of the > topics covered by other SciPy submodules, and the combination of that > domain-specific knowledge plus a large amount of Fortran code is not very > appealing. > > I think my ideal outcome here would be that python-control and harold > merge, and we recommend that one well-designed/maintained package to users > over and above the LTI and filter design functionality in scipy.signal. We > can't deprecate scipy.signal, because it's too widely used. But it could > have a similar relation to the python-control-harold package as > scipy.cluster has to scikit-learn: we offer the basics, and for > higher-performance or more state-of-the-art stuff, go elsewhere. > > Cheers, > Ralf > > > >> Cheers, >> ilhan >> >> Current catalogue >> https://docs.scipy.org/doc/scipy/reference/signal.html >> >> python-control vendored version >> https://github.com/python-control/Slycot >> >> New BSD3 version >> https://github.com/SLICOT/SLICOT-Reference >> _______________________________________________ >> SciPy-Dev mailing list >> SciPy-Dev at python.org >> https://mail.python.org/mailman/listinfo/scipy-dev >> > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at python.org > https://mail.python.org/mailman/listinfo/scipy-dev > -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Sun Feb 7 16:23:04 2021 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sun, 7 Feb 2021 14:23:04 -0700 Subject: [SciPy-Dev] NumPy 1.20.1 released. Message-ID: Hi All, On behalf of the NumPy team I am pleased to announce the release of NumPy 1.20.1. NumPy 1.20.1 is a rapid bugfix release fixing several bugs and regressions reported after the 1.20.0 release. The Python versions supported for this release are 3.7-3.9. Wheels can be downloaded from PyPI ; source archives, release notes, and wheel hashes are available on Github . Linux users will need pip >= 0.19.3 in order to install manylinux2010 and manylinux2014 wheels. *Highlights* - The distutils bug that caused problems with downstream projects is fixed. - The ``random.shuffle`` regression is fixed. *Contributors* A total of 8 people contributed to this release. People with a "+" by their names contributed a patch for the first time. - Bas van Beek - Charles Harris - Nicholas McKibben + - Pearu Peterson - Ralf Gommers - Sebastian Berg - Tyler Reddy - @Aerysv + *Pull requests merged* A total of 15 pull requests were merged for this release. - gh-18306: MAINT: Add missing placeholder annotations - gh-18310: BUG: Fix typo in ``numpy.__init__.py`` - gh-18326: BUG: don't mutate list of fake libraries while iterating over... - gh-18327: MAINT: gracefully shuffle memoryviews - gh-18328: BUG: Use C linkage for random distributions - gh-18336: CI: fix when GitHub Actions builds trigger, and allow ci skips - gh-18337: BUG: Allow unmodified use of isclose, allclose, etc. with timedelta - gh-18345: BUG: Allow pickling all relevant DType types/classes - gh-18351: BUG: Fix missing signed_char dependency. Closes #18335. - gh-18352: DOC: Change license date 2020 -> 2021 - gh-18353: CI: CircleCI seems to occasionally time out, increase the limit - gh-18354: BUG: Fix f2py bugs when wrapping F90 subroutines. - gh-18356: MAINT: crackfortran regex simplify - gh-18357: BUG: threads.h existence test requires GLIBC > 2.12. - gh-18359: REL: Prepare for the NumPy 1.20.1 release. Cheers, Charles Harris -------------- next part -------------- An HTML attachment was scrubbed... URL: From nicholas.bgp at gmail.com Fri Feb 12 16:50:22 2021 From: nicholas.bgp at gmail.com (Nicholas McKibben) Date: Fri, 12 Feb 2021 14:50:22 -0700 Subject: [SciPy-Dev] Boost for stats Message-ID: Hi all, Many stats distributions in SciPy have outstanding issues with difficult solutions in legacy code. We've been working on replacing existing statistical distributions with those found in Boost.Math. The initial implementation resolves almost a dozen issues for scipy.stats with potential for resolving several more in scipy.stats and scipy.special in future PRs. Initial PR: https://github.com/scipy/scipy/pull/1332 This PR includes the ability to easily add Boost functionality through generated ufuncs. Boost is a large library and would incur the cost of one of the follow - an additional dependency (e,g, boostinator https://github.com/mckib2/boostinator) that outsources the packaging of the Boost libraries - the inclusion of Boost within SciPy either as a "clone and own" or submodule The initial PR includes the zipped Boost headers only (~24MB zipped), but adding Boost as a submodule might be a more maintainable approach if changes to Boost need to be made in the future. Inclusion of the entire Boost library is a virtual necessity for the Boost.Math module. Manual attempts to strip away unnecessary files and bcp (Boost's utility to provide stripped down installations) fail to create smaller sizes. The increase in size would be similar to the following: SciPy master repo ~177 MB Boost branch: ~221 MB Built: ~939 MB Built With Boost: ~1090 MB Wheel size should not be significantly impacted because Boost is used as a header-only library. I have no relationship with the Boost libraries other than as a user and bug reporter. I find them to be impressive and well-maintained with tremendous support from both industry and open source developers. SciPy would benefit from the efficient, well-tested and maintained implementations of stats and special algorithms. Thanks, Nicholas -------------- next part -------------- An HTML attachment was scrubbed... URL: From hans.dembinski at gmail.com Sat Feb 13 08:06:17 2021 From: hans.dembinski at gmail.com (Hans Dembinski) Date: Sat, 13 Feb 2021 14:06:17 +0100 Subject: [SciPy-Dev] Boost for stats In-Reply-To: References: Message-ID: Hi Nicholas, as a Boost developer (I wrote Boost.Histogram and contributed to several other Boost libs), I think it would be great to build SciPy on Boost.Math, it is a win-win. > On 12. Feb 2021, at 22:50, Nicholas McKibben wrote: > > The initial PR includes the zipped Boost headers only (~24MB zipped), but adding Boost as a submodule might be a more maintainable approach if changes to Boost need to be made in the future. Including it as a submodule seems like a good approach. > Inclusion of the entire Boost library is a virtual necessity for the Boost.Math module. Manual attempts to strip away unnecessary files and bcp (Boost's utility to provide stripped down installations) fail to create smaller sizes. I was a bit shocked to hear this, but you are right: https://pdimov.github.io/boostdep-report/master/math.html Math depends on everything. We have a long-term goal to reduce the coupling between Boost libs, but this also incurs costs. Library maintainers then have to copy the relevant bits from other Boost libraries to not depend on them, which is actually a terrible idea: you loose the synergies offered by a rich shared code base. In my view, the coupling is not a bug, it is a feature. It is impressive to see how you use generators to create the binding code in Cython. I had a lot of trouble with Cython as it does not support all C++ features. The best way to wrap (modern) C++ is pybind11, which is a painless experience. It does the code generation at compile-time with TMP. Best regards, Hans From andrea.cortis at gmail.com Sat Feb 13 11:45:01 2021 From: andrea.cortis at gmail.com (Andrea Cortis) Date: Sat, 13 Feb 2021 10:45:01 -0600 Subject: [SciPy-Dev] Hyper-dual numbers Message-ID: Hello, first time here. I was wondering if there are plans for the definition of a ?hyper-dual? type in numpy. I think that would be most useful for neural nets training, and optimization in general. Andrea Sent from my iPad From evgeny.burovskiy at gmail.com Sat Feb 13 13:18:59 2021 From: evgeny.burovskiy at gmail.com (Evgeni Burovski) Date: Sat, 13 Feb 2021 21:18:59 +0300 Subject: [SciPy-Dev] Hyper-dual numbers In-Reply-To: References: Message-ID: Hi Andrea, welcome! Since you're asking about numpy, you likely want the numpy-discussion mailing list. (the overlap is non-zero, but nevertheless). Out of curiosity, what's a hyper-dual type? Cheers, Evgeni On Sat, Feb 13, 2021 at 7:45 PM Andrea Cortis wrote: > > Hello, first time here. I was wondering if there are plans for the definition of a ?hyper-dual? type in numpy. I think that would be most useful for neural nets training, and optimization in general. > > Andrea > > Sent from my iPad > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at python.org > https://mail.python.org/mailman/listinfo/scipy-dev From jfoxrabinovitz at gmail.com Sat Feb 13 13:46:15 2021 From: jfoxrabinovitz at gmail.com (Joseph Fox-Rabinovitz) Date: Sat, 13 Feb 2021 13:46:15 -0500 Subject: [SciPy-Dev] Hyper-dual numbers In-Reply-To: References: Message-ID: I directed Andrea here from Stack Overflow ( https://stackoverflow.com/q/66179855/2988730). Based on the Wikipedia article (https://en.m.wikipedia.org/wiki/Dual_number), it seems like scipy is a much more likely place to look than numpy. - Joe On Sat, Feb 13, 2021, 13:19 Evgeni Burovski wrote: > Hi Andrea, welcome! > > Since you're asking about numpy, you likely want the numpy-discussion > mailing list. (the overlap is non-zero, but nevertheless). > > Out of curiosity, what's a hyper-dual type? > > Cheers, > > Evgeni > > On Sat, Feb 13, 2021 at 7:45 PM Andrea Cortis > wrote: > > > > Hello, first time here. I was wondering if there are plans for the > definition of a ?hyper-dual? type in numpy. I think that would be most > useful for neural nets training, and optimization in general. > > > > Andrea > > > > Sent from my iPad > > _______________________________________________ > > SciPy-Dev mailing list > > SciPy-Dev at python.org > > https://mail.python.org/mailman/listinfo/scipy-dev > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at python.org > https://mail.python.org/mailman/listinfo/scipy-dev > -------------- next part -------------- An HTML attachment was scrubbed... URL: From evgeny.burovskiy at gmail.com Sat Feb 13 14:20:34 2021 From: evgeny.burovskiy at gmail.com (Evgeni Burovski) Date: Sat, 13 Feb 2021 22:20:34 +0300 Subject: [SciPy-Dev] Hyper-dual numbers In-Reply-To: References: Message-ID: Ah, these. Suspected it, but wanted to make sure. IMO, these are best implemented as a numpy dtype. I'm biased though --- here's a branch which makes a start, based on Mike Boyle's version of quaternion dtype (Mike and other authors, if you're reading this --- thanks a ton!) https://github.com/ev-br/quaternion/tree/dual Now, I don't think scipy should carry around additional numpy dtypes. Cannot speak for the numpy project, but I strongly suspect this is best implemented as a separate repository/project. Cheers, Evgeni On Sat, Feb 13, 2021 at 9:46 PM Joseph Fox-Rabinovitz wrote: > > I directed Andrea here from Stack Overflow (https://stackoverflow.com/q/66179855/2988730). Based on the Wikipedia article (https://en.m.wikipedia.org/wiki/Dual_number), it seems like scipy is a much more likely place to look than numpy. > > - Joe > > > On Sat, Feb 13, 2021, 13:19 Evgeni Burovski wrote: >> >> Hi Andrea, welcome! >> >> Since you're asking about numpy, you likely want the numpy-discussion >> mailing list. (the overlap is non-zero, but nevertheless). >> >> Out of curiosity, what's a hyper-dual type? >> >> Cheers, >> >> Evgeni >> >> On Sat, Feb 13, 2021 at 7:45 PM Andrea Cortis wrote: >> > >> > Hello, first time here. I was wondering if there are plans for the definition of a ?hyper-dual? type in numpy. I think that would be most useful for neural nets training, and optimization in general. >> > >> > Andrea >> > >> > Sent from my iPad >> > _______________________________________________ >> > SciPy-Dev mailing list >> > SciPy-Dev at python.org >> > https://mail.python.org/mailman/listinfo/scipy-dev >> _______________________________________________ >> SciPy-Dev mailing list >> SciPy-Dev at python.org >> https://mail.python.org/mailman/listinfo/scipy-dev > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at python.org > https://mail.python.org/mailman/listinfo/scipy-dev From evgeny.burovskiy at gmail.com Sat Feb 13 14:32:03 2021 From: evgeny.burovskiy at gmail.com (Evgeni Burovski) Date: Sat, 13 Feb 2021 22:32:03 +0300 Subject: [SciPy-Dev] Boost for stats In-Reply-To: References: Message-ID: Hi, Borrowing from Boost.Math sounds great indeed. (Great if it seems advantageous by boost devs, too). There is really no reason to keep using parts of e.g. cdflib which are superseded by Boost.Math. However, playing devil's advocate somewhat: - does the scipy PR need the whole Boost.Math? If it only needs a select subset (e.g., do we need root-finding etc?), then maybe the size can be reduced. - do we need the whole thing? e.g. ufunc loops only need a select subset of types. - if we do go this route of taking parts / applying scipy specific patches, what is easier to do or better maintenance-wise: vendor original code + patches, or do the work once by porting relevant parts to standalone C or C++ subset? Obviously, all these should be weighted with other implications of adding a dependency. The immediate concerns are distribution size and build times. Cheers, Evgeni On Sat, Feb 13, 2021 at 4:06 PM Hans Dembinski wrote: > > Hi Nicholas, > > as a Boost developer (I wrote Boost.Histogram and contributed to several other Boost libs), I think it would be great to build SciPy on Boost.Math, it is a win-win. > > > On 12. Feb 2021, at 22:50, Nicholas McKibben wrote: > > > > The initial PR includes the zipped Boost headers only (~24MB zipped), but adding Boost as a submodule might be a more maintainable approach if changes to Boost need to be made in the future. > > Including it as a submodule seems like a good approach. > > > Inclusion of the entire Boost library is a virtual necessity for the Boost.Math module. Manual attempts to strip away unnecessary files and bcp (Boost's utility to provide stripped down installations) fail to create smaller sizes. > > I was a bit shocked to hear this, but you are right: > https://pdimov.github.io/boostdep-report/master/math.html > Math depends on everything. > > We have a long-term goal to reduce the coupling between Boost libs, but this also incurs costs. Library maintainers then have to copy the relevant bits from other Boost libraries to not depend on them, which is actually a terrible idea: you loose the synergies offered by a rich shared code base. In my view, the coupling is not a bug, it is a feature. > > It is impressive to see how you use generators to create the binding code in Cython. I had a lot of trouble with Cython as it does not support all C++ features. The best way to wrap (modern) C++ is pybind11, which is a painless experience. It does the code generation at compile-time with TMP. > > Best regards, > Hans > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at python.org > https://mail.python.org/mailman/listinfo/scipy-dev From andrea.cortis at gmail.com Sat Feb 13 14:35:44 2021 From: andrea.cortis at gmail.com (Andrea Cortis) Date: Sat, 13 Feb 2021 13:35:44 -0600 Subject: [SciPy-Dev] Hyper-dual numbers In-Reply-To: References: Message-ID: I am no mathematician and cannot comment on the equivalence of quaternions vs hyper-dual numbers, even though they seem quite different to me at a first glance. For sure, I know that the algebra of dual-numbers is different from the algebra of complex numbers. It seems to me therefore that having a numpy `dtype=dual` would be extremely advantageous when looking to construct *exact* values of first (and n-th) derivatives. Say that one would want to replace automatic differentiation for backpropagation neural networks like in this blogpost https://blog.demofox.org/2017/03/13/neural-network-gradients-backpropagation-dual-numbers-finite-differences/ then we would have a plethora of different implementations, say for pytorch, tensorflow, etc. and no unifying framework. Best, Andre On Sat, Feb 13, 2021 at 1:21 PM Evgeni Burovski wrote: > Ah, these. Suspected it, but wanted to make sure. > > IMO, these are best implemented as a numpy dtype. I'm biased though > --- here's a branch which makes a start, based on Mike Boyle's version > of quaternion dtype (Mike and other authors, if you're reading this > --- thanks a ton!) > > https://github.com/ev-br/quaternion/tree/dual > > Now, I don't think scipy should carry around additional numpy dtypes. > Cannot speak for the numpy project, but I strongly suspect this is > best implemented as a separate repository/project. > > Cheers, > > Evgeni > > On Sat, Feb 13, 2021 at 9:46 PM Joseph Fox-Rabinovitz > wrote: > > > > I directed Andrea here from Stack Overflow ( > https://stackoverflow.com/q/66179855/2988730). Based on the Wikipedia > article (https://en.m.wikipedia.org/wiki/Dual_number), it seems like > scipy is a much more likely place to look than numpy. > > > > - Joe > > > > > > On Sat, Feb 13, 2021, 13:19 Evgeni Burovski > wrote: > >> > >> Hi Andrea, welcome! > >> > >> Since you're asking about numpy, you likely want the numpy-discussion > >> mailing list. (the overlap is non-zero, but nevertheless). > >> > >> Out of curiosity, what's a hyper-dual type? > >> > >> Cheers, > >> > >> Evgeni > >> > >> On Sat, Feb 13, 2021 at 7:45 PM Andrea Cortis > wrote: > >> > > >> > Hello, first time here. I was wondering if there are plans for the > definition of a ?hyper-dual? type in numpy. I think that would be most > useful for neural nets training, and optimization in general. > >> > > >> > Andrea > >> > > >> > Sent from my iPad > >> > _______________________________________________ > >> > SciPy-Dev mailing list > >> > SciPy-Dev at python.org > >> > https://mail.python.org/mailman/listinfo/scipy-dev > >> _______________________________________________ > >> SciPy-Dev mailing list > >> SciPy-Dev at python.org > >> https://mail.python.org/mailman/listinfo/scipy-dev > > > > _______________________________________________ > > SciPy-Dev mailing list > > SciPy-Dev at python.org > > https://mail.python.org/mailman/listinfo/scipy-dev > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at python.org > https://mail.python.org/mailman/listinfo/scipy-dev > -------------- next part -------------- An HTML attachment was scrubbed... URL: From andrea.cortis at gmail.com Sat Feb 13 14:46:12 2021 From: andrea.cortis at gmail.com (Andrea Cortis) Date: Sat, 13 Feb 2021 13:46:12 -0600 Subject: [SciPy-Dev] Hyper-dual numbers In-Reply-To: References: Message-ID: Anyways, should I then move this conversation to the numpy-dev mailing list? On Sat, Feb 13, 2021 at 1:35 PM Andrea Cortis wrote: > I am no mathematician and cannot comment on the equivalence of quaternions > vs hyper-dual numbers, even though they seem quite different to me at a > first glance. > > For sure, I know that the algebra of dual-numbers is different from the > algebra of complex numbers. > > It seems to me therefore that having a numpy `dtype=dual` would be > extremely advantageous when looking to construct *exact* values of first > (and n-th) derivatives. > Say that one would want to replace automatic differentiation for > backpropagation neural networks like in this blogpost > > > https://blog.demofox.org/2017/03/13/neural-network-gradients-backpropagation-dual-numbers-finite-differences/ > > then we would have a plethora of different implementations, say for > pytorch, tensorflow, etc. and no unifying framework. > > Best, > > Andre > > > > > On Sat, Feb 13, 2021 at 1:21 PM Evgeni Burovski < > evgeny.burovskiy at gmail.com> wrote: > >> Ah, these. Suspected it, but wanted to make sure. >> >> IMO, these are best implemented as a numpy dtype. I'm biased though >> --- here's a branch which makes a start, based on Mike Boyle's version >> of quaternion dtype (Mike and other authors, if you're reading this >> --- thanks a ton!) >> >> https://github.com/ev-br/quaternion/tree/dual >> >> Now, I don't think scipy should carry around additional numpy dtypes. >> Cannot speak for the numpy project, but I strongly suspect this is >> best implemented as a separate repository/project. >> >> Cheers, >> >> Evgeni >> >> On Sat, Feb 13, 2021 at 9:46 PM Joseph Fox-Rabinovitz >> wrote: >> > >> > I directed Andrea here from Stack Overflow ( >> https://stackoverflow.com/q/66179855/2988730). Based on the Wikipedia >> article (https://en.m.wikipedia.org/wiki/Dual_number), it seems like >> scipy is a much more likely place to look than numpy. >> > >> > - Joe >> > >> > >> > On Sat, Feb 13, 2021, 13:19 Evgeni Burovski >> wrote: >> >> >> >> Hi Andrea, welcome! >> >> >> >> Since you're asking about numpy, you likely want the numpy-discussion >> >> mailing list. (the overlap is non-zero, but nevertheless). >> >> >> >> Out of curiosity, what's a hyper-dual type? >> >> >> >> Cheers, >> >> >> >> Evgeni >> >> >> >> On Sat, Feb 13, 2021 at 7:45 PM Andrea Cortis >> wrote: >> >> > >> >> > Hello, first time here. I was wondering if there are plans for the >> definition of a ?hyper-dual? type in numpy. I think that would be most >> useful for neural nets training, and optimization in general. >> >> > >> >> > Andrea >> >> > >> >> > Sent from my iPad >> >> > _______________________________________________ >> >> > SciPy-Dev mailing list >> >> > SciPy-Dev at python.org >> >> > https://mail.python.org/mailman/listinfo/scipy-dev >> >> _______________________________________________ >> >> SciPy-Dev mailing list >> >> SciPy-Dev at python.org >> >> https://mail.python.org/mailman/listinfo/scipy-dev >> > >> > _______________________________________________ >> > SciPy-Dev mailing list >> > SciPy-Dev at python.org >> > https://mail.python.org/mailman/listinfo/scipy-dev >> _______________________________________________ >> SciPy-Dev mailing list >> SciPy-Dev at python.org >> https://mail.python.org/mailman/listinfo/scipy-dev >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From evgeny.burovskiy at gmail.com Sat Feb 13 14:47:34 2021 From: evgeny.burovskiy at gmail.com (Evgeni Burovski) Date: Sat, 13 Feb 2021 22:47:34 +0300 Subject: [SciPy-Dev] Hyper-dual numbers In-Reply-To: References: Message-ID: Sure, these are very different. The use of the quaternions package here is only that it implements the machinery. All I did was to change the multiplication table from the quaternion one to the dual number one. And it sort of works --- the branch above lets you define arrays with `dype=dual_number` and do basic arithmetics. Making it usable is reasonably straightforward, but I did not manage to find time since last November. Help's welcome --- even checking out the branch and trying corner cases in arithmetics (there certainly are bugs) would be helpful. On Sat, Feb 13, 2021 at 10:36 PM Andrea Cortis wrote: > > I am no mathematician and cannot comment on the equivalence of quaternions vs hyper-dual numbers, even though they seem quite different to me at a first glance. > > For sure, I know that the algebra of dual-numbers is different from the algebra of complex numbers. > > It seems to me therefore that having a numpy `dtype=dual` would be extremely advantageous when looking to construct *exact* values of first (and n-th) derivatives. > Say that one would want to replace automatic differentiation for backpropagation neural networks like in this blogpost > > https://blog.demofox.org/2017/03/13/neural-network-gradients-backpropagation-dual-numbers-finite-differences/ > > then we would have a plethora of different implementations, say for pytorch, tensorflow, etc. and no unifying framework. > > Best, > > Andre > > > > > On Sat, Feb 13, 2021 at 1:21 PM Evgeni Burovski wrote: >> >> Ah, these. Suspected it, but wanted to make sure. >> >> IMO, these are best implemented as a numpy dtype. I'm biased though >> --- here's a branch which makes a start, based on Mike Boyle's version >> of quaternion dtype (Mike and other authors, if you're reading this >> --- thanks a ton!) >> >> https://github.com/ev-br/quaternion/tree/dual >> >> Now, I don't think scipy should carry around additional numpy dtypes. >> Cannot speak for the numpy project, but I strongly suspect this is >> best implemented as a separate repository/project. >> >> Cheers, >> >> Evgeni >> >> On Sat, Feb 13, 2021 at 9:46 PM Joseph Fox-Rabinovitz >> wrote: >> > >> > I directed Andrea here from Stack Overflow (https://stackoverflow.com/q/66179855/2988730). Based on the Wikipedia article (https://en.m.wikipedia.org/wiki/Dual_number), it seems like scipy is a much more likely place to look than numpy. >> > >> > - Joe >> > >> > >> > On Sat, Feb 13, 2021, 13:19 Evgeni Burovski wrote: >> >> >> >> Hi Andrea, welcome! >> >> >> >> Since you're asking about numpy, you likely want the numpy-discussion >> >> mailing list. (the overlap is non-zero, but nevertheless). >> >> >> >> Out of curiosity, what's a hyper-dual type? >> >> >> >> Cheers, >> >> >> >> Evgeni >> >> >> >> On Sat, Feb 13, 2021 at 7:45 PM Andrea Cortis wrote: >> >> > >> >> > Hello, first time here. I was wondering if there are plans for the definition of a ?hyper-dual? type in numpy. I think that would be most useful for neural nets training, and optimization in general. >> >> > >> >> > Andrea >> >> > >> >> > Sent from my iPad >> >> > _______________________________________________ >> >> > SciPy-Dev mailing list >> >> > SciPy-Dev at python.org >> >> > https://mail.python.org/mailman/listinfo/scipy-dev >> >> _______________________________________________ >> >> SciPy-Dev mailing list >> >> SciPy-Dev at python.org >> >> https://mail.python.org/mailman/listinfo/scipy-dev >> > >> > _______________________________________________ >> > SciPy-Dev mailing list >> > SciPy-Dev at python.org >> > https://mail.python.org/mailman/listinfo/scipy-dev >> _______________________________________________ >> SciPy-Dev mailing list >> SciPy-Dev at python.org >> https://mail.python.org/mailman/listinfo/scipy-dev > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at python.org > https://mail.python.org/mailman/listinfo/scipy-dev From sebastian at sipsolutions.net Sat Feb 13 15:09:13 2021 From: sebastian at sipsolutions.net (Sebastian Berg) Date: Sat, 13 Feb 2021 14:09:13 -0600 Subject: [SciPy-Dev] Hyper-dual numbers In-Reply-To: References: Message-ID: <999035c87362532e946b4288aaef7639d7441316.camel@sipsolutions.net> On Sat, 2021-02-13 at 22:47 +0300, Evgeni Burovski wrote: > Sure, these are very different. The use of the quaternions package > here is only that it implements the machinery. All I did was to > change > the multiplication table from the quaternion one to the dual number It looks like someone has already done it: https://github.com/anandpratap/PyHD As to NumPy, I doubt it should be part of NumPy it seems far from "basic". Should this be in SciPy? My opinion is: maybe, but quite certainly not right now. The long story is, that I am doing a pretty big revamp of how NumPy does DTypes [1]. That will remove a lot of quirks and limitations. But those quirks should not be forbidding for a hyper-dual number dtype (and probably are not, I suspect that project above just works fairly well!). Should SciPy be in the business of providing new dtypes? Honestly, I hope that the answer may be "yes" at least after NumPy has this new API available and it has been proven/smoothened out. If a dtype is useful enough in general SciPy terms of course. At this time, I think it is probably best to do in stand-alone packages for a while longer and hope that is not actually a limitation at all. Cheers, Sebastian [1] https://numpy.org/neps/nep-0041-improved-dtype-support.html > one. And it sort of works --- the branch above lets you define arrays > with `dype=dual_number` and do basic arithmetics. Making it usable is > reasonably straightforward, but I did not manage to find time since > last November. > > Help's welcome --- even checking out the branch and trying corner > cases in arithmetics (there certainly are bugs) would be helpful. > > On Sat, Feb 13, 2021 at 10:36 PM Andrea Cortis < > andrea.cortis at gmail.com> wrote: > > > > I am no mathematician and cannot comment on the equivalence of > > quaternions vs hyper-dual numbers, even though they seem quite > > different to me at a first glance. > > > > For sure, I know that the algebra of dual-numbers is different from > > the algebra of complex numbers. > > > > It seems to me therefore that having a numpy `dtype=dual` would be > > extremely advantageous when looking to construct *exact* values of > > first (and n-th) derivatives. > > Say that one would want to replace automatic differentiation for > > backpropagation neural networks like in this blogpost > > > > https://blog.demofox.org/2017/03/13/neural-network-gradients-backpropagation-dual-numbers-finite-differences/ > > > > then we would have a plethora of different implementations, say for > > pytorch, tensorflow, etc. and no unifying framework. > > > > Best, > > > > Andre > > > > > > > > > > On Sat, Feb 13, 2021 at 1:21 PM Evgeni Burovski < > > evgeny.burovskiy at gmail.com> wrote: > > > > > > Ah, these. Suspected it, but wanted to make sure. > > > > > > IMO, these are best implemented as a numpy dtype. I'm biased > > > though > > > --- here's a branch which makes a start, based on Mike Boyle's > > > version > > > of quaternion dtype (Mike and other authors, if you're reading > > > this > > > --- thanks a ton!) > > > > > > https://github.com/ev-br/quaternion/tree/dual > > > > > > Now, I don't think scipy should carry around additional numpy > > > dtypes. > > > Cannot speak for the numpy project, but I strongly suspect this > > > is > > > best implemented as a separate repository/project. > > > > > > Cheers, > > > > > > Evgeni > > > > > > On Sat, Feb 13, 2021 at 9:46 PM Joseph Fox-Rabinovitz > > > wrote: > > > > > > > > I directed Andrea here from Stack Overflow ( > > > > https://stackoverflow.com/q/66179855/2988730). Based on the > > > > Wikipedia article > > > > (https://en.m.wikipedia.org/wiki/Dual_number), it seems like > > > > scipy is a much more likely place to look than numpy. > > > > > > > > - Joe > > > > > > > > > > > > On Sat, Feb 13, 2021, 13:19 Evgeni Burovski < > > > > evgeny.burovskiy at gmail.com> wrote: > > > > > > > > > > Hi Andrea, welcome! > > > > > > > > > > Since you're asking about numpy, you likely want the numpy- > > > > > discussion > > > > > mailing list. (the overlap is non-zero, but nevertheless). > > > > > > > > > > Out of curiosity, what's a hyper-dual type? > > > > > > > > > > Cheers, > > > > > > > > > > Evgeni > > > > > > > > > > On Sat, Feb 13, 2021 at 7:45 PM Andrea Cortis < > > > > > andrea.cortis at gmail.com> wrote: > > > > > > > > > > > > Hello, first time here. I was wondering if there are plans > > > > > > for the definition of a ?hyper-dual? type in numpy. I think > > > > > > that would be most useful for neural nets training, and? > > > > > > optimization in general. > > > > > > > > > > > > Andrea > > > > > > > > > > > > Sent from my iPad > > > > > > _______________________________________________ > > > > > > SciPy-Dev mailing list > > > > > > SciPy-Dev at python.org > > > > > > https://mail.python.org/mailman/listinfo/scipy-dev > > > > > _______________________________________________ > > > > > SciPy-Dev mailing list > > > > > SciPy-Dev at python.org > > > > > https://mail.python.org/mailman/listinfo/scipy-dev > > > > > > > > _______________________________________________ > > > > SciPy-Dev mailing list > > > > SciPy-Dev at python.org > > > > https://mail.python.org/mailman/listinfo/scipy-dev > > > _______________________________________________ > > > SciPy-Dev mailing list > > > SciPy-Dev at python.org > > > https://mail.python.org/mailman/listinfo/scipy-dev > > > > _______________________________________________ > > SciPy-Dev mailing list > > SciPy-Dev at python.org > > https://mail.python.org/mailman/listinfo/scipy-dev > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at python.org > https://mail.python.org/mailman/listinfo/scipy-dev -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: This is a digitally signed message part URL: From andrea.cortis at gmail.com Sat Feb 13 17:24:58 2021 From: andrea.cortis at gmail.com (Andrea Cortis) Date: Sat, 13 Feb 2021 16:24:58 -0600 Subject: [SciPy-Dev] Hyper-dual numbers In-Reply-To: <999035c87362532e946b4288aaef7639d7441316.camel@sipsolutions.net> References: <999035c87362532e946b4288aaef7639d7441316.camel@sipsolutions.net> Message-ID: I did try to port PyHD to python3 with 2to3 There are 38 warnings upon compiling (macos Catalina), but then when I try to run I get ----> 1 from hyperdual import numpy_hyperdual ValueError: Failed to register dtype for : Legacy user dtypes using `NPY_ITEM_IS_POINTER` or `NPY_ITEM_REFCOUNT` areunsupported. It is possible to create such a dtype only if it is a structured dtype with names and fields hardcoded at registration time. Please contact the NumPy developers if this used to work but now fails. On Sat, Feb 13, 2021 at 2:09 PM Sebastian Berg wrote: > On Sat, 2021-02-13 at 22:47 +0300, Evgeni Burovski wrote: > > Sure, these are very different. The use of the quaternions package > > here is only that it implements the machinery. All I did was to > > change > > the multiplication table from the quaternion one to the dual number > > It looks like someone has already done it: > https://github.com/anandpratap/PyHD > > As to NumPy, I doubt it should be part of NumPy it seems far from > "basic". Should this be in SciPy? My opinion is: maybe, but quite > certainly not right now. > > The long story is, that I am doing a pretty big revamp of how NumPy > does DTypes [1]. That will remove a lot of quirks and limitations. But > those quirks should not be forbidding for a hyper-dual number dtype > (and probably are not, I suspect that project above just works fairly > well!). > > Should SciPy be in the business of providing new dtypes? Honestly, I > hope that the answer may be "yes" at least after NumPy has this new API > available and it has been proven/smoothened out. If a dtype is useful > enough in general SciPy terms of course. > At this time, I think it is probably best to do in stand-alone packages > for a while longer and hope that is not actually a limitation at all. > > Cheers, > > Sebastian > > > > [1] https://numpy.org/neps/nep-0041-improved-dtype-support.html > > > > > one. And it sort of works --- the branch above lets you define arrays > > with `dype=dual_number` and do basic arithmetics. Making it usable is > > reasonably straightforward, but I did not manage to find time since > > last November. > > > > Help's welcome --- even checking out the branch and trying corner > > cases in arithmetics (there certainly are bugs) would be helpful. > > > > On Sat, Feb 13, 2021 at 10:36 PM Andrea Cortis < > > andrea.cortis at gmail.com> wrote: > > > > > > I am no mathematician and cannot comment on the equivalence of > > > quaternions vs hyper-dual numbers, even though they seem quite > > > different to me at a first glance. > > > > > > For sure, I know that the algebra of dual-numbers is different from > > > the algebra of complex numbers. > > > > > > It seems to me therefore that having a numpy `dtype=dual` would be > > > extremely advantageous when looking to construct *exact* values of > > > first (and n-th) derivatives. > > > Say that one would want to replace automatic differentiation for > > > backpropagation neural networks like in this blogpost > > > > > > > https://blog.demofox.org/2017/03/13/neural-network-gradients-backpropagation-dual-numbers-finite-differences/ > > > > > > then we would have a plethora of different implementations, say for > > > pytorch, tensorflow, etc. and no unifying framework. > > > > > > Best, > > > > > > Andre > > > > > > > > > > > > > > > On Sat, Feb 13, 2021 at 1:21 PM Evgeni Burovski < > > > evgeny.burovskiy at gmail.com> wrote: > > > > > > > > Ah, these. Suspected it, but wanted to make sure. > > > > > > > > IMO, these are best implemented as a numpy dtype. I'm biased > > > > though > > > > --- here's a branch which makes a start, based on Mike Boyle's > > > > version > > > > of quaternion dtype (Mike and other authors, if you're reading > > > > this > > > > --- thanks a ton!) > > > > > > > > https://github.com/ev-br/quaternion/tree/dual > > > > > > > > Now, I don't think scipy should carry around additional numpy > > > > dtypes. > > > > Cannot speak for the numpy project, but I strongly suspect this > > > > is > > > > best implemented as a separate repository/project. > > > > > > > > Cheers, > > > > > > > > Evgeni > > > > > > > > On Sat, Feb 13, 2021 at 9:46 PM Joseph Fox-Rabinovitz > > > > wrote: > > > > > > > > > > I directed Andrea here from Stack Overflow ( > > > > > https://stackoverflow.com/q/66179855/2988730). Based on the > > > > > Wikipedia article > > > > > (https://en.m.wikipedia.org/wiki/Dual_number), it seems like > > > > > scipy is a much more likely place to look than numpy. > > > > > > > > > > - Joe > > > > > > > > > > > > > > > On Sat, Feb 13, 2021, 13:19 Evgeni Burovski < > > > > > evgeny.burovskiy at gmail.com> wrote: > > > > > > > > > > > > Hi Andrea, welcome! > > > > > > > > > > > > Since you're asking about numpy, you likely want the numpy- > > > > > > discussion > > > > > > mailing list. (the overlap is non-zero, but nevertheless). > > > > > > > > > > > > Out of curiosity, what's a hyper-dual type? > > > > > > > > > > > > Cheers, > > > > > > > > > > > > Evgeni > > > > > > > > > > > > On Sat, Feb 13, 2021 at 7:45 PM Andrea Cortis < > > > > > > andrea.cortis at gmail.com> wrote: > > > > > > > > > > > > > > Hello, first time here. I was wondering if there are plans > > > > > > > for the definition of a ?hyper-dual? type in numpy. I think > > > > > > > that would be most useful for neural nets training, and > > > > > > > optimization in general. > > > > > > > > > > > > > > Andrea > > > > > > > > > > > > > > Sent from my iPad > > > > > > > _______________________________________________ > > > > > > > SciPy-Dev mailing list > > > > > > > SciPy-Dev at python.org > > > > > > > https://mail.python.org/mailman/listinfo/scipy-dev > > > > > > _______________________________________________ > > > > > > SciPy-Dev mailing list > > > > > > SciPy-Dev at python.org > > > > > > https://mail.python.org/mailman/listinfo/scipy-dev > > > > > > > > > > _______________________________________________ > > > > > SciPy-Dev mailing list > > > > > SciPy-Dev at python.org > > > > > https://mail.python.org/mailman/listinfo/scipy-dev > > > > _______________________________________________ > > > > SciPy-Dev mailing list > > > > SciPy-Dev at python.org > > > > https://mail.python.org/mailman/listinfo/scipy-dev > > > > > > _______________________________________________ > > > SciPy-Dev mailing list > > > SciPy-Dev at python.org > > > https://mail.python.org/mailman/listinfo/scipy-dev > > _______________________________________________ > > SciPy-Dev mailing list > > SciPy-Dev at python.org > > https://mail.python.org/mailman/listinfo/scipy-dev > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at python.org > https://mail.python.org/mailman/listinfo/scipy-dev > -------------- next part -------------- An HTML attachment was scrubbed... URL: From sebastian at sipsolutions.net Sat Feb 13 17:37:13 2021 From: sebastian at sipsolutions.net (Sebastian Berg) Date: Sat, 13 Feb 2021 16:37:13 -0600 Subject: [SciPy-Dev] Hyper-dual numbers In-Reply-To: References: <999035c87362532e946b4288aaef7639d7441316.camel@sipsolutions.net> Message-ID: On Sat, 2021-02-13 at 16:24 -0600, Andrea Cortis wrote: > I did try to port PyHD to python3 with 2to3 > There are 38 warnings upon compiling (macos Catalina), but then when > I try > to run I get > > ----> 1 from hyperdual import numpy_hyperdual > > ValueError: Failed to register dtype for 'hyperdual.hyperdual'>: > Legacy user dtypes using `NPY_ITEM_IS_POINTER` or `NPY_ITEM_REFCOUNT` > areunsupported.? It is possible to create such a dtype only if it is > a > structured dtype with names and fields hardcoded at registration > time. > Please contact the NumPy developers if this used to work but now > fails. > Sounds like a simple bug, the code fails to properly initialize the descriptor flags, which likely just means zero'ing them out. (NumPy was really designed with being passed a static version of the struct, although nothing wrong with the design here ? it just will never be cleaned up). - Sebastian > > On Sat, Feb 13, 2021 at 2:09 PM Sebastian Berg < > sebastian at sipsolutions.net> > wrote: > > > On Sat, 2021-02-13 at 22:47 +0300, Evgeni Burovski wrote: > > > Sure, these are very different. The use of the quaternions > > > package > > > here is only that it implements the machinery. All I did was to > > > change > > > the multiplication table from the quaternion one to the dual > > > number > > > > It looks like someone has already done it: > > https://github.com/anandpratap/PyHD > > > > As to NumPy, I doubt it should be part of NumPy it seems far from > > "basic".? Should this be in SciPy?? My opinion is: maybe, but quite > > certainly not right now. > > > > The long story is, that I am doing a pretty big revamp of how NumPy > > does DTypes [1]. That will remove a lot of quirks and limitations. > > But > > those quirks should not be forbidding for a hyper-dual number dtype > > (and probably are not, I suspect that project above just works > > fairly > > well!). > > > > Should SciPy be in the business of providing new dtypes? Honestly, > > I > > hope that the answer may be "yes" at least after NumPy has this new > > API > > available and it has been proven/smoothened out.? If a dtype is > > useful > > enough in general SciPy terms of course. > > At this time, I think it is probably best to do in stand-alone > > packages > > for a while longer and hope that is not actually a limitation at > > all. > > > > Cheers, > > > > Sebastian > > > > > > > > [1] https://numpy.org/neps/nep-0041-improved-dtype-support.html > > > > > > > > > one. And it sort of works --- the branch above lets you define > > > arrays > > > with `dype=dual_number` and do basic arithmetics. Making it > > > usable is > > > reasonably straightforward, but I did not manage to find time > > > since > > > last November. > > > > > > Help's welcome --- even checking out the branch and trying corner > > > cases in arithmetics (there certainly are bugs) would be helpful. > > > > > > On Sat, Feb 13, 2021 at 10:36 PM Andrea Cortis < > > > andrea.cortis at gmail.com> wrote: > > > > > > > > I am no mathematician and cannot comment on the equivalence of > > > > quaternions vs hyper-dual numbers, even though they seem quite > > > > different to me at a first glance. > > > > > > > > For sure, I know that the algebra of dual-numbers is different > > > > from > > > > the algebra of complex numbers. > > > > > > > > It seems to me therefore that having a numpy `dtype=dual` would > > > > be > > > > extremely advantageous when looking to construct *exact* values > > > > of > > > > first (and n-th) derivatives. > > > > Say that one would want to replace automatic differentiation > > > > for > > > > backpropagation neural networks like in this blogpost > > > > > > > > > > https://blog.demofox.org/2017/03/13/neural-network-gradients-backpropagation-dual-numbers-finite-differences/ > > > > > > > > then we would have a plethora of different implementations, say > > > > for > > > > pytorch, tensorflow, etc. and no unifying framework. > > > > > > > > Best, > > > > > > > > Andre > > > > > > > > > > > > > > > > > > > > On Sat, Feb 13, 2021 at 1:21 PM Evgeni Burovski < > > > > evgeny.burovskiy at gmail.com> wrote: > > > > > > > > > > Ah, these. Suspected it, but wanted to make sure. > > > > > > > > > > IMO, these are best implemented as a numpy dtype. I'm biased > > > > > though > > > > > --- here's a branch which makes a start, based on Mike > > > > > Boyle's > > > > > version > > > > > of quaternion dtype (Mike and other authors, if you're > > > > > reading > > > > > this > > > > > --- thanks a ton!) > > > > > > > > > > https://github.com/ev-br/quaternion/tree/dual > > > > > > > > > > Now, I don't think scipy should carry around additional numpy > > > > > dtypes. > > > > > Cannot speak for the numpy project, but I strongly suspect > > > > > this > > > > > is > > > > > best implemented as a separate repository/project. > > > > > > > > > > Cheers, > > > > > > > > > > Evgeni > > > > > > > > > > On Sat, Feb 13, 2021 at 9:46 PM Joseph Fox-Rabinovitz > > > > > wrote: > > > > > > > > > > > > I directed Andrea here from Stack Overflow ( > > > > > > https://stackoverflow.com/q/66179855/2988730). Based on the > > > > > > Wikipedia article > > > > > > (https://en.m.wikipedia.org/wiki/Dual_number), it seems > > > > > > like > > > > > > scipy is a much more likely place to look than numpy. > > > > > > > > > > > > - Joe > > > > > > > > > > > > > > > > > > On Sat, Feb 13, 2021, 13:19 Evgeni Burovski < > > > > > > evgeny.burovskiy at gmail.com> wrote: > > > > > > > > > > > > > > Hi Andrea, welcome! > > > > > > > > > > > > > > Since you're asking about numpy, you likely want the > > > > > > > numpy- > > > > > > > discussion > > > > > > > mailing list. (the overlap is non-zero, but > > > > > > > nevertheless). > > > > > > > > > > > > > > Out of curiosity, what's a hyper-dual type? > > > > > > > > > > > > > > Cheers, > > > > > > > > > > > > > > Evgeni > > > > > > > > > > > > > > On Sat, Feb 13, 2021 at 7:45 PM Andrea Cortis < > > > > > > > andrea.cortis at gmail.com> wrote: > > > > > > > > > > > > > > > > Hello, first time here. I was wondering if there are > > > > > > > > plans > > > > > > > > for the definition of a ?hyper-dual? type in numpy. I > > > > > > > > think > > > > > > > > that would be most useful for neural nets training, and > > > > > > > > optimization in general. > > > > > > > > > > > > > > > > Andrea > > > > > > > > > > > > > > > > Sent from my iPad > > > > > > > > _______________________________________________ > > > > > > > > SciPy-Dev mailing list > > > > > > > > SciPy-Dev at python.org > > > > > > > > https://mail.python.org/mailman/listinfo/scipy-dev > > > > > > > _______________________________________________ > > > > > > > SciPy-Dev mailing list > > > > > > > SciPy-Dev at python.org > > > > > > > https://mail.python.org/mailman/listinfo/scipy-dev > > > > > > > > > > > > _______________________________________________ > > > > > > SciPy-Dev mailing list > > > > > > SciPy-Dev at python.org > > > > > > https://mail.python.org/mailman/listinfo/scipy-dev > > > > > _______________________________________________ > > > > > SciPy-Dev mailing list > > > > > SciPy-Dev at python.org > > > > > https://mail.python.org/mailman/listinfo/scipy-dev > > > > > > > > _______________________________________________ > > > > SciPy-Dev mailing list > > > > SciPy-Dev at python.org > > > > https://mail.python.org/mailman/listinfo/scipy-dev > > > _______________________________________________ > > > SciPy-Dev mailing list > > > SciPy-Dev at python.org > > > https://mail.python.org/mailman/listinfo/scipy-dev > > > > _______________________________________________ > > SciPy-Dev mailing list > > SciPy-Dev at python.org > > https://mail.python.org/mailman/listinfo/scipy-dev > > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at python.org > https://mail.python.org/mailman/listinfo/scipy-dev -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: This is a digitally signed message part URL: From andrea.cortis at gmail.com Sat Feb 13 17:52:07 2021 From: andrea.cortis at gmail.com (Andrea Cortis) Date: Sat, 13 Feb 2021 16:52:07 -0600 Subject: [SciPy-Dev] Hyper-dual numbers In-Reply-To: References: <999035c87362532e946b4288aaef7639d7441316.camel@sipsolutions.net> Message-ID: I will report it as such to the author On Sat, Feb 13, 2021 at 4:37 PM Sebastian Berg wrote: > On Sat, 2021-02-13 at 16:24 -0600, Andrea Cortis wrote: > > I did try to port PyHD to python3 with 2to3 > > There are 38 warnings upon compiling (macos Catalina), but then when > > I try > > to run I get > > > > ----> 1 from hyperdual import numpy_hyperdual > > > > ValueError: Failed to register dtype for > 'hyperdual.hyperdual'>: > > Legacy user dtypes using `NPY_ITEM_IS_POINTER` or `NPY_ITEM_REFCOUNT` > > areunsupported. It is possible to create such a dtype only if it is > > a > > structured dtype with names and fields hardcoded at registration > > time. > > Please contact the NumPy developers if this used to work but now > > fails. > > > > Sounds like a simple bug, the code fails to properly initialize the > descriptor flags, which likely just means zero'ing them out. > (NumPy was really designed with being passed a static version of the > struct, although nothing wrong with the design here ? it just will > never be cleaned up). > > - Sebastian > > > > > > On Sat, Feb 13, 2021 at 2:09 PM Sebastian Berg < > > sebastian at sipsolutions.net> > > wrote: > > > > > On Sat, 2021-02-13 at 22:47 +0300, Evgeni Burovski wrote: > > > > Sure, these are very different. The use of the quaternions > > > > package > > > > here is only that it implements the machinery. All I did was to > > > > change > > > > the multiplication table from the quaternion one to the dual > > > > number > > > > > > It looks like someone has already done it: > > > https://github.com/anandpratap/PyHD > > > > > > As to NumPy, I doubt it should be part of NumPy it seems far from > > > "basic". Should this be in SciPy? My opinion is: maybe, but quite > > > certainly not right now. > > > > > > The long story is, that I am doing a pretty big revamp of how NumPy > > > does DTypes [1]. That will remove a lot of quirks and limitations. > > > But > > > those quirks should not be forbidding for a hyper-dual number dtype > > > (and probably are not, I suspect that project above just works > > > fairly > > > well!). > > > > > > Should SciPy be in the business of providing new dtypes? Honestly, > > > I > > > hope that the answer may be "yes" at least after NumPy has this new > > > API > > > available and it has been proven/smoothened out. If a dtype is > > > useful > > > enough in general SciPy terms of course. > > > At this time, I think it is probably best to do in stand-alone > > > packages > > > for a while longer and hope that is not actually a limitation at > > > all. > > > > > > Cheers, > > > > > > Sebastian > > > > > > > > > > > > [1] https://numpy.org/neps/nep-0041-improved-dtype-support.html > > > > > > > > > > > > > one. And it sort of works --- the branch above lets you define > > > > arrays > > > > with `dype=dual_number` and do basic arithmetics. Making it > > > > usable is > > > > reasonably straightforward, but I did not manage to find time > > > > since > > > > last November. > > > > > > > > Help's welcome --- even checking out the branch and trying corner > > > > cases in arithmetics (there certainly are bugs) would be helpful. > > > > > > > > On Sat, Feb 13, 2021 at 10:36 PM Andrea Cortis < > > > > andrea.cortis at gmail.com> wrote: > > > > > > > > > > I am no mathematician and cannot comment on the equivalence of > > > > > quaternions vs hyper-dual numbers, even though they seem quite > > > > > different to me at a first glance. > > > > > > > > > > For sure, I know that the algebra of dual-numbers is different > > > > > from > > > > > the algebra of complex numbers. > > > > > > > > > > It seems to me therefore that having a numpy `dtype=dual` would > > > > > be > > > > > extremely advantageous when looking to construct *exact* values > > > > > of > > > > > first (and n-th) derivatives. > > > > > Say that one would want to replace automatic differentiation > > > > > for > > > > > backpropagation neural networks like in this blogpost > > > > > > > > > > > > > > https://blog.demofox.org/2017/03/13/neural-network-gradients-backpropagation-dual-numbers-finite-differences/ > > > > > > > > > > then we would have a plethora of different implementations, say > > > > > for > > > > > pytorch, tensorflow, etc. and no unifying framework. > > > > > > > > > > Best, > > > > > > > > > > Andre > > > > > > > > > > > > > > > > > > > > > > > > > On Sat, Feb 13, 2021 at 1:21 PM Evgeni Burovski < > > > > > evgeny.burovskiy at gmail.com> wrote: > > > > > > > > > > > > Ah, these. Suspected it, but wanted to make sure. > > > > > > > > > > > > IMO, these are best implemented as a numpy dtype. I'm biased > > > > > > though > > > > > > --- here's a branch which makes a start, based on Mike > > > > > > Boyle's > > > > > > version > > > > > > of quaternion dtype (Mike and other authors, if you're > > > > > > reading > > > > > > this > > > > > > --- thanks a ton!) > > > > > > > > > > > > https://github.com/ev-br/quaternion/tree/dual > > > > > > > > > > > > Now, I don't think scipy should carry around additional numpy > > > > > > dtypes. > > > > > > Cannot speak for the numpy project, but I strongly suspect > > > > > > this > > > > > > is > > > > > > best implemented as a separate repository/project. > > > > > > > > > > > > Cheers, > > > > > > > > > > > > Evgeni > > > > > > > > > > > > On Sat, Feb 13, 2021 at 9:46 PM Joseph Fox-Rabinovitz > > > > > > wrote: > > > > > > > > > > > > > > I directed Andrea here from Stack Overflow ( > > > > > > > https://stackoverflow.com/q/66179855/2988730). Based on the > > > > > > > Wikipedia article > > > > > > > (https://en.m.wikipedia.org/wiki/Dual_number), it seems > > > > > > > like > > > > > > > scipy is a much more likely place to look than numpy. > > > > > > > > > > > > > > - Joe > > > > > > > > > > > > > > > > > > > > > On Sat, Feb 13, 2021, 13:19 Evgeni Burovski < > > > > > > > evgeny.burovskiy at gmail.com> wrote: > > > > > > > > > > > > > > > > Hi Andrea, welcome! > > > > > > > > > > > > > > > > Since you're asking about numpy, you likely want the > > > > > > > > numpy- > > > > > > > > discussion > > > > > > > > mailing list. (the overlap is non-zero, but > > > > > > > > nevertheless). > > > > > > > > > > > > > > > > Out of curiosity, what's a hyper-dual type? > > > > > > > > > > > > > > > > Cheers, > > > > > > > > > > > > > > > > Evgeni > > > > > > > > > > > > > > > > On Sat, Feb 13, 2021 at 7:45 PM Andrea Cortis < > > > > > > > > andrea.cortis at gmail.com> wrote: > > > > > > > > > > > > > > > > > > Hello, first time here. I was wondering if there are > > > > > > > > > plans > > > > > > > > > for the definition of a ?hyper-dual? type in numpy. I > > > > > > > > > think > > > > > > > > > that would be most useful for neural nets training, and > > > > > > > > > optimization in general. > > > > > > > > > > > > > > > > > > Andrea > > > > > > > > > > > > > > > > > > Sent from my iPad > > > > > > > > > _______________________________________________ > > > > > > > > > SciPy-Dev mailing list > > > > > > > > > SciPy-Dev at python.org > > > > > > > > > https://mail.python.org/mailman/listinfo/scipy-dev > > > > > > > > _______________________________________________ > > > > > > > > SciPy-Dev mailing list > > > > > > > > SciPy-Dev at python.org > > > > > > > > https://mail.python.org/mailman/listinfo/scipy-dev > > > > > > > > > > > > > > _______________________________________________ > > > > > > > SciPy-Dev mailing list > > > > > > > SciPy-Dev at python.org > > > > > > > https://mail.python.org/mailman/listinfo/scipy-dev > > > > > > _______________________________________________ > > > > > > SciPy-Dev mailing list > > > > > > SciPy-Dev at python.org > > > > > > https://mail.python.org/mailman/listinfo/scipy-dev > > > > > > > > > > _______________________________________________ > > > > > SciPy-Dev mailing list > > > > > SciPy-Dev at python.org > > > > > https://mail.python.org/mailman/listinfo/scipy-dev > > > > _______________________________________________ > > > > SciPy-Dev mailing list > > > > SciPy-Dev at python.org > > > > https://mail.python.org/mailman/listinfo/scipy-dev > > > > > > _______________________________________________ > > > SciPy-Dev mailing list > > > SciPy-Dev at python.org > > > https://mail.python.org/mailman/listinfo/scipy-dev > > > > > _______________________________________________ > > SciPy-Dev mailing list > > SciPy-Dev at python.org > > https://mail.python.org/mailman/listinfo/scipy-dev > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at python.org > https://mail.python.org/mailman/listinfo/scipy-dev > -------------- next part -------------- An HTML attachment was scrubbed... URL: From andyfaff at gmail.com Sat Feb 13 20:17:05 2021 From: andyfaff at gmail.com (Andrew Nelson) Date: Sun, 14 Feb 2021 12:17:05 +1100 Subject: [SciPy-Dev] Hyper-dual numbers In-Reply-To: References: <999035c87362532e946b4288aaef7639d7441316.camel@sipsolutions.net> Message-ID: IIUC The Jax package uses dual numbers to a great extent, particularly for autodifferentiation. Autodifferentiation is great for providing gradients and Jacobians in optimisation. On Sun, 14 Feb 2021, 09:53 Andrea Cortis, wrote: > I will report it as such to the author > > On Sat, Feb 13, 2021 at 4:37 PM Sebastian Berg > wrote: > >> On Sat, 2021-02-13 at 16:24 -0600, Andrea Cortis wrote: >> > I did try to port PyHD to python3 with 2to3 >> > There are 38 warnings upon compiling (macos Catalina), but then when >> > I try >> > to run I get >> > >> > ----> 1 from hyperdual import numpy_hyperdual >> > >> > ValueError: Failed to register dtype for > > 'hyperdual.hyperdual'>: >> > Legacy user dtypes using `NPY_ITEM_IS_POINTER` or `NPY_ITEM_REFCOUNT` >> > areunsupported. It is possible to create such a dtype only if it is >> > a >> > structured dtype with names and fields hardcoded at registration >> > time. >> > Please contact the NumPy developers if this used to work but now >> > fails. >> > >> >> Sounds like a simple bug, the code fails to properly initialize the >> descriptor flags, which likely just means zero'ing them out. >> (NumPy was really designed with being passed a static version of the >> struct, although nothing wrong with the design here ? it just will >> never be cleaned up). >> >> - Sebastian >> >> >> > >> > On Sat, Feb 13, 2021 at 2:09 PM Sebastian Berg < >> > sebastian at sipsolutions.net> >> > wrote: >> > >> > > On Sat, 2021-02-13 at 22:47 +0300, Evgeni Burovski wrote: >> > > > Sure, these are very different. The use of the quaternions >> > > > package >> > > > here is only that it implements the machinery. All I did was to >> > > > change >> > > > the multiplication table from the quaternion one to the dual >> > > > number >> > > >> > > It looks like someone has already done it: >> > > https://github.com/anandpratap/PyHD >> > > >> > > As to NumPy, I doubt it should be part of NumPy it seems far from >> > > "basic". Should this be in SciPy? My opinion is: maybe, but quite >> > > certainly not right now. >> > > >> > > The long story is, that I am doing a pretty big revamp of how NumPy >> > > does DTypes [1]. That will remove a lot of quirks and limitations. >> > > But >> > > those quirks should not be forbidding for a hyper-dual number dtype >> > > (and probably are not, I suspect that project above just works >> > > fairly >> > > well!). >> > > >> > > Should SciPy be in the business of providing new dtypes? Honestly, >> > > I >> > > hope that the answer may be "yes" at least after NumPy has this new >> > > API >> > > available and it has been proven/smoothened out. If a dtype is >> > > useful >> > > enough in general SciPy terms of course. >> > > At this time, I think it is probably best to do in stand-alone >> > > packages >> > > for a while longer and hope that is not actually a limitation at >> > > all. >> > > >> > > Cheers, >> > > >> > > Sebastian >> > > >> > > >> > > >> > > [1] https://numpy.org/neps/nep-0041-improved-dtype-support.html >> > > >> > > >> > > >> > > > one. And it sort of works --- the branch above lets you define >> > > > arrays >> > > > with `dype=dual_number` and do basic arithmetics. Making it >> > > > usable is >> > > > reasonably straightforward, but I did not manage to find time >> > > > since >> > > > last November. >> > > > >> > > > Help's welcome --- even checking out the branch and trying corner >> > > > cases in arithmetics (there certainly are bugs) would be helpful. >> > > > >> > > > On Sat, Feb 13, 2021 at 10:36 PM Andrea Cortis < >> > > > andrea.cortis at gmail.com> wrote: >> > > > > >> > > > > I am no mathematician and cannot comment on the equivalence of >> > > > > quaternions vs hyper-dual numbers, even though they seem quite >> > > > > different to me at a first glance. >> > > > > >> > > > > For sure, I know that the algebra of dual-numbers is different >> > > > > from >> > > > > the algebra of complex numbers. >> > > > > >> > > > > It seems to me therefore that having a numpy `dtype=dual` would >> > > > > be >> > > > > extremely advantageous when looking to construct *exact* values >> > > > > of >> > > > > first (and n-th) derivatives. >> > > > > Say that one would want to replace automatic differentiation >> > > > > for >> > > > > backpropagation neural networks like in this blogpost >> > > > > >> > > > > >> > > >> https://blog.demofox.org/2017/03/13/neural-network-gradients-backpropagation-dual-numbers-finite-differences/ >> > > > > >> > > > > then we would have a plethora of different implementations, say >> > > > > for >> > > > > pytorch, tensorflow, etc. and no unifying framework. >> > > > > >> > > > > Best, >> > > > > >> > > > > Andre >> > > > > >> > > > > >> > > > > >> > > > > >> > > > > On Sat, Feb 13, 2021 at 1:21 PM Evgeni Burovski < >> > > > > evgeny.burovskiy at gmail.com> wrote: >> > > > > > >> > > > > > Ah, these. Suspected it, but wanted to make sure. >> > > > > > >> > > > > > IMO, these are best implemented as a numpy dtype. I'm biased >> > > > > > though >> > > > > > --- here's a branch which makes a start, based on Mike >> > > > > > Boyle's >> > > > > > version >> > > > > > of quaternion dtype (Mike and other authors, if you're >> > > > > > reading >> > > > > > this >> > > > > > --- thanks a ton!) >> > > > > > >> > > > > > https://github.com/ev-br/quaternion/tree/dual >> > > > > > >> > > > > > Now, I don't think scipy should carry around additional numpy >> > > > > > dtypes. >> > > > > > Cannot speak for the numpy project, but I strongly suspect >> > > > > > this >> > > > > > is >> > > > > > best implemented as a separate repository/project. >> > > > > > >> > > > > > Cheers, >> > > > > > >> > > > > > Evgeni >> > > > > > >> > > > > > On Sat, Feb 13, 2021 at 9:46 PM Joseph Fox-Rabinovitz >> > > > > > wrote: >> > > > > > > >> > > > > > > I directed Andrea here from Stack Overflow ( >> > > > > > > https://stackoverflow.com/q/66179855/2988730). Based on the >> > > > > > > Wikipedia article >> > > > > > > (https://en.m.wikipedia.org/wiki/Dual_number), it seems >> > > > > > > like >> > > > > > > scipy is a much more likely place to look than numpy. >> > > > > > > >> > > > > > > - Joe >> > > > > > > >> > > > > > > >> > > > > > > On Sat, Feb 13, 2021, 13:19 Evgeni Burovski < >> > > > > > > evgeny.burovskiy at gmail.com> wrote: >> > > > > > > > >> > > > > > > > Hi Andrea, welcome! >> > > > > > > > >> > > > > > > > Since you're asking about numpy, you likely want the >> > > > > > > > numpy- >> > > > > > > > discussion >> > > > > > > > mailing list. (the overlap is non-zero, but >> > > > > > > > nevertheless). >> > > > > > > > >> > > > > > > > Out of curiosity, what's a hyper-dual type? >> > > > > > > > >> > > > > > > > Cheers, >> > > > > > > > >> > > > > > > > Evgeni >> > > > > > > > >> > > > > > > > On Sat, Feb 13, 2021 at 7:45 PM Andrea Cortis < >> > > > > > > > andrea.cortis at gmail.com> wrote: >> > > > > > > > > >> > > > > > > > > Hello, first time here. I was wondering if there are >> > > > > > > > > plans >> > > > > > > > > for the definition of a ?hyper-dual? type in numpy. I >> > > > > > > > > think >> > > > > > > > > that would be most useful for neural nets training, and >> > > > > > > > > optimization in general. >> > > > > > > > > >> > > > > > > > > Andrea >> > > > > > > > > >> > > > > > > > > Sent from my iPad >> > > > > > > > > _______________________________________________ >> > > > > > > > > SciPy-Dev mailing list >> > > > > > > > > SciPy-Dev at python.org >> > > > > > > > > https://mail.python.org/mailman/listinfo/scipy-dev >> > > > > > > > _______________________________________________ >> > > > > > > > SciPy-Dev mailing list >> > > > > > > > SciPy-Dev at python.org >> > > > > > > > https://mail.python.org/mailman/listinfo/scipy-dev >> > > > > > > >> > > > > > > _______________________________________________ >> > > > > > > SciPy-Dev mailing list >> > > > > > > SciPy-Dev at python.org >> > > > > > > https://mail.python.org/mailman/listinfo/scipy-dev >> > > > > > _______________________________________________ >> > > > > > SciPy-Dev mailing list >> > > > > > SciPy-Dev at python.org >> > > > > > https://mail.python.org/mailman/listinfo/scipy-dev >> > > > > >> > > > > _______________________________________________ >> > > > > SciPy-Dev mailing list >> > > > > SciPy-Dev at python.org >> > > > > https://mail.python.org/mailman/listinfo/scipy-dev >> > > > _______________________________________________ >> > > > SciPy-Dev mailing list >> > > > SciPy-Dev at python.org >> > > > https://mail.python.org/mailman/listinfo/scipy-dev >> > > >> > > _______________________________________________ >> > > SciPy-Dev mailing list >> > > SciPy-Dev at python.org >> > > https://mail.python.org/mailman/listinfo/scipy-dev >> > > >> > _______________________________________________ >> > SciPy-Dev mailing list >> > SciPy-Dev at python.org >> > https://mail.python.org/mailman/listinfo/scipy-dev >> >> _______________________________________________ >> SciPy-Dev mailing list >> SciPy-Dev at python.org >> https://mail.python.org/mailman/listinfo/scipy-dev >> > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at python.org > https://mail.python.org/mailman/listinfo/scipy-dev > -------------- next part -------------- An HTML attachment was scrubbed... URL: From tyler.je.reddy at gmail.com Sun Feb 14 09:17:19 2021 From: tyler.je.reddy at gmail.com (Tyler Reddy) Date: Sun, 14 Feb 2021 07:17:19 -0700 Subject: [SciPy-Dev] Boost for stats In-Reply-To: References: Message-ID: Probably good to make sure that aarch64 build times remain relatively stable since I think we're already doing quite a few gymnastics on that platform for wheel builds (and we're quite limited on iterations with Travis CI resources there). On Sat, 13 Feb 2021 at 12:32, Evgeni Burovski wrote: > Hi, > > Borrowing from Boost.Math sounds great indeed. (Great if it seems > advantageous by boost devs, too). > There is really no reason to keep using parts of e.g. cdflib which are > superseded by Boost.Math. > > However, playing devil's advocate somewhat: > > - does the scipy PR need the whole Boost.Math? If it only needs a > select subset (e.g., do we need root-finding etc?), then maybe the > size can be reduced. > - do we need the whole thing? e.g. ufunc loops only need a select > subset of types. > - if we do go this route of taking parts / applying scipy specific > patches, what is easier to do or better maintenance-wise: vendor > original code + patches, or do the work once by porting relevant parts > to standalone C or C++ subset? > > Obviously, all these should be weighted with other implications of > adding a dependency. The immediate concerns are distribution size and > build times. > > Cheers, > > Evgeni > > On Sat, Feb 13, 2021 at 4:06 PM Hans Dembinski > wrote: > > > > Hi Nicholas, > > > > as a Boost developer (I wrote Boost.Histogram and contributed to several > other Boost libs), I think it would be great to build SciPy on Boost.Math, > it is a win-win. > > > > > On 12. Feb 2021, at 22:50, Nicholas McKibben > wrote: > > > > > > The initial PR includes the zipped Boost headers only (~24MB zipped), > but adding Boost as a submodule might be a more maintainable approach if > changes to Boost need to be made in the future. > > > > Including it as a submodule seems like a good approach. > > > > > Inclusion of the entire Boost library is a virtual necessity for the > Boost.Math module. Manual attempts to strip away unnecessary files and bcp > (Boost's utility to provide stripped down installations) fail to create > smaller sizes. > > > > I was a bit shocked to hear this, but you are right: > > https://pdimov.github.io/boostdep-report/master/math.html > > Math depends on everything. > > > > We have a long-term goal to reduce the coupling between Boost libs, but > this also incurs costs. Library maintainers then have to copy the relevant > bits from other Boost libraries to not depend on them, which is actually a > terrible idea: you loose the synergies offered by a rich shared code base. > In my view, the coupling is not a bug, it is a feature. > > > > It is impressive to see how you use generators to create the binding > code in Cython. I had a lot of trouble with Cython as it does not support > all C++ features. The best way to wrap (modern) C++ is pybind11, which is a > painless experience. It does the code generation at compile-time with TMP. > > > > Best regards, > > Hans > > _______________________________________________ > > SciPy-Dev mailing list > > SciPy-Dev at python.org > > https://mail.python.org/mailman/listinfo/scipy-dev > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at python.org > https://mail.python.org/mailman/listinfo/scipy-dev > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Sun Feb 14 18:01:27 2021 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Mon, 15 Feb 2021 00:01:27 +0100 Subject: [SciPy-Dev] GSoC'21 participation SciPy Message-ID: Hi all, It's GSoC time soon! I'd like to participate again under the PSF umbrella this year. The GSoC project durations have been halved, which means they're also more manageable to mentor. They're 175 hours now. For more details, see https://summerofcode.withgoogle.com/. Also worth mentioning is that NumPy decided not to participate, but a few people on the NumPy team are interested in helping mentor on a SciPy project. We've also been discussing an initiative around mentoring of people new to open source from under-represented backgrounds with the NumPy team, and came to the conclusion that it'd be better to use SciPy project ideas and help mentor them. So if you have ideas that don't exactly fit the size or timeline for GSoC, please still propose them since they may be suitable for that initiative. We have a few more weeks till the deadline to sign up, so it'd be great to get some ideas and potential mentors onto this ideas page: https://github.com/scipy/scipy/wiki/GSoC-2021-project-ideas. Any thoughts, good project ideas, or volunteers to mentor? Cheers, Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From warren.weckesser at gmail.com Sun Feb 14 19:47:14 2021 From: warren.weckesser at gmail.com (Warren Weckesser) Date: Sun, 14 Feb 2021 19:47:14 -0500 Subject: [SciPy-Dev] Boost for stats In-Reply-To: References: Message-ID: On Fri, Feb 12, 2021 at 4:50 PM Nicholas McKibben wrote: > Hi all, > > Many stats distributions in SciPy have outstanding issues with difficult > solutions in legacy code. We've been working on replacing existing > statistical distributions with those found in Boost.Math. The initial > implementation resolves almost a dozen issues for scipy.stats with > potential for resolving several more in scipy.stats and scipy.special in > future PRs. > > Initial PR: https://github.com/scipy/scipy/pull/1332 > > This PR includes the ability to easily add Boost functionality through > generated ufuncs. > > Boost is a large library and would incur the cost of one of the follow > - an additional dependency (e,g, boostinator > https://github.com/mckib2/boostinator) that outsources the packaging of > the Boost libraries > - the inclusion of Boost within SciPy either as a "clone and own" or > submodule > > The initial PR includes the zipped Boost headers only (~24MB zipped), but > adding Boost as a submodule might be a more maintainable approach if > changes to Boost need to be made in the future. > > Inclusion of the entire Boost library is a virtual necessity for the > Boost.Math module. Manual attempts to strip away unnecessary files and bcp > (Boost's utility to provide stripped down installations) fail to create > smaller sizes. The increase in size would be similar to the following: > > SciPy master repo ~177 MB > Boost branch: ~221 MB > > Built: ~939 MB > Built With Boost: ~1090 MB > > Wheel size should not be significantly impacted because Boost is used as a > header-only library. > > I have no relationship with the Boost libraries other than as a user and > bug reporter. I find them to be impressive and well-maintained with > tremendous support from both industry and open source developers. SciPy > would benefit from the efficient, well-tested and maintained > implementations of stats and special algorithms. > > Thanks, > Nicholas > Thanks Nick, this is a long overdue enhancement for SciPy. For years, we've been fixing bugs in `stats` and `special` for functions that have high quality, thoroughly tested and license- compatible implementations in Boost. Indeed, we often check our results against Boost. I think it will be worth the effort of working out the interface issues and the packaging issues to allow SciPy to take advantage of the excellent code in Boost. The pull request focuses on using Boost functions to improve the implementations of some of probability distributions in `stats`, but in the long term, we should take advantage of Boost as much as we can. The next obvious module is `special`, where it looks like we can close a bunch of issues by replacing the existing implementation of a function with the Boost version (e.g. `erfinv` [gh-12758], `betaincinv` [gh-12796], `jn_zeros` [gh-4690]). There are many other parts of Boost that would be great to have available in SciPy. Here are few (based on a browse through the Boost docs; I don't know if wrapping any of these would have insurmountable technical obstacles): * ODE solvers; in particular, Boost has symplectic solvers, which are not currently available in SciPy (gh-12690). * Interpolators: it looks like Boost has a few interpolators that are not currently in SciPy. * The Boost histogram library might provide some benefits over the existing NumPy and SciPy options. (Hans Dembinski, the author of the histrogram library, has already commented in this email thread.) Warren _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at python.org > https://mail.python.org/mailman/listinfo/scipy-dev > -------------- next part -------------- An HTML attachment was scrubbed... URL: From andyfaff at gmail.com Mon Feb 15 02:26:16 2021 From: andyfaff at gmail.com (Andrew Nelson) Date: Mon, 15 Feb 2021 18:26:16 +1100 Subject: [SciPy-Dev] Boost for stats In-Reply-To: References: Message-ID: On Mon, 15 Feb 2021 at 11:48, Warren Weckesser wrote: > > Thanks Nick, this is a long overdue enhancement for SciPy. > It seems that these libraries make maintenance a lot simpler and close a lot of issues. My questions would be: - how portable is the boost code in general? - how easy is it to install the library. In concept it doesn't seem all that dissimilar to the effort that needs to be done with installing a BLAS library when building from source. -------------- next part -------------- An HTML attachment was scrubbed... URL: From hans.dembinski at gmail.com Mon Feb 15 07:14:45 2021 From: hans.dembinski at gmail.com (Hans Dembinski) Date: Mon, 15 Feb 2021 13:14:45 +0100 Subject: [SciPy-Dev] Boost for stats In-Reply-To: References: Message-ID: > On 15. Feb 2021, at 08:26, Andrew Nelson wrote: > > My questions would be: > > - how portable is the boost code in general? It is very portable. The core goal of Boost is to offer implementations with quality and portability on par with the C++ standard library implementations. Non-portable extensions are sometimes used to speed up things, but there is always a standard compliant vanilla version. In practice, maintainers test portability with CI on Windows, OSX, Linux, using various versions of gcc, clang, msvc, intel, see e.g. https://github.com/boostorg/math/blob/develop/.github/workflows/ci.yml and the Boost build farm from the days before free CI for OSS was easily available, https://www.boost.org/development/tests/master/developer/move.html Not all compilers/platforms are fully compliant, of course. Boost uses workarounds to combat that and submits bug reports on the compiler bug trackers. > - how easy is it to install the library. As Nicholas mentioned, Boost.Math (and Boost.Histogram) is header-only, so it is sufficient to include the headers. Best regards, Hans From ndbecker2 at gmail.com Mon Feb 15 07:23:38 2021 From: ndbecker2 at gmail.com (Neal Becker) Date: Mon, 15 Feb 2021 07:23:38 -0500 Subject: [SciPy-Dev] Boost for stats In-Reply-To: References: Message-ID: I have been using pybind11 (and it's predecessor before it, boost::python) to package c++ code for python use for many years, including some of boost libraries. pybind11 is easy to use and is much better than e.g., cython for packaging c++ code. pybind11 is also header-only. I would also like to call attention for anyone interested in scientific software and c++ to a wonderful library (header-only), xtensor https://xtensor.readthedocs.io/en/latest/ On Mon, Feb 15, 2021 at 7:15 AM Hans Dembinski wrote: > > > > On 15. Feb 2021, at 08:26, Andrew Nelson wrote: > > > > My questions would be: > > > > - how portable is the boost code in general? > > It is very portable. The core goal of Boost is to offer implementations with quality and portability on par with the C++ standard library implementations. Non-portable extensions are sometimes used to speed up things, but there is always a standard compliant vanilla version. In practice, maintainers test portability with CI on Windows, OSX, Linux, using various versions of gcc, clang, msvc, intel, see e.g. > > https://github.com/boostorg/math/blob/develop/.github/workflows/ci.yml > > and the Boost build farm from the days before free CI for OSS was easily available, > > https://www.boost.org/development/tests/master/developer/move.html > > Not all compilers/platforms are fully compliant, of course. Boost uses workarounds to combat that and submits bug reports on the compiler bug trackers. > > > - how easy is it to install the library. > > As Nicholas mentioned, Boost.Math (and Boost.Histogram) is header-only, so it is sufficient to include the headers. > > Best regards, > Hans > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at python.org > https://mail.python.org/mailman/listinfo/scipy-dev -- Those who don't understand recursion are doomed to repeat it From hans.dembinski at gmail.com Mon Feb 15 07:35:25 2021 From: hans.dembinski at gmail.com (Hans Dembinski) Date: Mon, 15 Feb 2021 13:35:25 +0100 Subject: [SciPy-Dev] Boost for stats In-Reply-To: References: Message-ID: <13411649-82A4-4EC1-A58C-FAA3DDFF11D1@gmail.com> > On 15. Feb 2021, at 01:47, Warren Weckesser wrote: > > * The Boost histogram library might provide some benefits over the > existing NumPy and SciPy options. (Hans Dembinski, the author > of the histrogram library, has already commented in this email > thread.) I would happily support this. We currently offer a Python front-end to Boost.Histogram https://github.com/scikit-hep/boost-histogram which includes a numpy.histogram compatible interface. Switching to Boost.Histogram may offer performance benefits, see https://boost-histogram.readthedocs.io/en/latest/notebooks/PerformanceComparison.html Compared to np.histogram we saw a 1.7 times increase - single threaded, more if multiple threads are used. Compared to np.histogram2d we saw a 11 times increase. These numbers should probably be checked more carefully before decisions are made. Boost.Histogram offers generalised histograms with arbitrary accumulators per cell, so it could also replace the implementations of https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.binned_statistic.html and friends. Best regards, Hans From ralf.gommers at gmail.com Mon Feb 15 07:41:51 2021 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Mon, 15 Feb 2021 13:41:51 +0100 Subject: [SciPy-Dev] Boost for stats In-Reply-To: <13411649-82A4-4EC1-A58C-FAA3DDFF11D1@gmail.com> References: <13411649-82A4-4EC1-A58C-FAA3DDFF11D1@gmail.com> Message-ID: On Mon, Feb 15, 2021 at 1:35 PM Hans Dembinski wrote: > > > On 15. Feb 2021, at 01:47, Warren Weckesser > wrote: > > > > * The Boost histogram library might provide some benefits over the > > existing NumPy and SciPy options. (Hans Dembinski, the author > > of the histrogram library, has already commented in this email > > thread.) > > I would happily support this. We currently offer a Python front-end to > Boost.Histogram > https://github.com/scikit-hep/boost-histogram > which includes a numpy.histogram compatible interface. > > Switching to Boost.Histogram may offer performance benefits, see > > https://boost-histogram.readthedocs.io/en/latest/notebooks/PerformanceComparison.html > > Compared to np.histogram we saw a 1.7 times increase - single threaded, > more if multiple threads are used. Compared to np.histogram2d we saw a 11 > times increase. These numbers should probably be checked more carefully > before decisions are made. > > Boost.Histogram offers generalised histograms with arbitrary accumulators > per cell, so it could also replace the implementations of > https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.binned_statistic.html > and friends. > That would be really nice. binned_statistic is currently pure Python, and can be a performance hotspot (I've seen multiple cases of that in dealing with image and geospatial data). Cheers, Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From ndbecker2 at gmail.com Mon Feb 15 07:47:48 2021 From: ndbecker2 at gmail.com (Neal Becker) Date: Mon, 15 Feb 2021 07:47:48 -0500 Subject: [SciPy-Dev] Boost for stats In-Reply-To: References: <13411649-82A4-4EC1-A58C-FAA3DDFF11D1@gmail.com> Message-ID: One thing I've missed with the current scipy histogram is the ability to do 'online' or 'incremental' collection of the histogram data. For this reason I have written my own histogram code. I am often collecting data from monte-carlo simulations and want to accumulate stats from data that arrives in batches. I don't know if boost-histogram supports this but if so I would find this very welcome. On Mon, Feb 15, 2021 at 7:42 AM Ralf Gommers wrote: > > > > On Mon, Feb 15, 2021 at 1:35 PM Hans Dembinski wrote: >> >> >> > On 15. Feb 2021, at 01:47, Warren Weckesser wrote: >> > >> > * The Boost histogram library might provide some benefits over the >> > existing NumPy and SciPy options. (Hans Dembinski, the author >> > of the histrogram library, has already commented in this email >> > thread.) >> >> I would happily support this. We currently offer a Python front-end to Boost.Histogram >> https://github.com/scikit-hep/boost-histogram >> which includes a numpy.histogram compatible interface. >> >> Switching to Boost.Histogram may offer performance benefits, see >> https://boost-histogram.readthedocs.io/en/latest/notebooks/PerformanceComparison.html >> >> Compared to np.histogram we saw a 1.7 times increase - single threaded, more if multiple threads are used. Compared to np.histogram2d we saw a 11 times increase. These numbers should probably be checked more carefully before decisions are made. >> >> Boost.Histogram offers generalised histograms with arbitrary accumulators per cell, so it could also replace the implementations of https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.binned_statistic.html and friends. > > > That would be really nice. binned_statistic is currently pure Python, and can be a performance hotspot (I've seen multiple cases of that in dealing with image and geospatial data). > > Cheers, > Ralf > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at python.org > https://mail.python.org/mailman/listinfo/scipy-dev -- Those who don't understand recursion are doomed to repeat it From hans.dembinski at gmail.com Mon Feb 15 08:03:54 2021 From: hans.dembinski at gmail.com (Hans Dembinski) Date: Mon, 15 Feb 2021 14:03:54 +0100 Subject: [SciPy-Dev] Boost for stats In-Reply-To: References: <13411649-82A4-4EC1-A58C-FAA3DDFF11D1@gmail.com> Message-ID: <2290861A-F5A6-46A1-8397-213EA0AD5F4E@gmail.com> > On 15. Feb 2021, at 13:47, Neal Becker wrote: > > One thing I've missed with the current scipy histogram is the ability > to do 'online' or 'incremental' collection of the histogram data. For > this reason I have written my own histogram code. I am often > collecting data from monte-carlo simulations and want to accumulate > stats from data that arrives in batches. > I don't know if boost-histogram supports this but if so I would find > this very welcome. I think the answer is yes, if I understood you correctly. Boost.Histogram has an object oriented design, the histogram is an object that one can fill incrementally with input arrays. I personally like the functional paradigm behind np.histogram and friends, but it is not as efficient for incremental collection. When numpy.histogram is used, one has to generate a temporary array with the intermediate results, which are then added to the main array. The object-oriented approach avoids this. In my field (high energy and astroparticle physics), incremental filling is also the default. We typically have large amounts of data that we want to convert into histograms, so the codes typically fill some histograms incrementally. The performance issue of numpy.histogram could also be fixed by adding an "out" keyword to numpy.histogram, to allow the user to pass the array which is filled. Best regards, Hans From roy.pamphile at gmail.com Mon Feb 15 12:24:24 2021 From: roy.pamphile at gmail.com (Pamphile Roy) Date: Mon, 15 Feb 2021 18:24:24 +0100 Subject: [SciPy-Dev] GSoC'21 participation SciPy Message-ID: <6EB6E0CE-53E3-4ECC-806A-4B728728579A@gmail.com> Hi, Thank you for putting this together! I would have some ideas for the ideal pool :) scipy.optimize: Would it be wanted to have a possibility to have workers to evaluate the function during an optimization? In most industrial context, the function is not trivial and might require minutes if not hours or even days to compute. Having a simple way to first parallelise the runs would help. We have machines with easily ten cores now and it would be great to leverage this here. Going that direction, having a more general infrastructure to handle external workers would be great. Sure there are external packages to do this, but then it?s not so trivial if you want to use SciPy?s optimizers. scipy.optimize: What about another optimization method such as EGO? This would require to have a Gaussian Process regressor. scipy.stats: there is an ANOVA section in the roadmap. But is sensitivity analysis in general something which would be of interest. I am thinking about Sobol? indices (not related to Sobol? sequence but from the same author), moment based indices, Shapley values, cusunoro, etc. scipy.metamodel: last but not least, a metamodel/response surface module. This is linked to the optimization or sensitivity analysis of expensive functions. Would be sufficient to have Gaussian Process and polynomial chaos expansion. Could also include more general things like linear regression or others things in scipy.interpolate. Cheers, Pamphile @tupui -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Mon Feb 15 14:01:44 2021 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Mon, 15 Feb 2021 20:01:44 +0100 Subject: [SciPy-Dev] GSoC'21 participation SciPy In-Reply-To: <6EB6E0CE-53E3-4ECC-806A-4B728728579A@gmail.com> References: <6EB6E0CE-53E3-4ECC-806A-4B728728579A@gmail.com> Message-ID: On Mon, Feb 15, 2021 at 6:24 PM Pamphile Roy wrote: > Hi, > > Thank you for putting this together! > > I would have some ideas for the ideal pool :) > Thanks Pamphile! > *scipy.optimize:* Would it be wanted to have a possibility to have > workers to evaluate the function during an optimization? > In most industrial context, the function is not trivial and might require > minutes if not hours or even days to compute. > Having a simple way to first parallelise the runs would help. We have > machines with easily ten cores now and it would be great to leverage this > here. > Definitely - see the mention of workers under http://scipy.github.io/devdocs/roadmap.html#performance-improvements. Going that direction, having a more general infrastructure to handle > external workers would be great. > I'm assuming you mean something like standard multiprocessing, or using a custom Pool object, for code that's trivially parallelizable. Both are covered by the `workers` pattern. If you're thinking about something else, can you elaborate? Sure there are external packages to do this, but then it?s not so trivial > if you want to use SciPy?s optimizers. > > *scipy.optimize:* What about another optimization method such as EGO? > This would require to have a Gaussian Process regressor. > In general we'd like to continue adding high-quality optimization methods if they bring something extra - see https://mail.python.org/pipermail/scipy-dev/2021-January/024489.html. Not sure about EGO in particular (I'm not familiar with it), gaussian processes sounds a little out of scope - that's scikit-learn territory probably. > *scipy.stats:* there is an ANOVA section in the roadmap. But is > sensitivity analysis in general something which would be of interest. > I am thinking about Sobol? indices (not related to Sobol? sequence but > from the same author), moment based indices, Shapley values, cusunoro, etc. > I'm not 100% sure, let's see if someone more familiar with this topic has an opinion. In general for new stats functionality we try to figure out if it fits better in scipy.stats or in statsmodels. The latter doesn't have much either right now, only: https://www.statsmodels.org/stable/generated/statsmodels.genmod.generalized_estimating_equations.GEEResults.sensitivity_params.html > *scipy.metamodel: *last but not least, a metamodel/response surface > module. This is linked to the optimization or sensitivity analysis of > expensive > functions. Would be sufficient to have Gaussian Process and polynomial > chaos expansion. Could also include more general things like linear > regression or > others things in scipy.interpolate. > That is out of scope I'd say, too specific for a new submodule - at the very least it should start as a separate package first. Cheers, Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From roy.pamphile at gmail.com Mon Feb 15 15:51:37 2021 From: roy.pamphile at gmail.com (Pamphile Roy) Date: Mon, 15 Feb 2021 21:51:37 +0100 Subject: [SciPy-Dev] GSoC'21 participation SciPy Message-ID: <91E68BFC-5DEA-4BBC-9BE3-A9CC44D00907@gmail.com> > I'm assuming you mean something like standard multiprocessing, or using a > custom Pool object, for code that's trivially parallelizable. Both are > covered by the `workers` pattern. If you're thinking about something else, > can you elaborate? Yes the first step would be to do some simple multiprocessing. But IMO we should still try to have something flexible enough so that we could plug (or the user) something else like a job scheduler. I would not see any scheduling per say in SciPy (totally out of scope and too many options: bash, slurm, kubernetes world, etc.), but being able to access a queue would be nice. This way you could wrap anything to go take a job form a queue and return a result. This would need some experiments, but I think that we could achieve something like this just with some simple queue from std lib. Because I am not sure we would want to introduce, even optionally, a dependency to something like RabbitMQ. (Or maybe?). > Not sure about EGO in particular (I'm not familiar with it), gaussian > processes sounds a little out of scope - that's scikit-learn territory > probably. True. I am using scikit-optimize for that currently. It just seemed EGO was getting more and more tractions and so I though that maybe having this in SciPy could be justified. But definitely, other famous algorithms from the list you posted would be good candidates. > I'm not 100% sure, let's see if someone more familiar with this topic has > an opinion. In general for new stats functionality we try to figure out if > it fits better in scipy.stats or in statsmodels. The latter doesn't have > much either right now, only: > https://www.statsmodels.org/stable/generated/statsmodels.genmod.generalized_estimating_equations.GEEResults.sensitivity_params.html I posted several time the suggestion to statsmodels mailing list but so far did get a clear answer. Hopefully we will have more luck here :) Agreed for the response surface part. It would be a big undertaking. Cheers, Pamphile @tupui -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Mon Feb 15 16:19:08 2021 From: robert.kern at gmail.com (Robert Kern) Date: Mon, 15 Feb 2021 16:19:08 -0500 Subject: [SciPy-Dev] GSoC'21 participation SciPy In-Reply-To: References: <6EB6E0CE-53E3-4ECC-806A-4B728728579A@gmail.com> Message-ID: On Mon, Feb 15, 2021 at 2:02 PM Ralf Gommers wrote: > > On Mon, Feb 15, 2021 at 6:24 PM Pamphile Roy > wrote: > >> *scipy.optimize:* Would it be wanted to have a possibility to have >> workers to evaluate the function during an optimization? >> In most industrial context, the function is not trivial and might require >> minutes if not hours or even days to compute. >> Having a simple way to first parallelise the runs would help. We have >> machines with easily ten cores now and it would be great to leverage this >> here. >> > > Definitely - see the mention of workers under > http://scipy.github.io/devdocs/roadmap.html#performance-improvements. > > Going that direction, having a more general infrastructure to handle >> external workers would be great. >> > > I'm assuming you mean something like standard multiprocessing, or using a > custom Pool object, for code that's trivially parallelizable. Both are > covered by the `workers` pattern. If you're thinking about something else, > can you elaborate? > A standard approach for this is to organize the implementation of the optimization algorithms in what's usually called an "ask-tell" interface. The minimize()-style interface is easy to implement from an ask-tell interface, but not vice-versa. Basically, you have the optimizer object expose two methods, ask(), which returns a next point to evaluate, and tell(), where you feed back the point and its evaluated function value. You're in charge of evaluating that function. This gives you a lot of flexibility in how to dispatch that function evaluation, and importantly, we don't have to commit to any dependencies! That's the user's job! scikit-optimize implements their optimizers in this style, for example. It's pretty common for optimizers that are geared towards expensive evaluations. https://scikit-optimize.github.io/stable/auto_examples/ask-and-tell.html I think it might be a well-scoped GSoC project to start re-implementing a chosen handful of the algorithms in scipy.optimize in such an interface. It could even be a trial run as an external package (even in scikit-optimize, if they're amenable). Then we can evaluate whether we want to adopt that framework inside scipy.optimize and make a roadmap for re-implementing all of the algorithms in that style. It will be a technical challenge to adapt the FORTRAN-implemented algorithms to such an interface. I will not be available to mentor such a project, but that's the general approach that I would recommend. I think it would be a valuable addition. -- Robert Kern -------------- next part -------------- An HTML attachment was scrubbed... URL: From rcasero at gmail.com Mon Feb 15 17:01:53 2021 From: rcasero at gmail.com (=?UTF-8?B?UmFtw7NuIENhc2VybyBDYcOxYXM=?=) Date: Mon, 15 Feb 2021 22:01:53 +0000 Subject: [SciPy-Dev] Pull request test cancelled Message-ID: Hi, i submitted a pull request to speed up hdquantile_sd ( https://github.com/scipy/scipy/pull/13566). I think all tests passed, except one that was cancelled, but I don't think that has to do with my changes? azure-pipelines/ scipy.scipy Build log #L650 The operation was canceled. Kind regards, Ramon. -- Dr Ram?n Casero, CSci, Post-Doctoral Researcher MRC Harwell Institute, Biocomputing, Mammalian Genetics Unit Harwell Campus, Oxfordshire, OX11 0RD, UK. Tel: +44 (0)1235 841237 | www.har.mrc.ac.uk -------------- next part -------------- An HTML attachment was scrubbed... URL: From ilhanpolat at gmail.com Mon Feb 15 17:20:51 2021 From: ilhanpolat at gmail.com (Ilhan Polat) Date: Mon, 15 Feb 2021 23:20:51 +0100 Subject: [SciPy-Dev] Pull request test cancelled In-Reply-To: References: Message-ID: After 60 minutes, the job times out. I've restarted it. On Mon, Feb 15, 2021 at 11:02 PM Ram?n Casero Ca?as wrote: > Hi, > > i submitted a pull request to speed up hdquantile_sd ( > https://github.com/scipy/scipy/pull/13566). > > I think all tests passed, except one that was cancelled, but I don't think > that has to do with my changes? > > azure-pipelines/ scipy.scipy > > Build log #L650 > > The operation was canceled. > > > Kind regards, > > Ramon. > > -- > Dr Ram?n Casero, CSci, Post-Doctoral Researcher > > > MRC Harwell Institute, Biocomputing, Mammalian Genetics Unit > Harwell Campus, Oxfordshire, OX11 0RD, UK. > Tel: +44 (0)1235 841237 | www.har.mrc.ac.uk > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at python.org > https://mail.python.org/mailman/listinfo/scipy-dev > -------------- next part -------------- An HTML attachment was scrubbed... URL: From grlee77 at gmail.com Mon Feb 15 17:24:24 2021 From: grlee77 at gmail.com (Gregory Lee) Date: Mon, 15 Feb 2021 17:24:24 -0500 Subject: [SciPy-Dev] Boost for stats In-Reply-To: References: Message-ID: On Sun, Feb 14, 2021 at 7:47 PM Warren Weckesser wrote: > > > On Fri, Feb 12, 2021 at 4:50 PM Nicholas McKibben > wrote: > >> Hi all, >> >> Many stats distributions in SciPy have outstanding issues with difficult >> solutions in legacy code. We've been working on replacing existing >> statistical distributions with those found in Boost.Math. The initial >> implementation resolves almost a dozen issues for scipy.stats with >> potential for resolving several more in scipy.stats and scipy.special in >> future PRs. >> >> Initial PR: https://github.com/scipy/scipy/pull/1332 >> >> This PR includes the ability to easily add Boost functionality through >> generated ufuncs. >> >> Boost is a large library and would incur the cost of one of the follow >> - an additional dependency (e,g, boostinator >> https://github.com/mckib2/boostinator) that outsources the packaging of >> the Boost libraries >> - the inclusion of Boost within SciPy either as a "clone and own" or >> submodule >> >> The initial PR includes the zipped Boost headers only (~24MB zipped), but >> adding Boost as a submodule might be a more maintainable approach if >> changes to Boost need to be made in the future. >> >> Inclusion of the entire Boost library is a virtual necessity for the >> Boost.Math module. Manual attempts to strip away unnecessary files and bcp >> (Boost's utility to provide stripped down installations) fail to create >> smaller sizes. The increase in size would be similar to the following: >> >> SciPy master repo ~177 MB >> Boost branch: ~221 MB >> >> Built: ~939 MB >> Built With Boost: ~1090 MB >> >> Wheel size should not be significantly impacted because Boost is used as >> a header-only library. >> >> I have no relationship with the Boost libraries other than as a user and >> bug reporter. I find them to be impressive and well-maintained with >> tremendous support from both industry and open source developers. SciPy >> would benefit from the efficient, well-tested and maintained >> implementations of stats and special algorithms. >> >> Thanks, >> Nicholas >> > > > Thanks Nick, this is a long overdue enhancement for SciPy. > > For years, we've been fixing bugs in `stats` and `special` for > functions that have high quality, thoroughly tested and license- > compatible implementations in Boost. Indeed, we often check our > results against Boost. I think it will be worth the effort of > working out the interface issues and the packaging issues to allow > SciPy to take advantage of the excellent code in Boost. > > The pull request focuses on using Boost functions to improve the > implementations of some of probability distributions in `stats`, > but in the long term, we should take advantage of Boost as much as > we can. The next obvious module is `special`, where it looks like we > can close a bunch of issues by replacing the existing implementation > of a function with the Boost version (e.g. `erfinv` [gh-12758], > `betaincinv` [gh-12796], `jn_zeros` [gh-4690]). > > There are many other parts of Boost that would be great to have > available in SciPy. Here are few (based on a browse through the > Boost docs; I don't know if wrapping any of these would have > insurmountable technical obstacles): > > * ODE solvers; in particular, Boost has symplectic solvers, which > are not currently available in SciPy (gh-12690). > * Interpolators: it looks like Boost has a few interpolators > that are not currently in SciPy. > * The Boost histogram library might provide some benefits over the > existing NumPy and SciPy options. (Hans Dembinski, the author > of the histrogram library, has already commented in this email > thread.) > > Warren > > I think the Boost Graph Library is also of substantial interest and could be used in SciPy. We would likely use max-flow algorithms downstream in scikit-image for things like image segmentation and phase unwrapping if they were made available. Discussion in https://github.com/scikit-image/scikit-image/issues/4832 seems to indicate that the scipy.sparse.csgraph.maximum_flow algorithm is fairly slow, and a faster replacement would be welcome. - Greg > > _______________________________________________ >> SciPy-Dev mailing list >> SciPy-Dev at python.org >> https://mail.python.org/mailman/listinfo/scipy-dev >> > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at python.org > https://mail.python.org/mailman/listinfo/scipy-dev > -------------- next part -------------- An HTML attachment was scrubbed... URL: From rcasero at gmail.com Mon Feb 15 18:38:19 2021 From: rcasero at gmail.com (=?UTF-8?B?UmFtw7NuIENhc2VybyBDYcOxYXM=?=) Date: Mon, 15 Feb 2021 23:38:19 +0000 Subject: [SciPy-Dev] Pull request test cancelled In-Reply-To: References: Message-ID: Thanks, checks have passed now. Ramon. On Mon, 15 Feb 2021 at 22:21, Ilhan Polat wrote: > After 60 minutes, the job times out. I've restarted it. > > On Mon, Feb 15, 2021 at 11:02 PM Ram?n Casero Ca?as > wrote: > >> Hi, >> >> i submitted a pull request to speed up hdquantile_sd ( >> https://github.com/scipy/scipy/pull/13566). >> >> I think all tests passed, except one that was cancelled, but I don't >> think that has to do with my changes? >> >> azure-pipelines/ scipy.scipy >> >> Build log #L650 >> >> The operation was canceled. >> >> >> Kind regards, >> >> Ramon. >> >> -- >> Dr Ram?n Casero, CSci, Post-Doctoral Researcher >> >> >> MRC Harwell Institute, Biocomputing, Mammalian Genetics Unit >> Harwell Campus, Oxfordshire, OX11 0RD, UK. >> Tel: +44 (0)1235 841237 | www.har.mrc.ac.uk >> _______________________________________________ >> SciPy-Dev mailing list >> SciPy-Dev at python.org >> https://mail.python.org/mailman/listinfo/scipy-dev >> > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at python.org > https://mail.python.org/mailman/listinfo/scipy-dev > -- Dr Ram?n Casero, CSci, Post-Doctoral Researcher MRC Harwell Institute, Biocomputing, Mammalian Genetics Unit Harwell Campus, Oxfordshire, OX11 0RD, UK. Tel: +44 (0)1235 841237 | www.har.mrc.ac.uk -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Tue Feb 16 09:02:06 2021 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Tue, 16 Feb 2021 15:02:06 +0100 Subject: [SciPy-Dev] GSoC'21 participation SciPy In-Reply-To: References: <6EB6E0CE-53E3-4ECC-806A-4B728728579A@gmail.com> Message-ID: On Mon, Feb 15, 2021 at 10:19 PM Robert Kern wrote: > On Mon, Feb 15, 2021 at 2:02 PM Ralf Gommers > wrote: > >> >> On Mon, Feb 15, 2021 at 6:24 PM Pamphile Roy >> wrote: >> >>> *scipy.optimize:* Would it be wanted to have a possibility to have >>> workers to evaluate the function during an optimization? >>> In most industrial context, the function is not trivial and might >>> require minutes if not hours or even days to compute. >>> Having a simple way to first parallelise the runs would help. We have >>> machines with easily ten cores now and it would be great to leverage this >>> here. >>> >> >> Definitely - see the mention of workers under >> http://scipy.github.io/devdocs/roadmap.html#performance-improvements. >> >> Going that direction, having a more general infrastructure to handle >>> external workers would be great. >>> >> >> I'm assuming you mean something like standard multiprocessing, or using a >> custom Pool object, for code that's trivially parallelizable. Both are >> covered by the `workers` pattern. If you're thinking about something else, >> can you elaborate? >> > > A standard approach for this is to organize the implementation of the > optimization algorithms in what's usually called an "ask-tell" interface. > The minimize()-style interface is easy to implement from an ask-tell > interface, but not vice-versa. Basically, you have the optimizer object > expose two methods, ask(), which returns a next point to evaluate, and > tell(), where you feed back the point and its evaluated function value. > You're in charge of evaluating that function. This gives you a lot of > flexibility in how to dispatch that function evaluation, and importantly, > we don't have to commit to any dependencies! That's the user's job! > > scikit-optimize implements their optimizers in this style, for example. > It's pretty common for optimizers that are geared towards expensive > evaluations. > > https://scikit-optimize.github.io/stable/auto_examples/ask-and-tell.html > > I think it might be a well-scoped GSoC project to start re-implementing a > chosen handful of the algorithms in scipy.optimize in such an interface. It > could even be a trial run as an external package (even in scikit-optimize, > if they're amenable). Then we can evaluate whether we want to adopt that > framework inside scipy.optimize and make a roadmap for re-implementing all > of the algorithms in that style. It will be a technical challenge to adapt > the FORTRAN-implemented algorithms to such an interface. > > I will not be available to mentor such a project, but that's the general > approach that I would recommend. I think it would be a valuable addition. > Thanks Robert, that seems like an interesting exercise. This reminds me of the "class based optimizers" proposal. That didn't mention ask-tell, but the "reverse-communication" may be the same idea: https://github.com/scipy/scipy/pull/8552 https://mail.python.org/pipermail/scipy-dev/2018-February/022449.html Your comments and the scikit-optimize link are imho a better justification for doing this exercise than we had before. Cheers, Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From warren.weckesser at gmail.com Tue Feb 16 12:01:57 2021 From: warren.weckesser at gmail.com (Warren Weckesser) Date: Tue, 16 Feb 2021 12:01:57 -0500 Subject: [SciPy-Dev] New probability distributions in stats: the noncentral hypergeometric distributions Message-ID: Hey all, Matt Haberland's pull request with implementations of the Fisher and Wallenius noncentral hypergeometric distributions (built on Nicholas McKibben's wrappers of Agner Fog's biasedurn code) looks ready to merge: https://github.com/scipy/scipy/pull/13330. It would be nice to get some more eyes on it. The last issue that we've been discussing is the names of the distributions. The current names, `fnch` and `wnch`, are quite concise, and some longer alternatives are being considered in the PR. Comments on that and on the PR in general would be helpful. Thanks! Warren -------------- next part -------------- An HTML attachment was scrubbed... URL: From rcasero at gmail.com Tue Feb 16 12:48:51 2021 From: rcasero at gmail.com (=?UTF-8?B?UmFtw7NuIENhc2VybyBDYcOxYXM=?=) Date: Tue, 16 Feb 2021 17:48:51 +0000 Subject: [SciPy-Dev] Pull request test cancelled In-Reply-To: References: Message-ID: Good evening, The scipy.scipy test timed out again after 60 min for https://github.com/scipy/scipy/pull/13566. Could somebody restart it, please? Kind regards, Ramon. On Mon, 15 Feb 2021 at 22:00, Ram?n Casero Ca?as wrote: > Hi, > > i submitted a pull request to speed up hdquantile_sd ( > https://github.com/scipy/scipy/pull/13566). > > I think all tests passed, except one that was cancelled, but I don't think > that has to do with my changes? > > azure-pipelines/ scipy.scipy > > Build log #L650 > > The operation was canceled. > > > Kind regards, > > Ramon. > > -- > Dr Ram?n Casero, CSci, Post-Doctoral Researcher > > > MRC Harwell Institute, Biocomputing, Mammalian Genetics Unit > Harwell Campus, Oxfordshire, OX11 0RD, UK. > Tel: +44 (0)1235 841237 | www.har.mrc.ac.uk > -- Dr Ram?n Casero, CSci, Post-Doctoral Researcher MRC Harwell Institute, Biocomputing, Mammalian Genetics Unit Harwell Campus, Oxfordshire, OX11 0RD, UK. Tel: +44 (0)1235 841237 | www.har.mrc.ac.uk -------------- next part -------------- An HTML attachment was scrubbed... URL: From rcasero at gmail.com Tue Feb 16 13:55:20 2021 From: rcasero at gmail.com (=?UTF-8?B?UmFtw7NuIENhc2VybyBDYcOxYXM=?=) Date: Tue, 16 Feb 2021 18:55:20 +0000 Subject: [SciPy-Dev] Pull request test cancelled In-Reply-To: References: Message-ID: Thanks to whomever did it. Now, all tests have passed. Kind regards, Ramon. On Tue, 16 Feb 2021 at 17:48, Ram?n Casero Ca?as wrote: > Good evening, > > The scipy.scipy test timed out again after 60 min for > https://github.com/scipy/scipy/pull/13566. Could somebody restart it, > please? > > Kind regards, > > Ramon. > > > On Mon, 15 Feb 2021 at 22:00, Ram?n Casero Ca?as > wrote: > >> Hi, >> >> i submitted a pull request to speed up hdquantile_sd ( >> https://github.com/scipy/scipy/pull/13566). >> >> I think all tests passed, except one that was cancelled, but I don't >> think that has to do with my changes? >> >> azure-pipelines/ scipy.scipy >> >> Build log #L650 >> >> The operation was canceled. >> >> >> Kind regards, >> >> Ramon. >> >> -- >> Dr Ram?n Casero, CSci, Post-Doctoral Researcher >> >> >> MRC Harwell Institute, Biocomputing, Mammalian Genetics Unit >> Harwell Campus, Oxfordshire, OX11 0RD, UK. >> Tel: +44 (0)1235 841237 | www.har.mrc.ac.uk >> > > > -- > Dr Ram?n Casero, CSci, Post-Doctoral Researcher > > > MRC Harwell Institute, Biocomputing, Mammalian Genetics Unit > Harwell Campus, Oxfordshire, OX11 0RD, UK. > Tel: +44 (0)1235 841237 | www.har.mrc.ac.uk > -- Dr Ram?n Casero, CSci, Post-Doctoral Researcher MRC Harwell Institute, Biocomputing, Mammalian Genetics Unit Harwell Campus, Oxfordshire, OX11 0RD, UK. Tel: +44 (0)1235 841237 | www.har.mrc.ac.uk -------------- next part -------------- An HTML attachment was scrubbed... URL: From warren.weckesser at gmail.com Wed Feb 17 13:58:50 2021 From: warren.weckesser at gmail.com (Warren Weckesser) Date: Wed, 17 Feb 2021 13:58:50 -0500 Subject: [SciPy-Dev] Implementation of the Alexander-Govern test Message-ID: Hey all, The pull request https://github.com/scipy/scipy/pull/12873 to add the Alexander-Govern test (https://www.jstor.org/stable/1165140), a test currently in the detailed roadmap for the stats module, looks ready to merge. I'd like to merge on Friday, so be sure to comment in the PR if you have any issues with the implementation. Thanks! Warren From samwallan at icloud.com Wed Feb 17 21:09:22 2021 From: samwallan at icloud.com (Sam Wallan) Date: Wed, 17 Feb 2021 18:09:22 -0800 Subject: [SciPy-Dev] Boost for stats In-Reply-To: References: Message-ID: Hello, I?ve been working on a spreadsheet that compares Boost and SciPy. I looked at statistical distributions, special functions, and ODE solvers. Here?s the google sheets link: https://docs.google.com/spreadsheets/d/1zVaau6k1_0yQNW107D81RVCirWEN8sXwcYaWj2g8UNY/edit?usp=sharing I?ve left it on suggestion mode with that sharing link, so if anyone has any thoughts please feel free to leave a comment. It looks like Boost may have a lot to add! Regards, Sam > On Feb 15, 2021, at 4:48 AM, scipy-dev-request at python.org wrote: > > Send SciPy-Dev mailing list submissions to > scipy-dev at python.org > > To subscribe or unsubscribe via the World Wide Web, visit > https://mail.python.org/mailman/listinfo/scipy-dev > or, via email, send a message with subject or body 'help' to > scipy-dev-request at python.org > > You can reach the person managing the list at > scipy-dev-owner at python.org > > When replying, please edit your Subject line so it is more specific > than "Re: Contents of SciPy-Dev digest..." > > > Today's Topics: > > 1. Re: Boost for stats (Neal Becker) > 2. Re: Boost for stats (Hans Dembinski) > 3. Re: Boost for stats (Ralf Gommers) > 4. Re: Boost for stats (Neal Becker) > > > ---------------------------------------------------------------------- > > Message: 1 > Date: Mon, 15 Feb 2021 07:23:38 -0500 > From: Neal Becker > To: SciPy Developers List > Subject: Re: [SciPy-Dev] Boost for stats > Message-ID: > > Content-Type: text/plain; charset="UTF-8" > > I have been using (and it's predecessor before it, > boost::python) to package c++ code for python use for many years, > including some of boost libraries. > pybind11 is easy to use and is much better than e.g., cython for > packaging c++ code. pybind11 is also header-only. > > I would also like to call attention for anyone interested in > scientific software and c++ to a wonderful library (header-only), > xtensor > https://xtensor.readthedocs.io/en/latest/ > > On Mon, Feb 15, 2021 at 7:15 AM Hans Dembinski wrote: >> >> >>> On 15. Feb 2021, at 08:26, Andrew Nelson wrote: >>> >>> My questions would be: >>> >>> - how portable is the boost code in general? >> >> It is very portable. The core goal of Boost is to offer implementations with quality and portability on par with the C++ standard library implementations. Non-portable extensions are sometimes used to speed up things, but there is always a standard compliant vanilla version. In practice, maintainers test portability with CI on Windows, OSX, Linux, using various versions of gcc, clang, msvc, intel, see e.g. >> >> https://github.com/boostorg/math/blob/develop/.github/workflows/ci.yml >> >> and the Boost build farm from the days before free CI for OSS was easily available, >> >> https://www.boost.org/development/tests/master/developer/move.html >> >> Not all compilers/platforms are fully compliant, of course. Boost uses workarounds to combat that and submits bug reports on the compiler bug trackers. >> >>> - how easy is it to install the library. >> >> As Nicholas mentioned, Boost.Math (and Boost.Histogram) is header-only, so it is sufficient to include the headers. >> >> Best regards, >> Hans >> _______________________________________________ >> SciPy-Dev mailing list >> SciPy-Dev at python.org >> https://mail.python.org/mailman/listinfo/scipy-dev > > > > -- > Those who don't understand recursion are doomed to repeat it > > > ------------------------------ > > Message: 2 > Date: Mon, 15 Feb 2021 13:35:25 +0100 > From: Hans Dembinski > To: SciPy Developers List > Subject: Re: [SciPy-Dev] Boost for stats > Message-ID: <13411649-82A4-4EC1-A58C-FAA3DDFF11D1 at gmail.com> > Content-Type: text/plain; charset=us-ascii > > >> On 15. Feb 2021, at 01:47, Warren Weckesser wrote: >> >> * The Boost histogram library might provide some benefits over the >> existing NumPy and SciPy options. (Hans Dembinski, the author >> of the histrogram library, has already commented in this email >> thread.) > > I would happily support this. We currently offer a Python front-end to Boost.Histogram > https://github.com/scikit-hep/boost-histogram > which includes a numpy.histogram compatible interface. > > Switching to Boost.Histogram may offer performance benefits, see > https://boost-histogram.readthedocs.io/en/latest/notebooks/PerformanceComparison.html > > Compared to np.histogram we saw a 1.7 times increase - single threaded, more if multiple threads are used. Compared to np.histogram2d we saw a 11 times increase. These numbers should probably be checked more carefully before decisions are made. > > Boost.Histogram offers generalised histograms with arbitrary accumulators per cell, so it could also replace the implementations of https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.binned_statistic.html and friends. > > Best regards, > Hans > > ------------------------------ > > Message: 3 > Date: Mon, 15 Feb 2021 13:41:51 +0100 > From: Ralf Gommers > To: SciPy Developers List > Subject: Re: [SciPy-Dev] Boost for stats > Message-ID: > > Content-Type: text/plain; charset="utf-8" > > On Mon, Feb 15, 2021 at 1:35 PM Hans Dembinski > wrote: > >> >>> On 15. Feb 2021, at 01:47, Warren Weckesser >> wrote: >>> >>> * The Boost histogram library might provide some benefits over the >>> existing NumPy and SciPy options. (Hans Dembinski, the author >>> of the histrogram library, has already commented in this email >>> thread.) >> >> I would happily support this. We currently offer a Python front-end to >> Boost.Histogram >> https://github.com/scikit-hep/boost-histogram >> which includes a numpy.histogram compatible interface. >> >> Switching to Boost.Histogram may offer performance benefits, see >> >> https://boost-histogram.readthedocs.io/en/latest/notebooks/PerformanceComparison.html >> >> Compared to np.histogram we saw a 1.7 times increase - single threaded, >> more if multiple threads are used. Compared to np.histogram2d we saw a 11 >> times increase. These numbers should probably be checked more carefully >> before decisions are made. >> >> Boost.Histogram offers generalised histograms with arbitrary accumulators >> per cell, so it could also replace the implementations of >> https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.binned_statistic.html >> and friends. >> > > That would be really nice. binned_statistic is currently pure Python, and > can be a performance hotspot (I've seen multiple cases of that in dealing > with image and geospatial data). > > Cheers, > Ralf > -------------- next part -------------- > An HTML attachment was scrubbed... > URL: > > ------------------------------ > > Message: 4 > Date: Mon, 15 Feb 2021 07:47:48 -0500 > From: Neal Becker > To: SciPy Developers List > Subject: Re: [SciPy-Dev] Boost for stats > Message-ID: > > Content-Type: text/plain; charset="UTF-8" > > One thing I've missed with the current scipy histogram is the ability > to do 'online' or 'incremental' collection of the histogram data. For > this reason I have written my own histogram code. I am often > collecting data from monte-carlo simulations and want to accumulate > stats from data that arrives in batches. > I don't know if boost-histogram supports this but if so I would find > this very welcome. > > On Mon, Feb 15, 2021 at 7:42 AM Ralf Gommers wrote: >> >> >> >> On Mon, Feb 15, 2021 at 1:35 PM Hans Dembinski wrote: >>> >>> >>>> On 15. Feb 2021, at 01:47, Warren Weckesser wrote: >>>> >>>> * The Boost histogram library might provide some benefits over the >>>> existing NumPy and SciPy options. (Hans Dembinski, the author >>>> of the histrogram library, has already commented in this email >>>> thread.) >>> >>> I would happily support this. We currently offer a Python front-end to Boost.Histogram >>> https://github.com/scikit-hep/boost-histogram >>> which includes a numpy.histogram compatible interface. >>> >>> Switching to Boost.Histogram may offer performance benefits, see >>> https://boost-histogram.readthedocs.io/en/latest/notebooks/PerformanceComparison.html >>> >>> Compared to np.histogram we saw a 1.7 times increase - single threaded, more if multiple threads are used. Compared to np.histogram2d we saw a 11 times increase. These numbers should probably be checked more carefully before decisions are made. >>> >>> Boost.Histogram offers generalised histograms with arbitrary accumulators per cell, so it could also replace the implementations of https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.binned_statistic.html and friends. >> >> >> That would be really nice. binned_statistic is currently pure Python, and can be a performance hotspot (I've seen multiple cases of that in dealing with image and geospatial data). >> >> Cheers, >> Ralf >> >> _______________________________________________ >> SciPy-Dev mailing list >> SciPy-Dev at python.org >> https://mail.python.org/mailman/listinfo/scipy-dev > > > > -- > Those who don't understand recursion are doomed to repeat it > > > ------------------------------ > > Subject: Digest Footer > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at python.org > https://mail.python.org/mailman/listinfo/scipy-dev > > > ------------------------------ > > End of SciPy-Dev Digest, Vol 208, Issue 13 > ****************************************** From robert.kern at gmail.com Wed Feb 17 21:39:00 2021 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 17 Feb 2021 21:39:00 -0500 Subject: [SciPy-Dev] GSoC'21 participation SciPy In-Reply-To: References: <6EB6E0CE-53E3-4ECC-806A-4B728728579A@gmail.com> Message-ID: On Tue, Feb 16, 2021 at 9:03 AM Ralf Gommers wrote: > > On Mon, Feb 15, 2021 at 10:19 PM Robert Kern > wrote: > >> On Mon, Feb 15, 2021 at 2:02 PM Ralf Gommers >> wrote: >> >>> >>> On Mon, Feb 15, 2021 at 6:24 PM Pamphile Roy >>> wrote: >>> >>>> *scipy.optimize:* Would it be wanted to have a possibility to have >>>> workers to evaluate the function during an optimization? >>>> In most industrial context, the function is not trivial and might >>>> require minutes if not hours or even days to compute. >>>> Having a simple way to first parallelise the runs would help. We have >>>> machines with easily ten cores now and it would be great to leverage this >>>> here. >>>> >>> >>> Definitely - see the mention of workers under >>> http://scipy.github.io/devdocs/roadmap.html#performance-improvements. >>> >>> Going that direction, having a more general infrastructure to handle >>>> external workers would be great. >>>> >>> >>> I'm assuming you mean something like standard multiprocessing, or using >>> a custom Pool object, for code that's trivially parallelizable. Both are >>> covered by the `workers` pattern. If you're thinking about something else, >>> can you elaborate? >>> >> >> A standard approach for this is to organize the implementation of the >> optimization algorithms in what's usually called an "ask-tell" interface. >> The minimize()-style interface is easy to implement from an ask-tell >> interface, but not vice-versa. Basically, you have the optimizer object >> expose two methods, ask(), which returns a next point to evaluate, and >> tell(), where you feed back the point and its evaluated function value. >> You're in charge of evaluating that function. This gives you a lot of >> flexibility in how to dispatch that function evaluation, and importantly, >> we don't have to commit to any dependencies! That's the user's job! >> >> scikit-optimize implements their optimizers in this style, for example. >> It's pretty common for optimizers that are geared towards expensive >> evaluations. >> >> >> https://scikit-optimize.github.io/stable/auto_examples/ask-and-tell.html >> >> I think it might be a well-scoped GSoC project to start re-implementing a >> chosen handful of the algorithms in scipy.optimize in such an interface. It >> could even be a trial run as an external package (even in scikit-optimize, >> if they're amenable). Then we can evaluate whether we want to adopt that >> framework inside scipy.optimize and make a roadmap for re-implementing all >> of the algorithms in that style. It will be a technical challenge to adapt >> the FORTRAN-implemented algorithms to such an interface. >> >> I will not be available to mentor such a project, but that's the general >> approach that I would recommend. I think it would be a valuable addition. >> > > Thanks Robert, that seems like an interesting exercise. This reminds me of > the "class based optimizers" proposal. That didn't mention ask-tell, but > the "reverse-communication" may be the same idea: > https://github.com/scipy/scipy/pull/8552 > https://mail.python.org/pipermail/scipy-dev/2018-February/022449.html > > Your comments and the scikit-optimize link are imho a better justification > for doing this exercise than we had before. > Yes, "reverse communication" is a FORTRAN-era term for the same general idea. In FORTRAN reverse communication APIs, you would generally call the one optimizer subroutine over and over again, passing in the current state and function evaluation and reading the next point to evaluate (and other state) from "intent out" variables. "ask-tell" is a somewhat more specific instance of that idea, just in an OO context for which it's just a fairly obvious design pattern once you have chosen to go OO and free yourself from the constraints of a FORTRAN subroutine. -- Robert Kern -------------- next part -------------- An HTML attachment was scrubbed... URL: From nicholas.bgp at gmail.com Wed Feb 17 21:44:53 2021 From: nicholas.bgp at gmail.com (Nicholas McKibben) Date: Wed, 17 Feb 2021 19:44:53 -0700 Subject: [SciPy-Dev] Boost for stats In-Reply-To: References: Message-ID: Hi all, Responding to some comments I've seen fly by on this thread in no particular order: > However, playing devil's advocate somewhat: > - does the scipy PR need the whole Boost.Math? If it only needs a select subset (e.g., do we need root-finding etc?), then maybe the size can be reduced. As Hans mentioned, the Boost.Math depends on the whole of Boost, so not without a lot of pain of detangling code and losing the ability to easily bring in upstream updates. > - do we need the whole thing? e.g. ufunc loops only need a select subset of types. Virtually all Boost functions (and certainly the ones we're dealing with in the stats distributions) are templated. The ufunc generators I've written specialize the templates to create all the types we need for the ufuncs (single, double, and long double precision, specifically). float16 could be done in principle by unpacking to floats in the ufunc loop function, but no other distribution considers float16 so I didn't either. > - if we do go this route of taking parts / applying scipy specific patches, what is easier to do or better maintenance-wise: vendor original code + patches, or do the work once by porting relevant parts to standalone C or C++ subset? It sounds like the prefered option (taking the discussion here and in the PR) is to include Boost as a submodule (which precludes SciPy specific patches, incidentally) and track specific tagged commits or commits with bug fixes as necessary. The problem with porting to C is that we lose the typing extensibility and easy upstream pulling of upstream bug fixes. Existing C ports of Boost functions could (should?) be moved to use Boost-proper to reduce maintenance burden. > pybind11 The only reason I didn't consider pybind11 is because I've never used it before and could get it done with Cython. The only troublesome C++ features I ran into were non-type template parameters, but there are easy workarounds for this. If anyone would like to patch my PR to use pybind11, please do! > Probably good to make sure that aarch64 build times remain relatively stable Good point! Can this be checked via a PR to scipy-wheels? Best, Nicholas On Wed, Feb 17, 2021 at 7:09 PM Sam Wallan wrote: > Hello, > > I?ve been working on a spreadsheet that compares Boost and SciPy. I looked > at statistical distributions, special functions, and ODE solvers. Here?s > the google sheets link: > > > https://docs.google.com/spreadsheets/d/1zVaau6k1_0yQNW107D81RVCirWEN8sXwcYaWj2g8UNY/edit?usp=sharing > > I?ve left it on suggestion mode with that sharing link, so if anyone has > any thoughts please feel free to leave a comment. It looks like Boost may > have a lot to add! > > Regards, > > Sam > > > > > On Feb 15, 2021, at 4:48 AM, scipy-dev-request at python.org wrote: > > > > Send SciPy-Dev mailing list submissions to > > scipy-dev at python.org > > > > To subscribe or unsubscribe via the World Wide Web, visit > > https://mail.python.org/mailman/listinfo/scipy-dev > > or, via email, send a message with subject or body 'help' to > > scipy-dev-request at python.org > > > > You can reach the person managing the list at > > scipy-dev-owner at python.org > > > > When replying, please edit your Subject line so it is more specific > > than "Re: Contents of SciPy-Dev digest..." > > > > > > Today's Topics: > > > > 1. Re: Boost for stats (Neal Becker) > > 2. Re: Boost for stats (Hans Dembinski) > > 3. Re: Boost for stats (Ralf Gommers) > > 4. Re: Boost for stats (Neal Becker) > > > > > > ---------------------------------------------------------------------- > > > > Message: 1 > > Date: Mon, 15 Feb 2021 07:23:38 -0500 > > From: Neal Becker > > To: SciPy Developers List > > Subject: Re: [SciPy-Dev] Boost for stats > > Message-ID: > > Esvi1unvYO0+DNnv8RxGKryuTS+jBUg at mail.gmail.com> > > Content-Type: text/plain; charset="UTF-8" > > > > I have been using (and it's predecessor before it, > > boost::python) to package c++ code for python use for many years, > > including some of boost libraries. > > pybind11 is easy to use and is much better than e.g., cython for > > packaging c++ code. pybind11 is also header-only. > > > > I would also like to call attention for anyone interested in > > scientific software and c++ to a wonderful library (header-only), > > xtensor > > https://xtensor.readthedocs.io/en/latest/ > > > > On Mon, Feb 15, 2021 at 7:15 AM Hans Dembinski > wrote: > >> > >> > >>> On 15. Feb 2021, at 08:26, Andrew Nelson wrote: > >>> > >>> My questions would be: > >>> > >>> - how portable is the boost code in general? > >> > >> It is very portable. The core goal of Boost is to offer implementations > with quality and portability on par with the C++ standard library > implementations. Non-portable extensions are sometimes used to speed up > things, but there is always a standard compliant vanilla version. In > practice, maintainers test portability with CI on Windows, OSX, Linux, > using various versions of gcc, clang, msvc, intel, see e.g. > >> > >> https://github.com/boostorg/math/blob/develop/.github/workflows/ci.yml > >> > >> and the Boost build farm from the days before free CI for OSS was > easily available, > >> > >> https://www.boost.org/development/tests/master/developer/move.html > >> > >> Not all compilers/platforms are fully compliant, of course. Boost uses > workarounds to combat that and submits bug reports on the compiler bug > trackers. > >> > >>> - how easy is it to install the library. > >> > >> As Nicholas mentioned, Boost.Math (and Boost.Histogram) is header-only, > so it is sufficient to include the headers. > >> > >> Best regards, > >> Hans > >> _______________________________________________ > >> SciPy-Dev mailing list > >> SciPy-Dev at python.org > >> https://mail.python.org/mailman/listinfo/scipy-dev > > > > > > > > -- > > Those who don't understand recursion are doomed to repeat it > > > > > > ------------------------------ > > > > Message: 2 > > Date: Mon, 15 Feb 2021 13:35:25 +0100 > > From: Hans Dembinski > > To: SciPy Developers List > > Subject: Re: [SciPy-Dev] Boost for stats > > Message-ID: <13411649-82A4-4EC1-A58C-FAA3DDFF11D1 at gmail.com> > > Content-Type: text/plain; charset=us-ascii > > > > > >> On 15. Feb 2021, at 01:47, Warren Weckesser > wrote: > >> > >> * The Boost histogram library might provide some benefits over the > >> existing NumPy and SciPy options. (Hans Dembinski, the author > >> of the histrogram library, has already commented in this email > >> thread.) > > > > I would happily support this. We currently offer a Python front-end to > Boost.Histogram > > https://github.com/scikit-hep/boost-histogram > > which includes a numpy.histogram compatible interface. > > > > Switching to Boost.Histogram may offer performance benefits, see > > > https://boost-histogram.readthedocs.io/en/latest/notebooks/PerformanceComparison.html > > > > Compared to np.histogram we saw a 1.7 times increase - single threaded, > more if multiple threads are used. Compared to np.histogram2d we saw a 11 > times increase. These numbers should probably be checked more carefully > before decisions are made. > > > > Boost.Histogram offers generalised histograms with arbitrary > accumulators per cell, so it could also replace the implementations of > https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.binned_statistic.html > and friends. > > > > Best regards, > > Hans > > > > ------------------------------ > > > > Message: 3 > > Date: Mon, 15 Feb 2021 13:41:51 +0100 > > From: Ralf Gommers > > To: SciPy Developers List > > Subject: Re: [SciPy-Dev] Boost for stats > > Message-ID: > > < > CABL7CQjYZh0CyA6Kx5FULw2KaYMmdrLbm0Jecztc5+4z+r8OJg at mail.gmail.com> > > Content-Type: text/plain; charset="utf-8" > > > > On Mon, Feb 15, 2021 at 1:35 PM Hans Dembinski > > > wrote: > > > >> > >>> On 15. Feb 2021, at 01:47, Warren Weckesser < > warren.weckesser at gmail.com> > >> wrote: > >>> > >>> * The Boost histogram library might provide some benefits over the > >>> existing NumPy and SciPy options. (Hans Dembinski, the author > >>> of the histrogram library, has already commented in this email > >>> thread.) > >> > >> I would happily support this. We currently offer a Python front-end to > >> Boost.Histogram > >> https://github.com/scikit-hep/boost-histogram > >> which includes a numpy.histogram compatible interface. > >> > >> Switching to Boost.Histogram may offer performance benefits, see > >> > >> > https://boost-histogram.readthedocs.io/en/latest/notebooks/PerformanceComparison.html > >> > >> Compared to np.histogram we saw a 1.7 times increase - single threaded, > >> more if multiple threads are used. Compared to np.histogram2d we saw a > 11 > >> times increase. These numbers should probably be checked more carefully > >> before decisions are made. > >> > >> Boost.Histogram offers generalised histograms with arbitrary > accumulators > >> per cell, so it could also replace the implementations of > >> > https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.binned_statistic.html > >> and friends. > >> > > > > That would be really nice. binned_statistic is currently pure Python, and > > can be a performance hotspot (I've seen multiple cases of that in dealing > > with image and geospatial data). > > > > Cheers, > > Ralf > > -------------- next part -------------- > > An HTML attachment was scrubbed... > > URL: < > https://mail.python.org/pipermail/scipy-dev/attachments/20210215/5a972b62/attachment-0001.html > > > > > > ------------------------------ > > > > Message: 4 > > Date: Mon, 15 Feb 2021 07:47:48 -0500 > > From: Neal Becker > > To: SciPy Developers List > > Subject: Re: [SciPy-Dev] Boost for stats > > Message-ID: > > C-Tvo6bnPKWNH_w at mail.gmail.com> > > Content-Type: text/plain; charset="UTF-8" > > > > One thing I've missed with the current scipy histogram is the ability > > to do 'online' or 'incremental' collection of the histogram data. For > > this reason I have written my own histogram code. I am often > > collecting data from monte-carlo simulations and want to accumulate > > stats from data that arrives in batches. > > I don't know if boost-histogram supports this but if so I would find > > this very welcome. > > > > On Mon, Feb 15, 2021 at 7:42 AM Ralf Gommers > wrote: > >> > >> > >> > >> On Mon, Feb 15, 2021 at 1:35 PM Hans Dembinski < > hans.dembinski at gmail.com> wrote: > >>> > >>> > >>>> On 15. Feb 2021, at 01:47, Warren Weckesser < > warren.weckesser at gmail.com> wrote: > >>>> > >>>> * The Boost histogram library might provide some benefits over the > >>>> existing NumPy and SciPy options. (Hans Dembinski, the author > >>>> of the histrogram library, has already commented in this email > >>>> thread.) > >>> > >>> I would happily support this. We currently offer a Python front-end to > Boost.Histogram > >>> https://github.com/scikit-hep/boost-histogram > >>> which includes a numpy.histogram compatible interface. > >>> > >>> Switching to Boost.Histogram may offer performance benefits, see > >>> > https://boost-histogram.readthedocs.io/en/latest/notebooks/PerformanceComparison.html > >>> > >>> Compared to np.histogram we saw a 1.7 times increase - single > threaded, more if multiple threads are used. Compared to np.histogram2d we > saw a 11 times increase. These numbers should probably be checked more > carefully before decisions are made. > >>> > >>> Boost.Histogram offers generalised histograms with arbitrary > accumulators per cell, so it could also replace the implementations of > https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.binned_statistic.html > and friends. > >> > >> > >> That would be really nice. binned_statistic is currently pure Python, > and can be a performance hotspot (I've seen multiple cases of that in > dealing with image and geospatial data). > >> > >> Cheers, > >> Ralf > >> > >> _______________________________________________ > >> SciPy-Dev mailing list > >> SciPy-Dev at python.org > >> https://mail.python.org/mailman/listinfo/scipy-dev > > > > > > > > -- > > Those who don't understand recursion are doomed to repeat it > > > > > > ------------------------------ > > > > Subject: Digest Footer > > > > _______________________________________________ > > SciPy-Dev mailing list > > SciPy-Dev at python.org > > https://mail.python.org/mailman/listinfo/scipy-dev > > > > > > ------------------------------ > > > > End of SciPy-Dev Digest, Vol 208, Issue 13 > > ****************************************** > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at python.org > https://mail.python.org/mailman/listinfo/scipy-dev > -------------- next part -------------- An HTML attachment was scrubbed... URL: From tyler.je.reddy at gmail.com Wed Feb 17 22:27:40 2021 From: tyler.je.reddy at gmail.com (Tyler Reddy) Date: Wed, 17 Feb 2021 20:27:40 -0700 Subject: [SciPy-Dev] ANN: SciPy 1.6.1 Message-ID: Hi all, On behalf of the SciPy development team I'm pleased to announce the release of SciPy 1.6.1, which is a bug fix release. Sources and binary wheels can be found at: https://pypi.org/project/scipy/ and at: https://github.com/scipy/scipy/releases/tag/v1.6.1 One of a few ways to install this release with pip: pip install scipy==1.6.1 ===================== SciPy 1.6.1 Release Notes ===================== SciPy 1.6.1 is a bug-fix release with no new features compared to 1.6.0. Please note that for SciPy wheels to correctly install with Pip on macOS 11, Pip >= 20.3.3 is needed. Authors ====== * Peter Bell * Evgeni Burovski * CJ Carey * Ralf Gommers * Peter Mahler Larsen * Cheng H. Lee + * Cong Ma * Nicholas McKibben * Nikola Forr? * Tyler Reddy * Warren Weckesser A total of 11 people contributed to this release. People with a "+" by their names contributed a patch for the first time. This list of names is automatically generated, and may not be fully complete. Issues closed for 1.6.1 ------------------------------- * `#13072 `__: BLD: Quadpack undefined references * `#13241 `__: Not enough values to unpack when passing tuple to \`blocksize\`... * `#13329 `__: Large sparse matrices of big integers lose information * `#13342 `__: fftn crashes if shape arguments are supplied as ndarrays * `#13356 `__: LSQBivariateSpline segmentation fault when quitting the Python... * `#13358 `__: scipy.spatial.transform.Rotation object can not be deepcopied... * `#13408 `__: Type of \`has_sorted_indices\` property * `#13412 `__: Sorting spherical Voronoi vertices leads to crash in area calculation * `#13421 `__: linear_sum_assignment - support for matrices with more than 2^31... * `#13428 `__: \`stats.exponnorm.cdf\` returns \`nan\` for small values of \`K\`... * `#13465 `__: KDTree.count_neighbors : 0xC0000005 error for tuple of different... * `#13468 `__: directed_hausdorff issue with shuffle * `#13472 `__: Failures on FutureWarnings with numpy 1.20.0 for lfilter, sosfilt... * `#13565 `__: BUG: 32-bit wheels repo test failure in optimize Pull requests for 1.6.1 ----------------------------- * `#13318 `__: REL: prepare for SciPy 1.6.1 * `#13344 `__: BUG: fftpack doesn't work with ndarray shape argument * `#13345 `__: MAINT: Replace scipy.take with numpy.take in FFT function docstrings. * `#13354 `__: BUG: optimize: rename private functions to include leading underscore * `#13387 `__: BUG: Support big-endian platforms and big-endian WAVs * `#13394 `__: BUG: Fix Python crash by allocating larger array in LSQBivariateSpline * `#13400 `__: BUG: sparse: Better validation for BSR ctor * `#13403 `__: BUG: sparse: Propagate dtype through CSR/CSC constructors * `#13414 `__: BUG: maintain dtype of SphericalVoronoi regions * `#13422 `__: FIX: optimize: use npy_intp to store array dims for lsap * `#13425 `__: BUG: spatial: make Rotation picklable * `#13426 `__: BUG: \`has_sorted_indices\` and \`has_canonical_format\` should... * `#13430 `__: BUG: stats: Fix exponnorm.cdf and exponnorm.sf for small K * `#13470 `__: MAINT: silence warning generated by \`spatial.directed_hausdorff\` * `#13473 `__: TST: fix failures due to new FutureWarnings in NumPy 1.21.dev0 * `#13479 `__: MAINT: update directed_hausdorff Cython code * `#13485 `__: BUG: KDTree weighted count_neighbors doesn't work between two... * `#13503 `__: TST: fix \`test_fortranfile_read_mixed_record\` on big-endian... * `#13518 `__: DOC: document that pip >= 20.3.3 is needed for macOS 11 * `#13520 `__: BLD: update reqs based on oldest-supported-numpy in pyproject.toml * `#13567 `__: TST, BUG: adjust tol on test_equivalence Checksums ========= MD5 ~~~ 6312f6644420a0ad11f9dfb80aaa0560 scipy-1.6.1-cp37-cp37m-macosx_10_9_x86_64.whl 0018622e5d32ca0cc690db152a371889 scipy-1.6.1-cp37-cp37m-manylinux1_i686.whl 7612dc5ebc5928d606b6f486e0edabad scipy-1.6.1-cp37-cp37m-manylinux1_x86_64.whl bcbc57efab027e9c74fe4be8ac1b6470 scipy-1.6.1-cp37-cp37m-manylinux2014_aarch64.whl 49d9b5824b22c87d184214497fec1079 scipy-1.6.1-cp37-cp37m-win32.whl 929834c270b3056997717bbcee58809c scipy-1.6.1-cp37-cp37m-win_amd64.whl 4a862104bb2add633ead9a28496356ae scipy-1.6.1-cp38-cp38-macosx_10_9_x86_64.whl c0dc4f798d0acc015c5fb36d3d97f4ed scipy-1.6.1-cp38-cp38-manylinux1_i686.whl 8f0dce3503871db857f44a3ffb5800f6 scipy-1.6.1-cp38-cp38-manylinux1_x86_64.whl e4ee2176f25684d1cd3d21f0db5906ed scipy-1.6.1-cp38-cp38-manylinux2014_aarch64.whl 8589661ea9a320746aef8299cd16f32f scipy-1.6.1-cp38-cp38-win32.whl 819424a909991eec489441880709a97c scipy-1.6.1-cp38-cp38-win_amd64.whl e7ea30f4dc26b79a3a2b9446afd4c572 scipy-1.6.1-cp39-cp39-macosx_10_9_x86_64.whl d8f7678b426174aba4a6184803d90c5a scipy-1.6.1-cp39-cp39-manylinux1_i686.whl d8f5ec24b15fef9786a6233c28003753 scipy-1.6.1-cp39-cp39-manylinux1_x86_64.whl 4a832944f71c5f7b019f6539475647a2 scipy-1.6.1-cp39-cp39-manylinux2014_aarch64.whl 5fff9d3f673e4ae73e76f02ea8544aa3 scipy-1.6.1-cp39-cp39-win32.whl b03f9713b7b9be7fa019ab3c94c72254 scipy-1.6.1-cp39-cp39-win_amd64.whl 98a860ce2d6296cace333d0a07501f13 scipy-1.6.1.tar.gz 5cd15c4b4abf2e24ed05dbde9e7b90c8 scipy-1.6.1.tar.xz a3c4bf7491ea0ab49bc8b149334f50f7 scipy-1.6.1.zip SHA256 ~~~~~~ a15a1f3fc0abff33e792d6049161b7795909b40b97c6cc2934ed54384017ab76 scipy-1.6.1-cp37-cp37m-macosx_10_9_x86_64.whl e79570979ccdc3d165456dd62041d9556fb9733b86b4b6d818af7a0afc15f092 scipy-1.6.1-cp37-cp37m-manylinux1_i686.whl a423533c55fec61456dedee7b6ee7dce0bb6bfa395424ea374d25afa262be261 scipy-1.6.1-cp37-cp37m-manylinux1_x86_64.whl 33d6b7df40d197bdd3049d64e8e680227151673465e5d85723b3b8f6b15a6ced scipy-1.6.1-cp37-cp37m-manylinux2014_aarch64.whl 6725e3fbb47da428794f243864f2297462e9ee448297c93ed1dcbc44335feb78 scipy-1.6.1-cp37-cp37m-win32.whl 5fa9c6530b1661f1370bcd332a1e62ca7881785cc0f80c0d559b636567fab63c scipy-1.6.1-cp37-cp37m-win_amd64.whl bd50daf727f7c195e26f27467c85ce653d41df4358a25b32434a50d8870fc519 scipy-1.6.1-cp38-cp38-macosx_10_9_x86_64.whl f46dd15335e8a320b0fb4685f58b7471702234cba8bb3442b69a3e1dc329c345 scipy-1.6.1-cp38-cp38-manylinux1_i686.whl 0e5b0ccf63155d90da576edd2768b66fb276446c371b73841e3503be1d63fb5d scipy-1.6.1-cp38-cp38-manylinux1_x86_64.whl 2481efbb3740977e3c831edfd0bd9867be26387cacf24eb5e366a6a374d3d00d scipy-1.6.1-cp38-cp38-manylinux2014_aarch64.whl 68cb4c424112cd4be886b4d979c5497fba190714085f46b8ae67a5e4416c32b4 scipy-1.6.1-cp38-cp38-win32.whl 5f331eeed0297232d2e6eea51b54e8278ed8bb10b099f69c44e2558c090d06bf scipy-1.6.1-cp38-cp38-win_amd64.whl 0c8a51d33556bf70367452d4d601d1742c0e806cd0194785914daf19775f0e67 scipy-1.6.1-cp39-cp39-macosx_10_9_x86_64.whl 83bf7c16245c15bc58ee76c5418e46ea1811edcc2e2b03041b804e46084ab627 scipy-1.6.1-cp39-cp39-manylinux1_i686.whl 794e768cc5f779736593046c9714e0f3a5940bc6dcc1dba885ad64cbfb28e9f0 scipy-1.6.1-cp39-cp39-manylinux1_x86_64.whl 5da5471aed911fe7e52b86bf9ea32fb55ae93e2f0fac66c32e58897cfb02fa07 scipy-1.6.1-cp39-cp39-manylinux2014_aarch64.whl 8e403a337749ed40af60e537cc4d4c03febddcc56cd26e774c9b1b600a70d3e4 scipy-1.6.1-cp39-cp39-win32.whl a5193a098ae9f29af283dcf0041f762601faf2e595c0db1da929875b7570353f scipy-1.6.1-cp39-cp39-win_amd64.whl c4fceb864890b6168e79b0e714c585dbe2fd4222768ee90bc1aa0f8218691b11 scipy-1.6.1.tar.gz 2800f47a5040cbab05b3ce58f1dfb670c70232b0f56d30380c6fd4ef4e787df5 scipy-1.6.1.tar.xz 18601effa06aba0e9f34475b6d34b3d7454feabe8b0f96bcc483b3fd38b0afc2 scipy-1.6.1.zip -------------- next part -------------- An HTML attachment was scrubbed... URL: From andyfaff at gmail.com Wed Feb 17 22:34:16 2021 From: andyfaff at gmail.com (Andrew Nelson) Date: Thu, 18 Feb 2021 14:34:16 +1100 Subject: [SciPy-Dev] GSoC'21 participation SciPy In-Reply-To: References: <6EB6E0CE-53E3-4ECC-806A-4B728728579A@gmail.com> Message-ID: It's good to hear about the ask-tell interface, it's not something I'd heard about before. The class-based Optimizer that was proposed wasn't going to work in quite that way. The main concept was to create an (e.g.) LBFGSB class (inheriting a Minimizer superclass). All Minimizer objects would be iterators, having a __next__ method that would perform one step of a minimisation loop. Iterator based design syncs quite well with the loop based design of most of the existing minimisation algorithms. The __next__ method would be responsible for calling the user based functions. If the user based functions could be marked as vectorisable the __next__ method could despatch a whole series of `x` locations for the user function (one or all of func/jac/hess) to evaluate; the user function could do whatever parallelisation it wanted. Vectorisable function evaluations also offer benefits for numerical differentiation evaluation. The return value of __next__ would be something along the lines of an intermediate OptimizeResult. I don't know the ask-tell approach works in finer detail. For example, each minimisation step typically requires multiple function evaluations to proceed, e.g. at least once for func evaluation, and many times more for grad/jac and hess evaluation (not to mention constraint function evaluations). THerefore there wouldn't be a 1:1 correspondence of a single ask-tell and a complete step of the minimizer. I reckon the development of this would be way more than a single GSOC could provide, at least to get a mature design into scipy. It's vital to get the architecture correct (esp. the base class), when considering all the minimizers that scipy offers, and their different vagaries. Implementing for one or two minimizers wouldn't be sufficient otherwise one forgets that they e.g. all have different approaches to halting, and you find yourself bolting other things on to make things work. In addition, it's not just the minimizer methods that are involved, it's considering how this all ties in with how constraints/numerical differentiation/`LowLevelCallable`/etc could be improved/used in such a design. At least for the methods involved in `minimize` such an opportunity is the time to consider a total redesign of how things work. Smart/vectorisable numerical differentiation would be more than a whole GSOC in itself. As Robert says, implementation in a separate package would probably be the best way to work; once the bugs have been ironed out it could be merged into scipy-proper. Any redesign could take into account the existing API's/functionality to make things a less jarring change. It'd be great to get the original class-based Optimization off the ground, or something similar. However, it's worth noting that the original proposal only received lukewarm support. A. -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Thu Feb 18 05:01:12 2021 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Thu, 18 Feb 2021 11:01:12 +0100 Subject: [SciPy-Dev] Boost for stats In-Reply-To: References: Message-ID: On Thu, Feb 18, 2021 at 3:45 AM Nicholas McKibben wrote: > > > Probably good to make sure that aarch64 build times remain relatively > stable > > Good point! Can this be checked via a PR to scipy-wheels? > Don't worry about this one, if compile time increase on other platforms is minor, it'll be fine for aarch64 too. We have limited TravisCI credits (actual status of that is a little unclear), so no need to burn them for this. We anyway should be moving CI providers for aarch64 at some point, probably to https://www.drone.io/ Cheers, Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From matti.picus at gmail.com Thu Feb 18 06:35:10 2021 From: matti.picus at gmail.com (Matti Picus) Date: Thu, 18 Feb 2021 13:35:10 +0200 Subject: [SciPy-Dev] Speed up large NumPy arrays with PNumPy Message-ID: <16bdbae7-3607-8ec4-93be-5a110a3acef9@gmail.com> An HTML attachment was scrubbed... URL: From touqir at ualberta.ca Thu Feb 18 08:12:00 2021 From: touqir at ualberta.ca (Touqir Sajed) Date: Thu, 18 Feb 2021 19:12:00 +0600 Subject: [SciPy-Dev] Faster maximum flow algorithm for scipy.sparse.csgraph Message-ID: Dear Scipy developers, This is a continuation of https://github.com/scipy/scipy/issues/13402 . The current implementation of scipy.sparse.csgraph uses Edmond's Karp algorithm which is not quite good in terms of theoretical time complexity but despite of this, the implementation is optimized enough to outperform several other superior (theoretical complexity) algorithms as shown here : https://github.com/scipy/scipy/pull/10566#issuecomment-552615594 . Later I carried out benchmarks ( https://github.com/scipy/scipy/issues/13402#issuecomment-767909167 ) showing that indeed scipy's Edmond-Karp implementation can be significantly beaten with optimized implementations. My original concern was Edmond-Karp's theoretical complexity which limits its performance in some cases (highly dense graphs). So, having another algorithm in scipy with a better theoretical complexity along with proven superior empirical performance makes sense. Only the algorithms here : https://github.com/touqir14/MaxFlow have been shown to significantly outperform scipy's Edmond-Karp. I think it would be good to port one or several of these implementations into scipy. Having solely cython ports will probably be easier to maintain. One thing to ponder here is how much of a complex implementation should we allow if we decide to add new max flow algorithms to scipy. Let me know your thoughts. Cheers, Touqir -------------- next part -------------- An HTML attachment was scrubbed... URL: From mhaberla at calpoly.edu Sat Feb 20 00:43:13 2021 From: mhaberla at calpoly.edu (Matt Haberland) Date: Fri, 19 Feb 2021 21:43:13 -0800 Subject: [SciPy-Dev] SciPy 2021 Conference Seeking Submissions and Reviewers Message-ID: Dear SciPy Developers, Besides an incredible library, "SciPy" is also the name of a conference about scientific computing with Python and the scientific Python ecosystem as a whole. This year's conference, SciPy 2021 , will be held virtually from July 14 - July 16, with two days before for tutorials and two days after for developer sprints. Registration is now open! There's nothing more motivating than a deadline, right? Well, there are just three days left to submit your talk, poster, or tutorial: submissions are due at 11:59 p.m. on Monday, February 22. We also seek volunteers to review submissions over the next few weeks. Please indicate your interest here . Feel free to contact me if you have any questions about the conference, and I hope to "see" you there! Matt -------------- next part -------------- An HTML attachment was scrubbed... URL: From nicholas.bgp at gmail.com Sun Feb 21 23:36:24 2021 From: nicholas.bgp at gmail.com (Nicholas McKibben) Date: Sun, 21 Feb 2021 21:36:24 -0700 Subject: [SciPy-Dev] Boost for stats In-Reply-To: References: Message-ID: Local testing of inclusion of Boost as a submodule has revealed some undesirable side effects: - all sources, documentation, etc. regardless of relevance to SciPy must be fetched - recursive submodule initialization can take quite a while (~10 minutes on my machine and internet connection) - lots of churn when running commands like `git status` Of course we will also need to see how this impacts the CI pipelines. This extra overhead may initially cause some timeouts. Another option that will alleviate some of these pains is to create a header only repo similar to this one: https://github.com/povilasb/boost-header-only. It could live in the SciPy github account and would be easy to update -- simply download the Boost tarball release and copy over the include directory only (or build a specific commit locally and do the same thing). It is more maintenance than simply checking out the latest tagged release of Boost and updating the submodules (adds an extra step of updating the header only repo), but it minimizes space and bandwidth usage. Thoughts? On Thu, Feb 18, 2021 at 3:01 AM Ralf Gommers wrote: > > > On Thu, Feb 18, 2021 at 3:45 AM Nicholas McKibben > wrote: > >> >> > Probably good to make sure that aarch64 build times remain relatively >> stable >> >> Good point! Can this be checked via a PR to scipy-wheels? >> > > Don't worry about this one, if compile time increase on other platforms is > minor, it'll be fine for aarch64 too. We have limited TravisCI credits > (actual status of that is a little unclear), so no need to burn them for > this. We anyway should be moving CI providers for aarch64 at some point, > probably to https://www.drone.io/ > > Cheers, > Ralf > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at python.org > https://mail.python.org/mailman/listinfo/scipy-dev > -------------- next part -------------- An HTML attachment was scrubbed... URL: From matthew.brett at gmail.com Mon Feb 22 02:56:10 2021 From: matthew.brett at gmail.com (Matthew Brett) Date: Mon, 22 Feb 2021 07:56:10 +0000 Subject: [SciPy-Dev] Boost for stats In-Reply-To: References: Message-ID: The header-only repository sounds like the better option to me. That level of git churn for the submodule would be a noticeable burden. From ralf.gommers at gmail.com Mon Feb 22 04:35:36 2021 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Mon, 22 Feb 2021 10:35:36 +0100 Subject: [SciPy-Dev] Boost for stats In-Reply-To: References: Message-ID: On Mon, Feb 22, 2021 at 5:36 AM Nicholas McKibben wrote: > Local testing of inclusion of Boost as a submodule has revealed some > undesirable side effects: > - all sources, documentation, etc. regardless of relevance to SciPy must > be fetched > - recursive submodule initialization can take quite a while (~10 minutes > on my machine and internet connection) > - lots of churn when running commands like `git status` > Ouch, that's a lot slower than I expected. I'm not sure I understand it though, there should be no `git status` churn at all (unless the build process messes with files in-place?) and it's faster than cloning our own repo: $ time git clone git at github.com:boostorg/boost.git Cloning into 'boost'... remote: Enumerating objects: 15, done. remote: Counting objects: 100% (15/15), done. remote: Compressing objects: 100% (11/11), done. remote: Total 254626 (delta 8), reused 11 (delta 4), pack-reused 254611 Receiving objects: 100% (254626/254626), 62.02 MiB | 7.47 MiB/s, done. Resolving deltas: 100% (163071/163071), done. real 0m12.221s user 0m5.959s sys 0m2.725s $ time git clone git at github.com:scipy/scipy.git Cloning into 'scipy'... remote: Enumerating objects: 178585, done. remote: Total 178585 (delta 0), reused 0 (delta 0), pack-reused 178585 Receiving objects: 100% (178585/178585), 104.61 MiB | 6.56 MiB/s, done. Resolving deltas: 100% (137836/137836), done. real 0m21.492s user 0m9.620s sys 0m3.231s What should I test to reproduce the problem? Cheers, Ralf > Of course we will also need to see how this impacts the CI pipelines. > This extra overhead may initially cause some timeouts. Another option that > will alleviate some of these pains is to create a header only repo similar > to this one: https://github.com/povilasb/boost-header-only. It could > live in the SciPy github account and would be easy to update -- simply > download the Boost tarball release and copy over the include directory only > (or build a specific commit locally and do the same thing). It is more > maintenance than simply checking out the latest tagged release of Boost and > updating the submodules (adds an extra step of updating the header only > repo), but it minimizes space and bandwidth usage. Thoughts? > > On Thu, Feb 18, 2021 at 3:01 AM Ralf Gommers > wrote: > >> >> >> On Thu, Feb 18, 2021 at 3:45 AM Nicholas McKibben >> wrote: >> >>> >>> > Probably good to make sure that aarch64 build times remain relatively >>> stable >>> >>> Good point! Can this be checked via a PR to scipy-wheels? >>> >> >> Don't worry about this one, if compile time increase on other platforms >> is minor, it'll be fine for aarch64 too. We have limited TravisCI credits >> (actual status of that is a little unclear), so no need to burn them for >> this. We anyway should be moving CI providers for aarch64 at some point, >> probably to https://www.drone.io/ >> >> Cheers, >> Ralf >> >> _______________________________________________ >> SciPy-Dev mailing list >> SciPy-Dev at python.org >> https://mail.python.org/mailman/listinfo/scipy-dev >> > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at python.org > https://mail.python.org/mailman/listinfo/scipy-dev > -------------- next part -------------- An HTML attachment was scrubbed... URL: From hans.dembinski at gmail.com Mon Feb 22 06:22:53 2021 From: hans.dembinski at gmail.com (Hans Dembinski) Date: Mon, 22 Feb 2021 12:22:53 +0100 Subject: [SciPy-Dev] Boost for stats In-Reply-To: References: Message-ID: <8F1CE745-50F2-48C7-81E2-01E6B762E3F9@gmail.com> Hi, > On 22. Feb 2021, at 10:35, Ralf Gommers wrote: > > Ouch, that's a lot slower than I expected. I'm not sure I understand it though, there should be no `git status` churn at all (unless the build process messes with files in-place?) and it's faster than cloning our own repo: > > $ time git clone git at github.com:boostorg/boost.git > Cloning into 'boost'... > remote: Enumerating objects: 15, done. > remote: Counting objects: 100% (15/15), done. > remote: Compressing objects: 100% (11/11), done. > remote: Total 254626 (delta 8), reused 11 (delta 4), pack-reused 254611 > Receiving objects: 100% (254626/254626), 62.02 MiB | 7.47 MiB/s, done. > Resolving deltas: 100% (163071/163071), done. > > real 0m12.221s > user 0m5.959s > sys 0m2.725s because of the a long-term goal to make boost more modular, cloning boostorg/boost like this only clones the so called superproject of Boost, which indeed very small. It consists itself of many submodules with the individual Boost libraries, like Boost.Math etc, which live in separate repositories. If you do git clone --recurse-submodules git at github.com:boostorg/boost.git instead, you will see the long delay. Fetching all the submodules indeed takes a lot of time, unfortunately. The main Boost repo includes 157 submodules. Best regards, Hans From treverhines at gmail.com Mon Feb 22 09:43:33 2021 From: treverhines at gmail.com (Trever Hines) Date: Mon, 22 Feb 2021 09:43:33 -0500 Subject: [SciPy-Dev] ENH: improve RBF interpolation In-Reply-To: <274ab45e-a9cc-4710-a40e-65b6f96c0b4a@www.fastmail.com> References: <274ab45e-a9cc-4710-a40e-65b6f96c0b4a@www.fastmail.com> Message-ID: Hello all, I have made a pull request here https://github.com/scipy/scipy/pull/13595, and I would appreciate any feedback. On Thu, Feb 4, 2021 at 7:23 PM Stefan van der Walt wrote: > > Is there any advantage to keeping the old interface, or should this > eventually replace Rbf entirely? > > My intention is for `RBFInterpolator` to replace `Rbf` entirely. It should be possible to replicate the functionality of `Rbf` with `RBFInterpolator` (albeit with warnings when the interpolant may not be well-posed). `Rbf` is not currently deprecated in my PR, but I can make that change if you think it is appropriate. -Trever -------------- next part -------------- An HTML attachment was scrubbed... URL: From nicholas.bgp at gmail.com Mon Feb 22 09:54:44 2021 From: nicholas.bgp at gmail.com (Nicholas McKibben) Date: Mon, 22 Feb 2021 07:54:44 -0700 Subject: [SciPy-Dev] Boost for stats In-Reply-To: <8F1CE745-50F2-48C7-81E2-01E6B762E3F9@gmail.com> References: <8F1CE745-50F2-48C7-81E2-01E6B762E3F9@gmail.com> Message-ID: Adding the following options to the .gitmodules file was also useful for speeding up routine git commands: active = false ignore = true shallow = true I am not sure if all of them are necessary - I don't profess to be a git wizard. I still had commands such as 'git add -u' hang. Indeed --recurse-submodules is necessary (and might be why the CI is currently failing for the the PR). Thanks, Nicholas On Mon, Feb 22, 2021, 04:23 Hans Dembinski wrote: > Hi, > > > On 22. Feb 2021, at 10:35, Ralf Gommers wrote: > > > > Ouch, that's a lot slower than I expected. I'm not sure I understand it > though, there should be no `git status` churn at all (unless the build > process messes with files in-place?) and it's faster than cloning our own > repo: > > > > $ time git clone git at github.com:boostorg/boost.git > > Cloning into 'boost'... > > remote: Enumerating objects: 15, done. > > remote: Counting objects: 100% (15/15), done. > > remote: Compressing objects: 100% (11/11), done. > > remote: Total 254626 (delta 8), reused 11 (delta 4), pack-reused 254611 > > Receiving objects: 100% (254626/254626), 62.02 MiB | 7.47 MiB/s, done. > > Resolving deltas: 100% (163071/163071), done. > > > > real 0m12.221s > > user 0m5.959s > > sys 0m2.725s > > because of the a long-term goal to make boost more modular, cloning > boostorg/boost like this only clones the so called superproject of Boost, > which indeed very small. It consists itself of many submodules with the > individual Boost libraries, like Boost.Math etc, which live in separate > repositories. If you do > > git clone --recurse-submodules git at github.com:boostorg/boost.git > > instead, you will see the long delay. Fetching all the submodules indeed > takes a lot of time, unfortunately. The main Boost repo includes 157 > submodules. > > Best regards, > Hans > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at python.org > https://mail.python.org/mailman/listinfo/scipy-dev > -------------- next part -------------- An HTML attachment was scrubbed... URL: From roy.pamphile at gmail.com Tue Feb 23 04:17:53 2021 From: roy.pamphile at gmail.com (Pamphile Roy) Date: Tue, 23 Feb 2021 10:17:53 +0100 Subject: [SciPy-Dev] merged scipy.stats.qmc with quasi-Monte Carlo functionality Message-ID: <7FBC7377-4A05-4FBC-83B8-F31B5C4C752B@gmail.com> Hi everyone, First of all, thank you for everyone who helped with adding QMC (scipy.stats.qmc) :). I wanted to give an overview of the current work in progress and some perspectives. Waiting for reviews: * Using QMC in scipy.optimize: https://github.com/scipy/scipy/pull/13469 (had a few reviews, could be shipped fast and blocking Stefan Endres to work on improvements of shgo). * New QMC method (LHS optimized): https://github.com/scipy/scipy/pull/13471 (code was originally in the QMC PR and received initial reviews thanks to Matt Haberland, but since then nothing more). * General tutorial: https://github.com/scipy/scipy/pull/13487 (finished but Art Owen might drop in and add things, this could also wait for another PR). * Small Cython refactoring of an internal function of Sobol?: https://github.com/scipy/scipy/pull/13514 (waiting for a maintainer to be merged). Work in progress: * Port to Cython (Pythran is being looked at too) of discrepancy functions (thanks Arthur Volant): https://github.com/scipy/scipy/pull/13576 . Other Cythonization which have not been looked at yet, see this issue https://github.com/scipy/scipy/issues/13474 . Perspectives: * Other scrambling methods (for Halton and Sobol?, but could also have more generic things like randomization of any QMC). * Use QMC with any distribution from scipy.stats. There is an issue with some initial discussions, would need more opinion for me to continue and propose a PR: https://github.com/scipy/scipy/issues/13368 . * Add other QMC methods: lattice rules, other low discrepancy sequences, adaptive LHS, etc. * Add other uniformity criteria: L-inf, minimum spanning tree, discrepancy over sub-spaces, etc. * General method to construct an optimal design based on a metric. (Similar to what I propose with LHS optimized). * Your ideas. Thanks in advance for your help. Cheers, Pamphile -------------- next part -------------- An HTML attachment was scrubbed... URL: From roy.pamphile at gmail.com Thu Feb 25 06:55:12 2021 From: roy.pamphile at gmail.com (Pamphile Roy) Date: Thu, 25 Feb 2021 12:55:12 +0100 Subject: [SciPy-Dev] Using more functionalities from GitHub? Message-ID: <0689A9B7-5561-4782-A83B-AF8C6EA80641@gmail.com> Hi everyone, I would like to propose to use more functionalities of GitHub. I could not find any reference showing this was discussed before, so I apologize if the following would not be relevant. Teams Currently there is just the triage team, but we could create other teams like one per submodule or for given skill sets. This could allow easier pinging of people on PR. Project We have the roadmap and some issues are acting as meta-issues. But we could use, instead/on top, the integrated project management dashboard. It would be more convenient to keep track of what can be done, who is working on what area, etc. Also, this would clearly be in favor of openness and transparency with the management of the library. Cheers, Pamphile -------------- next part -------------- An HTML attachment was scrubbed... URL: From mhaberla at calpoly.edu Thu Feb 25 14:21:42 2021 From: mhaberla at calpoly.edu (Matt Haberland) Date: Thu, 25 Feb 2021 11:21:42 -0800 Subject: [SciPy-Dev] Using more functionalities from GitHub? In-Reply-To: <0689A9B7-5561-4782-A83B-AF8C6EA80641@gmail.com> References: <0689A9B7-5561-4782-A83B-AF8C6EA80641@gmail.com> Message-ID: I haven't used these features before, but they sound useful to me. On Thu, Feb 25, 2021 at 3:55 AM Pamphile Roy wrote: > Hi everyone, > > I would like to propose to use more functionalities of GitHub. I could not > find any reference showing this was discussed before, > so I apologize if the following would not be relevant. > > *Teams* > > Currently there is just the *triage* team, but we could create other > teams like one per submodule or for given skill sets. > This could allow easier pinging of people on PR. > > *Project* > > We have the roadmap and some issues are acting as meta-issues. But we > could use, instead/on top, the integrated project management dashboard. > It would be more convenient to keep track of what can be done, who is > working on what area, etc. Also, this would clearly be in favor of openness > and > transparency with the management of the library. > > > Cheers, > Pamphile > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at python.org > https://mail.python.org/mailman/listinfo/scipy-dev > -- Matt Haberland Assistant Professor BioResource and Agricultural Engineering 08A-3K, Cal Poly -------------- next part -------------- An HTML attachment was scrubbed... URL: From tirthasheshpatel at gmail.com Fri Feb 26 04:13:10 2021 From: tirthasheshpatel at gmail.com (Tirth Patel) Date: Fri, 26 Feb 2021 14:43:10 +0530 Subject: [SciPy-Dev] mypy 0.770 is broken on Python 3.9.0a5 Message-ID: I recently proposed gh-13613 (https://github.com/scipy/scipy/pull/13613) which adds CI for type checking and noticed this failure: ``` scipy/_lib/_uarray/_backend.py:96: error: syntax error in type comment [syntax] Found 1 error in 1 file (checked 662 source files) ``` I don't see any errors with Python 3.8.0 and mypy 0.770, mypy 0.780, and mypy 0.812 (latest release). So, this probably has something to do with Python 3.9.0a5 (which is currently being installed on ubuntu-latest). I checked with mypy 0.780, mypy 0.800, and mypy 0.812 on Python 3.9.0a5. This error disappears in all those versions but dozens of others arise. I think this error is related to python/mypy#8614 (https://github.com/python/mypy/issues/8614) which got fixed in mypy>=0.780. So, if I am not wrong, we have two options here: - Stick with Python 3.8 and mypy 0.770. - Use mypy latest (0.812 currently) which works for both Python 3.8 and Python 3.9. I think the latter would be a better choice as NumPy does it and mypy is changing fast so it will help keep up with new changes. I have a fix for all the errors occurring with mypy latest (mypy 0.812) and would add the changes to the PR if there is a consensus to go that way! -- Kind Regards, Tirth Patel From ralf.gommers at gmail.com Fri Feb 26 04:21:18 2021 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Fri, 26 Feb 2021 10:21:18 +0100 Subject: [SciPy-Dev] mypy 0.770 is broken on Python 3.9.0a5 In-Reply-To: References: Message-ID: On Fri, Feb 26, 2021 at 10:13 AM Tirth Patel wrote: > I recently proposed gh-13613 > (https://github.com/scipy/scipy/pull/13613) which adds CI for type > checking and noticed this failure: > > ``` > scipy/_lib/_uarray/_backend.py:96: error: syntax error in type comment > [syntax] > Found 1 error in 1 file (checked 662 source files) > ``` > > I don't see any errors with Python 3.8.0 and mypy 0.770, mypy 0.780, > and mypy 0.812 (latest release). So, this probably has something to do > with Python 3.9.0a5 (which is currently being installed on > ubuntu-latest). I checked with mypy 0.780, mypy 0.800, and mypy 0.812 > on Python 3.9.0a5. This error disappears in all those versions but > dozens of others arise. > > I think this error is related to python/mypy#8614 > (https://github.com/python/mypy/issues/8614) which got fixed in > mypy>=0.780. > > So, if I am not wrong, we have two options here: > - Stick with Python 3.8 and mypy 0.770. > - Use mypy latest (0.812 currently) which works for both Python 3.8 > and Python 3.9. > > I think the latter would be a better choice as NumPy does it and mypy > is changing fast so it will help keep up with new changes. I have a > fix for all the errors occurring with mypy latest (mypy 0.812) and > would add the changes to the PR if there is a consensus to go that > way! > Yes we can bump mypy versions, no problem to update to the most recent version if needed. It's not a runtime dependency, so there's not much of a downside to updating other than a few maintainers and contributers needing to update their local version. Thanks for working on this Tirth! Cheers, Ralf > -- > Kind Regards, > Tirth Patel > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at python.org > https://mail.python.org/mailman/listinfo/scipy-dev > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Sat Feb 27 11:25:13 2021 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Sat, 27 Feb 2021 17:25:13 +0100 Subject: [SciPy-Dev] Using more functionalities from GitHub? In-Reply-To: <0689A9B7-5561-4782-A83B-AF8C6EA80641@gmail.com> References: <0689A9B7-5561-4782-A83B-AF8C6EA80641@gmail.com> Message-ID: On Thu, Feb 25, 2021 at 12:55 PM Pamphile Roy wrote: > Hi everyone, > > I would like to propose to use more functionalities of GitHub. I could not > find any reference showing this was discussed before, > so I apologize if the following would not be relevant. > Thanks for asking Pamphile. We indeed haven't discussed this before, at least not in the last few years. > *Teams* > > Currently there is just the *triage* team, but we could create other > teams like one per submodule or for given skill sets. > This could allow easier pinging of people on PR. > I agree it would be good to make some improvements here. The team has grown a lot, and many people don't know who are the experts/maintainers for some module. A long time ago we tried to keep a list manually in the repo (a MAINTAINERS.rst file), but that just went out of date and most people didn't know to look there anyway. A related problem: I have also stopped watching the repo, because the amount of notifications I was getting was starting to get a little overwhelming. Instead, I check in and browse new issues/PRs every few days to a week - but that means I may miss relevant stuff. There's currently not a good middle ground here. PyTorch has a useful system where you can subscribe to a label, and then once the label gets added a bot comes along and Cc's you on the issue. It does require running a bot, which would be another piece of machinery to maintain. The most related GitHub feature is CODEOWNERS: https://docs.github.com/en/github/creating-cloning-and-archiving-repositories/about-code-owners. It can be used to automatically request PR reviews from individuals or from a team. So there's at least three approaches: 1. Teams per submodule and other area 2. A bot to subscribe to labels 3. Using CODEOWNERS The trouble with (1) is that it's a lot of overhead managing teams in the GitHub UI, and only people with owner/maintainer status can do it. My preference is (3) I think: it solves both problems to some extent, it's the most granular (you can get notifications for individual files as well as blog patterns), and it's a plain file in the repo that everyone can propose changes to via a PR. For pinging people outside of PRs, we can use the same file as documentation (just look at it, find the submodule/file of interest, and see who is subscribed to it to @-mention them). Should we try that? > *Project* > > We have the roadmap and some issues are acting as meta-issues. But we > could use, instead/on top, the integrated project management dashboard. > It would be more convenient to keep track of what can be done, who is > working on what area, etc. > My experience with GitHub Projects isn't great, it's more work than tracking issues to keep up to date, and is less integrated with the rest of the GitHub workflow. I'd be happy to give interested people the permissions to create new project boards for specific topics if that's how they like to work, but I'd like to keep it completely optional. > Also, this would clearly be in favor of openness and transparency with the > management of the library. > There's actually very little management going on. There's zero hidden repositories or other content; the only thing is a very low-traffic private maintainer mailing list that is meant only for topics that aren't always good to discuss in public (mostly just deciding on giving someone more permissions). Maybe it's time to give regular community calls a go again - it's working quite well for NumPy. We tried it briefly a couple of years ago and it was useful but I dropped the ball on organizing at some point because I was too busy. Should we try that again? Maybe regular once a month Zoom calls open to anyone who wants to attend? Cheers, Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From christoph.baumgarten at gmail.com Sat Feb 27 15:34:04 2021 From: christoph.baumgarten at gmail.com (Christoph Baumgarten) Date: Sat, 27 Feb 2021 21:34:04 +0100 Subject: [SciPy-Dev] (no subject) Message-ID: Hi, I implemented the Cramer-von-Mises test for two samples in PR 13263 , The proposed name is cramervonmises_2samp. The one-sample version (cramervonmises) was already released in version 1.6.0. Since a few names of tests in scipy.stats were recently discussed on the mailing list (though cramervonmises was not), I just wanted to mention the PR here in case there are concerns about the name. One additional remark: for the KS test, there are three functions: kstest, ks_1samp, ks_2samp and kstest can be used both for the one- and two-sample tests. In my view, this makes the definition of kstest quite complicated since the meaning of the parameters depends on the version of the test and one needs a helper function _parse_kstest_args(data1, data2, args, N) in stats/stats.py So maybe cramervonmises_1samp and cramervonmises_2samp would have been a good choice, though I hope the names cramervonmises and cramervonmises_2samp also guide the user when to use which function. (While writing this message, I noted that the documentation of cramervonmises should state more clearly that it is about the one-sample test, e.g. "Perform the Cram?r-von Mises test for goodness of fit.' --> 'Perfrom the one-sample ...') Any views? Thanks for your feedback Christoph -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Sun Feb 28 11:14:58 2021 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Sun, 28 Feb 2021 17:14:58 +0100 Subject: [SciPy-Dev] (no subject) In-Reply-To: References: Message-ID: Hi Christoph, On Sat, Feb 27, 2021 at 9:34 PM Christoph Baumgarten < christoph.baumgarten at gmail.com> wrote: > Hi, > > I implemented the Cramer-von-Mises test for two samples in PR 13263 > , The proposed name is > cramervonmises_2samp. The one-sample version (cramervonmises) was already > released in version 1.6.0. Since a few names of tests in scipy.stats were > recently discussed on the mailing list (though cramervonmises was not), I > just wanted to mention the PR here in case there are concerns about the > name. > This seems like a nice function to add. > One additional remark: for the KS test, there are three functions: kstest, > ks_1samp, ks_2samp and kstest can be used both for the one- and two-sample > tests. In my view, this makes the definition of kstest quite complicated > since the meaning of the parameters depends on the version of the test and > one needs a helper function _parse_kstest_args(data1, data2, args, N) in > stats/stats.py > > So maybe cramervonmises_1samp and cramervonmises_2samp would have been a > good choice, though I hope the names cramervonmises and > cramervonmises_2samp also guide the user when to use which function. (While > writing this message, I noted that the documentation of cramervonmises > should state more clearly that it is about the one-sample test, e.g. "Perform > the Cram?r-von Mises test for goodness of fit.' --> 'Perfrom the one-sample > ...') > I agree with your assessment - `kstest` doing both is not great, keeping things separate like for cramervonmises(_2samp) is nicer. Cheers, Ralf > Any views? Thanks for your feedback > > Christoph > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at python.org > https://mail.python.org/mailman/listinfo/scipy-dev > -------------- next part -------------- An HTML attachment was scrubbed... URL: From roy.pamphile at gmail.com Sun Feb 28 14:48:24 2021 From: roy.pamphile at gmail.com (Pamphile Roy) Date: Sun, 28 Feb 2021 20:48:24 +0100 Subject: [SciPy-Dev] Using more functionalities from GitHub? In-Reply-To: References: <0689A9B7-5561-4782-A83B-AF8C6EA80641@gmail.com> Message-ID: <65EA20D3-70D3-4B9E-A010-3C9FA78B6A5A@gmail.com> > On 27.02.2021, at 17:25, Ralf Gommers wrote: > > > The most related GitHub feature is CODEOWNERS: https://docs.github.com/en/github/creating-cloning-and-archiving-repositories/about-code-owners . It can be used to automatically request PR reviews from individuals or from a team. > > So there's at least three approaches: > 1. Teams per submodule and other area > 2. A bot to subscribe to labels > 3. Using CODEOWNERS > > The trouble with (1) is that it's a lot of overhead managing teams in the GitHub UI, and only people with owner/maintainer status can do it. > > My preference is (3) I think: it solves both problems to some extent, it's the most granular (you can get notifications for individual files as well as blog patterns), and it's a plain file in the repo that everyone can propose changes to via a PR. For pinging people outside of PRs, we can use the same file as documentation (just look at it, find the submodule/file of interest, and see who is subscribed to it to @-mention them). Should we try that? Thanks for pointing out CODEOWNERS, I didn?t know about this! I agree with you, this looks like a good idea and would bring more value than just grouping members by tags. This would also not prevent to also have, if needed, some grouping for convenience, like maintainer team, review team, build team, etc. So, I vote in favor of this solution. > > > Project > > We have the roadmap and some issues are acting as meta-issues. But we could use, instead/on top, the integrated project management dashboard. > It would be more convenient to keep track of what can be done, who is working on what area, etc. > > My experience with GitHub Projects isn't great, it's more work than tracking issues to keep up to date, and is less integrated with the rest of the GitHub workflow. I'd be happy to give interested people the permissions to create new project boards for specific topics if that's how they like to work, but I'd like to keep it completely optional. > > Also, this would clearly be in favor of openness and transparency with the management of the library. > > There's actually very little management going on. There's zero hidden repositories or other content; the only thing is a very low-traffic private maintainer mailing list that is meant only for topics that aren't always good to discuss in public (mostly just deciding on giving someone more permissions). Maybe it's time to give regular community calls a go again - it's working quite well for NumPy. We tried it briefly a couple of years ago and it was useful but I dropped the ball on organizing at some point because I was too busy. > > Should we try that again? Maybe regular once a month Zoom calls open to anyone who wants to attend? Thanks for the clarification. Currently, what I am personally missing is a way to easily understand the project directions. Why we do some things, where the project is going, why the roadmap is like this, etc. So something a bit more detailed than the roadmap. Meeting minutes of such zoom calls could be relevant. But that?s some work to do? That?s why I suggested something like project as it?s fairly quick to update and follow. But I am a new player here, so I am just suggesting as you (the maintainers) will have to do most of the work here. Cheers, Pamphile -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Sun Feb 28 17:13:13 2021 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Sun, 28 Feb 2021 23:13:13 +0100 Subject: [SciPy-Dev] ENH: improve RBF interpolation In-Reply-To: References: <274ab45e-a9cc-4710-a40e-65b6f96c0b4a@www.fastmail.com> Message-ID: On Mon, Feb 22, 2021 at 3:43 PM Trever Hines wrote: > Hello all, > > I have made a pull request here https://github.com/scipy/scipy/pull/13595, > and I would appreciate any feedback. > Your PR looks really good, thanks for working on this Trever! > On Thu, Feb 4, 2021 at 7:23 PM Stefan van der Walt > wrote: > >> >> Is there any advantage to keeping the old interface, or should this >> eventually replace Rbf entirely? >> >> > My intention is for `RBFInterpolator` to replace `Rbf` entirely. It > should be possible to replicate the functionality of `Rbf` with > `RBFInterpolator` (albeit with warnings when the interpolant may not be > well-posed). `Rbf` is not currently deprecated in my PR, but I can make > that change if you think it is appropriate. > It looks to me like deprecating Rbf is a good idea. However, it's best not to do it too quickly after introducing the replacement, because then you force users that want to be compatible with multiple versions of scipy to do: if scipy.__version__ >= 1.7.0: RBFInterpolator(...) else: Rbf(...) So I would suggest that in your open PR you add a note to the `Rbf` docstring saying something like "`Rbf` is legacy code, for new usage please use `RbfInterpolator` instead". Cheers, Ralf > > -Trever > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at python.org > https://mail.python.org/mailman/listinfo/scipy-dev > -------------- next part -------------- An HTML attachment was scrubbed... URL: