From roy.pamphile at gmail.com Thu Jul 1 08:14:20 2021 From: roy.pamphile at gmail.com (Pamphile Roy) Date: Thu, 1 Jul 2021 14:14:20 +0200 Subject: [SciPy-Dev] Code formatting: black Message-ID: Hi everyone, I have been recently working on using black on our code base. I have created this PR https://github.com/scipy/scipy/pull/14330 . There are lots of points, hence I invite you to come over and comment directly instead of replying to this email. Thanks in advance for having a look, I hope we can make this work together! Cheers, Pamphile -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Thu Jul 1 08:52:54 2021 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Thu, 1 Jul 2021 08:52:54 -0400 Subject: [SciPy-Dev] Code formatting: black In-Reply-To: References: Message-ID: On Thu, Jul 1, 2021 at 8:14 AM Pamphile Roy wrote: > Hi everyone, > > I have been recently working on using black on our code base. > > I have created this PR https://github.com/scipy/scipy/pull/14330. There > are lots of points, hence I invite you to come over and comment directly > instead of replying to this email. > > Thanks in advance for having a look, I hope we can make this work together! > The main one I don't like at all are dedented closing parenthesis and brackets. They don't look like python code blocks at all to me. When I looked at black a while ago, I gave up when I saw those. Josef > > Cheers, > Pamphile > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at python.org > https://mail.python.org/mailman/listinfo/scipy-dev > -------------- next part -------------- An HTML attachment was scrubbed... URL: From roy.pamphile at gmail.com Thu Jul 1 09:00:23 2021 From: roy.pamphile at gmail.com (Pamphile Roy) Date: Thu, 1 Jul 2021 15:00:23 +0200 Subject: [SciPy-Dev] Code formatting: black In-Reply-To: References: Message-ID: > On 01.07.2021, at 14:52, josef.pktd at gmail.com wrote: > > The main one I don't like at all are dedented closing parenthesis and brackets. > They don't look like python code blocks at all to me. Seems strange, but less so after reading at this https://lukasz.langa.pl/1d1a43c4-9c8a-4c5f-a366-7f22ce6a49fc/ . TL;DR it minimizes diffs among other advantages. Feel free to comment on the PR itself. Cheers, Pamphile -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Fri Jul 2 03:19:00 2021 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Fri, 2 Jul 2021 09:19:00 +0200 Subject: [SciPy-Dev] Static Typing In-Reply-To: <20210630203017.GA15579@sguelton.remote.csb> References: <6fdc3926-8be7-4ff7-bf63-e614e3d16a65@www.fastmail.com> <20210630203017.GA15579@sguelton.remote.csb> Message-ID: On Wed, Jun 30, 2021 at 10:37 PM Serge Guelton < serge.guelton at telecom-bretagne.eu> wrote: > On Wed, Jun 30, 2021 at 08:14:55PM +0200, Ralf Gommers wrote: > > > > > > On Wed, Jun 30, 2021 at 8:07 PM Stefan van der Walt <[1] > stefanv at berkeley.edu> > > wrote: > > > > On Wed, Jun 30, 2021, at 09:03, Evgeni Burovski wrote: > > > ISTM it's important that annotations are optional in the sense > that we > > > do not explicitly require that new code is typed. If someone is > > > willing to add them, great (and if someone is willing to review a > > > typing PR, even better :-)). But this should be possible to do in a > > > follow-up PR, not as a requirement for an enhancement PR. > > > > I agree, especially given that the typing notation is still > changing. For > > example, they're currently working out a shorthand for typing > function > > definitions (and I'm sure other simplifications are in the pipeline > too). > > I think it's worth noting that some numpy interface are inherently > incompatible > with fine-grain static typing. On simple example would be > > def foo(x : ndarray[int, :, :], strict: bool): > return np.mean(x, keepdims=strict) > > What should be the return type of `foo`? We can't tell precisely, because > it > depends on the runtime value of strict. We're left with something alonside > "this > returns an array of the same dimension or a scalar of the same dtype" > The "boolean keyword to control return type or shape" is (unfortunately) so common that there's a specific way to deal with this, using @overload: @overload def foo(x : ndarray[int, :, :], strict: Literal[True]) -> ndarray[int, :, :]: ... @overload def foo(x: ndarray[int,:,:], strict: Literal[False]) -> ndarray[int, :]: ... # The fallback, if a user passes `strict='this-is-true' then we have to guess (unless we raise an exception) def foo(x: ndarray[int,:,:], strict: bool) -> ndarray[int, :]: ... See https://mypy.readthedocs.io/en/stable/literal_types.html. So the idea is to treat `True` and `False` as distinct types. And if you build, e.g., a compiler for Python code then do the same. This is fairly painful and ugly, but doable. There is other behavior and functions in numpy code that's harder to deal with, things like value-based casting, output shapes that depend on (array) input data, and returning scalars instead of 0-D arrays. I totally agree that boolean keywords are best avoided, but at least there is a solution if they do happen. I don't know how much this dynamicity leaks to scipy interface, but it does > look > like a difficult problem to solve. > SciPy is just as bad as NumPy in this respect. For example, scipy.stats does this a lot: if y.ndim == 0: y = y[()] # return a float rather than an array here return y Type checkers will complain loudly about this kind of thing, so having a type checker in CI warns you about this being a bad pattern. On the other hand, to add correct type annotations to old code that's already like that, you have to jump through a lot of hoops. Cheers, Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Fri Jul 2 12:00:54 2021 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Fri, 2 Jul 2021 12:00:54 -0400 Subject: [SciPy-Dev] Static Typing In-Reply-To: References: <6fdc3926-8be7-4ff7-bf63-e614e3d16a65@www.fastmail.com> <20210630203017.GA15579@sguelton.remote.csb> Message-ID: How would you annotate scipy distributions inputs? args, kwargs, flexible number of parameters, parameters that can be kwargs or args. And how would this affect subclasses? The distribution classes have a lot of input validation, and it took me some time recently to get a subclass to fit their design. Josef -------------- next part -------------- An HTML attachment was scrubbed... URL: From tirthasheshpatel at gmail.com Sun Jul 4 03:24:05 2021 From: tirthasheshpatel at gmail.com (Tirth Patel) Date: Sun, 4 Jul 2021 12:54:05 +0530 Subject: [SciPy-Dev] GSoC: Integrating library UNU.RAN in SciPy In-Reply-To: References: Message-ID: On Sat, Jun 12, 2021 at 4:45 PM Ralf Gommers wrote: > > > On Wed, Jun 9, 2021 at 12:16 PM Tirth Patel > wrote: > >> Hi Ralf, >> >> On Tue, Jun 8, 2021 at 11:14 PM Ralf Gommers >> wrote: >> >>> >>> >>> On Tue, Jun 8, 2021 at 6:53 PM Tirth Patel >>> wrote: >>> >>>> >>>> Other Notes >>>> ----------- >>>> >>>> I think NumPy 1.19 introduced a stable interface for its Generator API. >>>> Before that the RandomState API was stable. But RandomState doesn't have a >>>> Cython Interface and its Python interface is very slow to be practically >>>> useful. For instance, it takes around 7ms to sample 100,000 samples from >>>> the normal distribution using the TDR method on NumPy >= 1.19 while it >>>> takes around 911ms on NumPy == 1.18 which is around 130 times slower! Is >>>> there a plan to drop 1.16 soon and can we use the unstable Generator API >>>> from 1.17 onwards or would it too unsafe? Maybe this discussion isn't >>>> suited here but I just thought to put it out as a note. >>>> >>> >>> We can drop NumPy 1.16 right now. I'm not sure if the 1.17 C API for >>> numpy.random is usable - it was either missing some features or not present >>> at all. >>> >> >> Nice to hear that we don't need to support 1.16 now! With that, I think >> there is a possibility of using the Cython API of NumPy BitGenerator to >> speed things up. I checked out a few releases on NumPy and found out >> BitGenerator was added in 1.17 with a Cython API. All we need is the >> `next_double` and `state` member of the `bitgen_t` object which are present >> from 1.17 onwards. The only difference is that 1.17 contains the `bitgen_t` >> in `numpy.random.common` module while it was shifted to `numpy.random` from >> 1.18 onwards. I don't know if there are any known bugs in 1.17 and 1.18 >> before becoming stable in 1.19. If not, we might be able to use the >> unstable NumPy Generator in 1.17 and 1.18 for our purpose. What do you >> think? >> > > I think this is fine - if it helps that much, let's just try it. If it > passes all tests with the latest bugfix releases of 1.17.x and 1.18.x then > it should be okay. > While refactoring UNU.RAN wrapper to address recent memory leaks issue, I tried to do this but turns out 1.17 doesn't ship the C/Cython API. So, I have kept things as-is for now. Sorry for the noise here! > Cheers, > Ralf > > > >> What the recently added biasedurn does is a conditional compile - only >>> use that C API for NumPy >= 1.19. If the performance on <1.19 isn't >>> completely unusable, that may be a good option? >>> >> >> Yes, that's what I do right now. But I am just worried if the performance >> on <1.19 is too slow to practically rely on. Anyways, thanks for looking >> into this! >> > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at python.org > https://mail.python.org/mailman/listinfo/scipy-dev > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Sun Jul 4 04:04:36 2021 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Sun, 4 Jul 2021 10:04:36 +0200 Subject: [SciPy-Dev] Static Typing In-Reply-To: References: <6fdc3926-8be7-4ff7-bf63-e614e3d16a65@www.fastmail.com> <20210630203017.GA15579@sguelton.remote.csb> Message-ID: On Fri, Jul 2, 2021 at 6:01 PM wrote: > How would you annotate scipy distributions inputs? > > args, kwargs, flexible number of parameters, parameters that can be kwargs > or args. > > And how would this affect subclasses? > > The distribution classes have a lot of input validation, and it took me > some time recently to get a subclass to fit their design. > In their current form, they're basically impossible to type (and we shouldn't try). If we'd rewrite the framework, it would not look anything like the current design though - everything `def some_method(x, *args, **kwargs)` is madness, and all shape parameters broadcasting is madness^2 - we're still finding new bugs there after 10+ years. It's a complex topic so a new design would require a lot of thought, but I think the public methods would look something like: def pdf(x: ndarray, a: float, loc: float | None, scale: float | None) -> ndarray: ... def rvs(a: float, size: int, loc: float | None, scale: float | None, random_state: np.random.Generator | int | None) -> ndarray: ... It would not have `rv_xxx` base classes, but factory functions to generate classes. No third-party subclassing, just provide the tools to define your own classes with the factory functions. Cheers, Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Sun Jul 4 17:46:25 2021 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Sun, 4 Jul 2021 23:46:25 +0200 Subject: [SciPy-Dev] proposal to use SciPy project funds for build and CI improvements Message-ID: Hi all, I'd like to make a proposal to use some project funds to make faster progress on moving to a new build system and improving our CI timeout issues. Because this is the first such proposal, let me first deal with the meta issue of _how_ to deal with this. At the start of the year I co-wrote NEP 48 (https://numpy.org/neps/nep-0048-spending-project-funds.html) to have a framework for NumPy for spending project funds. I'd like to follow that NEP also for SciPy. So I'll make the proposal here, in accordance with that NEP, and then the other thing is to figure out where to put it up as a PR (or wiki page, or ...). For that my proposal is: create a new repo called something like `project-mgmt`, where we can put such proposals as well as meeting minutes, presentations, and other content. And have a single summary page in Markdown format with a list of all proposals made, linking out to the full proposals. On to the actual proposal, keeping in mind https://numpy.org/neps/nep-0048-spending-project-funds.html#defining-fundable-activities-and-projects . Title: Accelerating the move to Meson as the SciPy build system Cost: $12,000 Time duration: 3 months Team members: Matthew Brett (funded), Ralf Gommers (unfunded), Gagandeep Singh (unfunded), Smit Lunagariya (funded) Context: 1. Multiple people recently told me that our current build system and CI issues are a pain point for working on SciPy. We have CI timeouts on our main repo, scipy-wheels is a pain, conda-forge is also timing out on aarch64 and ppc64le build. And I've found myself spending a whole day or weekend several times in the past months to fix broken CI jobs due to build/packaging issues. 2. About 4 months ago, I wrote an RFC to move to Meson as a build system, see https://github.com/scipy/scipy/issues/13615. Since then I have spent a lot of time on this, and also used Quansight Labs funding to get one of my colleagues, Gagandeep Singh, to help me move this forward faster (and he has been super helpful). Getting a Linux + OpenBLAS dev build is close to done (16 of 20 modules complete, all that's left is `misc`, `io`, `integrate` and `signal` - probably those can be completed in the coming week). 3. Doing everything else, like adding support for Windows/aarch64/MKL/ILP64/etc. is still a lot of work. There's a tracking issue at https://github.com/rgommers/scipy/issues/22. Some of this stuff is pretty specialized, e.g., building wheels, and Fortran-on-Windows. It will take a long time to complete if it's just me spending time on the weekends. 4. Matthew Brett is available to help soon. He has written multibuild and dealt with build/packaging topics for a long time, so I think he would be able to make progress quickly. He wants to work on this topic as an independent contractor (I won't add rates here, that's for the core team to look at - but they're very reasonable). The overall goals of this effort are: 1. Get a (mostly) feature-complete Meson based build system in place by the end of October, so we can ship it in the 1.8.0 release. If all goes well, get rid of the `numpy.distutils` based build 1 release later. 2. Resolve all our CI timeout issues. If the build system alone isn't enough (it should be though), analyze the remaining issues and figure out how to tackle whatever is still problematic. The deliverables are, in order of priority: 1. Add capability to build release artifacts (wheels, sdist). I did some initial tests at the start with mesonpep517, but I have no idea if it works for SciPy and what bugs are lurking there. 2. macOS support 3. Add support for things we need to a dev.py interface (basically, like runtests.py in that it can do everything, like building docs, running benchmarks, etc.) 4. Contribute a few things we need, or that would make the build setup better, to Meson. For example, I think we'd want to be able to use `py3.install_sources` on generated targets. Of course I'm still new-ish to Meson, so it could well be that after making a proposal on the Meson issue tracker, another solution will be preferred. 5. Windows support 6. Improve build dependency detection and configuration - in particular BLAS/LAPACK flavors. 7. Clean up as many build warnings as possible, and silence the rest. The goal is to have a CI job that is completely silent (passes with `-Werror`). 8. Optimize build time further, including on CI (e.g., reuse build cache between jobs). I'd like to use 75% of the funds to pay Matthew, and 25% of the funds to take on a talented intern, Smit Lunagariya. He is comfortable working in Python, C and C++, and will be able to help with deliverables 7 and 8, as well as with testing and other tasks. Overall I'm not certain that this amount of funds is enough to complete every last deliverable on this list; it's hard to estimate. But we should certainly get close, and be able to ship a Meson build in 1.8.0. For a quick sketch of the benefits: - On my machine the build time is currently 13 minutes with numpy.distutils and 90 seconds with Meson - The build log goes from >10,000 lines (mostly noise) to ~100 lines (mostly configuration info). Why should this be a funded activity? For two reasons: (1) build and CI issues are probably the number one maintenance issue we have, and (2) most people find working on build/CI boring and painful, and hence it doesn't happen with volunteer-only activities. Even regular maintenance of our CI is hard - recently jobs have been broken for 1-2 weeks at a time a couple of times. And the scipy-wheels repo is in even worse shape. My proposal is to designate this topic as high priority, let the core team approve the compensation levels for the funded people, and get the work started. For reporting on progress and to have some accountability about how we spent the funds, I propose a short report (say 1 page, giving a status update and linking the main PRs made or issues closed) when half the funds are spent, and another such report at the end. In addition I'll plan to write a small development grant for this when the next call for that opens up, because I think we want Matthew to continue working on this after this $12,000 has been spent. I'd like to get everything as good as it can be, so other projects can easily adopt Meson as well - to also achieve my not-so-secret goal of not having to integrate numpy.distutils capabilities into setuptools once distutils is removed from Python. Thoughts? Cheers, Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From tyler.je.reddy at gmail.com Sun Jul 4 18:00:38 2021 From: tyler.je.reddy at gmail.com (Tyler Reddy) Date: Sun, 4 Jul 2021 16:00:38 -0600 Subject: [SciPy-Dev] proposal to use SciPy project funds for build and CI improvements In-Reply-To: References: Message-ID: +1 On Sun, 4 Jul 2021 at 15:47, Ralf Gommers wrote: > Hi all, > > I'd like to make a proposal to use some project funds to make faster > progress on moving to a new build system and improving our CI timeout > issues. > > Because this is the first such proposal, let me first deal with the meta > issue of _how_ to deal with this. At the start of the year I co-wrote NEP > 48 (https://numpy.org/neps/nep-0048-spending-project-funds.html) to have > a framework for NumPy for spending project funds. I'd like to follow that > NEP also for SciPy. So I'll make the proposal here, in accordance with that > NEP, and then the other thing is to figure out where to put it up as a PR > (or wiki page, or ...). For that my proposal is: create a new repo called > something like `project-mgmt`, where we can put such proposals as well as > meeting minutes, presentations, and other content. And have a single > summary page in Markdown format with a list of all proposals made, linking > out to the full proposals. > > On to the actual proposal, keeping in mind > https://numpy.org/neps/nep-0048-spending-project-funds.html#defining-fundable-activities-and-projects > . > > Title: Accelerating the move to Meson as the SciPy build system > Cost: $12,000 > Time duration: 3 months > Team members: Matthew Brett (funded), Ralf Gommers (unfunded), Gagandeep > Singh (unfunded), Smit Lunagariya (funded) > > Context: > > 1. Multiple people recently told me that our current build system and > CI issues are a pain point for working on SciPy. We have CI timeouts on our > main repo, scipy-wheels is a pain, conda-forge is also timing out on > aarch64 and ppc64le build. And I've found myself spending a whole day or > weekend several times in the past months to fix broken CI jobs due to > build/packaging issues. > 2. About 4 months ago, I wrote an RFC to move to Meson as a build > system, see https://github.com/scipy/scipy/issues/13615. Since then I > have spent a lot of time on this, and also used Quansight Labs funding to > get one of my colleagues, Gagandeep Singh, to help me move this forward > faster (and he has been super helpful). Getting a Linux + OpenBLAS dev > build is close to done (16 of 20 modules complete, all that's left is > `misc`, `io`, `integrate` and `signal` - probably those can be completed in > the coming week). > 3. Doing everything else, like adding support for > Windows/aarch64/MKL/ILP64/etc. is still a lot of work. There's a tracking > issue at https://github.com/rgommers/scipy/issues/22. Some of this > stuff is pretty specialized, e.g., building wheels, and Fortran-on-Windows. > It will take a long time to complete if it's just me spending time on the > weekends. > 4. Matthew Brett is available to help soon. He has written multibuild > and dealt with build/packaging topics for a long time, so I think he would > be able to make progress quickly. He wants to work on this topic as an > independent contractor (I won't add rates here, that's for the core team to > look at - but they're very reasonable). > > The overall goals of this effort are: > > 1. Get a (mostly) feature-complete Meson based build system in place > by the end of October, so we can ship it in the 1.8.0 release. If all goes > well, get rid of the `numpy.distutils` based build 1 release later. > 2. Resolve all our CI timeout issues. If the build system alone isn't > enough (it should be though), analyze the remaining issues and figure out > how to tackle whatever is still problematic. > > The deliverables are, in order of priority: > > 1. Add capability to build release artifacts (wheels, sdist). I did > some initial tests at the start with mesonpep517, but I have no idea if it > works for SciPy and what bugs are lurking there. > 2. macOS support > 3. Add support for things we need to a dev.py interface (basically, > like runtests.py in that it can do everything, like building docs, running > benchmarks, etc.) > 4. Contribute a few things we need, or that would make the build setup > better, to Meson. For example, I think we'd want to be able to use > `py3.install_sources` on generated targets. Of course I'm still new-ish to > Meson, so it could well be that after making a proposal on the Meson issue > tracker, another solution will be preferred. > 5. Windows support > 6. Improve build dependency detection and configuration - in > particular BLAS/LAPACK flavors. > 7. Clean up as many build warnings as possible, and silence the rest. > The goal is to have a CI job that is completely silent (passes with > `-Werror`). > 8. Optimize build time further, including on CI (e.g., reuse build > cache between jobs). > > I'd like to use 75% of the funds to pay Matthew, and 25% of the funds to > take on a talented intern, Smit Lunagariya. He is comfortable working in > Python, C and C++, and will be able to help with deliverables 7 and 8, as > well as with testing and other tasks. > > Overall I'm not certain that this amount of funds is enough to complete > every last deliverable on this list; it's hard to estimate. But we should > certainly get close, and be able to ship a Meson build in 1.8.0. > > For a quick sketch of the benefits: > > - On my machine the build time is currently 13 minutes with > numpy.distutils and 90 seconds with Meson > - The build log goes from >10,000 lines (mostly noise) to ~100 lines > (mostly configuration info). > > Why should this be a funded activity? For two reasons: (1) build and CI > issues are probably the number one maintenance issue we have, and (2) most > people find working on build/CI boring and painful, and hence it doesn't > happen with volunteer-only activities. Even regular maintenance of our CI > is hard - recently jobs have been broken for 1-2 weeks at a time a couple > of times. And the scipy-wheels repo is in even worse shape. > > My proposal is to designate this topic as high priority, let the core team > approve the compensation levels for the funded people, and get the work > started. For reporting on progress and to have some accountability about > how we spent the funds, I propose a short report (say 1 page, giving a > status update and linking the main PRs made or issues closed) when half the > funds are spent, and another such report at the end. > > In addition I'll plan to write a small development grant for this when the > next call for that opens up, because I think we want Matthew to continue > working on this after this $12,000 has been spent. I'd like to get > everything as good as it can be, so other projects can easily adopt Meson > as well - to also achieve my not-so-secret goal of not having to integrate > numpy.distutils capabilities into setuptools once distutils is removed from > Python. > > Thoughts? > > Cheers, > Ralf > > > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at python.org > https://mail.python.org/mailman/listinfo/scipy-dev > -------------- next part -------------- An HTML attachment was scrubbed... URL: From roy.pamphile at gmail.com Mon Jul 5 01:53:47 2021 From: roy.pamphile at gmail.com (Pamphile Roy) Date: Mon, 5 Jul 2021 07:53:47 +0200 Subject: [SciPy-Dev] proposal to use SciPy project funds for build and CI improvements Message-ID: <4102ECDC-361D-4352-9F25-580AD6581ED6@gmail.com> +1 for both the project and the new repository. (Glad to read about the log as well!) Thanks Ralf. Cheers, Pamphile > On 5 Jul 2021, at 00:01, Tyler Reddy wrote: > > ? > +1 > >> On Sun, 4 Jul 2021 at 15:47, Ralf Gommers wrote: >> Hi all, >> >> I'd like to make a proposal to use some project funds to make faster progress on moving to a new build system and improving our CI timeout issues. >> >> Because this is the first such proposal, let me first deal with the meta issue of _how_ to deal with this. At the start of the year I co-wrote NEP 48 (https://numpy.org/neps/nep-0048-spending-project-funds.html) to have a framework for NumPy for spending project funds. I'd like to follow that NEP also for SciPy. So I'll make the proposal here, in accordance with that NEP, and then the other thing is to figure out where to put it up as a PR (or wiki page, or ...). For that my proposal is: create a new repo called something like `project-mgmt`, where we can put such proposals as well as meeting minutes, presentations, and other content. And have a single summary page in Markdown format with a list of all proposals made, linking out to the full proposals. >> >> On to the actual proposal, keeping in mind https://numpy.org/neps/nep-0048-spending-project-funds.html#defining-fundable-activities-and-projects. >> >> Title: Accelerating the move to Meson as the SciPy build system >> Cost: $12,000 >> Time duration: 3 months >> Team members: Matthew Brett (funded), Ralf Gommers (unfunded), Gagandeep Singh (unfunded), Smit Lunagariya (funded) >> >> Context: >> Multiple people recently told me that our current build system and CI issues are a pain point for working on SciPy. We have CI timeouts on our main repo, scipy-wheels is a pain, conda-forge is also timing out on aarch64 and ppc64le build. And I've found myself spending a whole day or weekend several times in the past months to fix broken CI jobs due to build/packaging issues. >> About 4 months ago, I wrote an RFC to move to Meson as a build system, see https://github.com/scipy/scipy/issues/13615. Since then I have spent a lot of time on this, and also used Quansight Labs funding to get one of my colleagues, Gagandeep Singh, to help me move this forward faster (and he has been super helpful). Getting a Linux + OpenBLAS dev build is close to done (16 of 20 modules complete, all that's left is `misc`, `io`, `integrate` and `signal` - probably those can be completed in the coming week). >> Doing everything else, like adding support for Windows/aarch64/MKL/ILP64/etc. is still a lot of work. There's a tracking issue at https://github.com/rgommers/scipy/issues/22. Some of this stuff is pretty specialized, e.g., building wheels, and Fortran-on-Windows. It will take a long time to complete if it's just me spending time on the weekends. >> Matthew Brett is available to help soon. He has written multibuild and dealt with build/packaging topics for a long time, so I think he would be able to make progress quickly. He wants to work on this topic as an independent contractor (I won't add rates here, that's for the core team to look at - but they're very reasonable). >> The overall goals of this effort are: >> Get a (mostly) feature-complete Meson based build system in place by the end of October, so we can ship it in the 1.8.0 release. If all goes well, get rid of the `numpy.distutils` based build 1 release later. >> Resolve all our CI timeout issues. If the build system alone isn't enough (it should be though), analyze the remaining issues and figure out how to tackle whatever is still problematic. >> The deliverables are, in order of priority: >> Add capability to build release artifacts (wheels, sdist). I did some initial tests at the start with mesonpep517, but I have no idea if it works for SciPy and what bugs are lurking there. >> macOS support >> Add support for things we need to a dev.py interface (basically, like runtests.py in that it can do everything, like building docs, running benchmarks, etc.) >> Contribute a few things we need, or that would make the build setup better, to Meson. For example, I think we'd want to be able to use `py3.install_sources` on generated targets. Of course I'm still new-ish to Meson, so it could well be that after making a proposal on the Meson issue tracker, another solution will be preferred. >> Windows support >> Improve build dependency detection and configuration - in particular BLAS/LAPACK flavors. >> Clean up as many build warnings as possible, and silence the rest. The goal is to have a CI job that is completely silent (passes with `-Werror`). >> Optimize build time further, including on CI (e.g., reuse build cache between jobs). >> I'd like to use 75% of the funds to pay Matthew, and 25% of the funds to take on a talented intern, Smit Lunagariya. He is comfortable working in Python, C and C++, and will be able to help with deliverables 7 and 8, as well as with testing and other tasks. >> >> Overall I'm not certain that this amount of funds is enough to complete every last deliverable on this list; it's hard to estimate. But we should certainly get close, and be able to ship a Meson build in 1.8.0. >> >> For a quick sketch of the benefits: >> On my machine the build time is currently 13 minutes with numpy.distutils and 90 seconds with Meson >> The build log goes from >10,000 lines (mostly noise) to ~100 lines (mostly configuration info). >> Why should this be a funded activity? For two reasons: (1) build and CI issues are probably the number one maintenance issue we have, and (2) most people find working on build/CI boring and painful, and hence it doesn't happen with volunteer-only activities. Even regular maintenance of our CI is hard - recently jobs have been broken for 1-2 weeks at a time a couple of times. And the scipy-wheels repo is in even worse shape. >> >> My proposal is to designate this topic as high priority, let the core team approve the compensation levels for the funded people, and get the work started. For reporting on progress and to have some accountability about how we spent the funds, I propose a short report (say 1 page, giving a status update and linking the main PRs made or issues closed) when half the funds are spent, and another such report at the end. >> >> In addition I'll plan to write a small development grant for this when the next call for that opens up, because I think we want Matthew to continue working on this after this $12,000 has been spent. I'd like to get everything as good as it can be, so other projects can easily adopt Meson as well - to also achieve my not-so-secret goal of not having to integrate numpy.distutils capabilities into setuptools once distutils is removed from Python. >> >> Thoughts? >> >> Cheers, >> Ralf >> >> >> >> _______________________________________________ >> SciPy-Dev mailing list >> SciPy-Dev at python.org >> https://mail.python.org/mailman/listinfo/scipy-dev > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at python.org > https://mail.python.org/mailman/listinfo/scipy-dev -------------- next part -------------- An HTML attachment was scrubbed... URL: From nil.goyette at imeka.ca Mon Jul 5 10:11:51 2021 From: nil.goyette at imeka.ca (Nil Goyette) Date: Mon, 5 Jul 2021 10:11:51 -0400 Subject: [SciPy-Dev] qhull FFI Message-ID: Hi all, I've been trying to use qhull in a Rust project so of course I wondered how SciPy managed to do it before I tried to code my own FFI. When my version didn't work, I looked deeper in SciPy and there's something I don't understand. We can find several definitions In qhull_src/src/libqhull_r.h, like facetT ctypedef struct facetT: coordT offset coordT *center coordT *normal facetT *previous; facetT *next; ... a total of 16 fields And the "corresponding" definitions in qhull.pyx struct facetT { coordT offset; coordT *normal; union { 6 fields } coordT *center; facetT *next facetT *previoust ... A total of 37 fields Now I'm puzzled. There are several missing fields and they are wrongly ordered! I'm not a FFI expert at all but, as a programmer, I would have thought that the definitions needed to match perfectly, at least for the order and the number of bytes (total and per field). Can someone please explain to me why only some fields match and why this is not a problem? Thank you. Nil Goyette -- During this time of social distancing, we offer free webinars on subjects that matter. CONFIDENTIALITY NOTICE: This message, and any attachments, is intended only for the use of the addressee or his authorized representative. It may contain information that is privileged, confidential and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, or his authorized representative, you are hereby notified that any dissemination, distribution or copying of this message and any attachments is strictly prohibited. The integrity of this message cannot be guaranteed on the Internet, IMEKA shall not be liable for its content if altered, changed or falsified. If you have received this message in error, please contact immediately the sender and delete this message and any attachments from your system. AVIS DE CONFIDENTIALIT? : Ce message, ainsi que tout fichier qui y est joint, est destin? exclusivement aux personnes ? qui il est adress?. Il peut contenir des informations de nature confidentielle qui ne doivent ?tre divulgu?es en vertu des lois applicables. Si vous n'?tes pas le destinataire de ce message ou un mandataire autoris? de celui-ci, vous ?tes avis? par la pr?sente que toute impression, diffusion, distribution ou reproduction de ce message et de tout fichier qui y est joint est strictement interdite. L'int?grit? de ce message n'?tant pas assur?e sur Internet, IMEKA ne peut ?tre tenue responsable de son contenu s'il a ?t? alt?r?, d?form? ou falsifi?. Si ce message vous a ?t? transmis par erreur, veuillez en aviser sans d?lai l'exp?diteur et l'effacer ainsi que tout fichier joint sans en conserver de copie. -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Mon Jul 5 10:18:37 2021 From: robert.kern at gmail.com (Robert Kern) Date: Mon, 5 Jul 2021 10:18:37 -0400 Subject: [SciPy-Dev] qhull FFI In-Reply-To: References: Message-ID: On Mon, Jul 5, 2021 at 10:12 AM Nil Goyette wrote: > Hi all, > > I've been trying to use qhull in a Rust project so of course I wondered > how SciPy managed to do it before I tried to code my own FFI. When my > version didn't work, I looked deeper in SciPy and there's something I don't > understand. We can find several definitions In qhull_src/src/libqhull_r.h, > like facetT > > ctypedef struct facetT: > coordT offset > coordT *center > coordT *normal > facetT *previous; > facetT *next; > ... a total of 16 fields > > And the "corresponding" definitions in qhull.pyx > > struct facetT { > coordT offset; > coordT *normal; > union { 6 fields } > coordT *center; > facetT *next > facetT *previoust > ... A total of 37 fields > > Now I'm puzzled. There are several missing fields and they are wrongly > ordered! I'm not a FFI expert at all but, as a programmer, I would have > thought that the definitions needed to match perfectly, at least for the > order and the number of bytes (total and per field). > > Can someone please explain to me why only some fields match and why this > is not a problem? Thank you. > Cython is not an FFI per se. It is a language that is transpiled to C and is exposed to Python using Python's standard C extension module mechanism. When describing the `ctypedef struct`, it only needs to describe the struct members that you are going to refer to in the Cython code so that it can deal with their types properly. Because it transpiles source-to-source, it doesn't need to know the full details of the binary API. The C compiler is doing that job, and it sees the original `.h` file from QHull. -- Robert Kern -------------- next part -------------- An HTML attachment was scrubbed... URL: From nil.goyette at imeka.ca Mon Jul 5 11:24:50 2021 From: nil.goyette at imeka.ca (Nil Goyette) Date: Mon, 5 Jul 2021 11:24:50 -0400 Subject: [SciPy-Dev] qhull FFI In-Reply-To: References: Message-ID: Hi Robert, Thank you, this is an excellent answer. You only define what you need because it becomes a part of SciPy code, like any other pyx file is. This is not FFI at all. Copying SciPy/qhull structures and types wasn't my best idea after all :D I'll continue my work using only the qhull source then. Nil Goyette Le lun. 5 juill. 2021, ? 10 h 19, Robert Kern a ?crit : > On Mon, Jul 5, 2021 at 10:12 AM Nil Goyette wrote: > >> Hi all, >> >> I've been trying to use qhull in a Rust project so of course I wondered >> how SciPy managed to do it before I tried to code my own FFI. When my >> version didn't work, I looked deeper in SciPy and there's something I don't >> understand. We can find several definitions In qhull_src/src/libqhull_r.h, >> like facetT >> >> ctypedef struct facetT: >> coordT offset >> coordT *center >> coordT *normal >> facetT *previous; >> facetT *next; >> ... a total of 16 fields >> >> And the "corresponding" definitions in qhull.pyx >> >> struct facetT { >> coordT offset; >> coordT *normal; >> union { 6 fields } >> coordT *center; >> facetT *next >> facetT *previoust >> ... A total of 37 fields >> >> Now I'm puzzled. There are several missing fields and they are wrongly >> ordered! I'm not a FFI expert at all but, as a programmer, I would have >> thought that the definitions needed to match perfectly, at least for the >> order and the number of bytes (total and per field). >> >> Can someone please explain to me why only some fields match and why this >> is not a problem? Thank you. >> > > Cython is not an FFI per se. It is a language that is transpiled to C and > is exposed to Python using Python's standard C extension module mechanism. > When describing the `ctypedef struct`, it only needs to describe the struct > members that you are going to refer to in the Cython code so that it can > deal with their types properly. Because it transpiles source-to-source, it > doesn't need to know the full details of the binary API. The C compiler is > doing that job, and it sees the original `.h` file from QHull. > > -- > Robert Kern > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at python.org > https://mail.python.org/mailman/listinfo/scipy-dev > -- Nil Goyette D?veloppeur principal www.imeka.ca -- During this time of social distancing, we offer free webinars on subjects that matter. CONFIDENTIALITY NOTICE: This message, and any attachments, is intended only for the use of the addressee or his authorized representative. It may contain information that is privileged, confidential and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, or his authorized representative, you are hereby notified that any dissemination, distribution or copying of this message and any attachments is strictly prohibited. The integrity of this message cannot be guaranteed on the Internet, IMEKA shall not be liable for its content if altered, changed or falsified. If you have received this message in error, please contact immediately the sender and delete this message and any attachments from your system. AVIS DE CONFIDENTIALIT? : Ce message, ainsi que tout fichier qui y est joint, est destin? exclusivement aux personnes ? qui il est adress?. Il peut contenir des informations de nature confidentielle qui ne doivent ?tre divulgu?es en vertu des lois applicables. Si vous n'?tes pas le destinataire de ce message ou un mandataire autoris? de celui-ci, vous ?tes avis? par la pr?sente que toute impression, diffusion, distribution ou reproduction de ce message et de tout fichier qui y est joint est strictement interdite. L'int?grit? de ce message n'?tant pas assur?e sur Internet, IMEKA ne peut ?tre tenue responsable de son contenu s'il a ?t? alt?r?, d?form? ou falsifi?. Si ce message vous a ?t? transmis par erreur, veuillez en aviser sans d?lai l'exp?diteur et l'effacer ainsi que tout fichier joint sans en conserver de copie. -------------- next part -------------- An HTML attachment was scrubbed... URL: From stefanv at berkeley.edu Mon Jul 5 14:39:17 2021 From: stefanv at berkeley.edu (Stefan van der Walt) Date: Mon, 05 Jul 2021 11:39:17 -0700 Subject: [SciPy-Dev] =?utf-8?q?proposal_to_use_SciPy_project_funds_for_bu?= =?utf-8?q?ild_and_CI_improvements?= In-Reply-To: References: Message-ID: Hi Ralf, On Sun, Jul 4, 2021, at 14:46, Ralf Gommers wrote: > I'd like to make a proposal to use some project funds to make faster progress on moving to a new build system and improving our CI timeout issues. Thank you for writing this proposal. I think this will be of great benefit, not only to SciPy but also to NumPy and many other projects in the ecosystem. It is wise to prepare in advance for the distutils deprecation. > Because this is the first such proposal, let me first deal with the meta issue of _how_ to deal with this. At the start of the year I co-wrote NEP 48 (https://numpy.org/neps/nep-0048-spending-project-funds.html) to have a framework for NumPy for spending project funds. I'd like to follow that NEP also for SciPy. So I'll make the proposal here, in accordance with that NEP, and then the other thing is to figure out where to put it up as a PR (or wiki page, or ...). For that my proposal is: create a new repo called something like `project-mgmt`, where we can put such proposals as well as meeting minutes, presentations, and other content. And have a single summary page in Markdown format with a list of all proposals made, linking out to the full proposals. It would be great to have all these in one location. We can figure out the exact mechanism later, but it would be helpful to have the documents link to relevant discussion (a GitHub Discussion/PR/Issue would be better than wiki page, e.g.; but a mailing list link would suffice too). > I'd like to use 75% of the funds to pay Matthew, and 25% of the funds to take on a talented intern, Smit Lunagariya. He is comfortable working in Python, C and C++, and will be able to help with deliverables 7 and 8, as well as with testing and other tasks. +1 St?fan -------------- next part -------------- An HTML attachment was scrubbed... URL: From roy.pamphile at gmail.com Mon Jul 5 15:26:38 2021 From: roy.pamphile at gmail.com (Pamphile Roy) Date: Mon, 5 Jul 2021 21:26:38 +0200 Subject: [SciPy-Dev] Code formatting: black In-Reply-To: References: Message-ID: <174F2E93-5AEB-4FB2-984A-6B41347A257D@gmail.com> Hi everyone, The PR highlighted the need for a proper mathematical style guide. Tools like Black can only work if we are talking the same language. Hence, I have created the following issue https://github.com/scipy/scipy/issues/14354 Thank you everyone for the discussions. I hope we can craft something to solve this issue. It would be great for the Python Scientific community in general. Cheers, Pamphile > On 01.07.2021, at 14:14, Pamphile Roy wrote: > > Hi everyone, > > I have been recently working on using black on our code base. > > I have created this PR https://github.com/scipy/scipy/pull/14330 . There are lots of points, hence I invite you to come over and comment directly instead of replying to this email. > > Thanks in advance for having a look, I hope we can make this work together! > > Cheers, > Pamphile -------------- next part -------------- An HTML attachment was scrubbed... URL: From ilhanpolat at gmail.com Tue Jul 6 11:59:28 2021 From: ilhanpolat at gmail.com (Ilhan Polat) Date: Tue, 6 Jul 2021 17:59:28 +0200 Subject: [SciPy-Dev] proposal to use SciPy project funds for build and CI improvements In-Reply-To: References: Message-ID: A +1 from me too. Even OpenBLAS for Windows build scripts in our docs are stolen from his stuff (by yours truly) so making it a bit more supportive and less sacrificial in terms of time and effort would make sense. On Mon, Jul 5, 2021 at 8:40 PM Stefan van der Walt wrote: > Hi Ralf, > > On Sun, Jul 4, 2021, at 14:46, Ralf Gommers wrote: > > I'd like to make a proposal to use some project funds to make faster > progress on moving to a new build system and improving our CI timeout > issues. > > > Thank you for writing this proposal. I think this will be of great > benefit, not only to SciPy but also to NumPy and many other projects in the > ecosystem. It is wise to prepare in advance for the distutils deprecation. > > Because this is the first such proposal, let me first deal with the meta > issue of _how_ to deal with this. At the start of the year I co-wrote NEP > 48 (https://numpy.org/neps/nep-0048-spending-project-funds.html) to have > a framework for NumPy for spending project funds. I'd like to follow that > NEP also for SciPy. So I'll make the proposal here, in accordance with that > NEP, and then the other thing is to figure out where to put it up as a PR > (or wiki page, or ...). For that my proposal is: create a new repo called > something like `project-mgmt`, where we can put such proposals as well as > meeting minutes, presentations, and other content. And have a single > summary page in Markdown format with a list of all proposals made, linking > out to the full proposals. > > > It would be great to have all these in one location. We can figure out > the exact mechanism later, but it would be helpful to have the documents > link to relevant discussion (a GitHub Discussion/PR/Issue would be better > than wiki page, e.g.; but a mailing list link would suffice too). > > I'd like to use 75% of the funds to pay Matthew, and 25% of the funds to > take on a talented intern, Smit Lunagariya. He is comfortable working in > Python, C and C++, and will be able to help with deliverables 7 and 8, as > well as with testing and other tasks. > > > +1 > > St?fan > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at python.org > https://mail.python.org/mailman/listinfo/scipy-dev > -------------- next part -------------- An HTML attachment was scrubbed... URL: From rlucas7 at vt.edu Wed Jul 7 08:54:31 2021 From: rlucas7 at vt.edu (rlucas7 at vt.edu) Date: Wed, 7 Jul 2021 08:54:31 -0400 Subject: [SciPy-Dev] proposal to use SciPy project funds for build and CI improvements In-Reply-To: References: Message-ID: <8E93A717-491F-4653-B505-01720FFA5170@vt.edu> +1 -Lucas Roberts > On Jul 6, 2021, at 12:00 PM, Ilhan Polat wrote: > > ? > A +1 from me too. Even OpenBLAS for Windows build scripts in our docs are stolen from his stuff (by yours truly) so making it a bit more supportive and less sacrificial in terms of time and effort would make sense. > >> On Mon, Jul 5, 2021 at 8:40 PM Stefan van der Walt wrote: >> Hi Ralf, >> >>> On Sun, Jul 4, 2021, at 14:46, Ralf Gommers wrote: >>> I'd like to make a proposal to use some project funds to make faster progress on moving to a new build system and improving our CI timeout issues. >> >> Thank you for writing this proposal. I think this will be of great benefit, not only to SciPy but also to NumPy and many other projects in the ecosystem. It is wise to prepare in advance for the distutils deprecation. >> >>> Because this is the first such proposal, let me first deal with the meta issue of _how_ to deal with this. At the start of the year I co-wrote NEP 48 (https://numpy.org/neps/nep-0048-spending-project-funds.html) to have a framework for NumPy for spending project funds. I'd like to follow that NEP also for SciPy. So I'll make the proposal here, in accordance with that NEP, and then the other thing is to figure out where to put it up as a PR (or wiki page, or ...). For that my proposal is: create a new repo called something like `project-mgmt`, where we can put such proposals as well as meeting minutes, presentations, and other content. And have a single summary page in Markdown format with a list of all proposals made, linking out to the full proposals. >> >> It would be great to have all these in one location. We can figure out the exact mechanism later, but it would be helpful to have the documents link to relevant discussion (a GitHub Discussion/PR/Issue would be better than wiki page, e.g.; but a mailing list link would suffice too). >> >>> I'd like to use 75% of the funds to pay Matthew, and 25% of the funds to take on a talented intern, Smit Lunagariya. He is comfortable working in Python, C and C++, and will be able to help with deliverables 7 and 8, as well as with testing and other tasks. >> >> +1 >> >> St?fan >> >> _______________________________________________ >> SciPy-Dev mailing list >> SciPy-Dev at python.org >> https://mail.python.org/mailman/listinfo/scipy-dev > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at python.org > https://mail.python.org/mailman/listinfo/scipy-dev -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Sat Jul 10 10:42:12 2021 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Sat, 10 Jul 2021 16:42:12 +0200 Subject: [SciPy-Dev] proposal to use SciPy project funds for build and CI improvements In-Reply-To: References: Message-ID: Thanks for the feedback everyone. Looks like we are good to get started. On Mon, Jul 5, 2021 at 8:39 PM Stefan van der Walt wrote: > Hi Ralf, > > On Sun, Jul 4, 2021, at 14:46, Ralf Gommers wrote: > > I'd like to make a proposal to use some project funds to make faster > progress on moving to a new build system and improving our CI timeout > issues. > > > Thank you for writing this proposal. I think this will be of great > benefit, not only to SciPy but also to NumPy and many other projects in the > ecosystem. It is wise to prepare in advance for the distutils deprecation. > > Because this is the first such proposal, let me first deal with the meta > issue of _how_ to deal with this. At the start of the year I co-wrote NEP > 48 (https://numpy.org/neps/nep-0048-spending-project-funds.html) to have > a framework for NumPy for spending project funds. I'd like to follow that > NEP also for SciPy. So I'll make the proposal here, in accordance with that > NEP, and then the other thing is to figure out where to put it up as a PR > (or wiki page, or ...). For that my proposal is: create a new repo called > something like `project-mgmt`, where we can put such proposals as well as > meeting minutes, presentations, and other content. And have a single > summary page in Markdown format with a list of all proposals made, linking > out to the full proposals. > > > It would be great to have all these in one location. We can figure out > the exact mechanism later, but it would be helpful to have the documents > link to relevant discussion (a GitHub Discussion/PR/Issue would be better > than wiki page, e.g.; but a mailing list link would suffice too). > Good point. This is what we do in NEPs as well, I'll add it to the proposal metadata. Once I have a repo set up for tracking this stuff I'll update here. Cheers, Ralf > I'd like to use 75% of the funds to pay Matthew, and 25% of the funds to > take on a talented intern, Smit Lunagariya. He is comfortable working in > Python, C and C++, and will be able to help with deliverables 7 and 8, as > well as with testing and other tasks. > > > +1 > > St?fan > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at python.org > https://mail.python.org/mailman/listinfo/scipy-dev > -------------- next part -------------- An HTML attachment was scrubbed... URL: From danielschmitzsiegen at googlemail.com Sun Jul 11 04:39:08 2021 From: danielschmitzsiegen at googlemail.com (Daniel Schmitz) Date: Sun, 11 Jul 2021 10:39:08 +0200 Subject: [SciPy-Dev] Global Optimization Benchmarks In-Reply-To: References: Message-ID: Thanks to this discussion there is now already a PR by Gagandeep Singh to add DIRECT (via the original Fortran implementation) to scipy: https://github.com/scipy/scipy/pull/14300 As I thought we might want to exploit the momentum built up here, I also pinged the developer of the Python wrapper for biteopt (MIT license) if he would be interested to help to add it to scipy: https://github.com/leonidk/biteopt/pull/1 Another solver which gained some popularity (for example part of lmfit) and outperformed scipy's solvers for certain types of problems in Andrea's benchmark is AMPGO. If there was a license for it, I would also volunteer to add it to scipy (realistic timeframe: until end of the year). Cheers, Daniel On Tue, 25 May 2021 at 11:11, Ralf Gommers wrote: > > > On Mon, May 24, 2021 at 9:36 AM Andrea Gavana > wrote: > >> Hi Daniel, >> >> On Mon, 24 May 2021 at 09.23, Daniel Schmitz < >> danielschmitzsiegen at googlemail.com> wrote: >> >>> Hey Ralf, >>> >>> discussion for DIRECT was opened here: >>> https://github.com/scipy/scipy/issues/14121 and the scipydirect >>> maintainer was pinged: https://github.com/andim/scipydirect/issues/9 . >>> >> > Thanks! I commented on the issue about having checked the license info and > what I think needs to be done. > > >>> About the general discussion regarding global optimization in SciPy: I >>> am working on a SciPy style API for NLopt in my free time (mostly missing >>> documentation at the moment) and would like to benchmark its MLSL and StoGO >>> algorithms using the SciPy benchmarks. Currently, this seems very >>> complicated as any new algorithm would have to be included into SciPy >>> first. Is there a way to circumvent this? Of course, Andrea's benchmark >>> suite would be the mother of all benchmarks if available ;). >>> >> >> I think it sounds like an very good idea to design a SciPy stile API for >> NLOpt. >> > > This seems valuable for LGPL'd algorithms that we cannot include directly. > Having an optional dependency also has maintenance and CI costs, so for > BSD/MIT-licensed algorithms I'd say we prefer to vendor them rather than > have a scikit-umfpack like optional dependency. Unless there's really a ton > of code - but if it's one or two algorithms, then vendoring seems like the > way to go. > > Cheers, > Ralf > > > >> In my personal experience the two algorithms you mention (MLSL and StoGo) >> are generally less performant compared to the other ones available in >> NLOpt. >> >> That said, assuming enough interest was there and some spare time, I >> could easily rerun the whole benchmarks with those two methods as well. >> >> Andrea. >> >> >> >> >>> Cheers, >>> Daniel >>> >>> On Sun, 25 Apr 2021 at 22:32, Ralf Gommers >>> wrote: >>> >>>> Hi Daniel, >>>> >>>> Sorry for the extremely delayed reply. >>>> >>>> >>>> On Thu, Mar 11, 2021 at 10:19 AM Daniel Schmitz < >>>> danielschmitzsiegen at googlemail.com> wrote: >>>> >>>>> Hey all, >>>>> >>>>> after Andrea's suggestions, I did an extensive github search and found >>>>> several global optimization python libraries which mimic the scipy >>>>> API, so that a user only has to change the import statements. Could it >>>>> be useful to add a page in the documentation of these? >>>>> >>>> >>>> This does sound useful. Probably not a whole page, more like a note in >>>> the `minimize` docstring with references I'd think. >>>> >>>> >>>>> Non exhaustive list: >>>>> >>>>> DIRECT: https://github.com/andim/scipydirect >>>>> DE/PSO/CMA-ES: https://github.com/keurfonluu/stochopy >>>>> PSO: https://github.com/jerrytheo/psopy >>>>> Powell's derivative free optimizers: https://www.pdfo.net/index.html >>>>> >>>>> As DIRECT was very competitive on some of Andrea's benchmarks, it >>>>> could be useful to mimic the scipydirect repo for inclusion into scipy >>>>> (MIT license). The code is unfortunately a f2py port of the original >>>>> Fortran implementation which has hard coded bounds on the number of >>>>> function evaluations (90.000) and iterations (6.000). Any opinions on >>>>> this? >>>>> >>>> >>>> This sounds like a good idea. Would you mind opening a GitHub issue >>>> with the feature request, so we keep track of this? Contacting the original >>>> author about this would also be useful; if the author would like to >>>> upstream their code, that'd be a good outcome. >>>> >>>> Re hard coded bounds, I assume those can be removed again without too >>>> much trouble. >>>> >>>> >>>>> >>>>> I personally am very impressed by biteopt's performance, and although >>>>> it ranked very high in other global optimization benchmarks there is >>>>> no formal paper on it yet. I understand the scipy guidelines in a way >>>>> that such a paper is a requisite for inclusion into scipy. >>>>> >>>> >>>> Well, we do have the very extensive benchmarks - if it does indeed >>>> perform better than what we have, then of course we'd be happy to add a new >>>> algorithm even if it doesn't have a paper. We use papers and the citations >>>> it has as an indication of usefulness only; anything that outperforms our >>>> existing code is clearly useful. >>>> >>>> Cheers, >>>> Ralf >>>> >>>> >>>> >>>>> >>>>> Best, >>>>> >>>>> Daniel >>>>> >>>>> Daniel >>>>> >>>>> >>>>> On Sun, 17 Jan 2021 at 14:33, Andrea Gavana >>>>> wrote: >>>>> > >>>>> > Hi Stefan, >>>>> > >>>>> > You?re most welcome :-) . I?m happy the experts in the community >>>>> are commenting and suggesting things, and constructive criticism is also >>>>> always welcome. >>>>> > >>>>> > On Sun, 17 Jan 2021 at 12.11, Stefan Endres < >>>>> stefan.c.endres at gmail.com> wrote: >>>>> >> >>>>> >> Dear Andrea, >>>>> >> >>>>> >> Thank you very much for this detailed analysis. I don't think I've >>>>> seen such a large collection of benchmark test suites or collection of DFO >>>>> algorithms since the publication by Rios and Sahinidis in 2013. Some >>>>> questions: >>>>> >> >>>>> >> Many of the commercial algorithms offer free licenses for >>>>> benchmarking problems of less than 10 dimensions. Would you be willing to >>>>> include some of these in your benchmarks at some point? It would be a great >>>>> reference to use. >>>>> > >>>>> > >>>>> > I?m definitely willing to include those commercial algorithms. The >>>>> test suite per se is almost completely automated, so it?s not that >>>>> complicated to add one or more solvers. I?m generally more inclined in >>>>> testing open source algorithms but there?s nothing stopping the inclusion >>>>> of commercial ones. >>>>> > >>>>> > I welcome any suggestions related to commercial solvers, as long as >>>>> they can run on Python 2 / Python 3 and on Windows (I might be able to >>>>> setup a Linux virtual machine if absolutely needed but that would defy part >>>>> of the purpose of the exercise - SHGO, Dual Annealing and the other SciPy >>>>> solvers run on all platforms that support SciPy). >>>>> > >>>>> >> The collection of test suites you've garnered could be immensely >>>>> useful for further algorithm development. Is there a possibility of >>>>> releasing the code publicly (presumably after you've published the results >>>>> in a journal)? >>>>> >> >>>>> >> In this case I would also like to volunteer to run some of the >>>>> commercial solvers on the benchmark suite. >>>>> >> It would also help to have a central repository for fixing bugs and >>>>> adding lower global minima when they are found (of which there are quite >>>>> few ). >>>>> > >>>>> > >>>>> > >>>>> > I?m still sorting out all the implications related to a potential >>>>> paper with my employer, but as far as I can see there shouldn?t be any >>>>> problem with that: assuming everything goes as it should, I will definitely >>>>> push for making the code open source. >>>>> > >>>>> > >>>>> >> >>>>> >> Comments on shgo: >>>>> >> >>>>> >> High RAM use in higher dimensions: >>>>> >> >>>>> >> In the higher dimensions the new simplicial sampling can be used >>>>> (not pushed to scipy yet; I still need to update some documentation before >>>>> the PR). This alleviates, but does not eliminate the memory leak issue. As >>>>> you've said SHGO is best suited to problems below 10 dimensions as any >>>>> higher leaves the realm of DFO problems and starts to enter the domain of >>>>> NLP problems. My personal preference in this case is to use the stochastic >>>>> algorithms (basinhopping and differential evolution) on problems where it >>>>> is known that a gradient based solver won't work. >>>>> >> >>>>> >> An exception to this "rule" is when special grey box information >>>>> such as symmetry of the objective function (something that can be supplied >>>>> to shgo to push the applicability of the algorithm up to ~100 variables) or >>>>> pre-computed bounds on the Lipschitz constants is known. >>>>> >> >>>>> >> In the symmetry case SHGO can solve these by supplying the >>>>> `symmetry` option (which was used in the previous benchmarks done by me for >>>>> the JOGO publication, although I did not specifically check if performance >>>>> was actually improved on those problems, but shgo did converge on all >>>>> benchmark problems in the scipy test suite). >>>>> >> >>>>> >> I have had a few reports of memory leaks from various users. I have >>>>> spoken to a few collaborators about the possibility of finding a Masters >>>>> student to cythonize some of the code or otherwise improve it. Hopefully, >>>>> this will happen in the summer semester of 2021. >>>>> > >>>>> > >>>>> > To be honest I wouldn?t be so concerned in general: SHGO is an >>>>> excellent global optimization algorithm and it consistently ranks at the >>>>> top, no matter what problems you throw at it. Together with Dual Annealing, >>>>> SciPy has gained two phenomenal nonlinear solvers and I?m very happy to see >>>>> that SciPy is now at the cutting edge of the open source global >>>>> optimization universe. >>>>> > >>>>> > Andrea. >>>>> > >>>>> >> Thank you again for compiling this large set of benchmark results. >>>>> >> >>>>> >> Best regards, >>>>> >> Stefan >>>>> >> On Fri, Jan 8, 2021 at 10:21 AM Andrea Gavana < >>>>> andrea.gavana at gmail.com> wrote: >>>>> >>> >>>>> >>> Dear SciPy Developers & Users, >>>>> >>> >>>>> >>> long time no see :-) . I thought to start 2021 with a bit of a >>>>> bang, to try and forget how bad 2020 has been... So I am happy to present >>>>> you with a revamped version of the Global Optimization Benchmarks from my >>>>> previous exercise in 2013. >>>>> >>> >>>>> >>> This new set of benchmarks pretty much superseeds - and greatly >>>>> expands - the previous analysis that you can find at this location: >>>>> http://infinity77.net/global_optimization/ . >>>>> >>> >>>>> >>> The approach I have taken this time is to select as many benchmark >>>>> test suites as possible: most of them are characterized by test function >>>>> generators, from which we can actually create almost an unlimited number of >>>>> unique test problems. Biggest news are: >>>>> >>> >>>>> >>> This whole exercise is made up of 6,825 test problems divided >>>>> across 16 different test suites: most of these problems are of low >>>>> dimensionality (2 to 6 variables) with a few benchmarks extending to 9+ >>>>> variables. With all the sensitivities performed during this exercise on >>>>> those benchmarks, the overall grand total number of functions evaluations >>>>> stands at 3,859,786,025 - close to 4 billion. Not bad. >>>>> >>> A couple of "new" optimization algorithms I have ported to Python: >>>>> >>> >>>>> >>> MCS: Multilevel Coordinate Search, it?s my translation to Python >>>>> of the original Matlab code from A. Neumaier and W. Huyer (giving then for >>>>> free also GLS and MINQ) I have added a few, minor improvements compared to >>>>> the original implementation. >>>>> >>> BiteOpt: BITmask Evolution OPTimization , I have converted the C++ >>>>> code into Python and added a few, minor modifications. >>>>> >>> >>>>> >>> >>>>> >>> Enough chatting for now. The 13 tested algorithms are described >>>>> here: >>>>> >>> >>>>> >>> http://infinity77.net/go_2021/ >>>>> >>> >>>>> >>> High level description & results of the 16 benchmarks: >>>>> >>> >>>>> >>> http://infinity77.net/go_2021/thebenchmarks.html >>>>> >>> >>>>> >>> Each benchmark test suite has its own dedicated page, with more >>>>> detailed results and sensitivities. >>>>> >>> >>>>> >>> List of tested algorithms: >>>>> >>> >>>>> >>> AMPGO: Adaptive Memory Programming for Global Optimization: this >>>>> is my Python implementation of the algorithm described here: >>>>> >>> >>>>> >>> >>>>> http://leeds-faculty.colorado.edu/glover/fred%20pubs/416%20-%20AMP%20(TS)%20for%20Constrained%20Global%20Opt%20w%20Lasdon%20et%20al%20.pdf >>>>> >>> >>>>> >>> I have added a few improvements here and there based on my Master >>>>> Thesis work on the standard Tunnelling Algorithm of Levy, Montalvo and >>>>> Gomez. After AMPGO was integrated in lmfit, I have improved it even more - >>>>> in my opinion. >>>>> >>> >>>>> >>> BasinHopping: Basin hopping is a random algorithm which attempts >>>>> to find the global minimum of a smooth scalar function of one or more >>>>> variables. The algorithm was originally described by David Wales: >>>>> >>> >>>>> >>> http://www-wales.ch.cam.ac.uk/ >>>>> >>> >>>>> >>> BasinHopping is now part of the standard SciPy distribution. >>>>> >>> >>>>> >>> BiteOpt: BITmask Evolution OPTimization, based on the algorithm >>>>> presented in this GitHub link: >>>>> >>> >>>>> >>> https://github.com/avaneev/biteopt >>>>> >>> >>>>> >>> I have converted the C++ code into Python and added a few, minor >>>>> modifications. >>>>> >>> >>>>> >>> CMA-ES: Covariance Matrix Adaptation Evolution Strategy, based on >>>>> the following algorithm: >>>>> >>> >>>>> >>> http://www.lri.fr/~hansen/cmaesintro.html >>>>> >>> >>>>> >>> http://www.lri.fr/~hansen/cmaes_inmatlab.html#python (Python code >>>>> for the algorithm) >>>>> >>> >>>>> >>> CRS2: Controlled Random Search with Local Mutation, as implemented >>>>> in the NLOpt package: >>>>> >>> >>>>> >>> >>>>> http://ab-initio.mit.edu/wiki/index.php/NLopt_Algorithms#Controlled_Random_Search_.28CRS.29_with_local_mutation >>>>> >>> >>>>> >>> DE: Differential Evolution, described in the following page: >>>>> >>> >>>>> >>> http://www1.icsi.berkeley.edu/~storn/code.html >>>>> >>> >>>>> >>> DE is now part of the standard SciPy distribution, and I have >>>>> taken the implementation as it stands in SciPy: >>>>> >>> >>>>> >>> >>>>> https://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.differential_evolution.html#scipy.optimize.differential_evolution >>>>> >>> >>>>> >>> DIRECT: the DIviding RECTangles procedure, described in: >>>>> >>> >>>>> >>> >>>>> https://www.tol-project.org/export/2776/tolp/OfficialTolArchiveNetwork/NonLinGloOpt/doc/DIRECT_Lipschitzian%20optimization%20without%20the%20lipschitz%20constant.pdf >>>>> >>> >>>>> >>> >>>>> http://ab-initio.mit.edu/wiki/index.php/NLopt_Algorithms#DIRECT_and_DIRECT-L >>>>> (Python code for the algorithm) >>>>> >>> >>>>> >>> DualAnnealing: the Dual Annealing algorithm, taken directly from >>>>> the SciPy implementation: >>>>> >>> >>>>> >>> >>>>> https://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.dual_annealing.html#scipy.optimize.dual_annealing >>>>> >>> >>>>> >>> LeapFrog: the Leap Frog procedure, which I have been recommended >>>>> for use, taken from: >>>>> >>> >>>>> >>> https://github.com/flythereddflagg/lpfgopt >>>>> >>> >>>>> >>> MCS: Multilevel Coordinate Search, it?s my translation to Python >>>>> of the original Matlab code from A. Neumaier and W. Huyer (giving then for >>>>> free also GLS and MINQ): >>>>> >>> >>>>> >>> https://www.mat.univie.ac.at/~neum/software/mcs/ >>>>> >>> >>>>> >>> I have added a few, minor improvements compared to the original >>>>> implementation. See the MCS section for a quick and dirty comparison >>>>> between the Matlab code and my Python conversion. >>>>> >>> >>>>> >>> PSWARM: Particle Swarm optimization algorithm, it has been >>>>> described in many online papers. I have used a compiled version of the C >>>>> source code from: >>>>> >>> >>>>> >>> http://www.norg.uminho.pt/aivaz/pswarm/ >>>>> >>> >>>>> >>> SCE: Shuffled Complex Evolution, described in: >>>>> >>> >>>>> >>> Duan, Q., S. Sorooshian, and V. Gupta, Effective and efficient >>>>> global optimization for conceptual rainfall-runoff models, Water Resour. >>>>> Res., 28, 1015-1031, 1992. >>>>> >>> >>>>> >>> The version I used was graciously made available by Matthias Cuntz >>>>> via a personal e-mail. >>>>> >>> >>>>> >>> SHGO: Simplicial Homology Global Optimization, taken directly from >>>>> the SciPy implementation: >>>>> >>> >>>>> >>> >>>>> https://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.shgo.html#scipy.optimize.shgo >>>>> >>> >>>>> >>> >>>>> >>> List of benchmark test suites: >>>>> >>> >>>>> >>> SciPy Extended: 235 multivariate problems (where the number of >>>>> independent variables ranges from 2 to 17), again with multiple >>>>> local/global minima. >>>>> >>> >>>>> >>> I have added about 40 new functions to the standard SciPy >>>>> benchmarks and fixed a few bugs in the existing benchmark models in the >>>>> SciPy repository. >>>>> >>> >>>>> >>> GKLS: 1,500 test functions, with dimensionality varying from 2 to >>>>> 6, generated with the super famous GKLS Test Functions Generator. I have >>>>> taken the original C code (available at http://netlib.org/toms/) and >>>>> converted it to Python. >>>>> >>> >>>>> >>> GlobOpt: 288 tough problems, with dimensionality varying from 2 to >>>>> 5, created with another test function generator which I arbitrarily named >>>>> ?GlobOpt?: >>>>> https://www.researchgate.net/publication/225566516_A_new_class_of_test_functions_for_global_optimization >>>>> . The original code is in C++ and I have bridged it to Python using Cython. >>>>> >>> >>>>> >>> Many thanks go to Professor Marco Locatelli for providing an >>>>> updated copy of the C++ source code. >>>>> >>> >>>>> >>> MMTFG: sort-of an acronym for ?Multi-Modal Test Function with >>>>> multiple Global minima?, this test suite implements the work of Jani >>>>> Ronkkonen: >>>>> https://www.researchgate.net/publication/220265526_A_Generator_for_Multimodal_Test_Functions_with_Multiple_Global_Optima >>>>> . It contains 981 test problems with dimensionality varying from 2 to 4. >>>>> The original code is in C and I have bridge it to Python using Cython. >>>>> >>> >>>>> >>> GOTPY: a generator of benchmark functions using the >>>>> Bocharov-Feldbaum ?Method-Min?, containing 400 test problems with >>>>> dimensionality varying from 2 to 5. I have taken the Python implementation >>>>> from https://github.com/redb0/gotpy and improved it in terms of >>>>> runtime. >>>>> >>> >>>>> >>> Original paper from >>>>> http://www.mathnet.ru/php/archive.phtml?wshow=paper&jrnid=at&paperid=11985&option_lang=eng >>>>> . >>>>> >>> >>>>> >>> Huygens: this benchmark suite is very different from the rest, as >>>>> it uses a ?fractal? approach to generate test functions. It is based on the >>>>> work of Cara MacNish on Fractal Functions. The original code is in Java, >>>>> and at the beginning I just converted it to Python: given it was slow as a >>>>> turtle, I have re-implemented it in Fortran and wrapped it using f2py, then >>>>> generating 600 2-dimensional test problems out of it. >>>>> >>> >>>>> >>> LGMVG: not sure about the meaning of the acronym, but the >>>>> implementation follows the ?Max-Set of Gaussians Landscape Generator? >>>>> described in http://boyuan.global-optimization.com/LGMVG/index.htm . >>>>> Source code is given in Matlab, but it?s fairly easy to convert it to >>>>> Python. This test suite contains 304 problems with dimensionality varying >>>>> from 2 to 5. >>>>> >>> >>>>> >>> NgLi: Stemming from the work of Chi-Kong Ng and Duan Li, this is a >>>>> test problem generator for unconstrained optimization, but it?s fairly easy >>>>> to assign bound constraints to it. The methodology is described in >>>>> https://www.sciencedirect.com/science/article/pii/S0305054814001774 , >>>>> while the Matlab source code can be found in >>>>> http://www1.se.cuhk.edu.hk/~ckng/generator/ . I have used the Matlab >>>>> script to generate 240 problems with dimensionality varying from 2 to 5 by >>>>> outputting the generator parameters in text files, then used Python to >>>>> create the objective functions based on those parameters and the benchmark >>>>> methodology. >>>>> >>> >>>>> >>> MPM2: Implementing the ?Multiple Peaks Model 2?, there is a Python >>>>> implementation at >>>>> https://github.com/jakobbossek/smoof/blob/master/inst/mpm2.py . This >>>>> is a test problem generator also used in the smoof library, I have taken >>>>> the code almost as is and generated 480 benchmark functions with >>>>> dimensionality varying from 2 to 5. >>>>> >>> >>>>> >>> RandomFields: as described in >>>>> https://www.researchgate.net/publication/301940420_Global_optimization_test_problems_based_on_random_field_composition >>>>> , it generates benchmark functions by ?smoothing? one or more >>>>> multidimensional discrete random fields and composing them. No source code >>>>> is given, but the implementation is fairly straightforward from the article >>>>> itself. >>>>> >>> >>>>> >>> NIST: not exactly the realm of Global Optimization solvers, but >>>>> the NIST StRD dataset can be used to generate a single objective function >>>>> as ?sum of squares?. I have used the NIST dataset as implemented in lmfit, >>>>> thus creating 27 test problems with dimensionality ranging from 2 to 9. >>>>> >>> >>>>> >>> GlobalLib: Arnold Neumaier maintains a suite of test problems >>>>> termed ?COCONUT Benchmark? and Sahinidis has converted the GlobalLib and >>>>> PricentonLib AMPL/GAMS dataset into C/Fortran code ( >>>>> http://archimedes.cheme.cmu.edu/?q=dfocomp ). I have used a simple C >>>>> parser to convert the benchmarks from C to Python. >>>>> >>> >>>>> >>> The global minima are taken from Sahinidis or from Neumaier or >>>>> refined using the NEOS server when the accuracy of the reported minima is >>>>> too low. The suite contains 181 test functions with dimensionality varying >>>>> between 2 and 9. >>>>> >>> >>>>> >>> CVMG: another ?landscape generator?, I had to dig it out using the >>>>> Wayback Machine at >>>>> http://web.archive.org/web/20100612044104/https://www.cs.uwyo.edu/~wspears/multi.kennedy.html >>>>> , the acronym stands for ?Continuous Valued Multimodality Generator?. >>>>> Source code is in C++ but it?s fairly easy to port it to Python. In >>>>> addition to the original implementation (that uses the Sigmoid as a >>>>> softmax/transformation function) I have added a few others to create varied >>>>> landscapes. 360 test problems have been generated, with dimensionality >>>>> ranging from 2 to 5. >>>>> >>> >>>>> >>> NLSE: again, not really the realm of Global optimization solvers, >>>>> but Nonlinear Systems of Equations can be transformed to single objective >>>>> functions to optimize. I have drawn from many different sources >>>>> (Publications, ALIAS/COPRIN and many others) to create 44 systems of >>>>> nonlinear equations with dimensionality ranging from 2 to 8. >>>>> >>> >>>>> >>> Schoen: based on the early work of Fabio Schoen and his short note >>>>> on a simple but interesting idea on a test function generator, I have taken >>>>> the C code in the note and converted it into Python, thus creating 285 >>>>> benchmark functions with dimensionality ranging from 2 to 6. >>>>> >>> >>>>> >>> Many thanks go to Professor Fabio Schoen for providing an updated >>>>> copy of the source code and for the email communications. >>>>> >>> >>>>> >>> Robust: the last benchmark test suite for this exercise, it is >>>>> actually composed of 5 different kind-of analytical test function >>>>> generators, containing deceptive, multimodal, flat functions depending on >>>>> the settings. Matlab source code is available at >>>>> http://www.alimirjalili.com/RO.html , I simply converted it to Python >>>>> and created 420 benchmark functions with dimensionality ranging from 2 to 6. >>>>> >>> >>>>> >>> >>>>> >>> Enjoy, and Happy 2021 :-) . >>>>> >>> >>>>> >>> >>>>> >>> Andrea. >>>>> >>> >>>>> >>> _______________________________________________ >>>>> >>> >>>>> >>> >>>>> >>> SciPy-Dev mailing list >>>>> >>> SciPy-Dev at python.org >>>>> >>> https://mail.python.org/mailman/listinfo/scipy-dev >>>>> >> >>>>> >> >>>>> >> >>>>> >> -- >>>>> >> Stefan Endres (MEng, AMIChemE, BEng (Hons) Chemical Engineering) >>>>> >> >>>>> >> Wissenchaftlicher Mitarbeiter: Leibniz Institute for Materials >>>>> Engineering IWT, Badgasteiner Stra?e 3, 28359 Bremen, Germany >>>>> >>>>> >> Work phone (DE): +49 (0) 421 218 51238 >>>>> >> Cellphone (DE): +49 (0) 160 949 86417 >>>>> >> Cellphone (ZA): +27 (0) 82 972 42 89 >>>>> >> E-mail (work): s.endres at iwt.uni-bremen.de >>>>> >> Website: https://stefan-endres.github.io/ >>>>> >> _______________________________________________ >>>>> >> SciPy-Dev mailing list >>>>> >> SciPy-Dev at python.org >>>>> >> https://mail.python.org/mailman/listinfo/scipy-dev >>>>> > >>>>> > _______________________________________________ >>>>> > SciPy-Dev mailing list >>>>> > SciPy-Dev at python.org >>>>> > https://mail.python.org/mailman/listinfo/scipy-dev >>>>> _______________________________________________ >>>>> SciPy-Dev mailing list >>>>> SciPy-Dev at python.org >>>>> https://mail.python.org/mailman/listinfo/scipy-dev >>>>> >>>> _______________________________________________ >>>> SciPy-Dev mailing list >>>> SciPy-Dev at python.org >>>> https://mail.python.org/mailman/listinfo/scipy-dev >>>> >>> _______________________________________________ >>> SciPy-Dev mailing list >>> SciPy-Dev at python.org >>> https://mail.python.org/mailman/listinfo/scipy-dev >>> >> _______________________________________________ >> SciPy-Dev mailing list >> SciPy-Dev at python.org >> https://mail.python.org/mailman/listinfo/scipy-dev >> > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at python.org > https://mail.python.org/mailman/listinfo/scipy-dev > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Tue Jul 13 09:29:11 2021 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Tue, 13 Jul 2021 15:29:11 +0200 Subject: [SciPy-Dev] better distinguishing public and private APIs in SciPy Message-ID: Hi all, Last week I filed https://github.com/scipy/scipy/issues/14360. The summary of that is: I think it's time to make private APIs really private, and have a better-defined public API. People don't read docs, even maintainers of other libraries like Dask and CuPy. While we have very prominently documented what our public API is, on the front page of the reference guide (http://scipy.github.io/devdocs/reference/index.html#api-definition), this is not enough. Now that more and more other libraries (Dask, CuPy, JAX, Pytorch, RAPIDS, etc.) are starting to reimplement SciPy APIs, this is becoming more of an issue. GitHub has gotten better (I think) at following renames, so `git blame` still works as expected after renaming a file - which was my main concern. My proposed approach is to add a test that ensures we won't add any new namespaces with missing underscores (we can steal this test from NumPy), add underscores to file names that are missing them, and deprecate accessing `scipy.submodule.some_private_namespace` with a deprecation that only expires in SciPy 2.0. There may be a few places where we find that some semi-private API is used quite a lot in the wild. In those cases, I suggest to still do the rename but don't add the deprecation (just use a code comment instead to explain), to avoid too much downstream churn. Please look at https://github.com/scipy/scipy/issues/14360 for more details on this, and comment if you see a potential issue with this. Cheers, Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From serge.guelton at telecom-bretagne.eu Tue Jul 13 17:16:59 2021 From: serge.guelton at telecom-bretagne.eu (Serge Guelton) Date: Tue, 13 Jul 2021 23:16:59 +0200 Subject: [SciPy-Dev] pythran 0.9.12 - heskenn Message-ID: <20210713211659.GA7897@sguelton.remote.csb> Hi folks, I just shipped a new release of Pythran?a compiler for scientific kernels written in python. It means it is available on PyPI and github, but not yet on conda, fedora, gentoo etc. This one fixes a bunch of issues for Scipy (again), restores compatibility with Cython, but the biggest change is definitively the new dependency tree. Indeed, Pythran no longer depends on networkx, six nor decorator. The dependency on gast and beniget received a version bump for compatibility with python 3.10 changes. Big thanks to the contributors, every commit counts! $ git shortlog -s 0.9.11..0.9.12 1 Ashwin Vishnu 1 Christian Clauss 1 Jochen Schr?der 1 Miro Hron?ok 2 Ralf Gommers 35 serge-sans-paille 1 ??? A summarized changelog $ git diff 0.9.11..0.9.12 -- Changelog | cut -c 2- (...) 2021-07-06 Serge Guelton * Remove six, networkx and decorator dependency * Bump gast and Beniget requirements to support python 3.10 * Bump xsimd to 7.5.0 * Minimal default support for non-linux, non-osx, now-windows platform * Numpy improvements for np.bincount, np.transpose, np.searchsorted * Restore (and test) cython compatibility * Expose pythran.get_include for toolchain integration * Improve error message on invalid spec * Handle static dispatching based on keyword signature * Raise Memory Error upon (too) large numpy alloc * Support scalar case of scipy.special.binom * Trim the number of warnings in pythonic codebase (...) From tirthasheshpatel at gmail.com Fri Jul 16 15:48:24 2021 From: tirthasheshpatel at gmail.com (Tirth Patel) Date: Sat, 17 Jul 2021 01:18:24 +0530 Subject: [SciPy-Dev] Integrating UNU.RAN in scipy.stats Message-ID: Hi all, I, Christoph, and Nicholas have been working on gh-14215 [1] to integrate UNU.RAN library in SciPy. We'd appreciate your thoughts on it! We have designed an object-oriented API for sampling from any continuous or discrete distributions using universal generators in UNU.RAN. For now, only the `TransformedDensityRejection` (for continuous distributions) and `DiscreteAliasUrn` (for discrete distributions) methods have been added. These methods take a distribution object with required methods like PDF, dPDF, CDF, etc as input and set-up a generator which can then be used to sample from that distribution: >>> from scipy.stats import TransformedDensityRejection >>> import numpy as np >>> >>> class StdNorm: ... def pdf(self, x: float) -> float: ... # notice that normalization constant is not required ... # and the pdf accepts and returns scalars. ... return np.exp(-0.5 * x*x) ... def dpdf(self, x: float) -> float: ... return -x * self.pdf(x) ... >>> dist = StdNorm() >>> rng = TransformedDensityRejection(dist, seed=123) >>> rng.rvs() 0.474548717355228 One of the tricky parts about this PR is handling errors occurring in Python callbacks and UNU.RAN C library. We could use some reviews and ideas to build a maintainable infrastructure. Use of non-local returns causes memory leaks making the code for error handling a lot less trivial and possibly much more complex. I would really appreciate it if someone could take a look at the Cython and C code in the PR and help verify the approach or suggest an alternative approach, if any. There are some discussions about this in [2] and [3] and you can also look at the review comments on the main PR. Although the API design is not final yet, please feel free to comment on it as well. Thanks! [1]: https://github.com/scipy/scipy/pull/14215 [2]: https://github.com/tirthasheshpatel/scipy/pull/9 [3]: https://github.com/tirthasheshpatel/unuran/pull/1 -- Kind Regards, Tirth Patel -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Sun Jul 18 16:13:08 2021 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sun, 18 Jul 2021 14:13:08 -0600 Subject: [SciPy-Dev] (no subject) Message-ID: Hi All, On behalf of the NumPy team I am pleased to announce the release of NumPy 1.21.1. The NumPy 1.21.1 is a maintenance release that fixes bugs discovered after the 1.21.0 release. OpenBLAS has also been updated to v0.3.17 to deal with arm64 problems. The Python versions supported for this release are 3.7-3.9. The 1.21.x series is compatible with development Python 3.10 and Python 3.10 will be officially supported after it is released. Wheels can be downloaded from PyPI ; source archives, release notes, and wheel hashes are available on Github . Linux users will need pip >= 0.19.3 in order to install manylinux2010 and manylinux2014 wheels. *Contributors* A total of 11 people contributed to this release. People with a \"+\" by their names contributed a patch for the first time. - Bas van Beek - Charles Harris - Ganesh Kathiresan - Gregory R. Lee - Hugo Defois + - Kevin Sheppard - Matti Picus - Ralf Gommers - Sayed Adel - Sebastian Berg - Thomas J. Fan *Pull requests merged* A total of 26 pull requests were merged for this release. - #19311: REV,BUG: Replace `NotImplemented` with `typing.Any` - #19324: MAINT: Fixed the return-dtype of `ndarray.real` and `imag` - #19330: MAINT: Replace `"dtype[Any]"` with `dtype` in the definiton of\... - #19342: DOC: Fix some docstrings that crash pdf generation. - #19343: MAINT: bump scipy-mathjax - #19347: BUG: Fix arr.flat.index for large arrays and big-endian machines - #19348: ENH: add `numpy.f2py.get_include` function - #19349: BUG: Fix reference count leak in ufunc dtype handling - #19350: MAINT: Annotate missing attributes of `np.number` subclasses - #19351: BUG: Fix cast safety and comparisons for zero sized voids - #19352: BUG: Correct Cython declaration in random - #19353: BUG: protect against accessing base attribute of a NULL subarray - #19365: BUG, SIMD: Fix detecting AVX512 features on Darwin - #19366: MAINT: remove `print()`\'s in distutils template handling - #19390: ENH: SIMD architectures to show\_config - #19391: BUG: Do not raise deprecation warning for all nans in unique\... - #19392: BUG: Fix NULL special case in object-to-any cast code - #19430: MAINT: Use arm64-graviton2 for testing on travis - #19495: BUILD: update OpenBLAS to v0.3.17 - #19496: MAINT: Avoid unicode characters in division SIMD code comments - #19499: BUG, SIMD: Fix infinite loop during count non-zero on GCC-11 - #19500: BUG: fix a numpy.npiter leak in npyiter\_multi\_index\_set - #19501: TST: Fix a `GenericAlias` test failure for python 3.9.0 - #19502: MAINT: Start testing with Python 3.10.0b3. - #19503: MAINT: Add missing dtype overloads for object- and ctypes-based\... - #19510: REL: Prepare for NumPy 1.21.1 release. Cheers, Charles Harris -------------- next part -------------- An HTML attachment was scrubbed... URL: From nicholas.bgp at gmail.com Sun Jul 18 18:41:25 2021 From: nicholas.bgp at gmail.com (Nicholas McKibben) Date: Sun, 18 Jul 2021 15:41:25 -0700 Subject: [SciPy-Dev] PROPACK Sparse SVD Integration Message-ID: Hi all, Matt Haberland and I are happy to open PR gh-14433 to close the long-standing issue gh-857 through PROPACK integration into scipy.sparse.linalg.svds based on the previous work done by @jakevdp in pypropack . This PR includes: - PROPACK wrappers exposing a new 'propack' solver backend for scipy.sparse.linalg.svds - an overhaul and expansion of the scipy.sparse.linalg.svds test suite, documentation, and benchmarks - inclusion of a new PROPACK submodule under the SciPy organization which includes patches for known bugs, Fortran compiler support, and OpenMP issues Thanks, Nicholas -------------- next part -------------- An HTML attachment was scrubbed... URL: From roy.pamphile at gmail.com Mon Jul 19 03:10:54 2021 From: roy.pamphile at gmail.com (Pamphile Roy) Date: Mon, 19 Jul 2021 09:10:54 +0200 Subject: [SciPy-Dev] PROPACK Sparse SVD Integration In-Reply-To: References: Message-ID: <2844EAFD-5C1C-4A16-9C41-C9DB6D1EAD03@gmail.com> This is great! Thanks Nicholas and Matt for this great job! Quick question about these modules. Recently we have been adding quite a few. Do we have a plan to keep them aligned (if need be) with their upstream version? Cheers, Pamphile > On 19.07.2021, at 00:41, Nicholas McKibben wrote: > > Hi all, > > Matt Haberland and I are happy to open PR gh-14433 to close the long-standing issue gh-857 through PROPACK integration into scipy.sparse.linalg.svds based on the previous work done by @jakevdp in pypropack . > > This PR includes: > - PROPACK wrappers exposing a new 'propack' solver backend for scipy.sparse.linalg.svds > - an overhaul and expansion of the scipy.sparse.linalg.svds test suite, documentation, and benchmarks > - inclusion of a new PROPACK submodule under the SciPy organization which includes patches for known bugs, Fortran compiler support, and OpenMP issues > > Thanks, > Nicholas > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at python.org > https://mail.python.org/mailman/listinfo/scipy-dev -------------- next part -------------- An HTML attachment was scrubbed... URL: From nicholas.bgp at gmail.com Mon Jul 19 10:02:48 2021 From: nicholas.bgp at gmail.com (Nicholas McKibben) Date: Mon, 19 Jul 2021 07:02:48 -0700 Subject: [SciPy-Dev] PROPACK Sparse SVD Integration In-Reply-To: <2844EAFD-5C1C-4A16-9C41-C9DB6D1EAD03@gmail.com> References: <2844EAFD-5C1C-4A16-9C41-C9DB6D1EAD03@gmail.com> Message-ID: Hi Pamphile, PROPACK is an old library that hasn't received an update in 16 years, so I don't anticipate any issues with staying up to date with any upstream. The only changes will be our patches to fix known/discovered bugs. Thanks, Nicholas On Mon, Jul 19, 2021 at 12:11 AM Pamphile Roy wrote: > This is great! Thanks Nicholas and Matt for this great job! > > Quick question about these modules. Recently we have been adding quite a > few. Do we have a plan to keep them aligned (if need be) with their > upstream version? > > Cheers, > Pamphile > > On 19.07.2021, at 00:41, Nicholas McKibben wrote: > > Hi all, > > Matt Haberland and I are happy to open PR gh-14433 > to close the long-standing > issue gh-857 through PROPACK > integration into scipy.sparse.linalg.svds based on the previous work done > by @jakevdp in pypropack . > > This PR includes: > - PROPACK wrappers exposing a new 'propack' solver backend for > scipy.sparse.linalg.svds > - an overhaul and expansion of the scipy.sparse.linalg.svds test suite, > documentation, and benchmarks > - inclusion of a new PROPACK submodule under the SciPy organization which > includes patches for known bugs, Fortran compiler support, and OpenMP issues > > Thanks, > Nicholas > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at python.org > https://mail.python.org/mailman/listinfo/scipy-dev > > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at python.org > https://mail.python.org/mailman/listinfo/scipy-dev > -------------- next part -------------- An HTML attachment was scrubbed... URL: From roy.pamphile at gmail.com Mon Jul 19 12:27:12 2021 From: roy.pamphile at gmail.com (Pamphile Roy) Date: Mon, 19 Jul 2021 18:27:12 +0200 Subject: [SciPy-Dev] PROPACK Sparse SVD Integration In-Reply-To: References: <2844EAFD-5C1C-4A16-9C41-C9DB6D1EAD03@gmail.com> Message-ID: Haha ok I see? well then it should be fine as you said. Cheers, Pamphile > On 19.07.2021, at 16:02, Nicholas McKibben wrote: > > Hi Pamphile, > > PROPACK is an old library that hasn't received an update in 16 years, so I don't anticipate any issues with staying up to date with any upstream. The only changes will be our patches to fix known/discovered bugs. > > Thanks, > Nicholas > > On Mon, Jul 19, 2021 at 12:11 AM Pamphile Roy > wrote: > This is great! Thanks Nicholas and Matt for this great job! > > Quick question about these modules. Recently we have been adding quite a few. Do we have a plan to keep them aligned (if need be) with their upstream version? > > Cheers, > Pamphile > >> On 19.07.2021, at 00:41, Nicholas McKibben > wrote: >> >> Hi all, >> >> Matt Haberland and I are happy to open PR gh-14433 to close the long-standing issue gh-857 through PROPACK integration into scipy.sparse.linalg.svds based on the previous work done by @jakevdp in pypropack . >> >> This PR includes: >> - PROPACK wrappers exposing a new 'propack' solver backend for scipy.sparse.linalg.svds >> - an overhaul and expansion of the scipy.sparse.linalg.svds test suite, documentation, and benchmarks >> - inclusion of a new PROPACK submodule under the SciPy organization which includes patches for known bugs, Fortran compiler support, and OpenMP issues >> >> Thanks, >> Nicholas >> _______________________________________________ >> SciPy-Dev mailing list >> SciPy-Dev at python.org >> https://mail.python.org/mailman/listinfo/scipy-dev > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at python.org > https://mail.python.org/mailman/listinfo/scipy-dev > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at python.org > https://mail.python.org/mailman/listinfo/scipy-dev -------------- next part -------------- An HTML attachment was scrubbed... URL: From sarafridov at gmail.com Tue Jul 27 13:23:50 2021 From: sarafridov at gmail.com (Sara Fridovich-Keil) Date: Tue, 27 Jul 2021 10:23:50 -0700 Subject: [SciPy-Dev] directional derivatives in Wolfe line search Message-ID: <1DE64E20-BBDA-4094-B477-9BAEFD739228@gmail.com> Hi SciPy-Dev, I have been using scipy.optimize.fmin_bfgs on some derivative-free benchmark problems, and noticed that whenever the Wolfe line search requires a directional derivative, the current implementation estimates the entire gradient via finite differencing, and then computes the directional derivative by taking the inner product of the gradient and the search direction. In my experiments, replacing this full gradient estimation with a single extra function evaluation to estimate the directional derivative directly, is faster. The key places this happens in the code are: https://github.com/scipy/scipy/blob/701ffcc8a6f04509d115aac5e5681c538b5265a2/scipy/optimize/linesearch.py#L86 https://github.com/scipy/scipy/blob/701ffcc8a6f04509d115aac5e5681c538b5265a2/scipy/optimize/linesearch.py#L301 Both of the Wolfe line search implementations take an input function fprime that computes/estimates the full gradient, even though only derphi, the directional derivative, is used. What I?d like to do is have an option for fprime to be either provided or not provided to the Wolfe line search. If the objective has a nice/cheap gradient then the current behavior is fine (passing the gradient function as fprime, and computing directional derivatives with an inner product), but if the objective is derivative-free then derphi should be computed with finite differencing along the search direction (just one extra function evaluation) instead of using fprime. Do people agree this would be a good change? If so I can make a pull request. Best, Sara -------------- next part -------------- An HTML attachment was scrubbed... URL: From andyfaff at gmail.com Tue Jul 27 22:24:08 2021 From: andyfaff at gmail.com (Andrew Nelson) Date: Wed, 28 Jul 2021 12:24:08 +1000 Subject: [SciPy-Dev] directional derivatives in Wolfe line search In-Reply-To: <1DE64E20-BBDA-4094-B477-9BAEFD739228@gmail.com> References: <1DE64E20-BBDA-4094-B477-9BAEFD739228@gmail.com> Message-ID: On Wed, 28 Jul 2021 at 03:24, Sara Fridovich-Keil wrote: > I have been using scipy.optimize.fmin_bfgs on some derivative-free > benchmark problems, and noticed that whenever the Wolfe line search > requires a directional derivative, the current implementation estimates the > entire gradient via finite differencing, and then computes the directional > derivative by taking the inner product of the gradient and the search > direction. In my experiments, replacing this full gradient estimation with > a single extra function evaluation to estimate the directional derivative > directly, is faster. > > What I?d like to do is have an option for fprime to be either provided or > not provided to the Wolfe line search. If the objective has a nice/cheap > gradient then the current behavior is fine (passing the gradient function > as fprime, and computing directional derivatives with an inner product), > but if the objective is derivative-free then derphi should be computed with > finite differencing along the search direction (just one extra function > evaluation) instead of using fprime. > Estimating gradients with finite differences relies on a good choice of step size. Good step defaults are automatically chosen by `optimize._numdiff.approx_derivative` when fprime is estimated numerically. Estimating derphi with numerical differentiation along the search direction would first of all require a good step size along the search direction, `pk`. Whilst this may be ok if the parameter scaling, pk, and the derivatives are well chosen/well behaved (e.g. all the parameters are the same magnitude, etc), I'm concerned that there will be cases where it won't be as numerically accurate/stable as the existing behaviour. For example, a chosen step along pk may result in individual dx that aren't optimal from a numerical differentiation viewpoint. How would one know if a specific system was exhibiting that behaviour? I'd expect the current code to be more robust than your proposed alternative. Having said that, I'm not an expert in this domain, so I'd be interested to hear what someone who is more expert than me has to say. Can you point to any literature that says that your proposed changes are generally acceptable? Andrew. -------------- next part -------------- An HTML attachment was scrubbed... URL: From sarafridov at gmail.com Wed Jul 28 01:34:21 2021 From: sarafridov at gmail.com (Sara Fridovich-Keil) Date: Tue, 27 Jul 2021 22:34:21 -0700 Subject: [SciPy-Dev] directional derivatives in Wolfe line search In-Reply-To: References: <1DE64E20-BBDA-4094-B477-9BAEFD739228@gmail.com> Message-ID: <9A16DF8C-1823-4857-8A8B-002B0CB057D9@gmail.com> Hi Andrew (and others), Thanks for the input. I can describe the experiment I?ve done so far that makes me believe this simpler finite differencing would be an improvement for the DFO case, but I?m not sure if there is a standard benchmark that is used for making these decisions in scipy. My experiment is on the DFO benchmark of Mor? and Wild (https://www.mcs.anl.gov/~more/dfo/ ), comparing 3 algorithms. The green curve is just calling the current implementation of fmin_bfgs. The orange curve is my own implementation of fmin_bfgs based on scipy but using a simple forward finite differencing approximation to the full gradient. The blue curve is the same as the orange one, but replacing the forward finite differencing approximation of the full gradient with a forward finite differencing approximation of just the directional derivative (inside the Wolfe line search). The sampling distance for finite differencing is 1.4901161193847656e-08 (for both the orange and blue curves), which is the same default as scipy (https://github.com/scipy/scipy/blob/v1.7.0/scipy/optimize/optimize.py#L1448 , and the value recommended in Nocedal and Wright). The blue curve is the algorithm I?m proposing; I just included the orange one to compare the simple vs sophisticated full-gradient finite differencing methods (and any differences in the outer BFGS, which should be basically the same). The plot is a performance profile, where the y axis is the fraction of benchmark problems solved to a desired precision and the x axis is the performance ratio, which is the ratio of the number of function evaluations used by each algorithm compared to the number of function evaluations used by the most efficient algorithm on each problem. For instance, in this experiment the current implementation is fastest on about 42% of the benchmark problems, and the proposed version is fastest on about 58% of the benchmark problems. The proposed implementation never uses more than 3x the fewest number of function evaluations to converge, whereas on about 9% of the benchmark problems the current implementation fails to converge within 10x the number of function evaluations. As far as literature/theory, I?m not aware of this exact comparison (which is part of why I set up this experiment). My instinct is that the line search subroutine only really cares about the directional derivative in this specific search direction, so I would expect that estimating that directly might even have some benefits compared to taking samples in standard basis directions and then taking an inner product. Certainly there is some sensitivity to the sampling radius, but this is true even in the case of estimating the full gradient (I?m not sure I would expect it to be worse for just estimating the directional derivative). If anyone does have specific references/experience, please chime in! Is there a standard benchmark test suite that scipy uses to make these kinds of algorithm implementation decisions? (It would also make sense to me to keep both options around, have a default, and let users deviate from it if they want.) Best, Sara > On Jul 27, 2021, at 7:24 PM, Andrew Nelson wrote: > > > > On Wed, 28 Jul 2021 at 03:24, Sara Fridovich-Keil > wrote: > I have been using scipy.optimize.fmin_bfgs on some derivative-free benchmark problems, and noticed that whenever the Wolfe line search requires a directional derivative, the current implementation estimates the entire gradient via finite differencing, and then computes the directional derivative by taking the inner product of the gradient and the search direction. In my experiments, replacing this full gradient estimation with a single extra function evaluation to estimate the directional derivative directly, is faster. > > What I?d like to do is have an option for fprime to be either provided or not provided to the Wolfe line search. If the objective has a nice/cheap gradient then the current behavior is fine (passing the gradient function as fprime, and computing directional derivatives with an inner product), but if the objective is derivative-free then derphi should be computed with finite differencing along the search direction (just one extra function evaluation) instead of using fprime. > > Estimating gradients with finite differences relies on a good choice of step size. Good step defaults are automatically chosen by `optimize._numdiff.approx_derivative` when fprime is estimated numerically. > Estimating derphi with numerical differentiation along the search direction would first of all require a good step size along the search direction, `pk`. Whilst this may be ok if the parameter scaling, pk, and the derivatives are well chosen/well behaved (e.g. all the parameters are the same magnitude, etc), I'm concerned that there will be cases where it won't be as numerically accurate/stable as the existing behaviour. For example, a chosen step along pk may result in individual dx that aren't optimal from a numerical differentiation viewpoint. How would one know if a specific system was exhibiting that behaviour? > I'd expect the current code to be more robust than your proposed alternative. Having said that, I'm not an expert in this domain, so I'd be interested to hear what someone who is more expert than me has to say. Can you point to any literature that says that your proposed changes are generally acceptable? > > Andrew. > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at python.org > https://mail.python.org/mailman/listinfo/scipy-dev -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: dfosmooth.pdf Type: application/pdf Size: 14866 bytes Desc: not available URL: -------------- next part -------------- An HTML attachment was scrubbed... URL: From andyfaff at gmail.com Wed Jul 28 03:38:29 2021 From: andyfaff at gmail.com (Andrew Nelson) Date: Wed, 28 Jul 2021 17:38:29 +1000 Subject: [SciPy-Dev] directional derivatives in Wolfe line search In-Reply-To: <9A16DF8C-1823-4857-8A8B-002B0CB057D9@gmail.com> References: <1DE64E20-BBDA-4094-B477-9BAEFD739228@gmail.com> <9A16DF8C-1823-4857-8A8B-002B0CB057D9@gmail.com> Message-ID: On Wed, 28 Jul 2021 at 15:35, Sara Fridovich-Keil wrote: > Thanks for the input. I can describe the experiment I?ve done so far that > makes me believe this simpler finite differencing would be an improvement > for the DFO case, but I?m not sure if there is a standard benchmark that is > used for making these decisions in scipy. > The first hurdle is to ensure that correctness isn't affects. The second hurdle is to show that changes improve things. I'm not sure that we have a comprehensive test suite for these kinds of comparisons for scipy. There are functions in benchmarks/test_functions.py, but we don't have something more comprehensive like CUTEST. It would be good to have that! > My experiment is on the DFO benchmark of Mor? and Wild ( > https://www.mcs.anl.gov/~more/dfo/), comparing 3 algorithms. The green > curve is just calling the current implementation of fmin_bfgs. The orange > curve is my own implementation of fmin_bfgs based on scipy but using a > simple forward finite differencing approximation to the full gradient. > If I understand correctly according to that graph your implementation in orange is better than the existing scipy implementation? The blue curve is the same as the orange one, but replacing the forward > finite differencing approximation of the full gradient with a forward > finite differencing approximation of just the directional derivative > (inside the Wolfe line search). The sampling distance for finite > differencing is 1.4901161193847656e-08 (for both the orange and blue > curves), which is the same default as scipy ( > https://github.com/scipy/scipy/blob/v1.7.0/scipy/optimize/optimize.py#L1448, > and the value recommended in Nocedal and Wright). The blue curve is the > algorithm I?m proposing; I just included the orange one to compare the > simple vs sophisticated full-gradient finite differencing methods (and any > differences in the outer BFGS, which should be basically the same). > Note that the finite differencing for most of the optimizers is done by approx_derivative ( https://github.com/scipy/scipy/blob/master/scipy/optimize/_numdiff.py#L257), which chooses the step size automatically. If `jac=None` for `_minimize_bfgs` then the default is for absolute steps, but you can use relative steps by specifying `jac='2-point', etc. I'm guessing that the modification would be: 1. If user provides a callable(jac), then use that in derphi. 2. If jac is estimated via a FD method then estimate derphi by finite difference along the search direction, as you suggest. Estimate grad by FD of `fun`. Both grad and derphi are calculated in line_search_wolfe1, line_search_wolfe2. -------------- next part -------------- An HTML attachment was scrubbed... URL: From sarafridov at gmail.com Wed Jul 28 12:50:20 2021 From: sarafridov at gmail.com (Sara Fridovich-Keil) Date: Wed, 28 Jul 2021 09:50:20 -0700 Subject: [SciPy-Dev] directional derivatives in Wolfe line search In-Reply-To: References: <1DE64E20-BBDA-4094-B477-9BAEFD739228@gmail.com> <9A16DF8C-1823-4857-8A8B-002B0CB057D9@gmail.com> Message-ID: <75EFAE77-F073-4061-9AFC-074428F1BFEC@gmail.com> > The first hurdle is to ensure that correctness isn't affects. The second hurdle is to show that changes improve things. I'm not sure that we have a comprehensive test suite for these kinds of comparisons for scipy. There are functions in benchmarks/test_functions.py, but we don't have something more comprehensive like CUTEST. It would be good to have that! I ended up translating the Mor? and Wild CUTEST benchmark into python (from matlab) for my own experiment anyway, so I?d be happy to include that as a benchmark for scipy. > If I understand correctly according to that graph your implementation in orange is better than the existing scipy implementation? It?s a bit nuanced; it looks like the current scipy implementation is faster than my orange implementation for some problems, but there are other problems where my orange implementation converges and the current scipy gets stuck. I suspect the former is probably because of the fancier finite differencing in scipy, and the latter might be because I added a little check in the BFGS update (since the benchmark problems are nonconvex): if BFGS ever tries to take an ascent direction, I replace it with steepest descent for that step, and reset the BFGS matrix to identity. > I'm guessing that the modification would be: > > 1. If user provides a callable(jac), then use that in derphi. > 2. If jac is estimated via a FD method then estimate derphi by finite difference along the search direction, as you suggest. Estimate grad by FD of `fun`. Both grad and derphi are calculated in line_search_wolfe1, line_search_wolfe2. Yes, I think this would be good. Best, Sara > On Jul 28, 2021, at 12:38 AM, Andrew Nelson wrote: > > On Wed, 28 Jul 2021 at 15:35, Sara Fridovich-Keil > wrote: > Thanks for the input. I can describe the experiment I?ve done so far that makes me believe this simpler finite differencing would be an improvement for the DFO case, but I?m not sure if there is a standard benchmark that is used for making these decisions in scipy. > > The first hurdle is to ensure that correctness isn't affects. The second hurdle is to show that changes improve things. I'm not sure that we have a comprehensive test suite for these kinds of comparisons for scipy. There are functions in benchmarks/test_functions.py, but we don't have something more comprehensive like CUTEST. It would be good to have that! > > My experiment is on the DFO benchmark of Mor? and Wild (https://www.mcs.anl.gov/~more/dfo/ ), comparing 3 algorithms. The green curve is just calling the current implementation of fmin_bfgs. The orange curve is my own implementation of fmin_bfgs based on scipy but using a simple forward finite differencing approximation to the full gradient. > > If I understand correctly according to that graph your implementation in orange is better than the existing scipy implementation? > > The blue curve is the same as the orange one, but replacing the forward finite differencing approximation of the full gradient with a forward finite differencing approximation of just the directional derivative (inside the Wolfe line search). The sampling distance for finite differencing is 1.4901161193847656e-08 (for both the orange and blue curves), which is the same default as scipy (https://github.com/scipy/scipy/blob/v1.7.0/scipy/optimize/optimize.py#L1448 , and the value recommended in Nocedal and Wright). The blue curve is the algorithm I?m proposing; I just included the orange one to compare the simple vs sophisticated full-gradient finite differencing methods (and any differences in the outer BFGS, which should be basically the same). > > Note that the finite differencing for most of the optimizers is done by approx_derivative (https://github.com/scipy/scipy/blob/master/scipy/optimize/_numdiff.py#L257 ), which chooses the step size automatically. If `jac=None` for `_minimize_bfgs` then the default is for absolute steps, but you can use relative steps by specifying `jac='2-point', etc. > > I'm guessing that the modification would be: > > 1. If user provides a callable(jac), then use that in derphi. > 2. If jac is estimated via a FD method then estimate derphi by finite difference along the search direction, as you suggest. Estimate grad by FD of `fun`. Both grad and derphi are calculated in line_search_wolfe1, line_search_wolfe2. > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at python.org > https://mail.python.org/mailman/listinfo/scipy-dev -------------- next part -------------- An HTML attachment was scrubbed... URL: From yacine936 at gmail.com Wed Jul 28 21:49:13 2021 From: yacine936 at gmail.com (Yacine Thabet) Date: Wed, 28 Jul 2021 21:49:13 -0400 Subject: [SciPy-Dev] Tests failing #14371 Message-ID: Hi guys, I have jumped to work on small issues to jump on the code of scipy and get used to it. I have worked on the PR https://github.com/scipy/scipy/pull/14371, but I have an issue all tests are passing on my local environment but are failing when I push to github. I tried ot pull/ merge and rebase but nothing works and I created a mess on my PR. Could you help me please, because I don't understand the problem here. Cheers. -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Thu Jul 29 04:27:01 2021 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Thu, 29 Jul 2021 10:27:01 +0200 Subject: [SciPy-Dev] Tests failing #14371 In-Reply-To: References: Message-ID: On Thu, Jul 29, 2021 at 3:49 AM Yacine Thabet wrote: > Hi guys, > > I have jumped to work on small issues to jump on the code of scipy and get > used to it. > > I have worked on the PR https://github.com/scipy/scipy/pull/14371, but I > have an issue all tests are passing on my local environment but are failing > when I push to github. > > I tried ot pull/ merge and rebase but nothing works and I created a mess > on my PR. > > Could you help me please, because I don't understand the problem here. > No worries. I replied in the PR. Cheers, Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From shoppert at baileywick.plus.com Sat Jul 31 09:18:39 2021 From: shoppert at baileywick.plus.com (Todd Bailey) Date: Sat, 31 Jul 2021 14:18:39 +0100 Subject: [SciPy-Dev] Refactorize levy_stable _fitstart Message-ID: <7714F9F7-26BD-4557-A008-AE6BE1A1CE59@baileywick.plus.com> Hi all. I have analyses that need parts of the calculations in stats.levy_stable._fitstart(). I am willing to put them into a formal PR if people think it would be useful to refactorize _fitstart() to re-use these calculations. Current processing .fitstart() takes an array of data samples, and returns parameter estimates for a stable distribution fit to those samples. Stage 1 calculates quintiles from the samples (percentiles 5, 25, 50, 75, 95). Stage 2 estimates shape parameters from the quintiles. Stage 3 estimates location and scale parameters from the quintiles and the estimated shape parameters. Use cases I have some data that is already aggregated to percentiles. For this use case, I need Stage 2 and Stage 3. I also have some related data that is aggregated to quintiles (percentiles 10, 25, 50, 75, 90). These quintiles do not go far enough into the tails to inform an estimate of alpha. However, if I assume that the shape parameters are the same throughout, then I need a slightly-adapted Stage 3 to estimate the location and scale for each quintile distribution (using fixed alpha and beta that have been determined from other data). Refactorization Would it be useful to make some form of Stages 2 and 3 available separately within scipy.stats.levy_stable? If so, should it be via public API? Thanks in advance for your thoughts and any advice on how best to proceed. Todd