From shahab.sanjari at gmail.com Wed Jul 1 08:19:35 2020 From: shahab.sanjari at gmail.com (Shahab Sanjari) Date: Wed, 1 Jul 2020 14:19:35 +0200 Subject: [SciPy-Dev] multitaper Message-ID: <99d4b040-d88c-578f-9a74-dd1fb5adf638@gmail.com> Hi everyone. there is the Slepian function in scipy.signal.windows.dpss, which can be used to do the multitaper analysis, but the actual multiplication routine is missing. It is like the "pmtm" function in matlab, also available in "spectrum.mtm" from PyPI. Why not include it in Scipy as well? it is just 5 lines!! here is the code: https://github.com/xaratustrah/multitaper/blob/f3b9340606e4bc4357f382c00127e9c0706af57a/multitaper/multitaper.py#L8 I use this code regularly and it is pretty good. cheers, Shahab From rlucas7 at vt.edu Fri Jul 3 14:52:55 2020 From: rlucas7 at vt.edu (rlucas7 at vt.edu) Date: Fri, 3 Jul 2020 14:52:55 -0400 Subject: [SciPy-Dev] F strings in codebase? Message-ID: Hi SciPy-dev, On reviewing this pr: https://github.com/scipy/scipy/pull/12376 cool-RR was asking about whether f strings are ok in scipy now. I wasn?t sure and couldn?t find any, nor clear documentation on a preferred format. I know different (parts) of different sub packages may use ?%? for interpolation and others use .format() I poked around trying to find examples in codebase but GitHub searching for f? didn?t give anything, nor did I find anything in the hacking file or related links. Do we have a preferred string interpolation method for SciPy? Note the discussion of the string interpolation is about a potential follow up PR that cool-RR may do and not the Linked PR itself. Sincerely, -Lucas Roberts -------------- next part -------------- An HTML attachment was scrubbed... URL: From warren.weckesser at gmail.com Fri Jul 3 15:40:26 2020 From: warren.weckesser at gmail.com (Warren Weckesser) Date: Fri, 3 Jul 2020 15:40:26 -0400 Subject: [SciPy-Dev] F strings in codebase? In-Reply-To: References: Message-ID: On 7/3/20, rlucas7 at vt.edu wrote: > Hi SciPy-dev, > > On reviewing this pr: > > https://github.com/scipy/scipy/pull/12376 > > cool-RR was asking about whether f strings are ok in scipy now. I wasn?t > sure and couldn?t find any, nor clear documentation on a preferred format. Yes, f-strings are acceptable now. But I don't think we want to go back and convert all uses of .format() and '%' to f-strings. The benefit is probably not worth the code churn. So for now, it is probably best to use f-strings in new code, and when updating old code for some other reason. E.g. if some additional information is added to the message of exception, and the old code uses '%', it would be fine to change it to use an f-string. Other devs may have different opinions about that, so let's see what others say before that is considered official policy. Warren > > I know different (parts) of different sub packages may use ?%? for > interpolation and others use .format() > > I poked around trying to find examples in codebase but GitHub searching for > f? didn?t give anything, nor did I find anything in the hacking file or > related links. > > Do we have a preferred string interpolation method for SciPy? > > Note the discussion of the string interpolation is about a potential follow > up PR that cool-RR may do and not the Linked PR itself. > > Sincerely, > > -Lucas Roberts From terry.y.davis+scipy at gmail.com Fri Jul 3 16:09:42 2020 From: terry.y.davis+scipy at gmail.com (Terry) Date: Fri, 3 Jul 2020 13:09:42 -0700 Subject: [SciPy-Dev] F strings in codebase? In-Reply-To: References: Message-ID: We could start using git's blame.ignoreRevsFile, which would hide the code churn*. *only on the client side until github supports this feature. If you want to push github to add this feature, reference this gitlab issue: https://gitlab.com/gitlab-org/gitlab/-/issues/31423 see: https://github.com/ipython/ipython/pull/12091/files -Terry On Fri, Jul 3, 2020 at 12:40 PM Warren Weckesser wrote: > On 7/3/20, rlucas7 at vt.edu wrote: > > Hi SciPy-dev, > > > > On reviewing this pr: > > > > https://github.com/scipy/scipy/pull/12376 > > > > cool-RR was asking about whether f strings are ok in scipy now. I wasn?t > > sure and couldn?t find any, nor clear documentation on a preferred > format. > > > Yes, f-strings are acceptable now. > > But I don't think we want to go back and convert all uses of .format() > and '%' to f-strings. The benefit is probably not worth the code > churn. So for now, it is probably best to use f-strings in new code, > and when updating old code for some other reason. E.g. if some > additional information is added to the message of exception, and the > old code uses '%', it would be fine to change it to use an f-string. > > Other devs may have different opinions about that, so let's see what > others say before that is considered official policy. > > Warren > > > > > > I know different (parts) of different sub packages may use ?%? for > > interpolation and others use .format() > > > > I poked around trying to find examples in codebase but GitHub searching > for > > f? didn?t give anything, nor did I find anything in the hacking file or > > related links. > > > > Do we have a preferred string interpolation method for SciPy? > > > > Note the discussion of the string interpolation is about a potential > follow > > up PR that cool-RR may do and not the Linked PR itself. > > > > Sincerely, > > > > -Lucas Roberts > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at python.org > https://mail.python.org/mailman/listinfo/scipy-dev > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Sat Jul 4 17:23:55 2020 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Sat, 4 Jul 2020 23:23:55 +0200 Subject: [SciPy-Dev] F strings in codebase? In-Reply-To: References: Message-ID: On Fri, Jul 3, 2020 at 9:40 PM Warren Weckesser wrote: > On 7/3/20, rlucas7 at vt.edu wrote: > > Hi SciPy-dev, > > > > On reviewing this pr: > > > > https://github.com/scipy/scipy/pull/12376 > > > > cool-RR was asking about whether f strings are ok in scipy now. I wasn?t > > sure and couldn?t find any, nor clear documentation on a preferred > format. > > > Yes, f-strings are acceptable now. > > But I don't think we want to go back and convert all uses of .format() > and '%' to f-strings. The benefit is probably not worth the code > churn. So for now, it is probably best to use f-strings in new code, > and when updating old code for some other reason. E.g. if some > additional information is added to the message of exception, and the > old code uses '%', it would be fine to change it to use an f-string. > > Other devs may have different opinions about that, so let's see what > others say before that is considered official policy. > +1 completely agree Ralf > Warren > > > > > > I know different (parts) of different sub packages may use ?%? for > > interpolation and others use .format() > > > > I poked around trying to find examples in codebase but GitHub searching > for > > f? didn?t give anything, nor did I find anything in the hacking file or > > related links. > > > > Do we have a preferred string interpolation method for SciPy? > > > > Note the discussion of the string interpolation is about a potential > follow > > up PR that cool-RR may do and not the Linked PR itself. > > > > Sincerely, > > > > -Lucas Roberts > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at python.org > https://mail.python.org/mailman/listinfo/scipy-dev > -------------- next part -------------- An HTML attachment was scrubbed... URL: From tyler.je.reddy at gmail.com Sat Jul 4 19:47:24 2020 From: tyler.je.reddy at gmail.com (Tyler Reddy) Date: Sat, 4 Jul 2020 17:47:24 -0600 Subject: [SciPy-Dev] ANN: SciPy 1.5.1 Message-ID: Hi all, On behalf of the SciPy development team I'm pleased to announce the release of SciPy 1.5.1, which is a bug fix release. Sources and binary wheels can be found at: https://pypi.org/project/scipy/ and at: https://github.com/scipy/scipy/releases/tag/v1.5.1 One of a few ways to install this release with pip: pip install scipy==1.5.1 ========================== SciPy 1.5.1 Release Notes ========================== SciPy 1.5.1 is a bug-fix release with no new features compared to 1.5.0. In particular, an issue where DLL loading can fail for SciPy wheels on Windows with Python 3.6 has been fixed. Authors ======= * Peter Bell * Lo?c Est?ve * Philipp Th?lke + * Tyler Reddy * Paul van Mulbregt * Pauli Virtanen * Warren Weckesser A total of 7 people contributed to this release. People with a "+" by their names contributed a patch for the first time. This list of names is automatically generated, and may not be fully complete. Issues closed for 1.5.1 -------------------------------- * `#9108 `__: documentation: scipy.spatial.KDTree vs. scipy.spatial.cKDTree * `#12218 `__: Type error in stats.ks_2samp when alternative != 'two-sided- * `#12406 `__: DOC: Docstring in stats.anderson function not properly formatted * `#12418 `__: Regression in hierarchy.dendogram Pull requests for 1.5.1 -------------------------------- * `#12280 `__: BUG: Fixes gh-12218, TypeError converting int to float inside... * `#12336 `__: BUG: KDTree should reject complex input points * `#12344 `__: MAINT: Don't use numpy's aliases of Python builtin objects. * `#12407 `__: DOC: Fix docstring for dist param in anderson function * `#12410 `__: CI: Run the Azure Windows Python36 32bit tests with mode 'fast' * `#12421 `__: Fix regression in scipy 1.5.0 in dendogram when labels is a numpy... * `#12462 `__: MAINT: move distributor_init import after __config__ import Checksums ========= MD5 ~~~ b71e8115d61c604cc65e5ecc556131f6 scipy-1.5.1-cp36-cp36m-macosx_10_9_x86_64.whl 0190c11f75ed28a7e56050182ca95a18 scipy-1.5.1-cp36-cp36m-manylinux1_i686.whl c4dd717a3a0c3fe64380039e4fda663f scipy-1.5.1-cp36-cp36m-manylinux1_x86_64.whl baad02c954e85e7fd3d4a9fd49fc6359 scipy-1.5.1-cp36-cp36m-win32.whl 9edc3a9aedf6bffccb17101c905126d0 scipy-1.5.1-cp36-cp36m-win_amd64.whl 83479a6de66a6bc2da0990fa71cf3cec scipy-1.5.1-cp37-cp37m-macosx_10_9_x86_64.whl f2d5c8713b087545c5ec19cc8e46212c scipy-1.5.1-cp37-cp37m-manylinux1_i686.whl 6a18a9636342574ae55d3a80136c550c scipy-1.5.1-cp37-cp37m-manylinux1_x86_64.whl 5da68faf5b32c539d1cb5390df010cc8 scipy-1.5.1-cp37-cp37m-win32.whl 2ca8c59a6712e91ac78b8540ab694b53 scipy-1.5.1-cp37-cp37m-win_amd64.whl cceb059d0cf6a70e62452deb5571ba00 scipy-1.5.1-cp38-cp38-macosx_10_9_x86_64.whl 8a65b30ccd72409704d3300922da2b7f scipy-1.5.1-cp38-cp38-manylinux1_i686.whl 00181f52a7917d1c3d50e42a76a6df96 scipy-1.5.1-cp38-cp38-manylinux1_x86_64.whl 2aa8b6ddceaebe7b33d71dbad0e208cc scipy-1.5.1-cp38-cp38-win32.whl a626585d08b0991c8f2df0caacdf9997 scipy-1.5.1-cp38-cp38-win_amd64.whl f6986798b7d22ffc5f80b749d7ec27ca scipy-1.5.1.tar.gz e126a1a0ff954b924a8273faa7437fe3 scipy-1.5.1.tar.xz 3bce82b23d45d1a96ee270f23176746a scipy-1.5.1.zip SHA256 ~~~~~~ 058e84930407927f71963a4ad8c1dc96c4d2d075636a68578195648c81f78810 scipy-1.5.1-cp36-cp36m-macosx_10_9_x86_64.whl 7908c85854c5b5b6d3ce7fefafac1ca3e23ff9ac41edabc2d46ae5dc9fa070ac scipy-1.5.1-cp36-cp36m-manylinux1_i686.whl 8302d69fb1528ea7c7f2a1ea640d354c981b6eb8192d1c175349874209397604 scipy-1.5.1-cp36-cp36m-manylinux1_x86_64.whl 35d042d6499caf1a5d171baed0ebf01eb665b7af2ad98a8ff1b0e6e783654540 scipy-1.5.1-cp36-cp36m-win32.whl 5e0bb43ff581811ab7f27425f6b96c1ddf7591ccad2e486c9af0b910c18f7185 scipy-1.5.1-cp36-cp36m-win_amd64.whl b4858ccbd88f4b53950fb9fc0069c1d9fea83d7cff2382e1d8b023d3f4883014 scipy-1.5.1-cp37-cp37m-macosx_10_9_x86_64.whl eb46d8b5947ca27b0bc972cecfba8130f088a83ab3d08c1a6033d9070b3046b3 scipy-1.5.1-cp37-cp37m-manylinux1_i686.whl fff15df01bef1243468be60c55178ed7576270b200aab08a7ffd5b8e0bbc340c scipy-1.5.1-cp37-cp37m-manylinux1_x86_64.whl 81859ed3aad620752dd2c07c32b5d3a80a0d47c5e3813904621954a78a0ae899 scipy-1.5.1-cp37-cp37m-win32.whl c05c6fe76228cc13c5214e9faf5f2a871a1da54473bc417ab9da310d0e5fff8b scipy-1.5.1-cp37-cp37m-win_amd64.whl 71742889393a724dfce755b6b61228677873d269a4234e51ddaf08b998433c91 scipy-1.5.1-cp38-cp38-macosx_10_9_x86_64.whl 9323d268775991b79690f7b9a28a4e8b8c4f2b160ed9f8a90123127314e2d3c1 scipy-1.5.1-cp38-cp38-manylinux1_i686.whl 06b19a650471781056c1a2172eeeeb777b8b516e9434005dd392a4559e0938b9 scipy-1.5.1-cp38-cp38-manylinux1_x86_64.whl 57a0f2be3063dbe1e3daf31ec9005576e8fd1022a28159d0db71d14566899d16 scipy-1.5.1-cp38-cp38-win32.whl c06e731aa46c0dfc563cc636155758178ebc019ef78b9b0f4370effe2ac0f0e6 scipy-1.5.1-cp38-cp38-win_amd64.whl 039572f0ca9578a466683558c5bf1e65d442860ec6e13307d528749cfe6d07b8 scipy-1.5.1.tar.gz 0728bd66a5251cfeff17a72280ae5a40ec14add217f94868d1415b3c469b610a scipy-1.5.1.tar.xz 6dfa9d1e718588f48731e865674b3270130f7736d6c7dc5ceaeb048f55ed793a scipy-1.5.1.zip -------------- next part -------------- An HTML attachment was scrubbed... URL: From asaadel1 at jhu.edu Fri Jul 10 10:47:13 2020 From: asaadel1 at jhu.edu (Ali Saad-Eldin) Date: Fri, 10 Jul 2020 14:47:13 +0000 Subject: [SciPy-Dev] ENH: Add an approximate Linear Sum Assignment Solver In-Reply-To: References: Message-ID: Hi all, Hi all, I would like to add an approximate Linear Assignment Problem (aLAP) solver to scipy.optimize. The specific approximation algorithm I propose adding can be found here. Scipy.optimize currently houses an exact solver for LAP in linear_sum_assignment(), and I was thinking this could be added either as a new method within linear_sum_assignment(), or as its own function. The approximation algorithm has time complexity O(n2log(n)), compared to the algorithm implemented in linear_sum_assignment() with complexity ~O(n2.3 ), while guaranteeing a score >= 1/2 optimum (though in practice the scores appear to be much closer to optimum). I've attached two plots demonstrating runtime and accuracy, running linear_sum_assignment() and aLAP on simulated, dense nxn cost_matrices with entries randomly selected from a uniform(0,100) distribution, for n sampled from [0, 3000]. Note that linear_sum_assignment() is essentially C/C++, while the aLAP implementation is native Python, so a cython implementation could make aLAP even faster. Additionally, an advantage to this algorithm is that it is parallelizable, which could cause a further speed up. Current version of the implementation can be found here, and proof of effectiveness here. Best, Ali Saad-Eldin -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: perdiff_lapvalap.png Type: image/png Size: 43428 bytes Desc: perdiff_lapvalap.png URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: time_lapvalap.png Type: image/png Size: 51863 bytes Desc: time_lapvalap.png URL: From dloney at usbr.gov Mon Jul 13 15:46:19 2020 From: dloney at usbr.gov (Loney, Drew A) Date: Mon, 13 Jul 2020 19:46:19 +0000 Subject: [SciPy-Dev] Explicit nonzero handling in sparse matrix operations Message-ID: Hello everyone, I'd like to add explicit zero handling into the scipy sparse matrix operations. It's currently part of pull request #11899. The functionality allows a user to selectively consider only explicit nonzero values in the sparse min, max, argmin, and argmax functions. Previous to this change, the programmer will have needed to mathematically manipulate the sparse matrix so that these functions gave the correct result when considering matrices that contain zero values. By passing in the explicit keyword, these functions are now able to consider only nonzero values within the matrix. The change defaults to the existing functionality and only activates these changes if the user activates the input keyword. This issue exists in several of my existing programs. As a benchmark, this implementation improves performance between 1.15x and 1.35x compared to manipulating the sparse matrix to identify nonzero values. Does anyone have any thoughts on this functionality? Drew Allan Loney, PhD PE Water Resources Engineering and Management Technical Services Center Bureau of Reclamation (303)445-2541 -------------- next part -------------- An HTML attachment was scrubbed... URL: From andyfaff at gmail.com Tue Jul 14 19:03:01 2020 From: andyfaff at gmail.com (Andrew Nelson) Date: Wed, 15 Jul 2020 09:03:01 +1000 Subject: [SciPy-Dev] creation / pickling of stats distributions Message-ID: I have some code that uses multiprocessing.Pool for parallelisation. This requires that an object is pickled. This object has an `rv_frozen` distribution as an attribute. It turns out that a performance is much improved if the `rv_frozen` distribution is not present --> pickling of `rv_frozen` objects is expensive. Creation of `rv_frozen` objects is also expensive. ``` >>> import scipy.stats as stats >>> import pickle >>> %timeit stats.norm(scale=1, loc=1) 694 ?s ? 123 ?s per loop (mean ? std. dev. of 7 runs, 1000 loops each) >>> rv = stats.norm(scale=1, loc=1) >>> %timeit s = pickle.dumps(rv); pickle.loads(s) 1.02 ms ? 24 ?s per loop (mean ? std. dev. of 7 runs, 1000 loops each) ``` I'd be hoping for an order of magnitude less in time for either of those. Using line profiling two of the big culprits for slowness during object creation are `rv_continuous._construct_doc` (50% of the total time, with a large part spent in `_lib.doccer.docformat`!!) and `rv_continuous._construct_argparser` My questions are: 1) Is it possible to speed up pickling/unpickling of these objects? (e.g. __setstate__/__getstate__, custom reduction, copyreg magic, ...) 2) Is there any way to turn off docstring creation (or speeding it up), besides starting the interpreter with -OO? _____________________________________ Dr. Andrew Nelson _____________________________________ -------------- next part -------------- An HTML attachment was scrubbed... URL: From evgeny.burovskiy at gmail.com Wed Jul 15 04:29:02 2020 From: evgeny.burovskiy at gmail.com (Evgeni Burovski) Date: Wed, 15 Jul 2020 11:29:02 +0300 Subject: [SciPy-Dev] creation / pickling of stats distributions In-Reply-To: References: Message-ID: While it's not wholly surprising these two are slow, it is surprising they are *that* slow. W.r.t. docstrings, I think there's room for adding a "skip_focstring" kwarg or some such to rv_generic. It'll need to be propagated to `rv_frozen.dist`. I can send a PR or one, if that helps. (I don't know about pickling, sadly.) All that said, maybe it's easier to share the shapes between processes and use regular distributions if rv_frozen is a bottleneck, will that help? ??, 15 ???. 2020 ?., 2:03 Andrew Nelson : > I have some code that uses multiprocessing.Pool for parallelisation. This > requires that an object is pickled. This object has an `rv_frozen` > distribution as an attribute. It turns out that a performance is much > improved if the `rv_frozen` distribution is not present --> pickling of > `rv_frozen` objects is expensive. Creation of `rv_frozen` objects is also > expensive. > > ``` > >>> import scipy.stats as stats > >>> import pickle > >>> %timeit stats.norm(scale=1, loc=1) > 694 ?s ? 123 ?s per loop (mean ? std. dev. of 7 runs, 1000 loops each) > >>> rv = stats.norm(scale=1, loc=1) > >>> %timeit s = pickle.dumps(rv); pickle.loads(s) > 1.02 ms ? 24 ?s per loop (mean ? std. dev. of 7 runs, 1000 loops each) > ``` > > I'd be hoping for an order of magnitude less in time for either of those. > Using line profiling two of the big culprits for slowness during object > creation are `rv_continuous._construct_doc` (50% of the total time, with a > large part spent in `_lib.doccer.docformat`!!) and > `rv_continuous._construct_argparser` > > My questions are: > > 1) Is it possible to speed up pickling/unpickling of these objects? (e.g. > __setstate__/__getstate__, custom reduction, copyreg magic, ...) > 2) Is there any way to turn off docstring creation (or speeding it up), > besides starting the interpreter with -OO? > > > _____________________________________ > Dr. Andrew Nelson > > > _____________________________________ > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at python.org > https://mail.python.org/mailman/listinfo/scipy-dev > -------------- next part -------------- An HTML attachment was scrubbed... URL: From evgeny.burovskiy at gmail.com Wed Jul 15 04:30:30 2020 From: evgeny.burovskiy at gmail.com (Evgeni Burovski) Date: Wed, 15 Jul 2020 11:30:30 +0300 Subject: [SciPy-Dev] creation / pickling of stats distributions In-Reply-To: References: Message-ID: Sorry, typo: I can send a PR or *review* one, if that helps. ??, 15 ???. 2020 ?., 11:29 Evgeni Burovski : > While it's not wholly surprising these two are slow, it is surprising they > are *that* slow. > > W.r.t. docstrings, I think there's room for adding a "skip_focstring" > kwarg or some such to rv_generic. It'll need to be propagated to > `rv_frozen.dist`. I can send a PR or one, if that helps. > (I don't know about pickling, sadly.) > > All that said, maybe it's easier to share the shapes between processes and > use regular distributions if rv_frozen is a bottleneck, will that help? > > > ??, 15 ???. 2020 ?., 2:03 Andrew Nelson : > >> I have some code that uses multiprocessing.Pool for parallelisation. This >> requires that an object is pickled. This object has an `rv_frozen` >> distribution as an attribute. It turns out that a performance is much >> improved if the `rv_frozen` distribution is not present --> pickling of >> `rv_frozen` objects is expensive. Creation of `rv_frozen` objects is also >> expensive. >> >> ``` >> >>> import scipy.stats as stats >> >>> import pickle >> >>> %timeit stats.norm(scale=1, loc=1) >> 694 ?s ? 123 ?s per loop (mean ? std. dev. of 7 runs, 1000 loops each) >> >>> rv = stats.norm(scale=1, loc=1) >> >>> %timeit s = pickle.dumps(rv); pickle.loads(s) >> 1.02 ms ? 24 ?s per loop (mean ? std. dev. of 7 runs, 1000 loops each) >> ``` >> >> I'd be hoping for an order of magnitude less in time for either of those. >> Using line profiling two of the big culprits for slowness during object >> creation are `rv_continuous._construct_doc` (50% of the total time, with a >> large part spent in `_lib.doccer.docformat`!!) and >> `rv_continuous._construct_argparser` >> >> My questions are: >> >> 1) Is it possible to speed up pickling/unpickling of these objects? (e.g. >> __setstate__/__getstate__, custom reduction, copyreg magic, ...) >> 2) Is there any way to turn off docstring creation (or speeding it up), >> besides starting the interpreter with -OO? >> >> >> _____________________________________ >> Dr. Andrew Nelson >> >> >> _____________________________________ >> _______________________________________________ >> SciPy-Dev mailing list >> SciPy-Dev at python.org >> https://mail.python.org/mailman/listinfo/scipy-dev >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From lorentzen.ch at gmail.com Fri Jul 17 06:38:07 2020 From: lorentzen.ch at gmail.com (Christian Lorentzen) Date: Fri, 17 Jul 2020 12:38:07 +0200 Subject: [SciPy-Dev] Tweedie distributions in scipy.stats In-Reply-To: <90626C84-DCFE-420C-BE6F-508052C66210@vt.edu> References: <90626C84-DCFE-420C-BE6F-508052C66210@vt.edu> Message-ID: <135a2ec6-b43b-a061-6e7a-ae0bf592f7f1@googlemail.com> Hi So to summarize: The plan is to put Wright's generalized Bessel function (for some parameter ranges) as a *private function* into scipy.special. This way, other libraries can use it under the hood, e.g. to compute likelihoods or provide distributions (rv, pdf, ...). Just as info, PR [1] for Wright's generalized Bessel function is meanwhile ready for a round of review, in my point of view. Side note: I deliberately chose to implement this special function and did not consider the various ways of computing the pdf of a Tweedie distribution as in [2], because I think an additional special function might have value for other purposes as well, and an independent approach to Tweedie distributions provides a good double check (I expect advantages and shortcomings of this approach). All the best, Christian [1] https://github.com/scipy/scipy/pull/11313 [2] https://cran.r-project.org/package=tweedie On 16.03.20 00:00, rlucas7 at vt.edu wrote: > >> On Mar 13, 2020, at 12:46 PM, Robert Kern wrote: >> >> ? >> On Fri, Mar 13, 2020 at 12:15 PM > > wrote: >> >> >> >> On Fri, Mar 13, 2020 at 12:04 PM > > wrote: >> >> Aside: >> compound Poisson is a convolution of distributions and not a >> finite mixture. >> Allowing an infinite mixing distribution like Poisson creates >> numerical problems in the upper tail that are not easy to >> solve in general. >> In most cases, computation have to be truncated at the upper >> tail, but then the problem is to figure out the truncation >> threshold for a required precision. >> My guess is that this would be a lot of work to get it to >> scipy standards. >> >> I was looking at the general case for convolution and >> compound poisson a long time ago, mainly using fft to get the >> pdf and cdf from the characteristic function,, the cf is >> relatively simple to compute for convolutions. The references >> in extreme value and risk applications that I looked at,?was >> emphasizing tail precision and ways how to work around it, or >> comparing different methods in how precise they are. >> fft was fast, but I only eyeballed the truncation threshold >> for my examples. >> >> > I think I also looked at that at my previous employer, I think the > reference I had used is this one > https://eprints.usq.edu.au/3888/1/Dunn_Smyth_Stats_and_Comp_v18n1.pdf > Hopefully that helps. > >> I thought of representing tweedie for the computation as a >> mixture between a mass point/discrete distribution and a >> distribution for the continuous part, so we can handle the two >> parts separately. >> > > I came to the same conclusion after thinking about this a bit over the > last few days. > >> Following this, it might be possible to add a zero-truncated >> tweedie distribution as a continuous distribution subclass in scipy. >> Then we could just add a simple mixture of the mass point at zero >> and the zero-truncated tweedie. >> > > The difference in zero inflated poisson is that it can be handled > directly within the rv_discrete framework (I think). > An rv_continuous with a zero point mass would handle a tweedie with > 1 artificial. E.g. a normal and 0 point mass mixture (used in stochastic > search variable selection) and a 0 point mass mixture with a moment > normal used in Valen Johnson?s work. > These seem to be the common applications (0 point mass). >> >> That could certainly work. It seems like handling that smoothly may >> be a pain for the user; you'd have to coordinate the effect of the >> parameters on both the size of the point mass and the continuous >> part, as well as the mixture. >> >> My recommendation is to implement this in its own package, using >> whatever frameworks you find help you solve your data analysis >> problems. Then we can figure out where it ought to finally live and >> how to extend the existing frameworks to handle this case best. > > Thanks for suggesting this Robert, this is a wise strategy, this will > enable to work out something that would generalize outside of the > specifics of only the tweedie distribution. > >> The code doesn't have to start out in scipy.stats in order to make >> use of the scipy.stats framework. Please do continue to put the >> necessary special functions into scipy.special; that framework is a >> little harder to use outside of scipy.special. If you need my vote of >> support for that on that PR, you have it. > > I found this from another statsModels developer that may be helpful to > use as reference > > https://github.com/thequackdaddy/tweedie/blob/master/tweedie/tweedie_dist.py > Hopefully you find it helpful. > >> >> -- >> Robert Kern >> _______________________________________________ >> SciPy-Dev mailing list >> SciPy-Dev at python.org >> https://mail.python.org/mailman/listinfo/scipy-dev > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at python.org > https://mail.python.org/mailman/listinfo/scipy-dev -------------- next part -------------- An HTML attachment was scrubbed... URL: From rlucas7 at vt.edu Fri Jul 17 16:25:06 2020 From: rlucas7 at vt.edu (rlucas7 at vt.edu) Date: Fri, 17 Jul 2020 16:25:06 -0400 Subject: [SciPy-Dev] Fwd: [rlucas7/scipy] Run failed: Nightly build - master (c5e0ca5) References: Message-ID: Hi scipy-dev, I think my branch is failing the python 3.9 (experimental) build. I don?t see the failure every night but instead maybe once a month. Has anyone else seen this and know what needs to be changed? Presumably there is a configuration file that I need to change an entry to turn off the build. Begin forwarded message: > From: Lucas Roberts > Date: July 16, 2020 at 8:03:43 PM EDT > To: rlucas7/scipy > Cc: Ci activity > Subject: [rlucas7/scipy] Run failed: Nightly build - master (c5e0ca5) > Reply-To: rlucas7/scipy > > ? > Run failed for master (c5e0ca5) > > Repository: rlucas7/scipy > Workflow: Nightly build > Duration: 35.0 seconds > Finished: 2020-07-17 00:03:24 UTC > > View results > > Jobs: > test_nightly (3.9) failed (1 annotation) > ? > You are receiving this because this workflow ran on your branch. > Manage your GitHub Actions notifications here. -------------- next part -------------- An HTML attachment was scrubbed... URL: From andyfaff at gmail.com Fri Jul 17 17:10:21 2020 From: andyfaff at gmail.com (Andrew Nelson) Date: Sat, 18 Jul 2020 07:10:21 +1000 Subject: [SciPy-Dev] Fwd: [rlucas7/scipy] Run failed: Nightly build - master (c5e0ca5) In-Reply-To: References: Message-ID: On Sat, 18 Jul 2020, 07:03 , wrote: > Hi scipy-dev, > > I think my branch is failing the python 3.9 (experimental) build. I don?t > see the failure every night but instead maybe once a month. > > Has anyone else seen this and know what needs to be changed? > https://github.community/t/stop-github-actions-running-on-a-fork/17965 does this do what you're asking? -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Fri Jul 17 18:02:51 2020 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Sat, 18 Jul 2020 00:02:51 +0200 Subject: [SciPy-Dev] welcome Seth Troisi to the SciPy core team Message-ID: Hi all, On behalf of the SciPy developers I'd like to welcome Seth Troisi as a member of the core team. Seth has been contributing since the start of this year. He has also done a lot of maintenance work all over the code base, as well as issue triaging and PR review. Here is an overview of his SciPy PRs: https://github.com/scipy/scipy/pulls/sethtroisi I'm looking forward to Seth' continued contributions! Cheers, Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From warren.weckesser at gmail.com Fri Jul 17 18:06:07 2020 From: warren.weckesser at gmail.com (Warren Weckesser) Date: Fri, 17 Jul 2020 18:06:07 -0400 Subject: [SciPy-Dev] welcome Seth Troisi to the SciPy core team In-Reply-To: References: Message-ID: On 7/17/20, Ralf Gommers wrote: > Hi all, > > On behalf of the SciPy developers I'd like to welcome Seth Troisi as a > member of the core team. Seth has been contributing since the start of this > year. He has also done a lot of maintenance work all over the code base, as > well as issue triaging and PR review. Here is an overview of his SciPy PRs: > https://github.com/scipy/scipy/pulls/sethtroisi Thanks Seth, for all the great work so far. Looking forward to more! Warren > > I'm looking forward to Seth' continued contributions! > > Cheers, > Ralf > From ilhanpolat at gmail.com Fri Jul 17 18:11:48 2020 From: ilhanpolat at gmail.com (Ilhan Polat) Date: Sat, 18 Jul 2020 00:11:48 +0200 Subject: [SciPy-Dev] welcome Seth Troisi to the SciPy core team In-Reply-To: References: Message-ID: Welcome on board Seth! Keep 'em coming. On Sat, Jul 18, 2020 at 12:07 AM Warren Weckesser < warren.weckesser at gmail.com> wrote: > On 7/17/20, Ralf Gommers wrote: > > Hi all, > > > > On behalf of the SciPy developers I'd like to welcome Seth Troisi as a > > member of the core team. Seth has been contributing since the start of > this > > year. He has also done a lot of maintenance work all over the code base, > as > > well as issue triaging and PR review. Here is an overview of his SciPy > PRs: > > https://github.com/scipy/scipy/pulls/sethtroisi > > > Thanks Seth, for all the great work so far. Looking forward to more! > > Warren > > > > > > I'm looking forward to Seth' continued contributions! > > > > Cheers, > > Ralf > > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at python.org > https://mail.python.org/mailman/listinfo/scipy-dev > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mhaberla at calpoly.edu Fri Jul 17 21:13:21 2020 From: mhaberla at calpoly.edu (Matt Haberland) Date: Fri, 17 Jul 2020 18:13:21 -0700 Subject: [SciPy-Dev] welcome Seth Troisi to the SciPy core team In-Reply-To: References: Message-ID: Welcome, Seth! On Fri, Jul 17, 2020 at 3:12 PM Ilhan Polat wrote: > Welcome on board Seth! Keep 'em coming. > > On Sat, Jul 18, 2020 at 12:07 AM Warren Weckesser < > warren.weckesser at gmail.com> wrote: > >> On 7/17/20, Ralf Gommers wrote: >> > Hi all, >> > >> > On behalf of the SciPy developers I'd like to welcome Seth Troisi as a >> > member of the core team. Seth has been contributing since the start of >> this >> > year. He has also done a lot of maintenance work all over the code >> base, as >> > well as issue triaging and PR review. Here is an overview of his SciPy >> PRs: >> > https://github.com/scipy/scipy/pulls/sethtroisi >> >> >> Thanks Seth, for all the great work so far. Looking forward to more! >> >> Warren >> >> >> > >> > I'm looking forward to Seth' continued contributions! >> > >> > Cheers, >> > Ralf >> > >> _______________________________________________ >> SciPy-Dev mailing list >> SciPy-Dev at python.org >> https://mail.python.org/mailman/listinfo/scipy-dev >> > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at python.org > https://mail.python.org/mailman/listinfo/scipy-dev > -- Matt Haberland Assistant Professor BioResource and Agricultural Engineering 08A-3K, Cal Poly -------------- next part -------------- An HTML attachment was scrubbed... URL: From rlucas7 at vt.edu Fri Jul 17 22:51:43 2020 From: rlucas7 at vt.edu (rlucas7 at vt.edu) Date: Fri, 17 Jul 2020 22:51:43 -0400 Subject: [SciPy-Dev] welcome Seth Troisi to the SciPy core team In-Reply-To: References: Message-ID: <8355CA2A-3174-4A3D-B6C8-8EB415631AEC@vt.edu> Welcome to the team! -Lucas Roberts > On Jul 17, 2020, at 9:14 PM, Matt Haberland wrote: > > ? > Welcome, Seth! > >> On Fri, Jul 17, 2020 at 3:12 PM Ilhan Polat wrote: >> Welcome on board Seth! Keep 'em coming. >> >>> On Sat, Jul 18, 2020 at 12:07 AM Warren Weckesser wrote: >>> On 7/17/20, Ralf Gommers wrote: >>> > Hi all, >>> > >>> > On behalf of the SciPy developers I'd like to welcome Seth Troisi as a >>> > member of the core team. Seth has been contributing since the start of this >>> > year. He has also done a lot of maintenance work all over the code base, as >>> > well as issue triaging and PR review. Here is an overview of his SciPy PRs: >>> > https://github.com/scipy/scipy/pulls/sethtroisi >>> >>> >>> Thanks Seth, for all the great work so far. Looking forward to more! >>> >>> Warren >>> >>> >>> > >>> > I'm looking forward to Seth' continued contributions! >>> > >>> > Cheers, >>> > Ralf >>> > >>> _______________________________________________ >>> SciPy-Dev mailing list >>> SciPy-Dev at python.org >>> https://mail.python.org/mailman/listinfo/scipy-dev >> _______________________________________________ >> SciPy-Dev mailing list >> SciPy-Dev at python.org >> https://mail.python.org/mailman/listinfo/scipy-dev > > > -- > Matt Haberland > Assistant Professor > BioResource and Agricultural Engineering > 08A-3K, Cal Poly > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at python.org > https://mail.python.org/mailman/listinfo/scipy-dev -------------- next part -------------- An HTML attachment was scrubbed... URL: From andyfaff at gmail.com Sat Jul 18 01:08:46 2020 From: andyfaff at gmail.com (Andrew Nelson) Date: Sat, 18 Jul 2020 15:08:46 +1000 Subject: [SciPy-Dev] welcome Seth Troisi to the SciPy core team In-Reply-To: <8355CA2A-3174-4A3D-B6C8-8EB415631AEC@vt.edu> References: <8355CA2A-3174-4A3D-B6C8-8EB415631AEC@vt.edu> Message-ID: Welcome Seth. On Sat, 18 Jul 2020 at 14:02, wrote: > Welcome to the team! > > -Lucas Roberts > > On Jul 17, 2020, at 9:14 PM, Matt Haberland wrote: > > ? > Welcome, Seth! > > On Fri, Jul 17, 2020 at 3:12 PM Ilhan Polat wrote: > >> Welcome on board Seth! Keep 'em coming. >> >> On Sat, Jul 18, 2020 at 12:07 AM Warren Weckesser < >> warren.weckesser at gmail.com> wrote: >> >>> On 7/17/20, Ralf Gommers wrote: >>> > Hi all, >>> > >>> > On behalf of the SciPy developers I'd like to welcome Seth Troisi as a >>> > member of the core team. Seth has been contributing since the start of >>> this >>> > year. He has also done a lot of maintenance work all over the code >>> base, as >>> > well as issue triaging and PR review. Here is an overview of his SciPy >>> PRs: >>> > https://github.com/scipy/scipy/pulls/sethtroisi >>> >>> >>> Thanks Seth, for all the great work so far. Looking forward to more! >>> >>> Warren >>> >>> >>> > >>> > I'm looking forward to Seth' continued contributions! >>> > >>> > Cheers, >>> > Ralf >>> > >>> _______________________________________________ >>> SciPy-Dev mailing list >>> SciPy-Dev at python.org >>> https://mail.python.org/mailman/listinfo/scipy-dev >>> >> _______________________________________________ >> SciPy-Dev mailing list >> SciPy-Dev at python.org >> https://mail.python.org/mailman/listinfo/scipy-dev >> > > > -- > Matt Haberland > Assistant Professor > BioResource and Agricultural Engineering > 08A-3K, Cal Poly > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at python.org > https://mail.python.org/mailman/listinfo/scipy-dev > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at python.org > https://mail.python.org/mailman/listinfo/scipy-dev > -- _____________________________________ Dr. Andrew Nelson _____________________________________ -------------- next part -------------- An HTML attachment was scrubbed... URL: From evgeny.burovskiy at gmail.com Sat Jul 18 01:52:21 2020 From: evgeny.burovskiy at gmail.com (Evgeni Burovski) Date: Sat, 18 Jul 2020 08:52:21 +0300 Subject: [SciPy-Dev] welcome Seth Troisi to the SciPy core team In-Reply-To: References: Message-ID: Welcome Seth! ??, 18 ???. 2020 ?., 1:03 Ralf Gommers : > Hi all, > > On behalf of the SciPy developers I'd like to welcome Seth Troisi as a > member of the core team. Seth has been contributing since the start of this > year. He has also done a lot of maintenance work all over the code base, as > well as issue triaging and PR review. Here is an overview of his SciPy PRs: > https://github.com/scipy/scipy/pulls/sethtroisi > > I'm looking forward to Seth' continued contributions! > > Cheers, > Ralf > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at python.org > https://mail.python.org/mailman/listinfo/scipy-dev > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Sat Jul 18 13:17:06 2020 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Sat, 18 Jul 2020 19:17:06 +0200 Subject: [SciPy-Dev] making Cython code interruptible, proposed new build dependency Message-ID: Hi all, In https://github.com/scipy/scipy/pull/12251 there's a proposal to add https://github.com/sagemath/cysignals as a build-time (not run-time) dependency in order to make Cython code interruptible. Looking at the open issues on the cysignals tracker and the activity, I'm not sure how well maintained it is. Does anyone have experience with cysignals? Do you think it's worth adding? Cheers, Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From peter.mahler.larsen at gmail.com Mon Jul 20 08:36:40 2020 From: peter.mahler.larsen at gmail.com (Peter Larsen) Date: Mon, 20 Jul 2020 14:36:40 +0200 Subject: [SciPy-Dev] Disjoint set / union find data structure Message-ID: In many projects I find myself needing a disjoint set data structure https://en.wikipedia.org/wiki/Disjoint-set_data_structure. It would be very convenient if we had an implementation in SciPy. The codebase already contains two implementations, albeit not publicly accessible ones: https://github.com/scipy/scipy/blob/1d8b34b47d2b881ea904f4c9bf4c49b3fa36b29a/scipy/cluster/_hierarchy.pyx#L1074 https://github.com/scipy/scipy/blob/8dba340293fe20e62e173bdf2c10ae208286692f/scipy/sparse/linalg/dsolve/SuperLU/SRC/sp_coletree.c#L40 If there are no objections to its inclusion in SciPy I will write a PR. I tentatively propose to put it in scipy.cluster but other suggestions are welcome. Cheers, Peter -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Tue Jul 21 09:42:31 2020 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Tue, 21 Jul 2020 15:42:31 +0200 Subject: [SciPy-Dev] Disjoint set / union find data structure In-Reply-To: References: Message-ID: On Mon, Jul 20, 2020 at 2:37 PM Peter Larsen wrote: > In many projects I find myself needing a disjoint set data structure > https://en.wikipedia.org/wiki/Disjoint-set_data_structure. > It would be very convenient if we had an implementation in SciPy. > That seems like a reasonable and useful thing to add. > The codebase already contains two implementations, albeit not publicly > accessible ones: > > https://github.com/scipy/scipy/blob/1d8b34b47d2b881ea904f4c9bf4c49b3fa36b29a/scipy/cluster/_hierarchy.pyx#L1074 > > https://github.com/scipy/scipy/blob/8dba340293fe20e62e173bdf2c10ae208286692f/scipy/sparse/linalg/dsolve/SuperLU/SRC/sp_coletree.c#L40 > > If there are no objections to its inclusion in SciPy I will write a PR. I > tentatively propose to put it in scipy.cluster but other suggestions are > welcome. > Cluster, spatial and sparse all could make sense. If we foresee using this internally in all those modules, I think we'd have to put it in `sparse` or in `_lib`. The reason: no further dependencies between modules. cluster already depends on spatial, and spatial depends on sparse. Cheers, Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From larson.eric.d at gmail.com Tue Jul 21 14:28:47 2020 From: larson.eric.d at gmail.com (Eric Larson) Date: Tue, 21 Jul 2020 14:28:47 -0400 Subject: [SciPy-Dev] multitaper In-Reply-To: <99d4b040-d88c-578f-9a74-dd1fb5adf638@gmail.com> References: <99d4b040-d88c-578f-9a74-dd1fb5adf638@gmail.com> Message-ID: I find multitaper estimates to be useful in scientific work. Given we have Welch, it seems useful to also have multitaper. However, I think we'd need to consider the API a bit. For example, the MATLAB API gives you some idea of the number of possible options: https://www.mathworks.com/help/signal/ref/pmtm.html I don't think we should follow their naming or API necessarily, but some of the options (providing your own tapers, using adaptive criteria, specifying sample rate and time-halfbandwidth, etc.) are indeed useful to expose in general. If you're interested in pursuing this, it might be worth opening a GitHub issue to hash out the API. Eric On Wed, Jul 1, 2020 at 8:20 AM Shahab Sanjari wrote: > Hi everyone. > > there is the Slepian function in scipy.signal.windows.dpss, which can be > used to do the multitaper analysis, but the actual multiplication > routine is missing. It is like the "pmtm" function in matlab, also > available in "spectrum.mtm" from PyPI. > > Why not include it in Scipy as well? it is just 5 lines!! here is the code: > > > https://github.com/xaratustrah/multitaper/blob/f3b9340606e4bc4357f382c00127e9c0706af57a/multitaper/multitaper.py#L8 > > I use this code regularly and it is pretty good. > > > cheers, > > Shahab > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at python.org > https://mail.python.org/mailman/listinfo/scipy-dev > -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Tue Jul 21 17:53:32 2020 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 21 Jul 2020 15:53:32 -0600 Subject: [SciPy-Dev] NumPy 1.19.1 released. Message-ID: Hi All, On behalf of the NumPy team I am pleased to announce that NumPy 1.19.1 has been released. This release supports Python 3.6-3.8 and may be built with the latest Python 3.9 beta. It fixes several bugs found in the 1.19.0 release, replaces several functions deprecated in the upcoming Python-3.9 release, has improved support for AIX, and has a number of development related updates to keep CI working with recent upstream changes. Downstream developers should use Cython >= 0.29.21 when building for Python 3.9 and Cython >= 0.29.16 when building for Python 3.8. OpenBLAS >= 3.7 is needed to avoid wrong results on the Skylake architecture. The NumPy Wheels for this release can be downloaded from PyPI , source archives, release notes, and wheel hashes are available from Github . Linux users will need pip >= 0.19.3 in order to install manylinux2010 and manylinux2014 wheels. *Contributors* A total of 15 people contributed to this release. People with a "+" by their names contributed a patch for the first time. - Abhinav Reddy + - Anirudh Subramanian - Antonio Larrosa + - Charles Harris - Chunlin Fang - Eric Wieser - Etienne Guesnet + - Kevin Sheppard - Matti Picus - Raghuveer Devulapalli - Roman Yurchak - Ross Barnowski - Sayed Adel - Sebastian Berg - Tyler Reddy *Pull requests merged* A total of 25 pull requests were merged for this release. - #16649: MAINT, CI: disable Shippable cache - #16652: MAINT: Replace PyUString_GET_SIZE with PyUnicode_GetLength. - #16654: REL: Fix outdated docs link - #16656: BUG: raise IEEE exception on AIX - #16672: BUG: Fix bug in AVX complex absolute while processing array of... - #16693: TST: Add extra debugging information to CPU features detection - #16703: BLD: Add CPU entry for Emscripten / WebAssembly - #16705: TST: Disable Python 3.9-dev testing. - #16714: MAINT: Disable use_hugepages in case of ValueError - #16724: BUG: Fix PyArray_SearchSorted signature. - #16768: MAINT: Fixes for deprecated functions in scalartypes.c.src - #16772: MAINT: Remove unneeded call to PyUnicode_READY - #16776: MAINT: Fix deprecated functions in scalarapi.c - #16779: BLD, ENH: Add RPATH support for AIX - #16780: BUG: Fix default fallback in genfromtxt - #16784: BUG: Added missing return after raising error in methods.c - #16795: BLD: update cython to 0.29.21 - #16832: MAINT: setuptools 49.2.0 emits a warning, avoid it - #16872: BUG: Validate output size in bin- and multinomial - #16875: BLD, MAINT: Pin setuptools - #16904: DOC: Reconstruct Testing Guideline. - #16905: TST, BUG: Re-raise MemoryError exception in test_large_zip's... - #16906: BUG, DOC: Fix bad MPL kwarg. - #16916: BUG: Fix string/bytes to complex assignment - #16922: REL: Prepare for NumPy 1.19.1 release Cheers, Charles Harris -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Thu Jul 23 11:58:17 2020 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 23 Jul 2020 09:58:17 -0600 Subject: [SciPy-Dev] Disjoint set / union find data structure In-Reply-To: References: Message-ID: On Mon, Jul 20, 2020 at 6:37 AM Peter Larsen wrote: > In many projects I find myself needing a disjoint set data structure > https://en.wikipedia.org/wiki/Disjoint-set_data_structure. > It would be very convenient if we had an implementation in SciPy. > > The codebase already contains two implementations, albeit not publicly > accessible ones: > > https://github.com/scipy/scipy/blob/1d8b34b47d2b881ea904f4c9bf4c49b3fa36b29a/scipy/cluster/_hierarchy.pyx#L1074 > > https://github.com/scipy/scipy/blob/8dba340293fe20e62e173bdf2c10ae208286692f/scipy/sparse/linalg/dsolve/SuperLU/SRC/sp_coletree.c#L40 > > If there are no objections to its inclusion in SciPy I will write a PR. I > tentatively propose to put it in scipy.cluster but other suggestions are > welcome. > > A union-find algorithm would be useful and has been proposed before, just never happened. In my implementations I also add a link in the set elements so that the disjoint sets can be iterated over, which was useful for my work but is not part of the usual implementations. IIRC, an implementation was shown in the discussion that took place when the graph algorithms were added to scipy. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From asaadel1 at jhu.edu Thu Jul 23 18:18:24 2020 From: asaadel1 at jhu.edu (Ali Saad-Eldin) Date: Thu, 23 Jul 2020 22:18:24 +0000 Subject: [SciPy-Dev] ENH: Adding Graph Embedding Functionality to SciPy In-Reply-To: References: Message-ID: Hi all, I?m curious whether the development team would be interested in the addition of graph embedding functionality, specifically adding Adjacency/Laplacian spectral embedding to scipy.sparse.csgraph (or anywhere else it would make sense to have it). Both methods are useful in a variety of graph applications, such as clustering nodes with similar connective structure; in essence, they both amount to an SVD. Our current implementations in GraSPy, (available at ASE and LSE) input a graph represented as a dense or sparse matrix, and return the appropriate embedding. Please let me know if you'd like to add these functions to scipy. Best, Ali Saad-Eldin -------------- next part -------------- An HTML attachment was scrubbed... URL: From tyler.je.reddy at gmail.com Thu Jul 23 22:21:07 2020 From: tyler.je.reddy at gmail.com (Tyler Reddy) Date: Thu, 23 Jul 2020 20:21:07 -0600 Subject: [SciPy-Dev] ANN: SciPy 1.5.2 Message-ID: Hi all, On behalf of the SciPy development team I'm pleased to announce the release of SciPy 1.5.2, which is a bug fix release. Sources and binary wheels can be found at: https://pypi.org/project/scipy/ and at: https://github.com/scipy/scipy/releases/tag/v1.5.2 One of a few ways to install this release with pip: pip install scipy==1.5.2 ========================== SciPy 1.5.2 Release Notes ========================== SciPy 1.5.2 is a bug-fix release with no new features compared to 1.5.1. Authors ====== * Peter Bell * Tobias Biester + * Evgeni Burovski * Thomas A Caswell * Ralf Gommers * Sturla Molden * Andrew Nelson * ofirr + * Sambit Panda * Ilhan Polat * Tyler Reddy * Atsushi Sakai * Pauli Virtanen A total of 13 people contributed to this release. People with a "+" by their names contributed a patch for the first time. This list of names is automatically generated, and may not be fully complete. Issues closed for 1.5.2 ------------------------------ * `#3847 `__: Crash of interpolate.splprep(task=-1) * `#7395 `__: splprep segfaults if fixed knots are specified * `#10761 `__: scipy.signal.convolve2d produces incorrect values for large arrays * `#11971 `__: DOC: search in devdocs returns wrong link * `#12155 `__: BUG: Fix permutation of distance matrices in scipy.stats.multiscale_graphcorr * `#12203 `__: Unable to install on PyPy 7.3.1 (Python 3.6.9) * `#12316 `__: negative scipy.spatial.distance.correlation * `#12422 `__: BUG: slsqp: ValueError: failed to initialize intent(inout) array... * `#12428 `__: stats.truncnorm.rvs() never returns a scalar in 1.5 * `#12441 `__: eigvalsh inconsistent eigvals= subset_by_index= * `#12445 `__: DOC: scipy.linalg.eigh * `#12449 `__: Warnings are not filtered in csr_matrix.sum() * `#12469 `__: SciPy 1.9 exception in LSQSphereBivariateSpline * `#12487 `__: BUG: optimize: incorrect result from approx_fprime * `#12493 `__: CI: GitHub Actions for maintenance branches * `#12533 `__: eigh gives incorrect results * `#12579 `__: BLD, MAINT: distutils issues in wheels repo Pull requests for 1.5.2 ------------------------------- * `#12156 `__: BUG: Fix permutation of distance matrices in scipy.stats.multiscale_graphcorr * `#12238 `__: BUG: Use 64-bit indexing in convolve2d to avoid overflow * `#12256 `__: BLD: Build lsap as a single extension instead of extension +... * `#12320 `__: BUG: spatial: avoid returning negative correlation distance * `#12383 `__: ENH: Make cKDTree.tree more efficient * `#12392 `__: DOC: update scipy-sphinx-theme * `#12430 `__: BUG: truncnorm and geninvgauss never return scalars from rvs * `#12437 `__: BUG: optimize: cast bounds to floats in new_bounds_to_old/old_bounds_to_new * `#12442 `__: MAINT:linalg: Fix for input args of eigvalsh * `#12461 `__: MAINT: sparse: write matrix/asmatrix wrappers without warning... * `#12478 `__: BUG: fix array_like input defects and add tests for all functions... * `#12488 `__: BUG: fix approx_derivative step size. Closes #12487 * `#12500 `__: CI: actions branch trigger fix * `#12501 `__: CI: actions branch trigger fix * `#12504 `__: BUG: cKDTreeNode use after free * `#12529 `__: MAINT: allow graceful docs re-upload * `#12538 `__: BUG:linalg: eigh type parameter handling corrected * `#12560 `__: MAINT: truncnorm.rvs compatibility for \`Generator\` * `#12562 `__: redo gh-12188: fix segfaults in splprep with fixed knots * `#12586 `__: BLD: Add -std=c99 to sigtools to compile with C99 * `#12590 `__: CI: Add GCC 4.8 entry to travis build matrix * `#12591 `__: BLD: fix cython error on master-branch cython Checksums ========= MD5 ~~~ 2e046d26cdc4241a6a5b2907d57528df scipy-1.5.2-cp36-cp36m-macosx_10_9_x86_64.whl 902dea66453e2fa0616e9479970986f5 scipy-1.5.2-cp36-cp36m-manylinux1_i686.whl e130db080706d9f4ce22d8493c8e1ce2 scipy-1.5.2-cp36-cp36m-manylinux1_x86_64.whl 721f16bae600731e479a5b4e98ce9a97 scipy-1.5.2-cp36-cp36m-win32.whl a3171cfe38618d51acbfb8d1b39ac612 scipy-1.5.2-cp36-cp36m-win_amd64.whl c9f733d4d2e82c098c08760963dafaf8 scipy-1.5.2-cp37-cp37m-macosx_10_9_x86_64.whl 53ba6c502d09145b38e0e857b2d4a273 scipy-1.5.2-cp37-cp37m-manylinux1_i686.whl b9db33944ac4147936a7f42df8e95ad2 scipy-1.5.2-cp37-cp37m-manylinux1_x86_64.whl be9e8bfdf0e5e0914d1e1605be26d9c0 scipy-1.5.2-cp37-cp37m-win32.whl 848fa7b82a25d0ce36710ccc47ebc2ca scipy-1.5.2-cp37-cp37m-win_amd64.whl 590cd3b70a2dc8664896d6b9e2e5fc6d scipy-1.5.2-cp38-cp38-macosx_10_9_x86_64.whl 7fdbb19c15702b98319ea4ea32df8458 scipy-1.5.2-cp38-cp38-manylinux1_i686.whl 301f3a873e1bfef70d6f594c489fafe8 scipy-1.5.2-cp38-cp38-manylinux1_x86_64.whl 8c08ac0f55810e89e336eb3bf5a7b337 scipy-1.5.2-cp38-cp38-win32.whl 711f5c47c801dc79bead7d40669fd8c9 scipy-1.5.2-cp38-cp38-win_amd64.whl 620fc39f371e04a76af5d0290f8d3753 scipy-1.5.2.tar.gz 5bc188f21054a2ecff74fae40dd298da scipy-1.5.2.tar.xz 17bc80802955d100f6c1335594eda29a scipy-1.5.2.zip SHA256 ~~~~~~ cca9fce15109a36a0a9f9cfc64f870f1c140cb235ddf27fe0328e6afb44dfed0 scipy-1.5.2-cp36-cp36m-macosx_10_9_x86_64.whl 1c7564a4810c1cd77fcdee7fa726d7d39d4e2695ad252d7c86c3ea9d85b7fb8f scipy-1.5.2-cp36-cp36m-manylinux1_i686.whl 07e52b316b40a4f001667d1ad4eb5f2318738de34597bd91537851365b6c61f1 scipy-1.5.2-cp36-cp36m-manylinux1_x86_64.whl d56b10d8ed72ec1be76bf10508446df60954f08a41c2d40778bc29a3a9ad9bce scipy-1.5.2-cp36-cp36m-win32.whl 8e28e74b97fc8d6aa0454989db3b5d36fc27e69cef39a7ee5eaf8174ca1123cb scipy-1.5.2-cp36-cp36m-win_amd64.whl 6e86c873fe1335d88b7a4bfa09d021f27a9e753758fd75f3f92d714aa4093768 scipy-1.5.2-cp37-cp37m-macosx_10_9_x86_64.whl a0afbb967fd2c98efad5f4c24439a640d39463282040a88e8e928db647d8ac3d scipy-1.5.2-cp37-cp37m-manylinux1_i686.whl eecf40fa87eeda53e8e11d265ff2254729d04000cd40bae648e76ff268885d66 scipy-1.5.2-cp37-cp37m-manylinux1_x86_64.whl 315aa2165aca31375f4e26c230188db192ed901761390be908c9b21d8b07df62 scipy-1.5.2-cp37-cp37m-win32.whl ec5fe57e46828d034775b00cd625c4a7b5c7d2e354c3b258d820c6c72212a6ec scipy-1.5.2-cp37-cp37m-win_amd64.whl fc98f3eac993b9bfdd392e675dfe19850cc8c7246a8fd2b42443e506344be7d9 scipy-1.5.2-cp38-cp38-macosx_10_9_x86_64.whl a785409c0fa51764766840185a34f96a0a93527a0ff0230484d33a8ed085c8f8 scipy-1.5.2-cp38-cp38-manylinux1_i686.whl 0a0e9a4e58a4734c2eba917f834b25b7e3b6dc333901ce7784fd31aefbd37b2f scipy-1.5.2-cp38-cp38-manylinux1_x86_64.whl dac09281a0eacd59974e24525a3bc90fa39b4e95177e638a31b14db60d3fa806 scipy-1.5.2-cp38-cp38-win32.whl 92eb04041d371fea828858e4fff182453c25ae3eaa8782d9b6c32b25857d23bc scipy-1.5.2-cp38-cp38-win_amd64.whl 066c513d90eb3fd7567a9e150828d39111ebd88d3e924cdfc9f8ce19ab6f90c9 scipy-1.5.2.tar.gz 28d5d2e9af6ca5c0352cd83fb64191f2d8e883ab5287a221ba7a175c8cc2ccbe scipy-1.5.2.tar.xz a9054595a370f24d68f7a694037316b69ae80f5837323d567f76cde055189c08 scipy-1.5.2.zip -------------- next part -------------- An HTML attachment was scrubbed... URL: From gulliver.harry at gmail.com Fri Jul 24 05:53:25 2020 From: gulliver.harry at gmail.com (Harry Gulliver) Date: Fri, 24 Jul 2020 10:53:25 +0100 Subject: [SciPy-Dev] Wallenius hypergeometric distribution Message-ID: Hi all, new (potential) contributor here! I'm working on a project using Wallenius' non-central hypergeometric distribution atm and noticed it's not yet available in scipy.stats, so thought I'd volunteer to add it. Could possibly also do Fisher's hypergeom while I'm at it. Anything I ought to know before just writing some code and making a pull request? How can I best help? Thanks! Harry -------------- next part -------------- An HTML attachment was scrubbed... URL: From mhaberla at calpoly.edu Fri Jul 24 10:54:26 2020 From: mhaberla at calpoly.edu (Matt Haberland) Date: Fri, 24 Jul 2020 07:54:26 -0700 Subject: [SciPy-Dev] Wallenius hypergeometric distribution In-Reply-To: References: Message-ID: Funny that you should mention this now! On Tuesday we got permission from Agner Fog to use his C++ implementation (used in the R package BiasedUrn, https://cran.r-project.org/web/packages/BiasedUrn/index.html) under SciPy's license. Would you be interested in wrapping a C++ implementation? On Fri, Jul 24, 2020, 2:53 AM Harry Gulliver wrote: > Hi all, new (potential) contributor here! > > I'm working on a project using Wallenius' non-central hypergeometric > distribution atm and noticed it's not yet available in scipy.stats, so > thought I'd volunteer to add it. Could possibly also do Fisher's hypergeom > while I'm at it. > > Anything I ought to know before just writing some code and making a pull > request? How can I best help? > > Thanks! > Harry > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at python.org > https://mail.python.org/mailman/listinfo/scipy-dev > -------------- next part -------------- An HTML attachment was scrubbed... URL: From gulliver.harry at gmail.com Fri Jul 24 12:53:56 2020 From: gulliver.harry at gmail.com (Harry Gulliver) Date: Fri, 24 Jul 2020 17:53:56 +0100 Subject: [SciPy-Dev] Wallenius hypergeometric distribution Message-ID: Serendipitous timing! In principle I'd be happy to do that, but I know almost no C++, so I'm not sure if I'd be able to... Harry Date: Fri, 24 Jul 2020 07:54:26 -0700 From: Matt Haberland To: SciPy Developers List Subject: Re: [SciPy-Dev] Wallenius hypergeometric distribution Message-ID: Content-Type: text/plain; charset="utf-8" Funny that you should mention this now! On Tuesday we got permission from Agner Fog to use his C++ implementation (used in the R package BiasedUrn, https://cran.r-project.org/web/packages/BiasedUrn/index.html) under SciPy's license. Would you be interested in wrapping a C++ implementation? On Fri, Jul 24, 2020, 2:53 AM Harry Gulliver wrote: > Hi all, new (potential) contributor here! > > I'm working on a project using Wallenius' non-central hypergeometric > distribution atm and noticed it's not yet available in scipy.stats, so > thought I'd volunteer to add it. Could possibly also do Fisher's hypergeom > while I'm at it. > > Anything I ought to know before just writing some code and making a pull > request? How can I best help? > > Thanks! > Harry -------------- next part -------------- An HTML attachment was scrubbed... URL: From ilhanpolat at gmail.com Fri Jul 24 15:59:51 2020 From: ilhanpolat at gmail.com (Ilhan Polat) Date: Fri, 24 Jul 2020 21:59:51 +0200 Subject: [SciPy-Dev] Wallenius hypergeometric distribution In-Reply-To: References: Message-ID: No worries. We can help with the C++ wrapping parts. On Fri, 24 Jul 2020, 18:55 Harry Gulliver, wrote: > Serendipitous timing! > > In principle I'd be happy to do that, but I know almost no C++, so I'm not > sure if I'd be able to... > > Harry > > > > > > Date: Fri, 24 Jul 2020 07:54:26 -0700 > From: Matt Haberland > To: SciPy Developers List > Subject: Re: [SciPy-Dev] Wallenius hypergeometric distribution > Message-ID: > < > CADuxUizWkhy1nkns+09hAkTXQ7TkyOhaVYwzMhdbp2TyZMOvaQ at mail.gmail.com> > Content-Type: text/plain; charset="utf-8" > > Funny that you should mention this now! On Tuesday we got permission from > Agner Fog to use his C++ implementation (used in the R package BiasedUrn, > https://cran.r-project.org/web/packages/BiasedUrn/index.html) under > SciPy's > license. Would you be interested in wrapping a C++ implementation? > > On Fri, Jul 24, 2020, 2:53 AM Harry Gulliver > wrote: > > > Hi all, new (potential) contributor here! > > > > I'm working on a project using Wallenius' non-central hypergeometric > > distribution atm and noticed it's not yet available in scipy.stats, so > > thought I'd volunteer to add it. Could possibly also do Fisher's > hypergeom > > while I'm at it. > > > > Anything I ought to know before just writing some code and making a pull > > request? How can I best help? > > > > Thanks! > > Harry > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at python.org > https://mail.python.org/mailman/listinfo/scipy-dev > -------------- next part -------------- An HTML attachment was scrubbed... URL: From nowotnym at gmail.com Fri Jul 24 22:00:38 2020 From: nowotnym at gmail.com (Michael Nowotny) Date: Fri, 24 Jul 2020 19:00:38 -0700 Subject: [SciPy-Dev] Entropy Calculations Message-ID: <07B1CA49-F62E-4FF1-AEB1-F470E62C4C79@gmail.com> Dear SciPy developers, I have noticed that the statistical functions for the calculation of entropy and KL divergence currently only support discrete distributions for which the probability mass function is known. I recently needed to compute various information theoretic measures from samples of distributions and created the package `Divergence`. It offers functionality for entropy, cross entropy, relative entropy, Jensen-Shannon divergence, joint entropy, conditional entropy, and mutual information and is available on GitHub at https://github.com/michaelnowotny/divergence . It supports samples from both discrete and continuous distributions. Continuous distributions are implement via numerical integration of kernel density estimates generated from the sample. I would be happy to contribute some or all of its functionality to SciPy. Please let me know if you are interested. Thank you, Michael -------------- next part -------------- An HTML attachment was scrubbed... URL: From andrea.gavana at gmail.com Sun Jul 26 10:05:20 2020 From: andrea.gavana at gmail.com (Andrea Gavana) Date: Sun, 26 Jul 2020 16:05:20 +0200 Subject: [SciPy-Dev] Global Optimization Benchmarks Message-ID: Dear SciPy developers & users, I have a couple of new derivative-free, global optimization algorithms I?ve been working on lately - plus some improvements to AMPGO and a few more benchmark functions - and I?d like to rerun the benchmarks as I did back in 2013 (!!!). In doing so, I?d like to remove some of the least interesting/worst performing algorithms (Firefly, MLSL, Galileo, the original DE) and replace them with the ones currently available in SciPy - differential_evolution, SHGO and dual_annealing. Everything seems good and dandy, but it appears to me that SHGO does not accept an initial point for the optimization process - which makes the whole ?run the optimization from 100 different starting points for each benchmark? a bit moot. I am no expert on SHGO, so maybe there is an alternative way to ?simulate? the changing of the starting point for the optimization? Or maybe some other approach to make it consistent across optimizers? Any suggestion is more than welcome. Andrea. -------------- next part -------------- An HTML attachment was scrubbed... URL: From stefan.c.endres at gmail.com Sun Jul 26 11:18:33 2020 From: stefan.c.endres at gmail.com (Stefan Endres) Date: Sun, 26 Jul 2020 17:18:33 +0200 Subject: [SciPy-Dev] Global Optimization Benchmarks In-Reply-To: References: Message-ID: Dear Andrea, SHGO does not use an initial starting point, only the bounds (which may also be specified as none or infinite). The benchmarks that I ran used for the publication used the global minimum as a stopping criteria (together with performance profiles that demonstrate the final results). For this particular benchmarking framework I would propose simply using a single iteration ((dim)^2 +1 points) or specifying 100 starting points. A script to use 100 sampling points in a single iteration with the sobol sampling method: ``` result = shgo(obj_fun, bounds, n=100, sampling_method='sobol') ``` If you would like to add a more stochastic element to this performance I think the best approach would be to use a different seed for the sampling method (in my experience this does not make much of a difference to the performance in low dimensional problems), otherwise run shgo only once and/or with increasing numbers of iterations. Another possibility is to add a stochastic element to the bounds. Please let me know if you need any help. Best regards, Stefan Endres On Sun, Jul 26, 2020 at 4:06 PM Andrea Gavana wrote: > Dear SciPy developers & users, > > I have a couple of new derivative-free, global optimization > algorithms I?ve been working on lately - plus some improvements to AMPGO > and a few more benchmark functions - and I?d like to rerun the benchmarks > as I did back in 2013 (!!!). > > In doing so, I?d like to remove some of the least interesting/worst > performing algorithms (Firefly, MLSL, Galileo, the original DE) and replace > them with the ones currently available in SciPy - differential_evolution, > SHGO and dual_annealing. > > Everything seems good and dandy, but it appears to me that SHGO does not > accept an initial point for the optimization process - which makes the > whole ?run the optimization from 100 different starting points for each > benchmark? a bit moot. > > I am no expert on SHGO, so maybe there is an alternative way to ?simulate? > the changing of the starting point for the optimization? Or maybe some > other approach to make it consistent across optimizers? > > Any suggestion is more than welcome. > > Andrea. > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at python.org > https://mail.python.org/mailman/listinfo/scipy-dev > -- Stefan Endres (MEng, AMIChemE, BEng (Hons) Chemical Engineering) Wissenchaftlicher Mitarbetier: Leibniz Institute for Materials Engineering IWT, Badgasteiner Stra?e 3, 28359 Bremen, Germany Work phone (DE): +49 (0) 421 218 51238 Cellphone (DE): +49 (0) 160 949 86417 Cellphone (ZA): +27 (0) 82 972 42 89 E-mail (work): s.endres at iwt.uni-bremen.de Website: https://stefan-endres.github.io/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From andrea.gavana at gmail.com Sun Jul 26 11:48:03 2020 From: andrea.gavana at gmail.com (Andrea Gavana) Date: Sun, 26 Jul 2020 17:48:03 +0200 Subject: [SciPy-Dev] Global Optimization Benchmarks In-Reply-To: References: Message-ID: Hi Stefan, On Sun, 26 Jul 2020 at 17.19, Stefan Endres wrote: > Dear Andrea, > > SHGO does not use an initial starting point, only the bounds (which may > also be specified as none or infinite). The benchmarks that I ran used for > the publication used the global minimum as a stopping criteria (together > with performance profiles that demonstrate the final results). For this > particular benchmarking framework I would propose simply using a single > iteration ((dim)^2 +1 points) or specifying 100 starting points. > > A script to use 100 sampling points in a single iteration with the sobol > sampling method: > > ``` result = shgo(obj_fun, bounds, n=100, sampling_method='sobol') ``` > > > If you would like to add a more stochastic element to this performance I > think the best approach would be to use a different seed for the sampling > method (in my experience this does not make much of a difference to the > performance in low dimensional problems), otherwise run shgo only once > and/or with increasing numbers of iterations. Another possibility is to add > a stochastic element to the bounds. > > Please let me know if you need any help. > Thank you for your answer. The approach I had several years ago - and that I?d like to keep - was to generate 100 random starting points for each benchmark and run all the global optimizers from that point: see http://infinity77.net/global_optimization/ Those benchmarks also relied on the global optimum as a stopping criterion plus a maximum number of objective function evaluations (2,000), whichever is reached first. Of course, reaching the maximum number of function evaluations without getting to the global optimum (plus a pre-specified tolerance) means a failure in the context of my original benchmarks. I have now a few more benchmark functions plus a couple of new algorithms and I?d like to take the same steps. I also have a slightly different approach to bounds/global optima locations so that algorithms that rely on guessing global optima by running to the center of the domain (or on the bounds) will have a less easy life this time. Bounds-shifting is what I used to do but you have to be careful as some of the benchmark functions can be undefined outside those bounds (I.e., returning NaNs) or they can have lower global optima outside the bounds. Shrinking the bounds is of course always a possibility but it makes life easier to the algorithms and it will fail as a strategy if a benchmark has a global optimum exactly at (one or more of) the original bounds. That said, I didn?t know that the sampling process of SHGO relied on random numbers: that is good to know, as an alternative I can do as you suggested and vary the seed 100 times - one of the new algorithms I have also does not use an initial point so it was already my strategy to change the seed for that one. I can simply do the same for SHGO. I?m running still with an old Python/Numpy/SciPy combination (for legacy reasons) so I?ll have to see if differential_evolution and dual_annealing can be simply copied over locally and run - I tested SHGO and it runs with no problem. Andrea. > Best regards, > Stefan Endres > > > > > On Sun, Jul 26, 2020 at 4:06 PM Andrea Gavana > wrote: > >> Dear SciPy developers & users, >> >> I have a couple of new derivative-free, global optimization >> algorithms I?ve been working on lately - plus some improvements to AMPGO >> and a few more benchmark functions - and I?d like to rerun the benchmarks >> as I did back in 2013 (!!!). >> >> In doing so, I?d like to remove some of the least interesting/worst >> performing algorithms (Firefly, MLSL, Galileo, the original DE) and replace >> them with the ones currently available in SciPy - differential_evolution, >> SHGO and dual_annealing. >> >> Everything seems good and dandy, but it appears to me that SHGO does not >> accept an initial point for the optimization process - which makes the >> whole ?run the optimization from 100 different starting points for each >> benchmark? a bit moot. >> >> I am no expert on SHGO, so maybe there is an alternative way to >> ?simulate? the changing of the starting point for the optimization? Or >> maybe some other approach to make it consistent across optimizers? >> >> Any suggestion is more than welcome. >> >> Andrea. >> > _______________________________________________ >> SciPy-Dev mailing list >> SciPy-Dev at python.org >> https://mail.python.org/mailman/listinfo/scipy-dev >> > > > -- > Stefan Endres (MEng, AMIChemE, BEng (Hons) Chemical Engineering) > > Wissenchaftlicher Mitarbetier: Leibniz Institute for Materials Engineering > IWT, Badgasteiner Stra?e 3, 28359 Bremen, Germany > > Work phone (DE): +49 (0) 421 218 51238 > Cellphone (DE): +49 (0) 160 949 86417 > Cellphone (ZA): +27 (0) 82 972 42 89 > E-mail (work): s.endres at iwt.uni-bremen.de > Website: https://stefan-endres.github.io/ > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at python.org > https://mail.python.org/mailman/listinfo/scipy-dev > -------------- next part -------------- An HTML attachment was scrubbed... URL: From stefan.c.endres at gmail.com Sun Jul 26 13:47:37 2020 From: stefan.c.endres at gmail.com (Stefan Endres) Date: Sun, 26 Jul 2020 19:47:37 +0200 Subject: [SciPy-Dev] Global Optimization Benchmarks In-Reply-To: References: Message-ID: Hi Andrea, Those benchmarks also relied on the global optimum as a stopping criterion plus a maximum number of objective function evaluations (2,000), whichever is reached first. Of course, reaching the maximum number of function evaluations without getting to the global optimum (plus a pre-specified tolerance) means a failure in the context of my original benchmarks. These criteria can be set with the following arguments: options = {'maxfev': 2000, 'f_min': f_min, 'f_tol': f_tol} result = shgo(obj_fun, bounds, n=30, sampling_method='sobol', options=options) The algorithm should iterate (1 iteration is: check stopping criteria --> sample `n` points --> triangulate --> find minimisers) until a global minimum within the specified tolerance is found or the number of function evaluations run out. A lower number of sampling points per iteration tends to show higher performance, but I know that there is a strange bug where the algorithm keeps running within the same attractor even if this is supposed to be added to the triangulation, so a higher`n` will work better on some test suites until this is fixed. Most people tend to use the non-iterative version of the algorithm which is the most stable so there might be a few bugs running it iteratively in general. I also have a slightly different approach to bounds/global optima locations so that algorithms that rely on guessing global optima by running to the center of the domain (or on the bounds) will have a less easy life this time I would also recommend looking into using GKLS generators which, to my understanding, is standard practice to avoid this kind of bias ( https://dl.acm.org/doi/10.1145/962437.962444 ) in benchmarking GO functions. Bounds-shifting is what I used to do but you have to be careful as some of the benchmark functions can be undefined outside those bounds (I.e., returning NaNs) or they can have lower global optima outside the bounds. At least for SHGO the NaN values (and other non-floating point objects) should not be a problem, it was partially developed to deal with discontinuities in the objective function. Returning a lower value might be an issue. SHGO is not supposed to be able to escape the bounds, but if the bounds fed to it are outside the actual bounds I think a strong penalty function would need to be added to the objective function or the algorithm will terminate earlier with the lower optimum. Another reason why I would recommend adding randomness to the objective functions (using something like GKLS) instead of the hyperparameters/bounds is that the sequences might lose their low-discrepancy properties with different seeds, but I am not an expert on how adjusting those sequences affects performance. I didn?t know that the sampling process of SHGO relied on random numbers The Sobol sequence is technically quasi random, but a true random number generator can also be used by specifying a random sampling function to the sample_method argument, the main difference being that most RNGs are biased towards the centre of the hypercube. On the other hand default sampling behaviour relies on sub-triangulations of the hyperrectangle (similar to DIRECT), but the performance is almost identical to the Sobol sequence. Sub-triangulations are biased towards boundaries and centres depending on if the vertices or centroids are used in the actual sampling so I would not recommend this for your benchmarks. Best regards, Stefan Endres On Sun, Jul 26, 2020 at 5:49 PM Andrea Gavana wrote: > Hi Stefan, > > On Sun, 26 Jul 2020 at 17.19, Stefan Endres > wrote: > >> Dear Andrea, >> >> SHGO does not use an initial starting point, only the bounds (which may >> also be specified as none or infinite). The benchmarks that I ran used for >> the publication used the global minimum as a stopping criteria (together >> with performance profiles that demonstrate the final results). For this >> particular benchmarking framework I would propose simply using a single >> iteration ((dim)^2 +1 points) or specifying 100 starting points. >> >> A script to use 100 sampling points in a single iteration with the sobol >> sampling method: >> >> ``` result = shgo(obj_fun, bounds, n=100, sampling_method='sobol') ``` >> >> >> If you would like to add a more stochastic element to this performance I >> think the best approach would be to use a different seed for the sampling >> method (in my experience this does not make much of a difference to the >> performance in low dimensional problems), otherwise run shgo only once >> and/or with increasing numbers of iterations. Another possibility is to add >> a stochastic element to the bounds. >> >> Please let me know if you need any help. >> > > > Thank you for your answer. The approach I had several years ago - and that > I?d like to keep - was to generate 100 random starting points for each > benchmark and run all the global optimizers from that point: see > http://infinity77.net/global_optimization/ > > Those benchmarks also relied on the global optimum as a stopping criterion > plus a maximum number of objective function evaluations (2,000), whichever > is reached first. Of course, reaching the maximum number of function > evaluations without getting to the global optimum (plus a pre-specified > tolerance) means a failure in the context of my original benchmarks. > > I have now a few more benchmark functions plus a couple of new algorithms > and I?d like to take the same steps. I also have a slightly different > approach to bounds/global optima locations so that algorithms that rely on > guessing global optima by running to the center of the domain (or on the > bounds) will have a less easy life this time. Bounds-shifting is what I > used to do but you have to be careful as some of the benchmark functions > can be undefined outside those bounds (I.e., returning NaNs) or they can > have lower global optima outside the bounds. Shrinking the bounds is of > course always a possibility but it makes life easier to the algorithms and > it will fail as a strategy if a benchmark has a global optimum exactly at > (one or more of) the original bounds. > > That said, I didn?t know that the sampling process of SHGO relied on > random numbers: that is good to know, as an alternative I can do as you > suggested and vary the seed 100 times - one of the new algorithms I have > also does not use an initial point so it was already my strategy to change > the seed for that one. I can simply do the same for SHGO. > > I?m running still with an old Python/Numpy/SciPy combination (for legacy > reasons) so I?ll have to see if differential_evolution and dual_annealing > can be simply copied over locally and run - I tested SHGO and it runs with > no problem. > > Andrea. > > >> Best regards, >> Stefan Endres >> >> >> >> >> On Sun, Jul 26, 2020 at 4:06 PM Andrea Gavana >> wrote: >> >>> Dear SciPy developers & users, >>> >>> I have a couple of new derivative-free, global optimization >>> algorithms I?ve been working on lately - plus some improvements to AMPGO >>> and a few more benchmark functions - and I?d like to rerun the benchmarks >>> as I did back in 2013 (!!!). >>> >>> In doing so, I?d like to remove some of the least interesting/worst >>> performing algorithms (Firefly, MLSL, Galileo, the original DE) and replace >>> them with the ones currently available in SciPy - differential_evolution, >>> SHGO and dual_annealing. >>> >>> Everything seems good and dandy, but it appears to me that SHGO does not >>> accept an initial point for the optimization process - which makes the >>> whole ?run the optimization from 100 different starting points for each >>> benchmark? a bit moot. >>> >>> I am no expert on SHGO, so maybe there is an alternative way to >>> ?simulate? the changing of the starting point for the optimization? Or >>> maybe some other approach to make it consistent across optimizers? >>> >>> Any suggestion is more than welcome. >>> >>> Andrea. >>> >> _______________________________________________ >>> SciPy-Dev mailing list >>> SciPy-Dev at python.org >>> https://mail.python.org/mailman/listinfo/scipy-dev >>> >> >> >> -- >> Stefan Endres (MEng, AMIChemE, BEng (Hons) Chemical Engineering) >> >> Wissenchaftlicher Mitarbetier: Leibniz Institute for Materials >> Engineering IWT, Badgasteiner Stra?e 3, 28359 Bremen, Germany >> >> Work phone (DE): +49 (0) 421 218 51238 >> Cellphone (DE): +49 (0) 160 949 86417 >> Cellphone (ZA): +27 (0) 82 972 42 89 >> E-mail (work): s.endres at iwt.uni-bremen.de >> Website: https://stefan-endres.github.io/ >> _______________________________________________ >> SciPy-Dev mailing list >> SciPy-Dev at python.org >> https://mail.python.org/mailman/listinfo/scipy-dev >> > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at python.org > https://mail.python.org/mailman/listinfo/scipy-dev > -- Stefan Endres (MEng, AMIChemE, BEng (Hons) Chemical Engineering) Wissenchaftlicher Mitarbetier: Leibniz Institute for Materials Engineering IWT, Badgasteiner Stra?e 3, 28359 Bremen, Germany Work phone (DE): +49 (0) 421 218 51238 Cellphone (DE): +49 (0) 160 949 86417 Cellphone (ZA): +27 (0) 82 972 42 89 E-mail (work): s.endres at iwt.uni-bremen.de Website: https://stefan-endres.github.io/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From andrea.gavana at gmail.com Mon Jul 27 02:59:34 2020 From: andrea.gavana at gmail.com (Andrea Gavana) Date: Mon, 27 Jul 2020 08:59:34 +0200 Subject: [SciPy-Dev] Global Optimization Benchmarks In-Reply-To: References: Message-ID: Hi Stefan, On Sun, 26 Jul 2020 at 19:48, Stefan Endres wrote: > Hi Andrea, > > Those benchmarks also relied on the global optimum as a stopping criterion > plus a maximum number of objective function evaluations (2,000), whichever > is reached first. Of course, reaching the maximum number of function > evaluations without getting to the global optimum (plus a pre-specified > tolerance) means a failure in the context of my original benchmarks. > > These criteria can be set with the following arguments: > > options = {'maxfev': 2000, > 'f_min': f_min, > 'f_tol': f_tol} > > result = shgo(obj_fun, bounds, n=30, sampling_method='sobol', options=options) > > The algorithm should iterate (1 iteration is: check stopping criteria --> > sample `n` points --> triangulate --> find minimisers) until a global > minimum within the specified tolerance is found or the number of function > evaluations run out. A lower number of sampling points per iteration tends > to show higher performance, but I know that there is a strange bug where > the algorithm keeps running within the same attractor even if this is > supposed to be added to the triangulation, so a higher`n` will work better > on some test suites until this is fixed. > > Most people tend to use the non-iterative version of the algorithm which > is the most stable so there might be a few bugs running it iteratively in > general. > > I also have a slightly different approach to bounds/global optima > locations so that algorithms that rely on guessing global optima by running > to the center of the domain (or on the bounds) will have a less easy life > this time > > I would also recommend looking into using GKLS generators which, to my > understanding, is standard practice to avoid this kind of bias ( > https://dl.acm.org/doi/10.1145/962437.962444 ) in benchmarking GO > functions. > > Bounds-shifting is what I used to do but you have to be careful as some of > the benchmark functions can be undefined outside those bounds (I.e., > returning NaNs) or they can have lower global optima outside the bounds. > > At least for SHGO the NaN values (and other non-floating point objects) > should not be a problem, it was partially developed to deal with > discontinuities in the objective function. Returning a lower value might be > an issue. SHGO is not supposed to be able to escape the bounds, but if the > bounds fed to it are outside the actual bounds I think a strong penalty > function would need to be added to the objective function or the algorithm > will terminate earlier with the lower optimum. > > Another reason why I would recommend adding randomness to the objective > functions (using something like GKLS) instead of the hyperparameters/bounds > is that the sequences might lose their low-discrepancy properties > with different seeds, but I am not an expert on how adjusting those > sequences affects performance. > > I didn?t know that the sampling process of SHGO relied on random numbers > > The Sobol sequence is technically quasi random, but a true random number > generator can also be used by specifying a random sampling function to the > sample_method argument, the main difference being that most RNGs are > biased towards the centre of the hypercube. On the other hand default > sampling behaviour relies on sub-triangulations of the hyperrectangle > (similar to DIRECT), but the performance is almost identical to the Sobol > sequence. Sub-triangulations are biased towards boundaries and centres > depending on if the vertices or centroids are used in the actual sampling > so I would not recommend this for your benchmarks. > > Best regards, > > Stefan Endres > Thank you for the detailed answer. I just run a very simple test using the Rosenbrock function, and I am sure I am doing something wrong here. Looking at the benchmark results on your webpage, it seems to me that SHGO is an extremely good algorithm: I have created this simple test: import numpy from shgo import shgo # -------------------------------------------------------------------------------- # class Rosenbrock(object): def __init__(self, dimensions=2): self.N = dimensions self.nfev = 0 self._bounds = zip([-30.] * self.N, [30.0] * self.N) self.global_optimum = [[1.0 for _ in range(self.N)]] self.fglob = 0.0 def fun(self, x, *args): self.nfev += 1 f = sum(100.0 * (x[1:] - x[:-1] ** 2.0) ** 2.0 + (1 - x[:-1]) ** 2.0) print (self.nfev, f) return f # -------------------------------------------------------------------------------------- # def Test(): maxfun = 2000 tolfun = 1e-2 numpy.random.seed(123456) benchmark = Rosenbrock() bounds = benchmark._bounds fglob = benchmark.fglob options = {'maxfev': maxfun, 'f_min': fglob, 'f_tol': tolfun} res = shgo(benchmark.fun, bounds, options=options, sampling_method='simplicial') xf, yf = res.x, res.fun print(xf, yf, fglob, benchmark.nfev) if __name__ == '__main__': Test() And I get the following numerical results and a graph of the evolution of the objective function: array([ 0.99999555, 0.99999111]), 1.9768160305004305e-11, 0.0, 2113 [image: image.png] I am probably too ignorant about SHGO, but I would have thought the Rosenbrock function to be an easy pick for SHGO - i.e., convergence to the global optimum achieved much faster than 2,000 evaluations. Maybe there are some settings that I should change? Andrea. > > On Sun, Jul 26, 2020 at 5:49 PM Andrea Gavana > wrote: > >> Hi Stefan, >> >> On Sun, 26 Jul 2020 at 17.19, Stefan Endres >> wrote: >> >>> Dear Andrea, >>> >>> SHGO does not use an initial starting point, only the bounds (which may >>> also be specified as none or infinite). The benchmarks that I ran used for >>> the publication used the global minimum as a stopping criteria (together >>> with performance profiles that demonstrate the final results). For this >>> particular benchmarking framework I would propose simply using a single >>> iteration ((dim)^2 +1 points) or specifying 100 starting points. >>> >>> A script to use 100 sampling points in a single iteration with the sobol >>> sampling method: >>> >>> ``` result = shgo(obj_fun, bounds, n=100, sampling_method='sobol') ``` >>> >>> >>> If you would like to add a more stochastic element to this performance I >>> think the best approach would be to use a different seed for the sampling >>> method (in my experience this does not make much of a difference to the >>> performance in low dimensional problems), otherwise run shgo only once >>> and/or with increasing numbers of iterations. Another possibility is to add >>> a stochastic element to the bounds. >>> >>> Please let me know if you need any help. >>> >> >> >> Thank you for your answer. The approach I had several years ago - and >> that I?d like to keep - was to generate 100 random starting points for each >> benchmark and run all the global optimizers from that point: see >> http://infinity77.net/global_optimization/ >> >> Those benchmarks also relied on the global optimum as a stopping >> criterion plus a maximum number of objective function evaluations (2,000), >> whichever is reached first. Of course, reaching the maximum number of >> function evaluations without getting to the global optimum (plus a >> pre-specified tolerance) means a failure in the context of my original >> benchmarks. >> >> I have now a few more benchmark functions plus a couple of new algorithms >> and I?d like to take the same steps. I also have a slightly different >> approach to bounds/global optima locations so that algorithms that rely on >> guessing global optima by running to the center of the domain (or on the >> bounds) will have a less easy life this time. Bounds-shifting is what I >> used to do but you have to be careful as some of the benchmark functions >> can be undefined outside those bounds (I.e., returning NaNs) or they can >> have lower global optima outside the bounds. Shrinking the bounds is of >> course always a possibility but it makes life easier to the algorithms and >> it will fail as a strategy if a benchmark has a global optimum exactly at >> (one or more of) the original bounds. >> >> That said, I didn?t know that the sampling process of SHGO relied on >> random numbers: that is good to know, as an alternative I can do as you >> suggested and vary the seed 100 times - one of the new algorithms I have >> also does not use an initial point so it was already my strategy to change >> the seed for that one. I can simply do the same for SHGO. >> >> I?m running still with an old Python/Numpy/SciPy combination (for legacy >> reasons) so I?ll have to see if differential_evolution and dual_annealing >> can be simply copied over locally and run - I tested SHGO and it runs with >> no problem. >> >> Andrea. >> >> >>> Best regards, >>> Stefan Endres >>> >>> >>> >>> >>> On Sun, Jul 26, 2020 at 4:06 PM Andrea Gavana >>> wrote: >>> >>>> Dear SciPy developers & users, >>>> >>>> I have a couple of new derivative-free, global optimization >>>> algorithms I?ve been working on lately - plus some improvements to AMPGO >>>> and a few more benchmark functions - and I?d like to rerun the benchmarks >>>> as I did back in 2013 (!!!). >>>> >>>> In doing so, I?d like to remove some of the least interesting/worst >>>> performing algorithms (Firefly, MLSL, Galileo, the original DE) and replace >>>> them with the ones currently available in SciPy - differential_evolution, >>>> SHGO and dual_annealing. >>>> >>>> Everything seems good and dandy, but it appears to me that SHGO does >>>> not accept an initial point for the optimization process - which makes the >>>> whole ?run the optimization from 100 different starting points for each >>>> benchmark? a bit moot. >>>> >>>> I am no expert on SHGO, so maybe there is an alternative way to >>>> ?simulate? the changing of the starting point for the optimization? Or >>>> maybe some other approach to make it consistent across optimizers? >>>> >>>> Any suggestion is more than welcome. >>>> >>>> Andrea. >>>> >>> _______________________________________________ >>>> SciPy-Dev mailing list >>>> SciPy-Dev at python.org >>>> https://mail.python.org/mailman/listinfo/scipy-dev >>>> >>> >>> >>> -- >>> Stefan Endres (MEng, AMIChemE, BEng (Hons) Chemical Engineering) >>> >>> Wissenchaftlicher Mitarbetier: Leibniz Institute for Materials >>> Engineering IWT, Badgasteiner Stra?e 3, 28359 Bremen, Germany >>> >>> Work phone (DE): +49 (0) 421 218 51238 >>> Cellphone (DE): +49 (0) 160 949 86417 >>> Cellphone (ZA): +27 (0) 82 972 42 89 >>> E-mail (work): s.endres at iwt.uni-bremen.de >>> Website: https://stefan-endres.github.io/ >>> _______________________________________________ >>> SciPy-Dev mailing list >>> SciPy-Dev at python.org >>> https://mail.python.org/mailman/listinfo/scipy-dev >>> >> _______________________________________________ >> SciPy-Dev mailing list >> SciPy-Dev at python.org >> https://mail.python.org/mailman/listinfo/scipy-dev >> > > > -- > Stefan Endres (MEng, AMIChemE, BEng (Hons) Chemical Engineering) > > Wissenchaftlicher Mitarbetier: Leibniz Institute for Materials Engineering > IWT, Badgasteiner Stra?e 3, 28359 Bremen, Germany > Work phone (DE): +49 (0) 421 218 51238 > Cellphone (DE): +49 (0) 160 949 86417 > Cellphone (ZA): +27 (0) 82 972 42 89 > E-mail (work): s.endres at iwt.uni-bremen.de > Website: https://stefan-endres.github.io/ > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at python.org > https://mail.python.org/mailman/listinfo/scipy-dev > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image.png Type: image/png Size: 73097 bytes Desc: not available URL: From stefan.c.endres at gmail.com Mon Jul 27 05:47:57 2020 From: stefan.c.endres at gmail.com (Stefan Endres) Date: Mon, 27 Jul 2020 11:47:57 +0200 Subject: [SciPy-Dev] Global Optimization Benchmarks In-Reply-To: References: Message-ID: Hi Andrea, Thank you for your continued patience and detailed response. I apologise this is really due to my poor documentation, I?m hoping to have more time to resolve that soon with use cases and examples. I forgot that for these performance benchmarks ideally use the minimize_every_iter option as well (this defaults to False because iterations are primarily used to track homology groups (mostly used in things like energy surfaces or vector field design with less thought to its performance as a GO algorithm)). When this is off the algorithm does not have a local exploration phase so this result is expected until the iterations finish and do a last refinement. I made the following adjustments to your script to include this option (also removed line using zip to be compatible with Python 3.8): import numpyfrom matplotlib import pyplot as plotfrom shgo import shgo # -------------------------------------------------------------------------------- # fun_l = [] # Global list for plot nfev_l = [] # Global list for plot class Rosenbrock(object): def __init__(self, dimensions=2): self.N = dimensions self.nfev = 0 # self._bounds = zip([-30.] * self.N, [30.0] * self.N) self._bounds = [(-30., 30.0),] * self.N self.global_optimum = [[1.0 for _ in range(self.N)]] self.fglob = 0.0 def fun(self, x, *args): self.nfev += 1 f = sum(100.0 * (x[1:] - x[:-1] ** 2.0) ** 2.0 + (1 - x[:-1]) ** 2.0) fun_l.append(f) # Append objective function values for plots nfev_l.append(self.nfev) # Append number of function eval values for plots return f # -------------------------------------------------------------------------------------- # def Test(): maxfun = 2000 tolfun = 1e-2 numpy.random.seed(123456) benchmark = Rosenbrock() bounds = benchmark._bounds fglob = benchmark.fglob options = {'maxfev': maxfun, 'f_min': fglob, 'f_tol': tolfun, 'minimize_every_iter': True} res = shgo(benchmark.fun, bounds, options=options, sampling_method='simplicial') xf, yf = res.x, res.fun print(xf, yf, fglob, benchmark.nfev) return if __name__ == '__main__': Test() # Plot results: plot.plot(nfev_l, fun_l) plot.show() Which produces the following output on Python 3.8 ( https://imgur.com/a/hMfbhW1): [image: image.png] So SHGO finds the global minimum with the performance in the publication results, *but* it does not terminate when it finds it, I believe this is related to the following warning: /home/stefan_endres/.local/lib/python3.8/site-packages/shgo/_shgo.py:986: RuntimeWarning: divide by zero encountered in double_scalars if (lres_f_min.fun - self.f_min_true) / abs( /home/stefan_endres/.local/lib/python3.8/site-packages/shgo/_shgo.py:823: RuntimeWarning: divide by zero encountered in double_scalars pe = (self.f_lowest - self.f_min_true) / abs(self.f_min_true) The `if` statement, which checks if the objective function is precisely zero, before checking the actual supplied value, was the culprit in this bug. It should be: if self.f_min_true == 0.0: if self.f_lowest == 0.0: ...to avoid the division by zero. However, fixing this produced other errors. Also when running this in Python 2 I also get the following outright division by zero error: [stefan_endres at primary shgo-gavana]$ python2 run.py Traceback (most recent call last): File "run.py", line 49, in Test() File "run.py", line 42, in Test res = shgo(benchmark.fun, bounds, options=options, sampling_method='simplicial') File "/home/stefan_endres/.local/lib/python2.7/site-packages/shgo/_shgo.py", line 423, in shgo shc.construct_complex() File "/home/stefan_endres/.local/lib/python2.7/site-packages/shgo/_shgo.py", line 728, in construct_complex self.iterate() File "/home/stefan_endres/.local/lib/python2.7/site-packages/shgo/_shgo.py", line 876, in iterate self.find_minima() # Process minimiser pool File "/home/stefan_endres/.local/lib/python2.7/site-packages/shgo/_shgo.py", line 749, in find_minima self.minimise_pool(self.local_iter) File "/home/stefan_endres/.local/lib/python2.7/site-packages/shgo/_shgo.py", line 987, in minimise_pool self.f_min_true) <= self.f_tol: ZeroDivisionError: float division by zero I apologise for these issues. To be frank, when I merged all the possible stopping criteria to streamline the code I did not re-run the benchmarks again to test if the performance profiles are reproduced. As I mentioned, most people who use SHGO tend to always use a single iteration version and tune sampling points until they find all solutions robustly in a problem class. So unfortunately I have been focussed on fixing bugs that arise in those instances when I have time to work on the code for the algorithm. The latest version should work with the stopping criteria, but some of the unittests are failing so I have not pushed it to the repository yet. Another possibility would be to use the Bitbucket version of SHGO with the scripts at the `jogo_Sobol` tag which was used for the publication and should also Python 2 compatible (using this together with the scripts to run the benchmarking performance profiles available on the `shgo-dev` repository). More ideally though I would like time to fix the new code, and possibly make it Python 2 compatible again (I dropped Python 2 support around the same time the SciPy project did), depending on your timeline for running these benchmarks. Best regards, Stefan Endres On Mon, Jul 27, 2020 at 9:00 AM Andrea Gavana wrote: > Hi Stefan, > > On Sun, 26 Jul 2020 at 19:48, Stefan Endres > wrote: > >> Hi Andrea, >> >> Those benchmarks also relied on the global optimum as a stopping >> criterion plus a maximum number of objective function evaluations (2,000), >> whichever is reached first. Of course, reaching the maximum number of >> function evaluations without getting to the global optimum (plus a >> pre-specified tolerance) means a failure in the context of my original >> benchmarks. >> >> These criteria can be set with the following arguments: >> >> options = {'maxfev': 2000, >> 'f_min': f_min, >> 'f_tol': f_tol} >> >> result = shgo(obj_fun, bounds, n=30, sampling_method='sobol', options=options) >> >> The algorithm should iterate (1 iteration is: check stopping criteria --> >> sample `n` points --> triangulate --> find minimisers) until a global >> minimum within the specified tolerance is found or the number of function >> evaluations run out. A lower number of sampling points per iteration tends >> to show higher performance, but I know that there is a strange bug where >> the algorithm keeps running within the same attractor even if this is >> supposed to be added to the triangulation, so a higher`n` will work better >> on some test suites until this is fixed. >> >> Most people tend to use the non-iterative version of the algorithm which >> is the most stable so there might be a few bugs running it iteratively in >> general. >> >> I also have a slightly different approach to bounds/global optima >> locations so that algorithms that rely on guessing global optima by running >> to the center of the domain (or on the bounds) will have a less easy life >> this time >> >> I would also recommend looking into using GKLS generators which, to my >> understanding, is standard practice to avoid this kind of bias ( >> https://dl.acm.org/doi/10.1145/962437.962444 ) in benchmarking GO >> functions. >> >> Bounds-shifting is what I used to do but you have to be careful as some >> of the benchmark functions can be undefined outside those bounds (I.e., >> returning NaNs) or they can have lower global optima outside the bounds. >> >> At least for SHGO the NaN values (and other non-floating point objects) >> should not be a problem, it was partially developed to deal with >> discontinuities in the objective function. Returning a lower value might be >> an issue. SHGO is not supposed to be able to escape the bounds, but if the >> bounds fed to it are outside the actual bounds I think a strong penalty >> function would need to be added to the objective function or the algorithm >> will terminate earlier with the lower optimum. >> >> Another reason why I would recommend adding randomness to the objective >> functions (using something like GKLS) instead of the hyperparameters/bounds >> is that the sequences might lose their low-discrepancy properties >> with different seeds, but I am not an expert on how adjusting those >> sequences affects performance. >> >> I didn?t know that the sampling process of SHGO relied on random numbers >> >> The Sobol sequence is technically quasi random, but a true random number >> generator can also be used by specifying a random sampling function to the >> sample_method argument, the main difference being that most RNGs are >> biased towards the centre of the hypercube. On the other hand default >> sampling behaviour relies on sub-triangulations of the hyperrectangle >> (similar to DIRECT), but the performance is almost identical to the Sobol >> sequence. Sub-triangulations are biased towards boundaries and centres >> depending on if the vertices or centroids are used in the actual sampling >> so I would not recommend this for your benchmarks. >> >> Best regards, >> >> Stefan Endres >> > > Thank you for the detailed answer. I just run a very simple test using the > Rosenbrock function, and I am sure I am doing something wrong here. Looking > at the benchmark results on your webpage, it seems to me that SHGO is an > extremely good algorithm: I have created this simple test: > > import numpy > from shgo import shgo > > # -------------------------------------------------------------------------------- # > > class Rosenbrock(object): > > def __init__(self, dimensions=2): > > self.N = dimensions > self.nfev = 0 > self._bounds = zip([-30.] * self.N, [30.0] * self.N) > > self.global_optimum = [[1.0 for _ in range(self.N)]] > self.fglob = 0.0 > > def fun(self, x, *args): > self.nfev += 1 > f = sum(100.0 * (x[1:] - x[:-1] ** 2.0) ** 2.0 + (1 - x[:-1]) ** 2.0) > print (self.nfev, f) > return f > > # -------------------------------------------------------------------------------------- # > > def Test(): > > maxfun = 2000 > tolfun = 1e-2 > numpy.random.seed(123456) > > benchmark = Rosenbrock() > bounds = benchmark._bounds > fglob = benchmark.fglob > > options = {'maxfev': maxfun, 'f_min': fglob, 'f_tol': tolfun} > res = shgo(benchmark.fun, bounds, options=options, sampling_method='simplicial') > xf, yf = res.x, res.fun > > print(xf, yf, fglob, benchmark.nfev) > > > if __name__ == '__main__': > Test() > > > And I get the following numerical results and a graph of the evolution of > the objective function: > > array([ 0.99999555, 0.99999111]), 1.9768160305004305e-11, 0.0, 2113 > > [image: image.png] > > I am probably too ignorant about SHGO, but I would have thought the > Rosenbrock function to be an easy pick for SHGO - i.e., convergence to the > global optimum achieved much faster than 2,000 evaluations. Maybe there are > some settings that I should change? > > Andrea. > > > >> >> On Sun, Jul 26, 2020 at 5:49 PM Andrea Gavana >> wrote: >> >>> Hi Stefan, >>> >>> On Sun, 26 Jul 2020 at 17.19, Stefan Endres >>> wrote: >>> >>>> Dear Andrea, >>>> >>>> SHGO does not use an initial starting point, only the bounds (which may >>>> also be specified as none or infinite). The benchmarks that I ran used for >>>> the publication used the global minimum as a stopping criteria (together >>>> with performance profiles that demonstrate the final results). For this >>>> particular benchmarking framework I would propose simply using a single >>>> iteration ((dim)^2 +1 points) or specifying 100 starting points. >>>> >>>> A script to use 100 sampling points in a single iteration with the >>>> sobol sampling method: >>>> >>>> ``` result = shgo(obj_fun, bounds, n=100, sampling_method='sobol') ``` >>>> >>>> >>>> If you would like to add a more stochastic element to this performance >>>> I think the best approach would be to use a different seed for the sampling >>>> method (in my experience this does not make much of a difference to the >>>> performance in low dimensional problems), otherwise run shgo only once >>>> and/or with increasing numbers of iterations. Another possibility is to add >>>> a stochastic element to the bounds. >>>> >>>> Please let me know if you need any help. >>>> >>> >>> >>> Thank you for your answer. The approach I had several years ago - and >>> that I?d like to keep - was to generate 100 random starting points for each >>> benchmark and run all the global optimizers from that point: see >>> http://infinity77.net/global_optimization/ >>> >>> Those benchmarks also relied on the global optimum as a stopping >>> criterion plus a maximum number of objective function evaluations (2,000), >>> whichever is reached first. Of course, reaching the maximum number of >>> function evaluations without getting to the global optimum (plus a >>> pre-specified tolerance) means a failure in the context of my original >>> benchmarks. >>> >>> I have now a few more benchmark functions plus a couple of new >>> algorithms and I?d like to take the same steps. I also have a slightly >>> different approach to bounds/global optima locations so that algorithms >>> that rely on guessing global optima by running to the center of the domain >>> (or on the bounds) will have a less easy life this time. Bounds-shifting is >>> what I used to do but you have to be careful as some of the benchmark >>> functions can be undefined outside those bounds (I.e., returning NaNs) or >>> they can have lower global optima outside the bounds. Shrinking the bounds >>> is of course always a possibility but it makes life easier to the >>> algorithms and it will fail as a strategy if a benchmark has a global >>> optimum exactly at (one or more of) the original bounds. >>> >>> That said, I didn?t know that the sampling process of SHGO relied on >>> random numbers: that is good to know, as an alternative I can do as you >>> suggested and vary the seed 100 times - one of the new algorithms I have >>> also does not use an initial point so it was already my strategy to change >>> the seed for that one. I can simply do the same for SHGO. >>> >>> I?m running still with an old Python/Numpy/SciPy combination (for legacy >>> reasons) so I?ll have to see if differential_evolution and dual_annealing >>> can be simply copied over locally and run - I tested SHGO and it runs with >>> no problem. >>> >>> Andrea. >>> >>> >>>> Best regards, >>>> Stefan Endres >>>> >>>> >>>> >>>> >>>> On Sun, Jul 26, 2020 at 4:06 PM Andrea Gavana >>>> wrote: >>>> >>>>> Dear SciPy developers & users, >>>>> >>>>> I have a couple of new derivative-free, global optimization >>>>> algorithms I?ve been working on lately - plus some improvements to AMPGO >>>>> and a few more benchmark functions - and I?d like to rerun the benchmarks >>>>> as I did back in 2013 (!!!). >>>>> >>>>> In doing so, I?d like to remove some of the least interesting/worst >>>>> performing algorithms (Firefly, MLSL, Galileo, the original DE) and replace >>>>> them with the ones currently available in SciPy - differential_evolution, >>>>> SHGO and dual_annealing. >>>>> >>>>> Everything seems good and dandy, but it appears to me that SHGO does >>>>> not accept an initial point for the optimization process - which makes the >>>>> whole ?run the optimization from 100 different starting points for each >>>>> benchmark? a bit moot. >>>>> >>>>> I am no expert on SHGO, so maybe there is an alternative way to >>>>> ?simulate? the changing of the starting point for the optimization? Or >>>>> maybe some other approach to make it consistent across optimizers? >>>>> >>>>> Any suggestion is more than welcome. >>>>> >>>>> Andrea. >>>>> >>>> _______________________________________________ >>>>> SciPy-Dev mailing list >>>>> SciPy-Dev at python.org >>>>> https://mail.python.org/mailman/listinfo/scipy-dev >>>>> >>>> >>>> >>>> -- >>>> Stefan Endres (MEng, AMIChemE, BEng (Hons) Chemical Engineering) >>>> >>>> Wissenchaftlicher Mitarbetier: Leibniz Institute for Materials >>>> Engineering IWT, Badgasteiner Stra?e 3, 28359 Bremen, Germany >>>> >>>> Work phone (DE): +49 (0) 421 218 51238 >>>> Cellphone (DE): +49 (0) 160 949 86417 >>>> Cellphone (ZA): +27 (0) 82 972 42 89 >>>> E-mail (work): s.endres at iwt.uni-bremen.de >>>> Website: https://stefan-endres.github.io/ >>>> _______________________________________________ >>>> SciPy-Dev mailing list >>>> SciPy-Dev at python.org >>>> https://mail.python.org/mailman/listinfo/scipy-dev >>>> >>> _______________________________________________ >>> SciPy-Dev mailing list >>> SciPy-Dev at python.org >>> https://mail.python.org/mailman/listinfo/scipy-dev >>> >> >> >> -- >> Stefan Endres (MEng, AMIChemE, BEng (Hons) Chemical Engineering) >> >> Wissenchaftlicher Mitarbetier: Leibniz Institute for Materials >> Engineering IWT, Badgasteiner Stra?e 3, 28359 Bremen, Germany >> Work phone (DE): +49 (0) 421 218 51238 >> Cellphone (DE): +49 (0) 160 949 86417 >> Cellphone (ZA): +27 (0) 82 972 42 89 >> E-mail (work): s.endres at iwt.uni-bremen.de >> Website: https://stefan-endres.github.io/ >> _______________________________________________ >> SciPy-Dev mailing list >> SciPy-Dev at python.org >> https://mail.python.org/mailman/listinfo/scipy-dev >> > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at python.org > https://mail.python.org/mailman/listinfo/scipy-dev > -- Stefan Endres (MEng, AMIChemE, BEng (Hons) Chemical Engineering) Wissenchaftlicher Mitarbetier: Leibniz Institute for Materials Engineering IWT, Badgasteiner Stra?e 3, 28359 Bremen, Germany Work phone (DE): +49 (0) 421 218 51238 Cellphone (DE): +49 (0) 160 949 86417 Cellphone (ZA): +27 (0) 82 972 42 89 E-mail (work): s.endres at iwt.uni-bremen.de Website: https://stefan-endres.github.io/ -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image.png Type: image/png Size: 73097 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image.png Type: image/png Size: 40312 bytes Desc: not available URL: From stefan.c.endres at gmail.com Mon Jul 27 07:15:27 2020 From: stefan.c.endres at gmail.com (Stefan Endres) Date: Mon, 27 Jul 2020 13:15:27 +0200 Subject: [SciPy-Dev] Global Optimization Benchmarks In-Reply-To: References: Message-ID: Hello everyone, A quick lunch break update on the above comments, I?ve fixed the above mentioned bugs in the main SHGO repository: https://github.com/Stefan-Endres/shgo running options = {'maxfev': maxfun, 'f_min': fglob, 'f_tol': tolfun, 'minimize_every_iter': True} res = shgo(benchmark.fun, bounds, n=20, options=options, sampling_method='simplicial') # Added a lower `n` argument, in general 20-30 per iteration produces better performance than 100 produces the expected output: [image: image.png] And the algorithm correctly terminates as expected. I'm still working on making it compatible with Python 2, but unfortunately ran out of time, I will try again later tonight. Regards, Stefan Endres On Mon, Jul 27, 2020 at 11:47 AM Stefan Endres wrote: > Hi Andrea, > > Thank you for your continued patience and detailed response. I apologise > this is really due to my poor documentation, I?m hoping to have more time > to resolve that soon with use cases and examples. > > I forgot that for these performance benchmarks ideally use the > minimize_every_iter option as well (this defaults to False because > iterations are primarily used to track homology groups (mostly used in > things like energy surfaces or vector field design with less thought to its > performance as a GO algorithm)). When this is off the algorithm does not > have a local exploration phase so this result is expected until the > iterations finish and do a last refinement. > > I made the following adjustments to your script to include this option > (also removed line using zip to be compatible with Python 3.8): > > import numpyfrom matplotlib import pyplot as plotfrom shgo import shgo > # -------------------------------------------------------------------------------- # > > fun_l = [] # Global list for plot > nfev_l = [] # Global list for plot > class Rosenbrock(object): > > def __init__(self, dimensions=2): > > self.N = dimensions > self.nfev = 0 > # self._bounds = zip([-30.] * self.N, [30.0] * self.N) > self._bounds = [(-30., 30.0),] * self.N > self.global_optimum = [[1.0 for _ in range(self.N)]] > self.fglob = 0.0 > > def fun(self, x, *args): > self.nfev += 1 > f = sum(100.0 * (x[1:] - x[:-1] ** 2.0) ** 2.0 + (1 - x[:-1]) ** 2.0) > fun_l.append(f) # Append objective function values for plots > nfev_l.append(self.nfev) # Append number of function eval values for plots > return f > # -------------------------------------------------------------------------------------- # > def Test(): > > maxfun = 2000 > tolfun = 1e-2 > numpy.random.seed(123456) > > benchmark = Rosenbrock() > bounds = benchmark._bounds > fglob = benchmark.fglob > > options = {'maxfev': maxfun, 'f_min': fglob, 'f_tol': tolfun, > 'minimize_every_iter': True} > res = shgo(benchmark.fun, bounds, options=options, sampling_method='simplicial') > xf, yf = res.x, res.fun > print(xf, yf, fglob, benchmark.nfev) > return > if __name__ == '__main__': > Test() > # Plot results: > plot.plot(nfev_l, fun_l) > plot.show() > > Which produces the following output on Python 3.8 ( > https://imgur.com/a/hMfbhW1): > [image: image.png] > > So SHGO finds the global minimum with the performance in the publication > results, *but* it does not terminate when it finds it, I believe this is > related to the following warning: > > /home/stefan_endres/.local/lib/python3.8/site-packages/shgo/_shgo.py:986: RuntimeWarning: divide by zero encountered in double_scalars > if (lres_f_min.fun - self.f_min_true) / abs( > /home/stefan_endres/.local/lib/python3.8/site-packages/shgo/_shgo.py:823: RuntimeWarning: divide by zero encountered in double_scalars > pe = (self.f_lowest - self.f_min_true) / abs(self.f_min_true) > > The `if` statement, which checks if the objective function is precisely > zero, before checking the actual supplied value, was the culprit in this > bug. It should be: > > if self.f_min_true == 0.0: > if self.f_lowest == 0.0: > > ...to avoid the division by zero. However, fixing this produced other > errors. Also when running this in Python 2 I also get the following > outright division by zero error: > > [stefan_endres at primary shgo-gavana]$ python2 run.py > Traceback (most recent call last): > File "run.py", line 49, in > Test() > File "run.py", line 42, in Test > res = shgo(benchmark.fun, bounds, options=options, sampling_method='simplicial') > File "/home/stefan_endres/.local/lib/python2.7/site-packages/shgo/_shgo.py", line 423, in shgo > shc.construct_complex() > File "/home/stefan_endres/.local/lib/python2.7/site-packages/shgo/_shgo.py", line 728, in construct_complex > self.iterate() > File "/home/stefan_endres/.local/lib/python2.7/site-packages/shgo/_shgo.py", line 876, in iterate > self.find_minima() # Process minimiser pool > File "/home/stefan_endres/.local/lib/python2.7/site-packages/shgo/_shgo.py", line 749, in find_minima > self.minimise_pool(self.local_iter) > File "/home/stefan_endres/.local/lib/python2.7/site-packages/shgo/_shgo.py", line 987, in minimise_pool > self.f_min_true) <= self.f_tol: > ZeroDivisionError: float division by zero > > I apologise for these issues. To be frank, when I merged all the possible > stopping criteria to streamline the code I did not re-run the benchmarks > again to test if the performance profiles are reproduced. As I mentioned, > most people who use SHGO tend to always use a single iteration version and > tune sampling points until they find all solutions robustly in a problem > class. So unfortunately I have been focussed on fixing bugs that arise in > those instances when I have time to work on the code for the algorithm. The > latest version should work with the stopping criteria, but some of the > unittests are failing so I have not pushed it to the repository yet. > > Another possibility would be to use the Bitbucket version of SHGO with the > scripts at the `jogo_Sobol` tag which was used for the publication and > should also Python 2 compatible (using this together with the scripts to > run the benchmarking performance profiles available on the `shgo-dev` > repository). > > More ideally though I would like time to fix the new code, and possibly > make it Python 2 compatible again (I dropped Python 2 support around the > same time the SciPy project did), depending on your timeline for running > these benchmarks. > > Best regards, > > Stefan Endres > > On Mon, Jul 27, 2020 at 9:00 AM Andrea Gavana > wrote: > >> Hi Stefan, >> >> On Sun, 26 Jul 2020 at 19:48, Stefan Endres >> wrote: >> >>> Hi Andrea, >>> >>> Those benchmarks also relied on the global optimum as a stopping >>> criterion plus a maximum number of objective function evaluations (2,000), >>> whichever is reached first. Of course, reaching the maximum number of >>> function evaluations without getting to the global optimum (plus a >>> pre-specified tolerance) means a failure in the context of my original >>> benchmarks. >>> >>> These criteria can be set with the following arguments: >>> >>> options = {'maxfev': 2000, >>> 'f_min': f_min, >>> 'f_tol': f_tol} >>> >>> result = shgo(obj_fun, bounds, n=30, sampling_method='sobol', options=options) >>> >>> The algorithm should iterate (1 iteration is: check stopping criteria >>> --> sample `n` points --> triangulate --> find minimisers) until a global >>> minimum within the specified tolerance is found or the number of function >>> evaluations run out. A lower number of sampling points per iteration tends >>> to show higher performance, but I know that there is a strange bug where >>> the algorithm keeps running within the same attractor even if this is >>> supposed to be added to the triangulation, so a higher`n` will work better >>> on some test suites until this is fixed. >>> >>> Most people tend to use the non-iterative version of the algorithm which >>> is the most stable so there might be a few bugs running it iteratively in >>> general. >>> >>> I also have a slightly different approach to bounds/global optima >>> locations so that algorithms that rely on guessing global optima by running >>> to the center of the domain (or on the bounds) will have a less easy life >>> this time >>> >>> I would also recommend looking into using GKLS generators which, to my >>> understanding, is standard practice to avoid this kind of bias ( >>> https://dl.acm.org/doi/10.1145/962437.962444 ) in benchmarking GO >>> functions. >>> >>> Bounds-shifting is what I used to do but you have to be careful as some >>> of the benchmark functions can be undefined outside those bounds (I.e., >>> returning NaNs) or they can have lower global optima outside the bounds. >>> >>> At least for SHGO the NaN values (and other non-floating point objects) >>> should not be a problem, it was partially developed to deal with >>> discontinuities in the objective function. Returning a lower value might be >>> an issue. SHGO is not supposed to be able to escape the bounds, but if the >>> bounds fed to it are outside the actual bounds I think a strong penalty >>> function would need to be added to the objective function or the algorithm >>> will terminate earlier with the lower optimum. >>> >>> Another reason why I would recommend adding randomness to the objective >>> functions (using something like GKLS) instead of the hyperparameters/bounds >>> is that the sequences might lose their low-discrepancy properties >>> with different seeds, but I am not an expert on how adjusting those >>> sequences affects performance. >>> >>> I didn?t know that the sampling process of SHGO relied on random numbers >>> >>> The Sobol sequence is technically quasi random, but a true random number >>> generator can also be used by specifying a random sampling function to the >>> sample_method argument, the main difference being that most RNGs are >>> biased towards the centre of the hypercube. On the other hand default >>> sampling behaviour relies on sub-triangulations of the hyperrectangle >>> (similar to DIRECT), but the performance is almost identical to the Sobol >>> sequence. Sub-triangulations are biased towards boundaries and centres >>> depending on if the vertices or centroids are used in the actual sampling >>> so I would not recommend this for your benchmarks. >>> >>> Best regards, >>> >>> Stefan Endres >>> >> >> Thank you for the detailed answer. I just run a very simple test using >> the Rosenbrock function, and I am sure I am doing something wrong here. >> Looking at the benchmark results on your webpage, it seems to me that SHGO >> is an extremely good algorithm: I have created this simple test: >> >> import numpy >> from shgo import shgo >> >> # -------------------------------------------------------------------------------- # >> >> class Rosenbrock(object): >> >> def __init__(self, dimensions=2): >> >> self.N = dimensions >> self.nfev = 0 >> self._bounds = zip([-30.] * self.N, [30.0] * self.N) >> >> self.global_optimum = [[1.0 for _ in range(self.N)]] >> self.fglob = 0.0 >> >> def fun(self, x, *args): >> self.nfev += 1 >> f = sum(100.0 * (x[1:] - x[:-1] ** 2.0) ** 2.0 + (1 - x[:-1]) ** 2.0) >> print (self.nfev, f) >> return f >> >> # -------------------------------------------------------------------------------------- # >> >> def Test(): >> >> maxfun = 2000 >> tolfun = 1e-2 >> numpy.random.seed(123456) >> >> benchmark = Rosenbrock() >> bounds = benchmark._bounds >> fglob = benchmark.fglob >> >> options = {'maxfev': maxfun, 'f_min': fglob, 'f_tol': tolfun} >> res = shgo(benchmark.fun, bounds, options=options, sampling_method='simplicial') >> xf, yf = res.x, res.fun >> >> print(xf, yf, fglob, benchmark.nfev) >> >> >> if __name__ == '__main__': >> Test() >> >> >> And I get the following numerical results and a graph of the evolution of >> the objective function: >> >> array([ 0.99999555, 0.99999111]), 1.9768160305004305e-11, 0.0, 2113 >> >> [image: image.png] >> >> I am probably too ignorant about SHGO, but I would have thought the >> Rosenbrock function to be an easy pick for SHGO - i.e., convergence to the >> global optimum achieved much faster than 2,000 evaluations. Maybe there are >> some settings that I should change? >> >> Andrea. >> >> >> >>> >>> On Sun, Jul 26, 2020 at 5:49 PM Andrea Gavana >>> wrote: >>> >>>> Hi Stefan, >>>> >>>> On Sun, 26 Jul 2020 at 17.19, Stefan Endres >>>> wrote: >>>> >>>>> Dear Andrea, >>>>> >>>>> SHGO does not use an initial starting point, only the bounds (which >>>>> may also be specified as none or infinite). The benchmarks that I ran used >>>>> for the publication used the global minimum as a stopping criteria >>>>> (together with performance profiles that demonstrate the final results). >>>>> For this particular benchmarking framework I would propose simply using a >>>>> single iteration ((dim)^2 +1 points) or specifying 100 starting points. >>>>> >>>>> A script to use 100 sampling points in a single iteration with the >>>>> sobol sampling method: >>>>> >>>>> ``` result = shgo(obj_fun, bounds, n=100, sampling_method='sobol') ``` >>>>> >>>>> >>>>> If you would like to add a more stochastic element to this performance >>>>> I think the best approach would be to use a different seed for the sampling >>>>> method (in my experience this does not make much of a difference to the >>>>> performance in low dimensional problems), otherwise run shgo only once >>>>> and/or with increasing numbers of iterations. Another possibility is to add >>>>> a stochastic element to the bounds. >>>>> >>>>> Please let me know if you need any help. >>>>> >>>> >>>> >>>> Thank you for your answer. The approach I had several years ago - and >>>> that I?d like to keep - was to generate 100 random starting points for each >>>> benchmark and run all the global optimizers from that point: see >>>> http://infinity77.net/global_optimization/ >>>> >>>> Those benchmarks also relied on the global optimum as a stopping >>>> criterion plus a maximum number of objective function evaluations (2,000), >>>> whichever is reached first. Of course, reaching the maximum number of >>>> function evaluations without getting to the global optimum (plus a >>>> pre-specified tolerance) means a failure in the context of my original >>>> benchmarks. >>>> >>>> I have now a few more benchmark functions plus a couple of new >>>> algorithms and I?d like to take the same steps. I also have a slightly >>>> different approach to bounds/global optima locations so that algorithms >>>> that rely on guessing global optima by running to the center of the domain >>>> (or on the bounds) will have a less easy life this time. Bounds-shifting is >>>> what I used to do but you have to be careful as some of the benchmark >>>> functions can be undefined outside those bounds (I.e., returning NaNs) or >>>> they can have lower global optima outside the bounds. Shrinking the bounds >>>> is of course always a possibility but it makes life easier to the >>>> algorithms and it will fail as a strategy if a benchmark has a global >>>> optimum exactly at (one or more of) the original bounds. >>>> >>>> That said, I didn?t know that the sampling process of SHGO relied on >>>> random numbers: that is good to know, as an alternative I can do as you >>>> suggested and vary the seed 100 times - one of the new algorithms I have >>>> also does not use an initial point so it was already my strategy to change >>>> the seed for that one. I can simply do the same for SHGO. >>>> >>>> I?m running still with an old Python/Numpy/SciPy combination (for >>>> legacy reasons) so I?ll have to see if differential_evolution and >>>> dual_annealing can be simply copied over locally and run - I tested SHGO >>>> and it runs with no problem. >>>> >>>> Andrea. >>>> >>>> >>>>> Best regards, >>>>> Stefan Endres >>>>> >>>>> >>>>> >>>>> >>>>> On Sun, Jul 26, 2020 at 4:06 PM Andrea Gavana >>>>> wrote: >>>>> >>>>>> Dear SciPy developers & users, >>>>>> >>>>>> I have a couple of new derivative-free, global optimization >>>>>> algorithms I?ve been working on lately - plus some improvements to AMPGO >>>>>> and a few more benchmark functions - and I?d like to rerun the benchmarks >>>>>> as I did back in 2013 (!!!). >>>>>> >>>>>> In doing so, I?d like to remove some of the least interesting/worst >>>>>> performing algorithms (Firefly, MLSL, Galileo, the original DE) and replace >>>>>> them with the ones currently available in SciPy - differential_evolution, >>>>>> SHGO and dual_annealing. >>>>>> >>>>>> Everything seems good and dandy, but it appears to me that SHGO does >>>>>> not accept an initial point for the optimization process - which makes the >>>>>> whole ?run the optimization from 100 different starting points for each >>>>>> benchmark? a bit moot. >>>>>> >>>>>> I am no expert on SHGO, so maybe there is an alternative way to >>>>>> ?simulate? the changing of the starting point for the optimization? Or >>>>>> maybe some other approach to make it consistent across optimizers? >>>>>> >>>>>> Any suggestion is more than welcome. >>>>>> >>>>>> Andrea. >>>>>> >>>>> _______________________________________________ >>>>>> SciPy-Dev mailing list >>>>>> SciPy-Dev at python.org >>>>>> https://mail.python.org/mailman/listinfo/scipy-dev >>>>>> >>>>> >>>>> >>>>> -- >>>>> Stefan Endres (MEng, AMIChemE, BEng (Hons) Chemical Engineering) >>>>> >>>>> Wissenchaftlicher Mitarbetier: Leibniz Institute for Materials >>>>> Engineering IWT, Badgasteiner Stra?e 3, 28359 Bremen, Germany >>>>> >>>>> Work phone (DE): +49 (0) 421 218 51238 >>>>> Cellphone (DE): +49 (0) 160 949 86417 >>>>> Cellphone (ZA): +27 (0) 82 972 42 89 >>>>> E-mail (work): s.endres at iwt.uni-bremen.de >>>>> Website: https://stefan-endres.github.io/ >>>>> _______________________________________________ >>>>> SciPy-Dev mailing list >>>>> SciPy-Dev at python.org >>>>> https://mail.python.org/mailman/listinfo/scipy-dev >>>>> >>>> _______________________________________________ >>>> SciPy-Dev mailing list >>>> SciPy-Dev at python.org >>>> https://mail.python.org/mailman/listinfo/scipy-dev >>>> >>> >>> >>> -- >>> Stefan Endres (MEng, AMIChemE, BEng (Hons) Chemical Engineering) >>> >>> Wissenchaftlicher Mitarbetier: Leibniz Institute for Materials >>> Engineering IWT, Badgasteiner Stra?e 3, 28359 Bremen, Germany >>> Work phone (DE): +49 (0) 421 218 51238 >>> Cellphone (DE): +49 (0) 160 949 86417 >>> Cellphone (ZA): +27 (0) 82 972 42 89 >>> E-mail (work): s.endres at iwt.uni-bremen.de >>> Website: https://stefan-endres.github.io/ >>> _______________________________________________ >>> SciPy-Dev mailing list >>> SciPy-Dev at python.org >>> https://mail.python.org/mailman/listinfo/scipy-dev >>> >> _______________________________________________ >> SciPy-Dev mailing list >> SciPy-Dev at python.org >> https://mail.python.org/mailman/listinfo/scipy-dev >> > > > -- > Stefan Endres (MEng, AMIChemE, BEng (Hons) Chemical Engineering) > > Wissenchaftlicher Mitarbetier: Leibniz Institute for Materials Engineering > IWT, Badgasteiner Stra?e 3, 28359 Bremen, Germany > Work phone (DE): +49 (0) 421 218 51238 > Cellphone (DE): +49 (0) 160 949 86417 > Cellphone (ZA): +27 (0) 82 972 42 89 > E-mail (work): s.endres at iwt.uni-bremen.de > Website: https://stefan-endres.github.io/ > -- Stefan Endres (MEng, AMIChemE, BEng (Hons) Chemical Engineering) Wissenchaftlicher Mitarbetier: Leibniz Institute for Materials Engineering IWT, Badgasteiner Stra?e 3, 28359 Bremen, Germany Work phone (DE): +49 (0) 421 218 51238 Cellphone (DE): +49 (0) 160 949 86417 Cellphone (ZA): +27 (0) 82 972 42 89 E-mail (work): s.endres at iwt.uni-bremen.de Website: https://stefan-endres.github.io/ -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image.png Type: image/png Size: 73097 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image.png Type: image/png Size: 40312 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image.png Type: image/png Size: 29762 bytes Desc: not available URL: From rlucas7 at vt.edu Mon Jul 27 19:31:11 2020 From: rlucas7 at vt.edu (Lucas Roberts) Date: Mon, 27 Jul 2020 19:31:11 -0400 Subject: [SciPy-Dev] Fwd: [rlucas7/scipy] Run failed: Nightly build - master (c5e0ca5) In-Reply-To: References: Message-ID: On Fri, Jul 17, 2020 at 5:13 PM Andrew Nelson wrote: > > > On Sat, 18 Jul 2020, 07:03 , wrote: > >> Hi scipy-dev, >> >> I think my branch is failing the python 3.9 (experimental) build. I don?t >> see the failure every night but instead maybe once a month. >> >> Has anyone else seen this and know what needs to be changed? >> > > https://github.community/t/stop-github-actions-running-on-a-fork/17965 > does this do what you're asking? > Hmm, that might be it, difficult to say as it seems to occur only once a month it seems (despite being called nightly builds). Looks like github has "Enable local and third party Actions for this repository" as the default setting for my SciPy fork. I've changed to "Disable Actions for this repository" setting based on the linked suggestion, let's see if that works. Thanks! > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at python.org > https://mail.python.org/mailman/listinfo/scipy-dev > -- Sincerely, -Lucas -------------- next part -------------- An HTML attachment was scrubbed... URL: From andyfaff at gmail.com Mon Jul 27 21:02:05 2020 From: andyfaff at gmail.com (Andrew Nelson) Date: Tue, 28 Jul 2020 11:02:05 +1000 Subject: [SciPy-Dev] Fwd: [rlucas7/scipy] Run failed: Nightly build - master (c5e0ca5) In-Reply-To: References: Message-ID: > > > Hmm, that might be it, difficult to say as it seems to occur only once a > month it seems (despite being called nightly builds). > Looks like github has "Enable local and third party Actions for this > repository" as the default setting for my SciPy fork. > I've changed to "Disable Actions for this repository" setting based on the > linked suggestion, let's see if that works. > It's nightly in as much as it's testing against the current state of Python 3.9 at the time the branch is updated. We can call it something else if desired (a cron job is also possible to truly run it every night). A. -------------- next part -------------- An HTML attachment was scrubbed... URL: From mhaberla at calpoly.edu Wed Jul 29 16:23:06 2020 From: mhaberla at calpoly.edu (Matt Haberland) Date: Wed, 29 Jul 2020 13:23:06 -0700 Subject: [SciPy-Dev] Entropy Calculations In-Reply-To: <07B1CA49-F62E-4FF1-AEB1-F470E62C4C79@gmail.com> References: <07B1CA49-F62E-4FF1-AEB1-F470E62C4C79@gmail.com> Message-ID: Thanks for letting us know. Can you send a reference for the algorithms you implemented? I didn't see any in a quick look through the notebook and code. Also, I see that this uses Numba, but we don't have that as a dependency yet. How important is that speedup? How essential are the other dependencies - cocos, cubature, quadpy? ---------- Forwarded message --------- From: Michael Nowotny Date: Fri, Jul 24, 2020 at 7:00 PM Subject: [SciPy-Dev] Entropy Calculations To: Dear SciPy developers, I have noticed that the statistical functions for the calculation of entropy and KL divergence currently only support discrete distributions for which the probability mass function is known. I recently needed to compute various information theoretic measures from samples of distributions and created the package `Divergence`. It offers functionality for entropy, cross entropy, relative entropy, Jensen-Shannon divergence, joint entropy, conditional entropy, and mutual information and is available on GitHub at https://github.com/michaelnowotny/divergence. It supports samples from both discrete and continuous distributions. Continuous distributions are implement via numerical integration of kernel density estimates generated from the sample. I would be happy to contribute some or all of its functionality to SciPy. Please let me know if you are interested. Thank you, Michael _______________________________________________ SciPy-Dev mailing list SciPy-Dev at python.org https://mail.python.org/mailman/listinfo/scipy-dev -- Matt Haberland Assistant Professor BioResource and Agricultural Engineering 08A-3K, Cal Poly -------------- next part -------------- An HTML attachment was scrubbed... URL: From nowotnym at gmail.com Fri Jul 31 00:07:02 2020 From: nowotnym at gmail.com (Michael Nowotny) Date: Thu, 30 Jul 2020 21:07:02 -0700 Subject: [SciPy-Dev] Entropy Calculations In-Reply-To: <07B1CA49-F62E-4FF1-AEB1-F470E62C4C79@gmail.com> References: <07B1CA49-F62E-4FF1-AEB1-F470E62C4C79@gmail.com> Message-ID: Hi Matt, I have added a reference section to the readme on GitHub. The formulas were implemented straight from articles in wikipedia. The density for continuous distributions is estimated via kernel methods. At the moment I am using KDE objects from the statsmodels package. We could just as well switch to SciPy?s gaussian KDE implementation if there is a preference to avoid dependency on statsmodels. It turns out that quadpy is not needed anymore. It has been superseded by the cubature package which is a python wrapper around this C library: https://github.com/stevengj/cubature . I am using its adaptive Clenshaw-Curtis based rules which perform better than SciPy?s quad for this particular application. This could easily be reverted back to SciPy?s quadrature functions - albeit at a performance loss. For 10 million observations, the conditional entropy calculation is about 3 times faster with Numba than without. Numba makes little difference for the other information theoretic measures. We could probably remove Numba in the first iteration and rewrite the code for conditional entropy for a sample from a discrete distribution in C or Cython at some point if that 3x performance benefit is deemed important enough. Cocos is my own NumPy-like multi-GPU computing package for Python which includes an adaptation of SciPy?s gaussian_kde class for GPUs (see here https://github.com/michaelnowotny/cocos ). Since SciPy itself does not feature GPU support, I would suggest to simply not include any Cocos-based GPU accelerated functionality in divergence. Best, Michael Date: Wed, 29 Jul 2020 13:23:06 -0700 From: Matt Haberland > To: SciPy Developers List > Subject: Re: [SciPy-Dev] Entropy Calculations Message-ID: > Content-Type: text/plain; charset="utf-8? Thanks for letting us know. Can you send a reference for the algorithms you implemented? I didn't see any in a quick look through the notebook and code. Also, I see that this uses Numba, but we don't have that as a dependency yet. How important is that speedup? How essential are the other dependencies - cocos, cubature, quadpy? ---------- Forwarded message --------- From: Michael Nowotny > Date: Fri, Jul 24, 2020 at 7:00 PM Subject: [SciPy-Dev] Entropy Calculations To: > Dear SciPy developers, I have noticed that the statistical functions for the calculation of entropy and KL divergence currently only support discrete distributions for which the probability mass function is known. I recently needed to compute various information theoretic measures from samples of distributions and created the package `Divergence`. It offers functionality for entropy, cross entropy, relative entropy, Jensen-Shannon divergence, joint entropy, conditional entropy, and mutual information and is available on GitHub at https://github.com/michaelnowotny/divergence . It supports samples from both discrete and continuous distributions. Continuous distributions are implement via numerical integration of kernel density estimates generated from the sample. I would be happy to contribute some or all of its functionality to SciPy. Please let me know if you are interested. Thank you, Michael _______________________________________________ SciPy-Dev mailing list SciPy-Dev at python.org https://mail.python.org/mailman/listinfo/scipy-dev -------------- next part -------------- An HTML attachment was scrubbed... URL: