[SciPy-Dev] Global Optimization Benchmarks

Stefan Endres stefan.c.endres at gmail.com
Mon Jul 27 07:15:27 EDT 2020


Hello everyone,

A quick lunch break update on the above comments, I’ve fixed the above
mentioned bugs in the main SHGO repository:

https://github.com/Stefan-Endres/shgo

running

options = {'maxfev': maxfun, 'f_min': fglob, 'f_tol': tolfun,
           'minimize_every_iter': True}
res = shgo(benchmark.fun, bounds, n=20, options=options,
sampling_method='simplicial')  # Added a lower `n` argument, in
general 20-30 per iteration produces better performance than 100

produces the expected output:
[image: image.png]

And the algorithm correctly terminates as expected.

I'm still working on making it compatible with Python 2, but unfortunately
ran out of time, I will try again later tonight.

Regards,

Stefan Endres

On Mon, Jul 27, 2020 at 11:47 AM Stefan Endres <stefan.c.endres at gmail.com>
wrote:

> Hi Andrea,
>
> Thank you for your continued patience and detailed response. I apologise
> this is really due to my poor documentation, I’m hoping to have more time
> to resolve that soon with use cases and examples.
>
> I forgot that for these performance benchmarks ideally use the
> minimize_every_iter option as well (this defaults to False because
> iterations are primarily used to track homology groups (mostly used in
> things like energy surfaces or vector field design with less thought to its
> performance as a GO algorithm)). When this is off the algorithm does not
> have a local exploration phase so this result is expected until the
> iterations finish and do a last refinement.
>
> I made the following adjustments to your script to include this option
> (also removed line using zip to be compatible with Python 3.8):
>
> import numpyfrom matplotlib import pyplot as plotfrom shgo import shgo
> # -------------------------------------------------------------------------------- #
>
> fun_l = []  # Global list for plot
> nfev_l = []  # Global list for plot
> class Rosenbrock(object):
>
>     def __init__(self, dimensions=2):
>
>         self.N = dimensions
>         self.nfev = 0
>         # self._bounds = zip([-30.] * self.N, [30.0] * self.N)
>         self._bounds = [(-30., 30.0),] * self.N
>         self.global_optimum = [[1.0 for _ in range(self.N)]]
>         self.fglob = 0.0
>
>     def fun(self, x, *args):
>         self.nfev += 1
>         f = sum(100.0 * (x[1:] - x[:-1] ** 2.0) ** 2.0 + (1 - x[:-1]) ** 2.0)
>         fun_l.append(f)  # Append objective function values for plots
>         nfev_l.append(self.nfev)  # Append number of function eval values for plots
>         return f
> # -------------------------------------------------------------------------------------- #
> def Test():
>
>     maxfun = 2000
>     tolfun = 1e-2
>     numpy.random.seed(123456)
>
>     benchmark = Rosenbrock()
>     bounds = benchmark._bounds
>     fglob = benchmark.fglob
>
>     options = {'maxfev': maxfun, 'f_min': fglob, 'f_tol': tolfun,
>                'minimize_every_iter': True}
>     res = shgo(benchmark.fun, bounds, options=options, sampling_method='simplicial')
>     xf, yf = res.x, res.fun
>     print(xf, yf, fglob, benchmark.nfev)
>     return
> if __name__ == '__main__':
>     Test()
>     # Plot results:
>     plot.plot(nfev_l, fun_l)
>     plot.show()
>
> Which produces the following output on Python 3.8 (
> https://imgur.com/a/hMfbhW1):
> [image: image.png]
>
> So SHGO finds the global minimum with the performance in the publication
> results, *but* it does not terminate when it finds it, I believe this is
> related to the following warning:
>
> /home/stefan_endres/.local/lib/python3.8/site-packages/shgo/_shgo.py:986: RuntimeWarning: divide by zero encountered in double_scalars
>   if (lres_f_min.fun - self.f_min_true) / abs(
> /home/stefan_endres/.local/lib/python3.8/site-packages/shgo/_shgo.py:823: RuntimeWarning: divide by zero encountered in double_scalars
>   pe = (self.f_lowest - self.f_min_true) / abs(self.f_min_true)
>
> The `if` statement, which checks if the objective function is precisely
> zero, before checking the actual supplied value, was the culprit in this
> bug. It should be:
>
> if self.f_min_true == 0.0:
>     if self.f_lowest == 0.0:
>
>  ...to avoid the division by zero. However, fixing this produced other
> errors. Also when running this in Python 2 I also get the following
> outright division by zero error:
>
> [stefan_endres at primary shgo-gavana]$ python2 run.py
> Traceback (most recent call last):
>   File "run.py", line 49, in <module>
>     Test()
>   File "run.py", line 42, in Test
>     res = shgo(benchmark.fun, bounds, options=options, sampling_method='simplicial')
>   File "/home/stefan_endres/.local/lib/python2.7/site-packages/shgo/_shgo.py", line 423, in shgo
>     shc.construct_complex()
>   File "/home/stefan_endres/.local/lib/python2.7/site-packages/shgo/_shgo.py", line 728, in construct_complex
>     self.iterate()
>   File "/home/stefan_endres/.local/lib/python2.7/site-packages/shgo/_shgo.py", line 876, in iterate
>     self.find_minima()  # Process minimiser pool
>   File "/home/stefan_endres/.local/lib/python2.7/site-packages/shgo/_shgo.py", line 749, in find_minima
>     self.minimise_pool(self.local_iter)
>   File "/home/stefan_endres/.local/lib/python2.7/site-packages/shgo/_shgo.py", line 987, in minimise_pool
>     self.f_min_true) <= self.f_tol:
> ZeroDivisionError: float division by zero
>
> I apologise for these issues. To be frank, when I merged all the possible
> stopping criteria to streamline the code I did not re-run the benchmarks
> again to test if the performance profiles are reproduced. As I mentioned,
> most people who use SHGO tend to always use a single iteration version and
> tune sampling points until they find all solutions robustly in a problem
> class. So unfortunately I have been focussed on fixing bugs that arise in
> those instances when I have time to work on the code for the algorithm. The
> latest version should work with the stopping criteria, but some of the
> unittests are failing so I have not pushed it to the repository yet.
>
> Another possibility would be to use the Bitbucket version of SHGO with the
> scripts at the `jogo_Sobol` tag which was used for the publication and
> should also Python 2 compatible (using this together with the scripts to
> run the benchmarking performance profiles available on the `shgo-dev`
> repository).
>
> More ideally though I would like time to fix the new code, and possibly
> make it Python 2 compatible again (I dropped Python 2 support around the
> same time the SciPy project did), depending on your timeline for running
> these benchmarks.
>
> Best regards,
>
> Stefan Endres
>
> On Mon, Jul 27, 2020 at 9:00 AM Andrea Gavana <andrea.gavana at gmail.com>
> wrote:
>
>> Hi Stefan,
>>
>> On Sun, 26 Jul 2020 at 19:48, Stefan Endres <stefan.c.endres at gmail.com>
>> wrote:
>>
>>> Hi Andrea,
>>>
>>> Those benchmarks also relied on the global optimum as a stopping
>>> criterion plus a maximum number of objective function evaluations (2,000),
>>> whichever is reached first. Of course, reaching the maximum number of
>>> function evaluations without getting to the global optimum (plus a
>>> pre-specified tolerance) means a failure in the context of my original
>>> benchmarks.
>>>
>>> These criteria can be set with the following arguments:
>>>
>>> options = {'maxfev': 2000,
>>>            'f_min': f_min,
>>>            'f_tol': f_tol}
>>>
>>> result = shgo(obj_fun, bounds, n=30, sampling_method='sobol', options=options)
>>>
>>> The algorithm should iterate (1 iteration is: check stopping criteria
>>> --> sample `n` points --> triangulate --> find minimisers) until a global
>>> minimum within the specified tolerance is found or the number of function
>>> evaluations run out. A lower number of sampling points per iteration tends
>>> to show higher performance, but I know that there is a strange bug where
>>> the algorithm keeps running within the same attractor even if this is
>>> supposed to be added to the triangulation, so a higher`n` will work better
>>> on some test suites until this is fixed.
>>>
>>> Most people tend to use the non-iterative version of the algorithm which
>>> is the most stable so there might be a few bugs running it iteratively in
>>> general.
>>>
>>> I also have a slightly different approach to bounds/global optima
>>> locations so that algorithms that rely on guessing global optima by running
>>> to the center of the domain (or on the bounds) will have a less easy life
>>> this time
>>>
>>> I would also recommend looking into using GKLS generators which, to my
>>> understanding, is standard practice to avoid this kind of bias (
>>> https://dl.acm.org/doi/10.1145/962437.962444 ) in benchmarking GO
>>> functions.
>>>
>>> Bounds-shifting is what I used to do but you have to be careful as some
>>> of the benchmark functions can be undefined outside those bounds (I.e.,
>>> returning NaNs) or they can have lower global optima outside the bounds.
>>>
>>> At least for SHGO the NaN values (and other non-floating point objects)
>>> should not be a problem, it was partially developed to deal with
>>> discontinuities in the objective function. Returning a lower value might be
>>> an issue. SHGO is not supposed to be able to escape the bounds, but if the
>>> bounds fed to it are outside the actual bounds I think a strong penalty
>>> function would need to be added to the objective function or the algorithm
>>> will terminate earlier with the lower optimum.
>>>
>>> Another reason why I would recommend adding randomness to the objective
>>> functions (using something like GKLS) instead of the hyperparameters/bounds
>>> is that the sequences might lose their low-discrepancy properties
>>> with different seeds, but I am not an expert on how adjusting those
>>> sequences affects performance.
>>>
>>> I didn’t know that the sampling process of SHGO relied on random numbers
>>>
>>> The Sobol sequence is technically quasi random, but a true random number
>>> generator can also be used by specifying a random sampling function to the
>>> sample_method argument, the main difference being that most RNGs are
>>> biased towards the centre of the hypercube. On the other hand default
>>> sampling behaviour relies on sub-triangulations of the hyperrectangle
>>> (similar to DIRECT), but the performance is almost identical to the Sobol
>>> sequence. Sub-triangulations are biased towards boundaries and centres
>>> depending on if the vertices or centroids are used in the actual sampling
>>> so I would not recommend this for your benchmarks.
>>>
>>> Best regards,
>>>
>>> Stefan Endres
>>>
>>
>> Thank you for the detailed answer. I just run a very simple test using
>> the Rosenbrock function, and I am sure I am doing something wrong here.
>> Looking at the benchmark results on your webpage, it seems to me that SHGO
>> is an extremely good algorithm: I have created this simple test:
>>
>> import numpy
>> from shgo import shgo
>>
>> # -------------------------------------------------------------------------------- #
>>
>> class Rosenbrock(object):
>>
>>     def __init__(self, dimensions=2):
>>
>>         self.N = dimensions
>>         self.nfev = 0
>>         self._bounds = zip([-30.] * self.N, [30.0] * self.N)
>>
>>         self.global_optimum = [[1.0 for _ in range(self.N)]]
>>         self.fglob = 0.0
>>
>>     def fun(self, x, *args):
>>         self.nfev += 1
>>         f = sum(100.0 * (x[1:] - x[:-1] ** 2.0) ** 2.0 + (1 - x[:-1]) ** 2.0)
>>         print (self.nfev, f)
>>         return f
>>
>> # -------------------------------------------------------------------------------------- #
>>
>> def Test():
>>
>>     maxfun = 2000
>>     tolfun = 1e-2
>>     numpy.random.seed(123456)
>>
>>     benchmark = Rosenbrock()
>>     bounds = benchmark._bounds
>>     fglob = benchmark.fglob
>>
>>     options = {'maxfev': maxfun, 'f_min': fglob, 'f_tol': tolfun}
>>     res = shgo(benchmark.fun, bounds, options=options, sampling_method='simplicial')
>>     xf, yf = res.x, res.fun
>>
>>     print(xf, yf, fglob, benchmark.nfev)
>>
>>
>> if __name__ == '__main__':
>>     Test()
>>
>>
>> And I get the following numerical results and a graph of the evolution of
>> the objective function:
>>
>> array([ 0.99999555,  0.99999111]), 1.9768160305004305e-11, 0.0, 2113
>>
>> [image: image.png]
>>
>> I am probably too ignorant about SHGO, but I would have thought the
>> Rosenbrock function to be an easy pick for SHGO - i.e., convergence to the
>> global optimum achieved much faster than 2,000 evaluations. Maybe there are
>> some settings that I should change?
>>
>> Andrea.
>>
>>
>>
>>>
>>> On Sun, Jul 26, 2020 at 5:49 PM Andrea Gavana <andrea.gavana at gmail.com>
>>> wrote:
>>>
>>>> Hi Stefan,
>>>>
>>>> On Sun, 26 Jul 2020 at 17.19, Stefan Endres <stefan.c.endres at gmail.com>
>>>> wrote:
>>>>
>>>>> Dear Andrea,
>>>>>
>>>>> SHGO does not use an initial starting point, only the bounds (which
>>>>> may also be specified as none or infinite). The benchmarks that I ran used
>>>>> for the publication used the global minimum as a stopping criteria
>>>>> (together with performance profiles that demonstrate the final results).
>>>>> For this particular benchmarking framework I would propose simply using a
>>>>> single iteration ((dim)^2 +1 points) or specifying 100 starting points.
>>>>>
>>>>> A script to use 100 sampling points in a single iteration with the
>>>>> sobol sampling method:
>>>>>
>>>>> ``` result = shgo(obj_fun, bounds, n=100, sampling_method='sobol') ```
>>>>>
>>>>>
>>>>> If you would like to add a more stochastic element to this performance
>>>>> I think the best approach would be to use a different seed for the sampling
>>>>> method (in my experience this does not make much of a difference to the
>>>>> performance in low dimensional problems), otherwise run shgo only once
>>>>> and/or with increasing numbers of iterations. Another possibility is to add
>>>>> a stochastic element to the bounds.
>>>>>
>>>>> Please let me know if you need any help.
>>>>>
>>>>
>>>>
>>>> Thank you for your answer. The approach I had several years ago - and
>>>> that I’d like to keep - was to generate 100 random starting points for each
>>>> benchmark and run all the global optimizers from that point: see
>>>> http://infinity77.net/global_optimization/
>>>>
>>>> Those benchmarks also relied on the global optimum as a stopping
>>>> criterion plus a maximum number of objective function evaluations (2,000),
>>>> whichever is reached first. Of course, reaching the maximum number of
>>>> function evaluations without getting to the global optimum (plus a
>>>> pre-specified tolerance) means a failure in the context of my original
>>>> benchmarks.
>>>>
>>>> I have now a few more benchmark functions plus a couple of new
>>>> algorithms and I’d like to take the same steps. I also have a slightly
>>>> different approach to bounds/global optima locations so that algorithms
>>>> that rely on guessing global optima by running to the center of the domain
>>>> (or on the bounds) will have a less easy life this time. Bounds-shifting is
>>>> what I used to do but you have to be careful as some of the benchmark
>>>> functions can be undefined outside those bounds (I.e., returning NaNs) or
>>>> they can have lower global optima outside the bounds. Shrinking the bounds
>>>> is of course always a possibility but it makes life easier to the
>>>> algorithms and it will fail as a strategy if a benchmark has a global
>>>> optimum exactly at (one or more of) the original bounds.
>>>>
>>>> That said, I didn’t know that the sampling process of SHGO relied on
>>>> random numbers: that is good to know, as an alternative I can do as you
>>>> suggested and vary the seed 100 times - one of the new algorithms I have
>>>> also does not use an initial point so it was already my strategy to change
>>>> the seed for that one. I can simply do the same for SHGO.
>>>>
>>>> I’m running still with an old Python/Numpy/SciPy combination (for
>>>> legacy reasons) so I’ll have to see if differential_evolution and
>>>> dual_annealing can be simply copied over locally and run - I tested SHGO
>>>> and it runs with no problem.
>>>>
>>>> Andrea.
>>>>
>>>>
>>>>> Best regards,
>>>>> Stefan Endres
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On Sun, Jul 26, 2020 at 4:06 PM Andrea Gavana <andrea.gavana at gmail.com>
>>>>> wrote:
>>>>>
>>>>>> Dear SciPy developers & users,
>>>>>>
>>>>>>      I have a couple of new derivative-free, global optimization
>>>>>> algorithms I’ve been working on lately - plus some improvements to AMPGO
>>>>>> and a few more benchmark functions - and I’d like to rerun the benchmarks
>>>>>> as I did back in 2013 (!!!).
>>>>>>
>>>>>> In doing so, I’d like to remove some of the least interesting/worst
>>>>>> performing algorithms (Firefly, MLSL, Galileo, the original DE) and replace
>>>>>> them with the ones currently available in SciPy - differential_evolution,
>>>>>> SHGO and dual_annealing.
>>>>>>
>>>>>> Everything seems good and dandy, but it appears to me that SHGO does
>>>>>> not accept an initial point for the optimization process - which makes the
>>>>>> whole “run the optimization from 100 different starting points for each
>>>>>> benchmark” a bit moot.
>>>>>>
>>>>>> I am no expert on SHGO, so maybe there is an alternative way to
>>>>>> “simulate” the changing of the starting point for the optimization? Or
>>>>>> maybe some other approach to make it consistent across optimizers?
>>>>>>
>>>>>> Any suggestion is more than welcome.
>>>>>>
>>>>>> Andrea.
>>>>>>
>>>>> _______________________________________________
>>>>>> SciPy-Dev mailing list
>>>>>> SciPy-Dev at python.org
>>>>>> https://mail.python.org/mailman/listinfo/scipy-dev
>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Stefan Endres (MEng, AMIChemE, BEng (Hons) Chemical Engineering)
>>>>>
>>>>> Wissenchaftlicher Mitarbetier: Leibniz Institute for Materials
>>>>> Engineering IWT, Badgasteiner Straße 3, 28359 Bremen, Germany
>>>>> <https://www.google.com/maps/search/Badgasteiner+Stra%C3%9Fe+3,+28359+Bremen,+Germany?entry=gmail&source=g>
>>>>> Work phone (DE): +49 (0) 421 218 51238
>>>>> Cellphone (DE): +49 (0) 160 949 86417
>>>>> Cellphone (ZA): +27 (0) 82 972 42 89
>>>>> E-mail (work): s.endres at iwt.uni-bremen.de
>>>>> Website: https://stefan-endres.github.io/
>>>>> _______________________________________________
>>>>> SciPy-Dev mailing list
>>>>> SciPy-Dev at python.org
>>>>> https://mail.python.org/mailman/listinfo/scipy-dev
>>>>>
>>>> _______________________________________________
>>>> SciPy-Dev mailing list
>>>> SciPy-Dev at python.org
>>>> https://mail.python.org/mailman/listinfo/scipy-dev
>>>>
>>>
>>>
>>> --
>>> Stefan Endres (MEng, AMIChemE, BEng (Hons) Chemical Engineering)
>>>
>>> Wissenchaftlicher Mitarbetier: Leibniz Institute for Materials
>>> Engineering IWT, Badgasteiner Straße 3, 28359 Bremen, Germany
>>> Work phone (DE): +49 (0) 421 218 51238
>>> Cellphone (DE): +49 (0) 160 949 86417
>>> Cellphone (ZA): +27 (0) 82 972 42 89
>>> E-mail (work): s.endres at iwt.uni-bremen.de
>>> Website: https://stefan-endres.github.io/
>>> _______________________________________________
>>> SciPy-Dev mailing list
>>> SciPy-Dev at python.org
>>> https://mail.python.org/mailman/listinfo/scipy-dev
>>>
>> _______________________________________________
>> SciPy-Dev mailing list
>> SciPy-Dev at python.org
>> https://mail.python.org/mailman/listinfo/scipy-dev
>>
>
>
> --
> Stefan Endres (MEng, AMIChemE, BEng (Hons) Chemical Engineering)
>
> Wissenchaftlicher Mitarbetier: Leibniz Institute for Materials Engineering
> IWT, Badgasteiner Straße 3, 28359 Bremen, Germany
> Work phone (DE): +49 (0) 421 218 51238
> Cellphone (DE): +49 (0) 160 949 86417
> Cellphone (ZA): +27 (0) 82 972 42 89
> E-mail (work): s.endres at iwt.uni-bremen.de
> Website: https://stefan-endres.github.io/
>


-- 
Stefan Endres (MEng, AMIChemE, BEng (Hons) Chemical Engineering)

Wissenchaftlicher Mitarbetier: Leibniz Institute for Materials Engineering
IWT, Badgasteiner Straße 3, 28359 Bremen, Germany
Work phone (DE): +49 (0) 421 218 51238
Cellphone (DE): +49 (0) 160 949 86417
Cellphone (ZA): +27 (0) 82 972 42 89
E-mail (work): s.endres at iwt.uni-bremen.de
Website: https://stefan-endres.github.io/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scipy-dev/attachments/20200727/c9595797/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image.png
Type: image/png
Size: 73097 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/scipy-dev/attachments/20200727/c9595797/attachment-0003.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image.png
Type: image/png
Size: 40312 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/scipy-dev/attachments/20200727/c9595797/attachment-0004.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image.png
Type: image/png
Size: 29762 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/scipy-dev/attachments/20200727/c9595797/attachment-0005.png>


More information about the SciPy-Dev mailing list