[SciPy-Dev] Global Optimization Benchmarks

Stefan Endres stefan.c.endres at gmail.com
Mon Jul 27 05:47:57 EDT 2020


Hi Andrea,

Thank you for your continued patience and detailed response. I apologise
this is really due to my poor documentation, I’m hoping to have more time
to resolve that soon with use cases and examples.

I forgot that for these performance benchmarks ideally use the
minimize_every_iter option as well (this defaults to False because
iterations are primarily used to track homology groups (mostly used in
things like energy surfaces or vector field design with less thought to its
performance as a GO algorithm)). When this is off the algorithm does not
have a local exploration phase so this result is expected until the
iterations finish and do a last refinement.

I made the following adjustments to your script to include this option
(also removed line using zip to be compatible with Python 3.8):

import numpyfrom matplotlib import pyplot as plotfrom shgo import shgo
# --------------------------------------------------------------------------------
#

fun_l = []  # Global list for plot
nfev_l = []  # Global list for plot
class Rosenbrock(object):

    def __init__(self, dimensions=2):

        self.N = dimensions
        self.nfev = 0
        # self._bounds = zip([-30.] * self.N, [30.0] * self.N)
        self._bounds = [(-30., 30.0),] * self.N
        self.global_optimum = [[1.0 for _ in range(self.N)]]
        self.fglob = 0.0

    def fun(self, x, *args):
        self.nfev += 1
        f = sum(100.0 * (x[1:] - x[:-1] ** 2.0) ** 2.0 + (1 - x[:-1]) ** 2.0)
        fun_l.append(f)  # Append objective function values for plots
        nfev_l.append(self.nfev)  # Append number of function eval
values for plots
        return f
# --------------------------------------------------------------------------------------
#
def Test():

    maxfun = 2000
    tolfun = 1e-2
    numpy.random.seed(123456)

    benchmark = Rosenbrock()
    bounds = benchmark._bounds
    fglob = benchmark.fglob

    options = {'maxfev': maxfun, 'f_min': fglob, 'f_tol': tolfun,
               'minimize_every_iter': True}
    res = shgo(benchmark.fun, bounds, options=options,
sampling_method='simplicial')
    xf, yf = res.x, res.fun
    print(xf, yf, fglob, benchmark.nfev)
    return
if __name__ == '__main__':
    Test()
    # Plot results:
    plot.plot(nfev_l, fun_l)
    plot.show()

Which produces the following output on Python 3.8 (
https://imgur.com/a/hMfbhW1):
[image: image.png]

So SHGO finds the global minimum with the performance in the publication
results, *but* it does not terminate when it finds it, I believe this is
related to the following warning:

/home/stefan_endres/.local/lib/python3.8/site-packages/shgo/_shgo.py:986:
RuntimeWarning: divide by zero encountered in double_scalars
  if (lres_f_min.fun - self.f_min_true) / abs(
/home/stefan_endres/.local/lib/python3.8/site-packages/shgo/_shgo.py:823:
RuntimeWarning: divide by zero encountered in double_scalars
  pe = (self.f_lowest - self.f_min_true) / abs(self.f_min_true)

The `if` statement, which checks if the objective function is precisely
zero, before checking the actual supplied value, was the culprit in this
bug. It should be:

if self.f_min_true == 0.0:
    if self.f_lowest == 0.0:

 ...to avoid the division by zero. However, fixing this produced other
errors. Also when running this in Python 2 I also get the following
outright division by zero error:

[stefan_endres at primary shgo-gavana]$ python2 run.py
Traceback (most recent call last):
  File "run.py", line 49, in <module>
    Test()
  File "run.py", line 42, in Test
    res = shgo(benchmark.fun, bounds, options=options,
sampling_method='simplicial')
  File "/home/stefan_endres/.local/lib/python2.7/site-packages/shgo/_shgo.py",
line 423, in shgo
    shc.construct_complex()
  File "/home/stefan_endres/.local/lib/python2.7/site-packages/shgo/_shgo.py",
line 728, in construct_complex
    self.iterate()
  File "/home/stefan_endres/.local/lib/python2.7/site-packages/shgo/_shgo.py",
line 876, in iterate
    self.find_minima()  # Process minimiser pool
  File "/home/stefan_endres/.local/lib/python2.7/site-packages/shgo/_shgo.py",
line 749, in find_minima
    self.minimise_pool(self.local_iter)
  File "/home/stefan_endres/.local/lib/python2.7/site-packages/shgo/_shgo.py",
line 987, in minimise_pool
    self.f_min_true) <= self.f_tol:
ZeroDivisionError: float division by zero

I apologise for these issues. To be frank, when I merged all the possible
stopping criteria to streamline the code I did not re-run the benchmarks
again to test if the performance profiles are reproduced. As I mentioned,
most people who use SHGO tend to always use a single iteration version and
tune sampling points until they find all solutions robustly in a problem
class. So unfortunately I have been focussed on fixing bugs that arise in
those instances when I have time to work on the code for the algorithm. The
latest version should work with the stopping criteria, but some of the
unittests are failing so I have not pushed it to the repository yet.

Another possibility would be to use the Bitbucket version of SHGO with the
scripts at the `jogo_Sobol` tag which was used for the publication and
should also Python 2 compatible (using this together with the scripts to
run the benchmarking performance profiles available on the `shgo-dev`
repository).

More ideally though I would like time to fix the new code, and possibly
make it Python 2 compatible again (I dropped Python 2 support around the
same time the SciPy project did), depending on your timeline for running
these benchmarks.

Best regards,

Stefan Endres

On Mon, Jul 27, 2020 at 9:00 AM Andrea Gavana <andrea.gavana at gmail.com>
wrote:

> Hi Stefan,
>
> On Sun, 26 Jul 2020 at 19:48, Stefan Endres <stefan.c.endres at gmail.com>
> wrote:
>
>> Hi Andrea,
>>
>> Those benchmarks also relied on the global optimum as a stopping
>> criterion plus a maximum number of objective function evaluations (2,000),
>> whichever is reached first. Of course, reaching the maximum number of
>> function evaluations without getting to the global optimum (plus a
>> pre-specified tolerance) means a failure in the context of my original
>> benchmarks.
>>
>> These criteria can be set with the following arguments:
>>
>> options = {'maxfev': 2000,
>>            'f_min': f_min,
>>            'f_tol': f_tol}
>>
>> result = shgo(obj_fun, bounds, n=30, sampling_method='sobol', options=options)
>>
>> The algorithm should iterate (1 iteration is: check stopping criteria -->
>> sample `n` points --> triangulate --> find minimisers) until a global
>> minimum within the specified tolerance is found or the number of function
>> evaluations run out. A lower number of sampling points per iteration tends
>> to show higher performance, but I know that there is a strange bug where
>> the algorithm keeps running within the same attractor even if this is
>> supposed to be added to the triangulation, so a higher`n` will work better
>> on some test suites until this is fixed.
>>
>> Most people tend to use the non-iterative version of the algorithm which
>> is the most stable so there might be a few bugs running it iteratively in
>> general.
>>
>> I also have a slightly different approach to bounds/global optima
>> locations so that algorithms that rely on guessing global optima by running
>> to the center of the domain (or on the bounds) will have a less easy life
>> this time
>>
>> I would also recommend looking into using GKLS generators which, to my
>> understanding, is standard practice to avoid this kind of bias (
>> https://dl.acm.org/doi/10.1145/962437.962444 ) in benchmarking GO
>> functions.
>>
>> Bounds-shifting is what I used to do but you have to be careful as some
>> of the benchmark functions can be undefined outside those bounds (I.e.,
>> returning NaNs) or they can have lower global optima outside the bounds.
>>
>> At least for SHGO the NaN values (and other non-floating point objects)
>> should not be a problem, it was partially developed to deal with
>> discontinuities in the objective function. Returning a lower value might be
>> an issue. SHGO is not supposed to be able to escape the bounds, but if the
>> bounds fed to it are outside the actual bounds I think a strong penalty
>> function would need to be added to the objective function or the algorithm
>> will terminate earlier with the lower optimum.
>>
>> Another reason why I would recommend adding randomness to the objective
>> functions (using something like GKLS) instead of the hyperparameters/bounds
>> is that the sequences might lose their low-discrepancy properties
>> with different seeds, but I am not an expert on how adjusting those
>> sequences affects performance.
>>
>> I didn’t know that the sampling process of SHGO relied on random numbers
>>
>> The Sobol sequence is technically quasi random, but a true random number
>> generator can also be used by specifying a random sampling function to the
>> sample_method argument, the main difference being that most RNGs are
>> biased towards the centre of the hypercube. On the other hand default
>> sampling behaviour relies on sub-triangulations of the hyperrectangle
>> (similar to DIRECT), but the performance is almost identical to the Sobol
>> sequence. Sub-triangulations are biased towards boundaries and centres
>> depending on if the vertices or centroids are used in the actual sampling
>> so I would not recommend this for your benchmarks.
>>
>> Best regards,
>>
>> Stefan Endres
>>
>
> Thank you for the detailed answer. I just run a very simple test using the
> Rosenbrock function, and I am sure I am doing something wrong here. Looking
> at the benchmark results on your webpage, it seems to me that SHGO is an
> extremely good algorithm: I have created this simple test:
>
> import numpy
> from shgo import shgo
>
> # -------------------------------------------------------------------------------- #
>
> class Rosenbrock(object):
>
>     def __init__(self, dimensions=2):
>
>         self.N = dimensions
>         self.nfev = 0
>         self._bounds = zip([-30.] * self.N, [30.0] * self.N)
>
>         self.global_optimum = [[1.0 for _ in range(self.N)]]
>         self.fglob = 0.0
>
>     def fun(self, x, *args):
>         self.nfev += 1
>         f = sum(100.0 * (x[1:] - x[:-1] ** 2.0) ** 2.0 + (1 - x[:-1]) ** 2.0)
>         print (self.nfev, f)
>         return f
>
> # -------------------------------------------------------------------------------------- #
>
> def Test():
>
>     maxfun = 2000
>     tolfun = 1e-2
>     numpy.random.seed(123456)
>
>     benchmark = Rosenbrock()
>     bounds = benchmark._bounds
>     fglob = benchmark.fglob
>
>     options = {'maxfev': maxfun, 'f_min': fglob, 'f_tol': tolfun}
>     res = shgo(benchmark.fun, bounds, options=options, sampling_method='simplicial')
>     xf, yf = res.x, res.fun
>
>     print(xf, yf, fglob, benchmark.nfev)
>
>
> if __name__ == '__main__':
>     Test()
>
>
> And I get the following numerical results and a graph of the evolution of
> the objective function:
>
> array([ 0.99999555,  0.99999111]), 1.9768160305004305e-11, 0.0, 2113
>
> [image: image.png]
>
> I am probably too ignorant about SHGO, but I would have thought the
> Rosenbrock function to be an easy pick for SHGO - i.e., convergence to the
> global optimum achieved much faster than 2,000 evaluations. Maybe there are
> some settings that I should change?
>
> Andrea.
>
>
>
>>
>> On Sun, Jul 26, 2020 at 5:49 PM Andrea Gavana <andrea.gavana at gmail.com>
>> wrote:
>>
>>> Hi Stefan,
>>>
>>> On Sun, 26 Jul 2020 at 17.19, Stefan Endres <stefan.c.endres at gmail.com>
>>> wrote:
>>>
>>>> Dear Andrea,
>>>>
>>>> SHGO does not use an initial starting point, only the bounds (which may
>>>> also be specified as none or infinite). The benchmarks that I ran used for
>>>> the publication used the global minimum as a stopping criteria (together
>>>> with performance profiles that demonstrate the final results). For this
>>>> particular benchmarking framework I would propose simply using a single
>>>> iteration ((dim)^2 +1 points) or specifying 100 starting points.
>>>>
>>>> A script to use 100 sampling points in a single iteration with the
>>>> sobol sampling method:
>>>>
>>>> ``` result = shgo(obj_fun, bounds, n=100, sampling_method='sobol') ```
>>>>
>>>>
>>>> If you would like to add a more stochastic element to this performance
>>>> I think the best approach would be to use a different seed for the sampling
>>>> method (in my experience this does not make much of a difference to the
>>>> performance in low dimensional problems), otherwise run shgo only once
>>>> and/or with increasing numbers of iterations. Another possibility is to add
>>>> a stochastic element to the bounds.
>>>>
>>>> Please let me know if you need any help.
>>>>
>>>
>>>
>>> Thank you for your answer. The approach I had several years ago - and
>>> that I’d like to keep - was to generate 100 random starting points for each
>>> benchmark and run all the global optimizers from that point: see
>>> http://infinity77.net/global_optimization/
>>>
>>> Those benchmarks also relied on the global optimum as a stopping
>>> criterion plus a maximum number of objective function evaluations (2,000),
>>> whichever is reached first. Of course, reaching the maximum number of
>>> function evaluations without getting to the global optimum (plus a
>>> pre-specified tolerance) means a failure in the context of my original
>>> benchmarks.
>>>
>>> I have now a few more benchmark functions plus a couple of new
>>> algorithms and I’d like to take the same steps. I also have a slightly
>>> different approach to bounds/global optima locations so that algorithms
>>> that rely on guessing global optima by running to the center of the domain
>>> (or on the bounds) will have a less easy life this time. Bounds-shifting is
>>> what I used to do but you have to be careful as some of the benchmark
>>> functions can be undefined outside those bounds (I.e., returning NaNs) or
>>> they can have lower global optima outside the bounds. Shrinking the bounds
>>> is of course always a possibility but it makes life easier to the
>>> algorithms and it will fail as a strategy if a benchmark has a global
>>> optimum exactly at (one or more of) the original bounds.
>>>
>>> That said, I didn’t know that the sampling process of SHGO relied on
>>> random numbers: that is good to know, as an alternative I can do as you
>>> suggested and vary the seed 100 times - one of the new algorithms I have
>>> also does not use an initial point so it was already my strategy to change
>>> the seed for that one. I can simply do the same for SHGO.
>>>
>>> I’m running still with an old Python/Numpy/SciPy combination (for legacy
>>> reasons) so I’ll have to see if differential_evolution and dual_annealing
>>> can be simply copied over locally and run - I tested SHGO and it runs with
>>> no problem.
>>>
>>> Andrea.
>>>
>>>
>>>> Best regards,
>>>> Stefan Endres
>>>>
>>>>
>>>>
>>>>
>>>> On Sun, Jul 26, 2020 at 4:06 PM Andrea Gavana <andrea.gavana at gmail.com>
>>>> wrote:
>>>>
>>>>> Dear SciPy developers & users,
>>>>>
>>>>>      I have a couple of new derivative-free, global optimization
>>>>> algorithms I’ve been working on lately - plus some improvements to AMPGO
>>>>> and a few more benchmark functions - and I’d like to rerun the benchmarks
>>>>> as I did back in 2013 (!!!).
>>>>>
>>>>> In doing so, I’d like to remove some of the least interesting/worst
>>>>> performing algorithms (Firefly, MLSL, Galileo, the original DE) and replace
>>>>> them with the ones currently available in SciPy - differential_evolution,
>>>>> SHGO and dual_annealing.
>>>>>
>>>>> Everything seems good and dandy, but it appears to me that SHGO does
>>>>> not accept an initial point for the optimization process - which makes the
>>>>> whole “run the optimization from 100 different starting points for each
>>>>> benchmark” a bit moot.
>>>>>
>>>>> I am no expert on SHGO, so maybe there is an alternative way to
>>>>> “simulate” the changing of the starting point for the optimization? Or
>>>>> maybe some other approach to make it consistent across optimizers?
>>>>>
>>>>> Any suggestion is more than welcome.
>>>>>
>>>>> Andrea.
>>>>>
>>>> _______________________________________________
>>>>> SciPy-Dev mailing list
>>>>> SciPy-Dev at python.org
>>>>> https://mail.python.org/mailman/listinfo/scipy-dev
>>>>>
>>>>
>>>>
>>>> --
>>>> Stefan Endres (MEng, AMIChemE, BEng (Hons) Chemical Engineering)
>>>>
>>>> Wissenchaftlicher Mitarbetier: Leibniz Institute for Materials
>>>> Engineering IWT, Badgasteiner Straße 3, 28359 Bremen, Germany
>>>> <https://www.google.com/maps/search/Badgasteiner+Stra%C3%9Fe+3,+28359+Bremen,+Germany?entry=gmail&source=g>
>>>> Work phone (DE): +49 (0) 421 218 51238
>>>> Cellphone (DE): +49 (0) 160 949 86417
>>>> Cellphone (ZA): +27 (0) 82 972 42 89
>>>> E-mail (work): s.endres at iwt.uni-bremen.de
>>>> Website: https://stefan-endres.github.io/
>>>> _______________________________________________
>>>> SciPy-Dev mailing list
>>>> SciPy-Dev at python.org
>>>> https://mail.python.org/mailman/listinfo/scipy-dev
>>>>
>>> _______________________________________________
>>> SciPy-Dev mailing list
>>> SciPy-Dev at python.org
>>> https://mail.python.org/mailman/listinfo/scipy-dev
>>>
>>
>>
>> --
>> Stefan Endres (MEng, AMIChemE, BEng (Hons) Chemical Engineering)
>>
>> Wissenchaftlicher Mitarbetier: Leibniz Institute for Materials
>> Engineering IWT, Badgasteiner Straße 3, 28359 Bremen, Germany
>> Work phone (DE): +49 (0) 421 218 51238
>> Cellphone (DE): +49 (0) 160 949 86417
>> Cellphone (ZA): +27 (0) 82 972 42 89
>> E-mail (work): s.endres at iwt.uni-bremen.de
>> Website: https://stefan-endres.github.io/
>> _______________________________________________
>> SciPy-Dev mailing list
>> SciPy-Dev at python.org
>> https://mail.python.org/mailman/listinfo/scipy-dev
>>
> _______________________________________________
> SciPy-Dev mailing list
> SciPy-Dev at python.org
> https://mail.python.org/mailman/listinfo/scipy-dev
>


-- 
Stefan Endres (MEng, AMIChemE, BEng (Hons) Chemical Engineering)

Wissenchaftlicher Mitarbetier: Leibniz Institute for Materials Engineering
IWT, Badgasteiner Straße 3, 28359 Bremen, Germany
Work phone (DE): +49 (0) 421 218 51238
Cellphone (DE): +49 (0) 160 949 86417
Cellphone (ZA): +27 (0) 82 972 42 89
E-mail (work): s.endres at iwt.uni-bremen.de
Website: https://stefan-endres.github.io/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scipy-dev/attachments/20200727/97a94e46/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image.png
Type: image/png
Size: 73097 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/scipy-dev/attachments/20200727/97a94e46/attachment-0002.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image.png
Type: image/png
Size: 40312 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/scipy-dev/attachments/20200727/97a94e46/attachment-0003.png>


More information about the SciPy-Dev mailing list