[SciPy-Dev] Introducing scipy.optimization.tgo; a global optimization routine.

Brian Scannell brianjscannell at gmail.com
Tue Apr 5 16:35:10 EDT 2016


Regardless of TGO, it seems like Sobol sequences could be a good
contribution. Josef Perktold was asking about low discrepancy sequences a
few years back on this list:
https://mail.scipy.org/pipermail/scipy-user/2013-June/034744.html and wrote
a blog post talking about them:
http://jpktd.blogspot.com/2013/06/quasi-random-numbers-with-halton.html

How do people feel about something like scipy.stats.lowdiscrepancy(samples,
dimensions, method='sobol') or similar approach? I have code for the Halton
and Hammersley sequences written in Cython that I'd like to contribute to
SciPy as well.

Thanks,
Brian


On Mon, Apr 4, 2016 at 1:21 PM Jacob Stevenson <jstevenson131 at gmail.com>
wrote:

> I agree that the algorithm looks promising (though I also don't have
> access to the papers).  If I understand correctly, a high level summary is
>
> 1. sample points from all possible solutions (e.g. sobol points or a
> simple grid)
> 2. discard points that don't look promising
> 3. perform local optimization on the remaining points.
>
> And I guess the graph theoretic / clustering cleverness comes in step 2.
>  (Something along the lines of "keep only points where the K nearest
> neighbors all have larger function values", if i'm not mistaken.)
>
> Here are a few thoughts I have, please chime in if I'm making bad
> assumptions.
>
> It starts with a grid-like search, and as such probably doesn't scale to
> very high dimensions.  But I think that's fine, it would still be
> applicable to many (most?) applications.
>
> It's nice that the bounds implementation seems to be fairly robust.  That
> is currently a weakness of basinhopping.
>
> I'm curious how the automatic selection of hyper parameters works.  I
> guess we'll find out in Andrea's benchmarks.
>
> People would probably find it useful to have a standalone implementation
> of sobal sequences.  Maybe it would fit in scipy.spatial?
>
> In general, I think the approach is nicely complementary with basinhopping
> and differential evolution.  I think it could be a good fit for scipy.
>
> Thanks for the contribution!
> Jacob
>
>
> On Mon, 4 Apr 2016 at 16:45 Andrea Gavana <andrea.gavana at gmail.com> wrote:
>
>> Hi Stefan,
>>
>>     The algorithm seems very interesting. I haven't gone through the
>> papers (mostly because I don't have access to those journals 😊), so I
>> can't really say I understand the theoretical basis of it...
>>
>> However, I believe it would be nice to put TGO through the grinder as I
>> did in the past with many other algorithms:
>>
>> http://infinity77.net/global_optimization/multidimensional.html
>>
>> And take a look to the overall performances. I'm always on the lookout
>> for GO algorithms available in Python (and actually running on Windows as
>> well...), so I'll try to find some time soon enough to put TGO as well on
>> the tables and pictures.
>>
>> Thank you for sharing!
>>
>> Andrea.
>>
>>
>> On Monday, 4 April 2016, Stefan Endres <stefan.c.endres at gmail.com> wrote:
>>
>>> Hello all,
>>>
>>> I’d like to present this implementation of the Topographical Global
>>> Optimisation (TGO) method, a powerfull global optimisation method first
>>> proposed by Törn (1990) [1]. The implementation is written entirely in
>>> python using mostly boolean numpy arrays to implement the core algorithm
>>> and using scipy.optimize.minimize to find local minima.
>>>
>>> https://github.com/Stefan-Endres/tgo
>>>
>>> TGO is a clustering method that uses graph theory to generate good
>>> starting points for local search methods from points distributed uniformly
>>> in the interior of the feasible set. These points are currently generated
>>> using the Sobol (1967) [3] sequence. The semi-empirical correlation by
>>> Hendorson et. al. (2015) [2] for k integer defining the k-t matrix. The
>>> latter publication used an (unpublished as far as I can tell) C/C++
>>> implementation which was comparatively succesful against commercial
>>> software (MIDACO 4.0).
>>>
>>> More details can be found in the docstrings in
>>> https://github.com/Stefan-Endres/tgo
>>>
>>> Some features in brief.
>>>
>>>    - Global optimisation of multivariate functions with
>>>       - Bounded or unbounded variables.
>>>       - Optional non-linear constraints.
>>>    - Returns an ordered list of all local minima with the res.xl
>>>    attribute and corresponding function values res.fun (this is
>>>    something that I’ve found very useful in my own research).
>>>    - Highly customisable to increase performance on difficult
>>>    optimisation problems.
>>>    - The functions were written entirely in python based on algorithms
>>>    available in open literature [2]
>>>
>>> As a quick visual illustration of the algorithm’s capabilities, a few
>>> results for common optimisation problems and their results:
>>> http://imgur.com/a/'ll MS0hG <http://imgur.com/a/MS0hG>
>>>
>>> I’ve also run a comparison of tgo with the other two global optimisation
>>> currently available.
>>>
>>> http://pastebin.com/8K72Xve8
>>>
>>> Note in all cases tgo found the correct answer, details on my system can
>>> be found at the end of the pastebin (it’s a terminal session dump). Also I
>>> apologise I’m not very familiar with the basinhopping algorithm so I didn’t
>>> know how many iterations where optimal for the problems and stuck with the
>>> default.
>>>
>>> Details on the functions used and their references can be found in the
>>> docstrings in the unittest script:
>>> https://github.com/Stefan-Endres/tgo/blob/master/test__tgo.py
>>> Potential issues
>>>
>>> The most important barrier I forsee is the large file size of the
>>> direction numbers used in generating the Sobol sampling points (
>>> new-joe-kuo-6.21201
>>> <https://github.com/Stefan-Endres/tgo/blob/master/new-joe-kuo-6.21201>)
>>> at 1.8MiB. The code to generate the Sobol points itself is not an issue,
>>> it’s a small script that was translated to Python by one of my advisers for
>>> the project (repository: https://bitbucket.org/alchemyst/sobol/src)
>>> from the original C++ code under a scipy compatible BSD style liscence (
>>> http://web.maths.unsw.edu.au/~fkuo/sobol/).
>>>
>>> I suggest the following solutions to this:
>>>
>>>    - Use latin hypercube sampling (LHS) to generate the initial points
>>>    instead. LHS is already used in _differentialevolution.py (I’ve
>>>    tried to do this, but I’m not sure how without initiating the entire
>>>    DifferentialEvolutionSolver object).
>>>    - Trim the data file to only including Sobol sequence sampling for
>>>    problems of up to 1000 dimensions. It’s easy to write something to download
>>>    higher dimensional data when a user specifies problems higher than 1000
>>>    dimensions.
>>>
>>> I think the sobol_points function is more appropriate numpy.random in
>>> the first from where it can be imported to tgo, but I'm not sure if it's
>>> wanted there.
>>>
>>> Everything else was written from scratch so there shouldn’t be any
>>> licensing issues.
>>>
>>> I’ve tried to follow the guidelines given in
>>> http://docs.scipy.org/doc/scipy/reference/hacking.html as close as
>>> possible with this repository:
>>> https://github.com/Stefan-Endres/scipy/branches
>>>
>>> It’s almost ready for a PR, in particular I don’t really know where/how
>>> to do the following checklists.
>>>
>>>    - Is the new functionality tagged with .. versionadded:: X.Y.Z (with
>>>    X.Y.Z the version number of the next release - can be found in setup.py)?
>>>    - Is the new functionality mentioned in the release notes of the
>>>    next release?
>>>    - Is the new functionality added to the reference guide?
>>>
>>> References
>>>
>>>    1. Törn, A (1990) “Topographical global optimization”, Reports on
>>>    Computer Science and Mathematics Ser. A, No 199, 8p. Abo Akademi
>>>    University, Sweden)
>>>    2. Henderson, N et. al (2015) “A new look at the topographical
>>>    global optimization method and its application to the phase stability
>>>    analysis of mixtures”, Chemical Engineering Science, 127, 151-174
>>>    3. Sobol, IM (1967) “The distribution of points in a cube and the
>>>    approximate evaluation of integrals. USSR Comput. Math. Math. Phys. 7,
>>>    86-112
>>>
>>>
>>>
>>>
>>> If you wish to learn more about the method I highly recommend Hendorson
>>> et. al.’s (2015) article which presents the material in an introductory
>>> fashion.
>>>
>>> Please let me know what you guys think! I will keep working on this if
>>> you think it’s something that could be included in scipy.
>>>
>>> Thank you for reading.
>>>
>>> Regards,
>>> -
>>> Stefan Endres
>>> Postgraduate Student: Institute of Applied Materials
>>> Department of Chemical Engineering, University of Pretoria
>>> Cell: +27 (0) 82 972 42 89
>>> E-mail: Stefan.C.Endres at gmail.com
>>> St. Number: 11004968
>>>>>>
>>
>>
>> --
>>
>> _______________________________________________
>> SciPy-Dev mailing list
>> SciPy-Dev at scipy.org
>> https://mail.scipy.org/mailman/listinfo/scipy-dev
>>
> _______________________________________________
> SciPy-Dev mailing list
> SciPy-Dev at scipy.org
> https://mail.scipy.org/mailman/listinfo/scipy-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scipy-dev/attachments/20160405/2a7f35db/attachment.html>


More information about the SciPy-Dev mailing list