[SciPy-Dev] Introducing scipy.optimization.tgo; a global optimization routine.

Mon Apr 4 10:54:35 EDT 2016

Hello all,

I’d like to present this implementation of the Topographical Global
Optimisation (TGO) method, a powerfull global optimisation method first
proposed by Törn (1990) [1]. The implementation is written entirely in
python using mostly boolean numpy arrays to implement the core algorithm
and using scipy.optimize.minimize to find local minima.

https://github.com/Stefan-Endres/tgo

TGO is a clustering method that uses graph theory to generate good starting
points for local search methods from points distributed uniformly in the
interior of the feasible set. These points are currently generated using
the Sobol (1967) [3] sequence. The semi-empirical correlation by Hendorson
et. al. (2015) [2] for k integer defining the k-t matrix. The latter
publication used an (unpublished as far as I can tell) C/C++ implementation
which was comparatively succesful against commercial software (MIDACO 4.0).

More details can be found in the docstrings in
https://github.com/Stefan-Endres/tgo

Some features in brief.

   - Global optimisation of multivariate functions with
      - Bounded or unbounded variables.
      - Optional non-linear constraints.
   - Returns an ordered list of all local minima with the res.xl attribute
   and corresponding function values res.fun (this is something that I’ve
   found very useful in my own research).
   - Highly customisable to increase performance on difficult optimisation
   problems.
   - The functions were written entirely in python based on algorithms
   available in open literature [2]

As a quick visual illustration of the algorithm’s capabilities, a few
results for common optimisation problems and their results:
http://imgur.com/a/'ll MS0hG <http://imgur.com/a/MS0hG>

I’ve also run a comparison of tgo with the other two global optimisation
currently available.

http://pastebin.com/8K72Xve8

Note in all cases tgo found the correct answer, details on my system can be
found at the end of the pastebin (it’s a terminal session dump). Also I
apologise I’m not very familiar with the basinhopping algorithm so I didn’t
know how many iterations where optimal for the problems and stuck with the
default.

Details on the functions used and their references can be found in the
docstrings in the unittest script:
https://github.com/Stefan-Endres/tgo/blob/master/test__tgo.py
Potential issues

The most important barrier I forsee is the large file size of the direction
numbers used in generating the Sobol sampling points (new-joe-kuo-6.21201
<https://github.com/Stefan-Endres/tgo/blob/master/new-joe-kuo-6.21201>) at
1.8MiB. The code to generate the Sobol points itself is not an issue, it’s
a small script that was translated to Python by one of my advisers for the
project (repository: https://bitbucket.org/alchemyst/sobol/src) from the
original C++ code under a scipy compatible BSD style liscence (
http://web.maths.unsw.edu.au/~fkuo/sobol/).

I suggest the following solutions to this:

   - Use latin hypercube sampling (LHS) to generate the initial points
   instead. LHS is already used in _differentialevolution.py (I’ve tried to
   do this, but I’m not sure how without initiating the entire
   DifferentialEvolutionSolver object).
   - Trim the data file to only including Sobol sequence sampling for
   problems of up to 1000 dimensions. It’s easy to write something to download
   higher dimensional data when a user specifies problems higher than 1000
   dimensions.

I think the sobol_points function is more appropriate numpy.random in the
first from where it can be imported to tgo, but I'm not sure if it's wanted
there.

Everything else was written from scratch so there shouldn’t be any
licensing issues.

I’ve tried to follow the guidelines given in
http://docs.scipy.org/doc/scipy/reference/hacking.html as close as possible
with this repository:
https://github.com/Stefan-Endres/scipy/branches

It’s almost ready for a PR, in particular I don’t really know where/how to
do the following checklists.

   - Is the new functionality tagged with .. versionadded:: X.Y.Z (with
   X.Y.Z the version number of the next release - can be found in setup.py)?
   - Is the new functionality mentioned in the release notes of the next
   release?
   - Is the new functionality added to the reference guide?

References

   1. Törn, A (1990) “Topographical global optimization”, Reports on
   Computer Science and Mathematics Ser. A, No 199, 8p. Abo Akademi
   University, Sweden)
   2. Henderson, N et. al (2015) “A new look at the topographical global
   optimization method and its application to the phase stability analysis of
   mixtures”, Chemical Engineering Science, 127, 151-174
   3. Sobol, IM (1967) “The distribution of points in a cube and the
   approximate evaluation of integrals. USSR Comput. Math. Math. Phys. 7,
   86-112

If you wish to learn more about the method I highly recommend Hendorson et.
al.’s (2015) article which presents the material in an introductory fashion.

Please let me know what you guys think! I will keep working on this if you
think it’s something that could be included in scipy.

Thank you for reading.

Regards,
-
Stefan Endres
Postgraduate Student: Institute of Applied Materials
Department of Chemical Engineering, University of Pretoria
Cell: +27 (0) 82 972 42 89
E-mail: Stefan.C.Endres at gmail.com
St. Number: 11004968

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scipy-dev/attachments/20160404/928055b6/attachment.html>