[SciPy-Dev] cKDTree

Thu Sep 8 06:02:35 EDT 2016

Hi Ralf,

On Thu, Sep 8, 2016 at 10:18 AM, Ralf Gommers <ralf.gommers at gmail.com>
wrote:

>
> On Wed, Sep 7, 2016 at 6:23 AM, Sylvain Corlay <sylvain.corlay at gmail.com>
> wrote:
>
>> I understand (especially about the scikits), although I was surprised by
>> some recently added features in scipy, such as differential evolution
>>
>> DE is more of a  "recipe" which has been studied empirically for which
>> there is no theoretical convergence rate. Even though it may "work well"
>> for certain problems, it causes some issues such as defining on which basis
>> it should even be "improved", and what should be the foundation for a
>> change. (Recently a result on convergence in probability (with no error
>> rate) on a bounded domain has been published, but no result exists on the
>> speed of convergence afaik...)
>>
>
>> Evolutionary optimization is definitely cool and may work strikingly well
>> for certain problems, but I was surprised that it got elected to inclusion
>> in scipy. DE would have been a nice seed for a scikit-evolution.
>>
>> For those interested in stochastic algorithms for global multivariate
>> optimization for which we have proven convergence results and an abundant
>> literature, we have at least the two following categories of methods
>>
>>  - *MCMC* (Markov Chain Monte Carlo) which comprises simulated
>> annealing, for which there are nice results of convergence in distribution.
>>  - *Robbins-Monro* methods, which comprise stochastic gradient methods,
>> for which we have almost-sure convergence and central-limit - type theorems.
>>
>
> I think it's fair to summarize the reason for the current status and
> inclusion of DE as: theoretical convergence rates and papers are nice, but
> code that actually works on a good set of benchmarks is way more important.
>
> Some years ago we had only bad global optimizers. We did have simulated
> annealing, but it just didn't work for most problems. No one stepped up to
> improve it, so we deprecated and removed it.
>
> DE was benchmarked quite thoroughly (on the benchmark functions from
> benchmarks/benchmarks/test_go_benchmark_functions.py) and came out
> looking good. It solved problems that the only other good optimizer we had
> (basinhopping) did not do well on. So that means there was value in adding
> it.
>
> Cheers,
> Ralf
>
>
It does raise the question on what to define as an "improvement" for a DE
optimizer since it is only empirical. Is an improvement something that
makes it work better on a fixed set of examples? Is it something that makes
it closer to the seminal article describing it or a later more general
description of this approach?

DE was added to Scipy almost at the same time as a first implementation of
a simple LP solver. In the case of the LP solver, bug reports can be easily
be sorted: I have this reasonably-sized and conditioned LP problem. The
solver says it is infeasible why this set of values is in a feasible set ->
clear bug.

I find this a bit worrying that it was never question of theoretical bounds
or rates for a method to be elected as part of Scipy. I would have thought
that some scientific foundation was a requirement.

Just imagine: I have a new uniformly filling sequence, but no proof that it
is pseudo-random, and I don't even know the discrepancy of the sequence,
but a bunch of examples for which "it works"...  Well, I doubt that anyone
would want to use it for Monte-Carlo simulation / cryptography etc...

In any case, I find this question on the need of a scientific justification
worthy to be answered in general - especially in the context of the
discussions on scope and the 1.0 release.

Sylvain
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scipy-dev/attachments/20160908/b8d2077a/attachment.html>