[SciPy-Dev] cKDTree

Ralf Gommers ralf.gommers at gmail.com
Thu Sep 8 14:52:17 EDT 2016


On Thu, Sep 8, 2016 at 10:02 PM, Sylvain Corlay <sylvain.corlay at gmail.com>
wrote:

> Hi Ralf,
>
>
> On Thu, Sep 8, 2016 at 10:18 AM, Ralf Gommers <ralf.gommers at gmail.com>
> wrote:
>
>>
>> On Wed, Sep 7, 2016 at 6:23 AM, Sylvain Corlay <sylvain.corlay at gmail.com>
>> wrote:
>>
>>> I understand (especially about the scikits), although I was surprised by
>>> some recently added features in scipy, such as differential evolution
>>>
>>> DE is more of a  "recipe" which has been studied empirically for which
>>> there is no theoretical convergence rate. Even though it may "work well"
>>> for certain problems, it causes some issues such as defining on which basis
>>> it should even be "improved", and what should be the foundation for a
>>> change. (Recently a result on convergence in probability (with no error
>>> rate) on a bounded domain has been published, but no result exists on the
>>> speed of convergence afaik...)
>>>
>>
>>> Evolutionary optimization is definitely cool and may work strikingly
>>> well for certain problems, but I was surprised that it got elected to
>>> inclusion in scipy. DE would have been a nice seed for a scikit-evolution.
>>>
>>> For those interested in stochastic algorithms for global multivariate
>>> optimization for which we have proven convergence results and an abundant
>>> literature, we have at least the two following categories of methods
>>>
>>>  - *MCMC* (Markov Chain Monte Carlo) which comprises simulated
>>> annealing, for which there are nice results of convergence in distribution.
>>>  - *Robbins-Monro* methods, which comprise stochastic gradient methods,
>>> for which we have almost-sure convergence and central-limit - type theorems.
>>>
>>
>> I think it's fair to summarize the reason for the current status and
>> inclusion of DE as: theoretical convergence rates and papers are nice, but
>> code that actually works on a good set of benchmarks is way more important.
>>
>> Some years ago we had only bad global optimizers. We did have simulated
>> annealing, but it just didn't work for most problems. No one stepped up to
>> improve it, so we deprecated and removed it.
>>
>> DE was benchmarked quite thoroughly (on the benchmark functions from
>> benchmarks/benchmarks/test_go_benchmark_functions.py) and came out
>> looking good. It solved problems that the only other good optimizer we had
>> (basinhopping) did not do well on. So that means there was value in adding
>> it.
>>
>> Cheers,
>> Ralf
>>
>>
> It does raise the question on what to define as an "improvement" for a DE
> optimizer since it is only empirical. Is an improvement something that
> makes it work better on a fixed set of examples?
>

Unless the set of examples is too small or the improvement is tailored for
those examples (which requires a bit of judgement by the reviewers), yes.


> Is it something that makes it closer to the seminal article describing it
> or a later more general description of this approach?
>

That has to be judged on a case by case basis I'd say.


>
> I find this a bit worrying that it was never question of theoretical
> bounds or rates for a method to be elected as part of Scipy. I would have
> thought that some scientific foundation was a requirement.
>

There is a large body of peer-reviewed work for DE, and Google Scholar
tells me that the original Storn and Price paper has 11652 citations. So
that very very clearly meets any requirement for a scientific foundation we
have.

I'm not sure if your questioning of reasons for including DE has anything
to do with your original question of adding new methods to KDTree, but on
that topic: I would suggest to sketch the picture of pros and cons for
Jake's suggestion of a scikit. How wide would the scope of that scikit
potentially be, is there a smaller subset of methods that would make sense
for scipy. Yu Feng and Sturla Molden have been improving cKDTree pretty
much every release, so it's certainly not code that is defined as
maintenance-only.

Ralf
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scipy-dev/attachments/20160909/24e41f2a/attachment.html>


More information about the SciPy-Dev mailing list