[SciPy-Dev] curve_fit() should require initial values for parameters

Thu Jan 24 11:26:09 EST 2019

On Wed, Jan 23, 2019 at 11:12 PM Robert Kern <robert.kern at gmail.com> wrote:

> For what it's worth, I agree that we should remove the default. There's no
> generic value that accidentally works often enough to justify the time
> wasted and novice confusion when it fails.
>
> I also agree that it is going to take a reasonably long deprecation cycle.
>
> On Wed, Jan 23, 2019 at 5:26 PM Matt Newville <newville at cars.uchicago.edu>
> wrote:
>
>> Hi All,
>>
>> First, I apologize in advance if this sounds un-appreciative of the
>> efforts made in scipy and scipy.optimize.  I am a very big fan, and very
>> appreciative of the work done here.  With lmfit we have tried to take the
>> "rough edges" from optimization and curve-fitting with python, but we're
>> very much in favor of building wrappers on top of the core of
>> scipy.optimize.
>>
>> Still, many people use `curve_fit` because it comes built-in with scipy,
>> is well advertised, and is well-suited for simple uses.  It is clearly
>> aimed at novices and tries to hide many of the complexities of using
>> optimization routines for curve fitting.  I try to help people with
>> questions about using `curve_fit` and `scipy.optimize` as well as `lmfit`.
>> In trying to help people using `curve_fit`, I've noticed a disturbing
>> trend.  When a novice or even experienced user asks for help using
>> `curve_fit` because a fit "doesn't work" there is a very good chance that
>> they did not specify `p0` for the initial values and that the default
>> behavior of setting all starting values to 1.0 will cause their fit to fail
>> to converge, or really to even start.
>>
>> This failure is understandable to an experienced user, but apparently not
>> to the novice.  Curve-fitting problems are by nature local solvers and are
>> always sensitive to initial values. For some problems, parameter values of
>> 1.0 are simply inappropriate, and the numerical derivatives for some
>> parameters near values of 1.0 will be 0.  Indeed, there is no value that is
>> always a reasonable starting value for all parameters.  FWIW, in lmfit, we
>> simply refuse to let a user run a curve-fitting problem without initial
>> values.  I believe that most other curve-fitting interfaces also require
>> initial values for all parameters.
>>
>> Unfortunately, faced with no initial parameter values, `curve_fit`
>> silently chooses initial values.  It doesn't try to make an informed
>> decision, it simply chooses '1.0', which can easily be so far off as to
>> prevent a solution from being found. When this happens, `curve_fit` gives
>> no information to the user of what the problem is.  Indeed it allows
>> initial values to not be set, giving the impression that they are not
>> important.  This impression is wrong: initial values are always important.
>> `curve_fit` is mistaken in having a default starting value.
>>
>> I've made a Pull Request at https://github.com/scipy/scipy/pull/9701 to
>> fix this misbehavior, so that `curve_fit()` requires starting values.  It
>> was suggested there that this topic should be discussed here. I'm happy to
>> do so.  It was suggested in the github Issue that forcing the user to give
>> initial values was "annoying".  It was also suggested that a deprecation
>> cycle would be required. I should say that I don't actually use
>> `curve_fit()` myself, I'm just trying to help make this commonly used
>> routine be completely wrong less often.
>>
>

I think making initial values compulsory is too much of a break with
tradition.
IMO, a warning and better documentation would be more appropriate.
https://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.curve_fit.html
does not show an example with starting values.
curve_fit could issue a warning if p0 is not specified, or warn if
convergence fails and p0 was not specified.

I think it should also be possible to improve the default starting values,
e.g. if the function fails or if bounds are provided.

I'm not a user of curve_fit, but I guess there might be a strong selection
bias in use cases when helping out users that run into problems.
I have no idea what the overall selection of use cases is. A brief skimming
of some github searches shows that many users don't specify the initial
values.
(A semi-random search result
https://github.com/jmunroe/phys2820-fall2018/blob/e270b1533130b2b7acd0ec5da3edd9262b792da6/Lecture.13-Data-Analysis-and-Curve-Fitting.ipynb
)

(Asides:
The feedback I get about statsmodels are almost only for cases that "don't
work", e.g. badly conditioned data, bad scaling of the data, corner case.
Based on this feedback statsmodels optimization looks pretty bad, but this
does not reflect that it works well in, say, 90% of the cases.
However, unlike curve_fit, statsmodels has mostly models with predefined
nonlinear functions which makes it easier to fit.
)

>
>> Cheers,
>>
>> --Matt Newville
>>
>> _______________________________________________
>> SciPy-Dev mailing list
>> SciPy-Dev at python.org
>> https://mail.python.org/mailman/listinfo/scipy-dev
>>
>
>
> --
> Robert Kern
> _______________________________________________
> SciPy-Dev mailing list
> SciPy-Dev at python.org
> https://mail.python.org/mailman/listinfo/scipy-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scipy-dev/attachments/20190124/ba01d2b8/attachment.html>