[SciPy-Dev] curve_fit() should require initial values for parameters

Robert Kern robert.kern at gmail.com
Wed Jan 23 23:12:19 EST 2019


For what it's worth, I agree that we should remove the default. There's no
generic value that accidentally works often enough to justify the time
wasted and novice confusion when it fails.

I also agree that it is going to take a reasonably long deprecation cycle.

On Wed, Jan 23, 2019 at 5:26 PM Matt Newville <newville at cars.uchicago.edu>
wrote:

> Hi All,
>
> First, I apologize in advance if this sounds un-appreciative of the
> efforts made in scipy and scipy.optimize.  I am a very big fan, and very
> appreciative of the work done here.  With lmfit we have tried to take the
> "rough edges" from optimization and curve-fitting with python, but we're
> very much in favor of building wrappers on top of the core of
> scipy.optimize.
>
> Still, many people use `curve_fit` because it comes built-in with scipy,
> is well advertised, and is well-suited for simple uses.  It is clearly
> aimed at novices and tries to hide many of the complexities of using
> optimization routines for curve fitting.  I try to help people with
> questions about using `curve_fit` and `scipy.optimize` as well as `lmfit`.
> In trying to help people using `curve_fit`, I've noticed a disturbing
> trend.  When a novice or even experienced user asks for help using
> `curve_fit` because a fit "doesn't work" there is a very good chance that
> they did not specify `p0` for the initial values and that the default
> behavior of setting all starting values to 1.0 will cause their fit to fail
> to converge, or really to even start.
>
> This failure is understandable to an experienced user, but apparently not
> to the novice.  Curve-fitting problems are by nature local solvers and are
> always sensitive to initial values. For some problems, parameter values of
> 1.0 are simply inappropriate, and the numerical derivatives for some
> parameters near values of 1.0 will be 0.  Indeed, there is no value that is
> always a reasonable starting value for all parameters.  FWIW, in lmfit, we
> simply refuse to let a user run a curve-fitting problem without initial
> values.  I believe that most other curve-fitting interfaces also require
> initial values for all parameters.
>
> Unfortunately, faced with no initial parameter values, `curve_fit`
> silently chooses initial values.  It doesn't try to make an informed
> decision, it simply chooses '1.0', which can easily be so far off as to
> prevent a solution from being found. When this happens, `curve_fit` gives
> no information to the user of what the problem is.  Indeed it allows
> initial values to not be set, giving the impression that they are not
> important.  This impression is wrong: initial values are always important.
> `curve_fit` is mistaken in having a default starting value.
>
> I've made a Pull Request at https://github.com/scipy/scipy/pull/9701 to
> fix this misbehavior, so that `curve_fit()` requires starting values.  It
> was suggested there that this topic should be discussed here. I'm happy to
> do so.  It was suggested in the github Issue that forcing the user to give
> initial values was "annoying".  It was also suggested that a deprecation
> cycle would be required. I should say that I don't actually use
> `curve_fit()` myself, I'm just trying to help make this commonly used
> routine be completely wrong less often.
>
> Cheers,
>
> --Matt Newville
>
> _______________________________________________
> SciPy-Dev mailing list
> SciPy-Dev at python.org
> https://mail.python.org/mailman/listinfo/scipy-dev
>


-- 
Robert Kern
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scipy-dev/attachments/20190123/80a11ee3/attachment-0001.html>


More information about the SciPy-Dev mailing list