[SciPy-Dev] curve_fit() should require initial values for parameters

Thu Jan 24 13:15:32 EST 2019

On Thu, Jan 24, 2019 at 10:26 AM <josef.pktd at gmail.com> wrote:

>
>
> On Wed, Jan 23, 2019 at 11:12 PM Robert Kern <robert.kern at gmail.com>
> wrote:
>
>> For what it's worth, I agree that we should remove the default. There's
>> no generic value that accidentally works often enough to justify the time
>> wasted and novice confusion when it fails.
>>
>> I also agree that it is going to take a reasonably long deprecation cycle.
>>
>> On Wed, Jan 23, 2019 at 5:26 PM Matt Newville <newville at cars.uchicago.edu>
>> wrote:
>>
>>> Hi All,
>>>
>>> First, I apologize in advance if this sounds un-appreciative of the
>>> efforts made in scipy and scipy.optimize.  I am a very big fan, and very
>>> appreciative of the work done here.  With lmfit we have tried to take the
>>> "rough edges" from optimization and curve-fitting with python, but we're
>>> very much in favor of building wrappers on top of the core of
>>> scipy.optimize.
>>>
>>> Still, many people use `curve_fit` because it comes built-in with scipy,
>>> is well advertised, and is well-suited for simple uses.  It is clearly
>>> aimed at novices and tries to hide many of the complexities of using
>>> optimization routines for curve fitting.  I try to help people with
>>> questions about using `curve_fit` and `scipy.optimize` as well as `lmfit`.
>>> In trying to help people using `curve_fit`, I've noticed a disturbing
>>> trend.  When a novice or even experienced user asks for help using
>>> `curve_fit` because a fit "doesn't work" there is a very good chance that
>>> they did not specify `p0` for the initial values and that the default
>>> behavior of setting all starting values to 1.0 will cause their fit to fail
>>> to converge, or really to even start.
>>>
>>> This failure is understandable to an experienced user, but apparently
>>> not to the novice.  Curve-fitting problems are by nature local solvers and
>>> are always sensitive to initial values. For some problems, parameter values
>>> of 1.0 are simply inappropriate, and the numerical derivatives for some
>>> parameters near values of 1.0 will be 0.  Indeed, there is no value that is
>>> always a reasonable starting value for all parameters.  FWIW, in lmfit, we
>>> simply refuse to let a user run a curve-fitting problem without initial
>>> values.  I believe that most other curve-fitting interfaces also require
>>> initial values for all parameters.
>>>
>>> Unfortunately, faced with no initial parameter values, `curve_fit`
>>> silently chooses initial values.  It doesn't try to make an informed
>>> decision, it simply chooses '1.0', which can easily be so far off as to
>>> prevent a solution from being found. When this happens, `curve_fit` gives
>>> no information to the user of what the problem is.  Indeed it allows
>>> initial values to not be set, giving the impression that they are not
>>> important.  This impression is wrong: initial values are always important.
>>> `curve_fit` is mistaken in having a default starting value.
>>>
>>> I've made a Pull Request at https://github.com/scipy/scipy/pull/9701 to
>>> fix this misbehavior, so that `curve_fit()` requires starting values.  It
>>> was suggested there that this topic should be discussed here. I'm happy to
>>> do so.  It was suggested in the github Issue that forcing the user to give
>>> initial values was "annoying".  It was also suggested that a deprecation
>>> cycle would be required. I should say that I don't actually use
>>> `curve_fit()` myself, I'm just trying to help make this commonly used
>>> routine be completely wrong less often.
>>>
>>
>
> I think making initial values compulsory is too much of a break with
> tradition.
>

Well, it may be a break with the tradition of using
scipy.optimize.curve_fit, but I do not think it is a break with the
tradition of curve fitting.
Indeed, what curve_fit does when a user leaves `p0=None` is *not* to leave
the initial values unspecified -- the underlying optimization routine would
simply not accept that -- but rather to silently select values that are all
'1.0'.   I am not aware of any other curve-fitting code or use of
non-linear optimization that does this.  So, in a sense it is a
"traditional".  It is also wrong.

IMO, a warning and better documentation would be more appropriate.
>
> https://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.curve_fit.html
> does not show an example with starting values.
> curve_fit could issue a warning if p0 is not specified, or warn if
> convergence fails and p0 was not specified.
>
> I think it should also be possible to improve the default starting values,
> e.g. if the function fails or if bounds are provided.
>

I think trying to guess starting values would require an understanding of
the function calculating the model, and is not generally solvable.
For sure, if one knows the function, initial guesses are possible.  Lmfit
has this capability for a few commonly used model functions and those
initial guess are often very good.   But it cannot be done in general.

> I'm not a user of curve_fit, but I guess there might be a strong selection
> bias in use cases when helping out users that run into problems.
> I have no idea what the overall selection of use cases is. A brief
> skimming of some github searches shows that many users don't specify the
> initial values.
> (A semi-random search result
> https://github.com/jmunroe/phys2820-fall2018/blob/e270b1533130b2b7acd0ec5da3edd9262b792da6/Lecture.13-Data-Analysis-and-Curve-Fitting.ipynb
> )
>
> (Asides:
> The feedback I get about statsmodels are almost only for cases that "don't
> work", e.g. badly conditioned data, bad scaling of the data, corner case.
> Based on this feedback statsmodels optimization looks pretty bad, but this
> does not reflect that it works well in, say, 90% of the cases.
> However, unlike curve_fit, statsmodels has mostly models with predefined
> nonlinear functions which makes it easier to fit.
> )
>
>
I do not know what the usage of `curve_fit` is.  Apparently some users get
tripped up by not specifying initial values. But that is in the nature of
curve fitting -- initial values are necessary.  Claiming that they do not
matter or are an unnecessary burden is just not correct.

It seems like the first step is to change `curve_fit` to raise a warning or
print a message (not sure which is preferred) when `p0` is `None`, but
continue guessing `1.0`, at least for the time being.  Eventually, this
could be changed to raise an exception if `p0` is `None`.

Perhaps a middle step would be to change it to not guess `1.0` but a number
comprised of a uniformly selected random number between [-1, 1] for the
mantissa and a uniformly selected random integer between [-20, 20] for the
exponent, as long as bounds are respected?

Cheers,

--Matt Newville
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scipy-dev/attachments/20190124/bfe7cc53/attachment.html>