[SciPy-Dev] curve_fit() should require initial values for parameters

Thu Jan 24 13:51:14 EST 2019

On Thu, Jan 24, 2019 at 1:16 PM Matt Newville <newville at cars.uchicago.edu>
wrote:

>
>
> On Thu, Jan 24, 2019 at 10:26 AM <josef.pktd at gmail.com> wrote:
>
>>
>>
>> On Wed, Jan 23, 2019 at 11:12 PM Robert Kern <robert.kern at gmail.com>
>> wrote:
>>
>>> For what it's worth, I agree that we should remove the default. There's
>>> no generic value that accidentally works often enough to justify the time
>>> wasted and novice confusion when it fails.
>>>
>>> I also agree that it is going to take a reasonably long deprecation
>>> cycle.
>>>
>>> On Wed, Jan 23, 2019 at 5:26 PM Matt Newville <
>>> newville at cars.uchicago.edu> wrote:
>>>
>>>> Hi All,
>>>>
>>>> First, I apologize in advance if this sounds un-appreciative of the
>>>> efforts made in scipy and scipy.optimize.  I am a very big fan, and very
>>>> appreciative of the work done here.  With lmfit we have tried to take the
>>>> "rough edges" from optimization and curve-fitting with python, but we're
>>>> very much in favor of building wrappers on top of the core of
>>>> scipy.optimize.
>>>>
>>>> Still, many people use `curve_fit` because it comes built-in with
>>>> scipy, is well advertised, and is well-suited for simple uses.  It is
>>>> clearly aimed at novices and tries to hide many of the complexities of
>>>> using optimization routines for curve fitting.  I try to help people with
>>>> questions about using `curve_fit` and `scipy.optimize` as well as `lmfit`.
>>>> In trying to help people using `curve_fit`, I've noticed a disturbing
>>>> trend.  When a novice or even experienced user asks for help using
>>>> `curve_fit` because a fit "doesn't work" there is a very good chance that
>>>> they did not specify `p0` for the initial values and that the default
>>>> behavior of setting all starting values to 1.0 will cause their fit to fail
>>>> to converge, or really to even start.
>>>>
>>>> This failure is understandable to an experienced user, but apparently
>>>> not to the novice.  Curve-fitting problems are by nature local solvers and
>>>> are always sensitive to initial values. For some problems, parameter values
>>>> of 1.0 are simply inappropriate, and the numerical derivatives for some
>>>> parameters near values of 1.0 will be 0.  Indeed, there is no value that is
>>>> always a reasonable starting value for all parameters.  FWIW, in lmfit, we
>>>> simply refuse to let a user run a curve-fitting problem without initial
>>>> values.  I believe that most other curve-fitting interfaces also require
>>>> initial values for all parameters.
>>>>
>>>> Unfortunately, faced with no initial parameter values, `curve_fit`
>>>> silently chooses initial values.  It doesn't try to make an informed
>>>> decision, it simply chooses '1.0', which can easily be so far off as to
>>>> prevent a solution from being found. When this happens, `curve_fit` gives
>>>> no information to the user of what the problem is.  Indeed it allows
>>>> initial values to not be set, giving the impression that they are not
>>>> important.  This impression is wrong: initial values are always important.
>>>> `curve_fit` is mistaken in having a default starting value.
>>>>
>>>> I've made a Pull Request at https://github.com/scipy/scipy/pull/9701
>>>> to fix this misbehavior, so that `curve_fit()` requires starting values.
>>>> It was suggested there that this topic should be discussed here. I'm happy
>>>> to do so.  It was suggested in the github Issue that forcing the user to
>>>> give initial values was "annoying".  It was also suggested that a
>>>> deprecation cycle would be required. I should say that I don't
>>>> actually use `curve_fit()` myself, I'm just trying to help make this commonly
>>>> used routine be completely wrong less often.
>>>>
>>>
>>
>> I think making initial values compulsory is too much of a break with
>> tradition.
>>
>
> Well, it may be a break with the tradition of using
> scipy.optimize.curve_fit, but I do not think it is a break with the
> tradition of curve fitting.
> Indeed, what curve_fit does when a user leaves `p0=None` is *not* to leave
> the initial values unspecified -- the underlying optimization routine would
> simply not accept that -- but rather to silently select values that are all
> '1.0'.   I am not aware of any other curve-fitting code or use of
> non-linear optimization that does this.  So, in a sense it is a
> "traditional".  It is also wrong.
>

scipy.stats distribution fit also default to ones if not overridden by the
specific distribution.

statsmodels only has a few models with arbitrary user functions, but I
usually try to set a default that works at least in some common cases.

    def fitstart(self):
        #might not make sense for more general functions
        return np.zeros(self.exog.shape[1])

curve_fit was added to scipy as a convenience function, in contrast to the
"serious" optimizers.
For that I think putting in more effort to reduce the work by users is
useful.

(Note, I was never a fan of using `inspect` which is needed to know how
many default starting values to create.)

>
> IMO, a warning and better documentation would be more appropriate.
>>
>> https://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.curve_fit.html
>> does not show an example with starting values.
>> curve_fit could issue a warning if p0 is not specified, or warn if
>> convergence fails and p0 was not specified.
>>
>> I think it should also be possible to improve the default starting
>> values, e.g. if the function fails or if bounds are provided.
>>
>
> I think trying to guess starting values would require an understanding of
> the function calculating the model, and is not generally solvable.
> For sure, if one knows the function, initial guesses are possible.  Lmfit
> has this capability for a few commonly used model functions and those
> initial guess are often very good.   But it cannot be done in general.
>
>
>> I'm not a user of curve_fit, but I guess there might be a strong
>> selection bias in use cases when helping out users that run into problems.
>> I have no idea what the overall selection of use cases is. A brief
>> skimming of some github searches shows that many users don't specify the
>> initial values.
>> (A semi-random search result
>> https://github.com/jmunroe/phys2820-fall2018/blob/e270b1533130b2b7acd0ec5da3edd9262b792da6/Lecture.13-Data-Analysis-and-Curve-Fitting.ipynb
>> )
>>
>> (Asides:
>> The feedback I get about statsmodels are almost only for cases that
>> "don't work", e.g. badly conditioned data, bad scaling of the data, corner
>> case. Based on this feedback statsmodels optimization looks pretty bad, but
>> this does not reflect that it works well in, say, 90% of the cases.
>> However, unlike curve_fit, statsmodels has mostly models with predefined
>> nonlinear functions which makes it easier to fit.
>> )
>>
>>
> I do not know what the usage of `curve_fit` is.  Apparently some users get
> tripped up by not specifying initial values. But that is in the nature of
> curve fitting -- initial values are necessary.  Claiming that they do not
> matter or are an unnecessary burden is just not correct.
>
> It seems like the first step is to change `curve_fit` to raise a warning
> or print a message (not sure which is preferred) when `p0` is `None`, but
> continue guessing `1.0`, at least for the time being.  Eventually, this
> could be changed to raise an exception if `p0` is `None`.
>
> Perhaps a middle step would be to change it to not guess `1.0` but a
> number comprised of a uniformly selected random number between [-1, 1] for
> the mantissa and a uniformly selected random integer between [-20, 20] for
> the exponent, as long as bounds are respected?
>
> Cheers,
>
> --Matt Newville
> _______________________________________________
> SciPy-Dev mailing list
> SciPy-Dev at python.org
> https://mail.python.org/mailman/listinfo/scipy-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scipy-dev/attachments/20190124/f2a7ed1c/attachment.html>