[SciPy-Dev] curve_fit() should require initial values for parameters

Wed Jan 30 15:22:06 EST 2019

On Wed, Jan 30, 2019 at 2:27 PM <josef.pktd at gmail.com> wrote:

>
>
> On Wed, Jan 30, 2019 at 1:48 PM Ilhan Polat <ilhanpolat at gmail.com> wrote:
>
>> I am a frequent user of this function. I am also occasionally teaching
>> control theory and I use this also professionally, I provided a proper use
>> case as a "user". You already admitted that you don't use it. So excuse me
>> when I say that I have more experience than you about this function API
>> (please read this twice; function API not the underlying theory). How can I
>> even provide evidence on this completely subjective matter? Here is all the
>> issues related to curve_fit
>>
>> https://github.com/scipy/scipy/search?q=curve_fit&type=Issues
>>
>> As far as I know there was no complaint so far. Hence that means it is
>> really not that big of a deal that would grant such tone. I don't know what
>> to add other than what I already provided. You cannot make educated guesses
>> about initial points on nonconvex searches. As you mentioned, we are as
>> blind as np.ones(n) choice. That's just being pedantic about the API.
>> Making the function api more annoying than it has to be is for me a wrong
>> choice or even unpythonic if you are among that kind of crowd. A wrongness
>> that matlab and alike software continuously annoy with their clunky UI.
>> Users are not ours to educate. And if something is None then it means the
>> code will make up some values not the correct ones; as clearly documented,
>> some values. If you have better ones provide them.
>>
>> Having said that, I am getting a lot of "horrible, utterly wrong,
>> obstinate, disservice" etc. that makes me uncomfortable and for me it's
>> past beyond the discussion. I am not good at interwebz so I'd better stop
>> here. We are talking about a simple function argument being required or
>> optional that is essentially a made-up array. Thus I would like to reserve
>> these words for more important occasions  It's just a python library I am
>> contributing to so I don't want to be involved in this particular issue any
>> more. Since I am clearly biased on one side, I leave it to other members to
>> decide, I am fine with any outcome.
>>
>> Best,
>>
>
> Except that I am not a user of curve_fit, I agree completely with Ilhan.
>
>
> Actually, I think `ones` is one of the most reasonable default choices.
> Most cases are not in an arbitrary space of functions. The
> parameterization is often chosen to have nice positive values for
> interpretability. For example, I think that all (or almost all) parameters
> in scipy.stats distributions are positive or nonnegative (except for loc
> where any number is possible).
>
> Based on a brief browsing of recent stackoverflow questions, it looks like
> there are many possible problems with curve_fit which is an inherent
> problem with nonlinear optimization in general.
> But I think that specifying starting values if the results don't look
> nice, should be an obvious solution to users (especially with improved
> docstring for curve_fit).
> Many other problems on stackoverflow seem to be that users don't use a
> good parameterization of their function.
> Starting values is just one possible source of problems, and a user needs
> to be willing to investigate those when the first try doesn't work. (*)
>
> In the rest of scipy.optimize (and in related functions in statsmodels)
> there is no default starting values also because there is no information
> about the number of parameters (or length or the parameter) available.
> curve_fit using inspect and args was designed for making automatic starting
> values possible.
>
>
> "If at first you don't succeed, try, try again."
>
>
> (I'm strongly in favor of trying "defaults" first, and if that doesn't
> work, then dig into or debug likely candidates. in loose analogy of test
> driven development instead of up-front design.)
>

This finally reminded me that I do have a similar example with default
starting values, although with fmin and not leastsq.

scipy had around 94 distributions when I started, and maybe around 65
(IIRC) continuous distributions with a fit method.
This was too many to go through every distribution individually.
Essentially the only information available in general is the number of
parameters.

The way I worked initially was to start with some generic defaults, and
then work my way through the failing cases.
Nice cases like normal and similar work out of the box (scale=1 is actually
a good choice for default, mean=1 is arbitrary but no problem).
Later we added fitstart for individual distributions with reasonable
starting values that can be inferred from the data.
Some distribution don't have a well behaved loglikelihood functions, and
AFAIK, they still don't work "out of the box".

Each stage of refinement requires more work for those cases that have
failed all previous stages. However, the number of cases that are left is
shrinking so the heavy work is mostly for nasty cases.

(It's pretty much the same in statsmodels. We have a large number of models
and very simple default starting parameters. In some cases or for some
datasets this works fine. Other cases failed or still fail and I spent
several months overall to improve starting values and numerical stability
for those, not always with full success. But I don't "waste" that time on
cases that work fine out of the box, i.e. with simple starting values.)

Josef

>
> Currently no user is prevented from specifying starting values.
> After the change everyone is forced to add this additional step, just
> because some users are surprised that nonlinear optimization doesn't always
> work (out of the box).
>
> Josef
>
>
>
>
>>
>>
>>
>>
>>
>> On Wed, Jan 30, 2019 at 5:05 PM Matt Newville <newville at cars.uchicago.edu>
>> wrote:
>>
>>> Hi Ilhan,
>>>
>>> On Tue, Jan 29, 2019 at 10:54 AM Ilhan Polat <ilhanpolat at gmail.com>
>>> wrote:
>>>
>>>> > The problem I have with this is that there really is not an option to
>>>>  "try automatic first".  There is "try `np.ones(n_variables)` first".
>>>>  This, or any other value, is really not a defensible choice for starting
>>>> values.  Starting values always depend on the function used and the
>>>> data being fit.
>>>>
>>>> Why not? 1.s are as good as any other choice.
>>>>
>>>
>>> Well, I agree that `np.ones(n_variables)` is as good as any other
>>> choice.  All default choices are horrible and not defensible.
>>>
>>> Mathematically, algorithmically, and conceptually, initial values ARE
>>> REQUIRED for non-linear least squares optimization.  The codes underlying
>>> `curve_fit()` (including `leastsq` and the mess that is `least_square`) do
>>> not permit the user to not provide initial values.  The algorithms used
>>> simply do not make sense without initial values.
>>>
>>> Programmatically, an optional keyword argument to a function, say with a
>>> default value of `None`, implies that there is a sensible default for that
>>> value.  Thus, default fitting tolerances might be 1.e-7 (or, perhaps square
>>> root of machine precision) or the default value for "method to calculate
>>> jacobian" might be `None` to mean "calculate by finite difference".  For
>>> these optional inputs, the function (say, `curve_fit()`) has a sensible
>>> default value that will work independently from the other input.
>>>
>>> That notion of "independent, sensible default value" is not ever
>>> possible for the initial values `p0` for `curve_fit`.  Sensible initial
>>> values always depend on the data to be fit and the function modeling the
>>> data.   Change the data values dramatically and `p0` must change.  Change
>>> the definition of the function (or even the order of the arguments), and
>>> `p0` must change.   There is not and cannot be a sensible default.
>>>
>>> Telling the user that `p0` is optional and can be `None` (as the current
>>> documentation does clearly state) is utterly and profoundly wrong.   It is
>>> mathematically indefensible.  It is horrible API design.  It harms the
>>> integretiy of `scipy.optimize` to tell the user this.
>>>
>>>  I don't know anything about the curve fit I will get in the end. So I
>>>> don't need to pretend that I know a good starting value.
>>>>
>>>
>>> It is not possible to do curve-fitting or non-linear least-squares
>>> minimization when you "don't know anything".  The user MUST provide data to
>>> be modeled and MUST provide a function to model that data.   It is
>>> "pretending" to think that this is sufficient.  The user also must provide
>>> initial values.
>>>
>>> Maybe for 3 parameter functions, fine I can come up with an argument but
>>>> you surely don't expect me to know the starting point if I am fitting a 7
>>>> parameter func involving esoteric structure. At that point I am completely
>>>> ignorant about anything about this function. So not knowing where to start
>>>> is not due to my noviceness about the tools but because by definition. My
>>>> search might even turn out to be convex so initial value won't matter.
>>>>
>>>
>>> It is not possible to do curve-fitting or non-linear least-squares
>>> minimization when one is completely ignorant of the the function.
>>>
>>>
>>>
>>>> > Currently `curve_fit`  converts `p0=None` to `np.ones(n_variables)`
>>>> without warning or explanation.  Again, I do not use `curve_fit()` myself.
>>>> I find several aspects of it unpleasant.
>>>>
>>>> It is documented in the p0 argument docs. I am using this function
>>>> quite often. That's why I don't like extra required arguments. It's
>>>> annoying to enter some random array just to please the API where I know
>>>> that I am just taking a shot in the dark.
>>>>
>>>
>>> It is not an extra keyword argument.  It is required input for the
>>> problem. `curve_fit()` is converting your feigned (or perhaps obstinate)
>>> ignorance to a set of starting values for you.   But starting values simply
>>> cannot be independent of the input model function or input data.
>>>
>>>
>>> I am pretty confident that if we force this argument most of the people
>>>> you want to educate will enter np.zeros(n). Then they will get an even
>>>> weirder error then they'll try np.ones(n) but misremember n then they get
>>>> another error to remember the func parameter number which has already
>>>> trippped up twice. This curve_fit function is one of those functions that
>>>> you don't run just once and be done with it but over and over again until
>>>> you give up or satisfied. Hence defaults matter a lot from a UX
>>>> perspective. "If you have an initial value in mind fine enter it otherwise
>>>> let me do my thing" is much better than "I don't care about your quick
>>>> experiment give me some values or I will keep tripping up".
>>>>
>>>>
>>> I refuse to speculate on what "most users" will do, and I also refuse to
>>> accept your speculation on this without evidence.  There a great many
>>> applications of curve-fitting for which `np.ones(n_variables)` and
>>> `np.zeros(n_variables)` will completely fail -- the fit will never move
>>> from the starting point.   For the kinds of fits done in the programs I
>>> support, either of these would mean that essentially all fits would never
>>> move from its starting point, as at least one parameter being 0 or 1 would
>>> essentially always mean the model function was 0 over the full data range.
>>>
>>>
>>> But, again, I don't use `curve_fit` but other tools built on top of
>>> `scipy.optimize()`.  Generally, the model function and data together imply
>>> sensible or at least guessable default values, but these cannot independent
>>> of model or data.  In lmfit we do not permit the user to not supply default
>>> starting values -- default parameter values are `None` which will quickly
>>> raise a ValueError. I don't recall ever getting asked to change this.
>>> Because it should be obvious to all users that each parameter requires an
>>> initial value.  Where appropriate and possible, we do provide methods for
>>> model functions to make initial guesses based on data.  But again, the
>>> starting values always depend on model function and data.
>>>
>>> Default arguments *do* matter from a UX perspective when defaults are
>>> sensible.   `curve_fit` has three required positional arguments:  a model
>>> function (that must be annoying to have to provide), "y" data to be fit
>>> (well, I guess I have that), and "x" data.  Why are those all required?
>>> Why not allow `func=None` to be a function that calculates a sine wave?
>>> Why not allow `y=None` to mean `np.ones(1000)`?  Why not allow `x=None` to
>>> mean `np.arange(len(y))`?    Wouldn't that be friendlier to the user?
>>>
>>>
>>> >   But this behavior strikes me as utterly wrong and a disservice to
>>>> the scipy ecosystem.   I do not think that a documentation change is
>>>> sufficient.
>>>>
>>>> Maybe a bit overzealous?
>>>>
>>>>
>>> Nope, just trying to help `curve_fit()` work better.
>>>
>>> --Matt
>>>
>>> _______________________________________________
>>> SciPy-Dev mailing list
>>> SciPy-Dev at python.org
>>> https://mail.python.org/mailman/listinfo/scipy-dev
>>>
>> _______________________________________________
>> SciPy-Dev mailing list
>> SciPy-Dev at python.org
>> https://mail.python.org/mailman/listinfo/scipy-dev
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scipy-dev/attachments/20190130/2540e659/attachment-0001.html>