[SciPy-User] Revisit Unexpected covariance matrix from scipy.optimize.curve_fit

Fri Feb 22 13:17:47 EST 2013

On Fri, Feb 22, 2013 at 1:03 PM, Pierre Barbier de Reuille
<pierre at barbierdereuille.net> wrote:
> I don't know about this result I must say, do you have a reference?
>
> But intuitively, perr shouldn't change when applying the same weight to all
> the values.
>
> --
> Barbier de Reuille Pierre
>
>
> On 22 February 2013 17:12, Moore, Eric (NIH/NIDDK) [F] <eric.moore2 at nih.gov>
> wrote:
>>
>> > -----Original Message-----
>> > From: Tom Aldcroft [mailto:aldcroft at head.cfa.harvard.edu]
>> > Sent: Friday, February 22, 2013 10:42 AM
>> > To: SciPy Users List
>> > Subject: [SciPy-User] Revisit Unexpected covariance matrix from
>> > scipy.optimize.curve_fit
>> >
>> > In Aug 2011 there was a thread [Unexpected covariance matrix from
>> > scipy.optimize.curve_fit](http://mail.scipy.org/pipermail/scipy-
>> > user/2011-August/030412.html)
>> > where Christoph Deil reported that "scipy.optimize.curve_fit returns
>> > parameter errors that don't scale with sigma, the standard deviation
>> > of ydata, as I expected."  Today I independently came to the same
>> > conclusion.
>> >
>> > This thread generated some discussion but seemingly no agreement that
>> > the covariance output of `curve_fit` is not what would be expected.  I
>> > think the discussion wasn't as focused as possible because the example
>> > was too complicated.  With that I provide here about the simplest
>> > possible example, which is fitting a constant to a constant dataset,
>> > aka computing the mean and error on the mean.  Since we know the
>> > answers we can compare the output of `curve_fit`.
>> >
>> > To illustrate things more easily I put the examples into an IPython
>> > notebook which is available at:
>> >
>> >   http://nbviewer.ipython.org/5014170/

If my fast reading is correct, then this is a very good example what I
DON'T want in curve_fit.

Your actual standard deviation (in simulation) is 1.

Then you impose a sigma of 100, and your results are completely
inconsistent with the data, huge error margins and confidence
intervals 5 times the range of the actual observations.

In most cases (maybe not in astronomy) we would like to estimate the
parameter uncertainty based on the actual data.
There are some cases where we have another estimate for sigma, a
Bayesian can impose any prior; if we have more information about
measurement errors, then we can use ODR, but in my opinion curve_fit
should just be a boring standard weighted least squares.

Josef

>> >
>> > This was run using scipy 0.11.0 by the way.  Any further discussion on
>> > this topic to come to an understanding of the covariance output from
>> > `curve_fit` would be appreciated.
>> >
>> > Thanks,
>> > Tom
>> > _______________________________________________
>>
>> chi2 = np.sum(((yn-const(x, *popt))/sigma)**2)
>> perr = np.sqrt(np.diag(pcov)/(chi2/(x.shape[0]-1)))
>>
>> Perr is then the actual error in the fit parameter. No?
>>
>> -Eric
>> _______________________________________________
>> SciPy-User mailing list
>> SciPy-User at scipy.org
>> http://mail.scipy.org/mailman/listinfo/scipy-user
>
>
>
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user
>