[SciPy-User] Revisit Unexpected covariance matrix from scipy.optimize.curve_fit

Tom Aldcroft aldcroft at head.cfa.harvard.edu
Fri Feb 22 13:44:21 EST 2013


On Fri, Feb 22, 2013 at 1:17 PM,  <josef.pktd at gmail.com> wrote:
> On Fri, Feb 22, 2013 at 1:03 PM, Pierre Barbier de Reuille
> <pierre at barbierdereuille.net> wrote:
>> I don't know about this result I must say, do you have a reference?
>>
>> But intuitively, perr shouldn't change when applying the same weight to all
>> the values.
>>
>> --
>> Barbier de Reuille Pierre
>>
>>
>> On 22 February 2013 17:12, Moore, Eric (NIH/NIDDK) [F] <eric.moore2 at nih.gov>
>> wrote:
>>>
>>> > -----Original Message-----
>>> > From: Tom Aldcroft [mailto:aldcroft at head.cfa.harvard.edu]
>>> > Sent: Friday, February 22, 2013 10:42 AM
>>> > To: SciPy Users List
>>> > Subject: [SciPy-User] Revisit Unexpected covariance matrix from
>>> > scipy.optimize.curve_fit
>>> >
>>> > In Aug 2011 there was a thread [Unexpected covariance matrix from
>>> > scipy.optimize.curve_fit](http://mail.scipy.org/pipermail/scipy-
>>> > user/2011-August/030412.html)
>>> > where Christoph Deil reported that "scipy.optimize.curve_fit returns
>>> > parameter errors that don't scale with sigma, the standard deviation
>>> > of ydata, as I expected."  Today I independently came to the same
>>> > conclusion.
>>> >
>>> > This thread generated some discussion but seemingly no agreement that
>>> > the covariance output of `curve_fit` is not what would be expected.  I
>>> > think the discussion wasn't as focused as possible because the example
>>> > was too complicated.  With that I provide here about the simplest
>>> > possible example, which is fitting a constant to a constant dataset,
>>> > aka computing the mean and error on the mean.  Since we know the
>>> > answers we can compare the output of `curve_fit`.
>>> >
>>> > To illustrate things more easily I put the examples into an IPython
>>> > notebook which is available at:
>>> >
>>> >   http://nbviewer.ipython.org/5014170/
>
> If my fast reading is correct, then this is a very good example what I
> DON'T want in curve_fit.
>
> Your actual standard deviation (in simulation) is 1.
>
> Then you impose a sigma of 100, and your results are completely
> inconsistent with the data, huge error margins and confidence
> intervals 5 times the range of the actual observations.
>
> In most cases (maybe not in astronomy) we would like to estimate the
> parameter uncertainty based on the actual data.

It happens all the time in astronomy and physics (and probably any
experimental science) that you either have well-justified priors for
the errors, or need to account for systematic errors that are not
reflected in the scatter of the data.  This is exactly why (at least
in physics/astronomy) one has the option to specify the actual
uncertainty of each data point for parameter estimation algorithms,
not just relative weighting.

> There are some cases where we have another estimate for sigma, a
> Bayesian can impose any prior; if we have more information about
> measurement errors, then we can use ODR, but in my opinion curve_fit
> should just be a boring standard weighted least squares.

But curve_fit is *much* simpler to use and I suspect most astronomers
will go straight there.  The fundamental algorithm underneath
(Levenburg-Marquardt + chi^2 statistics) has no problem handling
absolute sigma values.  I grew up on Numerical Recipes, and "mrqmin",
which gives as its output a covariance matrix which is exactly the
"expected" covariance matrix.  So why limit the usefulness of
curve_fit to the case of empirical scatter estimates?

>
> Josef
>
>
>
>>> >
>>> > This was run using scipy 0.11.0 by the way.  Any further discussion on
>>> > this topic to come to an understanding of the covariance output from
>>> > `curve_fit` would be appreciated.
>>> >
>>> > Thanks,
>>> > Tom
>>> > _______________________________________________
>>>
>>> chi2 = np.sum(((yn-const(x, *popt))/sigma)**2)
>>> perr = np.sqrt(np.diag(pcov)/(chi2/(x.shape[0]-1)))
>>>
>>> Perr is then the actual error in the fit parameter. No?
>>>
>>> -Eric
>>> _______________________________________________
>>> SciPy-User mailing list
>>> SciPy-User at scipy.org
>>> http://mail.scipy.org/mailman/listinfo/scipy-user
>>
>>
>>
>> _______________________________________________
>> SciPy-User mailing list
>> SciPy-User at scipy.org
>> http://mail.scipy.org/mailman/listinfo/scipy-user
>>
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user
>



More information about the SciPy-User mailing list