[SciPy-User] leastsq - When to scale covariance matrix by reduced chi square for confidence interval estimation

Thu May 31 23:50:30 EDT 2012

Hi List,

I'm trying to get my head around, when to scale the covariance matrix by
the reduced chi square of the problem for getting an estimate of the error
of a parameter obtained via fitting. I'm kinda stuck and would appreciate
any pointers to an answer. From the documentation of scipy.optimize.leastsq
and scipy.curve_fit, as well as from some old threads on this mailing list
[1, 2] it seems the procedure in scipy is the following

1) Create estimate of the Hessian matrix based on the Jacobian at the final
value, which is called the covariance matrix cov_x and is the curvature of
the fitting parameters around the minimum

2) Calculate the reduced chi square, red_chi_2 = sum( (w_i *(y_i - f_i))**2
) / dof, where w_i are the weights, y_i the data, f_i the fit and dof the
degrees of freedom (number of knobs)

3) Get parameter estimates by  calculating sqrt(diag(cov_x) * red_chi_2)

4) Scale confidence interval by appropriate value of student t
distribution, e.g. when predicting the standard deviation of a single value
just *= 1

So far, so good. However in the literature [3, 4] I often find that steps 2
and 3 are skipped when the data is weighted by errors in the individual
observation. Obviously for a good fit with red_chi_2 = 1 both ways of
getting an error are the same. [3] and [4] caution that the method they are
using assume among others normality and a reduced chi square of about 1 and
discourage the use of estimating the error in the fit for bad fits. However
it seems that the method currently in scipy somehow is more robust. Take
for example data similiar to the one I am currently working with [5]. The
fit has a reduced chi square of about one, and hence the errors of both the
scipy method and the literature method agree. If I make my reduced chi
square worse by scaling the error bars, the method in the literature gives
either very, very small errors or very, very large ones. The scipy method
however always produces about the same error estimate. Here is the output
of [5]

Errors scaled by:       1
Offset:                 -71.0
Chi 2:                  19.1752
Red. Chi 2:             1.0092
Error literature        2.8426
Error scipy             2.8557

Errors scaled by:       100.0
Offset:                 -71.0077
Chi 2:                  0.0019
Red. Chi 2:             0.0001
Error literature        283.9153
Error scipy             2.8563

Errors scaled by:       0.01
Offset:                 -69.0724
Chi 2:                  9797.5838
Red. Chi 2:             515.6623
Error literature        0.1235
Error scipy             2.8036

Now in the particular problem I am working at, we have a couple of fits
like [5] and some of them have a slightly worse reduced chi square of say
about 1.4 or 0.7. At this point the two methods start to deviate and I am
wondering which would be the correct way of quoting the errors estimated
from the fit. Even a basic reference to some text book that explains the
method used in scipy would be very helpful.

Thanks a lot,

Markus

P.S. On a related side remark: What happened to the benchmark tests against
NIST certified non linear regression data sets mentioned in [1]. Are they
still in the code base? And are there similiar benchmark data sets for non
linear regression with observations that have error bars on them?

[1] http://thread.gmane.org/gmane.comp.python.scientific.user/19482<http://article.gmane.org/gmane.comp.python.scientific.user/19482>
[2]
http://thread.gmane.org/gmane.comp.python.scientific.user/31023/focus=31024
[3] Numerical Recipes in C, 2nd edition, chapter 15.6 , especially p.
695-698
[4] Data Reduction and Error Analysis for the physical sciences, 3rd
edition, chapter 8.5, p. 157
[5] Gist of self-contained example at https://gist.github.com/2848439
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20120601/adc5b13d/attachment.html>