[SciPy-User] confidence interval for leastsq fit

Thu Apr 28 06:18:45 EDT 2011

On Thu, Apr 28, 2011 at 6:00 AM, Gael Varoquaux
<gael.varoquaux at normalesup.org> wrote:
> On Wed, Apr 27, 2011 at 11:08:54PM -0400, dima osin wrote:
>>    How to calculate confidence interval for scipy.optimize.leastsq fit using
>>    the Student's t distribution and NOT the bootstrapping method?
>
> I suspect that you are the same person that I just replied to on stack
> overflow. I'll copy my answer here:
>
> """
> I am not sure what you mean by confidence interval.
>
> In general, leastsq doesn't know much about the function that you are
> trying to minimize, so it can't really give a confidence interval.
> However, it does return an estimate of the Hessian, in other word the
> generalization of 2nd derivatives to multidimensional problems.
>
> As hinted in the docstring of the function, you could use that
> information along with the residuals (the difference between your fitted
> solution and the actual data) to computed the covariance of parameter
> estimates, which is a local guess of the confidence interval.
>
> Note that it is only a local information, and I suspect that you can
> strictly speaking come to a conclusion only if your objective function is
> strictly convex. I don't have any proofs or references on that statement
> :).
> """
>
> That said, you give precisions above "using the Student's t distribution
> and NOT the bootstrapping method". I am a bit worried that you employ the
> concept of Student's t distribution. Strictly speaking, a Student's t is
> applicable only under Gaussian hypothesis, in a linear regression
> setting. In other words, it comes with assumptions that could very well
> be violated by your cost function and your data. If your cost function is
> well behaved (i.e. the optimizer always finds the same mimimum, and it's
> Hessian is well conditionned near this minimum) and if you have a
> data-fitting problem, I suspect that assymptotic normality will apply,
> and you will be able to use Gaussian test statistics. These are big ifs.

given the question is about optimize.leastsq, I assume this applies :

http://en.wikipedia.org/wiki/Non-linear_least_squares#Parameter_errors.2C_confidence_limits.2C_residuals_etc.

non-linear model with additive error  y = f(x, params) + err
function f is differentiable.
minimize quadratic loss
parameter is in the interior of the parameter space if there are bounds

all regression results of the linear model apply based on local
(derivatives), asymptotic (normality) distribution using the Jacobian
instead of the design matrix.

Josef

>
> HTH,
>
> Gael
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user
>