[SciPy-user] incorrect variance in optimize.curvefit and leastsq

Fri Feb 13 01:08:03 EST 2009

On Fri, Feb 13, 2009 at 12:24 AM, Travis E. Oliphant
<oliphant at enthought.com> wrote:
> josef.pktd at gmail.com wrote:
>> On Thu, Feb 12, 2009 at 11:09 PM, Travis E. Oliphant
>> <oliphant at enthought.com> wrote:
>>
>>> josef.pktd at gmail.com wrote:
>>>
>>>> I just saw the new optimize.curvefit which provides a wrapper around
>>>> optimize.leastsq
>>>>
>>>>
>>>>
>> the standard deviation of the error can be calculated and the
>> corrected (this is written for the use from outside of curvefit):
>>
>> yhat = func(x,popt[0], popt[1])   # get predicted observations
>> SSE = np.sum((y-yhat)**2)
>> sig2 = SSE/(len(y)-len(popt))
>> ecov = sig2*pcov       # this is the variance-covariance  matrix of
>> the parameter estimates
>>
>>
>> inside curvefit, this work (before the return):
>>
>> err  = func(popt, *args)
>> SSE = np.sum((err)**2)
>> sig2 = SSE / (len(ydata) - len(popt))
>> pcov = sig2 * pcov

I think for the weighted least squares problem the weights should go
into the SSE calculation. I didn't find directly the reference, but I
am somewhat confident that this is correct, from the analogy to the
transformed
model ynew = y*weight where weight_i = 1/sigma_i in the linear case.
But it's too late today to try to check this.

SSE = np.sum((err*weight)**2)

>>
>
> Thanks very much for these...   I appreciate your help in getting
> curve_fit in better shape.   These functions were essentially added to
> curvefit so now curve_fit can be tested against those NIST pages.
> Thank you also for the unit tests.

You're welcome

>
> A good project idea for somebody wanting to get their feet wet with
> SciPy would be to write unit-tests of curve_fit against some more of
> those NIST data-sets --- those look nice.   A combination of loadtxt and
> additional parsing fu and you could automate the whole thing --- nice
> project for a student out there ;-)
>
> -Travis
>

Yes, and they are also useful to test other optimization functions.
Unfortunately, I didn't see any cases with several explanatory
variables or with weights/heteroscedasticity. There are also certified
reference cases for basic statistics and linear regression, that also
would be useful.

 I will try to look during weekend at the weighted case a bit more.

Josef