[SciPy-user] estimating errors with optimize.leastsq

Tue Jul 11 10:13:17 EDT 2006

Christian Kristukat wrote:
> Robert Kern wrote:
>> Christian Kristukat wrote:
>>> Chiara Caronna wrote:
>>>> 2) Is there any way to get the estimated errors on the fitting parameter?
>>>> Maybe optimize.leastsq is not the right choice? Does anyone has some good 
>>>> hints?
>>> Everything you need is in here:
>>> http://www.boulder.nist.gov/mcsd/Staff/JRogers/papers/odr_vcv.dvi
>>> I haven't yet found time/will to dig into it, but I'm definitely interested in a
>>> good error estimation routine.
>> The implementation of the ideas in that paper is in ODRPACK by the same author. 
>> It is wrapped as scipy.sandbox.odr . The docstrings are fairly thorough, I 
>> think, but please let me know if something needs to be clarified.
> 
> great! I just to tried to build from svn with sandbox.odr enabled. Upon
> importing odr I get the following error:
> 
>>>> from scipy.sandbox import odr
> Traceback (most recent call last):
>   File "<stdin>", line 1, in ?
>   File "/usr/local/lib/python2.4/site-packages/scipy/sandbox/odr/__init__.py",
> line 55, in ?
>     import odrpack
>   File "/usr/local/lib/python2.4/site-packages/scipy/sandbox/odr/odrpack.py",
> line 113, in ?
>     from scipy.sandbox.odr import __odrpack
> ImportError: cannot import name __odrpack
> 
> Looking at site-packages/scipy/sandbox/odr it looks like that the extension
> module has not been built.
> 
> However I have been able to build the odr module alone. I had to comment out
> line 48 in setup_odr.py (I've no atlas) and in the last line a .todict()) was
> missing.

D'oh! I could have sworn I had that working. Oh well. It's fixed now.

> After that it was really easy to switch to odr and the results, especially the
> confidential intervals look very nice. I noticed that odr is more demanding on a
> good  initial guess than leastsq and sometimes it seems to takes a dead-end
> road. But I'll have to play some more with it.
> It just came into my mind that for data which has no noise on the x-values, odr
> might not be advantageous compared to an ordinary least squares fit of the
> y-values. Is that assumption right?

One thing you have to watch out for is that if you don't specify the weights on 
the X-values, then they will be implicitly set to 1 (in whatever units the data 
are in), so you'll be solving an ODR problem whether it's appropriate or not. So 
the question isn't really "is ODR better or worse technique than OLS?" but 
rather "do I have an ODR problem or an OLS problem?"

Of course, ODRPACK handles OLS problems, too. Just do .set_job(fit_type=2) (no, 
I'm not really happy with that interface, either, but there it is).

> Anyway, thanks for wrapping odr! Is it ok to include odr in my GPLed package as
> long as odr is not part of the officail scipy distribution?

You could include all of scipy into your GPLed package, too. The BSD license is 
GPL-compatible. Well, now that I think about it, that's not entirely true. We've 
been a little lax about wrapping one or two routines where the authors have 
requested citations in publications, which, if enforced, is not a GPL-compatible 
restriction. But there are no such problems with odr; ODRPACK is US government 
public domain code, and my wrappers are given under the scipy license.

The license requirements for the code in the sandbox are no different from the 
"official" scipy distribution.

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless enigma
  that is made terrible by our own mad attempt to interpret it as though it had
  an underlying truth."
   -- Umberto Eco