[SciPy-user] Constrained least-squares fitting routine?

Rob Clewley rob.clewley at gmail.com
Fri May 1 21:07:41 EDT 2009


On Fri, May 1, 2009 at 8:19 PM, Adam Ginsburg
<adam.ginsburg at colorado.edu> wrote:
> Hi Scipy group,
>   Is there a constrained least squares fitting routine available, or
> can anyone offer me tips on implementing such a beast?

I don't think so, but I'm not absolutely sure. Anyway, see below.

>  I have been
> using scipy.optimize.leastsq, but I do not know how to constrain
> parameters.  The model I'm looking to emulate is Craig Markwardt's
> mpfit.pro (http://www.physics.wisc.edu/~craigm/idl/down/mpfit.pro), in
> particular the parinfo section that allows max/min and fixed
> parameters.  I've tried simply constraining parameters in my fitting
> function using if statements to set min/max values, but this strategy
> fails, I think because the algorithm pushes into space outside of the
> limits and can't get back.

Well, of course, because the poor algorithm can't see the discrete
boundary "coming". At the very least you have to make the penalty vary
smoothly with the parameters because you're dealing with a *gradient*
descent algorithm. I have had a lot of success with penalty functions,
even though they are a bit of a hack and certainly don't come with any
theoretical guarantees. You can try appropriately rescaled 1/x or log
functions and sometimes other funky things, provided at least that
there is some feedback given to the algorithm about exactly *how*
badly it is failing when it goes past the boundary (I sometimes scale
a large constant penalty by the square of how far the parameter passed
the boundary). Preferably, if you can know that your solution won't be
right at the boundary, you can make your penalty function kick in
before the boundary is even reached to push back from it before
something bad might happen (in case your system catastrophically fails
for values beyond the boundary).

In general this is an extremely non-trivial problem and I'm not aware
of good solutions apart from spending a lot more time analyzing your
parameter space in other ways (sensitivities) and coming up with
better measures of fitness than the naive "distance" between two
curves (for instance).

HTH,
Rob



More information about the SciPy-User mailing list