[SciPy-User] curve_fit errors

Wed Aug 5 13:53:01 EDT 2009

On Wed, Aug 5, 2009 at 1:31 PM, <josef.pktd at gmail.com> wrote:
> On Wed, Aug 5, 2009 at 11:56 AM, Pim
> Schellart<P.Schellart at student.science.ru.nl> wrote:
>> Hi Everyone,
>>
>> I am using the curve_fit wrapper around optimize.leastsq but I have a
>> few minor problems.
>> Even for a simple line fit the fitting does not produce a solution
>> unless I first shift the curve so that it's flat.
>> For a Gaussian fit a solution is not found unless I shift the curve
>> such that the maximum/minimum of the Gaussian is around zero.
>> Both of these errors occur whether I supply start values for the
>> parameters or not.
>> I get the following error:
>>
>> RuntimeError: Optimal parameters not found: Both actual and predicted
>> relative reductions in the sum of squares
>>  are at most 0.000000 and the relative error between two consecutive
>> iterates is at
>>  most 0.000000
>>
>> The following code produces this error:
>>
>> import numpy as np
>> import scipy as sp
>> import matplotlib.pyplot as plt
>> from scipy.optimize.minpack import curve_fit
>>
>> def f(x,a,b):
>>    return a*x+b
>>
>> x=np.array([1,10,20,21,30,31,39,40,49,50,59,60
>> ,69,70,4212,4213,4222,4223,4232,4233,4281,4302,4311,4312
>> ,4320,4321,4330,4331,4399,4400,4408,4409,4418,11086,11087,11095
>> ,11096,11105,11106,11115,11116,11117,36400,36401])
>>
>> y=f(x,0.101083824,82352.234)
>> y=y+np.random.randn(len(y))
>>
>> plt.plot(x,y,'r+')
>>
>> # Specify initial guess values for the parameters
>> p0=((y[-1]-y[0])/(x[-1]-x[0]),y[0])
>>
>> # Compute the fit without using the errors
>> popt,pcov=curve_fit(f,x,y,p0)
>>
>> However if I shift the curve with the guess values e.g.
>> y=y-((y[-1]-y[0])/(x[-1]-x[0]))*x-y[0]
>> than the fit does work.
>> Does anyone have any idea why this is and how to improve the
>> reliability of the curve_fit routine?
>
> It looks like curve_fit is raising an error/exception when it shouldn't
>
>    if ier != 1:
>        raise RuntimeError, "Optimal parameters not found: " + mesg

optimize.leastsq is less picky about the solution:

    errors = {0:["Improper input parameters.", TypeError],
              1:["Both actual and predicted relative reductions in the
sum of squares\n  are at most %f" % ftol, None],
              2:["The relative error between two consecutive iterates
is at most %f" % xtol, None],
              3:["Both actual and predicted relative reductions in the
sum of squares\n  are at most %f and the relative error between two
consecutive iterates is at \n  most %f" % (ftol,xtol), None],
              4:["The cosine of the angle between func(x) and any
column of the\n  Jacobian is at most %f in absolute value" % gtol,
None],
              5:["Number of calls to function has reached maxfev =
%d." % maxfev, ValueError],
              6:["ftol=%f is too small, no further reduction in the
sum of squares\n  is possible.""" % ftol, ValueError],
              7:["xtol=%f is too small, no further improvement in the
approximate\n  solution is possible." % xtol, ValueError],
              8:["gtol=%f is too small, func(x) is orthogonal to the
columns of\n  the Jacobian to machine precision." % gtol, ValueError],
              'unknown':["Unknown error.", TypeError]}

    info = retval[-1]    # The FORTRAN return value

    if (info not in [1,2,3,4] and not full_output):
        if info in [5,6,7,8]:
            if warning:  print "Warning: " + errors[info][0]
        else:
            try:
                raise errors[info][1], errors[info][0]
            except KeyError:
                raise errors['unknown'][1], errors['unknown'][0]

I think curve_fit should just print a warning, instead of raising a
runtime error and throwing away potentially correct and useful
results.

Josef

>
> variations that work, see below:
>
> * for a linear curve, linalg.lstsq is better
> * using optimize.leastsq directly works
> * if the noise is larger and with adjustments to the data it works
> * adding optimize.leastsq options didn't work to remove ier !=1, at
> least the ones I tried
>
>
> filing a ticket, and preparing a patch (looking at the return codes)
> and adding this as a test case would be very helpful for getting this
> fixed. Assuming of course my diagnosis is correct.
>
> Making non-linear estimation numerically more robust for all kinds of
> functions might be difficult, but I don't know much about that, (in
> contrast to the linear case)
>
> Josef
>
>
> X = np.column_stack((np.ones(len(y)),x))
> Y = y[:,np.newaxis]
> print 'ols'
> print np.dot(np.linalg.inv(np.dot(X.T,X)),np.dot(X.T,Y))
>
> print 'lstsq'
> print linalg.lstsq(X, Y)
>
> print 'using leastsq directly'
> def f2((a,b),x,y):return y-f(x,a,b)
> popt,pcov=optimize.leastsq(f2,p0,(x,y))
> print popt[::-1]
>
> print 'larger noise and constant correction'
> y2=f(x,0.101083824,82352.234)
> y2=y2+np.random.randn(len(y2))*100
> popt,pcov=curve_fit(f,x,y2-y2.min(),p0)
> print popt[1]+y2.min(), popt[0]
>
>
>
>>
>> Kind regards,
>>
>> Pim Schellart
>> _______________________________________________
>> SciPy-User mailing list
>> SciPy-User at scipy.org
>> http://mail.scipy.org/mailman/listinfo/scipy-user
>>
>