[SciPy-User] SciPy-User Digest, Vol 82, Issue 43

Tue Jun 15 12:15:43 EDT 2010

On Tue, Jun 15, 2010 at 11:52 AM, Timothy Kinney
<timothyjkinney at gmail.com> wrote:
>> From: Charles R Harris <charlesr.harris at gmail.com>
>> Subject: Re: [SciPy-User] Leastsq questions
>>> I am using the scipy leastsq method to fit some cooling data, such
>>> that the temperature is defined by an exponential decay function
>>> (Newton's Cooling Law). However, there are some other factors which
>>> also influence the cooling rate and I am attempting to account for
>>> them in the cooling law. I have some questions about leastsq:
>>>
>>> 1) When I fit the data in Excel I get a different fit than when I fit
>>> the same data in Scipy. Why is this? The fits are not very different,
>>> but they are consistently different.
>>>
>> It is impossible to know without a good deal more information, i.e., what is
>> your model, how is it parameterized, what is the data, when do the
>> iterations stop, and, if the parameters aren't sufficiently independent over
>> the data set, what is the required condition number. I suspect the latter is
>> coming into play here.
>
> I am using Newton's Cooling Law as the model. It states that the
> temperature at time t can be determined as an exponential decay term
> modifying the difference in temperature between the body and the
> environment:
>
> T(t) = Ta + (T0- Ta)Exp(-kt) where Ta is the ambient temperature, T0
> is the temperature at t=0, and k is the cooling constant (determined
> empirically, which is what I am doing by regression).
>
> I have temperature and time data points for my sample which I am
> plotting in Excel 2003 and Python. I then perform an exponential fit
> using the graphical trendline in Excel and using leastsq in Scipy.
>
> The data points (as lists in Python) are:
> time = [356, 1476, 1477, 1478, 1480, 1480, 1481, 1482, 1485, 1489]
> temp = [600, 50, 46, 43, 40, 39, 38, 37, 36, 35]
>
> In Excel, I get the fit: y = 1416.904401 exp(-0.002406x) with R^2 0.9855
>
> In Python, I get y = 1638.2891 * exp(-0.00251719x) and I'm not sure
> what the R^2 is.
>
>>> 2) How do I calculate the goodness of fit (R squared) for the leastsq
>>> algorithm? I think it's just the sum of the squared errors divided by
>>> something, but shouldn't this be easily called from the output?
>>>
>>>
>> If you are using the function correctly, then you already have an error
>> function that returns the residuals. Note that it is also available in the
>> full return.
>
> If I call the residuals function with the plsq[0] (the returned
> information from calling leastsq), I get an array (abbreviating the
> decimals):
>
> [-68.67 10.11 7.21 5.21 3.31 0.42 0.51 -0.49 -1.39 -2.29 -2.19 -2.99
> -3.70 -3.60]
>
> If I square each value and sum over the array I get 4956.00, but I
> need to divide this by something to calculate the R^2. I see no where
> in the documentation for leastsq where this is explained or where this
> calculation is callable.
>
> My full return is pasted below but I don't see an R^2 written in
> there...maybe it is called something else?
>
> (array([  1.63828911e+03,  -2.51719240e-03]), array([[
> 2.11286829e-01,  -1.68448450e-06],
>       [ -1.68448450e-06,   2.41312772e-11]]), {'qtf':
> array([-0.00033384,  0.00037401]), 'nfev': 16, 'fjac': array([[
> 3.05684317e+05,   2.36280141e-01,   2.23786930e-01,
>          2.15345493e-01,   2.06813134e-01,   1.93842627e-01,
>          1.93842627e-01,   1.89472893e-01,   1.85079918e-01,
>          1.80663583e-01,   1.80663583e-01,   1.76223829e-01,
>          1.71760516e-01,   1.71760516e-01],
>       [  2.43706858e+00,   2.17552354e+00,   2.48497525e-01,
>          2.56589204e-01,   2.64756241e-01,   2.77149269e-01,
>          2.77149269e-01,   2.81318568e-01,   2.85507103e-01,
>          2.89714982e-01,   2.89714982e-01,   2.93942237e-01,
>          2.98188985e-01,   2.98188985e-01]]), 'fvec': array([
> -5.80026885,  31.45722273,  21.5073541 ,  14.16136832,
>         7.77830677,  -2.36621221,  -1.36621221,  -5.09979308,
>        -7.84278392, -10.59520846,  -9.59520846, -11.35709047,
>       -12.12845379, -11.12845379]), 'ipvt': array([2, 1])}, 'Both
> actual and predicted relative reductions in the sum of squares\n  are
> at most 0.000000', 1)
>
>> I would like to iterate over a computation where I change one of the
>>> values and see how it effects the goodness of the fit. I'm not sure
>>> how to calculate the r-squared from the plsq that is returned from
>>> leastsq.
>> plsq?
>
> Sorry, that's what I named the returned information from calling
> leastsq. plsq[0] is the array containing the coefficients for the fit.
>
> This is actually a different fit from the one I am confused about
> above. But if I can figure out how to calculate the R^2 from leastsq,
> I think I can figure out how to solve that one. Or if not, I'll post
> more detailed info about it.
>
> I appreciate any help you can offer.


I get the same results as your Excel results for the log transformed
linear model


>>> y=np.array([600., 50, 46, 43, 40, 39, 38, 37, 36, 35])
>>> t = np.array([356, 1476, 1477, 1478, 1480, 1480, 1481, 1482, 1485, 1489])
>>> from scipy import stats
>>> res = stats.linregress(t, np.log(y))
>>> np.exp(res[1]), res[0]
(1416.9044014863491, -0.002405999077350617)
>>> yhat = np.exp(res[1])*np.exp(res[0]*t)
>>> y-yhat
array([-1.6609619 ,  9.35096348,  5.44864747,  2.5460967 , -0.2597068 ,
       -1.2597068 , -2.16295842, -3.06644253, -3.77828427, -4.39729448])


>>> rss=((np.log(y)-np.log(yhat))**2).sum()
>>> rss
0.096919787740057148
>>> lny=np.log(y)
>>> 1-rss/((lny-lny.mean())**2).sum()   #R-squared
0.98551323008032532

Josef

>
> -Tim
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user
>