[SciPy-User] How to fit data with errorbars
josef.pktd at gmail.com
josef.pktd at gmail.com
Wed Feb 17 12:30:33 EST 2010
On Wed, Feb 17, 2010 at 12:12 PM, Joe Harrington <jh at physics.ucf.edu> wrote:
>> I have some data with some errobars that I need to fit to a line.
>
> Actually, you should fit the line; please don't change the data! ;-)
>
> The routine below does the job. I was surprised not to find this
> routine in numpy or scipy as it's very standard (I'm still convinced I
> must have missed it, so somebody please embarrass me and point it
> out).
scipy.stats.linregress for linear fit in one explanatory variable, but
ols only, there are no weights.
optimize.curve_curvefit uses weights but is intended for non-linear
(in parameters), and scipy.odr is very flexible.
And of course there is scikits.statsmodels for this and much more.
Josef
> The routine has a number of parameters that are standard in other
> numerical packages, such as IDL, for *every* fitting routine. We
> should consider adding these parameters to all our fitting routines,
> as well, since having them encourages people to fit (more) properly
> (i.e., actually assess the goodness of fit, etc.).
>
> --jh--
>
> import numpy as np
> import scipy.special as ss
>
> def linfit(y, x=None, y_unc=None):
> '''
> Fits a line to 2D data, optionally with errors in y.
>
> The method is robust to roundoff error.
>
> Parameters
> ----------
> y: ndarray
> Ordinates, any shape.
> x: ndarray
> Abcissas, same shape. Defaults to np.indices(y.length))
> y_unc: ndarray
> Uncertainties in y. If scalar or 1-element array, applied
> uniformly to all y values. [NOT IMPLEMENTED YET!] Must be
> positive.
>
> Returns
> -------
> a: scalar
> 0 Fitted intercept
> b: scalar
> 1 Fitted slope
> a_unc: scalar
> 2 Uncertainty of fitted intercept
> b_unc: scalar
> 3 Uncertainty of fitted slope
> chisq: scalar
> 4 Chi-squared
> prob: scalar
> 5 Probability of finding worse Chi-squared for this model with
> these uncertainties.
> covar: ndarray
> 6 Covariance matrix: [[a_unc**2, covar_ab],
> [covar_ab, b_unc**2]]
> yfit: ndarray
> 7 Model array calculated for our abcissas
>
> Notes
> -----
>
> If prob > 0.1, you can believe the fit. If prob > 0.001 and the
> errors are not Gaussian, you could believe the fit. Otherwise
> do not believe it.
>
> See Also
> --------
> Press, et al., Numerical Recipes in C, 2nd ed, section 15.2,
> or any standard data analysis text.
>
> Examples
> --------
> >>> import linfit
>
> >>> a = 1.
> >>> b = 2.
> >>> nx = 10
> >>> x = np.arange(10, dtype='float')
> >>> y = a + b * x
> >>> y_unc = numpy.ones(nx)
> >>> y[::2] += 1
> >>> y[1::2] -= 1
> >>> a, b, sa, sb, chisq, prob, covar, yfit = linfit.linfit(y, x, y_unc)
> >>> print(a, b, sa, sb, chisq, prob, covar, yfit)
> (1.272727272727272, 1.9393939393939394, 0.58775381364525869, 0.11009637651263605, 9.6969696969696937, 0.28694204178663996, array([[ 0.34545455, -0.05454545],
> [-0.05454545, 0.01212121]]), array([ 1.27272727, 3.21212121, 5.15151515, 7.09090909,
> 9.03030303, 10.96969697, 12.90909091, 14.84848485,
> 16.78787879, 18.72727273]))
>
> Revisons
> --------
> 2007-09-23 0.1 jh at physics.ucf.edu Initial version
> 2007-09-25 0.2 jh at physics.ucf.edu Fixed bug reported by Kevin Stevenson.
> 2008-10-09 0.3 jh at physics.ucf.edu Fixed doc bug.
> 2009-10-01 0.4 jh at physics.ucf.edu Updated docstring, imports.
> '''
> # standardize and test inputs
> if x == None:
> x = np.indices(y.length, dtype=y.dtype)
> x.shape = y.shape
>
> if y_unc == None:
> y_unc = np.ones(y.shape, dtype=y.dtype)
>
> # NR Eq. 15.2.4
> ryu2 = 1. / y_unc**2
> S = np.sum(1. * ryu2)
> Sx = np.sum(x * ryu2)
> Sy = np.sum(y * ryu2)
> # Sxx = np.sum(x**2 * ryu2) # not used in the robust method
> # Sxy = np.sum(x * y * ryu2) # not used in the robust method
>
> # NR Eq. 15.2.15 - 15.2.18 (i.e., the robust method)
> t = 1. / y_unc * (x - Sx / S)
> Stt = np.sum(t**2)
>
> b = 1. / Stt * np.sum(t * y / y_unc)
> a = (Sy - Sx * b) / S
>
> covab = -Sx / (S * Stt) # NR Eq. 15.2.21
>
> sa = np.sqrt(1. / S * (1. - Sx * covab)) # NR Eq. 15.2.19
> sb = np.sqrt(1. / Stt) # NR Eq. 15.2.20
>
> rab = covab / (sa * sb) # NR Eq. 15.2.22
>
> covar = np.array([[sa**2, covab],
> [covab, sb**2]])
>
> yfit = a + b * x
> chisq = np.sum( ((y - yfit) / y_unc)**2 )
>
> prob = 1. - ss.gammainc( (y.size - 2.) / 2., chisq / 2.)
>
> return a, b, sa, sb, chisq, prob, covar, yfit
>
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user
>
More information about the SciPy-User
mailing list