[SciPy-Dev] adding linear fitting routine

David Pine djpine at gmail.com
Fri Nov 29 10:21:19 EST 2013


I have written a function called linfit for linear least square fitting that I am proposing to have added to one of the numpy or scipy libraries.

linfit performs a full least squares chi-squared fit. Thus, it can handle data with error estimates (aka error bars), weighting the data accordingly.

linfit provides estimates of the uncertainties of the fitted parameters, the slope and y-intercept. linfit allows one to optionally:

(1) use no weighting, or

(2) to weight data according the residuals, often called relative weighting (the way it's often done in work in the social sciences), or

(3) to use the absolute measure of uncertainties either for each data point or for all the data points at once (the way it's often done in the physical sciences).

These options were included with the recent discussion on weighted least squares fitting in mind.  See scipy/scipy#448.

I am not sure where linfit best belongs in the numpy/scipy universe. The most reasonable places would seem to be either the polynomial package (a straight line is the simplest polynomial) or perhaps the scipy.optimize package along with curve_fit, which fits nonlinear functions to data.

I wrote the function because there really is nothing like it in numpy or scipy, and it is so basic that in my opinion, something like it should be available.  I tried to write it so that it is useful to a very wide range of users that cross all branches of social and physical sciences as well as engineering.

I have added linfit to a cloned version of the numpy.polynomial module, which can be found at https://github.com/numpy/numpy/pull/4080.

The standalone linfit function can be found at https://github.com/djpine/linfit.  An ipython notebook demonstrating various ways of using linfit is available at the same site; its output can be viewed at http://nbviewer.ipython.org/github/djpine/linfit/blob/master/linfit.ipynb.

I have in included a unit test test_linfit.py that, among other things, compares the speed of linfit to numpy.polyfit, scipy.linalg.lstsq, and scipy.stats.linregress.  linfit is faster for all cases I tested, typically by several times.

Finally, I am new to using Github and to contributing to numpy/scipy, so I am not sure if I have submitted everything properly, but I hope this gets the process going.

David Pine
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scipy-dev/attachments/20131129/e3029f4e/attachment.html>


More information about the SciPy-Dev mailing list