Linear regression in NumPy

Matt Crema crema at bu.edu
Sat Mar 18 10:24:34 EST 2006


Robert Kern wrote:
> nikie wrote:
> 
>>I still don't get it...
>>My data looks like this:
>> x = [0,1,2,3]
>> y = [1,3,5,7]
>>The expected output would be something like (2, 1), as y[i] = x[i]*2+1
>>
>>(An image sometimes says more than 1000 words, so to make myself clear:
>>this is what I want to do:
>>http://www.statistics4u.info/fundstat_eng/cc_regression.html)
>>
>>So, how am I to fill these matrices?
> 
> 
> As the docstring says, the problem it solves is min ||A*x - b||_2. In order to
> get it to solve your problem, you need to cast it into this matrix form. This is
> out of scope for the docstring, but most introductory statistics or linear
> algebra texts will cover this.
> 
> In [201]: x = array([0., 1, 2, 3])
> 
> In [202]: y = array([1., 3, 5, 7])
> 
> In [203]: A = ones((len(y), 2), dtype=float)
> 
> In [204]: A[:,0] = x
> 
> In [205]: from numpy import linalg
> 
> In [206]: linalg.lstsq(A, y)
> Out[206]:
> (array([ 2.,  1.]),
>  array([  1.64987674e-30]),
>  2,
>  array([ 4.10003045,  1.09075677]))
> 

I'm new to numpy myself.

The above posters are correct to say that the problem must be cast into 
matrix form.  However, as this is such a common technique, don't most 
math/stats packages do it behind the scenes?

For example, in Matlab or Octave I could type:
polyfit(x,y,1)

and I'd get the answer with shorter, more readable code.  A one-liner! 
Is there a 'canned' routine to do it in numpy?

btw, I am not advocating that one should not understand the concepts 
behind a 'canned' routine.  If you do not understand this concept you 
should take <Robert Kern>'s advice and dive into a linear algebra book. 
  It's not very difficult, and it is essential that a scientific 
programmer understand it.

-Matt



More information about the Python-list mailing list