Linear regression in 3 dimensions

bernhard.voigt at gmail.com bernhard.voigt at gmail.com
Mon Sep 4 16:32:02 EDT 2006


Hi Robert,

I'm using the scipy package for such problems. In the submodule
scipy.optimize there is an implmentation of a least-square fitting
algorithm (Levenberg-Marquardt) called leastsq.

You have to define a function that computes the residuals between your
model and the data points:

import scipy.optimize

def model(parameter, x, y):
    a, b, c = parameter
    return a*x + b*y + c

def residual(parameter, data, x, y):
    res = []
    for _x in x:
        for _y in y:
            res.append(data-model(parameter,x,y)
    return res

params0 = [1., 1.,1.]
result = scipy.optimize.leastsq(resdiual, params0, (data,x,y))
fittedParams = result[0]

If you haven't used numeric, numpy or scipy before, you should take a
look at an introduction. It uses some nice extended array objects,
where you can use some neat index tricks and compute values of array
items without looping through it.

Cheers! Bernhard



Robert Kern wrote:
> wirecom at wirelessmeasurement.com wrote:
> > Hi all,
> >
> > I am seeking a module that will do the equivalent of linear regression in
> > 3D to yield a best fit a plane through a set of points (X1, Y1, Z1), (X1,
> > Y1, Z1),... (Xn, Yn, Zn).
> >
> > The resulting equation to be of the form:
> >
> >  Z = aX + bY + c
> >
> > The function I need would take the set of points and return a,c & c Any
> > pointers to existing code / modules would be very helpful.
>
> Well, that's a very unspecified problem. You haven't defined "best."
>
> But if we make the assumption that you want to minimize the squared error in Z,
> that is minimize
>
>    Sum((Z[i] - (a*X[i] + b*Y[i] + c)) ** 2)
>
> then this is a standard linear algebra problem.
>
> In [1]: import numpy as np
>
> In [2]: a = 1.0
>
> In [3]: b = 2.0
>
> In [4]: c = 3.0
>
> In [5]: rs = np.random.RandomState(1234567890)  # Specify a seed for repeatability
>
> In [6]: x = rs.uniform(size=100)
>
> In [7]: y = rs.uniform(size=100)
>
> In [8]: e = rs.standard_normal(size=100)
>
> In [9]: z = a*x + b*y + c + e
>
> In [10]: A = np.column_stack([x, y, np.ones_like(x)])
>
> In [11]: np.linalg.lstsq?
> Type:           function
> Base Class:     <type 'function'>
> String Form:    <function lstsq at 0x6df070>
> Namespace:      Interactive
> File:
> /Library/Frameworks/Python.framework/Versions/2.4/lib/python2.4/site-packages/numpy-1.0b2.dev3002-py2.4-macosx-10.4-ppc.egg/numpy/linalg/linalg.py
> Definition:     np.linalg.lstsq(a, b, rcond=1e-10)
> Docstring:
>      returns x,resids,rank,s
>      where x minimizes 2-norm(|b - Ax|)
>            resids is the sum square residuals
>            rank is the rank of A
>            s is the rank of the singular values of A in descending order
>
>      If b is a matrix then x is also a matrix with corresponding columns.
>      If the rank of A is less than the number of columns of A or greater than
>      the number of rows, then residuals will be returned as an empty array
>      otherwise resids = sum((b-dot(A,x)**2).
>      Singular values less than s[0]*rcond are treated as zero.
>
>
> In [12]: abc, residuals, rank, s = np.linalg.lstsq(A, z)
>
> In [13]: abc
> Out[13]: array([ 0.93104714,  1.96780364,  3.15185125])
>
> --
> Robert Kern
>
> "I have come to believe that the whole world is an enigma, a harmless enigma
>   that is made terrible by our own mad attempt to interpret it as though it had
>   an underlying truth."
>    -- Umberto Eco




More information about the Python-list mailing list