[SciPy-dev] code for incremental least squares

Wed Feb 17 11:03:23 EST 2010

On Wed, Feb 17, 2010 at 8:10 AM, <josef.pktd at gmail.com> wrote:

> On Wed, Feb 17, 2010 at 3:32 AM, Nathaniel Smith <njs at pobox.com> wrote:
> > 2010/2/16 Stéfan van der Walt <stefan at sun.ac.za>:
> >> Forming the product A^T A is often a bad idea from a numerical
> >> perspective.
> >
> > Right.
> >
> >> In Chapter 3 of Ake Bjork's "Numerical Methods for Least
> >> Squares Problems", he talks about "update" problems.  He mentions that
> >> "...the solution should be accurate up to the limitation of data and
> >> conditioning of the problem; i.e., a stable numerical method must be
> >> used."
> >>
> >> He describes the Kalman-based update method (which bears some
> >> resemblance to yours), but says that "the main disadvantage...is its
> >> serious sensitivity to roundoff errors.  The updating algorithms based
> >> on orthogonal transformations developed in the following sections are
> >> generally to be preferred."  He then goes into more detail on updating
> >> the QR and Gram-Schmidt decompositions.
> >>
> >> Not sure if that helps, but it may be worth reading that chapter.
> >
> > Definitely -- thanks for the reference! Looks like the library has a
> > copy available, too...
> >
> > (Not that I'll necessarily bother for my own use, since like I said,
> > my problems seem to be fairly well conditioned and the sparse X'X +
> > Cholesky approach is very fast, and also generalizes easily to mixed
> > effect models, which I'm currently working on implementing.)
> >
> >>> For the QR code I use a recurrence relation I stole from some slides
> >>> by Simon Wood to compute R and Q'y incrementally; probably a real
> >>> incremental QR (e.g., "AS274", which is what R's biglm package uses)
> >>> would be better, but this one is easy to implement in terms of
> >>> non-incremental QR.
> >>
> >> Incremental QR is something we should implement in scipy.linalg, for
> sure.
> >
> > +1
>
> Just a follow up question for the experts.
>
> In the description of some estimators and the optimization routines
> used, I have read that the updating is done on the cholesky or QR
> decomposition of the inverse matrix ( (x'x)^{-1} or inverse Hessian).
> Except for simple least squares problem the inverse is often more
> important than the the original X'X.
>
> Are there algorithms available for this, and is there an advantage for
> either way? From what I have seen in incremental_ls, the updating
> works on QR of x'x and then the inverse of the QR decomposition is
> taken. Assuming a well behaved non-singular X'X.
>
>
Both versions are available for Kalman filters. Using the covariance
(inverse) is the usual approach, but the other version is used and goes
under the name Kalman information filter. There are various ways to minimise
roundoff error depending on how the updates are done. For high precision
there are implementations that work with factorizations, either the Cholesky
factorization or what amounts to QR. There is a large literature on the
subject going back to the 60's.

Kalman filters are Bayesian and require a prior to get started, but such
isn't hard to come by, zero with a large covariance suffices. However, that
might make them less suitable for an application where all the data is
available and dynamic estimates of the parameters aren't needed. But the
numerical parts may offer some useful bits.

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scipy-dev/attachments/20100217/3ea77dbe/attachment.html>