[Numpy-discussion] Efficient orthogonalisation with scipy/numpy

Tue Jan 19 22:02:37 EST 2010

On Tue, Jan 19, 2010 at 9:47 PM, Charles R Harris
<charlesr.harris at gmail.com> wrote:
>
>
> On Tue, Jan 19, 2010 at 6:08 PM, <josef.pktd at gmail.com> wrote:
>>
>> On Tue, Jan 19, 2010 at 6:48 PM, Charles R Harris
>> <charlesr.harris at gmail.com> wrote:
>> >
>> >
>> > On Tue, Jan 19, 2010 at 2:34 PM, <josef.pktd at gmail.com> wrote:
>> >>
>> >> On Tue, Jan 19, 2010 at 4:29 PM, Charles R Harris
>> >> <charlesr.harris at gmail.com> wrote:
>> >> >
>> >> >
>> >> > On Tue, Jan 19, 2010 at 1:47 PM, Gael Varoquaux
>> >> > <gael.varoquaux at normalesup.org> wrote:
>> >> >>
>> >> >> On Tue, Jan 19, 2010 at 02:22:30PM -0600, Robert Kern wrote:
>> >> >> > > y = y - np.dot(confounds.T, linalg.lstsq(confounds.T, y)[0])
>> >> >>
>> >> >> > > with np = numpy and linalg = scipy.linalg where scipy calls
>> >> >> > > ATLAS.
>> >> >>
>> >> >> > For clarification, are you trying to find the components of the y
>> >> >> > vectors that are perpendicular to the space spanned by the 10
>> >> >> > orthonormal vectors in confounds?
>> >> >>
>> >> >> Yes. Actually, what I am doing is calculating partial correlation
>> >> >> between
>> >> >> x and y conditionally to confounds, with the following code:
>> >> >>
>> >> >> def cond_partial_cor(y, x, confounds=[]):
>> >> >>    """ Returns the partial correlation of y and x, conditionning on
>> >> >>        confounds.
>> >> >>    """
>> >> >>    # First orthogonalise y and x relative to confounds
>> >> >>    if len(confounds):
>> >> >>        y = y - np.dot(confounds.T, linalg.lstsq(confounds.T, y)[0])
>> >> >>        x = x - np.dot(confounds.T, linalg.lstsq(confounds.T, x)[0])
>> >> >>    return np.dot(x, y)/sqrt(np.dot(y, y)*np.dot(x, x))
>> >> >>
>> >> >> I am not sure that what I am doing is optimal.
>> >> >>
>> >> >> > > Most of the time is spent in linalg.lstsq. The length of the
>> >> >> > > vectors
>> >> >> > > is
>> >> >> > > 810, and there are about 10 confounds.
>> >> >>
>> >> >> > Exactly what are the shapes? y.shape = (810, N); confounds.shape =
>> >> >> > (810,
>> >> >> > 10)?
>> >> >>
>> >> >> Sorry, I should have been more precise:
>> >> >>
>> >> >> y.shape = (810, )
>> >> >> confounds.shape = (10, 810)
>> >> >>
>> >> >
>> >> > Column stack the bunch so that the last column is y, then do a qr
>> >> > decomposition. The last column of q is the (normalized) orthogonal
>> >> > vector
>> >> > and its amplitude is the last (bottom right) component of r.
>> >>
>> >> do you have to do qr twice, once with x and once with y in the last
>> >> column or can this be combined?
>> >>
>> >> I was trying to do something similar for partial autocorrelation for
>> >> timeseries but didn't manage or try anything better than repeated
>> >> leastsq or a variant.
>> >>
>> >
>> > Depends on what you want to do. The QR decomposition is essentially
>> > Gram-Schmidt on the columns. So if you just want an orthonormal basis
>> > for
>> > the subspace spanned by a bunch of columns, the columns of Q are they.
>> > To
>> > get the part of y orthogonal to that subspace you can do y - Q*Q.T*y,
>> > which
>> > is probably what you want if the x's are fixed and the y's vary. If
>> > there is
>> > just one y, then putting it as the last column lets the QR algorithm do
>> > that
>> > last bit of projection.
>>
>> Gram-Schmidt (looking at it for the first time) looks a lot like
>> sequential least squares projection. So, I'm trying to figure out if I
>> can use the partial results up to a specific column as partial least
>> squares and then work my way to the end by including/looking at more
>> columns.
>>
>
> I don't the QR factorization would work for normal PLS. IIRC, one of the
> algorithms does a svd of the cross correlation matrix. The difference is
> that in some sense the svd picks out the best linear combination of columns,
> while the qr factorization without column pivoting just takes them in order.
> The QR factorization used to be the method of choice for least squares
> because it is straight forward to compute, no iterating needed as in svd,
> but these days that advantage is pretty much gone. It is still a common
> first step in the svd, however. The matrix is factored to Q*R, then the svd
> of R is computed.

I (finally) figured out svd and eigenvalue decomposition for this purpose.

But from your description of QR, I thought specifically of the case
where we have a "natural" ordering of the regressors, similar to the
polynomial case of you and Anne. In the timeseries case it would be by
increasing lags

yt on y_{t-1}
yt on y_{t-1}, y_{t-2}
...
...
yt on y_{t-k} for k= 1,...,K

or yt on xt and the lags of xt

This is really sequential LS with a predefined sequence, not PLS or
PCA/PCR or similar orthogonalization by "importance".
The usual procedure for deciding on the appropriate number of lags
usually loops over OLS with increasing number of regressors.
>From the discussion, I thought there might be a way to "cheat" in this
using QR and Gram-Schmidt

Thanks,

Josef

>
>>
>> But unfortunately I don't have time to play with it long enough to
>> figure out whether and how it works, but I keep this in mind for the
>> future.
>>
>
> Chuck
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>