[SciPy-user] OLS matrix-f(x) = 0 problem (Was: linear regression)

josef.pktd at gmail.com josef.pktd at gmail.com
Wed May 27 18:27:29 EDT 2009


On Wed, May 27, 2009 at 5:53 PM, Gael Varoquaux
<gael.varoquaux at normalesup.org> wrote:
> On Wed, May 27, 2009 at 03:55:18PM -0500, Robert Kern wrote:
>> On Wed, May 27, 2009 at 15:50, Gael Varoquaux
>> <gael.varoquaux at normalesup.org> wrote:
>> > I have been fighting a bit with a OLS regression problem (my ignorance in
>> > regression is wide), and a remark by Robert just prompted me to ask the
>> > list:
>
>> > On Wed, May 27, 2009 at 02:37:14PM -0500, Robert Kern wrote:
>> >> "f(x)=0" models can express covariances between all dimensions of x.
>
>> > Sorry for asking you about my 'homework', but people seem so
>> > knowledgeable...
>
>> > I have a multivariate dataset X, and a given sparse, lower triangular,
>> > boolean, matrix T with an empty diagonal. I am interested in finding the
>> > matrix R for which support(R) == support(T), that is the OLS solution to:
>
>> > Y = np.dot(R, Y)
>
>> Where did Y come from? And where did X and T go?
>
> Darn, sorry. Y and X are the same thing: my data. T is only there to
> specify the support of R. Another way to put it is that I know that a
> large fraction of the coefficients of R are zeros.
>
> I have a hunch that I need to 'unroll' the non-zero coefficients, and get
> back to a simpler, and well-known OLS estimation problem, but I couldn't
> do it.
>

Sounds like a recursive system of linear (simultaneous) equations with
linear restrictions to me. If you want an unbiased estimator, then
going row by row, and solving each linear OLS, linalg.lstsq, would be
the standard way to go. Substuting the previous estimates of the Y's
into the next step.

There might also be a way to estimate all in one big OLS if you find
the linear transformation matrix that removes the zeros from your R
matrix. But here, I'm not sure how easy this is, and how to get back
unbiased estimators.

What are the dimension of your matrices?
if Y is N by K, N observations and K regression equations, N>K,  what is K?

Josef



More information about the SciPy-User mailing list