[SciPy-user] OLS matrix-f(x) = 0 problem (Was: linear regression)

josef.pktd at gmail.com josef.pktd at gmail.com
Thu May 28 20:10:32 EDT 2009


On Thu, May 28, 2009 at 4:43 PM,  <josef.pktd at gmail.com> wrote:
> On Thu, May 28, 2009 at 3:37 PM, Gael Varoquaux
> <gael.varoquaux at normalesup.org> wrote:
>> On Wed, May 27, 2009 at 06:27:29PM -0400, josef.pktd at gmail.com wrote:
>>> Sounds like a recursive system of linear (simultaneous) equations with
>>> linear restrictions to me. If you want an unbiased estimator, then
>>> going row by row, and solving each linear OLS, linalg.lstsq, would be
>>> the standard way to go. Substuting the previous estimates of the Y's
>>> into the next step.
>>
>> Oups, I realise I forgot to answer.
>>
>> You are right, this is a way to interpret it, and I was solving the
>> system as you suggest. What didn't like is that the solution I was
>> getting was dependant on the order of the variables, but I had forgotten
>> that the lower triangular matrix was an approximation. The
>> non-permutation-invariance came from this approximation, not the way I
>> was solving the system.
>>
>> Unfortunately, it seems that the solution to the complete problem is
>> still an open research question (FYI the problem is to find the OLS
>> solution to "M X = X + e", with M definite positive, and with a given
>> support.
>>
>> X's dimension are everywhere between (50, 50) to (300, 500), including
>> the bad situation (300, 50).
>>
>> This is related sparse covariance matrix estimation. I don't think there
>> is (yet) an easy answer.
>>
>> Thanks for your answer, it brought me back to Earth, making me realize
>> that I was already doing the right thing, and look for the problem
>> elsewhere.
>>
>> Gaël
>
> I'm not sure I understand anymore.
>
> When estimating the parameters of a simultaneous system of equations
> with least squares, we need a lot of identifying restrictions, the
> lower triangular parameter matrix is the simplest one. And you don't
> get permutation invariance because the sequence of your equation is
> what identifies the parameters. In your case, you need to have enough
> identifying restrictions on the support of M, and given that you don't
> have any additional exogenous variables the identifying restrictions
> might require that it can be reordered to a lower triangular form.
> (Disclaimer: After I mixed up the bias yesterday, I should mentioned
> that I haven't looked at this in a pretty long time.)
>
> For the rest I'm a bit vague:
> If you don't want to impose the sequential identifying restriction,
> than you are just looking for a subspace that spans your X matrix with
> certain properties.
>
> Given that you have an X that can have more rows than columns and
> reversed, you have either more or fewer equations than unknowns, which
> should already create a large multiplicity of solutions for some
> cases. Also I expect your X'X (or in numpy X.T * X) matrix to be
> singular.  (maybe it is X*X.T in your notation)
> So I would think that the solution will depend more on the eigenvector
> decomposition, or SVD, or pinv of X'X, and there might be many
> possibilities to span the space of X. I'm not sure how to get the
> subspace that satisfies your support in M restrictions, if M is not
> lower triangular.

Now this just sounds like a description for principal component or
factor analysis to me. ???

Josef

>
> I don't really understand what permutation-invariance you want, but if
> you want to impose some kind of symmetry maybe this gives you
> identification of a unique solution.
>
> Josef
>



More information about the SciPy-User mailing list