[SciPy-user] linear regression

Thu May 28 01:01:49 EDT 2009

On Wed, May 27, 2009 at 9:39 PM,  <josef.pktd at gmail.com> wrote:
> On Wed, May 27, 2009 at 9:03 PM, Robert Kern <robert.kern at gmail.com> wrote:
>> On Wed, May 27, 2009 at 19:40,  <josef.pktd at gmail.com> wrote:
>>
>>> The variance of the error term in the regression equation is a linear
>>> combination of the true error (p_sy) and the measurement error in x
>>> (p_sx)
>>>
>>> So the correct weighting would be according to  p_sy**2 + beta**2 *
>>> p_sx**2, which is in practice not possible since we don't know beta,
>>> or maybe iterative would work  (at least something like this should be
>>> correct)
>>
>> Yes! This is precisely what ODR does for you in the linear case, all
>> in one shot.

> Using a simple 2 step iteration for estimating the weights, I get
> almost the same mean squared error as odr, but the bias stays higher,
> which I don't understand.

So, Robert you were right about the bias.  Since the bias didn't want
to go away, especially for large measurement errors, I had to look up
some textbooks.

In the case of measurement errors the observed regressors are (always)
correlated with the error in the regression equation, even if the true
(unobserved) variable is not. The reference model I had in mind, was
random regressors which are observed. If the observed regressors are
uncorrelated with the error term, then there is no bias.

Models with measurement errors show similar symptoms as for example
models with endogeneity bias, and the standard econometrics text book
solution is still instrumental variables. Given that the symptoms and
standard treatment are (mostly) the same, I had the wrong intuition
that the decease is also the same.

So, in all the variations of your example that I tried, bias goes in
favor of odr compared to ols. The MSEs are essentially the same, but I
assume there are cases where the MSE also deteriorates.

Josef