[SciPy-User] calculate predicted values from regression + confidence intervall

Mon Oct 17 15:29:47 EDT 2011

Hello again,

it is not a problem to get the covariance matrix of the parameter
estimates (with the vcov() function of R) and of course I can use
that also for the calculations...

The covariance matrix is:
             (Intercept)            L            uT  stream.order
(Intercept)   1.33059472 -0.246834193 -0.0307302435  0.0267554775
L            -0.24683419  0.054481286  0.0014007745 -0.0105982957
uT           -0.03073024  0.001400774  0.0053472652 -0.0007137384
stream.order  0.02675548 -0.010598296 -0.0007137384  0.0091419997

I probably just have to save that in the form of a python matrix (Just have
to look up how to do that).

But how can I know proceed?

I found following description of prediction intervals (Thats the thing
I want :) here: http://statmaster.sdu.dk/courses/st111/module05/module.pdf (page 11).
But can that be realised in python using the covariance matrix resp. the
other estimates and std. errors of my regression parameters?

/Johannes

Am 17.10.2011 um 19:00 schrieb scipy-user-request at scipy.org:

>> The R Coefficients are as follows:
>> ? ? ? ? ? ? Estimate Std. Error t value Pr(>|t|)
>> (Intercept) ?-9.00068 ? ?1.15351 ?-7.803 8.26e-12 ***
>> Variable X1 ?1.87119 ? ?0.23341 ? 8.017 ?2.95e-12 ***
>> Variable X2 ?0.39193 ? ?0.07312 ? 5.360 ?5.92e-07 ***
>> Variable X3 ?0.27870 ? ?0.09561 ? 2.915 ?0.00445 **
>> 
>> Can I use these results to manually calculate
>> a predicted value of Y with a give set of new Xs? like
>> X1 = 200
>> X2 = 150
>> X3 = 5
>> 
>> I can easily calculate the predicted Y as
>> Y = -9 + 200*1.87 + 150*0.39 + 5*0.28
> 
> I don't think this is enough information to get the prediction
> confidence interval. You need the entire covariance matrix of the
> parameter estimates.
> 
> Roughly (I would need to check the details):
> the parameter estimate is from a multivariate normal distribution,
> your y is a linear transformation, so the prediction should be normal
> distributed with mean y = Y = X*beta, and var(y) = X' * cov_beta * X +
> var_u_estimate (dot products for appropriate shapes)
> 
> Without knowing the covariance matrix of the parameter estimates, you
> would have to assume that cov_beta is diagonal which is almost surely
> not the case.
> 
> Josef