[SciPy-user] Limits of linrgress - underflow encountered in stdtr
wierob
wierob83 at googlemail.com
Tue Jun 9 07:57:18 EDT 2009
Hi,
for z = 30 my code sample prints
===== dependency_with_noise =====
slope: 2.0022556391
intercept: -0.771428571429
r^2: 0.953601402677
p-value: 0.0
stderr: 0.0258507089053
so I'm just confused that the p-value claims the match is absolutely
perfect while it is not (also its pretty close to perfect). If compared
this result to R (www.*r*-project.org) :
> summary(lm(y~x))
Call:
lm(formula = y ~ x)
Residuals:
Min 1Q Median 3Q Max
-12.2624 0.7325 0.7477 0.7635 7.7511
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -0.77143 0.28728 -2.685 0.00745 **
x 2.00226 0.02585 77.455 < 2e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 3.651 on 598 degrees of freedom
Multiple R-squared: 0.9094, Adjusted R-squared: 0.9092
F-statistic: 5999 on 1 and 598 DF, p-value: < 2.2e-16
> summary(lm(y~x))$coefficients
Estimate Std. Error t value Pr(>|t|)
(Intercept) -0.7714286 0.28728036 -2.685281 7.447975e-03
x 2.0022556 0.02585071 77.454574 6.009953e-314
The intercept, slope (x) and stderr values are equal but the p-value is
6.009953e-314 and r-squared is different. While 6.009953e-314 is small
enough to say its 0 and the result is highly significant, I just wonder
if Scipy decides its small enough to return 0.0 or if it returns 0.0
because it cant actually compute it. If 0.0 is returned deliberately
what's the threshold for this decision. Maybe this behavior should be
documented.
regards
robert
josef.pktd at gmail.com schrieb:
> On Mon, Jun 8, 2009 at 5:16 PM, wierob<wierob83 at googlemail.com> wrote:
>
>> Hi,
>>
>>
>>> turn of numpy.seterr(all="raise")
>>> as explained in the reply to your previous messages
>>>
>>> Josef
>>>
>>>
>> turning of the error reporting doesn't prevent the error. Thus the
>> result may be wrong, doesn't it? E.g. a p-value of 0.0 looks suspicious.
>>
>>
>
> anything else than a p-value of 0 would be suspicious, you have a
> perfect fit and the probability is zero that we observe a slope equal
> to the estimated slope under the null hypothesis( that the slope is
> zero). So (loosely speaking) we can reject the null of zero slope with
> probability 1.
> The result is not "maybe" wrong, it is correct. your r_square is 1,
> the standard error of the slope estimate is zero.
>
>
> floating point calculation with inf are correct (if they don't have a
> definite answer we get a nan). Dividing a non-zero number by zero has
> a well defined result, even if python raises a zerodivisionerror.
>
>
>>>> np.array(1)/0.
>>>>
> inf
>
>>>> 1/(np.array(1)/0.)
>>>>
> 0.0
>
>>>> np.seterr(all="raise")
>>>>
> {'over': 'ignore', 'divide': 'ignore', 'invalid': 'ignore', 'under': 'ignore'}
>
>>>> 1/(np.array(1)/0.)
>>>>
> Traceback (most recent call last):
> File "<pyshell#39>", line 1, in <module>
> 1/(np.array(1)/0.)
> FloatingPointError: divide by zero encountered in divide
>
> Josef
> _______________________________________________
> SciPy-user mailing list
> SciPy-user at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user
>
More information about the SciPy-User
mailing list