[SciPy-User] Scipy's probplot compared to R's qqplot

josef.pktd at gmail.com josef.pktd at gmail.com
Wed Mar 3 14:36:39 EST 2010


On Wed, Mar 3, 2010 at 2:09 PM,  <PHobson at geosyntec.com> wrote:
> Hey folks,
>
> I've taken more of an interest in statistics and Scipy lately and decided to compare the scipy.stats.probplot() function to R's qqplot(). For a given dataset, the results are slightly different.
>
> Here's a link to the script I wrote to do the comparison.
> http://dpaste.com/167464/
>
> Basically, it does the following:
> -Uses numpy to generate some fake, noramlly distributed data
> -Uses both R and Scipy to compute the values needed for quantile/probability plot
> -Computes linear regressions on the quantile data with both R and Scipy.
> -prints some output to compare the two
>
> My initial conclusions:
> 1) R's lm(y~x) and scipy.stats.linregress(x,y) yield the same slope and intercept of a linear model. (good)
> 2) R and Scipy compute the quantiles of a dataset in slightly different manners (??)
>
> Any clue as to why the discrepancy in #2 occurs? Would you consider it a big deal?

I would consider any significant deviation a big deal, unless we know
that there are differences in the definitions or underlying
assumptions.

I'm not sure what's going on since I never looked at the details of
probplot. However, when I plot the quantiles
>>> plt.plot(np.sort(qR))
>>> plt.plot(qS[0])
>>> plt.show()

then the graph looks almost the same except for the first and last point.

qS[0]-np.sort(qR)

differs in the second decimal, except for first and last observation.
My guess would be that there are some differences for example in the
continuity correction, or similar.

The boundary points, however, look suspicious.

Thanks for checking this,

Josef




 I'm using:
> Python v2.6.2 (XP) and v2.6.4 (Karmic and Snow Leopard)
> Scipy v0.7.1
> Numpy v1.4.0
> R v2.10.0
> Rpy2 v2.0.8
>
> Thanks,
> -Paul H.
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user
>



More information about the SciPy-User mailing list