[SciPy-User] stats.distributions skew, kurtosis bugs
josef.pktd at gmail.com
josef.pktd at gmail.com
Thu Feb 25 14:07:09 EST 2010
I found an old script of mine to check the skew and kurtosis of the
distributions in scipy.stats with bootstrapping, and I started to
produce a nicer printout.
The attached textfile shows all distributions where the stats() method
gives results that deviate by at least by 3 bootstrap standard
deviations from the bootstrap mean. This can be because the moment is
infinite or not defined, or because there is a bug. There are bugs in
some skew and kurtosis for sure, but I don't know which. Also, this is
a random test, so some false results might show up because of the
random sampling.
Bootstrap were generated by drawing an initial sample with
rvs(size=1000) from the distribution and then run a bootstrap sampling
until the bootstrap standard deviation settles down or a maximum of
30000 samples has been reached.
It's again a warning not to rely on higher moment in
stats.distributions without checking, and if anyone finds an incorrect
calculation or can verify false positives, then I would appreciate any
help.
Josef
-------------- next part --------------
Checking scipy.stats.distributions mean, variance, kurtosis with bootstrap
tstatistic for bias
tvalue = (bootstrapmean - diststatsvalue)/bootstrapstd
1.#IO = inf
1.#QNB = nan
inf in diststats usually means unbound mean or variance
nan in diststats can mean that the corresponding statistic does not exist (integral does not have defined value)
mean
distname, shapes, t-values, bootstrapmean, diststats
alpha (3.5704770516650459,): -1.#IO 0.3050 1.#IO
cauchy (): -1.#IO 2.7770 1.#IO
foldcauchy (4.7164673455831894,): -1.#IO 7.4451 1.#IO
halfcauchy (): -1.#IO 4.0620 1.#IO
levy (): -1.#IO 939.2252 1.#IO
levy_l (): -1.#IO -3147.6643 1.#IO
variance
distname, shapes, t-values, bootstrapmean, diststats
alpha (3.5704770516650459,): -1.#IO 0.0181 1.#IO
cauchy (): -1.#IO 4515.7526 1.#IO
foldcauchy (4.7164673455831894,): -1.#IO 303.3553 1.#IO
halfcauchy (): -1.#IO 407.1620 1.#IO
invgamma (2.0668996136993067,): -34.6421 1.5085 13.1320
levy (): -1.#IO 167040678.2146 1.#IO
levy_l (): -1.#IO 1340119770.2983 1.#IO
lomax (1.8771398388773268,): -1.#IO 5.0821 1.#IO
pareto (2.621716532144454,): -3.2486 0.8993 1.6034
powerlaw (1.6591133289905851,): -349.2695 0.0660 0.8586
rdist (0.90000000000000002,): -106.6559 0.5350 1.7800
tukeylambda (3.1321477856738267,): -291.4355 0.0255 0.3048
skew
distname, shapes, t-values, bootstrapmean, diststats
alpha (3.5704770516650459,): 1.#QNB 4.5995 1.#QNB
cauchy (): 1.#QNB 19.2907 1.#QNB
fatiguelife (29,): 9.2411 3.7254 0.0409
fisk (3.0857548622253179,): -30.1339 4.9220 38.7939
foldcauchy (4.7164673455831894,): 1.#QNB 9.8484 1.#QNB
foldnorm (1.9521253373555869,): -13.2886 0.1708 0.9714
gilbrat (): -20.8618 2.8363 6.1849
halfcauchy (): 1.#QNB 14.5184 1.#QNB
invgamma (2.0668996136993067,): 7.3556 5.5156 -0.4777
levy (): 1.#QNB 17.6494 1.#QNB
levy_l (): 1.#QNB -14.4021 1.#QNB
loglaplace (3.2505926592051435,): -30.3230 2.8614 16.9237
lognorm (0.95368226960575331,): -7.8588 3.3619 5.4597
lomax (1.8771398388773268,): 1.#QNB 7.4517 1.#QNB
mielke (10.4, 3.6000000000000001): -6.3266 3.2524 7.5954
ncf (27, 27, 0.41578441799226107): -301743359927.9626 1.0067 40747519832.6876
pareto (2.621716532144454,): 1.#QNB 5.4505 1.#QNB
powerlaw (1.6591133289905851,): 9.7330 -0.4061 -0.9070
t (2.7433514990818093,): 1.#QNB -2.1069 1.#QNB
kurtosis
distname, shapes, t-values, bootstrapmean, diststats
alpha (3.5704770516650459,): 1.#QNB 36.1850 1.#QNB
burr (10.5, 4.2999999999999998): -25059.8725 8.0754 112616.2702
cauchy (): 1.#QNB 511.4940 1.#QNB
dweibull (2.0685080649914673,): -58.0690 -1.0984 1.9089
fisk (3.0857548622253179,): 17.6460 44.9028 -224.6593
foldcauchy (4.7164673455831894,): 1.#QNB 118.0841 1.#QNB
foldnorm (1.9521253373555869,): -27.1183 -0.3591 2.7052
genpareto (0.10000000000000001,): -3.0045 8.8065 14.8286
gilbrat (): -79.2412 10.0417 110.9364
halfcauchy (): 1.#QNB 300.8983 1.#QNB
invgamma (2.0668996136993067,): 3.6372 45.5398 -2.8666
levy (): 1.#QNB 349.5211 1.#QNB
levy_l (): 1.#QNB 228.6728 1.#QNB
loglaplace (3.2505926592051435,): 39.7340 15.5471 -164.3326
lognorm (0.95368226960575331,): -23.8260 15.3252 81.1354
lomax (1.8771398388773268,): 1.#QNB 81.6275 1.#QNB
mielke (10.4, 3.6000000000000001): 21.3019 20.6964 -149.4051
ncf (27, 27, 0.41578441799226107): 357335608038.1621 1.6098 -239984516632.6775
nct (14, 0.24045031331198066): 2064492.8697 0.5513 -409040.4071
pareto (2.621716532144454,): 1.#QNB 49.1351 1.#QNB
rdist (0.90000000000000002,): 42.8105 -1.5425 -2.5679
t (2.7433514990818093,): 1.#QNB 30.1689 1.#QNB
tukeylambda (3.1321477856738267,): 38.5327 -0.8835 -2.9837
More information about the SciPy-User
mailing list