[SciPy-User] stats.distributions skew, kurtosis bugs

josef.pktd at gmail.com josef.pktd at gmail.com
Thu Feb 25 14:07:09 EST 2010


I found an old script of mine to check the skew and kurtosis of the
distributions in scipy.stats with bootstrapping, and I started to
produce a nicer printout.

The attached textfile shows all distributions where the stats() method
gives results that deviate by at least by 3 bootstrap standard
deviations from the bootstrap mean. This can be because the moment is
infinite or not defined, or because there is a bug. There are bugs in
some skew and kurtosis for sure, but I don't know which. Also, this is
a random test, so some false results might show up because of the
random sampling.

Bootstrap were generated by drawing an initial sample with
rvs(size=1000) from the distribution and then run a bootstrap sampling
until the bootstrap standard deviation settles down or a maximum of
30000 samples has been reached.

It's again a warning not to rely on higher moment in
stats.distributions without checking, and if anyone finds an incorrect
calculation or can verify false positives, then I would appreciate any
help.

Josef
-------------- next part --------------

Checking scipy.stats.distributions mean, variance, kurtosis with bootstrap


tstatistic for bias
tvalue = (bootstrapmean - diststatsvalue)/bootstrapstd

1.#IO = inf
1.#QNB = nan

inf in diststats usually means unbound mean or variance
nan in diststats can mean that the corresponding statistic does not exist (integral does not have defined value)



mean
distname,                  shapes,            t-values, bootstrapmean,   diststats
     alpha            (3.5704770516650459,):     -1.#IO     0.3050        1.#IO
    cauchy                               ():     -1.#IO     2.7770        1.#IO
foldcauchy            (4.7164673455831894,):     -1.#IO     7.4451        1.#IO
halfcauchy                               ():     -1.#IO     4.0620        1.#IO
      levy                               ():     -1.#IO   939.2252        1.#IO
    levy_l                               ():     -1.#IO -3147.6643        1.#IO

variance
distname,                  shapes,            t-values, bootstrapmean,   diststats
     alpha            (3.5704770516650459,):     -1.#IO     0.0181        1.#IO
    cauchy                               ():     -1.#IO  4515.7526        1.#IO
foldcauchy            (4.7164673455831894,):     -1.#IO   303.3553        1.#IO
halfcauchy                               ():     -1.#IO   407.1620        1.#IO
  invgamma            (2.0668996136993067,):   -34.6421     1.5085      13.1320
      levy                               ():     -1.#IO 167040678.2146        1.#IO
    levy_l                               ():     -1.#IO 1340119770.2983        1.#IO
     lomax            (1.8771398388773268,):     -1.#IO     5.0821        1.#IO
    pareto             (2.621716532144454,):    -3.2486     0.8993       1.6034
  powerlaw            (1.6591133289905851,):  -349.2695     0.0660       0.8586
     rdist           (0.90000000000000002,):  -106.6559     0.5350       1.7800
tukeylambda            (3.1321477856738267,):  -291.4355     0.0255       0.3048

skew
distname,                  shapes,            t-values, bootstrapmean,   diststats
     alpha            (3.5704770516650459,):     1.#QNB     4.5995       1.#QNB
    cauchy                               ():     1.#QNB    19.2907       1.#QNB
fatiguelife                            (29,):     9.2411     3.7254       0.0409
      fisk            (3.0857548622253179,):   -30.1339     4.9220      38.7939
foldcauchy            (4.7164673455831894,):     1.#QNB     9.8484       1.#QNB
  foldnorm            (1.9521253373555869,):   -13.2886     0.1708       0.9714
   gilbrat                               ():   -20.8618     2.8363       6.1849
halfcauchy                               ():     1.#QNB    14.5184       1.#QNB
  invgamma            (2.0668996136993067,):     7.3556     5.5156      -0.4777
      levy                               ():     1.#QNB    17.6494       1.#QNB
    levy_l                               ():     1.#QNB   -14.4021       1.#QNB
loglaplace            (3.2505926592051435,):   -30.3230     2.8614      16.9237
   lognorm           (0.95368226960575331,):    -7.8588     3.3619       5.4597
     lomax            (1.8771398388773268,):     1.#QNB     7.4517       1.#QNB
    mielke       (10.4, 3.6000000000000001):    -6.3266     3.2524       7.5954
       ncf    (27, 27, 0.41578441799226107): -301743359927.9626     1.0067 40747519832.6876
    pareto             (2.621716532144454,):     1.#QNB     5.4505       1.#QNB
  powerlaw            (1.6591133289905851,):     9.7330    -0.4061      -0.9070
         t            (2.7433514990818093,):     1.#QNB    -2.1069       1.#QNB

kurtosis
distname,                  shapes,            t-values, bootstrapmean,   diststats
     alpha            (3.5704770516650459,):     1.#QNB    36.1850       1.#QNB
      burr       (10.5, 4.2999999999999998): -25059.8725     8.0754  112616.2702
    cauchy                               ():     1.#QNB   511.4940       1.#QNB
  dweibull            (2.0685080649914673,):   -58.0690    -1.0984       1.9089
      fisk            (3.0857548622253179,):    17.6460    44.9028    -224.6593
foldcauchy            (4.7164673455831894,):     1.#QNB   118.0841       1.#QNB
  foldnorm            (1.9521253373555869,):   -27.1183    -0.3591       2.7052
 genpareto           (0.10000000000000001,):    -3.0045     8.8065      14.8286
   gilbrat                               ():   -79.2412    10.0417     110.9364
halfcauchy                               ():     1.#QNB   300.8983       1.#QNB
  invgamma            (2.0668996136993067,):     3.6372    45.5398      -2.8666
      levy                               ():     1.#QNB   349.5211       1.#QNB
    levy_l                               ():     1.#QNB   228.6728       1.#QNB
loglaplace            (3.2505926592051435,):    39.7340    15.5471    -164.3326
   lognorm           (0.95368226960575331,):   -23.8260    15.3252      81.1354
     lomax            (1.8771398388773268,):     1.#QNB    81.6275       1.#QNB
    mielke       (10.4, 3.6000000000000001):    21.3019    20.6964    -149.4051
       ncf    (27, 27, 0.41578441799226107): 357335608038.1621     1.6098 -239984516632.6775
       nct        (14, 0.24045031331198066): 2064492.8697     0.5513 -409040.4071
    pareto             (2.621716532144454,):     1.#QNB    49.1351       1.#QNB
     rdist           (0.90000000000000002,):    42.8105    -1.5425      -2.5679
         t            (2.7433514990818093,):     1.#QNB    30.1689       1.#QNB
tukeylambda            (3.1321477856738267,):    38.5327    -0.8835      -2.9837


More information about the SciPy-User mailing list