random.py gives wrong results (+ a solution)

Thu Jan 25 11:20:28 EST 2001

"Janne Sinkkonen" <janne at oops.nnets.fi> wrote in message
news:m3u26oy1rw.fsf at kinos.nnets.fi...
>
> At least in Python 2.0 and earlier, the samples returned by the
> function betavariate() of random.py are not from a beta distribution
> although the function name misleadingly suggests so.
>
> The following would give beta-distributed samples:
>
> def betavariate(alpha, beta):
>      y = gammavariate(alpha,1)
>      if y==0: return 0.0
>      else: return  y/(y+gammavariate(beta,1))
>
> This is from matlab. A comment in the original matlab code refers to
> Devroye, L. (1986) Non-Uniform Random Variate Generation, theorem 4.1A
> (p. 430). Another reference would be Gelman, A. et al. (1995) Bayesian
> data analysis, p. 481, which I have checked and found to agree with
> the code above.

I'm convinced that Janne Sinkkonen is right:  The beta distribution
generator in
module random.py does not return Beta-distributed random numbers.  Janne's
suggested fix should work just fine.

Here's my guess on how and why this bug bit  -- it won't be of interest to
most but
this subject is so obscure sometimes that there needs to be a detailed
analysis.

The probability density function of the gamma distribution with (positive)
parameters
A and B is usually written

    g(x; A, B) = (x**(A-1) * exp(x/B)) / (Gamma(A) * B**A), where x, A, and
B > 0.

Here Gamma(A) is the gamma function -- for A a positive integer, Gamma(A) is
the
factorial of A - 1, Gamma(A) = (A-1)!.  In fact, this is the definition used
by the authors of random.py in defining gammavariate(alpha, beta), the gamma
distribution random number generator.

Now it happens that a gamma-distributed random variable with parameters A =
1 and
B has the (much simpler) exponential distribution with density function

    g(x; 1, B) = exp(-x/B) / B.

Keep that in mind.

The reference "Discrete Event Simulation in ," by Kevin Watkins
(McGraw-Hill, 1993)
was consulted by the random.py authors.  But this reference defines the
gamma probability distribution a little differently, as

    g1(x; A, B) =  (B**A * x**(A-1) * exp(B*x)) / Gamma(A), where x, A, B >
0.

(See p. 85).  On page 87, Watkins states (incorrectly) that if grv(A, B) is
a function which
returns a gamma random variable with parameters A and B (using his
definition on p. 85),
then the function

    brv(A, B) = grv(1, 1/B) / ( grv(1, 1/B) + grv(1, A) )              [ not
true!]

will return a random variable which has the beta distribution with
parameters A and B.

Believing Watkins to be correct, the random.py authors remembered that a
gamma
random variable with parameter A = 1 is just an exponential random variable
and
further simplified their beta generator to

   brv(A, B) = erv(1/B) / (erv(1/B) + erv(A)), where erv(K) is a random
variable

having the exponential distribution with

parameter K.

The corrected equation for a beta random variable, using Watkins' definition
of the
gamma density, is

    brv(A, B) = grv(A, 1) / ( grv(A, 1) + grv(1/B, 1) ),

which translates to

    brv(A, B) = grv(A, 1) / (grv(A, 1) + grv(B, 1)

using the more common gamma density definition (the one used in random.py).
Many standard statistical references give this equation -- two are
"Non-Uniform random Variate Generation," by Luc Devroye, Springer-Verlag,
1986,
p. 432, and  "Monte Carlo Concepts, Algorithms and Applications," by
George S. Fishman, Springer, 1996, p. 200.

--Ivan Frohne

>>>