Hypergeometric distribution

Robert Kern robert.kern at gmail.com
Mon Dec 26 16:40:32 EST 2005


Raven wrote:
> Hi to all, I need to calculate the hpergeometric distribution:
> 
> 
>                        choose(r, x) * choose(b, n-x)
>         p(x; r,b,n) =  -----------------------------
>                            choose(r+b, n)
> 
> choose(r,x) is the binomial coefficient
> I use the factorial to calculate the above formula but since I am using
> large numbers, the result of choose(a,b) (ie: the binomial coefficient)
> is too big even for large int. I've tried the scipy library, but this
> library calculates
> the hypergeometric using the factorials too, so the problem subsist. Is
> there any other libray or an algorithm to calculate
> the hypergeometric distribution?

Use logarithms.

Specifically,

from scipy import special

def logchoose(n, k):
    lgn1 = special.gammaln(n+1)
    lgk1 = special.gammaln(k+1)
    lgnk1 = special.gammaln(n-k+1)
    return lgn1 - (lgnk1 + lgk1)

def gauss_hypergeom(x, r, b, n):
    return exp(logchoose(r, x) +
               logchoose(b, n-x) -
               logchoose(r+b, n))

Or you could use gmpy if you need exact rational arithmetic rather than floating
point.

-- 
Robert Kern
robert.kern at gmail.com

"In the fields of hell where the grass grows high
 Are the graves of dreams allowed to die."
  -- Richard Harter




More information about the Python-list mailing list