[SciPy-User] scipy.stats convolution of two distributions

Mon Apr 9 17:04:39 EDT 2012

Hi,

In one of my projects I built some code that depends in a nice and
generic way on the methods of rv_continuous in scipy.stats. Now it
turns out that I shot myself in the foot because I need to run a test
with the sum (convolution) of three such distributions. As far as I
can see there is no standard way to achieve this in scipy.stats. Does
anybody know of a good (generic) way to do this? If not, would it
actually be useful to add such functionality to scipy.stats, if
possible at all?

After some struggling I wrote the code below, but this is a first
attempt. I am very interested in obtaining feedback to turn this into
something that is useful for a larger population than just me.

Thanks in advance

Nicky

import numpy as np
import scipy.stats
from scipy.stats import poisson, uniform, expon
import pylab as pl

# I need a grid since np.convolve requires two arrays.

# choose the grid such that it covers the numerically relevant support
# of the distributions
grid = np.arange(0., 5., 0.001)

# I need P(D1+D2+D3 <= x)
D1 = expon(scale = 1./2)
D2 = expon(scale = 1./3)
D3 = expon(scale = 1./6)

class convolved_gen(scipy.stats.rv_continuous):
    def __init__(self, D1, D2, grid):
        self.D1 = D1
        self.D2 = D2
        delta = grid[1]-grid[0]
        p1 = self.D1.pdf(grid)
        p2 = self.D2.pdf(grid)*delta
        self.conv = np.convolve(p1, p2)

        super(convolved_gen, self).__init__(name = "convolved")

    def _cdf(self, grid):
        cdf = np.cumsum(self.conv)
        return cdf/cdf[-1] # ensure that cdf[-1] = 1

    def _pdf(self, grid):
        return self.conv[:len(grid)]

    def _stats(self):
        m = self.D1.stats("m") + self.D2.stats("m")
        v = self.D1.stats("v") + self.D2.stats("v")
        return m, v, 0., 0.

convolved = convolved_gen(D1, D2, grid)
conv = convolved()

convolved2 = convolved_gen(conv, D3, grid)
conv2 = convolved2()

pl.plot(grid,D1.cdf(grid))
pl.plot(grid,conv.cdf(grid))
pl.plot(grid,conv2.cdf(grid))
pl.show()