[SciPy-User] how to fit a given pdf

Wed Aug 11 13:48:32 EDT 2010

On Wed, Aug 11, 2010 at 11:14 AM, Jonathan Stickel <jjstickel at vcn.com>wrote:

> On 8/11/10 09:00 , scipy-user-request at scipy.org wrote:
> > Date: Wed, 11 Aug 2010 12:00:22 -0300
> > From: Renato Fabbri<renato.fabbri at gmail.com>
> > Subject: [SciPy-User] how to fit a given pdf
> > To: Discussion of Numerical Python<numpy-discussion at scipy.org>,
> >       scipy-user at scipy.org
> > Message-ID:
> >       <AANLkTinmmEbSo9_-ZmCXDzqDXLZb1fM8zFVoM-gnHo+c at mail.gmail.com<AANLkTinmmEbSo9_-ZmCXDzqDXLZb1fM8zFVoM-gnHo%2Bc at mail.gmail.com>
> >
> > Content-Type: text/plain; charset="iso-8859-1"
> >
> > Dear All,
> >
> > help appreciated, thanks in advance.
> >
> > how do you fit a pdf you have with a given pdf (say gamma).
> >
> > with the file attached, you can go like:
> >
> > a=open("AC-010_ED-1m37F100P0.txt","rb")
> > aa=a.read()
> > aaa=aa[1:-1].split(",")
> > data=[int(i) for i in aaa]
> >
> > if you do pylab.plot(data); pylab.show()
> >
> > The data is something like:
> > ___|\___
> >
> > It is my pdf (probability density function).
> >
> > how can i find the right parameters to make that fit with a gamma?
> >
> > if i was looking for a normal pdf, for example, i would just find mean
> > and std and ask for the pdf.
> >
> > i've been playing with scipy.stats.distributions.gamma but i have not
> > reached anything.
> >
> > we can extend the discussion further, but this is a good starting point.
> >
> > any idea?
> >
>
> I am not familiar with the scipy.stats module, and so I do not know what
> it can do for you.  However, I would just generate a model gamma
> distribution from the mean and variance, just as for a normal
> distribution.  The gamma distribution equation can be written as
>
> p(x) = p0/(b^a*Gamma(a))*x^(a-1)*exp(-x/b)
>
> assuming x starts at zero (x>=0)
> (http://en.wikipedia.org/wiki/Gamma_distribution).  Then the the
> parameters a and b are related to the mean and variance by
>
> a = mean^2/var
> b = var/mean
>
> (I did the algebra quickly just now, and so you might want to
> double-check).  p0 is the area under the distribution and may be simply
> 1 if your distribution is normalized.
>
> Hope this helps.
>
> Jonathan
>

Note, those are the point estimators for gamma distribution, but it does not
use maximum likeihood estimation (MLE).  Using MLE, finding the estimator
for alpha is tricky, requiring convergence.  Probably best to use the fit
module as Josef mentioned.  However, there is a decent estimator for the
estimator (I heard you liked estimation...)

a = 0.5 / (ln(mean(X)) - mean(ln(X)))

Note that if you have a zero value in your dataset this won't work.  Also,
the reference is not 100% clear if it is a base 10 log or not.  I am pretty
certain it is a natural log.

Ben Root
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20100811/f258be5b/attachment.html>