[SciPy-User] log pdf, cdf, etc

Travis Oliphant oliphant at enthought.com
Sat May 29 16:51:38 EDT 2010


On May 28, 2010, at 9:15 AM, josef.pktd at gmail.com wrote:

> On Fri, May 28, 2010 at 7:29 AM, Chris Strickland
> <christophermarkstrickland at gmail.com> wrote:
>> Hi,
>> 
>> When using any of the distributions of scipy.stats there does not seem to be
>> the ability (or at least I cannot figure out how) to have the function
>> return
>> the log of the pdf, cdf, sf, etc. For statistical analysis this is
>> essential.
>> For instance suppose we are interested in an exponential distribution for a
>> random variable x with a hyperparameter lambda there needs to be an option
>> that returns -log(lambda)-x/lambda. It is not sufficient (numerically) to
>> calculate log(scipy.stats.expon.pdf(x,lambda)).
>> 
>> Is there a way to do this using the distributions in scipy.stats?
> 
> It would need a new method for each distribution, e.g. _loglike, _logpdf
> So, this is work, and for some distributions the log wouldn't simplify much.
> 
> I proposed this once together with other improvements (but without response).
> 
> The second useful method for estimation would be _fitstart, which
> provides distribution specific starting values for fit, e.g. a moment
> estimator, or a simple rules of thumb
> http://projects.scipy.org/scipy/ticket/808
> 
> 
> Here are some of my currently planned enhancements to the distributions:
> 
> http://bazaar.launchpad.net/~scipystats/statsmodels/trunk/annotate/head:/scikits/statsmodels/sandbox/stats/distributions_patch.py

Hey Josef, 

I've been playing with distributions.py today and added logpdf, logcdf, logsf methods (based on _logpdf, _logcdf, _logsf methods in each distribution).  

I also added your _fitstart suggestion.   I would like to do something like your nnlf_fit method that allows you to fix some parameters and only solve for others, but I haven't thought through all the issues yet.  

Do you have updated code I could look at.   These are relatively easy adds that I would like to put in today.     Do you have check-in rights to SciPy?   

Thanks,

-Travis

> 
> but I just checked, it looks like I forgot to copy the _loglike method
> that I started from my experimental scripts.
> 
> For a few distributions, where this is possible, it would also be
> useful to add the gradient with respect to the parameters, (or even
> the Hessian). But this is currently mostly just an idea, since we need
> some analytical gradients in the estimation of stats models.
> 
> 
>> 
>> If there is not is it possible for me to suggest that this feature is added.
>> There is such an excellent range of distributions, each with such an
>> impressive range of options, it seems ashame to have to mostly manually code
>> up the log of pdfs and often call the log of CDFs from R.
> 
> So far I only thought about log pdf, because I wanted it for Maximum
> Likelihood estimation.
> 
> Do you have a rough idea for which distributions log cdf would work?
> that is, for which distribution is an analytical or efficient
> numerical expression possible.
> 
> I also think that scipy.stats.distributions could be one of the best
> (broadest, consistent) collection of univariate distributions that I
> have seen so far, once we fill in some missing pieces.
> 
> As a way forward, I think we could make the distributions into a
> numerical encyclopedia by adding private methods to those
> distributions where it makes sense, like log pdf, log cdf and I also
> started to add characteristic functions to some distributions in my
> experimental scripts.
> If you have a collection of logpdf, logcdf, we could add a trac ticket for this.
> 
> However, this would miss the generic broadcasting part of the public
> functions, pdf, cdf,... but for estimation I wouldn't necessarily call
> those because of the overhead.
> 
> 
> I'm working on and off on this, so it's moving only slowly (and my
> wishlist is big).
> (for example, I was reading up on extreme value distributions in
> actuarial science and hydrology to get a better overview over the
> estimators.)
> 
> 
> So, I really love to hear any ideas, feedback, and see contributions
> to improving the distributions.
> 
> Josef
> 
> 
>> 
>> Thanks,
>> Chris.
>> 
>> _______________________________________________
>> SciPy-User mailing list
>> SciPy-User at scipy.org
>> http://mail.scipy.org/mailman/listinfo/scipy-user
>> 
>> 
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user

---
Travis Oliphant
Enthought, Inc.
oliphant at enthought.com
1-512-536-1057
http://www.enthought.com






More information about the SciPy-User mailing list