[SciPy-User] log pdf, cdf, etc

josef.pktd at gmail.com josef.pktd at gmail.com
Sat May 29 00:20:51 EDT 2010


On Sat, May 29, 2010 at 12:15 AM,  <josef.pktd at gmail.com> wrote:
> On Fri, May 28, 2010 at 11:34 PM, Chris Strickland
> <christophermarkstrickland at gmail.com> wrote:
>>
>>
>> On Sat, May 29, 2010 at 12:53 PM, <josef.pktd at gmail.com> wrote:
>>>
>>> I don't think for many use cases log(stats.t.pdf) or many other
>>> distributions the performance and accuracy hit would be large enough
>>> to make it useless. At least, I haven't seen any other comments in
>>> this direction.
>>>
>>> On of the main use cases for me of stats.distributions are all the
>>> statistical test distributions, t, F, chi2 and so on. Howver, in
>>> statsmodels we have a mixture of calls to the pdf/cdf of
>>> stats.distributions and reimplementations of loglikelhood functions,
>>> where the scipy version is also just used for testing.
>>>
>> The main use for me is in specifying (log) prior distributions, (log)
>> posterior distributions and log-likelihood functions. There is simply no way
>> around using the log pdf in the vast majority of cases in MCMC analysis.
>> Whilst it is trivial for me to simply write functions when I need them it
>> would obviously benefit the statistical community as a whole if the option
>> was available in the excellent set of distributions that are available as a
>> part of Scipy.
>
> I agree that it would be very good to have this generally available,
> and I will appreciate it for maximum likelihood.
> For MCMC (where I know only little about the details), it might,
> however, always be faster to work with dedicated code as in pymc.
>
>>
>>>
>>> R's license, GPL, is incompatible with the license of scipy, BSD.
>>> While they are allowed to look at our code, code that goes into scipy
>>> cannot be based on GPL licensed code.
>>>
>> Fair enough. Still at least for the normal cdf we could simply use the
>> references in the R code to write a Scipy version.
>
> If it's the C or Fortran implementation, then it is out of my
> competence, I'm a pure scripting language person.
>
> Another idea for this would be to see if any of the pymc code for this
> would fit into scipy. Since I leave Fortran to others, I never looked
> at it.

I'm contradicting and confusing myself, I don't think pymc has any cdf
code, only pdf.

Josef

>
> I think if we get the easier cases, logpdf and logcdf that don't
> require compiled versions, we would be able to cover already a
> considerable range of the distributions.
>
> However, I also agree now, having norm.logcdf would also be useful for
> many other distributions.
>
>>
>>>
>>> If never seen it mentioned before that there is a direct function for
>>> log(norm.cdf). Which functions and packages in R implement the
>>> logarithm of the cdf of these distributions?
>>
>> pnorm it is in the stats package for the log of the normal CDF. Kind of
>> essential for distributions like the powernormal as well that use the normal
>> cdf as a part of their pdf.
>
> see previous message, I never paid enough attention to see the log.p option
>
> Josef
>
>>>
>>> The cdf for several distributions (including normal) is implement in
>>> Fortran or C in scipy.special, and I've never seen a log version for
>>> them.
>>>
>>> I looked at some of the distributions, and logpdf could be more
>>> efficiently calculated in many of them and very often also logcdf
>>>
>>> I opened a ticket for this
>>> http://projects.scipy.org/scipy/ticket/1184
>>>
>>> I also saw that there are still smaller, numerical improvements
>>> possible in several distributions.
>>>
>>> Thanks,
>>>
>>> Josef
>>>
>>> ______________________________________________
>>> SciPy-User mailing list
>>> SciPy-User at scipy.org
>>> http://mail.scipy.org/mailman/listinfo/scipy-user
>>
>>
>> _______________________________________________
>> SciPy-User mailing list
>> SciPy-User at scipy.org
>> http://mail.scipy.org/mailman/listinfo/scipy-user
>>
>>
>



More information about the SciPy-User mailing list