[SciPy-user] Making faster statistical distributions
Christopher Fonnesbeck
chris at fonnesbeck.org
Thu Jan 29 15:53:10 EST 2004
On Jan 29, 2004, at 2:17 PM, Travis Oliphant wrote:
> Christopher Fonnesbeck wrote:
>
>> I am already using pieces of SciPy in my Markov chain Monte Carlo
>> package (PyMC), mostly for plotting functionality. I would also like
>> to exploit the distributions implemented in scipy.stats, but they are
>> far too slow for use in statistical simulation applications like
>> MCMC, where millions of random draws may be taken. Therefore, I am
>> thinking of implementing many of these distributions (at least the
>> common ones) as C or Fortran extensions. I am unsure whether to use
>> Fortran through f2py for this task, or C through weave.inline (for
>> example). I have used both in the past for various tasks, and was
>> generally happy with both. Any suggestions?
>
>
> Could you specify which ones are too slow? This is a rather broad
> statement as many are implemented in C and are very fast. Some
> distributions, however, do default to using a numerical solver to
> invert the cdf and apply this to uniform random variates. You can
> improve the speed of these distributions by overriding the _ppf
> method or the _rvs method of the object to use a faster, more
> specialized method. I would use weave or fortran with f2py to do
> this.
>
> Best,
>
> -Travis O.
Well, the binomial and normal distributions, for sure, off the top of
my head. Using the scipy distributions slows my MCMC code down
significantly (they were the bottleneck, according to the profiling
module). Using Fortran via f2py sped things up a lot. I'm not talking
about the generation of random deviates, necessarily, but rather the
pdf's, which are used for calculating likelihoods.
C.
--
Christopher J. Fonnesbeck ( c h r i s @ f o n n e s b e c k . o r g )
Georgia Cooperative Fish & Wildlife Research Unit, University of Georgia
More information about the SciPy-User
mailing list