[SciPy-User] Big performance hit when using frozen distributions on scipy 0.16.0

Nicolas Chopin nicolas.chopin at ensae.fr
Fri Oct 28 12:53:21 EDT 2016


 Hi list,
I'm working on a package that does some complicate Monte Carlo experiments.
The package passes around frozen distributions quite a lot. Trying to
understand why certain parts were so slow, I did a bit of profiling, and
stumbled upon this:

 > %timeit x = scipy.stats.norm.rvs(size=1000)
> 10000 loops, best of 3: 49.3 µs per loop

> %timeit dist = scipy.stats.norm(); x = dist.rvs(size=1000)
> 1000 loops, best of 3: 512 µs per loop

So a x10 penalty when using a frozen dist, even if the size of the
simulated vector is 1000. This is using scipy 0.16.0 on Ubuntu 16.04. I
cannot replicate this problem on another machine with scipy 0.13.3 and
Ubuntu 14.04 (there is a penalty, but it's much smaller).

In the profiler, I can see that a lot of time is spent doing string
operations (such as expand_tabs) in order to generate the doc. In the
source, I see that this may depend on a certain -00 flag???

I do realise that instantiating a frozen distribution requires some
argument checking and what not, but here it looks too expensive. For my
package, this amounts to hours spent on ... tab extensions?

Anyway, I'd like to ask
(a) is this a known problem? I could not find anything on-line about this.
(b) Is this going to be fixed in some future version of scipy?
(c) is there a way to fix this with *this* version of scipy using this flag
mentioned in the source, and then how?
(c) or should I instead re-define manually my own distributions objects?
(it's really convenient for what I'm trying to do to define distributions
as objects with methods rvs, logpdf, and so on).

Many thanks for reading this! :-)
All the best
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20161028/803ff558/attachment.html>


More information about the SciPy-User mailing list