[SciPy-User] Big performance hit when using frozen distributions on scipy 0.16.0

Nicolas Chopin nicolas.chopin at ensae.fr
Fri Oct 28 13:37:34 EDT 2016


Yes, as I have just said, I agree that it is the creation of the frozen
dist that
explains the difference.

I do need to create a *lot* of frozen distributions, there is no way around
that
in what I do. Typically, one run may involve O(10^8) frozen distributions;
for each of these I may either simulate a vector (of size 10^2-10^3), or
compute
the log-pdf of a vector of the same size, or both.

On Fri, 28 Oct 2016 at 19:29 Evgeni Burovski <evgeny.burovskiy at gmail.com>
wrote:

> On Fri, Oct 28, 2016 at 7:53 PM, Nicolas Chopin <nicolas.chopin at ensae.fr>
> wrote:
> >  Hi list,
> > I'm working on a package that does some complicate Monte Carlo
> experiments.
> > The package passes around frozen distributions quite a lot. Trying to
> > understand why certain parts were so slow, I did a bit of profiling, and
> > stumbled upon this:
> >
> >  > %timeit x = scipy.stats.norm.rvs(size=1000)
> >> 10000 loops, best of 3: 49.3 µs per loop
> >
> >> %timeit dist = scipy.stats.norm(); x = dist.rvs(size=1000)
> >> 1000 loops, best of 3: 512 µs per loop
> >
> > So a x10 penalty when using a frozen dist, even if the size of the
> simulated
> > vector is 1000. This is using scipy 0.16.0 on Ubuntu 16.04. I cannot
> > replicate this problem on another machine with scipy 0.13.3 and Ubuntu
> 14.04
> > (there is a penalty, but it's much smaller).
> >
> > In the profiler, I can see that a lot of time is spent doing string
> > operations (such as expand_tabs) in order to generate the doc. In the
> > source, I see that this may depend on a certain -00 flag???
> >
> > I do realise that instantiating a frozen distribution requires some
> argument
> > checking and what not, but here it looks too expensive. For my package,
> this
> > amounts to hours spent on ... tab extensions?
> >
> > Anyway, I'd like to ask
> > (a) is this a known problem? I could not find anything on-line about
> this.
> > (b) Is this going to be fixed in some future version of scipy?
> > (c) is there a way to fix this with *this* version of scipy using this
> flag
> > mentioned in the source, and then how?
> > (c) or should I instead re-define manually my own distributions objects?
> > (it's really convenient for what I'm trying to do to define
> distributions as
> > objects with methods rvs, logpdf, and so on).
> >
> > Many thanks for reading this! :-)
> > All the best
>
>
> Why are you including the construction time into your timings? Surely,
> if you use frozen distributions for some MC work, you're not
> recreating frozen instances in hot loops?
>
>
> In [4]: %timeit norm.rvs(size=100, random_state=123)
> The slowest run took 142.68 times longer than the fastest. This could
> mean that an intermediate result is being cached.
> 10000 loops, best of 3: 74.2 µs per loop
>
> In [5]: %timeit dist = norm(); dist.rvs(size=100, random_state=123)
> The slowest run took 4.40 times longer than the fastest. This could
> mean that an intermediate result is being cached.
> 1000 loops, best of 3: 796 µs per loop
>
> In [6]: %timeit dist = norm()
> The slowest run took 4.89 times longer than the fastest. This could
> mean that an intermediate result is being cached.
> 1000 loops, best of 3: 672 µs per loop
>
> > (b) Is this going to be fixed in some future version of scipy?
> > (c) is there a way to fix this with *this* version of scipy using this
> flag
> > mentioned in the source, and then how?
>
> You could of course try reverting
> https://github.com/scipy/scipy/pull/3245 for your local copy of scipy.
> It went in into scipy 0.14, so this is the likely suspect.
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> https://mail.scipy.org/mailman/listinfo/scipy-user
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20161028/8bd3bbb3/attachment.html>


More information about the SciPy-User mailing list