[SciPy-Dev] creation / pickling of stats distributions

Evgeni Burovski evgeny.burovskiy at gmail.com
Wed Jul 15 04:30:30 EDT 2020


Sorry, typo: I can send a PR or *review* one, if that helps.

ср, 15 июл. 2020 г., 11:29 Evgeni Burovski <evgeny.burovskiy at gmail.com>:

> While it's not wholly surprising these two are slow, it is surprising they
> are *that* slow.
>
> W.r.t. docstrings, I think there's room for adding a "skip_focstring"
> kwarg or some such to rv_generic. It'll need to be propagated to
> `rv_frozen.dist`. I can send a PR or one, if that helps.
> (I don't know about pickling, sadly.)
>
> All that said, maybe it's easier to share the shapes between processes and
> use regular distributions if rv_frozen is a bottleneck, will that help?
>
>
> ср, 15 июл. 2020 г., 2:03 Andrew Nelson <andyfaff at gmail.com>:
>
>> I have some code that uses multiprocessing.Pool for parallelisation. This
>> requires that an object is pickled. This object has an `rv_frozen`
>> distribution as an attribute. It turns out that a performance is much
>> improved if the `rv_frozen` distribution is not present --> pickling of
>> `rv_frozen` objects is expensive. Creation of `rv_frozen` objects is also
>> expensive.
>>
>> ```
>> >>> import scipy.stats as stats
>> >>> import pickle
>> >>> %timeit stats.norm(scale=1, loc=1)
>> 694 µs ± 123 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
>> >>> rv = stats.norm(scale=1, loc=1)
>> >>> %timeit s = pickle.dumps(rv); pickle.loads(s)
>> 1.02 ms ± 24 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
>> ```
>>
>> I'd be hoping for an order of magnitude less in time for either of those.
>> Using line profiling two of the big culprits for slowness during object
>> creation are `rv_continuous._construct_doc` (50% of the total time, with a
>> large part spent in `_lib.doccer.docformat`!!) and
>> `rv_continuous._construct_argparser`
>>
>> My questions are:
>>
>> 1) Is it possible to speed up pickling/unpickling of these objects? (e.g.
>> __setstate__/__getstate__, custom reduction, copyreg magic, ...)
>> 2) Is there any way to turn off docstring creation (or speeding it up),
>> besides starting the interpreter with -OO?
>>
>>
>> _____________________________________
>> Dr. Andrew Nelson
>>
>>
>> _____________________________________
>> _______________________________________________
>> SciPy-Dev mailing list
>> SciPy-Dev at python.org
>> https://mail.python.org/mailman/listinfo/scipy-dev
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scipy-dev/attachments/20200715/490dae91/attachment-0001.html>


More information about the SciPy-Dev mailing list