[SciPy-Dev] Optimization to stats.dlaplace.rvs

Andrew Reed reed at cs.unc.edu
Fri Nov 15 16:00:45 EST 2019


All,

Some of you may have seen a message I sent to the NumPy mailing list about
adding a two-sided geometric distribution and/or the comments to my PR on
Github:
https://mail.python.org/pipermail/numpy-discussion/2019-November/080223.html
https://github.com/numpy/numpy/pull/14890

Bottom line, rather than add it as a distribution to NumPy, it was
suggested that I look into adding it to stats.dlaplace.rvs (which currently
uses the inverted CDF) and I was provided with some code to get me started.

I have been able to add the suggested code, with only minor tweaks, to
SciPy.  A few tests with timeit seem to confirm that the new code provides
a speedup of about 250% on my machine.  Furthermore, the default rvs
function would get killed when I tried to generate 100 million samples,
whereas this new code can generate at least 100 million samples (I get a
MemoryError on my VM when I try to go any higher).

I think I'm at the point now where I need to start working through some
broadcasting errors, but before I do, I wanted to gauge the potential
interest in these improvements.

I imagine that implementing a new method for rvs will probability break the
repeatability of previous versions of SciPy, and I'm not sure if this is a
distribution that warrants optimization.

Thanks,
Andrew
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scipy-dev/attachments/20191115/31ff410d/attachment.html>


More information about the SciPy-Dev mailing list