[SciPy-Dev] GSoC: Integrate library UNU.RAN into scipy.stats

Tirth Patel tirthasheshpatel at gmail.com
Sat Apr 3 14:15:53 EDT 2021


Hi Christoph,

Thanks for the reply!

On 4/3/21, Christoph Baumgarten <christoph.baumgarten at gmail.com> wrote:
> Hi Tirth,
>
> great to hear that you are interested in the project! My main goal would be
> to add the "universal" rv generation methods to SciPy, e.g. PINV, TDR
> (UNU.RAN
> User Manual (wu.ac.at)
> <http://statmath.wu.ac.at/software/unuran/doc/unuran.html#Methods_005ffor_005fCONT>).
> At the moment, we just have one such function in SciPy (Statistical
> functions (scipy.stats) — SciPy v1.6.2 Reference Guide
> <https://docs.scipy.org/doc/scipy/reference/stats.html#random-variate-generation>)
> and it is very basic (I implemented it a while ago). Such functionality is
> very useful in many situations, see e.g. OverflowError when sampling from
> some handmade stats distributions · Issue #13051 · scipy/scipy (github.com)
> <https://github.com/scipy/scipy/issues/13051> So the API would rather be
> name_of_sampling_method(pdf / cdf, parameters of the sampling methods).
>

I think I get a general idea now. Thanks!

> Whether one should add a keyword to distribution.rvs(...) that allows the
> user to choose the sampling method might be a question for a follow-up
> project. This would also be quite time-consuming since you need to verify
> which method is appropriate for a given distribution. A simpler task could
> be to check if the rvs methods of a specific distribution could be
> overwritten with the corresponding method in UNU.RAN (UNU.RAN User Manual
> (wu.ac.at)
> <http://statmath.wu.ac.at/software/unuran/doc/unuran.html#Stddist>).
> For example, geninvgauss in SciPy relies on a Python implementation of a
> rejection method / RoU and the implementation in UNU.RAN (gig / gig2) might
> be faster. Also distributions with slow ppf methods relying on special
> functions would be natural candidates. But that would also be of lower
> priority for me.
>
> I hope it helps. Feel free to reach out if you have more questions.
>
> Christoph
>
> On Fri, Apr 2, 2021 at 6:00 PM <scipy-dev-request at python.org> wrote:
>
>> Send SciPy-Dev mailing list submissions to
>>         scipy-dev at python.org
>>
>> To subscribe or unsubscribe via the World Wide Web, visit
>>         https://mail.python.org/mailman/listinfo/scipy-dev
>> or, via email, send a message with subject or body 'help' to
>>         scipy-dev-request at python.org
>>
>> You can reach the person managing the list at
>>         scipy-dev-owner at python.org
>>
>> When replying, please edit your Subject line so it is more specific
>> than "Re: Contents of SciPy-Dev digest..."
>>
>>
>> Today's Topics:
>>
>>    1. Multivariate non-central hypergeometric distributions
>>       (Wallenius' and Fisher's) (???? ?????????)
>>    2. GSoC: Integrate library UNU.RAN into scipy.stats (Tirth Patel)
>>
>>
>> ----------------------------------------------------------------------
>>
>> Message: 1
>> Date: Fri, 2 Apr 2021 00:05:10 +0200
>> From: ???? ????????? <samogot at gmail.com>
>> To: scipy-dev at python.org
>> Subject: [SciPy-Dev] Multivariate non-central hypergeometric
>>         distributions (Wallenius' and Fisher's)
>> Message-ID:
>>         <
>> CAMJZOa0xemJR7WCjcMcEyGUKvqMqPF6YDtLxbb+r2e4d1k-HDA at mail.gmail.com>
>> Content-Type: text/plain; charset="utf-8"
>>
>> Hi everyone.
>>
>> Univariate versions of non-central hypergeometric distributions based
>> on Agner Fog's BiasedUrn C++ code were added recently (in
>> https://github.com/scipy/scipy/pull/13330). C++ code added in that PR
>> already contains the implementation of multivariate versions of the same
>> distributions. As far as I understand, the only things needed for
>> multivariate distributions to work are Python wrapper and probably some
>> tests.
>>
>> Is anyone interested in adding them? If not, I might get to it myself
>> later
>> this month, but as I haven't made any scipy contributions yet and am not
>> familiar with the codebase, I will need much more time to rump up than an
>> experienced contributor :)
>>
>> --
>> Regards,
>> Ivan Naydonov
>> -------------- next part --------------
>> An HTML attachment was scrubbed...
>> URL: <
>> https://mail.python.org/pipermail/scipy-dev/attachments/20210402/455ec86f/attachment-0001.html
>> >
>>
>> ------------------------------
>>
>> Message: 2
>> Date: Fri, 2 Apr 2021 19:19:26 +0530
>> From: Tirth Patel <tirthasheshpatel at gmail.com>
>> To: scipy-dev <scipy-dev at python.org>
>> Subject: [SciPy-Dev] GSoC: Integrate library UNU.RAN into scipy.stats
>> Message-ID:
>>         <CABpuv38XtcJWOT6kskF_Rv3T=_
>> 0iSoNCVr7gtnupL0kGQixfWg at mail.gmail.com>
>> Content-Type: text/plain; charset="UTF-8"
>>
>> Hi all,
>>
>> I would like to participate in GSoC this year and found this project
>> very interesting!
>>
>> TL; DR: I have a few questions regarding the project:
>>   - Is the user interface desired as a separate python submodule
>> (inside `scipy.stats`) or does it serve as an extension of the `rvs`
>> method?
>>   - Should UNU.RAN C library be included as a submodule within SciPy
>> (e.g. gh-12043) or be cloned from a separate GitHub submodule (e.g
>> gh-13328)?
>>
>> About Me
>> ********
>> I am Tirth (@tirthasheshpatel on GitHub), a third-year computer
>> science undergrad student. I am quite familiar with Cython and a lot
>> of my college courses make use of C. I have a good knowledge of
>> probability theory and statistics.
>>
>> Open Source work: I have participated in GSoC with the PyMC team last
>> year. I am a contributor to SciPy since May 2020 and recently a
>> maintainer.
>>
>> About Project
>> *************
>> I had a question about the project. Is the user interface desired as a
>> separate python submodule inside `scipy.stats`? like:
>>
>>     import scipy.stats as stats
>>
>>     # sample a 1000 variates from a normal distribution
>>     # with mean 0 and std 1.5. Let UNU.RAN choose the method
>>     rvs = stats.random.normal(0., 1.5, size=1000, method='auto')
>>
>>     # sample 100 samples from the beta distribution using TDR method
>>     beta_rvs = stats.random.beta(1, 2, size=100, method='tdr')
>>
>>     # the `rvs` methods remains unaffected.
>>     norm_rvs = stats.norm.rvs(0, 1.5, size=1000)
>>
>> Or does it serve as an extension of the `rvs` method:
>>
>>     from scipy.stats import norm, beta
>>
>>     # something like this:
>>     # method = None => same behaviour as previous versions
>>     # method = 'auto' => use UNU.RAN and let it choose the method
>>     rvs = norm.rvs(0, 1.5, size=1000, method='auto')
>>
>>     beta_rvs = beta.rvs(1, 2, size=100, method='tdr')
>>
>> Also, should UNU.RAN C library be included as a submodule within SciPy
>> (e.g. gh-12043) or be cloned from a separate GitHub submodule (e.g
>> gh-13328)?
>>
>>
>> --
>> Kind Regards,
>> Tirth
>>
>>
>> ------------------------------
>>
>> Subject: Digest Footer
>>
>> _______________________________________________
>> SciPy-Dev mailing list
>> SciPy-Dev at python.org
>> https://mail.python.org/mailman/listinfo/scipy-dev
>>
>>
>> ------------------------------
>>
>> End of SciPy-Dev Digest, Vol 210, Issue 2
>> *****************************************
>>
>


-- 
Kind Regards,
Tirth Patel


More information about the SciPy-Dev mailing list