[SciPy-Dev] Generate random variates using Cython

Christoph Baumgarten christoph.baumgarten at gmail.com
Wed Jan 22 15:01:43 EST 2020


Hi,

After upgrading to Numpy 1.18, I could indeed compile the code
https://docs.scipy.org/doc/numpy-1.17.0/reference/random/extending.html and
it gives a substantial speedup.

I am still wondering how to integrate such an approach into SciPy. To
generate uniform samples in the _rvs method of a continuous distribution i
scipy.stats, I would write:

self._random_state.random_sample

If I want to use Cython, how can I ensure that it relies on the same random
state?

In the future, one needs to support RandomState and Generator (cross-ref to
the recent PR https://github.com/scipy/scipy/pull/11402)

Thanks for your help.

Christoph

P.S. code used

import numpy as np
cimport numpy as np
cimport cython
from cpython.pycapsule cimport PyCapsule_IsValid, PyCapsule_GetPointer
from numpy.random cimport bitgen_t
from numpy.random import PCG64

@cython.boundscheck(False)
@cython.wraparound(False)
def rvs(Py_ssize_t n):
    cdef Py_ssize_t i
    cdef bitgen_t *rng
    cdef const char *capsule_name = "BitGenerator"
    cdef double u, v
    cdef double[::1] random_values
    x = PCG64()
    capsule = x.capsule
    # Optional check that the capsule if from a BitGenerator
    if not PyCapsule_IsValid(capsule, capsule_name):
        raise ValueError("Invalid pointer to anon_func_state")
    # Cast the pointer
    rng = <bitgen_t *> PyCapsule_GetPointer(capsule, capsule_name)
    random_values = np.empty(n, dtype='float64')
    with x.lock, nogil:
        for i in range(n):
            while (1):
                u = rng.next_double(rng.state)
                v = rng.next_double(rng.state)
                if u <= v*v:
                    break
            random_values[i] = v
    randoms = np.asarray(random_values)
    return randoms




On Thu, Jan 16, 2020 at 12:46 PM <scipy-dev-request at python.org> wrote:

> Send SciPy-Dev mailing list submissions to
>         scipy-dev at python.org
>
> To subscribe or unsubscribe via the World Wide Web, visit
>         https://mail.python.org/mailman/listinfo/scipy-dev
> or, via email, send a message with subject or body 'help' to
>         scipy-dev-request at python.org
>
> You can reach the person managing the list at
>         scipy-dev-owner at python.org
>
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of SciPy-Dev digest..."
>
>
> Today's Topics:
>
>    1. Generate random variates using Cython (Christoph Baumgarten)
>    2. Re: Generate random variates using Cython (Matti Picus)
>    3. Re: Missing whitespace after commas in pycodestyle check
>       (Joshua Wilson)
>    4. cuSignal: GPU accelerated version of SciPy Signal package
>       (Robert Lucente - Pipeline.Com)
>
>
> ----------------------------------------------------------------------
>
> Message: 1
> Date: Wed, 15 Jan 2020 20:14:07 +0100
> From: Christoph Baumgarten <christoph.baumgarten at gmail.com>
> To: scipy-dev at python.org
> Subject: [SciPy-Dev] Generate random variates using Cython
> Message-ID:
>         <
> CABXY2qC6PLydMVi5on7x3tqn2qED5nBD19ACtKA81OYm_Y4GAA at mail.gmail.com>
> Content-Type: text/plain; charset="utf-8"
>
> Hi,
>
> I recently looked into rejection algorithms to generate random variates
> that rely on repeatedly generating uniform random variates until a
> termination condition is satisfied.
> A pure Python implementation is reasonably fast but much slower than a C
> implementation (e.g. Poisson distribution in numpy). I tried to use Cython
> to adapt my Python code (I just started to look into Cython, so it could be
> that I overlook something very basic) but I did not succeed and I would
> like to ask for help.
>
> The following code works that generates rvs on [0, 1] with density 3 x**2
> as an example (compiling it generates a warning though: Warning Msg: Using
> deprecated NumPy API, disable it with #define NPY_NO_DEPRECATED_API
> NPY_1_7_API_VERSION)  but I struggle to get rid of the Python interaction
> np.random.rand() to speed it up.
>
> cimport numpy as np
> from numpy cimport ndarray, float64_t
> import numpy as np
> def rvs(int N):
>     cdef double u, v
>     cdef int i
>     cdef np.ndarray[np.float64_t, ndim=1] x = np.empty((N, ),
> dtype=np.float64)
>     for i in range(N):
>         while (1):
>             u = np.random.rand()
>             v = np.random.rand()
>             if u <= v*v:
>                 break
>         x[i] = v
>     return x
>
> Following https://numpy.org/devdocs/reference/random/c-api.html (Cython
> API
> for random), I tried
> cimport numpy.random to use random_standard_uniform.
> However, I get an error: 'numpy\random.pxd' not found
>
> I get similar errors when trying to compile the Cython examles here:
> https://docs.scipy.org/doc/numpy-1.17.0/reference/random/extending.html
>
> Cython version: 0.29.14
> Numpy 1.17.4
> python 3.7.1
>
> Any guidance on how to access the uniform distribution in numpy using
> Cython would be great. Thanks
>
> Christoph
> -------------- next part --------------
> An HTML attachment was scrubbed...
> URL: <
> http://mail.python.org/pipermail/scipy-dev/attachments/20200115/077cdcf8/attachment-0001.html
> >
>
> ------------------------------
>
> Message: 2
> Date: Thu, 16 Jan 2020 07:14:41 +1100
> From: Matti Picus <matti.picus at gmail.com>
> To: scipy-dev at python.org
> Subject: Re: [SciPy-Dev] Generate random variates using Cython
> Message-ID: <5a73b945-afd1-e685-6d32-8cd2ebd822e5 at gmail.com>
> Content-Type: text/plain; charset=utf-8; format=flowed
>
> On 16/1/20 6:14 am, Christoph Baumgarten wrote:
> >
> > Hi,
> >
> > I recently looked into rejection algorithms to generate random
> > variates that rely on repeatedly generating uniform random variates
> > until a termination condition is satisfied.
> > A pure Python implementation is reasonably fast but much slower than a
> > C implementation (e.g. Poisson distribution in numpy). I tried to use
> > Cython to adapt my Python code (I just started to look into Cython, so
> > it could be that I overlook something very basic) but I did not
> > succeed and I would like to ask for help.
> >
> > The following code works that generates rvs on [0, 1] with density 3
> > x**2 as an example (compiling it generates a warning though: Warning
> > Msg: Using deprecated NumPy API, disable it with #define
> > NPY_NO_DEPRECATED_API NPY_1_7_API_VERSION)? but I struggle to get rid
> > of the Python interaction np.random.rand() to speed it up.
> >
> > cimport numpy as np
> > from numpy cimport ndarray, float64_t
> > import numpy as np
> > def rvs(int N):
> > ??? cdef double u, v
> > ??? cdef int i
> > ??? cdef np.ndarray[np.float64_t, ndim=1] x = np.empty((N, ),
> > dtype=np.float64)
> > ??? for i in range(N):
> > ??????? while (1):
> > ??????????? u = np.random.rand()
> > ??????????? v = np.random.rand()
> > ??????????? if u <= v*v:
> > ??????????????? break
> > ??????? x[i] = v
> > ??? return x
> >
> > Following https://numpy.org/devdocs/reference/random/c-api.html
> > (Cython API for random), I tried
> > cimport numpy.random to use random_standard_uniform.
> > However, I get an error: 'numpy\random.pxd' not found
> >
> > I get similar errors when trying to compile the Cython examles here:
> > https://docs.scipy.org/doc/numpy-1.17.0/reference/random/extending.html
> >
> > Cython version: 0.29.14
> > Numpy 1.17.4
> > python 3.7.1
> >
> > Any guidance on how to access the uniform distribution in numpy using
> > Cython would be great. Thanks
> >
> > Christoph
> >
>
> The new random c-api was broken in NumPy 1.17. Please try with NumPy 1.18.
>
> Matti
>
>
>
> ------------------------------
>
> Message: 3
> Date: Wed, 15 Jan 2020 19:06:18 -0800
> From: Joshua Wilson <josh.craig.wilson at gmail.com>
> To: SciPy Developers List <scipy-dev at python.org>
> Subject: Re: [SciPy-Dev] Missing whitespace after commas in
>         pycodestyle check
> Message-ID:
>         <CAKFGQGz83YjT5Zx3n=
> 52M4xAkVA+hq9LhviywUEYnk2YB8y9Qg at mail.gmail.com>
> Content-Type: text/plain; charset="UTF-8"
>
> Seems like there are no objections to adding the checks to just the
> diffs; I'll set that up in upcoming weeks. Very much looking forward
> to eliminating that aspect of code review.
>
> - Josh
>
> On Tue, Jan 14, 2020 at 2:47 AM Ralf Gommers <ralf.gommers at gmail.com>
> wrote:
> >
> >
> >
> > On Tue, Jan 14, 2020 at 12:15 AM Ilhan Polat <ilhanpolat at gmail.com>
> wrote:
> >>
> >> I actually got pretty quick at fixing these stuff by the aid of some
> very mild nonaggressive scripts. Hence if we all agree I can go through all
> repo and fix everything once and ask the current PR owners to rebase.
> >
> >
> > You mean you want to fix up all PRs, right? Or all code in master? If we
> implement the diff checks I'm not sure the former is needed, but perhaps it
> helps. If the latter, I don't think that's a good idea.
> >
> > Ralf
> >
> >
> >>
> >> Sounds like a chore to everyone but I guess biting the bullet once and
> then turning on all the checks seems the least painful way. Otherwise
> Ralf's suggestion is also a good idea.
> >>
> >> On Mon, Jan 13, 2020 at 11:10 PM Ralf Gommers <ralf.gommers at gmail.com>
> wrote:
> >>>
> >>> Hey Josh,
> >>>
> >>>
> >>> On Mon, Jan 13, 2020 at 1:44 AM Joshua Wilson <
> josh.craig.wilson at gmail.com> wrote:
> >>>>
> >>>> Hey everyone,
> >>>>
> >>>> During code review, it seems that we care about missing whitespace
> >>>> after commas. (I have no data on this, but I suspect it is our most
> >>>> common linting nit.) Linting for this is currently disabled. Now on
> >>>> the one hand, we generally avoid turning on checks like this because
> >>>> of the disruption to open PRs and what they do to git history. On the
> >>>> other hand, I don't think enforcing this check by hand works.
> >>>> Reminding an author to add spaces after commas can easily add several
> >>>> rounds of extra review (some places almost invariably get missed on
> >>>> the first try, and then the problem tends to sneak back in in later
> >>>> commits), and I usually just give up after a while. So I don't think
> >>>> we are heading in the direction of "in 5 years all the bad cases will
> >>>> be gone and we can turn on the check without disruption".
> >>>
> >>>
> >>> I agree, the current workflow is not optimal.
> >>>
> >>>>
> >>>> So I would kind of like to push us to either:
> >>>>
> >>>> 1. Turn the lint check on. We can do this module-by-module or even
> >>>> file-by-file if necessary to minimize disruption.
> >>>> 2. Declare that this is not something we care about in code review.
> >>>
> >>>
> >>> Another option would be to run pycodestyle/pyflakes with all checks we
> want enabled, but only on the diff in the PR. This is what PyTorch does for
> example. The nice thing is you can test what you want without having to
> deal with code churn before enabling a check. The downside is that it's a
> bit convoluted to reproduce the "run on the diff to master" command locally
> (but we could add a script to streamline that probably).
> >>>
> >>> Cheers,
> >>> Ralf
> >>>
> >>> _______________________________________________
> >>> SciPy-Dev mailing list
> >>> SciPy-Dev at python.org
> >>> https://mail.python.org/mailman/listinfo/scipy-dev
> >>
> >> _______________________________________________
> >> SciPy-Dev mailing list
> >> SciPy-Dev at python.org
> >> https://mail.python.org/mailman/listinfo/scipy-dev
> >
> > _______________________________________________
> > SciPy-Dev mailing list
> > SciPy-Dev at python.org
> > https://mail.python.org/mailman/listinfo/scipy-dev
>
>
> ------------------------------
>
> Message: 4
> Date: Thu, 16 Jan 2020 06:45:05 -0500
> From: "Robert Lucente - Pipeline.Com" <rlucente at pipeline.com>
> To: "'SciPy Developers List'" <scipy-dev at python.org>
> Cc: "Robert Gmail Backup 2 Lucente"
>         <robert.backup.2.lucente at gmail.com>
> Subject: [SciPy-Dev] cuSignal: GPU accelerated version of SciPy Signal
>         package
> Message-ID: <038f01d5cc62$6572a090$3057e1b0$@pipeline.com>
> Content-Type: text/plain;       charset="us-ascii"
>
> https://github.com/rapidsai/cusignal
>
> I was wondering what people thought of cuSignal?
>
> If the topic is not appropriate for this mailing list, please consider
> replying directly to me.
>
>
>
> ------------------------------
>
> Subject: Digest Footer
>
> _______________________________________________
> SciPy-Dev mailing list
> SciPy-Dev at python.org
> https://mail.python.org/mailman/listinfo/scipy-dev
>
>
> ------------------------------
>
> End of SciPy-Dev Digest, Vol 195, Issue 13
> ******************************************
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scipy-dev/attachments/20200122/a40dacd5/attachment-0001.html>


More information about the SciPy-Dev mailing list