[Numpy-discussion] locking np.random.Generator in a cython nogil context?

Robert Kern robert.kern at gmail.com
Mon Dec 14 16:59:23 EST 2020


On Mon, Dec 14, 2020 at 3:27 PM Evgeni Burovski <evgeny.burovskiy at gmail.com>
wrote:

> <snip>
>
> > I also think that the lock only matters for Multithreaded code not
> Multiprocess.  I believe the latter pickles and unpickles any Generator
> object (and the underying BitGenerator) and so each process has its own
> version.  Note that when multiprocessing the recommended procedure is to
> use spawn() to generate a sequence of BitGenerators and to use a distinct
> BitGenerator in each process. If you do this then you are free from the
> lock.
>
> Thanks. Just to confirm: does using SeedSequence spawn_key arg
> generate distinct BitGenerators? As in
>
> cdef class Wrapper():
>     def __init__(self, seed):
>         entropy, num = seed
>         py_gen = PCG64(SeedSequence(entropy, spawn_key=(spawn_key,)))
>         self.rng = <bitgen_t *>
> py_gen.capsule.PyCapsule_GetPointer(capsule, "BitGenerator")    # <---
> this
>
> cdef Wrapper rng_0 = Wrapper(seed=(123, 0))
> cdef Wrapper rng_1 = Wrapper(seed=(123, 1))
>
> And then,of these two objects, do they have distinct BitGenerators?
>

The code you wrote doesn't work (`spawn_key` is never assigned). I can
guess what you meant to write, though, and yes, you would get distinct
`BitGenerator`s. However, I do not recommend using `spawn_key` explicitly.
The `SeedSequence.spawn()` method internally keeps track of how many
children it has spawned and uses that to construct the `spawn_key`s for its
subsequent children. If you play around with making your own `spawn_key`s,
then the parent `SeedSequence(entropy)` might spawn identical
`SeedSequence`s to the ones you constructed.

If you don't want to use the `spawn()` API to construct the separate
`SeedSequence`s but still want to incorporate some per-process information
into the seeds (e.g. the 0 and 1 in your example), then note that a tuple
of integers is a valid value for the `entropy` argument. You can have the
first item be the same (i.e. per-run information) and the second item be a
per-process ID or counter.

cdef class Wrapper():
    def __init__(self, seed):
        py_gen = PCG64(SeedSequence(seed))
        self.rng = <bitgen_t *>py_gen.capsule.PyCapsule_GetPointer(capsule,
"BitGenerator")

cdef Wrapper rng_0 = Wrapper(seed=(123, 0))
cdef Wrapper rng_1 = Wrapper(seed=(123, 1))

-- 
Robert Kern
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.python.org/pipermail/numpy-discussion/attachments/20201214/5d5c19ea/attachment.html>


More information about the NumPy-Discussion mailing list