[Numpy-discussion] locking np.random.Generator in a cython nogil context?

Robert Kern robert.kern at gmail.com
Thu Dec 17 10:17:55 EST 2020


On Thu, Dec 17, 2020 at 9:56 AM Evgeni Burovski <evgeny.burovskiy at gmail.com>
wrote:

> On Thu, Dec 17, 2020 at 1:01 PM Matti Picus <matti.picus at gmail.com> wrote:
> >
> >
> > On 12/17/20 11:47 AM, Evgeni Burovski wrote:
> > > Just as a side note, this is not very prominent in the docs, and I'm
> > > ready to volunteer to send a doc PR --- I'm only not sure which part
> > > of the docs, and would appreciate a pointer.
> >
> > Maybe here
> >
> >
> https://numpy.org/devdocs/reference/random/bit_generators/index.html#seeding-and-entropy
> >
> > which is here in the sources
> >
> >
> https://github.com/numpy/numpy/blob/master/doc/source/reference/random/bit_generators/index.rst#seeding-and-entropy
> >
> >
> > And/or in the SeedSequence docstring documentation
> >
> >
> https://numpy.org/devdocs/reference/random/bit_generators/generated/numpy.random.SeedSequence.html#numpy.random.SeedSequence
> >
> > which is here in the sources
> >
> >
> https://github.com/numpy/numpy/blob/master/numpy/random/bit_generator.pyx#L255
>
>
> Here's the PR, https://github.com/numpy/numpy/pull/18014
>
> Two minor comments, both OT for the PR:
>
> 1. The recommendation to seed the generators from the OS --- I've been
> bitten by exactly this once. That was a rather exotic combination of a
> vendor RNG and a batch queueing system, and some of my runs did end up
> with identical random streams. Given that the recommendation is what
> it is, it probably means that experience is a singular point and it no
> longer happens with modern generators.
>

I suspect the vendor RNG was rolling its own entropy using time. We use
`secrets.getrandbits()`, which ultimately uses the best cryptographic
entropy source available. And if there is no cryptographic entropy source
available, I think we fail hard instead of falling back to less reliable
things like time. I'm not entirely sure that's a feature, but it is safe!


> 2. Robert's comment that `SeedSequence(..., spawn_key=(num,))`  is not
> equivalent to `SeedSequence(...).spawn(num)[num]` and that the former
> is not recommended. I'm not questioning the recommendation, but then
> __repr__ seems to suggest the equivalence:
>

I was saying that they were equivalent. That's precisely why it's not
recommended: it's too easy to do both and get identical streams
inadvertently.

-- 
Robert Kern
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.python.org/pipermail/numpy-discussion/attachments/20201217/3a320e81/attachment-0001.html>


More information about the NumPy-Discussion mailing list