[Numpy-discussion] random number generator, entropy and pickling

Gael Varoquaux gael.varoquaux at normalesup.org
Mon Apr 25 14:15:24 EDT 2011


On Mon, Apr 25, 2011 at 11:05:05AM -0700, T J wrote:
> If code A relies on code B (eg, some numpy function) and code B
> changes, then the stream of random numbers will no longer be the same.
>  The point here is that the user wrote code A but depended on code B,
> and even though code A was unchanged, their random numbers were not
> the same.

Yes, that's exactly why we want the different objects to able to recieve
their own PRNG.

> The situation is improved if scikits.learn used its own global
> RandomState instance.  Then code A will at least give the same stream
> of random numbers for a fixed version of scikits.learn.  It should be
> made very clear though that the data stream cannot be expected to be
> the same across versions.

The use case that we are trying to catter for, with the global PRNG, is
for mister Joe average, who is used to setting the numpy PRNG to control
what is going on. In my experience, the less you need to teach Mr Joe A.,
the better (I am not dumbing down Joe A., just acknowledging the fact
that he probably has many other things to worry about).

> As to each object having its own RandomState instance, I definitely
> see that it makes restoring the overall state of a piece of code
> harder, but perhaps utility functions could make this easier. 

That's what we are leaning toward: a utility function, that by default
returns the numpy PRNG object, but enables the use of specific PNRGs or
seeds. In other words, we are thinking of following Robert's suggestion
(option 'a' in the original mail, but enriched with Robert's input on
mtrand.rand). We'll probably wait a bit for feedback before making a
decision.

Thanks for all your input,

G



More information about the NumPy-Discussion mailing list