[Numpy-discussion] Adopt Mersenne Twister 64bit?

josef.pktd at gmail.com josef.pktd at gmail.com
Tue Mar 12 20:48:14 EDT 2013


On Tue, Mar 12, 2013 at 7:10 PM, Robert Kern <robert.kern at gmail.com> wrote:
> On Tue, Mar 12, 2013 at 10:38 PM, Neal Becker <ndbecker2 at gmail.com> wrote:
>> Nathaniel Smith wrote:
>>
>>> On Tue, Mar 12, 2013 at 9:25 PM, Nathaniel Smith <njs at pobox.com> wrote:
>>>> On Mon, Mar 11, 2013 at 9:46 AM, Robert Kern <robert.kern at gmail.com> wrote:
>>>>> On Sun, Mar 10, 2013 at 6:12 PM, Siu Kwan Lam <siu at continuum.io> wrote:
>>>>>> My suggestion to overcome (1) and (2) is to allow the user to select
>>>>>> between the two implementations (and possibly different algorithms in the
>>>>>> future). If user does not provide a choice, we use the MT19937-32 by
>>>>>> default.
>>>>>>
>>>>>>         numpy.random.set_state("MT19937_64", …)   # choose the 64-bit
>>>>>> implementation
>>>>>
>>>>> Most likely, the different PRNGs should be different subclasses of
>>>>> RandomState. The module-level convenience API should probably be left
>>>>> alone. If you need to control the PRNG that you are using, you really
>>>>> need to be passing around a RandomState instance and not relying on
>>>>> reseeding the shared global instance.
>>>>
>>>> +1
>>>>
>>>>> Aside: I really wish we hadn't
>>>>> exposed `set_state()` in the module API. It's an attractive nuisance.
>>>>
>>>> And our own test suite is a serious offender in this regard, we have
>>>> tests that fail if you run the test suite in a non-default order...
>>>>   https://github.com/numpy/numpy/issues/347
>>>>
>>>> I wonder if we dare deprecate it? The whole idea of a global random
>>>> state is just a bad one, like every other sort of global shared state.
>>>> But it's one that's deeply baked into a lot of scientific programmers
>>>> expectations about how APIs work...
>>>
>>> (To be clear, by 'it' here I meant np.random.set_seed(), not the whole
>>> np.random API. Probably. And by 'deprecate' I mean 'whine loudly in
>>> some fashion when people use it', not 'rip out in a few releases'. I
>>> think.)
>>>
>>> -n
>>
>> What do you mean that the idea of global shared state is a bad one?
>
> The words "global shared state" drives fear into the hearts of
> experienced programmers everywhere, whatever the context. :-) It's
> rarely a *good* idea.
>
>> How would
>> you prefer the API to look?
>
> There are two current APIs:
>
> 1. Instantiate RandomState and call it's methods
> 2. Just call the functions in numpy.random
>
> The latter has a shared global state. In fact, all of those
> "functions" are just references to the methods on a shared global
> RandomState instance.
>
> We advocate using the former API. Note that it already exists. It was
> the recommended API from day one. No one is recommending adding a new
> API.

I never saw much advertising for the RandomState api, and until
recently wasn't sure why using the global random state function
np.random.norm, ... should be a bad idea.

Learning by example, and seeing almost all examples using the global
state, is not exactly conducive to figuring out that there is an
issue.

All of scipy.stats.distribution random numbers are using the global
random state. (I guess I should open a ticket.)

Josef

>
>> An alternative is a stateless rng, where you have
>> to pass it it's state on each invocation, which it would update and return.  I
>> hope you're not advocating that.
>
> No. This is a place where OOP solved the problem neatly.
>
> --
> Robert Kern
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion



More information about the NumPy-Discussion mailing list