[Numpy-discussion] stability of numpy.random.RandomState API?

Thu Nov 6 17:58:46 EST 2008

On Thu, Nov 6, 2008 at 1:55 PM, Robert Kern <robert.kern at gmail.com> wrote:
> On Thu, Nov 6, 2008 at 15:12, Barry Wark <barrywark at gmail.com> wrote:
>> On Thu, Nov 6, 2008 at 12:09 PM, Robert Kern <robert.kern at gmail.com> wrote:
>>> On Thu, Nov 6, 2008 at 14:05, Barry Wark <barrywark at gmail.com> wrote:
>>>> I'm just about to embark on a long-term research project and was
>>>> planning to use numpy.random to generate stimuli for our experiments.
>>>> We plan to store only the parameters and RandomState seed for each
>>>> stimulus and I'm concerned about stability of the API in the long
>>>> term: will the parameters and random seed we store now work with
>>>> future versions of numpy.random?
>>>
>>> It should. But just in case, make sure you explicitly instantiate
>>> RandomState objects instead of using the functions in numpy.random.
>>> That way, should we need to fix some bug that might change the
>>> results, you can always pull out the current mtrand code and use it
>>> independently.
>>
>> That is our working plan, as well as to record the numpy.__version__
>> which was used to generate the original stimulus. Thanks for the
>> confirmation.
>>
>> On a side note, this seems like a potentially big issue for many
>> scientific users. Perhaps making a policy of keeping incompatible
>> revisions to  RandomState noted in its documentation (if they ever
>> come up) would be useful. Even better, a module function or class
>> method that returns an instance of RandomState as it was at a
>> particular numpy version:
>>
>> r = numpy.random.RandomState.from_version(my_numpy_version, seed=None)
>>
>> Hmm. Sounds like a bit of work. I'll give it a go, if you think this
>> is a valuable approach.
>>
>>>
>>>> I think I recall that there was a
>>>> change in the random seed format some time around numpy 1.0.
>>>
>>> I don't think I changed it after 1.0. Before 1.0, we explicitly warned
>>> people about API instability.
>>
>> I believe you. We've been developing this app since before numpy 1.0,
>> so I'm sure the issue cropped up from data generated pre-1.0.
>
> Okay. Actually, now that I think about it, there have been changes
> that would affect results using the nonuniform distributions. These
> should only have arisen from fixing bugs (i.e. the previous results
> were wrong, not just different). Do you have any thoughts on how you
> would want us to handle that case?

In our usage (neural physiology), we've recorded the physiological
response to a given stimulus. So being able to recover the _exact_
original stimulus that produced the recorded data is critical. This is
why I suggested an API which would let us get an instance of the
RandomState as it was at a particular revision (including bugs) so
that we could regenerate the exact original sequence. Obviously, we're
happy to have the bug fixes in, and continue to use the current
RandomState for new experiments.

>
> --
> Robert Kern
>
> "I have come to believe that the whole world is an enigma, a harmless
> enigma that is made terrible by our own mad attempt to interpret it as
> though it had an underlying truth."
>  -- Umberto Eco
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion at scipy.org
> http://projects.scipy.org/mailman/listinfo/numpy-discussion
>