[Numpy-discussion] Backwards-incompatible improvements to numpy.random.RandomState

Antony Lee antony.lee at berkeley.edu
Sun May 24 04:22:21 EDT 2015


Hi,

As mentioned in

#1450: Patch with Ziggurat method for Normal distribution
#5158: ENH: More efficient algorithm for unweighted random choice without
replacement
#5299: using `random.choice` to sample integers in a large range
#5851: Bug in np.random.dirichlet for small alpha parameters

some methods on np.random.RandomState are implemented either non-optimally
(#1450, #5158, #5299) or have outright bugs (#5851), but cannot be easily
changed due to backwards compatibility concerns.  While some have suggested
new methods deprecating the old ones (see e.g. #5872), some consensus has
formed around the following ideas (see #5299 for original discussion,
followed by private discussions with @njsmith):

- Backwards compatibility should only be provided to those who were
explicitly instantiating a seeded RandomState object or reseeding a
RandomState object to a given value, and drawing variates from it: using
the global methods (or a None-seeded RandomState) was already
non-reproducible anyways as e.g. other libraries could be drawing variates
from the global RandomState (of which the free functions in np.random are
actually methods).  Thus, the global RandomState object should use the
latest implementation of the methods.

- "RandomState(seed)" and "r = RandomState(...); r.seed(seed)" should offer
backwards-compatibility guarantees (see e.g.
https://docs.python.org/3.4/library/random.html#notes-on-reproducibility).

As such, we propose the following improvements to the API:

- RandomState gains a (keyword-only) parameter, "version", also accessible
as a read-only attribute.  This indicates the version of the methods on the
object.  The current version of RandomState is retroactively assigned
version 0.  The latest available version is available as
np.random.LATEST_VERSION.  Backwards-incompatible improvements to
RandomState methods can be introduced but increase the LAGTEST_VERSION.

- The global RandomState is instantiated as
RandomState(version=LATEST_VERSION).

- RandomState() and rs.seed() sets the version to LATEST_VERSION.

- RandomState(seed[!=None]) and rs.seed(seed[!=None]) sets the version to 0.

A proof-of-concept implementation, still missing tests, is tracked as
#5911.  It includes the patch proposed in #5158 as an example of how to
include an improved version of random.choice.

Comments, and help for writing tests (in particular to make sure backwards
compatibility is maintained) are welcome.

Antony Lee
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20150524/9fbd1c4d/attachment.html>


More information about the NumPy-Discussion mailing list