[Numpy-discussion] Moving NumPy's PRNG Forward

Robert Kern robert.kern at gmail.com
Fri Jan 19 17:34:12 EST 2018


On Sat, Jan 20, 2018 at 2:27 AM, <josef.pktd at gmail.com> wrote:

> I'm not sure I fully understand
> Is the proposal to drop stream-backward compatibility completely for the
future or just a one time change?

For all future.

> > No version-selection API would be required as you select the version by
installing the desired version of numpy.
>
> That's not useful if we want to have unit tests that run in the same way
across numpy versions.
>
> There are many unit tests that rely on fixed streams and have hard coded
results that rely on specific numbers (up to floating point, numerical
noise).
> Giving up stream compatibility would essentially kill using np.random for
these unit tests.

This is a use case that I am sensitive to. However, it should be noted that
relying on the exact stream for unit tests makes you vulnerable to platform
differences. That's something that we've never guaranteed (because we
can't). That said, there are some of the simpler distributions that are
more robust to such things, and those are fairly typical in unit tests. As
I mentioned, I am open to a small set of methods that we do guarantee
stream-compatibility for. I think that unit tests are the obvious use case
that should determine what that set is. Unit tests rarely need
`noncentral_chisquare()`, for instance.

I'd also be willing to make the API a little clunkier in order to maintain
the stable set of methods. For example, two methods that are common in unit
testing are `normal()` and `choice()`, but those have been the target of
the most attempted innovation. I'd be willing to leave them alone while
providing other methods that do the same thing but are allowed to innovate.

> Similar, reproducibility from another user, e.g. in notebooks, would
break without stream compatibility across numpy versions.

That is the reproducible-research use case that I discussed already. I
argued that the stability that our policy actually provides is rather more
muted than what it seems on its face.

> One possibility is to keep  the current stream-compatible np.random
version and maintain it in future for those usecases, and add a new
"high-performance" version with the new features.

That is one of the alternatives I raised.

--
Robert Kern
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20180120/c9747388/attachment.html>


More information about the NumPy-Discussion mailing list