[issue21470] Better seeding for the random module

Sun May 11 22:32:22 CEST 2014

Raymond Hettinger added the comment:

Looking back over this tracker item, I realize that I didn't elaborate sufficiently on the problem being addressed:

MT is equidistributed.  This a major point in its favor but also implies that there are long stretches of "uninteresting" sequences.  When we seed with only a subset the state space, there is a risk of systematically landing in those stretches.

That is why the cited best practices paper recommends filling the seed space and likely is why the cited reference implementation uses /urandom to fill the state space.

We've previously had this problem with MT (since resolved, where it is was landed in a very non-random zone).   We can't avoid the *possibility* of landing in one of these zones, but we can make sure that it isn't a *systematic* recurring problem.  By using a sufficiently large seed, we avoid biasing the selection of which sequences are visitable out of the huge MT period.

Though the current 32 bytes is pretty good (I hope so, I'm the one that chose that constant), there is no question that there is some benefit from the larger seed and that we are following in the footsteps of published reference implementations.

The real question is whether there is a actual downside to calling urandom(2500).  AFAICT, the impact is insignificant (increasing the cost of initialization by mirco-seconds, a percentage increase so small that I can't measure it).

----------

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue21470>
_______________________________________