random number generator thread safety

Mike Brown mike at skew.org
Tue Nov 8 23:43:49 EST 2005


Raymond Hettinger wrote:
> Mike Brown wrote:
> > I have questions about thread safety in the 'random' module.
> >
> > When using the random.Random class (be it Mersenne Twister or Wichmann-Hill
> > based), is it sufficiently thread-safe (preserving entropy and guarding
> > against attack) to just have each thread work with its own random.Random
> > instance? Or do I need to wrap my calls to each instance's methods to use
> > locks? Wichmann-Hill in particular has the warning in its .random()
> > vulnerability; do I need to make an exception for that case?
> 
> Thread-safety has nothing to do with preserving entropy or guarding
> against attack.

Well, the Wichmann-Hill implementation claims to be "not thread safe" but 
really, as Paul Rubin pointed out, it's about the risk (depending on the 
implementation) that when two threads have access to the same RNG, there is a 
loss of confidence in the 'randomness' of the results that each thread sees... 
e.g., can thread 2 manipulate the state of the RNG while thread 1 is using it? 
and can thread 2 see the same result as thread 1?

I think both of these situations are alleviated just by using separate RNG 
instances, but I don't know enough to know if I should still be doing some
blocking when calling some functions & methods, such as os.urandom and
random.SystemRandom's methods.

> Nothing in the random module provides cryptographic guarantees.

I don't need cryptographic guarantees. I just need to expose some RNG 
functions in a multithreaded application that supports Python 2.2 and up 
(hence the concern about WH).

[ I also intend to use stdlib and will work around whatever bugs/issues I know 
about, such as the undesirable NotImplementedError (i.e., I will fall back on 
Mersenne Twister if os.urandom or SystemRandom methods fail); the WH 'thread 
safety' issue (which affects my Py 2.2 users); the Py 2.3.0 'no milliseconds 
in default seed' issue (I will make sure whatever RNGs are available are 
better seeded by default); the Py 2.4.0-2.4.1 posix os.urandom filehandle 
caching/hold-open issue (I will use 2.4.2's os.urandom); and anything else 
that I should be concerned about. Since I'll be reimplementing os.urandom, 
I'll be making use of it wherever I can in 2.2-2.3, including as the preferred 
source over WH & MT, and as a better seed source than system clock. ]

I am asking for advice on how to mitigate the risks related to multithreading 
-- I don't the functions to be any less 'random' or more vulnerable than they 
would be in a single-threaded app. I don't know enough about the subject to 
know for sure whether the answer to "should I block when polling a PRNG" is 
"no", or whether the answer to "should I block when polling a system RNG 
(likely a PRNG seeded often from a hash of multiple sources)" would be 
different. That's why I am asking here.

The random module's docs suggest that I'll want to mitigate such risks by 
using my own instance(s) of random.Random, and I'm asking here if that alone 
is going to be sufficient -- I think that in Py 2.2 it will keep me from 
having to block when calling .random(), right? -- or if there are some risks 
that using separate random.Random instances doesn't take care of, and that 
would thus require me to use locks on more of my calls. I'm also asking 
whether the same answers would apply to the use of random.SystemRandom and 
os.urandom(); e.g., even if I don't need to block when calling random.Random 
methods, might it still be a good idea to block when accessing 
systemRandom/urandom?

-Mike




More information about the Python-list mailing list