[Tutor] What are these things urandom() returns?

Wed Oct 11 07:01:50 CEST 2006

[Dick Moores]
>>> Would this be a better random source than choice([0,1]), which uses
>>> random()?

[Tim Peters]
>> "Better" as measured against what criteria?

[Dick]
> I meant would it be closer to true randomness than random(), even if
> much slower?

Define "closer to true randomness" ;-)

I don't mean to be difficult, but if the phrase "cryptographically
strong" doesn't mean something vital to you already, there's little
for you to worry about here.  If you want /true/ randomness, you can
buy a certified hardware random number generator, based on
non-deterministic physical processes (like timing radioactive decay,
or measuring thermal noise).

Anything short of that is more-or-less a hack designed to pass various
tests for randomness.  The Mersenne Twister is one of the best-tested
generators in existence, producing a sequence indistinguishable from
"truly random" via all tests in common use.  Nevertheless, if an
intelligent /adversary/ knows you're using the Mersenne Twister, they
can exploit that under some conditions to predict the sequence you're
using.  OTOH, if you use a "cryptographically strong" generator, or a
hardware source of true randomness, and your adversary knows that,
current theory says that even if have they have enormous computer
resources to throw at it, and you don't make "stupid mistakes" in the
/way/ you use it, they won't be able to predict the sequence you're
seeing at significantly better than chance rate.

The /pragmatic/ answer to "would it be closer to true randomness?"
thus depends on what you're trying to achieve.  The theoretical answer
is "yes", but that may be of no relevance to what you're trying to
achieve.

>>  It's very much slower than using choice([0, 1])  (or choice("01"), or
>> randrange(2), or ...), and can't (even in theory) be reproduced by setting
>> a seed.

> That's fine. Why would one want to?

For example, any non-trivial program eventually grows a test suite to
verify that the program continues to run correctly as time goes on.
Testing a program that makes random decisions is, in general, very
much easier if you have a way to force it to make the same decisions
each time the test is run.  For example, here's a tiny piece of
Python's Lib/test/test_random.py:

    def test_genrandbits(self):
        # Verify cross-platform repeatability
        self.gen.seed(1234567)
        self.assertEqual(self.gen.getrandbits(100),
                         97904845777343510404718956115L)

Without the ability to force the seed, that test wouldn't be possible.
 As is, it tests that given the specific seed 1234567, the next call
to getrandbits(100) will return exactly 97904845777343510404718956115
regardless of platform.

Even if there is no test, non-trivial programs do "surprising" things
at times.  In a program that makes random decisions, if something
/very/ surprising is seen, and you have no way to reproduce the
decisions made before the surprising occurrence, you may be completely
stuck in debugging it.  OTOH, if there's a way to force the same
decisions to be made, you can, e.g., run the program under a debugger
to see everything that happens before disaster strikes.