[Python-ideas] Python's Source of Randomness and the random.py module Redux

Tim Peters tim.peters at gmail.com
Thu Sep 10 19:23:48 CEST 2015


[Donald Stufft <donald at stufft.io>, on arc4random speed]
> I wanted to try and test this. These are not super scientific since I just ran
> them on a single computer once (but 10 million iterations each) but I think it
> can probably give us an indication of the differences?
>
> I put the code up at https://github.com/dstufft/randtest but it's a pretty
> simple module. I'm not sure if (double)arc4random() / UINT_MAX is a reasonable
> way to get a double out of arc4random (which returns a uint) that is between
> 0.0 and 1.0, but I assume it's fine at least for this test.

arc4random() specifically returns uint32_t, which is 21 bits shy of
what's needed to generate a reasonable random double.  Our MT wrapping
internally generates two 32-bit uint32_t thingies, and pastes them
together like so (Python's C code here):

"""
/* random_random is the function named genrand_res53 in the original code;
 * generates a random number on [0,1) with 53-bit resolution; note that
 * 9007199254740992 == 2**53; I assume they're spelling "/2**53" as
 * multiply-by-reciprocal in the (likely vain) hope that the compiler will
 * optimize the division away at compile-time.  67108864 is 2**26.  In
 * effect, a contains 27 random bits shifted left 26, and b fills in the
 * lower 26 bits of the 53-bit numerator.
 * The orginal code credited Isaku Wada for this algorithm, 2002/01/09.
 */
static PyObject *
random_random(RandomObject *self)
{
    PY_UINT32_T a=genrand_int32(self)>>5, b=genrand_int32(self)>>6;
    return PyFloat_FromDouble((a*67108864.0+b)*(1.0/9007199254740992.0));
}
"""

So now you know how to make it more directly comparable.  The
high-order bit is that it requires 2 calls to the 32-bit uint integer
primitive to get a double, and that can indeed be significant.


> Here's the results from running the test on my personal computer which is
> running the OSX El Capitan public Beta:
>
>     $ python test.py
>     Number of Calls:  10000000
>     +---------------+--------------------+
>     | method        | usecs per call     |
>     +---------------+--------------------+
>     | deterministic | 0.0586802460020408 |
>     | system        | 1.6681434757076203 |
>     | userland      | 0.1534261149005033 |
>     +---------------+--------------------+
>
>
> I'll try it against OpenBSD later to see if their implementation of arc4random
> is faster than OSX.

Just noting that most people timing the OpenBSD version seem to
comment out the "get stuff from the kernel periodically" part first,
in order to time the algorithm instead of the kernel ;-)  In real
life, though, they both count, so I like what you're doing better.


More information about the Python-ideas mailing list