[Numpy-discussion] [Python-ideas] Should our default random number generator be secure?

Nathaniel Smith njs at pobox.com
Sun Sep 13 22:54:51 EDT 2015


[This is getting fairly off-topic for python-ideas (since AFAICT there
is no particular reason right now to add a new deterministic generator
to the stdlib), so CC'ing numpy-discussion and I'd suggest followups
be directed to there alone.]

On Thu, Sep 10, 2015 at 10:41 AM, Robert Kern <robert.kern at gmail.com> wrote:
> On 2015-09-10 04:56, Nathaniel Smith wrote:
>>
>> On Wed, Sep 9, 2015 at 8:35 PM, Tim Peters <tim.peters at gmail.com> wrote:
>>>
>>> There are some clean and easy approaches to this based on
>>> crypto-inspired schemes, but giving up crypto strength for speed.  If
>>> you haven't read it, this paper is delightful:
>>>
>>>      http://www.thesalmons.org/john/random123/papers/random123sc11.pdf
>>
>>
>> It really is! As AES acceleration instructions become more common
>> (they're now standard IIUC on x86, x86-64, and even recent ARM?), even
>> just using AES in CTR mode becomes pretty compelling -- it's fast,
>> deterministic, provably equidistributed, *and* cryptographically
>> secure enough for many purposes.
>
>
> I'll also recommend the PCG paper (and algorithm) as the author's
> cross-PRNGs comparisons have been bandied about in this thread already. The
> paper lays out a lot of the relevant issues and balances the various
> qualities that are important: statistical quality, speed, and security (of
> various flavors).
>
>   http://www.pcg-random.org/paper.html
>
> I'm actually not that impressed with Random123. The core idea is nice and
> clean, but the implementation is hideously complex.

I'm curious what you mean by this? In terms of the code, or...?

I haven't looked at the code, but the simplest generator they
recommend in the paper is literally just

def rng_stream(seed):
    counter = 0
    while True:
        # AES128 takes a 128 bit key and 128 bits of data and returns
128 bits of encrypted data
        val = AES128(key=seed, data=counter)
        yield low_64_bits(val)
        yield high_64_bits(val)
        counter += 1

which gives a 64-bit generator with a period of 2^129. They benchmark
it as faster than the Mersenne Twister on modern CPUs (<2
cycles-per-byte on recent x86, x86-64, ARMv8), it requires less
scratch space, is incredibly simple to work with -- you can
parallelize it, get independent random streams, etc., in a totally
trivial way -- and has a *way* stronger guarantee of
random-looking-ness than merely passing TestU01.

The downsides are that it does still require 176 bytes of read-only
scratch storage (used to cache an expanded version of the "key"), the
need for a modern CPU to get that speed, and that it doesn't play well
with GPUs. So they also provide a set of three more ad hoc generators
designed to solve these problems. I'm not as convinced about these,
but hey, they pass TestU01...

The PCG paper does a much better job of all the other stuff *around*
making a good RNG -- it has the nice website, clear comparisons, nice
code, etc. -- which is definitely important. But to me the increase in
speed and reduction in memory use doesn't seem worth it given how fast
these generators are to start with + the nice properties of counter
mode + and cryptographic guarantees of randomness that you get from
AES, for code that's generally targeting non-embedded non-GPU systems.

-n

-- 
Nathaniel J. Smith -- http://vorpus.org



More information about the NumPy-Discussion mailing list