[Python-ideas] Should our default random number generator be secure?

Tue Sep 15 16:20:46 CEST 2015

On 15.09.2015 14:09, Sturla Molden wrote:
> On 15/09/15 10:45, M.-A. Lemburg wrote:
> 
>> k-dim equidistribution is a way to measure how well your
>> PRNG behaves, because it describes in analytical terms how
>> far you can get with increasing the linear complexity of your
>> RNG output.
> 
> Yes and no. Conceptually it means that k subsequent samples will have exactly zero correlation. But
> any PRNG that produces detectable correlation between samples 623 steps apart is junk anyway. The MT
> have proven equidistribution for k=623, but many have measured equidistribution for far longer
> periods than that. Numerical computations are subject to rounding error and truncation error
> whatever you do. The question is whether the deviation from k-dim equidistribution will show up in
> your simulation result or drown in the error terms.

I guess the answer is: it depends :-)

According to the SFMT paper:

"""
...it requires 10**28 samples to detect
an F2-linear relation with 15 (or more) terms among 521 bits,
by a standard statistical test. If the number of bits is
increased, the necessary sample size is increased rapidly. Thus, it seems that
k(v) of SFMT19937 is sufficiently large, far beyond the level of the observable
bias. On the other hand, the speed of the generator is observable.
"""
http://www.math.sci.hiroshima-u.ac.jp/~m-mat/MT/SFMT/M062821.pdf
(which again refers to this paper:
http://www.math.sci.hiroshima-u.ac.jp/~m-mat/MT/ARTICLES/HONGKONG/hong-fin4.pdf)

10**28 is already a lot of data, but YMMV, of course.

Here's a quote for the WELL family of PRNGs:

"""
The WELL generators mentioned in Table IV successfully passed all the
statistical tests included ... TestU01 ..., except those that look for linear
dependencies in a long sequence of bits, such as the matrix-rank test
... for very large binary matrices and the linear complexity tests ...
This is in fact a limitation of all F2-linear generators, including
the Mersenne twister, the TT800, etc. Because of their linear nature,
the sequences produced by these generators just cannot have
the linear complexity of a truly random sequence. This is definitely
unacceptable in cryptology, for example, but is quite acceptable for the
vast majority of simulation applications if the linear dependencies are
of long range and high order.
"""
http://www.iro.umontreal.ca/~lecuyer/myftp/papers/wellrng.pdf

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Experts (#1, Sep 15 2015)
>>> Python Projects, Coaching and Consulting ...  http://www.egenix.com/
>>> Python Database Interfaces ...           http://products.egenix.com/
>>> Plone/Zope Database Interfaces ...           http://zope.egenix.com/
________________________________________________________________________
2015-09-14: Released mxODBC Plone/Zope DA 2.2.3   http://egenix.com/go84
2015-09-18: PyCon UK 2015 ...                               3 days to go
2015-09-26: Python Meeting Duesseldorf Sprint 2015         11 days to go

   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611
               http://www.egenix.com/company/contact/