[Numpy-discussion] On the quality of the numpy.random.normal() distribution

Charles R Harris charlesr.harris at gmail.com
Wed Dec 10 15:30:44 EST 2008


On Wed, Dec 10, 2008 at 12:03 PM, Michael Gilbert <
michael.s.gilbert at gmail.com> wrote:

> Hello,
>
>  I have been reading that there may be potential issues with the
>  Box-Muller transform, which is used by the numpy.random.normal()
>  function.  Supposedly, since f*x1 and f*x2 are not independent variables,
> then
>  the individual elements (corresponding to f*x1 and f*x2 ) of the
>  distribution also won't be independent.  For example, see "Stochastic
>  Simulation" by Ripley, pages 54-59, where the random values end up
>  distributed on a spiral.  Note that they mention that they only looked
>  at "congruential generators."  Is the random number generator used
>  by numpy congruential?
>
>  I have tried to generate plots that demonstrate this problem, but have
>  come up short.  For example:
>
>   import numpy , pylab
>   nsamples = 10**6
>   n = numpy.random.normal( 0.0 , 1.0 , nsamples )
>   pylab.scatter( n[0:-1:2] , n[1:-1:2] , 0.1 )
>   pylab.show()
>
>  I can zoom in and out, and the scatter still looks random (white
>  noise -- almost like tv static).  Does this prove that there is no
>  problem?  And if so, why does numpy do a better job than as
>  demonstrated by Ripley?
>

Bruce Carneal did some tests of robustness and speed for various normal
generators. I don't know what his final tests showed for Box-Muller. IIRC,
it had some failures but nothing spectacular. The tests were pretty
stringent and based on using the erf to turn the normal distribution into a
uniform distribution and using the crush tests on the latter.. You could
send him a note and ask: bcarneal at gmail.com. Here are the timings he got:

In what follows the uniform variate generators are:
 lcg64
 mwc8222
 mt19937
 mt19937_64
 yarn5

And the normal distribution codes are:
 trng  - default normal distribution code in TRNG
 boxm  - Box-Muller, mtrand lookalike, remembers/uses 2nd value
 zig7  - a 'Harris' ziggurat indexed by 7 bits
 zig8  - a 'Harris' ziggurat indexed by 8 bits
 zig9  - a 'Harris' ziggurat indexed by 9 bits

Here are the numbers in more detail:

# Timings from icc -O2 running on 2.4GhZ Core-2

lcg64 trng: 6.52459e+06 ops per second
lcg64 boxm: 2.18453e+07 ops per second
lcg64 zig7: 1.80616e+08 ops per second
lcg64 zig8: 2.01865e+08 ops per second
lcg64 zig9: 2.05156e+08 ops per second

mwc8222 trng: 6.52459e+06 ops per second
mwc8222 boxm: 2.08787e+07 ops per second
mwc8222 zig7: 9.44663e+07 ops per second
mwc8222 zig8: 1.05326e+08 ops per second
mwc8222 zig9: 1.03478e+08 ops per second

mt19937 trng: 6.41112e+06 ops per second
mt19937 boxm: 1.64986e+07 ops per second
mt19937 zig7: 4.23762e+07 ops per second
mt19937 zig8: 4.52623e+07 ops per second
mt19937 zig9: 4.52623e+07 ops per second

mt19937_64 trng: 6.42509e+06 ops per second
mt19937_64 boxm: 1.93226e+07 ops per second
mt19937_64 zig7: 5.8762e+07 ops per second
mt19937_64 zig8: 6.17213e+07 ops per second
mt19937_64 zig9: 6.29146e+07 ops per second

yarn5 trng: 5.95781e+06 ops per second
yarn5 boxm: 1.19156e+07 ops per second
yarn5 zig7: 1.48945e+07 ops per second
yarn5 zig8: 1.54809e+07 ops per second
yarn5 zig9: 1.53201e+07 ops per second


# Timings from g++ -O2 running on a 2.4GhZ Core-2

lcg64 trng: 6.72163e+06 ops per second
lcg64 boxm: 1.50465e+07 ops per second
lcg64 zig7: 1.31072e+08 ops per second
lcg64 zig8: 1.48383e+08 ops per second
lcg64 zig9: 1.6036e+08 ops per second

mwc8222 trng: 6.64215e+06 ops per second
mwc8222 boxm: 1.44299e+07 ops per second
mwc8222 zig7: 8.903e+07 ops per second
mwc8222 zig8: 1.00825e+08 ops per second
mwc8222 zig9: 1.03478e+08 ops per second

mt19937 trng: 6.52459e+06 ops per second
mt19937 boxm: 1.28223e+07 ops per second
mt19937 zig7: 5.00116e+07 ops per second
mt19937 zig8: 5.41123e+07 ops per second
mt19937 zig9: 5.47083e+07 ops per second

mt19937_64 trng: 6.58285e+06 ops per second
mt19937_64 boxm: 1.42988e+07 ops per second
mt19937_64 zig7: 6.72164e+07 ops per second
mt19937_64 zig8: 7.39591e+07 ops per second
mt19937_64 zig9: 7.46022e+07 ops per second

yarn5 trng: 6.25144e+06 ops per second
yarn5 boxm: 8.93672e+06 ops per second
yarn5 zig7: 1.50465e+07 ops per second
yarn5 zig8: 1.57496e+07 ops per second
yarn5 zig9: 1.56038e+07 ops per second

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20081210/e5bf60bc/attachment.html>


More information about the NumPy-Discussion mailing list