[SciPy-user] Bias in numpy.random.multivariate_normal?

John Reid j.reid at mail.cryst.bbk.ac.uk
Sat Nov 10 13:55:47 EST 2007


Could someone confirm I've got this right?

In a multivariate normal (MVN) the covariance is  E(X.X') - E(X).E(X'). 
I sample many times from a MVN with given mean and covariance. I then 
calculate the covariance of the sample. This covariance matrix should 
show no bias to be larger or smaller than the distribution's covariance 
but I see that some entries are always larger and others are smaller.

See this code:

from numpy.random import multivariate_normal
from numpy.linalg import inv
from numpy import outer

mu = [10.,10.]
precision = [[.6, .3], [.4, .3]]
sigma = inv(precision)
sample_size = 10000
num_tests = 100

for test in xrange(num_tests):
   samples = multivariate_normal(mu, sigma, [sample_size])
   sample_mean = samples.sum(axis=0)/sample_size
   #print (sample_mean - mu) > 0.
   sample_covariance = sum(outer(sample-sample_mean, sample-sample_mean) 
for sample in samples) / sample_size
   print (sample_covariance - sigma) > 0


'True' is printed when the entry is larger and False otherwise

Did I get this wrong or is this acceptable behaviour for a random 
variate generator or is it a bug?

Thanks,
John.




More information about the SciPy-User mailing list