Computing correlations with SciPy

Travis Oliphant oliphant.travis at ieee.org
Fri Mar 17 13:49:43 EST 2006


tkpmep at hotmail.com wrote:
> I want to compute the correlation between two sequences X and Y, and
> tried using SciPy to do so without success.l Here's what I have, how
> can I correct it?
> 

This was a bug in NumPy (inherited from Numeric actually).  The fix is 
in SVN of NumPy.

Here are the new versions of those functions that should work as you 
wish (again, these are in SVN, but perhaps you have a binary install).

These functions belong in <site-packages>/numpy/lib/function_base.py



def cov(m,y=None, rowvar=1, bias=0):
     """Estimate the covariance matrix.

     If m is a vector, return the variance.  For matrices return the
     covariance matrix.

     If y is given it is treated as an additional (set of)
     variable(s).

     Normalization is by (N-1) where N is the number of observations
     (unbiased estimate).  If bias is 1 then normalization is by N.

     If rowvar is non-zero (default), then each row is a variable with
     observations in the columns, otherwise each column
     is a variable and the observations are in the rows.
     """

     X = asarray(m,ndmin=2)
     if X.shape[0] == 1:
         rowvar = 1
     if rowvar:
         axis = 0
         tup = (slice(None),newaxis)
     else:
         axis = 1
         tup = (newaxis, slice(None))


     if y is not None:
         y = asarray(y,ndmin=2)
         X = concatenate((X,y),axis)

     X -= X.mean(axis=1-axis)[tup]
     if rowvar:
         N = X.shape[1]
     else:
         N = X.shape[0]

     if bias:
         fact = N*1.0
     else:
         fact = N-1.0

     if not rowvar:
         return (dot(X.transpose(), X.conj()) / fact).squeeze()
     else:
         return (dot(X,X.transpose().conj())/fact).squeeze()

def corrcoef(x, y=None, rowvar=1, bias=0):
     """The correlation coefficients
     """
     c = cov(x, y, rowvar, bias)
     try:
         d = diag(c)
     except ValueError: # scalar covariance
         return 1
     return c/sqrt(multiply.outer(d,d))




More information about the Python-list mailing list