[SciPy-User] corrcoef and dump

Vincent Davis vincent at vincentdavis.net
Sat Jun 26 20:31:42 EDT 2010


Her is a by row corr function, You'll need to decide how you want the
results. I took this from a larger function, and although it is called
by_row_corr() I think it return by col. I was using this on a 120,000
X 120,000 array. Slow but no memory problem.

def by_row_corr(anarray, test_array):
        stdarray = (anarray-anarray.mean(0))/anarray.std(0)  #standardize
        stdtestarray = np.append(anarray,[test_array], axis=0)
        stdtestarray =
(stdtestarray-stdtestarray.mean(0))/stdtestarray.std(0)  #standardize
        nobs, nvars = stdarray.shape #For the test array the noobs
will increase by 1
        sumcorrdiff = np.empty(nvars)
        # calculate correlation coefficient for each variable with all others
        for col in xrange(nvars):
            corr = np.dot(stdarray[:,col],stdarray)/nobs
            #print 'corr', corr
            corrt = np.dot(stdtestarray[:,col],stdtestarray)/(nobs+1)
I think you will want a yield statement at the end. As I said I took
this from a larger function.

Vincent


On Fri, Jun 25, 2010 at 11:04 AM, R. Padraic Springuel
<R.Springuel at umit.maine.edu> wrote:
> Each of the 15 arrays between which I want the calculations has
> 6,695,970 entries in it.  I tried feeding corrcoef just two of the lists
> and while I don't get a MemoryError (the python exception), I do get a
> couple of these errors:
>
> Python(337) malloc: *** mmap(size=2678390784) failed (error code=12)
> *** error: can't allocate region
> *** set a breakpoint in malloc_error_break to debug
>
> They don't seem to make the program fail, however, as I still get a
> result from corrcoef.
>
> I found a function I'd written some time before to calculate the
> correlation coefficient that doesn't raise that error and its results
> agree with those from corrcoef, so there has to be an implementation
> thing going on here with memory usage.
> --
>
> R. Padraic Springuel
> Research Assistant
> Department of Physics and Astronomy
> University of Maine
> Bennett 309
> Office Hours: By Appointment Only
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user
>



More information about the SciPy-User mailing list