[SciPy-User] Strange behaviour from corrcoef when calculating correlation-matrix in SciPy/NumPy.

josef.pktd at gmail.com josef.pktd at gmail.com
Wed Mar 2 14:36:18 EST 2011


On Wed, Mar 2, 2011 at 2:28 PM, Pauli Virtanen <pav at iki.fi> wrote:
> On Wed, 02 Mar 2011 14:06:23 -0500, josef.pktd wrote:
> [clip]
>> I also found it a bit strange that corrcoef(x,y) creates the stacked
>> version. scipy.stats.spearmanr inherits this behavior since I rewrote
>> it. scipy.stats.pearsonr hasn't been rewritten yet.
>>
>> It didn't bug me enough, to figure out whether there is a reason for
>> this stacking behavior or not.
>
> The Matlab convention
>
>        corrcoef(x, y) == corrcoef(c_[x.ravel(), y.ravel()])

I don't remember matlab exactly, but I don't think there is a ravel,
and I think R also does

cov(x, y) = np.dot((x-x.mean()).T, y-y.mean())

and normalized for corrcoef.

just getting the off-diagonal block of the matrix, x'y, instead of
also x'x and y'y

Josef


>
> is actually also a bit peculiar if you haven't seen it before -- how come
> there are now two variables, if x had variables on the rows (why not bail
> out with an error?).
>
> I don't typically deal with stuff that requires these functions, so I
> don't have an opinion, but it would have been better to do the same thing
> even if there is no real reason for it...
>
> --
> Pauli Virtanen
>
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user
>



More information about the SciPy-User mailing list