[Numpy-discussion] speedy remove mean of rows

Thu Feb 27 21:40:17 EST 2003

Hey John,

I think broadcasting is your best bet.  Here is a snippet using scipy (Numeric will be pretty much the same).

>>> from scipy import *
>>> a = stats.random((4,3))
a
array([[ 0.94058263,  0.24342623,  0.74673623],
       [ 0.53151542,  0.07523929,  0.49730805],
       [ 0.5161854 ,  0.51049614,  0.70360875],
       [ 0.09470515,  0.60604334,  0.64941102]])
>>> stats.mean(a) # axis=-1 by default in scipy
array([ 0.6435817 ,  0.36802092,  0.57676343,  0.45005317])
>>> a-stats.mean(a)[:,NewAxis]
array([[ 0.29700093, -0.40015546,  0.10315453],
       [ 0.1634945 , -0.29278163,  0.12928713],
       [-0.06057803, -0.06626729,  0.12684532],
       [-0.35534802,  0.15599017,  0.19935785]])

eric

John Hunter <jdhunter at ace.bsd.uchicago.edu> wrote ..
> 
> I have a large (40,000 x 128) Numeric array, X, with typecode Float.
> In some cases the number of rows may be approx 10x greater.
> 
> I want to create an array Y with the same dimensions as X, where each
> element of Y is the corresponding element of X with the mean of the
> row on which it occurs subtracted away.  Ie,
> 
>   Y = X - transpose(resize(mean(X,1), (X.shape[1],X.shape[0])))
> 
> I am wondering if this is the most efficient way (speed and memory).
> 
> Thanks for any suggestions,
> John Hunter
> 
> 
> -------------------------------------------------------
> This sf.net email is sponsored by:ThinkGeek
> Welcome to geek heaven.
> http://thinkgeek.com/sf
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/numpy-discussion