[Numpy-discussion] Faster

Fri May 2 21:16:05 EDT 2008

On Fri, May 2, 2008 at 6:05 PM, Robert Kern <robert.kern at gmail.com> wrote:
> On Fri, May 2, 2008 at 7:24 PM, Keith Goodman <kwgoodman at gmail.com> wrote:
>
>
> > How can I make this function faster? It removes the i-th row and
>  >  column from an array.
>  >
>  >  def cut(x, i):
>  >     idx = range(x.shape[0])
>  >     idx.remove(i)
>  >     y = x[idx,:]
>  >     y = y[:,idx]
>  >     return y
>  >
>  >  >> import numpy as np
>  >  >> x = np.random.rand(500,500)
>  >  >> timeit cut(x, 100)
>  >  100 loops, best of 3: 16.8 ms per loop
>
>  I can get a ~20% improvement with the following:
>
>  In [8]: %timeit cut(x, 100)
>  10 loops, best of 3: 21.6 ms per loop
>
>  In [9]: def mycut(x, i):
>    ...:     A = x[:i,:i]
>    ...:     B = x[:i,i+1:]
>    ...:     C = x[i+1:,:i]
>    ...:     D = x[i+1:,i+1:]
>    ...:     return hstack([vstack([A,C]),vstack([B,D])])
>    ...:
>
>  In [10]: %timeit mycut(x, 100)
>  10 loops, best of 3: 17.3 ms per loop
>
>  The hstack(vstack, vstack) seems to be somewhat better than
>  vstack(hstack, hstack), at least for these sizes.

Wow. I never would have come up with that. And I probably never will.

Original code:
>> cluster.test2(500)
n = 500 took 5.28 seconds

Your improvement:
>> cluster.test2(500)
n = 500 took 3.52 seconds

Much more than a 20% improvement when used in the larger program. Thank you.