[Numpy-discussion] the neighbourhood of each element of an array

Sun Feb 25 17:15:33 EST 2007

Thanks for all the useful comments. Some feedback about the improved version of
my code snippet. For a 40x40 matrix and d=1 the new version is 44 times faster,
and for d=2 it's 27 times faster. For my astronomical images (typical
2000x2000)
the new version saves my day.

# improved version
d = 1
Nrow = x.shape[0]
Ncol = x.shape[1]
s = 2*d+1
y = empty((s*s,Nrow-2*d,Ncol-2*d), dtype=x.dtype)    
for i in xrange(-d,d+1):
    for j in xrange(-d,d+1):
        y[(j+d)+s*(i+d)]= x[d+i:Nrow-d+i,d+j:Ncol-d+j]  
y = y.swapaxes(0,2).swapaxes(0,1)

The suggestion of Fransesc is a really cool one. But combining it with the
suggestion of Bryan does not seem possible in this particular case as the
swapaxis operations would no longer be possible as the resulting array would be
one with shape (s*s,) containing objects with shape (Nrow-2*d,Ncol-2*d).

Cheers,
Joris

Quoting Francesc Altet <faltet at carabos.com>:

> A Divendres 23 Febrer 2007 17:38, joris at ster.kuleuven.ac.be escrigué:
> > Hi,
> >
> > Given a (possibly masked) 2d array x, is there a fast(er) way in Numpy to
> > obtain the same result as the following few lines?
> >
> > d = 1                                  # neighbourhood 'radius'
> > Nrow = x.shape[0]
> > Ncol = x.shape[1]
> > y = array([[x[i-d:i+d+1,j-d:j+d+1].ravel() for j in range(d,Ncol-d)]   \
> >                                            for i in range(d,Nrow-d)])
> >
> > What you get is an array containing all the elements in a neighbourhood
> for
> > each element, disregarding the edges to avoid out-of-range problems. The
> > code above becomes quite slow for e.g. a 2000x2000 array. Does anyone know
> > a better approach?
> 
> Well, it seems that copying data here is taking most of the CPU. Perhaps you
> 
> may want to try getting *references* to the original slices better. For 
> example, if rd = 2+d, you can write:
> 
> def get_neighbors_views_ravel(x):
>     # The next is for an array of references to *views* of neighborgs
>     y = numpy.empty((Nrow-2*d, Ncol-2*d), dtype='object')
> 
>     for i in xrange(0, Nrow-2*d):
>         x2 = x[i:i+rd]   # Get a view of the first dimension slice
>         for j in xrange(0, Ncol-2*d):
>             y[i, j] = x2[:,j:j+rd].ravel()
>     return y
> 
> which is a 1.34x (on my machine) faster than your current approach.
> 
> If you want more speed, you may want to not .ravel() in the new array
> creation 
> time (you can always use .ravel() when you are going to use the data). The 
> removal of the .ravel() call makes the above function 2.56x faster.
> 
> Finally, if your machine has an x86 architecture, you can also take advantge
> 
> of Psyco so as to accelerate a bit more. With Psyco and not raveling, you can
> 
> get up to 3x better times than your original approach (without using Psyco).
> 
> Of course, more speed-ups could be possible by using Pyrex or any other easy
> 
> method for doing C-extensions.
> 
> I'm attaching a small benchmark, and here are the results for my machine:
> 
> ref time--> 3.021
> views (ravel) time--> 2.258   speed-up--> 1.34
> views (no ravel) time--> 1.179   speed-up--> 2.56
> 
> and if we use psyco:
> 
> ref time--> 2.368
> views (ravel) time--> 1.636   speed-up--> 1.45
> views (no ravel) time--> 0.935   speed-up--> 2.53
> 
> Cheers,
> 
> -- 
> >0,0<   Francesc Altet     http://www.carabos.com/
> V   V   Cárabos Coop. V.   Enjoy Data
>  "-"
> 

Disclaimer: http://www.kuleuven.be/cwis/email_disclaimer.htm