[SciPy-User] down-sampling an array by averaging - vectorized form?

Sat Feb 11 14:56:54 EST 2012

Hi Andrew,

> None of them are 'vectorized' persay but all are more clever or effeicent ways of getting at the same problem.  I thought I'd write a couple of quick comments.

It depends what you mean by "vectorized" -- none are using SIMD instructions on the chip, but from the Matlab/numpy perspective I think people often mean "vectorized" as "multiple data elements acted on by a single command" such as C = A + B, where A and B are matrices. 

In any case, the reshaping approach is "vectorized" according to the latter definition, which obviously really just means "the loops are pushed down into C"...

> The binning followed by scalar division is wicked fast and yields results that are very close to my original algorithm.  The reshaping seems very clever and I am going to read it more carefully to learn some lessons there, I think.

For more information on reshaping to do decimation, see the "avoiding loops when downsampling arrays" thread on the numpy-discussion list from last week:
https://groups.google.com/forum/#!topic/numpy/qyDKJTj5jx4

There's a bit more discussion about how this works, and some memory-layout caveats to be aware of.

Also, instead of doing the reshaping, you could see if hard-coding the averaging is faster. Here's how to do it for the 2x2 case:
B = (A[::2,::2] + A[1::2,::2] + A[::2,1::2] + A[1::2,::2])/4.0

> The ndimage.zoom approach is a very general approach (and roughly as quick as the others).  As far as I can tell, that function uses spline interpolation for zoom factors > 1, and I'm unsure how it deals with zoom factors < 1.  It might do nearest neighbor or something like that, I wasn't able to quickly determine from glancing at the source.  If anyone knows, it would be cool to hear.

I'm pretty certain that the zoom function doesn't do anything smart for image decimation/minification -- it just uses the requested interpolation order to take a point-sample of the image at the calculated coordinates. Lack of good decimation is a limitation of ndimage. I know that there are decimation routines in scipy.signal, but I'm not sure if they're just 1D.

In general, for integer-factor downsampling, I either do it with slicing like the above example, or use convolution (like ndimage.gaussian_filter with the appropriate bandwidth, which is quite fast) to prefilter followed by taking a view such as A[::3,::3] to downsample by a factor of three.

Zach