[Numpy-discussion] Need help for implementing a fast clip in numpy (was slow clip)

Fri Jan 12 00:50:40 EST 2007

Timothy Hochberg wrote:
>
>
> On 1/11/07, *Christopher Barker* <Chris.Barker at noaa.gov 
> <mailto:Chris.Barker at noaa.gov>> wrote:
>
> [CHOP]
>  
>
>     I'd still like to know if anyone knows how to efficiently loop through
>     all the elements of a non-contiguous array in C.
>
>
> First, it's not that important if the array is contiguous for this 
> sort of thing. What you really care about is whether it's 
> simply-strided (or maybe single-strided would be a better term). 
> Anyway, the last dimension of the array can be strided without making 
> things more difficult. All you need to be able to do is to address the 
> elements of the array as thedata[offset + stride*index].
This is right.  Just remember that strides are "byte-strides" and not 
"element-strides" and so thedata had better be a pointer to a byte.
>
> That being said, the strategy that the ufuncs use, and possibly other 
> functions as well, is to have the core functions operate only on 
> simply-strided chunks of data. At a higher level, there is some magic 
> that parcels up non-contiguous array into simply-strided chunks and 
> feeds them to core functions. How efficient this is obviously depends 
> on how large the chunks that the magic parceler manages to extract 
> are, but typically it seems to work pretty well.
This is one thing I've exposed (and made use of in more than one place) 
with NumPy.  In Numeric, the magic was in a few lines of the ufuncobject 
file).  Now, it is exposed in the concept of an array iterator.  Anybody 
can take advantage of it as it there is a C-API call to get an array 
iterator from the array (it's actually the object returned by the .flat 
method).   You can even get an iterator that iterates over one-less 
dimension than the array has (with the dimension using the smallest 
strides left "un-iterated" so that you can call an inner loop with it).

You can see examples of this in several places.  The basic template is:

int axis=-1;  /* -1 means let the code decide which axis, otherwise you 
can choose */
dit = (PyArrayIterObject *)PyArray_IterAllButAxis(dest, &axis);
if (dit==NULL) /* error return*/
while (dit->index < dit->size) {
          /* do something
              dit->dataptr points to the first position of the first "line"
              PyArray_STRIDE(dest, axis) is the striding
              PyArray_DIM(dest, axis) is the size of the "line"
          */
          PyArray_ITER_NEXT(dit);   /* move to the next  line */
}

You see code like this all over in NumPy.

The broadcasting of ufuncs is also exposed (so you can do this on 
multiple arrays).  See the multi-iterator objects.

-Travis

-Travis