[Numpy-discussion] Need help for implementing a fast clip in numpy (was slow clip)

Thu Jan 11 18:51:08 EST 2007

David Cournapeau wrote:

> To do the actual clipping if the datatypes are 'native' is trivial in C: 
> a single loop, a comparison, that's it.

only if it's also Contiguous. From a quick look at your posted code, it 
looks like it only works for contiguous arrays.

 >  I don't think C++ is  welcomed in core numpy

no , it's not. I avoid C and C++ whenever possible, but there looks to 
be some real promise in C++ templates, etc -- but it does get ugly! I"ve 
always wanted to give Blitz++ a try...

> autogen  works well enough for me;

I didn't know about autogen -- that may be all we need.

> Now, I didn't know that clip was supposed to handle arrays as min/max 
> values.

one more nifty feature...And if you want to support broadcasting, even 
more so!

> At first, I didn't understand the need to care about 
> contiguous/non contiguous; having non scalar for min/max makes it 
> necessary to have special case for non contiguous.

I'm confused. This issue is that you can't just increment the pointer to 
get the next element if the array is non-contiguous.. you need to do all 
the strides, etc, math.

 > But again, it is
> important not to lose sight... The goal was to have faster clipping for 
> matplotlib, and this cases are easy, because it is native type and 
> scalar min/max, where contiguous or not does not matter as we traverse 
> the input arrays element by element.

You still have to traverse them properly, and I didn't find a way to do 
that equally efficiently for contiguous and non-contiguous arrays -- 
thus I special cased it. That doesn't mean there isn't a a way -- I 
really don't know much what I'm doing!

What you've [posted doesn't look like it will work for non-contiguous 
arrays:

> static int native_scalar_fbl_clip(npy_float* in, 
>         npy_intp size, npy_float min, npy_float max) 
> {
>     //npy_float* kcoeff, npy_float* err)
>     npy_intp i;
> 
>     for (i = 0; i < size; ++i) {
>         if (in[i] <= min) {
>             in[i]   = min;
>         }
>         if (in[i] >= max) {
>             in[i]   = max;
>         }
>     }
>     return 0;
> }

This just loops through the whole C array from start to size -- only 
works for contiguous data.

> If we pass non native endian, non 
> contiguous arrays, there is actually a pretty good chance that the 
> current implementation is already fast enough, and does not need to be 
> changed anyway.

good point -- optimizing the important cases, while allowing all cases 
is a fine way to go. If anyone needs other cases optimized, that can be 
added later.

I'd still like to know if anyone knows how to efficiently loop through 
all the elements of a non-contiguous array in C.

-Chris

-- 
Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov