[Numpy-discussion] Need help for implementing a fast clip in numpy (was slow clip)
David Cournapeau
david at ar.media.kyoto-u.ac.jp
Thu Jan 11 00:41:55 EST 2007
Christopher Barker wrote:
>
>
> A. M. Archibald wrote:
>
>> Why not write the algorithm in C?
>
> I did just that a while back, for Numeric. I've enclosed the code for
> reference.
>
> Unfortunately, I never did figure out an efficient way to write this
> sort of thing for all types, so it only does doubles. Also, it does a
> bunch of special casing for discontiguous vs. contiguous arrays, and
> clipping to an array vs a scaler for the min and max arguments.
To do the actual clipping if the datatypes are 'native' is trivial in C:
a single loop, a comparison, that's it. I have become an irrational C++
hater with the time, so I don't use template (and I don't think C++ is
welcomed in core numpy). For those easy case of template use, autogen
works well enough for me; my impression is that numpy uses a similar
system, eg for ufunc, etc... You can look at
scipy/Lib/sandbox/cdavid/src/levinson1d.tpl for an example of autogen to
generate function for any datatype you want; if you need more fancy
template facilities like partial specialization and other crazy things
mere mortals like me will never understand in C++, then I am not sure
autogen can be used.
I guess the method used in numpy is better to use for core
functionalities, as it avoids the burden of installing one more tool for
development.
Now, I didn't know that clip was supposed to handle arrays as min/max
values. At first, I didn't understand the need to care about
contiguous/non contiguous; having non scalar for min/max makes it
necessary to have special case for non contiguous. But again, it is
important not to lose sight... The goal was to have faster clipping for
matplotlib, and this cases are easy, because it is native type and
scalar min/max, where contiguous or not does not matter as we traverse
the input arrays element by element. If we pass non native endian, non
contiguous arrays, there is actually a pretty good chance that the
current implementation is already fast enough, and does not need to be
changed anyway.
Thanks for the suggestion and the precisions,
David
More information about the NumPy-Discussion
mailing list