[Numpy-discussion] fast_any_all , a trivial but fast/useful helper function for numpy

Graeme B. Bell grb at skogoglandskap.no
Wed Sep 4 06:05:32 EDT 2013


In my current GIS raster work I often have a situation where I generate code something like this:

         np.any([A>4, A==2, B==5, ...]) 

However, np.any() is quite slow.

It's possible to use np.logical_or to solve the problem, but then you get nested logical_or's, since logical_or combines only two parameters.
It's also possible to use integer maths e.g. (A>4)+(A==2)+(B==5)>0.

The question is: which is best (syntactically, in terms of performance, etc)?

I've written a little helper function to provide a faster version of any() and all(). It's embarrassingly simple - just a for loop. However, I think there's a syntactic advantage to using a helper function for this situation rather than writing it idiomatically each time; and it reduces the chance of a bug in idiomatic implementation. However, the code does not cover all the use cases currently addressed by np.any() and np.all(). 

I benchmarked to pick the fastest underlying implementation (logical_or rather than integer maths). 

The result is 14 to 17x faster than np.any() for this use case.*

Code & benchmark here:

      https://github.com/gbb/numpy-fast-any-all

Please feel welcome to use it or improve it :-)

Graeme.


* (Should this become an execution path in np.any()... ?)


More information about the NumPy-Discussion mailing list