[Numpy-discussion] faster in1d() for monotonic case?

Tue Jun 21 03:05:39 EDT 2011

The following call is a bottleneck for me:

    np.in1d( large_array.field_of_interest, values_of_interest )

I'm not sure how in1d() is implemented, but this call seems to be slower than 
O(n) and faster than O( n**2 ), so perhaps it sorts the values_of_interest and 
does a binary search for each element of large_array?

In any case, in my situation I actually know that field_of_interest increases 
monotonically across the large_array. So if I were writing this in C, I could do 
a simple O(n) loop by sorting values_of_interest and then just checking each 
value of large_array against values_of_interest[ i ] and values_of_interest[ i + 
1 ], and any time it matched values_of_interest[ i + 1 ] increment i.

Is there some way to achieve that same efficiency in numpy, taking advantage of 
the monotonic nature of field_of_interest?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20110621/b4cc2606/attachment.html>