[Numpy-discussion] extract elements of an array that are contained in another array?

Fri Jun 5 02:04:26 EDT 2009

josef.pktd at gmail.com wrote:
> On Fri, Jun 5, 2009 at 1:48 AM, Robert Cimrman <cimrman3 at ntc.zcu.cz> wrote:
>> josef.pktd at gmail.com wrote:
>>> On Thu, Jun 4, 2009 at 4:30 PM, Gael Varoquaux
>>> <gael.varoquaux at normalesup.org> wrote:
>>>> On Thu, Jun 04, 2009 at 10:27:11PM +0200, Kim Hansen wrote:
>>>>> "in(b)" or "in_iterable(b)" method, such that you could do a.in(b)
>>>>> which would return a boolean array of the same shape as a with
>>>>> elements true if the equivalent a members were members in the iterable
>>>>> b.
>>>> That would really by what I would be looking for.
>>>>
>>> Just using "in" might promise more than it does, eg. it works only for
>>> one dimensional arrays, maybe "in1d". With "in", I would expect a
>>> generic function as in python that works with many array types and
>>> dimensions. (But I haven't checked whether it would work with a 1d
>>> structured array or object array.)
>>>
>>> I found arraysetops because of unique1d, but I didn't figure out what
>>> the subpackage really does, because I was reading "arrayse-tops"
>>> instead of array-set-ops"
>> I am bad in choosing names, but note that numpy sub-modules usually do
>> not use underscores, so array_set_ops would not fit well.
> 
> I would have chosen something like setfun.  Since this is in numpy
> that sets refers to arrays should be implied.

Yes, good idea. I am not sure how to proceed, if people agree (name 
contest is open!) What about making an alias name setfun, and deprecate 
the name arraysetops?

>>> BTW, for the docs, I haven't found a counter example where
>>> np.setdiff1d gives the wrong answer for non-unique arrays.
>> In [4]: np.setmember1d( [1, 1, 2, 4, 2], [3, 2, 4] )
>> Out[4]: array([ True, False,  True,  True,  True], dtype=bool)
> 
> setdiff1d    diff  not  member
> Looking at the source, I think setdiff always works even if for
> non-unique arrays.

Whoops, sorry. setdiff1d seems really to work for non-unique arrays - it 
relies on the behaviour above though :) - there is always one correct 
False even for repeated entries in the first array.

r.