[Numpy-discussion] What should np.ndarray.__contains__ do

Nathaniel Smith njs at pobox.com
Tue Feb 26 05:44:53 EST 2013


On Tue, Feb 26, 2013 at 10:21 AM, Sebastian Berg
<sebastian at sipsolutions.net> wrote:
> On Mon, 2013-02-25 at 16:33 +0000, Nathaniel Smith wrote:
>> On Mon, Feb 25, 2013 at 3:10 PM, Sebastian Berg
>> <sebastian at sipsolutions.net> wrote:
>> > Hello all,
>> >
>> > currently the `__contains__` method or the `in` operator on arrays, does
>> > not return what the user would expect when in the operation `a in b` the
>> > `a` is not a single element (see "In [3]-[4]" below).
>>
>> True, I did not expect that!
>>
> <snip>
>
>> The two approaches that I can see, and which generalize the behaviour
>> of simple Python lists in natural ways, are:
>>
>> a) the left argument is coerced to a scalar of the appropriate type,
>> then we check if that value appears anywhere in the array (basically
>> raveling the right argument).
>>
>
> How did I misread that? I guess you mean element and never subarray
> matching. Actually I am starting to think that is best. Subarray
> matching may be useful, but would probably be better off inside its own
> function.
> That also might be best with object arrays, since it is difficult to
> know if the user means a tuple as an element or a two element subarray,
> unless you say "input is array-like", which is possible (or more
> sensible) for a function.
>
> That would mean just make the use cases that current give weird results
> into errors. And maybe those errors hint to np.in1d and if numpy would
> get it, some dedicated subarray matching function.

Yeah, I don't have a strong opinion on whether or not sub-array
matching should work, but personally I lean towards "not". There's
precedent both ways:

In [2]: "bc" in "abcd"
Out[2]: True

In [3]: ["b", "c"] in ["a", "b", "c", "d"]
Out[3]: False

But arrays are much more like lists than they are like strings. And
it's not clear to what extent anyone even wants this kind of sub-array
matching. (I can't think of any use cases, really. I can for a version
that returns all the match locations, or a similarity map, like
cv.MatchTemplate, but not for this itself...) And there's a lot of
ambiguity about which axes are matched with which axes. So maybe
subarray matching is better off in its own function that can have some
extra arguments and more flexibility in its return value and so forth.

-n



More information about the NumPy-Discussion mailing list