[Numpy-discussion] in1d, but preserve shape of ar1

Stephan Hoyer shoyer at gmail.com
Mon Dec 19 20:43:41 EST 2016


I think this is a great idea!

I agree that we need a new function. Because the new API is almost strictly
superior, we should try to pick a more general name that we can encourage
users to switch to from in1d.

Pandas calls this method "isin", which I think is a perfectly good name for
the multi-dimensional NumPy version, too:
http://pandas.pydata.org/pandas-docs/stable/generated/
pandas.Series.isin.html

It's a subjective call, but I would probably keep the new function in
arraysetops.py. (This is the sort of question well suited to GitHub rather
than the mailing list, though.)


On Mon, Dec 19, 2016 at 3:25 PM, Brenton R S Recht <brstone at gmail.com>
wrote:

> I started an enhancement request in the Github bug tracker at
> https://github.com/numpy/numpy/issues/8331 , but Jaime Frio recommended I
> bring it to the mailing list.
>
> `in1d` takes two arrays, `ar1` and `ar2`, and returns a 1d array with the
> same number of elements as `ar1`. The logical extension would be a function
> that does the same thing but returns a (possibly multi-dimensional) array
> of the same shape as `ar1`. The code already has a comment suggesting this
> could be done (see https://github.com/numpy/numpy/blob/master/numpy/lib/
> arraysetops.py#L444 ).
>
> I agree that changing the behavior of the existing function isn't an
> option, since it would break backwards compatability. I'm not sure adding
> an option keep_shape is good, since the name of the function ("1d")
> wouldn't match what it does (returns an array that might not be 1d). I
> think a new function is the way to go. This would be it, more or less:
>
> def items_in(ar1, ar2, **kwargs):
>   return np.in1d(ar1, ar2, **kwargs).reshape(ar1.shape)
>
> Questions I have are:
> * Function name? I was thinking something like `items_in` or `item_in`:
> the function returns whether each item in `ar1` is in `ar2`. Is "item" or
> "element" the right term here?
> * Are there any other changes that need to happen in arraysetops.py? Or
> other files? I ask this because although the file says "Set operations for
> 1D numeric arrays" right at the top, it's growing increasingly not 1D:
> `unique` recently changed to operate on multidimensional arrays, and I'm
> proposing a multidimensional version of `in1d`. `ediff1d` could probably be
> tweaked into a version that operates along an axis the same way unique does
> now, fwiw. Mostly I want to know if I should put my code changes in this
> file or somewhere else.
>
> Thanks,
>
> -brsr
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> https://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20161219/60efb723/attachment.html>


More information about the NumPy-Discussion mailing list