[Numpy-discussion] Use-case for np.choose

Sun Nov 8 22:03:12 EST 2009

OK, now I'm trying to wrap my brain around broadcasting in choose when both
`a` *and* `choices` need to be (non-trivially) broadcast in order to arrive
at a common shape, e.g.:

>>> c=np.arange(4).reshape((2,1,2)) # shape is (2,1,2)
>>> a=np.eye(2, dtype=int) # shape is (2,2)
>>> np.choose(a,c)
array([[2, 1],
       [0, 3]])

(Unfortunately, the implementation is in C, so I can't easily insert print
statements to see intermediate results.)

First, let me confirm that the above is indeed an example of what I think it
is, i.e., both `a` and `choices` are broadcast in order for this to work,
correct?  (And if incorrect, how is one broadcast to the shape of the
other?)  Second, both are broadcast to shape (2,2,2), correct?  But how,
precisely, i.e., does c become

[[[0, 1], [2, 3]],        [[[0, 1], [0, 1]],
 [[0, 1], [2, 3]]]   or   [[2, 3], [2, 3]]]

and same question for a?  Then, once a is broadcast to a (2,2,2) shape,
precisely how does it "pick and choose" from c to create a (2,2) result?
For example, suppose a is broadcast to:

[[[1, 0], [0, 1]],
 [[1, 0], [0, 1]]]

(as indicated above, I'm uncertain at this point if this is indeed what a is
broadcast to); how does this create the (2,2) result obtained above?
(Obviously this depends in part on precisely how c is broadcast, I do
recognize that much.)

Finally, a seemingly relevant comment in the C source is:

/* Broadcast all arrays to each other, index array at the end.*/

This would appear to confirm that "co-broadcasting" is performed if
necessary, but what does the "index array at the end" phrase mean?

Thanks for your continued patience and tutelage.

DG

On Sun, Nov 8, 2009 at 5:36 AM, <josef.pktd at gmail.com> wrote:

> On Sun, Nov 8, 2009 at 5:00 AM, David Goldsmith <d.l.goldsmith at gmail.com>
> wrote:
> > On Sun, Nov 8, 2009 at 12:57 AM, Anne Archibald <
> peridot.faceted at gmail.com>
> > wrote:
> >>
> >> 2009/11/8 David Goldsmith <d.l.goldsmith at gmail.com>:
> >> > On Sat, Nov 7, 2009 at 11:59 PM, Anne Archibald
> >> > <peridot.faceted at gmail.com>
> >> > wrote:
> >> >>
> >> >> 2009/11/7 David Goldsmith <d.l.goldsmith at gmail.com>:
> >> >> > So in essence, at least as it presently functions, the shape of 'a'
> >> >> > *defines* what the individual choices are within 'choices`, and if
> >> >> > 'choices'
> >> >> > can't be parsed into an integer number of such individual choices,
> >> >> > that's
> >> >> > when an exception is raised?
> >> >>
> >> >> Um, I don't think so.
> >> >>
> >> >> Think of it this way: you provide np.choose with a selector array, a,
> >> >> and a list (not array!) [c0, c1, ..., cM] of choices. You construct
> an
> >> >> output array, say r, the same shape as a (no matter how many
> >> >> dimensions it has).
> >> >
> >> > Except that I haven't yet seen a working example with 'a' greater than
> >> > 1-D,
> >> > Josef's last two examples notwithstanding; or is that what you're
> saying
> >> > is
> >> > the bug.
> >>
> >> There's nothing magic about A being one-dimensional.
> >>
> >> C = np.random.randn(2,3,5)
> >> A = (C>-1).astype(int) + (C>0).astype(int) + (C>1).astype(int)
> >>
> >> R = np.choose(A, (-1, -C, C, 1))
> >
> > OK, now I get it: np.choose(A[0,:,:], (-1,-C,C,-1)) and
> > np.choose(A[0,:,0].reshape((3,1)), (-1,-C,C,1)), e.g., also work, but
> > np.choose(A[0,:,0], (-1,-C,C,-1)) doesn't - what's necessary for choose's
> > arguments is that both can be broadcast to a common shape (as you state
> > below), but choose won't reshape the arguments for you to make this
> > possible, you have to do so yourself first, if necessary.  That does
> appear
> > to be what's happening now; but do we want choose to be smarter than that
> > (e.g., for np.choose(A[0,:,0], (-1,-C,C,-1)) to work, so that the user
> > doesn't need to include the .reshape((3,1)))?
>
> No, I don't think we want to be that smart.
>
> If standard broadcasting rules apply, as I think they do, then I wouldn't
> want
> any special newaxis or reshapes done automatically. It will be confusing,
> the function wouldn't know what to do if there are, e.g., as many rows as
> columns, and this looks like a big source of errors.
> Standard broadcasting is pretty nice (once I got the hang of it), and
> adding
> a lot of np.newaxis (or some reshapes) to the code is only a small price to
> pay.
>
> Josef
>
>
>
> >
> > DG
> >
> >>
> >> Requv = np.minimum(np.abs(C),1)
> >>
> >> or:
> >>
> >> def wedge(*functions):
> >>     """Return a function whose value is the minimum of those of
> >> functions"""
> >>     def wedgef(X):
> >>          fXs = [f(X) for f in functions]
> >>          A = np.argmin(fXs, axis=0)
> >>          return np.choose(A,fXs)
> >>     return wedgef
> >>
> >> so e.g. np.abs is -wedge(lambda X: X, lambda X: -X)
> >>
> >> This works no matter what shape of X the user supplies - so a wedged
> >> function can be somewhat ufunclike - by making A the same shape.
> >>
> >> >> The (i0, i1, ..., iN) element of the output array
> >> >> is obtained by looking at the (i0, i1, ..., iN) element of a, which
> >> >> should be an integer no larger than M; say j. Then r[i0, i1, ..., iN]
> >> >> = cj[i0, i1, ..., iN]. That is, each element of the selector array
> >> >> determines which of the choice arrays to pull the corresponding
> >> >> element from.
> >> >
> >> > That's pretty clear (thanks for doing my work for me). ;-), Yet, see
> >> > above.
> >> >
> >> >> For example, suppose that you are processing an array C, and have
> >> >> constructed a selector array A the same shape as C in which a value
> is
> >> >> 0, 1, or 2 depending on whether the C value is too small, okay, or
> too
> >> >> big respectively. Then you might do something like:
> >> >>
> >> >> C = np.choose(A, [-inf, C, inf])
> >> >>
> >> >> This is something you might want to do no matter what shape A and C
> >> >> have. It's important not to require that the choices be an array of
> >> >> choices, because they often have quite different shapes (here, two
> are
> >> >> scalars) and it would be wasteful to broadcast them up to the same
> >> >> shape as C, just to stack them.
> >> >
> >> > OK, that's a pretty generic use-case, thanks; let me see if I
> understand
> >> > it
> >> > correctly: A is some how created independently with a 0 everywhere C
> is
> >> > too
> >> > small, a 1 everywhere C is OK, and a 2 everywhere C is too big; then
> >> > np.choose(A, [-inf, C, inf]) creates an array that is -inf everywhere
> C
> >> > is
> >> > too small, inf everywhere C is too large, and C otherwise (and since
> >> > -inf
> >> > and inf are scalars, this implies broadcasting of these is taking
> >> > place).
> >> > This is what you're asserting *should* be the behavior.  So, unless
> >> > there is
> >> > disagreement about this (you yourself said the opposite viewpoint
> might
> >> > rationally be held) np.choose definitely presently has a bug, namely,
> >> > the
> >> > index array can't be of arbitrary shape.
> >>
> >> There seems to be some disagreement between versions, but both Josef
> >> and I find that the index array *can* be arbitrary shape. In numpy
> >> 1.2.1 I find that all the choose items must be the same shape as it,
> >> which I think is a bug.
> >>
> >> What I suggested might be okay was if the index array was not
> >> broadcasted, so that the outputs always had exactly the same shape as
> >> the index array. But upon reflection it's useful to be able to use a
> >> 1-d array to select rows from a set of matrices, so I now think that
> >> all of A and the elements of choose should be broadcast to the same
> >> shape. This seems to be what Josef observes in his version of numpy,
> >> so maybe there's nothing to do.
> >>
> >> Anne
> >>
> >> > DG
> >> >
> >> >>
> >> >> Anne
> >> >> _______________________________________________
> >> >> NumPy-Discussion mailing list
> >> >> NumPy-Discussion at scipy.org
> >> >> http://mail.scipy.org/mailman/listinfo/numpy-discussion
> >> >
> >> >
> >> > _______________________________________________
> >> > NumPy-Discussion mailing list
> >> > NumPy-Discussion at scipy.org
> >> > http://mail.scipy.org/mailman/listinfo/numpy-discussion
> >> >
> >> >
> >> _______________________________________________
> >> NumPy-Discussion mailing list
> >> NumPy-Discussion at scipy.org
> >> http://mail.scipy.org/mailman/listinfo/numpy-discussion
> >
> >
> > _______________________________________________
> > NumPy-Discussion mailing list
> > NumPy-Discussion at scipy.org
> > http://mail.scipy.org/mailman/listinfo/numpy-discussion
> >
> >
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20091108/2cbcd945/attachment.html>