[Numpy-discussion] Advanced indexing: "fancy" vs. orthogonal

Eric Firing efiring at hawaii.edu
Thu Apr 2 14:03:27 EDT 2015


On 2015/04/02 4:15 AM, Jaime Fernández del Río wrote:
> We probably need more traction on the "should this be done?" discussion
> than on the "can this be done?" one, the need for a reordering of the
> axes swings me slightly in favor, but I mostly don't see it yet.

As a long-time user of numpy, and an advocate and teacher of Python for 
science, here is my perspective:

Fancy indexing is a horrible design mistake--a case of cleverness run 
amok.  As you can read in the Numpy documentation, it is hard to 
explain, hard to understand, hard to remember.  Its use easily leads to 
unreadable code and hard-to-see errors.  Here is the essence of an 
example that a student presented me with just this week, in the context 
of reordering eigenvectors based on argsort applied to eigenvalues:

In [25]: xx = np.arange(2*3*4).reshape((2, 3, 4))

In [26]: ii = np.arange(4)

In [27]: print(xx[0])
[[ 0  1  2  3]
  [ 4  5  6  7]
  [ 8  9 10 11]]

In [28]: print(xx[0, :, ii])
[[ 0  4  8]
  [ 1  5  9]
  [ 2  6 10]
  [ 3  7 11]]

Quickly now, how many numpy users would look at that last expression and 
say, "Of course, that is equivalent to transposing xx[0]"?  And, "Of 
course that expression should give a completely different result from 
xx[0][:, ii]."?

I would guess it would be less than 1%.  That should tell you right away 
that we have a real problem here.  Fancy indexing can't be *read* by a 
sub-genius--it has to be laboriously figured out piece by piece, with 
frequent reference to the baffling descriptions in the Numpy docs.

So I think you should turn the question around and ask, "What is the 
actual real-world use case for fancy indexing?"  How often does real 
code rely on it?  I have taken advantage of it occasionally, maybe you 
have too, but I think a survey of existing code would show that the need 
for it is *far* less common than the need for simple orthogonal 
indexing.  That tells me that it is fancy indexing, not orthogonal 
indexing, that should be available through a function and/or special 
indexing attribute.  The question is then how to make that transition.

Eric








More information about the NumPy-Discussion mailing list