[MATRIX-SIG] reverse of take?

Andrew P. Mullhaupt amullhau@ix.netcom.com
Tue, 01 Jul 1997 14:15:35 -0400


At 07:41 AM 7/1/97 -0700, Zane C. Motteler wrote:
>On Mon, 30 Jun 1997, Andrew P. Mullhaupt wrote:
>
>>
>>Can you index a multi-dimensional array with a one-dimensional array?
>>
>>Can you index a one-dimensional array with a multi-dimensional array?
>
>Both cases are ambiguous in NumPy. NumPy arrays are not necessarily
>stored contiguously (transpose merely exchanges some dimensions and
>strides, but doesn't move any elements).

They shouldn't be ambiguous at all. And if you're thinking of what I
think you're thinking of - well, I don't think anyone would want _that_ at
all. I think you interpret the indices as referring to the _locations at
which the elements are stored_ and this is not always true.

The way to sensibly index a multi dimensional array with a one-dimensional
array is _not_ to interpret the indices in terms of the possibly unpredictable
order the elements are stored in, but to treat the one-dimensional array as
indices into the 'full dimensional' indices of the array as determined by
the shape of the array. Thus, it is completely unambiguous, independent of
the order in which the actual elements are laid out, and it also
corresponds to what most users would expect.

The other case, indexing a one-dimensional array by a multi-dimensional
array is also completely unambiguous - you are using the elements of the
multi-dimensional array as indices into the one-d array and then giving the
result the shape of the original m-d array. Similar to the previous case,
this does not depend on the order in which the elements of the one-d array
are actually
stored.

> Secondly, to do these two things, even if arrays are contiguous,
>means that users have to know whether arrays are stored in row-major or
>column-major order, which is antithetical to the idea of a high-level
>language.

Actually, it doesn't mean that at all. It is quite clear that you can
construct a mathematical theory of APL indexing, which supports these
constructs without any reference to row or column majority. (I've alluded
to this mathematical construction before.) In fact, as I have pointed out
above, the _indexing_ can be (and in some cases in python _needs_ to be)
independent of the storage order.

However, for reasons of speed, whether you tell them or not, people will
find out many things about how the elements of arrays are stored. The APL
implementation community tried very hard for a long time to hide this from
users to no avail.

How, for example, will a user be able to pass an array to his favorite
multi-dimensional graphics package, without he can put the elements in the
right places for that package? It is quite simple to see that users _need_
be able to find out in any particular case, as necessary, the order of the
elements.

The goal of the higher level language is to reduce this need to a minimum,
_but no further_.

>>
>>If array0 is d-dimensional and arrayk is 1-dimensional for  1<=k<=d, does
>>array0[array1, ..., arrayd] refer to the elements corresponding to the
>>Cartesian product of array1 ... arrayd? It should. What is the shape
>>of the result when some of the index arrays have length 1?expression.)
>
>Getting too esoteric here. I'm not even too sure how useful this would
>be. It really demands intimate knowledge of the internal structure of
>arrays.

I think you're confused about the meaning of this based on your previous
statements.

As to useful, it is perhaps the single most common case of indexing in
array languages, and not only that, perhaps the _most efficient_.

Here's an example:

    a = reshape(range(12), [3,4])

which looks like

     0  1  2  3
     4  5  6  7
     8  9 10 11

then you want

    a[[0,2], [2,3]]

to be

     2  3
    10 11

right? I think a lot of people have been thinking that this would be the
case (like APL, S, Matlab, etc.).

The question here is actually not _can_ you do this, (I can easily give a
good implementation of what I outlined so far). But we are not at that
point yet.
The indexing questions I have asked are all about determining what behavior
people want. There are many consistent behaviors which people coming from
different languages are already happy with, and the trick here is to figure
out which one people want. There are _no_ issues of "can't get there from
here" in the context of Python's particular legacy choices - the questions
are all in the "do we _want_ to get there from here" sort.

There is no question that indexing will get improved in the way people use
Numerical Python - that is not the issue either. Too many people have done
too much cool stuff (for decades) for that to be a serious issue.

The question is not _how these things can be done_. Almost everything that
anyone has suggested so far falls well within the scope of things that I
know can be done quite practically, and other people may know how to do
many things I don't, so as far as we have seen to date, it's all quite doable.

The serious issue is _which flavor to choose_. It will be a really good
thing if the people in the discussion make themselves familiar with the
various choices before trying to implement something serious - or else we
will end
up with several parallel efforts which conflict in some aspects. I can
think of at least two completely reasonable ways to specify indexing which
are not compatible. For example - APL's rigid respect of dimensions as
opposed to S's "cyclic reuse" policy - There are advantages and
disadvantages to both. I know which one I like better, but either would be
fine for python.

The reason that indexing is _different_ than other cases where the new
functionality has "function" syntax is that _beacuse_ we call it "indexing"
we are all talking about syntax which is going to overlap (read collide)
heavily in different implementations. So it is likely that only _one_
approach can survive. It would be really nice if this choice isn't
determined solely by the first version to be implemented.

Later,
Andrew Mullhaupt

_______________
MATRIX-SIG  - SIG on Matrix Math for Python

send messages to: matrix-sig@python.org
administrivia to: matrix-sig-request@python.org
_______________