[PYTHON MATRIX-SIG] A problem with slicing
Jim Fulton, U.S. Geological Survey
jfulton@usgs.gov
Thu, 14 Sep 1995 10:15:42 -0400
On Thu, 14 Sep 1995 09:20:28 -0400
Guido van Rossum said:
> [The third in a series of short essays on subjects raised in the
> Matrix discussion.]
>
> Here's a problem where I have neither a strong opinion nor a perfect
> solution...
>
> Jim Fulton proposes an elegant indexing syntax for matrix objects
> which doesn't require any changes to the language:
>
> M[i][j]
>
> references the element at column i and row j (or was that column j and
> row i? Never mind...).
Actually, it's element j of sub-matrix i. If M is a 2-d matrix, then
you may choose to call submatrices either "rows" or "columns". I
prefer "columns".
> This nicely generalizes to slicing, so you can write:
>
> M[i][j1:j2]
>
> meaning the column vector at column i with row indices j1...j2-1.
>
> Unfortunately, the analogous expression for a row vector won't work:
>
> M[i1:i2][j]
>
> The reason for this is that it works by interpreting M as a sequence
> of columns (and it's all evaluated one thing at a time -- M[i][j]
> means (M[i])[j], and so on). M[i] is column i, so M[i][j] is the
> element at row j thereof. But slice semantics imply that of M is a
> sequence of X'es, then M[i1:j1] is still a sequence of X'es -- just
> shorter. So M[p:q][r] is really the same as M[p+r] (assuming r<q-p).
>
>
> One way out of this is to adopt the syntax
>
> M[i, j]
>
> for simple indexing. This would require only a minor tweaking of the
> grammar I believe.
In fact, this could be as simple as saying that the comma operator
generates tuples inside of []s. This is:
M[i,j] is equivalent to M[(i,j)].
or even:
M[i,] is equivalent to M[(i,)]
> This could be extended to support
>
> M[i1:i2, j]
> M[i1:i2, j1:j2]
> M[i, j1:j2]
>
> (and of course higher-dimensional equivalents).
>
> This would require considerable changes of the run-time architecture
> of slicing and indexing, and since currently everything is geared
> towards one-dimensional indexing/slicing, but I suppose it would be
> doable.
I agree.
> (Funny how I'm accepting this possibility of changing the language
> here, while I'm violently opposed to it for operator definitions. I
Yeah. Strange even! ;-)
> guess with adding operators there is no end to the number of new
> operators you could dream up, so there would be no end to the change;
> while here there's a clear-cut one-time change.)
Hm.
I really don't think this is a good idea. I don't really think we
need M[i1:i2, j1:j2]. M[range(i1,i2),range(j1,j2)] is fine for me.
Plus, it also allows: M[(1,3,5),(2,4,6)], in other words, we can
simply allow a sequence of indexes for a dimension and then let range
generate the desired sequence when we want a range.
>
> Of course adopting such a change would completely ruin any possbility
> of using things like
>
> M[3, 4, 7] = [1, 10, 100]
>
> as roughly equivalent to
>
> M[3] = 1
> M[4] = 10
> M[7] = 100
>
> but then again I'm not too fond of that anyway (as a matter of fact,
> I'd oppose it strongly).
>
>
> Some other things that I haven't completely followed through, and that
> may cause complications for the theoretical foundation of it all:
>
> - Allowing M[i, j] for (multidimensional) sequence types would also
> meaning that D[i, j] would be equivalent to D[(i, j)] for
> dictionaries.
I see no reason to support M[i,j] for arbitrary sequence types. I'd
say that if a type wants to support multiple arguments to [], then it
should provide mapping behavior and have the mapping implementation
sniff for either an integer or a tuple argument and do the right
thing.
I am *vary much* against a language change to support this.
> - Should M[i][j] still be equivalent to M[i, j]?
Yes. M[i,j] is really a compact form of M[((i),(j))].
> - Now we have multidimensional sequence types, should be have a
> multidimensional equivalent of len()? Some ideas:
I'm against multi-dimension sequence types. 8->
> - len(a, i) would return a's length in dimension i; len(a, i) == len(a)
>
> - dim(a) (or rank(a)?) would return the number of dimensions
>
> - shape(a) would return a tuple giving a's dimensions, e.g. for a
> 3x4 matrix it would return (3, 4), and for a one-dimensional
> sequence such as a string or list, it would return a singleton
> tuple: (len(a),).
Unnecessary. Matrices can provide special methods for this.
> - How about multidimensional map(), filter(), reduce()?
>
> - map() seems easy (except there seems to be no easy way to specify
> the rank of the operator): it returns a similarly shaped
> multidimensional object whose elements are the function results for
> the corresponding elements of the input matrix
>
> - filter() is problematic since the columns won't be of the same
> length
>
> - reduce()??? -- someone who knows APL tell me what it should mean
There have been a number of proposals for generic functions that
operate over matrices in some fashion. I have not had time to digest
them yet. Stay tuned. :-) (Geez, I really need to get back to my day
job.)
> - Multidimensional for loops? Or should for iterate over the first
> dimension only?
What is wrong with nested for loops.
Ee-gads, what's gotten into you? :-]
> One sees there are many potential consequences of a seemingly simple
> change --
Simple?
I agree that enabling the tuplefication opertor, ",", in []s is
simple, but not adding mult-dimensional behavior to sequences.
> that's why I insist that language changes be thought through
> in extreme detail before being introduced...
I really don't see any reason why the matrix type should require
language changes (aside from the minor impact of the tuplefication
operator).
Jim
=================
MATRIX-SIG - SIG on Matrix Math for Python
send messages to: matrix-sig@python.org
administrivia to: matrix-sig-request@python.org
=================