Allowing zero-dimensional subscripts

Thu Jun 8 17:10:40 EDT 2006

Hello,

I discovered that I needed a small change to the Python grammar. I
would like to hear what you think about it.

In two lines:
Currently, the expression "x[]" is a syntax error.
I suggest that it will be evaluated like "x[()]", just as "x[a, b]" is
evaluated like "x[(a, b)]" right now.

In a few more words: Currently, an object can be subscripted by a few
elements, separated by commas. It is evaluated as if the object was
subscripted by a tuple containing those elements. I suggest that an
object will also be subscriptable with no elements at all, and it will
be evaluated as if the object was subscripted by an empty tuple.

It involves no backwards incompatibilities, since we are dealing with
the legalization of a currently illegal syntax.

It is consistent with the current syntax. Consider that these
identities currently hold:

x[i, j, k]  <-->  x[(i, j, k)]
x[i, j]  <-->  x[(i, j)]
x[i, ]  <-->  x[(i, )]
x[i]  <-->  x[(i)]

I suggest that the next identity will hold too:

x[]  <-->  x[()]

I need this in order to be able to refer to zero-dimensional arrays
nicely. In NumPy, you can have arrays with a different number of
dimensions. In order to refer to a value in a two-dimensional array,
you write a[i, j]. In order to refer to a value in a one-dimensional
array, you write a[i]. You can also have a zero-dimensional array,
which holds a single value (a scalar). To refer to its value, you
currently need to write a[()], which is unexpected - the user may not
even know that when he writes a[i, j] he constructs a tuple, so he
won't guess the a[()] syntax. If my suggestion is accepted, he will be
able to write a[] in order to refer to the value, as expected. It will
even work without changing the NumPy package at all!

In the normal use of NumPy, you usually don't encounter
zero-dimensional arrays. However, I'm designing another library for
managing multi-dimensional arrays of data. Its purpose is similiar to
that of a spreadsheet - analyze data and preserve the relations between
a source of a calculation and its destination. In such an environment
you may have a lot of multi-dimensional arrays - for example, the sales
of several products over several time periods. But you may also have a
lot of zero-dimensional arrays, that is, single values - for example,
the income tax. I want the access to the zero-dimensional arrays to be
consistent with the access to the multi-dimensional arrays. Just using
the name of the zero-dimensional array to obtain its value isn't going
to work - the array and the value it contains have to be distinguished.

I have tried to change CPython to support it, and it was fairly easy.
You can see the diff against the current SVN here:
http://python.pastebin.com/768317
The test suite passes without changes, as expected. I didn't include
diffs of autogenerated files. I know almost nothing about the AST, so I
would appreciate it if someone who is familiar with the AST will check
to see if I did it right. It does seem to work, though.

Well, what do you think about this?

Have a good day,
Noam Raphael