Proposed PEP: New style indexing, was Re: Bug in slice type
Bryan Olson
fakeaddress at nowhere.org
Sat Aug 20 15:22:12 EDT 2005
Steven Bethard wrote:
> Well, I couldn't find where the general semantics of a negative stride
> index are defined, but for sequences at least[1]:
>
> "The slice of s from i to j with step k is defined as the sequence of
> items with index x = i + n*k such that 0 <= n < (j-i)/k."
>
> This seems to contradict list behavior though. [...]
The conclusion is inescapable: Python's handling of negative
subscripts is a wart. Indexing from the high end is too useful
to give up, but it should be specified by the slicing/indexing
operation, not by the value of the index expression.
PPEP (Proposed Python Enhancement Proposal): New-Style Indexing
Instead of:
sequence[start : stop : step]
new-style slicing uses the syntax:
sequence[start ; stop ; step]
It works like current slicing, except that negative start or
stop values do not trigger from-the-high-end interpretation.
Omissions and None work the same as in old-style slicing.
Within the square-brackets, the '$' symbol stands for the length
of the sequence. One can index from the high end by subtracting
the index from '$'. Instead of:
seq[3 : -4]
we write:
seq[3 ; $ - 4]
When square-brackets appear within other square-brackets, the
inner-most bracket-pair determines which sequence '$' describes.
(Perhaps '$$' should be the length of the next containing
bracket pair, and '$$$' the next-out and...?)
So far, I don't think the proposal breaks anything; let's keep
it that way. The next bit is tricky...
Obviously '$' should also work in simple (non-slice) indexing.
Instead of:
seq[-2]
we write:
seq[$ - 2]
So really seq[-2] should be out-of-bounds. Alas, that would
break way too much code. For now, simple indexing with a
negative subscript (and no '$') should continue to index from
the high end, as a deprecated feature. The presence of '$'
always indicates new-style slicing, so a programmer who needs a
negative index to trigger a range error can write:
seq[($ - $) + index]
An Alternative Variant:
Suppose instead of using semicolons as the PPEP proposes, we use
commas, as in:
sequence[start, stop, step]
Commas are already in use to form tuples, and we let them do
just that. A slice is a subscript that is a tuple (or perhaps we
should allow any sequence). We could just as well write:
index_tuple = (start, stop, step)
sequence[index_tuple]
This variant *reduces* the number and complexity of rules that
define Python semantics. There is no special interpretation of
the comma, and no need for a distinct slice type.
The '$' character works as in the PPEP above. It is undefined
outside square brackets, but that makes no real difference; the
programmer can use len(sequence).
This variant might break some tricky code.
--
--Bryan
More information about the Python-list
mailing list