Proposed PEP: New style indexing, was Re: Bug in slice type

Bryan Olson fakeaddress at nowhere.org
Sat Aug 20 15:22:12 EDT 2005


Steven Bethard wrote:
 > Well, I couldn't find where the general semantics of a negative stride
 > index are defined, but for sequences at least[1]:
 >
 > "The slice of s from i to j with step k is defined as the sequence of
 > items with index x = i + n*k such that 0 <= n < (j-i)/k."
 >
 > This seems to contradict list behavior though. [...]

The conclusion is inescapable: Python's handling of negative
subscripts is a wart. Indexing from the high end is too useful
to give up, but it should be specified by the slicing/indexing
operation, not by the value of the index expression.


PPEP (Proposed Python Enhancement Proposal): New-Style Indexing

Instead of:

     sequence[start : stop : step]

new-style slicing uses the syntax:

     sequence[start ; stop ; step]

It works like current slicing, except that negative start or
stop values do not trigger from-the-high-end interpretation.
Omissions and None work the same as in old-style slicing.

Within the square-brackets, the '$' symbol stands for the length
of the sequence. One can index from the high end by subtracting
the index from '$'. Instead of:

     seq[3 : -4]

we write:

     seq[3 ; $ - 4]

When square-brackets appear within other square-brackets, the
inner-most bracket-pair determines which sequence '$' describes.
(Perhaps '$$' should be the length of the next containing
bracket pair, and '$$$' the next-out and...?)

So far, I don't think the proposal breaks anything; let's keep
it that way. The next bit is tricky...

Obviously '$' should also work in simple (non-slice) indexing.
Instead of:

     seq[-2]

we write:

     seq[$ - 2]

So really seq[-2] should be out-of-bounds. Alas, that would
break way too much code. For now, simple indexing with a
negative subscript (and no '$') should continue to index from
the high end, as a deprecated feature. The presence of '$'
always indicates new-style slicing, so a programmer who needs a
negative index to trigger a range error can write:

     seq[($ - $) + index]



An Alternative Variant:

Suppose instead of using semicolons as the PPEP proposes, we use
commas, as in:

     sequence[start, stop, step]

Commas are already in use to form tuples, and we let them do
just that. A slice is a subscript that is a tuple (or perhaps we
should allow any sequence). We could just as well write:

     index_tuple = (start, stop, step)
     sequence[index_tuple]

This variant *reduces* the number and complexity of rules that
define Python semantics. There is no special interpretation of
the comma, and no need for a distinct slice type.

The '$' character works as in the PPEP above. It is undefined
outside square brackets, but that makes no real difference; the
programmer can use len(sequence).

This variant might break some tricky code.


-- 
--Bryan



More information about the Python-list mailing list