Design for slices

Wed Jan 31 19:50:02 EST 2001

Gustaf Liljegren <gustafl at algonet.se> wrote in comp.lang.python:
> I'm learning Python and have a question about defining ranges/slices in 
> strings. I can't find any defence for what looks like a poor design, so I 
> have to ask. Why is the first character in the string x defined as x[0] and 
> not x[1]? This looks just like the typical geek thinking that I'm trying to 
> avoid with Python.

Because experience has taught generations of programmers that this is the
right thing to do.

For a string x of length n, the following simple equations hold:
x == x[0:i]+x[i:n]   for any i (even negative i)
len(x[i:n]) == n-i   for 0 <= i <= n
x[i:i] == []         for any i

This means that it's always simple to split strings. This is also the reason
that a string slice x[i:j] *does not* include x[j]. It really simplifies
situations that would otherwise call for "-1" and "+1" everywhere.
"Off-by-one errors" have caused lots of problems in the past. The Python
choice avoids these problems for the common cases.

People often say that you have to think of the indexes as marks *between*
the actual values. Say, you have the string "Python", then think of it as

 P y t h o n
0 1 2 3 4 5 6

And now it's obvious that "Python"[2:5] == "tho", a string of length 3,
which is expected since 5-2 = 3.

There are more reasons, but you'll come to appreciate them over time.

-- 
Remco Gerlich