python list index - an easy question

Steve D'Aprano steve+python at pearwood.info
Mon Dec 19 19:49:19 EST 2016


On Mon, 19 Dec 2016 03:21 am, BartC wrote:

> On 18/12/2016 10:59, Paul Götze wrote:
>> Hi John,
>>
>> there is a nice short article by E. W. Dijkstra about why it makes sense
>> to start numbering at zero (and exclude the upper given bound) while
>> slicing a list. Might give a bit of additional understanding.
>>
>> http://www.cs.utexas.edu/users/EWD/ewd08xx/EWD831.PDF
> 
> (This from somebody who apparently can't use a typewriter?!)

Oh Bart don't be so hard on yourself, I don't think it matters whether or
not you can use a typewriter, your comments would be as equally interesting
whether you used a typewriter, spoke them allowed and had them transcribed
by voice recognition software, or wrote them out by hand.

(Handwriting is a lost skill in this day and age of people poking at their
JesusPhones with their thumb, typing out txt sp3k 1 ltr at a time.)


> I don't know if the arguments there are that convincing. Both lower
> bounds of 0 and 1 are useful; 

Dijkstra doesn't deny that other bounds can be useful.

The unstated assumption in his argument is you can only pick one system for
indexing into arrays. But of course that's not strictly correct --
different languages can and do choose different schemes, or even choose
different schemes for each language.


> some languages will use 0, some 1, and 
> some can have any lower bound.
> 
> But a strong argument for using 1 is that in real life things are
> usually counted from 1 (and measured from 0).

I'm sure that when European merchants first learned of this
new-fangled "zero" from their Arabic and Indian colleagues in the Middle
Ages, they probably had a quite similar reaction.

After all, didn't the Greeks themselves claim that the smallest number was
two? Zero is clearly not a number, it is the absence of number, and 1 is
not a number, it is unity.

http://math.furman.edu/~mwoodard/fuejum/content/2012/paper1_2012.pdf

You do make a point that even children (at least those about the age of
three or four) can understand 1-based indexing, but I don't think it is a
*critical* point. Most programmers have learned a bit more than the average
four year old.

If you can understand threads, OOP design patterns, binary search or C
pointers, counting indexes 0, 1, 2, 3, ... should hold no fears for you.


> So if you wanted a simple list giving the titles of the chapters in a
> book or on a DVD, on the colour of the front doors for each house in a
> street, usually you wouldn't be able to use element 0.
> 
> As for slice notation, I tend to informally use (not for any particulr
> language) A..B for an inclusive range, and A:N for a range of length N
> starting from A.

An interesting thought... I like the idea of being able to specify slices by
index or by length.


 
> In Python you can also have a third operand for a range, A:B:C, which
> can mean that B is not necessarily one past the last in the range, and
> that the A <= i < B condition in that paper is no longer quite true.

You cannot use a simple comparison like A <= i < B to represent a range with
a step-size not equal to 1. How would you specify the sequence:

    2, 4, 6, 8, 10, 12, 14

as a single expression with ONLY an upper and lower bound?

    2 <= i <= 14

clearly doesn't work, and nor does any alternative.

Dijkstra doesn't discuss range() with non-unit step sizes.


> In fact, A:B:-1 corresponds to A >= i > B, which I think is the same as
> condition (b) in the paper (but backwards), rather (a) which is favoured.

I have no way of knowing what it corresponds to in your head, but in Python,
a slice A:B:-1 corresponds to the elements at indices:

    A, A-1, A-2, A-3, ..., B+1

*in that order*.

py> '0123456789'[7:2:-1]
'76543'

Since Dijkstra doesn't discuss non-unit step sizes, it is unclear what he
would say about counting backwards.


> Another little anomaly in Python is that when negative indices are used,
> it suddenly switches to 1-based indexing! Or least, when -index is
> considered:
> 
>    x = [-4,-3,-2,-1]
> 
>    print x[-1]       # -1  Notice the correspondence here...
>    print x[-2]       # -2

Congratulations, you've discovered that 4-1 equals 3. Well done!

>    x = [1, 2, 3, 4]
> 
>    print x[1]        # 2   ...and the lack of it here
>    print x[2]        # 3

That's silly. You can get whichever correspondence you like by choosing
values which do or don't match the indices used by Python:

    x = [-3, -2, -1, 0]
    print x[-1]  # Notice the lack of correspondence

    x = [0, 1, 2, 3]
    print x[1]  # Notice the correspondence here

    x = [6, 7, 8, 9]
    print x[1]  # Notice the lack of correspondence here

What's your point?


I infer that you're having difficulty with Python slicing and indexing
because you have a mental model where indexes label the items themselves
(best viewed in a fixed width font):

+---+---+---+---+---+---+
| P | y | t | h | o | n |
+---+---+---+---+---+---+
..0...1...2...3...4...5        # Indexes
.-6..-5..-4..-3..-2..-1        # 6-1 = 5, 6-2 = 4 etc.


That makes 'Python'[1] easy to visualise (its obviously just 'y') but
slicing becomes more complex:

    'Python'[1:4]
    => *include* index 1, *exclude* index 4
    => 'yth'

But a better model is to consider the *boundaries* labelled:

+---+---+---+---+---+---+
| P | y | t | h | o | n |
+---+---+---+---+---+---+
0...1...2...3...4...5...6        # Indexes
-6..-5..-4..-3..-2..-1


A simple index is a bit more complex (think of it as the element immediately
following the boundary, hence 'Python'[1] is still 'y') but slicing now
becomes simple. Visualise slicing along the boundaries (that's where the
name comes from!) and now it is obvious that the start index is included
and the stop index is excluded/

+---+---+---+---+---+---+
| P | y | t | h | o | n |
+---+---+---+---+---+---+
....^...........^
cut here    and here


There's no mental model that makes slicing with non-zero step sizes easy,
*especially* if the step size is a non-unit negative number (say, -3). For
those you really have to simulate the process of actually iterating over
the indexes one by one.



-- 
Steve
“Cheer up,” they said, “things could be worse.” So I cheered up, and sure
enough, things got worse.




More information about the Python-list mailing list