on slices, negative indices, which are the equivalent procedures?

Wed Aug 11 10:59:11 EDT 2021

dn <PythonList at DancesWithMice.info> writes:

> Apologies for lateness.

That's alright.  Thanks for the marvelous post.

> Coincidentally, I've been asked to speak to our local Python Users'
> Group on slicing. Herewith an attempt to modify those demos around your
> data/question. Apologies if the result is thus somewhat lacking in flow.
> Also, whereas I prefer to illustrate 'how it works', I perceive that you
> are used to learning 'rules' and only thereafter their application (the
> teaching-practice under which most of us learned) - so, another reason
> for mixing-things-up, to suit yourself (hopefully).

Well observed.  (I suppose it is a habit of mine to try to infer the
axioms of things.)

[...]

> On 06/08/2021 05.35, Jack Brandom wrote:
>> The FAQ at 
>> 
>>   https://docs.python.org/3/faq/programming.html#what-s-a-negative-index
>> 
>> makes me think that I can always replace negative indices with positive
>> ones --- even in slices, although the FAQ seems not to say anything
>> about slices.  
>
> Yes, it can be done.

I thought your own post confirmed it cannot.  I'll point it out below.

>> With slices, it doesn't seem to always work.  For instance, I can
>> reverse a "Jack" this way:
>> 
>>>>> s = "Jack Brandom"
>>>>> s[3 : -13 : -1]
>> 'kcaJ'
>> 
>> I have no idea how to replace that -13 with a positive index.  Is it
>> possible at all?  
>
> Yes it is - but read on, gentle reader...
>
> If we envisage a string:
> - a positive index enables the identification of characters, from
> left-to-right
> - a negative index 'counts' from right-to-left, ie it takes the
> right-most elements/characters.
>
> The length of a string (sequence) is defined as len( str ). Accordingly,
> there is a formula which can 'translate' the length and either statement
> of the index's relative-location, to the other. To quote one of the
> web.refs (offered below) "If i or j is negative, the index is relative
> to the end of sequence s: len(s) + i or len(s) + j is substituted. But
> note that -0 is still 0."
>
> Clear as mud? Please continue...
>
>
>> But this example gives me the idea that perhaps each slice is equivalent
>> to a certain loop (which I could write in a procedure).  So I'm looking
>> for these procedures.  If I can have the procedures in mind, I think I
>> will be able to undersand slices without getting surprised.
>> 
>> Do you have these procedures from the top of your mind?  While I haven't
>> given up yet, I am not getting too close.  Thank you!
>
>
> Us Silver-Surfers have to stick-together. So, let's try closing the gap.
>
> Rather than attempting to re-build Python, perhaps some experimenting
> with sequences, indexing, and slicing is what is required? I can see
> writing a simulation routine as a perfectly-valid (proof of)
> learning-approach - and one we often applied 'back then'. However, is it
> the 'best' approach? (Only you can judge!)

I like extreme clarity and certainty.  While my approach did not give me
too much certainty, at least I got a conjecture --- so I posted my
hypothesis here to give everyone a chance to look and spot: ``oh, that's
not gonna work''.  Simulating in software is a great way to precisely
describe how something works, so I did it.

[...]

> Alternately, (with apologies for 'cheating') this may be some help:

That's great.

>>>> print( "  Pos Ltr Neg" )
>>>> print( "  ndx     ndx" )
>>>> for positive_index,
>         character,
>         negative_index in zip( range( 12 ),
>                                name,
>                                range( -12, 0 )
>                              ):
> ...     print( f"{ positive_index:4}   { character } { negative_index:4}" )
>
>   Pos Ltr Neg
>   ndx     ndx
>    0   J  -12
>    1   a  -11
>    2   c  -10
>    3   k   -9
>    4       -8
>    5   B   -7
>    6   r   -6
>    7   a   -5
>    8   n   -4
>    9   d   -3
>   10   o   -2
>   11   m   -1

So if we begin at, say, -2 and work backwards to the beginning of the
string so as to include the "J", the index must be -13 --- there is no
positive index that could take the place of -13.  (And that's because
--- I eventually realized --- there is no positive number to the left of
zero.  You, of course, notice the same.  So why are you saying it's
possible to substitute negative indices with positive?)

[...]

> Another reason why folk have difficulty with using an index or offset is
> the subtle difference between "cardinal" and "ordinal" numbers. Cardinal
> numbers are for counting, eg the number of letters in the name. Whereas,
> ordinal numbers refer to a position, eg first, second, third... (there
> are also "nominal numbers" which are simply labels, eg Channel 3 TV -
> but we're not concerned with them). There is an implicit problem in that
> Python uses zero-based indexing, whereas spoken language starts with a
> word like "first" or "1st" which looks and sounds more like "one" than
> "zero". Despite it being a junior-school 'math' topic, it is easy to
> fail to recognise which of the two types of number is in-use. The len(
> sequence ) provides a cardinal number. An index is an ordinal number.

That's well-observed.  But, as you say, "first" takes us to the number
1.  It is not very easy to equate "first" with zero.  So I might as well
just forget about the distinction between cardinal and ordinal in Python
indexing.

> Using the "rules" (above) and transforming around len( "Jack" ), one
> arrives at suitable 'translation' formulae/formulas:
>
>     negative-index = -1 * ( len( sequence ) - positive-index )
>     positive-index = len( sequence ) + negative-index

That's nice.  I didn't write them in this form and I suppose yours is
superior.

> The above rules describing an index do not apply to a slice's "start",
> "stop", or "step" values! The documentation expresses:
>
> proper_slice ::=  [lower_bound] ":" [upper_bound] [ ":" [stride] ]
> lower_bound  ::=  expression
> upper_bound  ::=  expression
> stride       ::=  expression
>
> Elsewhere in the docs the rules are described using to i, j, and k. NB
> in this context "k" is not the "stride"!

Where are these production rules coming from?  They're not at 

  https://docs.python.org/3/reference/grammar.html

The word ``stride'' doesn't appear in this grammar.

[...]

> Lesson:
> Whereas some like to talk of ( start:stop:step ), it might be more
> helpful to verbalise the relationships as "starting at:" and "stopping
> before"!

That's a nice vocabulary for the context.  Thanks.

[...]

> Many people consider the "stride"/step to indicate a sub-selection of
> elements from within the slice's defined extent. However, it may be more
> helpful to consider the rôle of the stride as defining the 'direction'
> in which elements will be selected (per 'reverse') - and thereafter its
> 'step effect'.

I agree.  And also I stuck with thinking of slices as the procedures I
wrote, so I will always think of them as loops.

> Remembering the rules governing indexes/indices, be advised that these
> do not apply to either the lower_bound or the upper_bound of a slice.
> For example:
>
>>>> name[ +100 ]
> Traceback (most recent call last):
>   File "<stdin>", line 1, in <module>
> IndexError: string index out of range
>
> and:
>
>>>> name[ -100 ]
> Traceback (most recent call last):
>   File "<stdin>", line 1, in <module>
> IndexError: string index out of range
>
> yet:
>
>>>> name[ -100:+100 ]
> 'Jack Brandom'
>
> Just as a "lower_bound" of 0 or None is taken to mean the first item in
> the sequence, so any ludicrous value is also 'translated'.
>
> Similarly, despite knowing the length (of the example-data) is 12, a
> slice's upper-bound that evaluates to a larger number will be 'rounded
> down'.

Nice.  That's a case where my procedures (simulating slices) fail.
Thanks!  I'll refine them.

[...]

> The tricks differ if the 'target' is the family-name:
>
>>>> name[ 5: ]
> 'Brandom'
>>>> name[ -7: ]
> 'Brandom'
>
> but to demonstrate with some non-default "bounds", (if you'll permit the
> (weak-)humor - and with no insult intended to your good name) we could
> describe distaste for a popular breakfast cereal (All Bran™):
>
>>>> name[ 1:9 ]
> 'ack Bran'
>>>> name[ -11:-3 ]
> 'ack Bran'

Lol.  It's not even my name.  I think maybe we should all use a
different name at every post.  Or perhaps a different name at every
thread.  The USENET is a place where poster names don't matter at all.
It's only the content that matters.  (``USENET is a strange place.'' --
Dennis Ritchie.)

[...]

> To round things out, here's some 'theory':-
>
> Contrary to practice, the Python Reference Manual (6.3.3 Slicings) says:
> <<<
> The formal syntax makes no special provision for negative indices in
> sequences; however, built-in sequences all provide a __getitem__()
> method that interprets negative indices by adding the length of the
> sequence to the index (so that x[-1] selects the last item of x). The
> resulting value must be a nonnegative integer less than the number of
> items in the sequence, and the subscription selects the item whose index
> is that value (counting from zero). Since the support for negative
> indices and slicing occurs in the object’s __getitem__() method,
> subclasses overriding this method will need to explicitly add that support.

That's interesting.  Thanks for the reference.  It is precisely this
kind of thing I was looking for.

[...]

> Finally, it is not forgotten that you want to code a loop which
> simulates a slice with negative attributes. (although it is hoped that
> after the above explanations (and further reading) such has become
> unnecessary as a learning-exercise!)

Has it become unnecessary?  For using Python, sure.  But as you noticed
--- I like to see the rules and be able to write them myself.  So it is
necessary for certain types. :-) 

> Please recall that whilst a slice-object will not, a range-object will
> work with a for-loop. So:
>
>>>> rng = range( 4, 0, -1 )
>>>> list( rng )
> [4, 3, 2, 1]
>>>> for index in rng:
> ...     print( name[ index ], end="   " )
> ...
>     k   c   a   >>>
>
> Oops! This looks familiar, so apply the same 'solution':
>
>>>> rng = range( 4, -1, -1 )
>>>> list( rng )
> [4, 3, 2, 1, 0]
>>>> for index in rng:
> ...     print( name[ index ], end="   " )
> ...
>     k   c   a   J
>
> The 'bottom line' is that such simulation code will become torturous
> simply because indexes/indices follow different rules to slices!
>
>
> Should you wish to persist, then may I suggest modifying mySlice(it,
> beg, end, step = 1) to:
>
> def my_slice( sequence, lower_bound, upper_bound, stride=1 ):

I will.  Thanks for the advice.

> and first splitting the implementation's decision-tree into two paths,
> according to whether the stride is positive or negative, before getting
> into 'the nitty-gritty'.

That I had already done.

> Perversely (if not, foolishly) I have indulged (and can't recommend
> it!). Nevertheless, if you are determined, I will be happy to forward
> some test conditions, upon request (caveat emptor!)...

Yes, send me your tests! (``Caveat emptor'' noted.) (I had included all
tests that appeared in this thread, but had missed, for example, "Jack
Brandom"[-100 : 100].  Didn't occur to me this wouldn't blow an
exception.)

> Web.Refs/Further reading:
> https://docs.python.org/3/tutorial/introduction.html
> https://docs.python.org/3/reference/expressions.html#primaries
> https://docs.python.org/3/library/functions.html
> https://docs.python.org/3/tutorial/controlflow.html
> https://docs.python.org/3/library/stdtypes.html#typesseq
> https://docs.python.org/3/library/stdtypes.html#ranges
> https://docs.python.org/3/reference/simple_stmts.html
> https://docs.python.org/3/glossary.html
> https://web.archive.org/web/20190321101606/https://plus.google.com/115212051037621986145/posts/YTUxbXYZyfi
> https://docs.python.org/3/reference/datamodel.html

Thank you so much.