on slices, negative indices, which are the equivalent procedures?

Wed Aug 11 20:40:32 EDT 2021

On 12/08/2021 02.59, Jack Brandom wrote:
> dn <PythonList at DancesWithMice.info> writes:
> 
...

>> Also, whereas I prefer to illustrate 'how it works', I perceive that you
>> are used to learning 'rules' and only thereafter their application (the
>> teaching-practice under which most of us learned) - so, another reason
>> for mixing-things-up, to suit yourself (hopefully).
> 
> Well observed.  (I suppose it is a habit of mine to try to infer the
> axioms of things.)

After a career involving various forms of vocational training and
tertiary education, I've met many different folk, (ultimately) each with
his/her own approach. Undoubtedly, the best way to answer someone's
question is to understand why they are asking in the first place.

Learning a programming language requires the alignment of one's own
"mental model" with that of the language's rules and idioms. Thus,
understanding your mental model enables a third-party to understand
where correction may be necessary or omission exists. [can you tell that
my colleagues and I were having just this discussion this morning?]

Having spent the last few years working with a series of MOOCs (usual
disclaimers: edX platform, not Python topics) I've noticed that when we
first started, our trainees were almost-exclusively post-grads, ie had a
'learning habit', curiosity, a set of expectations, and (most of the
time) some computing background. As time has gone by, these
characteristics have all changed, eg the expectations have broadened and
may be more 'interest' than 'professional, and we cannot expect that
everyone has met software development previously...

That's also a characteristic of the Python language: that as it has
become more popular, the 'new entrants' are very different people, and
the 'new features' and (newly) added libraries have become more diverse
(in application)!

NB "axioms" is basically "mental model".

>> On 06/08/2021 05.35, Jack Brandom wrote:
>>> The FAQ at 
>>>
>>>   https://docs.python.org/3/faq/programming.html#what-s-a-negative-index
>>>
>>> makes me think that I can always replace negative indices with positive
>>> ones --- even in slices, although the FAQ seems not to say anything
>>> about slices.  
>>
>> Yes, it can be done.
> 
> I thought your own post confirmed it cannot.  I'll point it out below.

Yes, it can!
(key means to starting a 'religious war').

(not starting a 'war', or even meaning to be unkind, I'm embarking on a
Socratic approach rather than giving you an immediate answer - you can
handle it/beat me up later...)

Just maybe you are guilty of (only) 'thinking inside the box'?

Have you come-across the 'connect nine dots with four lines'
brain-teaser? https://lateralaction.com/articles/thinking-outside-the-box/

>>> With slices, it doesn't seem to always work.  For instance, I can
>>> reverse a "Jack" this way:
>>>
>>>>>> s = "Jack Brandom"
>>>>>> s[3 : -13 : -1]
>>> 'kcaJ'

If this were a fight (between us), there is me 'hitting' you with the
answer. However, the 'fight' is waiting for 'the penny to drop' in your
mind.

So, please continue:-

As warned, here's where the misapprehensions begin:-

>>> But this example gives me the idea that perhaps each slice is equivalent
>>> to a certain loop (which I could write in a procedure).  So I'm looking
...

Then the old men begin waving their walking-sticks around (whilst others
wish there was such a thing as a negative-walking-stick):-

>> Us Silver-Surfers have to stick-together. So, let's try closing the gap.
...

> I like extreme clarity and certainty.  While my approach did not give me
> too much certainty, at least I got a conjecture --- so I posted my
> hypothesis here to give everyone a chance to look and spot: ``oh, that's
> not gonna work''.  Simulating in software is a great way to precisely
> describe how something works, so I did it.

Eminently reasonable.

The 'problem' (and it lies at the very roots of instructional design) is
that 'learning Python' involves BOTH width and depth, ie there's a lot
to learn! 'Pop Psy' talks about a 'learning curve' and how steep it
might be...

After a while, one realises that it is not necessary to learn the full
'width' - after all, does that mean just 'Python', does it include the
PSL (Python Standard Library), does it stop at PyPi, or where?
Similarly, with depth - is it worth knowing all there is to know about
slicing or might a study of comprehensions yield more 'fruit'?

On this list, I've referred more than once to, "some 'dark corner of
Python' in which I've never previously ventured".

Python is described as having a relatively-shallow learning curve (cf
Java, for example), which evidences/presumes a more breadth-first approach.

Accordingly, how does one split-up 'learning Python' into manageable
"chunks" - or in the words of software development, an MVP (Minimum
Viable Product)?

An answer to that may be to get 'a good handle on' positive indexing,
and then 'move on'. Thus, characteristics such as zero-base, < len(
sequence ), etc.

Later, when either an advanced topic or a work-assignment demands, only
then might we need to return to indexing to learn how negative-indexes
facilitate access from 'right-to-left' (string imagery).

Thus, in Python parlance: we re-factor our own knowledge-base and
capabilities, in the same way that re-factoring code improves same!

So, *providing* we realise and remember that positive indexes/indices
are only a shallow-understanding, and when the need arises, return to
augment that basic knowledge; such a learning-practice is (at least)
'good enough'.

(cf my sarcastic term of "Stack Overflow-driven development", which all
too often translates to zero-learning. Therein appears to lie
'survival', but 'blind-acceptance' rather than absorbed-knowledge, also
risks a close encounter with 'danger' - "here be dragons"!)

Accordingly, (on my side of the fence) a Python Apprentice needs to
learn positive-indexing in order to be useful on his/her first day at
work. (S)he becomes a Python Journeyman once negative-indexing has been
learned (and applied). Onwards and upwards!

Thus, does being able to program(me) replacement code for built-in
facility, ie coding axioms, fall into the field of the Apprentice, the
Journeyman, or the Master?

Some like to talk of "learning styles" (again: PopPsy, thoroughly
debunked, and without serious proof, despite its initial 'attraction').
The application of styles should be to topic, even sub-topic. It is
*very* limiting when an individual assumes/determines that (s)he has a
particular learning style, ie "I am a kinaesthetic/kinesthetic learning"
meaning 'has to move or feel whatever is being learned', when it is
followed by 'and this is the *only* way I can learn'. The person has
immediately limited him-/her-self!

I suggest that learning Python involves 'movement' in the sense that the
best way to prove learning (as indeed you have said) is to write code -
perhaps not because that might involve typing, head-scratching, throwing
things 'out the window' in frustration, etc. However, the 'learning'
likely involves reading and thinking (to form that 'mental model').
After that, it is the 'proof of learning' which involves 'movement'!
(what a tenuous claim!)

Thus, 'learning styles' are best adapted: what is the best style to use
when learning (training) this particular 'chunk' of material? Can I
express this is more than one manner (recall my disappointment at not
being able to include schematic diagrams in a list-post). Accordingly,
adaptability cf a rigid approach (despite phrases such as 'time-tested'
because they, like 'my [only] style' are ultimately self-limiting)!

Back to allusions with computer science, where we find an interesting
conundrum when looking through a mass of inter-connected data (usually
called a "network"). In which sequence do we process (learn) all of the
individual items? Do we work our way 'across' and then 'down', or go all
the way 'down' and then come back 'up' in order to work our way
'across'? eg Graph Traversals - Breadth First and Depth First
(https://www.youtube.com/watch?v=bIA8HEEUxZI)

You can likely see where I'm going with this: the less one has to learn
in one "chunk", the easier the learning. Accordingly, something of a
width-first approach. (yes, contrarily: "a little bit of knowledge is a
dangerous thing") However, this depends to some degree upon one's
ability to keep that in-mind when some new 'challenge' arises.
Particularly that one not feel that re-visiting a topic, eg to pick-up
negative-indexing, is 'going backwards', but recognise that it is adding
"depth" to one's existing (wider) knowledge!

Further reading:
Perhaps your "satiable curtiosity" will tempt you to take a dip in the
Limpopo River and contemplate the "six honest serving-men...What and
Where and When And How and Why and Who" (Rudyard Kipling, but not India
for a change. http://www.online-literature.com/kipling/165/)

Also a book, if you'd like to leave syntax and semantics and delve into
some related philosophy, "Insatiable Curiosity - Innovation in a Fragile
Future" (https://mitpress.mit.edu/books/insatiable-curiosity)

...
> So if we begin at, say, -2 and work backwards to the beginning of the
> string so as to include the "J", the index must be -13 --- there is no
> positive index that could take the place of -13.  (And that's because
> --- I eventually realized --- there is no positive number to the left of
> zero.  You, of course, notice the same.  So why are you saying it's
> possible to substitute negative indices with positive?)

Because the statement (above) conflates two concepts: indexing and
slicing! Please recall earlier warnings that the two exhibit fundamental
differences!

Also, remember our "fudging" that zero be treated as a positive-integer,
when it is plainly not? There is also the concept that a slice with
'missing' values, ie which invokes the defaults, is also 'fudging'
whether the None is a positive- or negative- integer (depending upon the
'direction' of the stride - or its default)!

Hey, if we're going to 'cheat' like this, then let's go whole-hog.
Sorry, wrong animal: "you might as well be hung for a sheep as for a
lamb" - and not a snake in sight!

If we talk about the "J", then an index is used to access that element
of the sequence - forwards or backwards, as desired. So, an index of -12
becomes +0. Exactly as claimed. Exactly per docs.

However, the task is "begin at, say, -2 and work backwards to the
beginning of the string". This is a task for slicing not indexing!

(if you haven't 'twigged' after the 'think outside the box' hint, above;
then I'm teasing you. Forget 'war', and please read on...)

>> Another reason why folk have difficulty with using an index or offset is
>> the subtle difference between "cardinal" and "ordinal" numbers. Cardinal
>> numbers are for counting, eg the number of letters in the name. Whereas,
>> ordinal numbers refer to a position, eg first, second, third... (there
>> are also "nominal numbers" which are simply labels, eg Channel 3 TV -
>> but we're not concerned with them). There is an implicit problem in that
>> Python uses zero-based indexing, whereas spoken language starts with a
>> word like "first" or "1st" which looks and sounds more like "one" than
>> "zero". Despite it being a junior-school 'math' topic, it is easy to
>> fail to recognise which of the two types of number is in-use. The len(
>> sequence ) provides a cardinal number. An index is an ordinal number.
> 
> That's well-observed.  But, as you say, "first" takes us to the number
> 1.  It is not very easy to equate "first" with zero.  So I might as well
> just forget about the distinction between cardinal and ordinal in Python
> indexing.

No! The difference is extremely important, even if you start talking
(?babbling) about 'the zero-th entry'!

Read on...

>> Using the "rules" (above) and transforming around len( "Jack" ), one
>> arrives at suitable 'translation' formulae/formulas:
>>
>>     negative-index = -1 * ( len( sequence ) - positive-index )
>>     positive-index = len( sequence ) + negative-index
> 
> That's nice.  I didn't write them in this form and I suppose yours is
> superior.

Superior is a lake - and not the one you're on. You're on "Huron".

(apologies, a little Illinois humor - if you can call it that)

>> The above rules describing an index do not apply to a slice's "start",
>> "stop", or "step" values! The documentation expresses:
>>
>> proper_slice ::=  [lower_bound] ":" [upper_bound] [ ":" [stride] ]
>> lower_bound  ::=  expression
>> upper_bound  ::=  expression
>> stride       ::=  expression
...

> Where are these production rules coming from?  They're not at 
>   https://docs.python.org/3/reference/grammar.html

You'll find them, when you have time to catch-up with all the web.refs -
that breadth-first/depth-first conundrum rises again!

(- way before the "Full Grammar" (which is a summary of the entire doc),
in https://docs.python.org/3/reference/expressions.html#primaries)

> The word ``stride'' doesn't appear in this grammar.

No but "slices" do!
(this topic seems to have collected a confusing collection of
nomenclature, to the point of obfuscation!)

> [...]
> 
>> Lesson:
>> Whereas some like to talk of ( start:stop:step ), it might be more
>> helpful to verbalise the relationships as "starting at:" and "stopping
>> before"!
> 
> That's a nice vocabulary for the context.  Thanks.

idiom > noun
(don't tell any English-language linguists I said that!)

> [...]
>> Many people consider the "stride"/step to indicate a sub-selection of
>> elements from within the slice's defined extent. However, it may be more
>> helpful to consider the rôle of the stride as defining the 'direction'
>> in which elements will be selected (per 'reverse') - and thereafter its
>> 'step effect'.
> 
> I agree.  And also I stuck with thinking of slices as the procedures I
> wrote, so I will always think of them as loops.

Yes!
(your thinking is climbing out of that "box"!
- but am still concerned about 'coding the axiom' holding back your
understanding)

>> Remembering the rules governing indexes/indices, be advised that these
>> do not apply to either the lower_bound or the upper_bound of a slice.
...

>>>>> name[ -100:+100 ]
>> 'Jack Brandom'

Another hint.
(OK, now he's just being cruel!)

>> Just as a "lower_bound" of 0 or None is taken to mean the first item in
>> the sequence, so any ludicrous value is also 'translated'.
>>
>> Similarly, despite knowing the length (of the example-data) is 12, a
>> slice's upper-bound that evaluates to a larger number will be 'rounded
>> down'.
> 
> Nice.  That's a case where my procedures (simulating slices) fail.
> Thanks!  I'll refine them.

You see - it's actually 'right there', standing at the very edge of your
reasoning!

> [...]
>> but to demonstrate with some non-default "bounds", (if you'll permit the
>> (weak-)humor - and with no insult intended to your good name) we could
>> describe distaste for a popular breakfast cereal (All Bran™):
>>
>>>>> name[ 1:9 ]
>> 'ack Bran'
>>>>> name[ -11:-3 ]
>> 'ack Bran'
> 
> Lol.  It's not even my name.  I think maybe we should all use a
> different name at every post.  Or perhaps a different name at every
> thread.  The USENET is a place where poster names don't matter at all.
> It's only the content that matters.  (``USENET is a strange place.'' --
> Dennis Ritchie.)

Remote courses (like ours) are (justly) criticised for lacking the
social interaction of a 'class room', particularly the inter-personal
relationship between tutor and trainee. Nevertheless, it surprises me
that it is quite possible to recognise people by their writing, the way
they ask questions, etc. (it shouldn't (be a surprise), because Morse
Code operators were able to identify each other purely by their "fist" -
not even with the hint of the name/label which we can see on email,
discussion boards, etc)

Whereas lists are general - this response goes not only to you, but to
anyone/everyone else who may be interested. Regardless of "lurkers"
(people paying attention but not making their 'presence' known) I'd
argue that it is better to be able to address a person as an individual.

Accordingly, I prefer 'real names'. That said, I go by "dn" which
appears two-faced. The rationale is that there seems to always be
another "David" within hearing distance; and my family name, "Neil", is
also used as a given-name, and also reasonably popular. A pathetic
attempt at asserting my uniqueness - maybe...

The other side of that is the use of illegal email addresses. These are
a pain to server admins and list admins alike, as they generate
bounce-back messages and failures to the log. This 'polluting traffic'
is why email practice changed to never 'returning' spam messages, but
only allowing them to become lost in cyber-space. Fine for spam, but
irritating should one slightly mis-type a correct address (and thus not
be aware of the communication-failure until 'it is too late'!).

The related consideration may be allayed by the fact that (to-date) I
have not experienced either this list's email address or my (slightly
different) address over on the Python-Tutor list, being 'harvested' and
used for spam.

There's also an inherent irony in being prepared to ask others for help,
but not being sufficiently 'honest' to use your own name and address.

The net effect (punny hah, hah!) is to add traffic (to the community) in
an attempt to reduce traffic (for yourself)...

> [...]
>> <<<
>> The formal syntax makes no special provision for negative indices in
>> sequences; however, built-in sequences all provide a __getitem__()
>> method that interprets negative indices by adding the length of the
>> sequence to the index (so that x[-1] selects the last item of x). The
>> resulting value must be a nonnegative integer less than the number of
>> items in the sequence, and the subscription selects the item whose index
>> is that value (counting from zero). Since the support for negative
>> indices and slicing occurs in the object’s __getitem__() method,
>> subclasses overriding this method will need to explicitly add that support.
> 
> That's interesting.  Thanks for the reference.  It is precisely this
> kind of thing I was looking for.

Utilising this is way 'up' in Python-Master country!

> [...]
>> Finally, it is not forgotten that you want to code a loop which
>> simulates a slice with negative attributes. (although it is hoped that
>> after the above explanations (and further reading) such has become
>> unnecessary as a learning-exercise!)
> 
> Has it become unnecessary?  For using Python, sure.  But as you noticed
> --- I like to see the rules and be able to write them myself.  So it is
> necessary for certain types. :-) 

Accordingly, we shall continue...
(whilst also keeping range and slice objects, and their differences,
in-mind)

>> Please recall that whilst a slice-object will not, a range-object will
>> work with a for-loop. So:
>>
>>>>> rng = range( 4, 0, -1 )
>>>>> list( rng )
>> [4, 3, 2, 1]
>>>>> for index in rng:
>> ...     print( name[ index ], end="   " )
>> ...
>>     k   c   a   >>>
>>
>> Oops! This looks familiar, so apply the same 'solution':
>>
>>>>> rng = range( 4, -1, -1 )
>>>>> list( rng )
>> [4, 3, 2, 1, 0]
>>>>> for index in rng:
>> ...     print( name[ index ], end="   " )
>> ...
>>     k   c   a   J

Here is the 'solution' to your unsatisfied question, but using 'positive
indexing'!
(with negative entries in the range, including the stride!)

>> The 'bottom line' is that such simulation code will become torturous
>> simply because indexes/indices follow different rules to slices!

Did I say "torturous"?
Should it have been "tortuous"?

Why did I say it?
Does either or both apply?

Right, back to your question (finally!):
<<<
So if we begin at, say, -2 and work backwards to the beginning of the
string so as to include the "J", the index must be -13 --- there is no
positive index that could take the place of -13.  (And that's because
--- I eventually realized --- there is no positive number to the left of
zero.  You, of course, notice the same.  So why are you saying it's
possible to substitute negative indices with positive?)
>>>

The problem appears as:

>>> name[ -10:-12:-1 ]
'ca'

Treating this as an indexing-problem, we can go back to the 'chart':

  Pos Ltr Neg
  ndx     ndx
   0   J  -12
   1   a  -11
   2   c  -10
   3   k   -9
   ...

and we can see that 'out by one' issue is (apparently) preventing us
from getting out the "J". Now J has the index -12, and we've used that;
but we refer to that the "stop-before" command and we want it included
("closed" not "open").

Oops!
or arggggghhhhh!

If we continue to treat it as an indexing-problem, what we'd like to do
is use an index of -13 - but that ain't goin' to fly [Wilbur]:

>>> name[ -13 ]
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
IndexError: string index out of range

The issue stems from the warning that indexes follow one set of rules,
and slices (which might be seen as presenting like multiple
indexes/indices) march to the tune of a different drum!

Thus, change gear and approach the matter as a slicing-problem!

Now, we can go outside the box/off the top of the chart (reproduced above):

>>> name[ -2:-13:-1 ]
'odnarB kcaJ'

or

we can take advantage of the idea that slices can be specified with
missing components - in this case the upper-bound. Instead, asking for
default-values to be substituted*:

>>> name[ -2::-1 ]
'odnarB kcaJ'

* either len( sequence )
or because this case involves a negative-stride: -len( sequence ) - 1

PS the ability to 'over-run'?'under-run' the negative index 'limit' in a
slice was mentioned (but has been edited-out of the response):

<<<
The reverse direction requires a re-think of the lower-bound value
we were using (above) and may necessitate a review of the schematics
presented earlier, in conjunction with the closed-open convention:

>>> name[ 4-1::-1 ]
'kcaJ'
>>>

and

<<<
What is rather handy, is that using a negative (unity) stride has the
effect of reversing the string (sequence):

>>> name[ ::-1 ]
'modnarB kcaJ'
>>>

...and now to make it all about 'me':
If such was not sufficiently emphasised, do I need to review the topic
to improve coverage?

Now the boot is on the other foot: perhaps I should be using the program
design advice (begin the decision tree by looking at the stride's
direction) and apply that to proof-reading my talk's coverage to ensure
that each 'half' of the 'network tree' is equally-well/sufficiently covered!

Which brings us to:-

>> and first splitting the implementation's decision-tree into two paths,
>> according to whether the stride is positive or negative, before getting
>> into 'the nitty-gritty'.
> 
> That I had already done.

Great!

...
> Yes, send me your tests! (``Caveat emptor'' noted.) (I had included all
> tests that appeared in this thread, but had missed, for example, "Jack
> Brandom"[-100 : 100].  Didn't occur to me this wouldn't blow an
> exception.)

(and here's the justification for the earlier rave about using real
email addresses - I should be sending this to you-alone, and not
'cluttering-up' the entire list (and its archive)!)

<<<
# would normally put TDD/unit tests in separate file from 'real code'

"""Three-way tests comparing:
   1 the output of the slice simulation function (under test)
   2 the expected output string
   3 the actual output from a slice
"""

name = "Jack Brandom"

# assert web.ref:
https://www.simplilearn.com/tutorials/python-tutorial/assert-in-python
(in case needed)

assert ( my_slice( name, lower_bound=0, upper_bound=12 ) ==
         'Jack Brandom' == name[ 0:12 ] )
assert ( my_slice( name, upper_bound=6 ) == 'Jack B' == name[ :6 ] )
assert ( my_slice( name, lower_bound=5 ) == 'Brandom' == name[ 5: ] )
assert ( my_slice( name ) == 'Jack Brandom' == name[ : ] )
assert ( my_slice( name, stride=2 ) == 'Jc rno' == name[ ::2 ] )
assert ( my_slice( name, lower_bound=-100, upper_bound=+100 ) ==
         'Jack Brandom' == name[ -100:100 ] )
assert ( my_slice( name, upper_bound=4 ) == 'Jack' == name[ :4 ] )
assert ( my_slice( name, upper_bound=-8 ) == 'Jack' == name[ :-8 ] )
assert ( my_slice( name, lower_bound=5 ) == 'Brandom' == name[ 5: ] )
assert ( my_slice( name, lower_bound=-7 ) == 'Brandom' == name[ -7: ] )
assert ( my_slice( name, lower_bound=1, upper_bound=9 ) ==
         'ack Bran' == name[ 1:9 ] )
assert ( my_slice( name, lower_bound=-11, upper_bound=-3 ) ==
         'ack Bran' == name[ -11:-3 ] )

assert ( my_slice( name, lower_bound=3, stride=-1 ) ==
         'kcaJ' == name[ 3::-1 ] )
assert ( my_slice( name, lower_bound=-9, stride=-1 ) ==
         'kcaJ' == name[ -9::-1 ] )
assert ( my_slice( name, upper_bound=4, stride=-1 ) ==
         'modnarB' == name[ :4:-1 ] )
assert ( my_slice( name, upper_bound=-8, stride=-1 ) ==
         'modnarB' == name[ :-8:-1 ] )
assert ( my_slice( name, lower_bound=6, upper_bound=2, stride=-1 ) ==
         'rB k' == name[ 6:2:-1 ] )
assert ( my_slice( name,
                   lower_bound=-6,
                   upper_bound=-10,
                   stride=-1 ) ==
                   'rB k' ==
                   name[ -6:-10:-1 ] )
>>>

> Thank you so much.

and thank you for helping to improve my talk...
-- 
Regards,
=dn