Possible improvement to slice opperations.

Mon Sep 5 08:08:22 EDT 2005

Ron Adam wrote:
> Slicing is one of the best features of Python in my opinion, but
> when you try to use negative index's and or negative step increments
> it can be tricky and lead to unexpected results.

Hm... Just as with positive indexes, you just need to understand
the concept properly.

> This topic has come up fairly often on comp.lang.python, and often 
> times, the responses include:
> 
>     * Beginners should avoid negative extended slices.
> 
>     * Slices with negative step increments are for advanced
>       python programmers.

I certainly wouldn't respond like that...

>     * It's not broke if you look at it in a different way.
> 
>     * You should do it a different way.

Those are correct, but need to be complemented with helpful
explanations. It's often like this, that we think things seem
strange, because we don't see it right. We need a new perspective.
For instance, it took me a long time to understand OOP, but once
the concept was clear, things fell into place. Often, we fail to
see things due to preconceptions that mislead us. Like Yoda said:
"You need to unlearn what you have learnt."

>     - Extended slices with negative values return values that
>       have less items than currently.
> 
>     - Slices with negative step values return entirely different
>       results.

Over my dead body! ;) Honestly, I can't imagine you'll get agreement
over such a change.

> REVERSE ORDER STEPPING
> ----------------------
> When negative steps are used, a slice operation
> does the following.  (or the equivalent)
> 
>    1. reverse the list
>    2. cut the reversed sequence using start and stop
>    3. iterate forward using the absolute value of step.

I think you are looking at this from the wrong perspective.

Whatever sign c has:
For s[a:b:c], a is the index for the first item to include,
b is the item after the last to include (just like .end() in
C++ iterators for instance), and c describes the step size.

To get a non-empty result, you obviously must have a > b iff
c < 0.

a defaults to 0, b defaults to None (which represents the
item beyond the end of the sequence) and c defaults to 1.

This is basically all you need to understand before you use
a or b < 0. There are no special cases or exceptions.

The concept of negative indices are completely orthogonal
to the concept of slicing today. You can learn and
understand them independently, and will automatically
be able to understand how to use the concepts together,
iff you actually understood both concepts correctly.

>    1. cut sequence using start and stop.
>    2  reverse the order of the results.
>    3. iterate forward using the absolute value of step.

With this solution, you suddenly have two different cases
to consider. You're suggesting that with c < 0, a should be
the end, and b should be the start. Now, it's no longer
obvious whether a or b should be excluded from the result.
I'm pretty sure the number of bugs and questions regarding
negative slices would grow quite a lot.

> CURRENT INDEXING
> ----------------
> 
> Given a range [a,b,c]:
> 
>   Positive indexing
> 
>   | a | b | c |
>   +---+---+---+
>   0   1   2   3
> 
>   Current negative indexing.
> 
>   | a | b | c |
>   +---+---+---+
>  -3  -2  -1  -0

Almost right, there's no such thing as -0 in Python.
It's more like this:
 >   Current negative indexing.
 >
 >   | a | b | c |
 >   +---+---+---+
 >  -3  -2  -1  None
Of course, [a,b,c][None] won't work, since it would
access beyond the list. It actually returns a TypeError
rather than an IndexError, and you might argue whether this
is the right thing to do. No big deal in my opinion. For
slices, None works as intended, giving the default in all
three positions. Thid means that if you want a negative
index that might be "-0" for the slice end, you simply
write something like

seq[start:end or None]

> Accessing a range at the end of a list numerically
> becomes inconvenient when negative index's are used
> as the '-0'th position can not be specified numerically
> with negative values.

But None works as is...

> ONES BASED NEGATIVE INDEXING
> ----------------------------
> Making negative index's Ones based, and selecting
> individual item to the left of negative index's would enable
> addressing the end of the list numerically.
> 
>   Ones based negative index's.
> 
>   | a | b | c |
>   +---+---+---+
>  -4  -3  -2  -1

Right. 0-based positive indices, and 1-based negative indices, but
you don't actually get the indexed value, but the one before...
I fear that your cure is worse then the "disease".

Today, a[x:x+1 or None] is always the same as [a[n]] for cases
where the latter doesn't yield an exception. With your change,
it won't be. That's not an improvement in my mind.

>    a[-3:-2] -> b  # item between index's
But a[-3] -> a! Shrug!

> The '~' is the binary not symbol which when used
> with integers returns the two's compliment. This
> works nice with indexing from the end of a list
> because convieniently ~0 == -1.

But that's only helpful if you want to write literal sequence
indexes, and that's not how real programming works. Further,
it's only an advantage if we agree that your suggestion is
broken and we want some kind of half-baked fix, to make it
work as it should.

> This creates a numerical symmetry between positive
> indexing and '~' "End of list" indexing.
> 
>    a[0]  -> first item in the list.
>    a[~0] -> last item in the list.

You can use this silly trick today to, if you think
it looks prettier than a[-1], but for the newbie,
this either means that you introduce magic, or that
you must teach two concepts instead of one.

>    a[0:~0] -> whole list.

I don't see why that's better than a[:], or a[0:None].
After all, the visual aspect only appears when you type
literals, which isn't very interesting. For calculated
values on the slice borders, you still have -1 as end
value.

>    a[1:~1] -> center, one position from both ends.

This is just a convoluted way of writing a[1:-2], which
is exactly the same as you would write today.