Sequence iterators with __index__

schickb schickb at gmail.com
Wed Jun 25 03:11:54 EDT 2008


On Jun 24, 5:46 pm, Terry Reedy <tjre... at udel.edu> wrote:
>
> Wanting to slice while iterating is a *very* specialized usage.

I disagree because iterators mark positions, which for sequences are
just offsets. And slicing is all about offsets. Here is a quote from
the already implemented PEP 357:

"Currently integers and long integers play a special role in slicing
in that they are the only objects allowed in slice syntax. In other
words, if X is an object implementing the sequence protocol, then
X[obj1:obj2] is only valid if obj1 and obj2 are both integers or long
integers.  There is no way for obj1 and obj2 to tell Python that they
could be reasonably used as indexes into a sequence.  This is an
unnecessary limitation."

But this isn't just about slicing. I'd like sequence iterators to be
usable as simple indexes as well; like a[it] (which __index__ would
also provide).

> In any case:
> A. If the iterator uses in incrementing index to iterate, you want access.
> B. Using an iterator as an integer will strike most people as
> conceptually bizarre; it will never be accepted.

It's not meant to be used as an integer. It's meant to be used as a
position in the sequence, which iterators already are. The fact that
the position is represented as an integer is not that important
(except to python). I'll grant you that it is conceptually strange
that you could use an iterator on one sequence as an index into
another.

> C. Doing so is unnecessary since the internal index can just as easily
> be exposed as an integer attribute called 'index' or, more generally,
> 'count'.
> a[it.count:] looks *much* better.
> D. You can easily add .index or .count to any iterator you write.  The
> iterator protocol is a minimum rather than maximum specification.

Following that line of reasoning, the __index__ special method
shouldn't really exist at all. Your arguments would suggest that NumPy
shouldn't use __index__ either because:
a[ushort.index] "looks *much* better".

> E. You can easily wrap any iterable/iterator in an iterator class that
> provides .count for *any* iteration process.

Sure, and that is why I mentioned this in my original post. But the
idea is to avoid redundant code and data in the case of sequences, and
make it a standard feature.

> F. Even this should be unnecessary for most usages.  Built-in function
> enumerate(iterable) generates count,item pairs in much the same manner:

I am not aware of a way to get the current position out of an
enumerate object without advancing it (or creating a custom wrapper).
If the special __index__ method was added it might be interesting ;)
But iterators are already a clean and abstract position marker, and
for sequences it seems surprising to me that they can't really be used
as such.

-Brad



More information about the Python-list mailing list