How about adding slice notation to iterators/generators?

Stephen Hansen apt.shansen at gmail.com
Fri Oct 16 11:31:36 EDT 2009


On Fri, Oct 16, 2009 at 2:02 AM, Bearophile <bearophileHUGS at lycos.com>wrote:

> Terry Reedy:
> >2. iterator protocol is intentionally simple.<
>
> Slice syntax is already available for lists, tuples, strings, arrays,
> numpy, etc, so adding it to iterators too doesn't look like adding
> that large amount of information to the mind of the programmer.
>
>
Except it does; not necessarily to the user, but to the implementor. Lists,
tuples, strings, arrays, numpy are not protocols but data types -- sure,
data types which share certain sets of methods that must be implemented for
a custom data type to successfully replace it, but they aren't really
protocols into themselves.

Iterators are completely different; its a common protocol which they all can
implement to support traversal of the internal data. Right now its really
simple to implement that protocol in new and different types... if slice
syntax was added to iterators, that would be more complicated. The required
protocol would be twice (or more!) as complicated to implement and support.

I use custom types which are iterables a lot, and for about 75% of them--
they're just really simple. I don't need to worry about thread safety nor
anything as such, so when someone tries to use one in a loop and Python
calls __iter__ to get an iterator for my data type? I simply set an internal
state variable and 'return self'.

Why? Because its just easy that way. There's no *need* to do anything more
complicated. The next/__next__ method on the same object then just returns
an item from its collection and advances the internal state variable. All is
easy.

If you could directly slice iterators, this would completely fail. I
wouldn't be able to just use the object as its own iterator in the 75% of
the cases, because the object _already_ has support for __getitem__ (and
occasionally, __getslice__) returning data directly. So now when I want to
make a type iterable, I now /have/ to have __iter__ return a special class.
That's very unfortunate.

Sure, in the 25% of other cases I make iterables, I'm already doing more
complex work so have a special iterator returned from __iter__; sometimes
because I need to be able to iterate over the contents from more then one
thread and so need iterator-specific state, sometimes because there's more
then one obvious way to spell "the next item" that should be spelled in a
particular collection. For those, adding __getslice__ doesn't break existing
implementations... but it also isn't a completely straight-forward thing to
add in all cases.

The *protocol* is intentionally simple so many things can easily be
iterables and produce iterators... adding slicing directly would complicate
that.

I'd much rather leave iterator slicing in itertools-- personally.

--S

(On reflection, I suppose it is possible that Python could just not use
__getslice__ on iterators, and instead basically invoke itertools.islice
when one uses slicing syntax on them... except in /that/ case, I would
oppose it on objections of black magic and witchcraft :) Syntax produces
special method calls on objects to implement functionality! It doesn't do so
for some types of objects and then go invoke mystery code for other kinds
:))
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-list/attachments/20091016/9256e8f5/attachment-0001.html>


More information about the Python-list mailing list