writable iterators?

Ian Kelly ian.g.kelly at gmail.com
Thu Jun 23 11:02:31 EDT 2011


On Wed, Jun 22, 2011 at 3:54 PM, Steven D'Aprano
<steve+comp.lang.python at pearwood.info> wrote:
> Fortunately, that's not how it works, and far from being a "limitation",
> it would be *disastrous* if iterables worked that way. I can't imagine
> how many bugs would occur from people reassigning to the loop variable,
> forgetting that it had a side-effect of also reassigning to the iterable.
> Fortunately, Python is not that badly designed.

The example syntax is a non-starter, but there's nothing wrong with
the basic idea.  The STL of C++ uses output iterators and a quick
Google search doesn't turn up any "harmful"-style rants about those.

Of course, there are a couple of major differences between C++
iterators and Python iterators.  FIrst, C++ iterators have an explicit
dereference step, which keeps the iterator variable separate from the
value that it accesses and also provides a possible target for
assignment.  You could say that next(iterator) is the corresponding
dereference step in Python, but it is not accessible in a for loop and
it does not provide an assignment target in any case.

Second, C++ iterators separate out the dereference step from the
iterator advancement step.  In Python, both next(iterator) and
generator.send() are expected to advance the iterator, which would be
problematic for creating an iterator that does both input and output.

I don't think that output iterators would be a "disaster" in Python,
but I also don't see a clean way to add them to the existing iterator
protocol.

> If you want to change the source iterable, you have to explicitly do so.
> Whether you can or not depends on the source:
>
> * iterators are lazy sequences, and cannot be changed because there's
> nothing to change (they don't store their values anywhere, but calculate
> them one by one on demand and then immediately forget that value);

No, an iterator is an object that allows traversal over a collection
in a manner independent of the implementation of that collection.  In
many instances, especially in Python and similar languages, the
"collection" is abstracted to an operation over another collection, or
even to the results of a serial computation where there is no actual
"collection" in memory.

Iterators are not lazy sequences, because they do not behave like
sequences.  You can't index them, you can't reiterate them, you can't
get their length (and before you point out that there are ways of
doing each of these things -- yes, but none of those ways use
sequence-like syntax).  For true lazy sequences, consider the concept
of streams and promises in the functional languages.

In any case, the desired behavior of an output iterator on a source
iterator is clear enough to me.  If the source iterator is also an
output iterator, then it propagates the write to it.  If the source
iterator is not an output iterator, then it raises a TypeError.

> * mutable sequences like lists can be changed. The standard idiom for
> that is to use enumerate:
>
> for i, e in enumerate(seq):
>    seq[i] = e + 42

Unless the underlying collection is a dict, in which case I need to do:

for k, v in d.items():
    d[k] = v + 42

Or a file:

for line in f:
    # I'm not even sure whether this actually works.
    f.seek(-len(line))
    f.write(line.upper())

As I said above, iterators are supposed to provide
implementation-independent traversal over a collection.  For writing,
enumerate fails in this regard.



More information about the Python-list mailing list