writable iterators?

Neal Becker ndbecker2 at gmail.com
Thu Jun 23 12:06:10 EDT 2011


Ian Kelly wrote:

> On Wed, Jun 22, 2011 at 3:54 PM, Steven D'Aprano
> <steve+comp.lang.python at pearwood.info> wrote:
>> Fortunately, that's not how it works, and far from being a "limitation",
>> it would be *disastrous* if iterables worked that way. I can't imagine
>> how many bugs would occur from people reassigning to the loop variable,
>> forgetting that it had a side-effect of also reassigning to the iterable.
>> Fortunately, Python is not that badly designed.
> 
> The example syntax is a non-starter, but there's nothing wrong with
> the basic idea.  The STL of C++ uses output iterators and a quick
> Google search doesn't turn up any "harmful"-style rants about those.
> 
> Of course, there are a couple of major differences between C++
> iterators and Python iterators.  FIrst, C++ iterators have an explicit
> dereference step, which keeps the iterator variable separate from the
> value that it accesses and also provides a possible target for
> assignment.  You could say that next(iterator) is the corresponding
> dereference step in Python, but it is not accessible in a for loop and
> it does not provide an assignment target in any case.
> 
> Second, C++ iterators separate out the dereference step from the
> iterator advancement step.  In Python, both next(iterator) and
> generator.send() are expected to advance the iterator, which would be
> problematic for creating an iterator that does both input and output.
> 
> I don't think that output iterators would be a "disaster" in Python,
> but I also don't see a clean way to add them to the existing iterator
> protocol.
> 
>> If you want to change the source iterable, you have to explicitly do so.
>> Whether you can or not depends on the source:
>>
>> * iterators are lazy sequences, and cannot be changed because there's
>> nothing to change (they don't store their values anywhere, but calculate
>> them one by one on demand and then immediately forget that value);
> 
> No, an iterator is an object that allows traversal over a collection
> in a manner independent of the implementation of that collection.  In
> many instances, especially in Python and similar languages, the
> "collection" is abstracted to an operation over another collection, or
> even to the results of a serial computation where there is no actual
> "collection" in memory.
> 
> Iterators are not lazy sequences, because they do not behave like
> sequences.  You can't index them, you can't reiterate them, you can't
> get their length (and before you point out that there are ways of
> doing each of these things -- yes, but none of those ways use
> sequence-like syntax).  For true lazy sequences, consider the concept
> of streams and promises in the functional languages.
> 
> In any case, the desired behavior of an output iterator on a source
> iterator is clear enough to me.  If the source iterator is also an
> output iterator, then it propagates the write to it.  If the source
> iterator is not an output iterator, then it raises a TypeError.
> 
>> * mutable sequences like lists can be changed. The standard idiom for
>> that is to use enumerate:
>>
>> for i, e in enumerate(seq):
>> seq[i] = e + 42
> 
> Unless the underlying collection is a dict, in which case I need to do:
> 
> for k, v in d.items():
>     d[k] = v + 42
> 
> Or a file:
> 
> for line in f:
>     # I'm not even sure whether this actually works.
>     f.seek(-len(line))
>     f.write(line.upper())
> 
> As I said above, iterators are supposed to provide
> implementation-independent traversal over a collection.  For writing,
> enumerate fails in this regard.


While python may not have output iterators, interestingly numpy has just added 
this capability.  It is part of nditer.  So, this may suggest a syntax.

There have been a number of responses to my question that suggest using indexing 
(maybe with enumerate).  Once again, this is not suitable for many data 
structures.  c++ and stl teach that iteration is often far more efficient than 
indexing.  Think of a linked-list.  Even for a dense multi-dim array, index 
calculations are much slower than iteration.

I believe the lack of output iterators is a defienciency in the python iterator 
concept.





More information about the Python-list mailing list