[Python-Dev] The iterator story

David Abrahams David Abrahams" <david.abrahams@rcn.com
Fri, 19 Jul 2002 10:15:46 -0400


From: "Oren Tirosh" <oren-py-d@hishome.net>

> > The Destructive-For Issue:
> >
> >     In most languages i can think of, and in Python for the most
> >     part, a statement such as "for x in y: print x" is a
> >     non-destructive operation on y.  Repeating "for x in y: print x"
> >     will produce exactly the same results once more.
> >
> >     For pre-iterator versions of Python, this fails to be true only
> >     if y's __getitem__ method mutates y.  The introduction of
> >     iterators has caused this to now be untrue when y is any iterator.
>
> The most significant example of an object that mutates on __getitem__ in
> pre-iterator Python is the xreadlines object.  Its __getitem__ method
> increments an internal counter and raises an exception if accessed out of
> order.  This hack may be the 'original sin' - the first widely used
> destructive for.
>
> I just wish the time machine could have picked up your posting when the
> iteration protcols were designed. Good work.

Yeah, Ping's article sure went "thunk" when I read it.

At the risk of boring everyone, I think I should explain why I started the
multipass iterator thread. One of the most important jobs of Boost.Python
is the conversion between C++ and Python types (and if you don't give a fig
for C++, hang on, because I hope this will be relevant to pure Python
also). In order to support wrapping of overloaded C++ functions and member
functions, it's important to be able to be able to do this in two steps:

1. Discover whether a Python object is convertible to a given C++ type
2. Perform the conversion

The overload resolution mechanism is currently pretty simple-minded: it
looks through the overloaded function objects until it can find one for
which all the arguments are convertible to the corresponding C++ type, then
it converts them and calls the wrapped C++ function.

My users really want to be able to define converters which, given any
Python iterable/sequence type, can extract a particular C++ container type.
In order to do that, we might commonly need to inspect each element of the
source object to see that it's convertible to the C++ container's value
type. It's pretty easy to see that if step 1 destroys the state of an
argument, it can foul the whole scheme: even if we store the result
somewhere so that step 2 can re-use it, overload resolution might fail for
arguments later in the function signature. Then the other overloads will be
looking at a different argument object.

What we were looking for was a way to quickly reject an overload if the
source object was not re-iterable, without modifying it.

It sure seems to me that we'd benefit from being able to do the same sort
of thing in Pure Python. It's not clear to me that anyone else cares about
this, but I hope one day we'll get built-in overloading or multimethod
dispatch in Python beyond what's currently offered by the numeric
operators.

Incidentally, I'm not sure whether PEP 246 provides much help here. If the
adaptation protocol only gives us a way to say "is this, or can this be
adapted to be a re-iterable sequence", something could easily answer:

    [ x for x in y ]

Which would produce a re-iterable sequence, but might also destroy the
source. Of course, I'll say up front I've only skimmed the PEP and might've
missed something crucial.

-Dave