[Python-Dev] iterators

M.-A. Lemburg mal@lemburg.com
Tue, 22 Aug 2000 16:43:50 +0200


Guido van Rossum wrote:
> 
> > > [MAL]
> > > > How about a third variant:
> > > >
> > > > #3:
> > > > __iter = <object>.iterator()
> > > > while __iter:
> > > >    <variable> = __iter.next()
> > > >    <block>
> > > >
> > > > This adds a slot call, but removes the malloc overhead introduced
> > > > by returning a tuple for every iteration (which is likely to be
> > > > a performance problem).
> > >
> > > Are you sure the slot call doesn't cause some malloc overhead as well?
> >
> > Quite likely not, since the slot in question doesn't generate
> > Python objects (nb_nonzero).
> 
> Agreed only for built-in objects like lists.  For class instances this
> would be way more expensive, because of the two calls vs. one!

True.
 
> > > Ayway, the problem with this one is that it requires a dynamic
> > > iterator (one that generates values on the fly, e.g. something reading
> > > lines from a pipe) to hold on to the next value between "while __iter"
> > > and "__iter.next()".
> >
> > Hmm, that depends on how you look at it: I was thinking in terms
> > of reading from a file -- feof() is true as soon as the end of
> > file is reached. The same could be done for iterators.
> 
> But feof() needs to read an extra character into the buffer if the
> buffer is empty -- so it needs buffering!  That's what I'm trying to
> avoid.

Ok.
 
> > We might also consider a mixed approach:
> >
> > #5:
> > __iter = <object>.iterator()
> > while __iter:
> >    try:
> >        <variable> = __iter.next()
> >    except ExhaustedIterator:
> >        break
> >    <block>
> >
> > Some iterators may want to signal the end of iteration using
> > an exception, others via the truth text prior to calling .next(),
> > e.g. a list iterator can easily implement the truth test
> > variant, while an iterator with complex .next() processing
> > might want to use the exception variant.
> 
> Belt and suspenders.  What does this buy you over "while 1"?

It gives you two possible ways to signal "end of iteration".
But your argument about Python iterators (as opposed to
builtin ones) applies here as well, so I withdraw this one :-)
 
> > Another possibility would be using exception class objects
> > as singleton indicators of "end of iteration":
> >
> > #6:
> > __iter = <object>.iterator()
> > while 1:
> >    try:
> >        rc = __iter.next()
> >    except ExhaustedIterator:
> >        break
> >    else:
> >        if rc is ExhaustedIterator:
> >            break
> >    <variable> = rc
> >    <block>
> 
> Then I'd prefer to use a single protocol:
> 
>     #7:
>     __iter = <object>.iterator()
>     while 1:
>        rc = __iter.next()
>        if rc is ExhaustedIterator:
>            break
>        <variable> = rc
>        <block>
> 
> This means there's a special value that you can't store in lists
> though, and that would bite some introspection code (e.g. code listing
> all internal objects).

Which brings us back to the good old "end of iteration" == raise
an exception logic :-)

Would this really hurt all that much in terms of performance ?
I mean, todays for-loop code uses IndexError for much the same
thing...
 
    #8:
    __iter = <object>.iterator()
    while 1:
       try:
           <variable> = __iter.next()
       except ExhaustedIterator:
           break
       <block>

Since this will be written in C, we don't even have the costs
of setting up an exception block.

I would still suggest that the iterator provides the current
position and iteration value as attributes. This avoids some
caching of those values and also helps when debugging code
using introspection tools.

The positional attribute will probably have to be optional
since not all iterators can supply this information, but
the .value attribute is certainly within range (it would
cache the value returned by the last .next() or .prev()
call).

-- 
Marc-Andre Lemburg
______________________________________________________________________
Business:                                      http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/