[Python-Dev] iterators

Guido van Rossum guido@beopen.com
Tue, 22 Aug 2000 08:03:28 -0500


> > [MAL]
> > > How about a third variant:
> > >
> > > #3:
> > > __iter = <object>.iterator()
> > > while __iter:
> > >    <variable> = __iter.next()
> > >    <block>
> > >
> > > This adds a slot call, but removes the malloc overhead introduced
> > > by returning a tuple for every iteration (which is likely to be
> > > a performance problem).
> > 
> > Are you sure the slot call doesn't cause some malloc overhead as well?
> 
> Quite likely not, since the slot in question doesn't generate
> Python objects (nb_nonzero).

Agreed only for built-in objects like lists.  For class instances this
would be way more expensive, because of the two calls vs. one!

> > Ayway, the problem with this one is that it requires a dynamic
> > iterator (one that generates values on the fly, e.g. something reading
> > lines from a pipe) to hold on to the next value between "while __iter"
> > and "__iter.next()".
> 
> Hmm, that depends on how you look at it: I was thinking in terms
> of reading from a file -- feof() is true as soon as the end of
> file is reached. The same could be done for iterators.

But feof() needs to read an extra character into the buffer if the
buffer is empty -- so it needs buffering!  That's what I'm trying to
avoid.

> We might also consider a mixed approach:
> 
> #5:
> __iter = <object>.iterator()
> while __iter:
>    try:
>        <variable> = __iter.next()
>    except ExhaustedIterator:
>        break
>    <block>
> 
> Some iterators may want to signal the end of iteration using
> an exception, others via the truth text prior to calling .next(),
> e.g. a list iterator can easily implement the truth test
> variant, while an iterator with complex .next() processing
> might want to use the exception variant.

Belt and suspenders.  What does this buy you over "while 1"?

> Another possibility would be using exception class objects
> as singleton indicators of "end of iteration":
> 
> #6:
> __iter = <object>.iterator()
> while 1:
>    try:
>        rc = __iter.next()
>    except ExhaustedIterator:
>        break
>    else:
>        if rc is ExhaustedIterator:
>            break
>    <variable> = rc
>    <block>

Then I'd prefer to use a single protocol:

    #7:
    __iter = <object>.iterator()
    while 1:
       rc = __iter.next()
       if rc is ExhaustedIterator:
	   break
       <variable> = rc
       <block>

This means there's a special value that you can't store in lists
though, and that would bite some introspection code (e.g. code listing
all internal objects).

--Guido van Rossum (home page: http://www.pythonlabs.com/~guido/)