An Iterator Idiom -- another proposal

Michael Haggerty mhagger at blizzard.harvard.edu
Tue May 4 17:05:32 EDT 1999


Moshe Zadka <moshez at math.huji.ac.il> writes:

> --------------- cut here -------------
> class iterator:
>         def __init__(self, f, *args):
>                 self.f=f
>                 self.args=args
>                 self.i=0
>         def __getitem__(self, i):
>                 if self.i<>i:
>                         raise ValueError, 'items not accessed consecutively'
>                 val=apply(self.f, self.args)
>                 if not val:
>                         raise IndexError, 'no more items'
>                 self.i=i+1
>                 return val
> ------------- cut here --------------------

We've seen such things posted before, but I suspect the reason it is
not used more is because of the overhead needed inside __getitem__ to
verify that the index requested is in sequence.  The problem here is
that a random-access array needs to be simulated when in fact only a
sequential-access data structure is provided.  Because such iterators
are accessed using __getitem__ but don't properly implement
__getitem__, they have the flavor of a hack.

How about giving us the possibility of emulating a sequential-access
data structure with a new magic method called, say, __getnext__, which
would be used as follows:

class iterator:
        def __init__(self, f, *args):
                self.f=f
                self.args=args
        def __getnext__(self):
                val = apply(self.f, self.args)
                if not val:
                        raise IndexError, 'no more items'
                return val

Then `for' loops would first try to use __getnext__ for the iteration,
and only if that were not available it would fall back to
__getitem__.  This paradigm would be useful, I think, for many other
situations as well.

It might be that __getnext__ should raise a new exception type
distinct from IndexError.

Advantages:
  Sequentially accessible or iteratable objects can be written more
  naturally, with no unused index or superfluous error checking.

  Doesn't break any old code.

Disadvantages:
  Another special function to remember.

  `for' would need to look up __getnext__ and if that is not found,
  __getitem__.  Since the looked-up value can be cached, this amounts
  to at most one extra lookup of overhead each time a for loop is
  entered.

Yours,
Michael

-- 
Michael Haggerty
mhagger at blizzard.harvard.edu




More information about the Python-list mailing list