Iterator length

Steven D'Aprano steve at REMOVE.THIS.cybersource.com.au
Fri Jan 19 06:36:29 EST 2007


On Thu, 18 Jan 2007 16:55:39 -0800, bearophileHUGS wrote:

> What's your point? Maybe you mean that it consumes the given iterator?
> I am aware of that, it's written in the function docstring too. But
> sometimes you don't need the elements of a given iterator, you just
> need to know how many elements it has. A very simple example:
> 
> s = "aaabbbbbaabbbbbb"
> from itertools import groupby
> print [(h,leniter(g)) for h,g in groupby(s)]

s isn't an iterator. It's a sequence, a string, and an iterable, but not
an iterator.

I hope you know what sequences and strings are :-)

An iterable is anything that can be iterated over -- it includes sequences
and iterators.

An iterator, on the other hand, is something with the iterator protocol,
that is, it has a next() method and raises StopIteration when it's done.

>>> s = "aaabbbbbaabbbbbb"
>>> s.next()
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
AttributeError: 'str' object has no attribute 'next'

An iterator should return itself if you pass it to iter():

>>> iter(s) is s
False
>>> it = iter(s); iter(it) is it
True

You've said that you understand len of an iterator will consume the
iterator, and that you don't think that matters. It might not matter in
a tiny percentage of cases, but it will certainly matter all the rest
of the time!

And let's not forget, in general you CAN'T calculate the length of an
iterator, not even in theory:

def randnums():
    while random.random != 0.123456789:
        yield "Not finished yet"
    yield "Finished"

What should the length of randnums() return?

One last thing which people forget... iterators can have a length, the
same as any other object, if they have a __len__ method:

>>> s = "aaabbbbbaabbbbbb"
>>> it = iter(s)
>>> len(it)
16

So, if you want the length of an arbitrary iterator, just call len()
and deal with the exception.



-- 
Steven.




More information about the Python-list mailing list