when an iterable object is exhausted or not

Sat Aug 4 17:04:51 EDT 2012

On 8/4/2012 4:24 PM, Tim Chase wrote:
> On 08/04/12 14:20, Franck Ditter wrote:
>> Two similar iterable objects but with a different behavior :
>>
>> $$$ i = range(2,5)
>> $$$ for x in i : print(x,end=' ')
>>
>> 2 3 4
>> $$$ for x in i : print(x,end=' ')        # i is not exhausted
>>
>> 2 3 4
>>
>> --------- Compare with :
>>
>> $$$ i = filter(lambda c : c.isdigit(), 'a1b2c3')
>> $$$ for x in i : print(x,end=' ')
>>
>> 1 2 3
>> $$$ for x in i : print(x,end=' ')        # i is exhausted
>>
>> $$$
>>
>> IMHO, this should not happen in Py3k.
>> What is the rationale of this (bad ?) design, which forces the programmer
>> to memorize which one is exhaustable and which one is not ?...
>
> I can't speak to the rationale, but it seems that a range() object
> has some extra features that a normal iter doesn't:
>
>    >>> i = iter(range(2,5))
>    >>> for x in i: print (x, end=' ')
>    ...
>    2 3 4 >>> for x in i: print (x, end=' ')
>    ...
>
> (your 2nd behavior, and what I'd expect).
>
> So my guess would be that the "for {var} in {thing}" triggers a
> re-calling of range.__iter__ since it's not an iterator to begin with.

range produces a re-iterable range object because it can. The result is 
self-contained with 3 data attributes, so it can create rangeiterators 
on demand.

filter, on the other hand, depends on an external iterable and it cannot 
depend on that external object being re-iterable. So even if we 
programmed filter() to produce a filter object that produced 
filteriterators, the latter would often only work for the first. Also, 
If you want the filtered collection more than once, you should just save 
it. On the other hand, reproducing counts with a rangeiterator is nearly 
as fast as looking them up in a saved list, and much more memory efficient.

-- 
Terry Jan Reedy