Distinguishing active generators from exhausted ones

Terry Reedy tjreedy at udel.edu
Mon Jul 27 16:47:34 EDT 2009


Steven D'Aprano wrote:
> On Mon, 27 Jul 2009 02:02:19 -0400, Terry Reedy wrote:
> 
>> Steven D'Aprano wrote:
>>> On Sun, 26 Jul 2009 20:10:00 -0400, Terry Reedy wrote:
>>>
>>>> Michal Kwiatkowski wrote:
>>>>
>>>>> The thing is I don't need the next item. I need to know if the
>>>>> generator has stopped without invoking it.
>>>> Write a one-ahead iterator class, which I have posted before, that
>>>> sets .exhausted to True when next fails.
>>>
>>> And hope that the generator doesn't have side-effects...
>> If run to exhastion, the same number of side-effects happen. The only
>> difference is that they each happen once step happpen sooner. For
>> reading a file that is irrelevant. Much else, and the iterator is not
>> just an iterator.
> 
> I believe the OP specifically said he needs to detect whether or not an 
> iterator is exhausted, without running it to exhaustion, so you shouldn't 
> assume that the generator has been exhausted.

I believe the OP said he needs to determine whether or not an iterator 
(specifically generator) is exhausted without consuming an item when it 
is not. That is slightly different. The wrapper I suggested makes that 
easy. I am obviously not assuming exhaustion when there is a .exhausted 
True/False flag to check.

There are two possible definition of 'exhausted': 1) will raise 
StopIteration on the next next() call; 2) has raised StopIteration at 
least once. The wrapper converts 2) to 1), which is to say, it obeys 
definition 1 once the underlying iteration has obeyed definition 2.

Since it is trivial to set 'exhausted=True' in the generator user code 
once StopIteration has been raised (meaning 2), I presume the OP wants 
the predictive meaning 1).

Without a iterator class wrapper, I see no way to predict what a 
generator will do (expecially, raise StopIteration the first time) 
without analyzing its code and local variable state.

I said in response to YOU that once exhaustion has occurred, then the 
same number of side effects would have occurred.

> When it comes to side-effects, timing matters.

Sometimes. And I admitted that possibility (slight garbled).

  For example, a generator
> which cleans up after it has run (deleting temporary files, closing 
> sockets, etc.) will leave the environment in a different state if run to 
> exhaustion than just before exhaustion. Even if you store the final 
> result in a one-ahead class, you haven't saved the environment, and that 
> may be significant.

Of course, an eager-beaver generator written to be a good citizen might 
well close resources as soon as it knows *they* are exhausted, long 
before *it* yields the last items from the in-memory last block read. 
For all I know, file.readlines could do such.

Assuming that is not the case, the cleanup will not happen until the 
what turns out to be the final item is requested from the wrapper. Once 
cleanup has happened, .exhausted will be set to True. If proper 
processing of even the last item requires that cleanup not have 
happened, then that and prediction of exhaustion are incompatible. One 
who wants both should write an iterator class instead of generator function.

> (Of course, it's possible that it isn't significant. Not all differences 
> make a difference.)
> 
> The best advice is, try to avoid side-effects, especially in generators.

Agreed.

Terry Jan Reedy






More information about the Python-list mailing list