[Python-ideas] Deterministic iterator cleanup

Terry Reedy tjreedy at udel.edu
Wed Oct 19 22:07:18 EDT 2016


On 10/19/2016 12:38 AM, Nathaniel Smith wrote:

> I'd like to propose that Python's iterator protocol be enhanced to add
> a first-class notion of completion / cleanup.

With respect the the standard iterator protocol, a very solid -1 from 
me.  (I leave commenting specifically on __aiterclose__ to Yury.)

1. I consider the introduction of iterables and the new iterator 
protocol in 2.2 and their gradual replacement of lists in many 
situations to be the greatest enhancement to Python since 1.3 (my first 
version).  They are, to me, they one of Python's greatest features and 
the minimal nature of the protocol an essential part of what makes them 
great.

2. I think you greatly underestimate the negative impact, just as we did 
with changing str is bytes to str is unicode.  The change itself, 
embodied in for loops, will break most non-trivial programs.  You 
yourself note that there will have to be pervasive changes in the stdlib 
just to begin fixing the breakage.

3. Though perhaps common for what you do, the need for the change is 
extremely rare in the overall Python world.  Iterators depending on an 
external resource are rare (< 1%, I would think).  Incomplete iteration 
is also rare (also < 1%, I think).  And resources do not always need to 
releases immediately.

4. Previous proposals to officially augment the iterator protocol, even 
with optional methods, have been rejected, and I think this one should 
be too.

a. Add .__len__ as an option.  We added __length_hint__, which an 
iterator may implement, but which is not part of the iterator protocol. 
It is also ignored by bool().

b., c. Add __bool__ and/or peek().  I posted a LookAhead wrapper class 
that implements both for most any iterable.  I suspect that the is 
rarely used.


>   def read_newline_separated_json(path):
>       with open(path) as file_handle:      # <-- with block
>           for line in file_handle:
>               yield json.loads(line)

One problem with passing paths around is that it makes the receiving 
function hard to test.  I think functions should at least optionally 
take an iterable of lines, and make the open part optional.  But then 
closing should also be conditional.

If the combination of 'with', 'for', and 'yield' do not work together, 
then do something else, rather than changing the meaning of 'for'. 
Moving responsibility for closing the file from 'with' to 'for', makes 
'with' pretty useless, while overloading 'for' with something that is 
rarely needed.  This does not strike me as the right solution to the 
problem.

>   for document in read_newline_separated_json(path):  # <-- outer for loop
>       ...

If the outer loop determines when the file should be closed, then why 
not open it there?  What fails with

try:
     lines = open(path)
     gen = read_newline_separated_json(lines)
     for doc in gen: do_something(doc)
finally:
     lines.close
     # and/or gen.throw(...) to stop the generator.

-- 
Terry Jan Reedy



More information about the Python-ideas mailing list