[Python-ideas] Deterministic iterator cleanup

Steven D'Aprano steve at pearwood.info
Fri Oct 21 06:29:01 EDT 2016


On Wed, Oct 19, 2016 at 05:52:34PM -0400, Yury Selivanov wrote:

> IOW I'm not convinced that if we implement your proposal we'll fix 90% 
> (or even 30%) of cases where non-deterministic and postponed cleanup is 
> harmful.

Just because something doesn't solve ALL problems doesn't mean it isn't 
worth doing. Reference counting doesn't solve the problem of cycles, but 
Python worked really well for many years even though cycles weren't 
automatically broken. Then a second GC was added, but it didn't solve 
the problem of cycles with __del__ finalizers. And recently (a year or 
two ago) there was an improvement that made the GC better able to deal 
with such cases -- but I expect that there are still edge cases where 
objects aren't collected.

Had people said "garbage collection doesn't solve all the edge cases, 
therefore its not worth doing" where would we be?

I don't know how big a problem the current lack of deterministic GC 
of resources opened in generators actually is. I guess that users of 
CPython will have *no idea*, because most of the time the ref counter 
will cleanup quite early. But not all Pythons are CPython, and despite 
my earlier post, I now think I've changed my mind and support this 
proposal.

One reason for this is that I thought hard about my own code where I use 
the double-for-loop idiom:

for x in iterator:
    if cond: break
    ...

# later
for y in iterator: # same iterator
   ...


and I realised:

(1) I don't do this *that* often;
(2) when I do, it really wouldn't be that big a problem for me to 
    guard against auto-closing:

for x in protect(iterator):
    if cond: break
    ...

(3) if I need to write hybrid code that runs over multiple versions, 
that's easy too:

try:
    from itertools import protect
except ImportError:
    def protect(it):
        return it



> Yes, mainly iterator wrappers.  You'll also will need to educate users 
> to refactor (more on that below) their __del__ methods to 
> __(a)iterclose__ in 3.6.

Couldn't __(a)iterclose__ automatically call __del__ if it exists? Seems 
like a reasonable thing to inherit from object.


> A lot of code that you find on stackoverflow etc will be broken.

"A lot"? Or a little? Are you guessing, or did you actually count it?

If we are worried about code like this:


it = iter([1, 2, 3])
a = list(it)
# currently b will be [], with this proposal it will raise RuntimeError
b = list(it)


we can soften the proposal's recommendation that iterators raise 
RuntimeError on calling next() when they are closed. I've suggested that 
"whatever exception makes sense" should be the rule. Iterators with no 
resources to close can simply raise StopIteration instead. That will 
preserve the current behaviour.


> Porting 
> code from Python2/<3.6 will be challenging.  People are still struggling 
> to understand 'dict.keys()'-like views in Python 3.

I spend a lot of time on the tutor and python-list mailing lists, and a 
little bit of time on Reddit /python, and I don't think I've ever seen 
anyone struggle with those. I'm sure it happens, but I don't think it 
happens often. After all, for the most common use-case, there's no real 
difference between Python 2 and 3:

    for key, value in mydict.items():
        ...


[...]
> With you proposal, to achieve the same (and make the code compatible 
> with new for-loop semantics), users will have to implement both 
> __iterclose__ and __del__.

As I ask above, couldn't we just inherit a default __(a)iterclose__ from 
object that looks like this?

    def __iterclose__(self):
        finalizer = getattr(type(self), '__del__', None)
        if finalizer:
            finalizer(self)


I know it looks a bit funny for non-iterables to have an iterclose 
method, but they'll never actually be called.


[...]
> The __(a)iterclose__ semantics is clear.  What's not clear is how much 
> harm changing the semantics of for-loops will do (and how to quantify 
> the amount of good :))


The "easy" way to find out (easy for those who aren't volunteering to do 
the work) is to fork Python, make the change, and see what breaks. I 
suspect not much, and most of the breakage will be easy to fix.

As for the amount of good, this proposal originally came from PyPy. I 
expect that CPython users won't appreciate it as much as PyPy users, and 
Jython/IronPython users when they eventually support Python 3.x.



-- 
Steve


More information about the Python-ideas mailing list