[Python-Dev] Fun with ExitStack

Martin Panter vadmium+py at gmail.com
Mon Jul 18 21:57:50 EDT 2016


On 18 July 2016 at 23:40, Barry Warsaw <barry at python.org> wrote:
> I was trying to debug a problem in some work code and I ran into some
> interesting oddities with contextlib.ExitStack and other context managers in
> Python 3.5.
>
> This program creates a temporary directory, and I wanted to give it a --keep
> flag to not automatically delete the tempdir at program exit.  I was using an
> ExitStack to manage a bunch of resources, and the temporary directory is the
> first thing pushed into the ExitStack.  At that point in the program, I check
> the value of --keep and if it's set, I use ExitStack.pop_all() to clear the
> stack, and thus, presumably, prevent the temporary directory from being
> deleted.
>
> There's this relevant quote in the contextlib documentation:
>
> """
> Each instance [of an ExitStack] maintains a stack of registered callbacks that
> are called in reverse order when the instance is closed (either explicitly or
> implicitly at the end of a with statement). Note that callbacks are not
> invoked implicitly when the context stack instance is garbage collected.
> """
>
> However if I didn't save the reference to the pop_all'd ExitStack, the tempdir
> would be immediately deleted.  If I did save a reference to the pop_all'd
> ExitStack, the tempdir would live until the saved reference went out of scope
> and got refcounted away.
>
> As best I can tell this happens because TemporaryDirectory.__init__() creates
> a weakref finalizer which ends up calling the _cleanup() function.  Although
> it's rather difficult to trace, it does appear that when the ExitStack is
> gc'd, this finalizer gets triggered (via weakref.detach()), thus causing the
> _cleanup() method to be called and the tmpdir to get deleted.

This seems to be missing from the documentation, but when you delete a
TemporaryDirectory instance without using a context manager or
cleanup(), it complains, and then cleans up anyway:

>>> d = TemporaryDirectory()
>>> del d
/usr/lib/python3.5/tempfile.py:797: ResourceWarning: Implicitly
cleaning up <TemporaryDirectory '/tmp/tmpfzf4g96l'>
  _warnings.warn(warn_message, ResourceWarning)

Perhaps you will have to use the lower-level mkdtemp() function
instead if you want the option of making the “temporary” directory
long-lived.

> I "fix" this by
> doing:
>
>     def __init__(self):
>         tmpdir = TemporaryDirectory()
>         self._tmpdir = (tmpdir.name if keep
>                         else self.resources.enter_context(tmpdir))
>
> There must be more to the story because when __init__() exits in the --keep
> case, tmpdir should have gotten refcounted away and the directory deleted, but
> it doesn't.  I haven't dug down deep enough to figure that out.
>
> Now, while I was debugging that behavior, I ran across more interesting bits.
> I put this in a file to drive some tests:
>
> ------snip snip-----
> with ExitStack() as resources:
>     print('enter context')
>     tmpdir = resources.enter_context(X())
>     resources.pop_all()
>     print('exit context')
> ------snip snip-----
>
> Let's say X is:
>
> class X:
>     def __enter__(self):
>         print('enter Foo')
>         return self
>
>     def __exit__(self, *args, **kws):
>         print('exit Foo')
>         return False
>
> the output is:
>
> enter context
> enter Foo
> exit context
>
> So far so good.  A fairly standard context manager class doesn't get its
> __exit__() called even when the program exits.  Let's try this:
>
> @contextmanager
> def X():
>     print('enter bar')
>     yield
>     print('exit bar')
>
> still good:
>
> enter context
> enter bar
> exit context
>
> Let's modify X a little bit to be a more common idiom:
>
> @contextmanager
> def X():
>     print('enter foo')
>     try:
>         yield
>     finally:
>         print('exit foo')
>
> enter context
> enter foo
> exit foo
> exit context
>
> Ah, the try-finally changes the behavior!  There's probably some documentation
> somewhere that defines how a generator gets finalized, and that triggers the
> finally clause, whereas in the previous example, nothing after the yield gets
> run.  I just can't find that anything that would describe the observed
> behavior.

I suspect the documentation doesn’t spell everything out, but my
understanding is that garbage collection of a generator instance
effectively calls its close() method, triggering any “finally” and
__exit__() handlers.

IMO, in some cases if a generator would execute these handlers and
gets garbage collected, it is a programming error because the
programmer should have explicitly called generator.close(). In these
cases, it would be nice to emit a ResourceWarning, just like
forgetting to close a file, or delete your temporay directory above.

But maybe there are other cases where there is no valid reason to emit
a warning. I have hesitated in suggesting this change in the past, but
I don’t remember why. One reason is it might be an annoyance with code
that also wants to handle non-generator iterators that don’t have a
close() method. In a world before “yield from”, imagine having to
change:

# Rough equivalent of “yield from g(...)”
for item in g(...):
    yield item

to:

with contextlib.closing(g(...)) as subgen:
    for item in subgen:
        yield item

It would be even worse if you had to support g() returning some other
iterator not supporting close().

> It's all very twisty, and I'm not sure Python is doing anything wrong, but I'm
> also not sure it's *not* doing anything wrong. ;)
>
> In any case, the contextlib documentation quoted above should probably be more
> liberally sprinkled with salty caveats.  Just calling .pop_all() isn't
> necessarily enough to ensure that resources managed by an ExitStack will
> survive its garbage collection.


More information about the Python-Dev mailing list