[Python-ideas] Draft PEP on protecting finally clauses

Paul Colomiets paul at colomiets.name
Fri Apr 6 23:04:28 CEST 2012


Hi,

I've finally made a PEP. Any feedback is appreciated.

-- 
Paul


PEP: XXX
Title: Protecting cleanup statements from interruptions
Version: $Revision$
Last-Modified: $Date$
Author: Paul Colomiets <paul at colomiets.name>
Status: Draft
Type: Standards Track
Content-Type: text/x-rst
Created: 06-Apr-2012
Python-Version: 3.3


Abstract
========

This PEP proposes a way to protect python code from being interrupted inside
finally statement or context manager.


Rationale
=========

Python has two nice ways to do cleanup. One is a ``finally`` statement
and the other is context manager (or ``with`` statement). Although,
neither of them is protected from ``KeyboardInterrupt`` or
``generator.throw()``. For example::

    lock.acquire()
    try:
        print('starting')
        do_someting()
    finally:
        print('finished')
        lock.release()

If ``KeyboardInterrupt`` occurs just after ``print`` function is
executed, lock will not be released. Similarly the following code
using ``with`` statement is affected::

    from threading import Lock

    class MyLock:

        def __init__(self):
            self._lock_impl = lock

        def __enter__(self):
            self._lock_impl.acquire()
            print("LOCKED")

        def __exit__(self):
            print("UNLOCKING")
            self._lock_impl.release()

    lock = MyLock()
    with lock:
        do_something

If ``KeyboardInterrupt`` occurs near any of the ``print`` statements,
lock will never be released.


Coroutine Use Case
------------------

Similar case occurs with coroutines. Usually coroutine libraries want
to interrupt coroutine with a timeout. There is a
``generator.throw()`` method for this use case, but there are no
method to know is it currently yielded from inside a ``finally``.

Example that uses yield-based coroutines follows. Code looks
similar using any of the popular coroutine libraries Monocle [1]_,
Bluelet [2]_, or Twisted [3]_. ::

    def run_locked()
        yield connection.sendall('LOCK')
        try:
            yield do_something()
            yield do_something_else()
        finally:
            yield connection.sendall('UNLOCK')

    with timeout(5):
        yield run_locked()

In the example above ``yield something`` means pause executing current
coroutine and execute coroutine ``something`` until it finished
execution. So that library keeps stack of generators itself. The
``connection.sendall`` waits until socket is writable and does thing
similar to what ``socket.sendall`` does.

The ``with`` statement ensures that all that code is executed within 5
seconds timeout. It does so by registering a callback in main loop,
which calls ``generator.throw()`` to the top-most frame in the
coroutine stack when timeout happens.

The ``greenlets`` extension works in similar way, except it doesn't
need ``yield`` to enter new stack frame. Otherwise considerations are
similar.


Specification
=============

Frame Flag 'f_in_cleanup'
-------------------------

A new flag on frame object is proposed. It is set to ``True`` if this
frame is currently in the ``finally`` suite.  Internally it must be
implemented as a counter of nested finally statements currently
executed.

The internal counter is also incremented when entering ``WITH_SETUP``
bytecode and ``WITH_CLEANUP`` bytecode, and is decremented when
leaving that bytecode. This allows to protect ``__enter__`` and
``__exit__`` methods too.


Function 'sys.setcleanuphook'
-----------------------------

A new function for the ``sys`` module is proposed. This function sets
a callback which is executed every time ``f_in_cleanup`` becomes
``False``. Callbacks gets ``frame`` as it's sole argument so it can
get some evindence where it is called from.

The setting is thread local and is stored inside ``PyThreadState``
structure.


Inspect Module Enhancements
---------------------------

Two new functions are proposed for ``inspect`` module:
``isframeincleanup`` and ``getcleanupframe``.

``isframeincleanup`` given ``frame`` object or ``generator`` object as
sole argument returns the value of ``f_in_cleanup`` attribute of a
frame itself or of the ``gi_frame`` attribute of a generator.

``getcleanupframe`` given ``frame`` object as sole argument returns
the innermost frame which has true value of ``f_in_cleanup`` or
``None`` if no frames in the stack has the attribute set. It starts to
inspect from specified frame and walks to outer frames using
``f_back`` pointers, just like ``getouterframes`` does.


Example
=======

Example implementation of ``SIGINT`` handler that interrupts safely
might look like::

    import inspect, sys, functools

    def sigint_handler(sig, frame)
        if inspect.getcleanupframe(frame) is None:
            raise KeyboardInterrupt()
        sys.setcleanuphook(functools.partial(sigint_handler, 0))

Coroutine example is out of scope of this document, because it's
implemention depends very much on a trampoline (or main loop) used by
coroutine library.


Unresolved Issues
=================

Interruption Inside With Statement Expression
---------------------------------------------

Given the statement::

    with open(filename):
        do_something()

Python can be interrupted after ``open`` is called, but before
``SETUP_WITH`` bytecode is executed. There are two possible decisions:

* Protect expression inside ``with`` statement. This would need
  another bytecode, since currently there is no delimiter at the start
  of ``with`` expression

* Let user write a wrapper if he considers it's important for his
  use-case. Safe wrapper code might look like the following::

    class FileWrapper(object):

        def __init__(self, filename, mode):
            self.filename = filename
            self.mode = mode

        def __enter__(self):
            self.file = open(self.filename, self.mode)

        def __exit__(self):
            self.file.close()

  Alternatively it can be written using context manager::

    @contextmanager
    def open_wrapper(filename, mode):
        file = open(filename, mode)
        try:
            yield file
        finally:
            file.close()

  This code is safe, as first part of generator (before yield) is
  executed inside ``WITH_SETUP`` bytecode of caller


Exception Propagation
---------------------

Sometimes ``finally`` block or ``__enter__/__exit__`` method can be
exited with an exception. Usually it's not a problem, since more
important exception like ``KeyboardInterrupt`` or ``SystemExit``
should be thrown instead. But it may be nice to be able to keep
original exception inside a ``__context__`` attibute. So cleanup hook
signature may grow an exception argument::

    def sigint_handler(sig, frame)
        if inspect.getcleanupframe(frame) is None:
            raise KeyboardInterrupt()
        sys.setcleanuphook(retry_sigint)

    def retry_sigint(frame, exception=None):
        if inspect.getcleanupframe(frame) is None:
            raise KeyboardInterrupt() from exception

.. note::

    No need to have three arguments like in ``__exit__`` method since
    we have a ``__traceback__`` attribute in exception in Python 3.x

Although, this will set ``__cause__`` for the exception, which is not
exactly what's intended. So some hidden interpeter logic may be used
to put ``__context__`` attribute on every exception raised in cleanup
hook.


Interruption Between Acquiring Resource and Try Block
-----------------------------------------------------

Example from the first section is not totally safe. Let's look closer::

    lock.acquire()
    try:
        do_something()
    finally:
        lock.release()

There is no way it can be fixed without modifying the code. The actual
fix of this code depends very much on use case.

Usually code can be fixed using a ``with`` statement::

    with lock:
        do_something()

Although, for coroutines you usually can't use ``with`` statement
because you need to ``yield`` for both aquire and release operations.
So code might be rewritten as following::

    try:
        yield lock.acquire()
        do_something()
    finally:
        yield lock.release()

The actual lock code might need more code to support this use case,
but implementation is usually trivial, like check if lock has been
acquired and unlock if it is.


Setting Interruption Context Inside Finally Itself
--------------------------------------------------

Some coroutine libraries may need to set a timeout for the finally
clause itself. For example::

    try:
        do_something()
    finally:
        with timeout(0.5):
            try:
                yield do_slow_cleanup()
            finally:
                yield do_fast_cleanup()

With current semantics timeout will either protect
the whole ``with`` block or nothing at all, depending on the
implementation of a library. What the author is intended is to treat
``do_slow_cleanup`` as an ordinary code, and ``do_fast_cleanup`` as a
cleanup (non-interruptible one).

Similar case might occur when using greenlets or tasklets.

This case can be fixed by exposing ``f_in_cleanup`` as a counter, and
by calling cleanup hook on each decrement.  Corouting library may then
remember the value at timeout start, and compare it on each hook
execution.

But in practice example is considered to be too obscure to take in
account.


Alternative Python Implementations Support
==========================================

We consider ``f_in_cleanup`` and implementation detail. The actual
implementation may have some fake frame-like object passed to signal
handler, cleanup hook and returned from ``getcleanupframe``. The only
requirement is that ``inspect`` module functions work as expected on
that objects. For this reason we also allow to pass a ``generator``
object to a ``isframeincleanup`` function, this disables need to use
``gi_frame`` attribute.

It may need to be specified that ``getcleanupframe`` must return the
same object that will be passed to cleanup hook at next invocation.


Alternative Names
=================

Original proposal had ``f_in_finally`` flag. The original intention
was to protect ``finally`` clauses. But as it grew up to protecting
``__enter__`` and ``__exit__`` methods too, the ``f_in_cleanup``
method seems better. Although ``__enter__`` method is not a cleanup
routine, it at least relates to cleanup done by context managers.

``setcleanuphook``, ``isframeincleanup`` and ``getcleanupframe`` can
be unobscured to ``set_cleanup_hook``, ``is_frame_in_cleanup`` and
``get_cleanup_frame``, althought they follow convention of their
respective modules.


Alternative Proposals
=====================

Propagating 'f_in_cleanup' Flag Automatically
-----------------------------------------------

This can make ``getcleanupframe`` unnecessary. But for yield based
coroutines you need to propagate it yourself. Making it writable leads
to somewhat unpredictable behavior of ``setcleanuphook``


Add Bytecodes 'INCR_CLEANUP', 'DECR_CLEANUP'
--------------------------------------------

These bytecodes can be used to protect expression inside ``with``
statement, as well as making counter increments more explicit and easy
to debug (visible inside a disassembly). Some middle ground might be
chosen, like ``END_FINALLY`` and ``SETUP_WITH`` imlicitly decrements
counter (``END_FINALLY`` is present at end of ``with`` suite).

Although, adding new bytecodes must be considered very carefully.


Expose 'f_in_cleanup' as a Counter
----------------------------------

The original intention was to expose minimum needed functionality.
Although, as we consider frame flag ``f_in_cleanup`` as an
implementation detail, we may expose it as a counter.

Similarly, if we have a counter we may need to have cleanup hook
called on every counter decrement. It's unlikely have much performance
impact as nested finally clauses are unlikely common case.


Add code object flag 'CO_CLEANUP'
---------------------------------

As an alternative to set flag inside ``WITH_SETUP``, and
``WITH_CLEANUP`` bytecodes we can introduce a flag ``CO_CLEANUP``.
When interpreter starts to execute code with ``CO_CLEANUP`` set, it
sets ``f_in_cleanup`` for the whole function body.  This flag is set
for code object of ``__enter__`` and ``__exit__`` special methods.
Technically it might be set on functions called ``__enter__`` and
``__exit__``.

This seems to be less clear solution. It also covers the case where
``__enter__`` and ``__exit__`` are called manually. This may be
accepted either as feature or as a unnecessary side-effect (unlikely
as a bug).

It may also impose a problem when ``__enter__`` or ``__exit__``
function are implemented in C, as there usually no frame to check for
``f_in_cleanup`` flag.


Have Cleanup Callback on Frame Object Itself
----------------------------------------------

Frame may be extended to have ``f_cleanup_callback`` which is called
when ``f_in_cleanup`` is reset to 0. It would help to register
different callbacks to different coroutines.

Despite apparent beauty. This solution doesn't add anything. As there
are two primary use cases:

* Set callback in signal handler. The callback is inherently single
  one for this case

* Use single callback per loop for coroutine use case. And in almost
  all cases there is only one loop per thread


No Cleanup Hook
---------------

Original proposal included no cleanup hook specification. As there are
few ways to achieve the same using current tools:

* Use ``sys.settrace`` and ``f_trace`` callback. It may impose some
  problem to debugging, and has big performance impact (although,
  interrupting doesn't happen very often)

* Sleep a bit more and try again. For coroutine library it's easy. For
  signals it may be achieved using ``alert``.

Both methods are considered too impractical and a way to catch exit
from ``finally`` statement is proposed.


References
==========

.. [1] Monocle
   https://github.com/saucelabs/monocle

.. [2] Bluelet
   https://github.com/sampsyo/bluelet

.. [3] Twisted: inlineCallbacks
   http://twistedmatrix.com/documents/8.1.0/api/twisted.internet.defer.html

.. [4] Original discussion
   http://mail.python.org/pipermail/python-ideas/2012-April/014705.html


Copyright
=========

This document has been placed in the public domain.



..
   Local Variables:
   mode: indented-text
   indent-tabs-mode: nil
   sentence-end-double-space: t
   fill-column: 70
   coding: utf-8
   End:



More information about the Python-ideas mailing list