[Python-ideas] Draft PEP on protecting finally clauses

Andrew Svetlov andrew.svetlov at gmail.com
Sat Apr 7 23:09:37 CEST 2012


What's about reference implementation?

On Sun, Apr 8, 2012 at 12:08 AM, Andrew Svetlov
<andrew.svetlov at gmail.com> wrote:
> I've published this PEP as PEP-419: http://www.python.org/dev/peps/pep-0419/
> Thank you, Paul.
>
> On Sat, Apr 7, 2012 at 12:04 AM, Paul Colomiets <paul at colomiets.name> wrote:
>> Hi,
>>
>> I've finally made a PEP. Any feedback is appreciated.
>>
>> --
>> Paul
>>
>>
>> PEP: XXX
>> Title: Protecting cleanup statements from interruptions
>> Version: $Revision$
>> Last-Modified: $Date$
>> Author: Paul Colomiets <paul at colomiets.name>
>> Status: Draft
>> Type: Standards Track
>> Content-Type: text/x-rst
>> Created: 06-Apr-2012
>> Python-Version: 3.3
>>
>>
>> Abstract
>> ========
>>
>> This PEP proposes a way to protect python code from being interrupted inside
>> finally statement or context manager.
>>
>>
>> Rationale
>> =========
>>
>> Python has two nice ways to do cleanup. One is a ``finally`` statement
>> and the other is context manager (or ``with`` statement). Although,
>> neither of them is protected from ``KeyboardInterrupt`` or
>> ``generator.throw()``. For example::
>>
>>    lock.acquire()
>>    try:
>>        print('starting')
>>        do_someting()
>>    finally:
>>        print('finished')
>>        lock.release()
>>
>> If ``KeyboardInterrupt`` occurs just after ``print`` function is
>> executed, lock will not be released. Similarly the following code
>> using ``with`` statement is affected::
>>
>>    from threading import Lock
>>
>>    class MyLock:
>>
>>        def __init__(self):
>>            self._lock_impl = lock
>>
>>        def __enter__(self):
>>            self._lock_impl.acquire()
>>            print("LOCKED")
>>
>>        def __exit__(self):
>>            print("UNLOCKING")
>>            self._lock_impl.release()
>>
>>    lock = MyLock()
>>    with lock:
>>        do_something
>>
>> If ``KeyboardInterrupt`` occurs near any of the ``print`` statements,
>> lock will never be released.
>>
>>
>> Coroutine Use Case
>> ------------------
>>
>> Similar case occurs with coroutines. Usually coroutine libraries want
>> to interrupt coroutine with a timeout. There is a
>> ``generator.throw()`` method for this use case, but there are no
>> method to know is it currently yielded from inside a ``finally``.
>>
>> Example that uses yield-based coroutines follows. Code looks
>> similar using any of the popular coroutine libraries Monocle [1]_,
>> Bluelet [2]_, or Twisted [3]_. ::
>>
>>    def run_locked()
>>        yield connection.sendall('LOCK')
>>        try:
>>            yield do_something()
>>            yield do_something_else()
>>        finally:
>>            yield connection.sendall('UNLOCK')
>>
>>    with timeout(5):
>>        yield run_locked()
>>
>> In the example above ``yield something`` means pause executing current
>> coroutine and execute coroutine ``something`` until it finished
>> execution. So that library keeps stack of generators itself. The
>> ``connection.sendall`` waits until socket is writable and does thing
>> similar to what ``socket.sendall`` does.
>>
>> The ``with`` statement ensures that all that code is executed within 5
>> seconds timeout. It does so by registering a callback in main loop,
>> which calls ``generator.throw()`` to the top-most frame in the
>> coroutine stack when timeout happens.
>>
>> The ``greenlets`` extension works in similar way, except it doesn't
>> need ``yield`` to enter new stack frame. Otherwise considerations are
>> similar.
>>
>>
>> Specification
>> =============
>>
>> Frame Flag 'f_in_cleanup'
>> -------------------------
>>
>> A new flag on frame object is proposed. It is set to ``True`` if this
>> frame is currently in the ``finally`` suite.  Internally it must be
>> implemented as a counter of nested finally statements currently
>> executed.
>>
>> The internal counter is also incremented when entering ``WITH_SETUP``
>> bytecode and ``WITH_CLEANUP`` bytecode, and is decremented when
>> leaving that bytecode. This allows to protect ``__enter__`` and
>> ``__exit__`` methods too.
>>
>>
>> Function 'sys.setcleanuphook'
>> -----------------------------
>>
>> A new function for the ``sys`` module is proposed. This function sets
>> a callback which is executed every time ``f_in_cleanup`` becomes
>> ``False``. Callbacks gets ``frame`` as it's sole argument so it can
>> get some evindence where it is called from.
>>
>> The setting is thread local and is stored inside ``PyThreadState``
>> structure.
>>
>>
>> Inspect Module Enhancements
>> ---------------------------
>>
>> Two new functions are proposed for ``inspect`` module:
>> ``isframeincleanup`` and ``getcleanupframe``.
>>
>> ``isframeincleanup`` given ``frame`` object or ``generator`` object as
>> sole argument returns the value of ``f_in_cleanup`` attribute of a
>> frame itself or of the ``gi_frame`` attribute of a generator.
>>
>> ``getcleanupframe`` given ``frame`` object as sole argument returns
>> the innermost frame which has true value of ``f_in_cleanup`` or
>> ``None`` if no frames in the stack has the attribute set. It starts to
>> inspect from specified frame and walks to outer frames using
>> ``f_back`` pointers, just like ``getouterframes`` does.
>>
>>
>> Example
>> =======
>>
>> Example implementation of ``SIGINT`` handler that interrupts safely
>> might look like::
>>
>>    import inspect, sys, functools
>>
>>    def sigint_handler(sig, frame)
>>        if inspect.getcleanupframe(frame) is None:
>>            raise KeyboardInterrupt()
>>        sys.setcleanuphook(functools.partial(sigint_handler, 0))
>>
>> Coroutine example is out of scope of this document, because it's
>> implemention depends very much on a trampoline (or main loop) used by
>> coroutine library.
>>
>>
>> Unresolved Issues
>> =================
>>
>> Interruption Inside With Statement Expression
>> ---------------------------------------------
>>
>> Given the statement::
>>
>>    with open(filename):
>>        do_something()
>>
>> Python can be interrupted after ``open`` is called, but before
>> ``SETUP_WITH`` bytecode is executed. There are two possible decisions:
>>
>> * Protect expression inside ``with`` statement. This would need
>>  another bytecode, since currently there is no delimiter at the start
>>  of ``with`` expression
>>
>> * Let user write a wrapper if he considers it's important for his
>>  use-case. Safe wrapper code might look like the following::
>>
>>    class FileWrapper(object):
>>
>>        def __init__(self, filename, mode):
>>            self.filename = filename
>>            self.mode = mode
>>
>>        def __enter__(self):
>>            self.file = open(self.filename, self.mode)
>>
>>        def __exit__(self):
>>            self.file.close()
>>
>>  Alternatively it can be written using context manager::
>>
>>    @contextmanager
>>    def open_wrapper(filename, mode):
>>        file = open(filename, mode)
>>        try:
>>            yield file
>>        finally:
>>            file.close()
>>
>>  This code is safe, as first part of generator (before yield) is
>>  executed inside ``WITH_SETUP`` bytecode of caller
>>
>>
>> Exception Propagation
>> ---------------------
>>
>> Sometimes ``finally`` block or ``__enter__/__exit__`` method can be
>> exited with an exception. Usually it's not a problem, since more
>> important exception like ``KeyboardInterrupt`` or ``SystemExit``
>> should be thrown instead. But it may be nice to be able to keep
>> original exception inside a ``__context__`` attibute. So cleanup hook
>> signature may grow an exception argument::
>>
>>    def sigint_handler(sig, frame)
>>        if inspect.getcleanupframe(frame) is None:
>>            raise KeyboardInterrupt()
>>        sys.setcleanuphook(retry_sigint)
>>
>>    def retry_sigint(frame, exception=None):
>>        if inspect.getcleanupframe(frame) is None:
>>            raise KeyboardInterrupt() from exception
>>
>> .. note::
>>
>>    No need to have three arguments like in ``__exit__`` method since
>>    we have a ``__traceback__`` attribute in exception in Python 3.x
>>
>> Although, this will set ``__cause__`` for the exception, which is not
>> exactly what's intended. So some hidden interpeter logic may be used
>> to put ``__context__`` attribute on every exception raised in cleanup
>> hook.
>>
>>
>> Interruption Between Acquiring Resource and Try Block
>> -----------------------------------------------------
>>
>> Example from the first section is not totally safe. Let's look closer::
>>
>>    lock.acquire()
>>    try:
>>        do_something()
>>    finally:
>>        lock.release()
>>
>> There is no way it can be fixed without modifying the code. The actual
>> fix of this code depends very much on use case.
>>
>> Usually code can be fixed using a ``with`` statement::
>>
>>    with lock:
>>        do_something()
>>
>> Although, for coroutines you usually can't use ``with`` statement
>> because you need to ``yield`` for both aquire and release operations.
>> So code might be rewritten as following::
>>
>>    try:
>>        yield lock.acquire()
>>        do_something()
>>    finally:
>>        yield lock.release()
>>
>> The actual lock code might need more code to support this use case,
>> but implementation is usually trivial, like check if lock has been
>> acquired and unlock if it is.
>>
>>
>> Setting Interruption Context Inside Finally Itself
>> --------------------------------------------------
>>
>> Some coroutine libraries may need to set a timeout for the finally
>> clause itself. For example::
>>
>>    try:
>>        do_something()
>>    finally:
>>        with timeout(0.5):
>>            try:
>>                yield do_slow_cleanup()
>>            finally:
>>                yield do_fast_cleanup()
>>
>> With current semantics timeout will either protect
>> the whole ``with`` block or nothing at all, depending on the
>> implementation of a library. What the author is intended is to treat
>> ``do_slow_cleanup`` as an ordinary code, and ``do_fast_cleanup`` as a
>> cleanup (non-interruptible one).
>>
>> Similar case might occur when using greenlets or tasklets.
>>
>> This case can be fixed by exposing ``f_in_cleanup`` as a counter, and
>> by calling cleanup hook on each decrement.  Corouting library may then
>> remember the value at timeout start, and compare it on each hook
>> execution.
>>
>> But in practice example is considered to be too obscure to take in
>> account.
>>
>>
>> Alternative Python Implementations Support
>> ==========================================
>>
>> We consider ``f_in_cleanup`` and implementation detail. The actual
>> implementation may have some fake frame-like object passed to signal
>> handler, cleanup hook and returned from ``getcleanupframe``. The only
>> requirement is that ``inspect`` module functions work as expected on
>> that objects. For this reason we also allow to pass a ``generator``
>> object to a ``isframeincleanup`` function, this disables need to use
>> ``gi_frame`` attribute.
>>
>> It may need to be specified that ``getcleanupframe`` must return the
>> same object that will be passed to cleanup hook at next invocation.
>>
>>
>> Alternative Names
>> =================
>>
>> Original proposal had ``f_in_finally`` flag. The original intention
>> was to protect ``finally`` clauses. But as it grew up to protecting
>> ``__enter__`` and ``__exit__`` methods too, the ``f_in_cleanup``
>> method seems better. Although ``__enter__`` method is not a cleanup
>> routine, it at least relates to cleanup done by context managers.
>>
>> ``setcleanuphook``, ``isframeincleanup`` and ``getcleanupframe`` can
>> be unobscured to ``set_cleanup_hook``, ``is_frame_in_cleanup`` and
>> ``get_cleanup_frame``, althought they follow convention of their
>> respective modules.
>>
>>
>> Alternative Proposals
>> =====================
>>
>> Propagating 'f_in_cleanup' Flag Automatically
>> -----------------------------------------------
>>
>> This can make ``getcleanupframe`` unnecessary. But for yield based
>> coroutines you need to propagate it yourself. Making it writable leads
>> to somewhat unpredictable behavior of ``setcleanuphook``
>>
>>
>> Add Bytecodes 'INCR_CLEANUP', 'DECR_CLEANUP'
>> --------------------------------------------
>>
>> These bytecodes can be used to protect expression inside ``with``
>> statement, as well as making counter increments more explicit and easy
>> to debug (visible inside a disassembly). Some middle ground might be
>> chosen, like ``END_FINALLY`` and ``SETUP_WITH`` imlicitly decrements
>> counter (``END_FINALLY`` is present at end of ``with`` suite).
>>
>> Although, adding new bytecodes must be considered very carefully.
>>
>>
>> Expose 'f_in_cleanup' as a Counter
>> ----------------------------------
>>
>> The original intention was to expose minimum needed functionality.
>> Although, as we consider frame flag ``f_in_cleanup`` as an
>> implementation detail, we may expose it as a counter.
>>
>> Similarly, if we have a counter we may need to have cleanup hook
>> called on every counter decrement. It's unlikely have much performance
>> impact as nested finally clauses are unlikely common case.
>>
>>
>> Add code object flag 'CO_CLEANUP'
>> ---------------------------------
>>
>> As an alternative to set flag inside ``WITH_SETUP``, and
>> ``WITH_CLEANUP`` bytecodes we can introduce a flag ``CO_CLEANUP``.
>> When interpreter starts to execute code with ``CO_CLEANUP`` set, it
>> sets ``f_in_cleanup`` for the whole function body.  This flag is set
>> for code object of ``__enter__`` and ``__exit__`` special methods.
>> Technically it might be set on functions called ``__enter__`` and
>> ``__exit__``.
>>
>> This seems to be less clear solution. It also covers the case where
>> ``__enter__`` and ``__exit__`` are called manually. This may be
>> accepted either as feature or as a unnecessary side-effect (unlikely
>> as a bug).
>>
>> It may also impose a problem when ``__enter__`` or ``__exit__``
>> function are implemented in C, as there usually no frame to check for
>> ``f_in_cleanup`` flag.
>>
>>
>> Have Cleanup Callback on Frame Object Itself
>> ----------------------------------------------
>>
>> Frame may be extended to have ``f_cleanup_callback`` which is called
>> when ``f_in_cleanup`` is reset to 0. It would help to register
>> different callbacks to different coroutines.
>>
>> Despite apparent beauty. This solution doesn't add anything. As there
>> are two primary use cases:
>>
>> * Set callback in signal handler. The callback is inherently single
>>  one for this case
>>
>> * Use single callback per loop for coroutine use case. And in almost
>>  all cases there is only one loop per thread
>>
>>
>> No Cleanup Hook
>> ---------------
>>
>> Original proposal included no cleanup hook specification. As there are
>> few ways to achieve the same using current tools:
>>
>> * Use ``sys.settrace`` and ``f_trace`` callback. It may impose some
>>  problem to debugging, and has big performance impact (although,
>>  interrupting doesn't happen very often)
>>
>> * Sleep a bit more and try again. For coroutine library it's easy. For
>>  signals it may be achieved using ``alert``.
>>
>> Both methods are considered too impractical and a way to catch exit
>> from ``finally`` statement is proposed.
>>
>>
>> References
>> ==========
>>
>> .. [1] Monocle
>>   https://github.com/saucelabs/monocle
>>
>> .. [2] Bluelet
>>   https://github.com/sampsyo/bluelet
>>
>> .. [3] Twisted: inlineCallbacks
>>   http://twistedmatrix.com/documents/8.1.0/api/twisted.internet.defer.html
>>
>> .. [4] Original discussion
>>   http://mail.python.org/pipermail/python-ideas/2012-April/014705.html
>>
>>
>> Copyright
>> =========
>>
>> This document has been placed in the public domain.
>>
>>
>>
>> ..
>>   Local Variables:
>>   mode: indented-text
>>   indent-tabs-mode: nil
>>   sentence-end-double-space: t
>>   fill-column: 70
>>   coding: utf-8
>>   End:
>> _______________________________________________
>> Python-ideas mailing list
>> Python-ideas at python.org
>> http://mail.python.org/mailman/listinfo/python-ideas
>
>
>
> --
> Thanks,
> Andrew Svetlov



-- 
Thanks,
Andrew Svetlov



More information about the Python-ideas mailing list