[Python-Dev] Context management patterns

Sat Oct 19 08:38:39 CEST 2013

On 18 October 2013 03:25, Glenn Linderman <v+python at g.nevcal.com> wrote:
> First, thanks for the education. What you wrote is extremely edifying about
> more than just context managers, and I really appreciate the visionary
> understanding you reported from BrisPy and further elucidated on, regarding
> the educational pattern of using things before you learn how they work...
> that applies strongly in arenas other than programming as well:
>
> - you learn how to walk before you understand the musculoskeletal physics
> - you learn how to turn on/off the lights before you understand how
> electricity works
> - you learn how to drive before you learn how/why a vehicle works
> - you learn how to speak before you understand how grammar works
> - you learn how to locate the constellations before you understand
> interplanetary gravitational forces
> - many, many, many, many more things
>
> And of course, many people never reach the understanding of how or why for
> many things they commonly use, do, or observe. That's why some people make
> things happen, some people watch what happens, and some people wonder "What
> happened?"
>
> What it doesn't do, though is address the dubious part of the whole
> construct, which is composition.

However, it's important to be clear as to whether the composition
problems are specifically with the context manager form *or* if they
also apply to the underlying exception handling pattern.

Barry raised a good point the other day about how context managers
encapsulate exception handling patterns, and that the level shift
happening with contextlib.suppress vs previous standard library usage
is that it actually takes advantage of the fact that the "with"
statement is a control flow construct that supports suppressing raised
exceptions. By contrast, the previously extracted patterns encountered
in the standard library are all about correct resource handling and
use either the try-finally or the try-except-raise patterns that don't
impact control flow.

Here are the clearest patterns I've personally noticed in the time
since the with statement was added:

* deterministic resource management (not a control flow pattern)

    try:
        <do something>
    finally:
        x.close() # Or otherwise clean up

- aside from calling close() methods, generally resource specific
- closing files, sockets, etc
- dropping memoryview buffer references
- contextlib.closing

* deterministic state management (not a control flow pattern)

    <set up>
    try:
        <do something>
    finally:
        <tear down>

- specific to the state being managed
- lock acquisition/release pairs
- decimal context management
- monkey patching (including standard stream redirection)

* transaction management (not a control flow pattern)

    <set up>
    try:
        <do something>
    except:
        <revert/rollback/otherwise handle failure case>
        raise
     else:
        <commit/otherwise handle success case>

- database sessions
- conditional resource cleanup (i.e. only in failure case)
- specific to the kind of transaction being managed

* logging unhandled exceptions (not a control flow pattern)

    try:
        <do something>
    except Exception as exc: # Exception is a good default, but needs
to be configurable
        log.exception(exc)
        raise # This is important, since *suppressing* the exception
is a separate decision

- specific to a logging framework
- in practice, usually just written out and combined with exception
suppression in long running server processes
- not commonly seen in scripts, since those rely on the unhandled
exception display in the interpreter
- stdlib logging module could possibly offer with a
"logging.log_unhandled" context manager, permitting things like:

    with suppress(Exception)
        with log_unhandled():
            <do something>

* suppressing expected exceptions that aren't errors (control flow pattern!)

    try:
        <do something>
    except <permitted failure>:
        pass

- general pattern, now explicitly named as contextlib.suppress
- indicates exceptions that aren't really "exceptions" in that
specific context, but just a different acceptable result
- once converted to the class based implementation, will be stateless
and reusable
- consider if, instead of accepting things like "ignore_errors" flags
and/or error callbacks, iteration constructs like shutil.rmtree
instead accepted a context manager to wrap around the innermost calls
where exceptions are anticipated.

There are now two more control flow patterns that I'm considering
adding to contextlib. However, I'd experiment with them in contextlib2
before adding them to the standard library's contextlib module:

* delayed failure handling

    try:
        i = data.index(target)
    except IndexError:
        i = None

    # later
    if i is None:
        # Handle the "not found" case
    else:
        # Do something with the value

- this could be rewritten as (I believe credit is due to RDM for the name):

    with catch(IndexError) as missing:
        i = data.index(target)

    # later
    if missing:
        # Handle the "not found" case
        # The caught exception would be available as missing.exception
    else:
        # Do something with the value

- unittest.TestCase.assertRaises is an existing construct along these lines
- can use a similarly stateless and reusable class-based
implementation to that which will be used for suppress
- such a "catch" decorator could also take care of saving the
exception and calling traceback.clear_frames() on it for easy
introspection without inadvertently keeping vast swathes of local
objects alive in CPython
- unfortunately, warnings.catch_warnings is misnamed - it's really a
state management context manager for the warning filter state, with an
option to record warnings via monkeypatching. Alas, I understand these
concepts far better now than I did back when we extracted that API
from the 'check_warnings' helper in the test suite, so I didn't
realise the name was wrong until long after it had been published. If
we add contextlib.catch in 3.5, it would probably be worth going
through the deprecation dance needed to rename catch_warnings to
something more sensible like "warning_context".

* constrained jumps

    # Search loop
    for item in data:
        if is_desired_result(item):
            result = item
            break
    else:
        # Handle the "not found" case
    # Do something with "result"

- using a suitable context manager, this can be rewritten as:

    # Search loop
    with exit_label() as found:
        for item in data:
            if is_desired_result(item):
                found.exit(item)

    # Later
    if found:
        # Do something with "found.value"
    else:
        # Handle the "not found" case

- this is the exit_label() idea I posted earlier, but without the
ability to specify the exception type (since I realised that's a
separate pattern, better handled as the distinct "catch" construct)
- rather than replacing exception handling constructs, it replaces
break/else search loops with something that's hopefully easier to
understand
- it's also a generalisation of the SystemExit/GeneratorExit pattern,
so the exception types it uses for internal flow control would inherit
directly from BaseException
- this is the pattern on the list that gets the closest to "goto" like
behaviour, since it permits arbitrary "bail out now" behaviour (by
passing the exit label to other operations), but the fact every label
uses a custom exception type derived directly from BaseException would
mean it still ends up being quite heavily constrained (since the
*only* thing it would be able to catch is the exception thrown by
calling the exit() method on the label).
- this pattern would be stateful and explicitly *not* reusable (acting
as a further constraint on abuse)

> On 10/17/2013 8:26 AM, Nick Coghlan wrote:
>
> And even a two line version:
>
>     with suppress(FileNotFoundError): os.remove("somefile.tmp")
>     with suppress(FileNotFoundError): os.remove("someotherfile.tmp")
>
>
> The above example, especially if extended beyond two files, begs to used in
> a loop, like your 5 line version:
>
>
> for name in ("somefile.tmp", "someotherfile.tmp"):
>        with suppress(FileNotFoundError):
>                 os.remove(name)
>
> which would be fine, of course.
>
> But to some with less education about the how and why, it is not clear why
> it couldn't be written like:
>
> with suppress(FileNotFoundError):
>
>         for name in ("somefile.tmp", "someotherfile.tmp"):
>                 os.remove(name)
>
> yet to the cognoscenti, it is obvious there are seriously different
> semantics.

However, that's a confusion about exception handling in general, not
about the suppress context manager in particular. The same potential
for conceptual confusion exists between:

    for name in ("somefile.tmp", "someotherfile.tmp"):
        try:
            os.remove(name)
        except FileNotFoundError:
            pass

and:

    try:
        for name in ("somefile.tmp", "someotherfile.tmp"):
           os.remove(name)
    except FileNotFoundError:
        pass

At the syntactic level, when composing compound statements, the order
of nesting *always* matters. The with/for and for/with constructs are
different, just as if/for and for/if are different. If a student makes
it through an introductory Python course without learning that much,
I'd have grave doubts about that course :)

> In my own code, I have a safe_delete function to bundle the exception
> handling and the os.remove, and when factored that way, the temptation to
> nest the loop inside the suppress is gone. With suppress available, though,
> and if used, the temptation to factor it, either correctly or incorrectly,
> appears. How many cut-n-paste programmers will get it right and how many
> will get it wrong, is the serious question here, I think, and while suppress
> is a slightly better term than ignore, it still hides the implications to
> the control flow when an exception is actually raised within the block.

A *lot* of written out exception handling is better abstracted away
into a helper function. However, the body of the try block can't
always be factored out without creating swiss-army functions with more
knobs and dials than subprocess.Popen, and those are the cases where
using a context manager instead really shines.

> I'm still dubious that the benefits of this simpler construct, while an
> interesting composition of powerful underlying constructs, has sufficient
> benefit to outweigh the naïve user's potential for misusing it (exacerbated
> by a name that doesn't imply control flow), or even the extra cost in
> performance per the microbenchmark someone published.

In the case of contextlib.suppress, my interest is mostly in giving
this particular pattern a name, although it also allows for finer
granularity in deciding exactly *which* exceptions to ignore. For
example, most typical "safe_remove" functions ignore *all* OSErrors,
while the contextlib.suppress example in the docs is deliberately
constrained to only ignore FileNotFoundError (since "I don't care if
it's already gone" is certainly a reasonable thing to say, while "I
don't care if I don't have permission to remove it" is more dubious).

This inherent ability to suppress exceptions means that "with"
statements *are* a control flow construct, and always have been since
we added them in PEP 343. However, this also gets back to Barry's
point about there being a category shift here: contextlib.suppress is
the first *standard library* context manager to actually make use of
the fact that with statements are a control flow construct just as
much as try/except/else/finally statements are (just a more
constrained one at the point of use, which makes them easier to
understand).

> Your more complex examples for future versions may have greater merit
> because they provide a significantly greater reduction in complexity to
> offset the significantly greater learning curve required to use and
> understand them. But even those look like an expensive form of goto (of
> course, goto is considered harmful, and I generally agree with the reasons
> why, but have coded them in situations where they are more useful than
> harmful in languages which support them).
>
> I imagine that everyone on python-dev is aware that most of the control flow
> constructs in structured programming (which is a subset of OO) are to
> control the context of the CPUs "instruction pointer" without the use of
> "goto".
>
> The real problem with "goto" is not that the instruction pointer is changed
> non-sequentially, but that arbitrary changes can easily violate poorly
> documented preconditions of the target location. Hence, structured
> programming is really an attempt to avoid writing documentation, a laudable
> goal as the documentation is seldom sufficient at that level of detail... or
> if sufficient, is repetitive and overwhelming to create, maintain, and
> comprehend. It achieves that by making control flow constructs that are
> "higher level" than goto, that have meanings that can be understood and
> explained in educational texts, which then are implicit documentation for
> those control flow aspects of a particular program. OO builds on structured
> programming to make neat packages of state and control flow, to isolate
> state into understandable chunks so that larger programs can be
> comprehended, as the BrisPy presenter enlightened us, without understanding
> all the details of how each object and function within it works.
>
> Programmers raised on OO and GUI toolkits are building more and more systems
> out of more complex parts, which increases productivity, and that is good,
> although when they fail to fully understand the parts, some "interesting"
> performance characteristics can result.
>
> ignore/suppress seems to me to be a sledge hammer solution for driving a
> tack. The tack may be driven successfully, but the potential for damage to
> the surroundings (by misunderstanding the control flow implications) is
> sufficient to make me dubious regarding its overall value. Adequate
> documentation may help (if it is both provided and read), but the best
> constructs are those that are self-documenting, or well documented in
> existing "programming 101" books. I haven't seen this construct in other
> languages, nor has such a comparison been made in this thread, so I consider
> the potential for misuse large.
>
> My conclusion: suppress considered harmful, hidden goto within :)

I believe your underlying concerns are actually with the non-local
flow control possibilities that are inherent in a language that offers
both exceptions and the ability to suppress them. Since I'm firmly
convinced that offering *more* structured exception handling is a
vastly better solution to that problem than alternatives like Go's
retreat to C-style return codes, I'll be continuing down the path of
trying to extract and formalise particular patterns that constitute
reasonable and common patterns for try/except/else/finally (or
break/else loops!) in Python.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia