[Python-ideas] Consistent programming error handling idiom

Fri Apr 8 12:53:21 EDT 2016

On Sat, Apr 9, 2016 at 2:40 AM, Rian Hunter <rian at thelig.ht> wrote:
>
>> From: Chris Angelico <rosuav at gmail.com>
>> This is exactly the idiom used to cope with builtins that may or may
>> not exist. If you want to support Python 2 as well as 3, you might use
>> something like this:
>>
>> try:
>>   input = raw_input
>> except NameError:
>>   raw_input = input
>
> Oops, good call. That was a bad example. Maybe this one is a bit better:
>
>    try:
>        assert no_bugs()
>    except AssertionError:
>        # bugs are okay
>        pass

Not sure I understand this example. It's obviously a toy, because
you'd never put an 'assert' just to catch and discard the exception -
the net result is that you call no_bugs() if you're in debug mode and
don't if you're not, which is more cleanly spelled "if __debug__:
no_bugs()".

>> This is what I'd call a boundary location. You have "outer" code and
>> "inner" code. Any uncaught exception in the inner code should get
>> logged rather than aborting the outer code.
>
> I'm very familiar with this pattern in Python and I've used it myself countless times. Unfortunately I've seen instances where it can lead to disastrous behavior, I think we all have. It seems to have limited usefulness in production and more usefulness in development. The point of my original email was precisely to put the legitimacy and usefulness of code like that into question.
>

When does this lead to disaster? Is it because someone creates a
boundary that shouldn't exist, or because the code maintains global
state in bad ways? The latter is a problem even without exceptions;
imagine a programming error that doesn't cause an exception, but just
omits some crucial "un-modify global state" call. You have no way of
detecting that in the outer code, and your state is messed up.

>> As to resetting stuff: I wouldn't bother; your functions should
>> already not mess with global state. The only potential mess you should
>> consider dealing with is a database rollback; and actually, my
>> personal recommendation is to do that with a context manager inside
>> the inner code, rather than a reset in the exception handler in the
>> outer code.
>
> So I agree this pattern works if you assume all code is exception-safe (context managers will clean up intermediate state) and there are no programming errors. There are lots of things that good code should do but as we all well know good code doesn't always do those things. My point is that except-log loops are dangerous and careless in the face of programming errors.
>

I don't think the except-log loop is the problem here. The problem is
the code that can go in part way, come out again, and leave itself in
a mess.

> Programming errors are unavoidable. In a large system they are a daily fact of life. When there is a bug in production it's very dangerous to re-enter a code block that has demonstrated itself to be buggy, else you risk corrupting data. For example:
>
>    def random_code(state):
>        assert is_valid(state.data)
>
>    def user_code_a(state):
>        state.data = "bad data"
>        # the following throws
>        random_code(state)
>
>    def user_code_b(state):
>        state.db.write(state.data)
>
>    def main_loop():
>        state = State()
>        loop = [user_code_a,
>                user_code_b]
>        for fn in loop:
>            try:
>                fn()
>            except Exception:
>                log_exception()
>
> This code allows user_code_b() to execute and corrupt data even though random_code() was lucky enough to be called and detect bad state early on.

Can you give a non-toy example that has this kind of mutable state at
top level? I suspect it's bad design. If it's truly necessary, use a
context manager to guarantee the reset:

def user_code_a(state):
    with state.set_data("bad data"):
        random_code(state)

> You may say the fix is to assert correct data before writing to the database and, yes, that would fix the problem for future executions in this instance. That's not the point, the point is that incorrect buggy code is running in production today and it's imperative to have multiple safeguards to limit its damage. For example, you wouldn't have an except-log loop in an airplane control system.
>

Actually, yes I would. The alternative that you're suggesting is to
have any error immediately shut down the whole system. Is that really
better? To have the entire control system disabled?

> Sometimes an error is just an error but sometimes an error signifies the running system itself is in a bad state. It would be nice to distinguish between the two in a consistent way across all Python event loops. Halting on any escaped exception is inconvenient, but continuing after any escape exception is dangerous.
>

There's no way for Python to be able to fix this for you. The tools
exist - most notably context managers - so the solution is to use
them.

I don't think we're in python-ideas territory here. What I see here is
a perfect subject for a blog post or other scholarly article on
"Python exception handling best practice", plus possibly an internal
style guide for your company/organization. The blog post I would
definitely read with interest; the style guide ought to be stating the
obvious (as most style guides should).

ChrisA