[Python-ideas] Consistent programming error handling idiom

rian at thelig.ht rian at thelig.ht
Fri Apr 8 14:24:15 EDT 2016


>> Oops, good call. That was a bad example. Maybe this one is a bit 
>> better:
>> 
>>    try:
>>        assert no_bugs()
>>    except AssertionError:
>>        # bugs are okay
>>        pass
> 
> Not sure I understand this example. It's obviously a toy, because
> you'd never put an 'assert' just to catch and discard the exception -
> the net result is that you call no_bugs() if you're in debug mode and
> don't if you're not, which is more cleanly spelled "if __debug__:
> no_bugs()".

The example is removed from its original context. The point was to show 
that there is no reason to locally catch AssertionError (contrast with 
AttributeError, ValueError, TypeError, etc.). AssertionError is *always* 
indicative of a bug (unlike AttributeError which only may be indicative 
of a bug).

>>> This is what I'd call a boundary location. You have "outer" code and
>>> "inner" code. Any uncaught exception in the inner code should get
>>> logged rather than aborting the outer code.
>> 
>> I'm very familiar with this pattern in Python and I've used it myself 
>> countless times. Unfortunately I've seen instances where it can lead 
>> to disastrous behavior, I think we all have. It seems to have limited 
>> usefulness in production and more usefulness in development. The point 
>> of my original email was precisely to put the legitimacy and 
>> usefulness of code like that into question.
>> 
> 
> When does this lead to disaster? Is it because someone creates a
> boundary that shouldn't exist, or because the code maintains global
> state in bad ways? The latter is a problem even without exceptions;
> imagine a programming error that doesn't cause an exception, but just
> omits some crucial "un-modify global state" call. You have no way of
> detecting that in the outer code, and your state is messed up.

This leads to disaster in the case of buggy code. This leads to a 
greater disaster when the boundary allows the code to continue running. 
It's not a black and white thing, there is no 100% way to detect for 
buggy code but my argument is that you shouldn't ignore an assert when 
you're lucky enough to get one.

>>> As to resetting stuff: I wouldn't bother; your functions should
>>> already not mess with global state. The only potential mess you 
>>> should
>>> consider dealing with is a database rollback; and actually, my
>>> personal recommendation is to do that with a context manager inside
>>> the inner code, rather than a reset in the exception handler in the
>>> outer code.
>> 
>> So I agree this pattern works if you assume all code is exception-safe 
>> (context managers will clean up intermediate state) and there are no 
>> programming errors. There are lots of things that good code should do 
>> but as we all well know good code doesn't always do those things. My 
>> point is that except-log loops are dangerous and careless in the face 
>> of programming errors.
>> 
> 
> I don't think the except-log loop is the problem here. The problem is
> the code that can go in part way, come out again, and leave itself in
> a mess.

I'm not saying except-log is to blame for buggy code. I'm saying that 
when an exception occurs, there's no way to tell whether it was caused 
by an internal bug or an external error. Except-log should normally be 
limited to exceptions caused by external errors in production code.

>> Programming errors are unavoidable. In a large system they are a daily 
>> fact of life. When there is a bug in production it's very dangerous to 
>> re-enter a code block that has demonstrated itself to be buggy, else 
>> you risk corrupting data. For example:
>> 
>>    def random_code(state):
>>        assert is_valid(state.data)
>> 
>>    def user_code_a(state):
>>        state.data = "bad data"
>>        # the following throws
>>        random_code(state)
>> 
>>    def user_code_b(state):
>>        state.db.write(state.data)
>> 
>>    def main_loop():
>>        state = State()
>>        loop = [user_code_a,
>>                user_code_b]
>>        for fn in loop:
>>            try:
>>                fn()
>>            except Exception:
>>                log_exception()
>> 
>> This code allows user_code_b() to execute and corrupt data even though 
>> random_code() was lucky enough to be called and detect bad state early 
>> on.
> 
> Can you give a non-toy example that has this kind of mutable state at
> top level? I suspect it's bad design. If it's truly necessary, use a
> context manager to guarantee the reset:
> 
> def user_code_a(state):
>     with state.set_data("bad data"):
>         random_code(state)
> 

Yes it may be bad design or the code may have bugs but that's precisely 
the point. By the time the exception hits your catch-all, there's no 
universal way of determining whether or not the exception was due to an 
internal bug (or bad design or whatever) or an external error.

>> You may say the fix is to assert correct data before writing to the 
>> database and, yes, that would fix the problem for future executions in 
>> this instance. That's not the point, the point is that incorrect buggy 
>> code is running in production today and it's imperative to have 
>> multiple safeguards to limit its damage. For example, you wouldn't 
>> have an except-log loop in an airplane control system.
>> 
> 
> Actually, yes I would. The alternative that you're suggesting is to
> have any error immediately shut down the whole system. Is that really
> better? To have the entire control system disabled?

The alternative I'm suggesting is to reset that flight computer and 
switch to the backup one (potentially written by a different team).

>> Sometimes an error is just an error but sometimes an error signifies 
>> the running system itself is in a bad state. It would be nice to 
>> distinguish between the two in a consistent way across all Python 
>> event loops. Halting on any escaped exception is inconvenient, but 
>> continuing after any escape exception is dangerous.
>> 
> 
> There's no way for Python to be able to fix this for you. The tools
> exist - most notably context managers - so the solution is to use
> them.

Context managers don't allow me to determine whether an exception is 
caused by an internal bug or external error, so it's not a solution to 
this problem. I want to live in a world where I can do this:

     while cbs:
         cb = cbs.pop()
         try:
             cb()
         except Exception as e:
             logging.exception("In main loop")
             if is_a_bug(e):
                 raise SystemExit() from e

Python may be able to do something here. One possible thing is a new 
exception hierarchy, there may be other solutions. This may be 
sufficient but I doubt it:

     def is_a_bug(e):
         return isinstance(e, AssertionError)

The reason I am polling python-ideas is that I can't be the only one who 
has ever encountered this deficiency in Python.

Rian


More information about the Python-ideas mailing list