[Tutor] properly propagate problems
Cameron Simpson
cs at cskk.id.au
Sat Mar 23 19:03:18 EDT 2019
On 23Mar2019 11:04, ingo janssen <ingoogni at gmail.com> wrote:
>One thing I often struggle with is how to deal with exceptions,
>especially when I have a chain of functions that use each others
>output and/or long running processes. As the answer will probably be
>"it depends"
Oh yes!
The core rule of thumb is "don't catch an exception which you don't know
how to handle", but that is for truly unexpected errors not envisaged by
the programmer. Then your programme aborts with a debugging stack trace.
Your situation below is more nuanced. Discussion below.
>take for example this program flow:
>
>open a file and read into BytesIO buffer
>get a FTP connection from pool
>send buffer to plantuml.jar in memory FTP server
>render file to image
>get image from FTP server
>push the image onto CherryPy bus
>push (SSE) the image to web browser
>
>def read_file(input_file):
> try:
> with open(input_file, 'rb') as f:
> buffer = io.BytesIO(f.read())
> except FileNotFoundError as e:
> print(e)
> ....
> return buffer
>
>assume the file is not found, I cannot just kill the whole process.
>Catching the exception is one thing, but how to deal with it properly,
>I have to inform the client somehow what went wrong.
Given a function like that, I would be inclined to do one of 2 things:
A) don't make a policy decision (catching the exception) this close to
the failure, instead let the exception out and let the caller handle it:
def read_file(input_file):
with open(input_file, 'rb') as f:
return io.BytesIO(f.read())
filename = "foo"
try:
buffer = read_file(filename)
except OSError as e:
error("could not load %r: %s", filename, e)
... failure action, maybe return from the function ...
... proceed with buffer ...
This leaves the policy decision with the calling code, which may have a
better idea about what is suitable. For example, you might pass some
useful response to your web client here. The low level function
read_file() doesn't know that it is part of a web service.
The handy thing about exceptions is that you can push that policy
decision quite a long way out. Provided the outer layer where you decide
to catch the exception knows that this involved accessing a file you can
put that try/except quite a long way out and still produce a sensible
looking error response.
Also, the further out the policy try/except lives, the simpler the inner
functions can be because they don't need to handle failure - they can be
written for success provided that failures raise exceptions, making them
_much_ simpler and easier to maintain. And with far fewer policy
decisions!
The flip side to this is that there is a limit to how far out in the
call chain this try/except can sensibly happen: if you're far enough out
that the catching code _doesn't_ know that there was a file read
involved, the error message becomes more vague (although you still have
the exception instance itself with the low level detail).
B) to return None on failure:
def read_file(input_file):
try:
with open(input_file, 'rb') as f:
return io.BytesIO(f.read())
except OSError as e:
error(
"read_file(%r): could not read input file: %s",
input_file, e)
return None
None is a useful sentinel value for failure. Note that sometimes you
will want something else if None is meaningful return value in ordinary
circumstances. Then your calling code can handle this without
exceptions:
buffer = read_file("foo")
if buffer is None:
... return nice message to web client ...
else:
... process the image ...
However, it does mean that this handling has to happen right at the call
to read_file. That can be fine, but might be inconvenient.
Finally, some related points:
I find it useful to distinguish "mechanism" and "policy". In my ideal
world a programme is at least 90% mechanism with a thin layer of policy
outside it. Here "policy" is what might be termed "business logic" or
"application logic" in some circumstances: what to do to achieve the
high level goal. The high level is where you decide how to behave in
various circumstances.
This has a few advantages: almost all low level code is mechanism: it
has a well defined, usually simple, purpose. By having almost all
failures raise an exception you can make the low level functions very
simple: do A then B then C until success, where you return the result;
raise exceptions when things go wrong (failure to open files, invalid
input parameters, what have you). This produces what I tend to call
"white list" code: code which only returns a result when all the
required operations succeed.
This is option (A) above, and makes for very simple inner functions.
For option (B) "return None on failure", this is where we decide that
specific failures are in fact valid execution paths, and None is a valid
function return, indicating some kind of null result. You might still
raise exceptions of various types for invalid input in this case; the
None is only for a well defined expected non-answer.
Regarding uncaught exceptions:
As you say, you don't want your whole app to abort. So while you may
catch specific exception types at some inner layer, you might want to
catch _all_ exceptions at the very outermost layer and log them (with a
stack trace), but not abort. So:
try:
... process client request ...
except Exception as e:
# log exception and stack trace to the application log
error("handler failed: %s", e, exc_info=True)
return 500 series web response to client here ...
This is one of those situaions where you might use the normally reviled
"catch all exceptions" anti-pattern: at the outermost layer of some kind
of service programme such as a daemon or web app handling requests:
report the exception and carry on with the application. Remember the
Zen: errors should not pass silently. Always log something when you
catch an exception.
Note that a primary reason to hate "catch all" is that such code often
then proceeds to do more work with the bogus results. In a daemon or a
web app, you're aborting _that request_. Any further work is shiny and
new from a new request, not continuing with nonsensical data left around
by a catch-all.
Fortunately web frameworks like Flask or CherryPy usually embed such a
catch-everything in their handler logic, outside you own code (after
all, what if you own catch-everything was buggy?) So you don't normally
need to write one of these things yourself. Which is good really, most
of the time - they are a recipe for accidentally hiding errors. Let the
framework do that one - it has been debugged for you.
Another issue is the distinction between what to log and what to show
the client. You usually DO NOT want to let the nitty gritty of the
exception get to the end user: that way lies accidental leaking of
credentials or private implementation details. So log details, but
return fairly bland information to the client. Try to write your code so
that this is the default behaviour. Again, web frameworks generally do
just this in their outermost catch-all handler: only if you turn on some
kind of DEBUG mode does it splurge private stuff over the web page for
ease of debugging in development.
Finally, I'm sure you've thought to yourself: if I catch an exception a
long way from where it happened, won't the exception message lack all
sorts of useful context about what happened? How useful is a log entry
like this (from the outermost "OCR the document" level):
error("OCR failed: %s", e)
producing:
OCR failed: permission denied
because of a permission issue on a specific (but here, unnamed) file?
My own solution to this issue is my cs.pfx module (you can install this
with "pip install cs.pfx").
This provides a context manager named Pfx which adorns exceptions with
call stack information, totally under your control. It also has various
.error and .warning etc methods which produce prefixed log messages.
Example:
from cs.pfx import Pfx
def read_file(input_file):
with Pfx("read_file(%r)", input_file):
with open(input_file, 'rb') as f:
return io.BytesIO(f.read())
and outer calls might look like:
def produce_image(image_name):
with Pfx("produce_image(%r)", image_name):
filename = path_to_image_file(image_name)
buffer = read_file(filename)
... do stuff with the buffer ...
If the inner open fails, the exception message, which is originally like
this:
[Errno 2] No such file or directory: 'fffff'
becomes:
produce_image('image_name'): read_file("/path/to/image_name.png"): [Errno 2] No such file or directory: '/path/to/image_name.png'
How much context you get depends on where you put the "with Pfx(...):"
statements.
It also furthers simple code, because you no longer need to pepper your
own exceptions with annoying repetitive context, just the core message:
def read_file(input_file):
with Pfx("read_file(%r)", input_file):
if not input_file.startswith('/'):
raise ValueError("must be an absolute path")
with open(input_file, 'rb') as f:
return io.BytesIO(f.read())
Because of the Pfx the ValueError gets the input_file value in question
prefixed automatically, so you don't need to include it in your raise
statement.
Hoping all this helps.
Short takeaway: decide what's mechanism and what is policy, and try to
put policy further out in higher level code.
Cheers,
Cameron Simpson <cs at cskk.id.au>
Go not to the elves for counsel, for they will say both no and yes.
- Frodo, The Fellowship of the Ring
More information about the Tutor
mailing list