Exception as the primary error handling mechanism?

Mon Jan 4 16:34:34 EST 2010

On Jan 4, 1:30 pm, Steven D'Aprano
<ste... at REMOVE.THIS.cybersource.com.au> wrote:
>
> This is very true, but good APIs often trade-off increased usability and
> reduced defect rate against machine efficiency too. In fact, I would
> argue that this is a general design principle of programming languages:
> since correctness and programmer productivity are almost always more
> important than machine efficiency, the long-term trend across virtually
> all languages is to increase correctness and productivity even if doing
> so costs some extra CPU cycles.

Yes, I agree with that in general. Correctness and productivity are
more important, as a rule, and should be given priority.

> > (For example, wrapper APIs often require additional
> > memory allocations and/or data copies.) Incorrect use of exceptions also
> > incurs an efficiency penalty.
>
> And? *Correct* use of exceptions also incur a penalty. So does the use of
> functions. Does this imply that putting code in functions is a poor API?
> Certainly not.

It does imply that incorrect use of exceptions incurs an unnecessary
performance penalty, no more, no less, just as incorrect use of
wrappers incurs an unnecessary performance penalty.

> But no matter how much more expensive, there will always be a cut-off
> point where it is cheaper on average to suffer the cost of handling an
> exception than it is to make unnecessary tests.
>
> In Python, for dictionary key access, that cut-off is approximately at
> one failure per ten or twenty attempts. So unless you expect more than
> one in ten attempts to lead to a failure, testing first is actually a
> pessimation, not an optimization.

What this really comes down to is how frequently or infrequently a
particular condition arises before that condition should be considered
an exceptional condition rather than a normal one. It also relates to
how the set of conditions partitions into "normal" conditions and
"abnormal" conditions. The difficulty for the API designer is to make
these choices correctly.

> In some, limited, cases you might be able to use the magic return value
> strategy, but this invariably leads to lost programmer productivity, more
> complex code, lowered readability and usability, and more defects,
> because programmers will invariably neglect to test for the special value:

I disagree here, to the extent that, whether something is an error or
not can very much depend on the circumstances in which the API is
used. The collection case is a very typical example. Whether failing
to locate a value in a collection is an error very much depends on
what the collection is used for. In some cases, it's a hard error
(because it might, for example, imply that internal program state has
been corrupted); in other cases, not finding a value is perfectly
normal.

For the API designer, the problem is that an API that throws an
exception when it should not sucks just as much as an API that doesn't
throw an exception when it should. For general-purpose APIs, such as a
collection API, as the designer, I usually cannot know. As I said
elsewhere in the article, general-purpose APIs should be policy-free,
and special-purpose APIs should be policy-rich. As the designer, the
more I know about the circumstances in which the API will be used, the
more fascist I can be in the design and bolt down the API more in
terms of static and run-time safety.

Wanting to ignore a return value from a function is perfectly normal
and legitimate in many cases. However, if a function throws instead of
returning a value, ignoring that value becomes more difficult for the
caller and can extract a performance penalty that may be unacceptable
to the caller. The problem really is that, at the time the API is
designed, there often is no way to tell whether this will actually be
the case; in turn, no matter whether I choose to throw an exception or
return an error code, it will be wrong for some people some of the
time.

> This is a classic example of premature optimization. Unless such
> inefficiency can be demonstrated to actually matter, then you do nobody
> any favours by preferring the API that leads to more defects on the basis
> of *assumed* efficiency.

I agree with the concern about premature optimisation. However, I
don't agree with a blanket statement that special return values always
and unconditionally lead to more defects. Returning to the .NET non-
blocking I/O example, the fact that the API throws an exception when
it shouldn't very much complicates the code and introduces a lot of
extra control logic that is much more likely to be wrong than a simple
if-then-else statement. As I said, throwing an exception when none
should be thrown can be just as harmful as the opposite case.

> It doesn't matter whether it is an error or not. They are called
> EXCEPTIONS, not ERRORS. What matters is that it is an exceptional case.
> Whether that exceptional case is an error condition or not is dependent
> on the application.

Exactly. To me, that implies that making something an exception that,
to the caller, shouldn't be is just as inconvenient as the other way
around.

> >  * Is it appropriate to force the caller to deal with the condition in
> > a catch-handler?
>
> >  * If the caller fails to explicitly deal with the condition, is it
> > appropriate to terminate the program?
>
> > Only if the answer to these questions is "yes" is it appropriate to
> > throw an exception. Note the third question, which is often forgotten.
> > By throwing an exception, I not only force the caller to handle the
> > exception with a catch-handler (as opposed to leaving the choice to the
> > caller), I also force the caller to *always* handle the exception: if
> > the caller wants to ignore the condition, he/she still has to write a
> > catch-handler and failure to do so terminates the program.
>
> That's a feature of exceptions, not a problem.

Yes, and didn't say that it is a problem. However, making the wrong
choice for the use of the feature is a problem, just as making the
wrong choice for not using the feature is.

> > Apart from the potential performance penalty, throwing exceptions for
> > expected outcomes is bad also because it forces a try-catch block on the
> > caller.
>
> But it's okay to force a `if (result==MagicValue)` test instead?

Yes, in some cases it is. For example:

int numBytes;
int fd = open(...);
while ((numBytes = read(fd, …)) > 0)
{
    // process data...
}

Would you prefer to see EOF indicated by an exception rather than a
zero return value? I wouldn't.

> Look, the caller has to deal with exceptional cases (which may include
> error conditions) one way or the other. If you don't deal with them at
> all, your code will core dump, or behave incorrectly, or something. If
> the caller fails to deal with the exceptional case, it is better to cause
> an exception that terminates the application immediately than it is to
> allow the application to generate incorrect results.

I agree that failing to deal with exceptional cases causes problems. I
also agree that exceptions, in general, are better than error codes
because they are less likely to go unnoticed. But, as I said, it
really depends on the caller whether something should be an exception
or not.

The core problem isn't whether exceptions are good or bad in a
particular case, but that most APIs make this an either-or choice. For
example, if I had an API that allowed me to choose at run time whether
an exception will be thrown for a particular condition, I could adapt
that API to my needs, instead of being stuck with whatever the
designer came up with.

There are many ways this could be done. For example, I could have a
find() operation on a collection that throws if a value isn't found,
and I could have findNoThrow() if I want a sentinel value returned.
Or, the API could offer a callback hook that decides at run time
whether to throw or not. (There are many other possible ways to do
this, such as setting the behaviour at construction time, or by having
different collection types with different behaviours.)

The point is that a more flexible API is likely to be more useful than
one that sets a single exception policy for everyone.

> > As the API
> > creator, if I indicate errors with exceptions, I make a policy decision
> > about what is an error and what is not. It behooves me to be
> > conservative in that policy: I should throw exceptions only for
> > conditions that are unlikely to arise during routine and normal use of
> > the API.
>
> But lost connections *are* routine and normal. Hopefully they are rare.

In the context of my example, they are not. The range of behaviours
naturally falls into these categories:

* No data ready
* Data ready
* EOF
* Socket error

The first three cases are the "normal" ones; they operate on the same
program state and they are completely expected: while reading a
message off the wire, the program will almost certainly encounter the
first two conditions and, if there is no error, it will always
encounter the EOF condition. The fourth case is the unexpected one, in
the sense that this case will often not arise at all. That's not to
say that lost connections aren't routine; they are. But, when a
connection is lost, the program has to do different things and operate
on different state than when the connection stays up. This strongly
suggests that the first three conditions should be dealt with by
return values and/or out parameters, and the fourth condition should
be dealt with as an exception.

Cheers,

Michi.