Exception as the primary error handling mechanism?

Sun Jan 3 22:30:42 EST 2010

On Sun, 03 Jan 2010 13:44:29 -0800, Michi wrote:

> The quoted sentence appears in a section of the article that deals with
> efficiency. I point out in that section that bad APIs often have a price
> not just in terms of usability and defect rate, but that they are often
> inefficient as well.

This is very true, but good APIs often trade-off increased usability and 
reduced defect rate against machine efficiency too. In fact, I would 
argue that this is a general design principle of programming languages: 
since correctness and programmer productivity are almost always more 
important than machine efficiency, the long-term trend across virtually 
all languages is to increase correctness and productivity even if doing 
so costs some extra CPU cycles.

> (For example, wrapper APIs often require additional
> memory allocations and/or data copies.) Incorrect use of exceptions also
> incurs an efficiency penalty.

And? *Correct* use of exceptions also incur a penalty. So does the use of 
functions. Does this imply that putting code in functions is a poor API? 
Certainly not.

> In many language implementations, exception handling is expensive;
> significantly more expensive than testing a return value.

And in some it is less expensive.

But no matter how much more expensive, there will always be a cut-off 
point where it is cheaper on average to suffer the cost of handling an 
exception than it is to make unnecessary tests.

In Python, for dictionary key access, that cut-off is approximately at 
one failure per ten or twenty attempts. So unless you expect more than 
one in ten attempts to lead to a failure, testing first is actually a 
pessimation, not an optimization.

> Consider the following:
> 
> int x;
> try {
>     x = func();
> } catch (SomeException) {
>    doSomething();
>    return;
> }
> doSomethingElse();
> 
> Here is the alternative without exceptions. (func() returns SpecialValue
> instead of throwing.)
> 
> int x;
> x = func();
> if (x == SpecialValue) {
>     doSomething();
>     return;
> }
> doSomethingElse();

In some, limited, cases you might be able to use the magic return value 
strategy, but this invariably leads to lost programmer productivity, more 
complex code, lowered readability and usability, and more defects, 
because programmers will invariably neglect to test for the special value:

int x;
x = func();
doSomething(x);
return;

Or worse, they will write doSomething() so that it too needs to know 
about SpecialValue, and so do all the functions it calls. Instead of 
dealing with the failure in one place, you can end up having to deal with 
it in a dozen places.

But even worse is common case that SpecialValue is a legal value when 
passed to doSomething, and you end up with the error propagating deep 
into the application before being found. Or even worse, it is never found 
at all, and the application simply does the wrong thing.

> In many language implementations, the second version is considerably
> faster, especially when the exception may be thrown from deep in the
> bowels of func(), possibly many frames down the call tree.

This is a classic example of premature optimization. Unless such 
inefficiency can be demonstrated to actually matter, then you do nobody 
any favours by preferring the API that leads to more defects on the basis 
of *assumed* efficiency.

If your test for a special value is 100 times faster than handling the 
exception, and exceptions occur only one time in 1000, then using a 
strategy of testing for a special value is actually ten times slower on 
average than catching an exception.

> If func() throws an exception for something that routinely occurs in the
> normal use of the API, the extra cost can be noticeable. 

"Can be". But it also might not be noticeable at all.

[...]
> Here is an example of this:
> 
> KeyType k = ...;
> ValueType v;
> 
> try {
>    v = collection.lookup(k);
> } catch (NotFoundException) {
>    collection.add(k, defaultValue);
>    v = defaultValue;
> }
> doSomethingWithValue(v);
> 
> The same code if collection doesn't throw when I look up something that
> isn't there:
> 
> KeyType k = ...;
> ValueType v;
> 
> v = collection.lookup(k);
> if (v == null) {
>     collection.add(k, defaultValue);
>     v = defaultValue;
> }
> doSomethingWithValue(v);
> 
> The problem is that, if I do something like this in a loop, and the loop
> is performance-critical, the exception version can cause a significant
> penalty.

No, the real problems are:

(1) The caller has to remember to check the return result for the magic 
value. Failure to do so leads to bugs, in some cases, serious and hard-to-
find bugs.

(2) If missing keys are rare enough, the cost of all those unnecessary 
tests will out-weigh the saving of avoiding catching the exception. "Rare 
enough" may still be very common: in the case of Python, the cross-over 
point is approximately 1 time in 15.

(3) Your collection now cannot use the magic value as a legitimate value.

This last one can be *very* problematic. In the early 1990s, I was 
programming using a callback API that could only return an integer. The 
standard way of indicating an error was to return -1. But what happens if 
-1 is a legitimate return value, e.g. for a maths function? The solution 
used was to have the function create a global variable holding a flag:

result = function(args)
if result == -1:
    if globalErrorState == -1:
        print "An error occurred"
        exit
doSomething(result)

That is simply horrible.

> As the API designer, when I make the choice between returning a special
> value to indicate some condition, or throwing an exception, I should
> consider the following questions:
> 
>  * Is the special condition such that, under most conceivable
> circumstances, the caller will treat the condition as an unexpected
> error?

Wrong.

It doesn't matter whether it is an error or not. They are called 
EXCEPTIONS, not ERRORS. What matters is that it is an exceptional case. 
Whether that exceptional case is an error condition or not is dependent 
on the application.

>  * Is it appropriate to force the caller to deal with the condition in
> a catch-handler?
> 
>  * If the caller fails to explicitly deal with the condition, is it
> appropriate to terminate the program?
> 
> Only if the answer to these questions is "yes" is it appropriate to
> throw an exception. Note the third question, which is often forgotten.
> By throwing an exception, I not only force the caller to handle the
> exception with a catch-handler (as opposed to leaving the choice to the
> caller), I also force the caller to *always* handle the exception: if
> the caller wants to ignore the condition, he/she still has to write a
> catch-handler and failure to do so terminates the program.

That's a feature of exceptions, not a problem.

> Apart from the potential performance penalty, throwing exceptions for
> expected outcomes is bad also because it forces a try-catch block on the
> caller. 

But it's okay to force a `if (result==MagicValue)` test instead?

Look, the caller has to deal with exceptional cases (which may include 
error conditions) one way or the other. If you don't deal with them at 
all, your code will core dump, or behave incorrectly, or something. If 
the caller fails to deal with the exceptional case, it is better to cause 
an exception that terminates the application immediately than it is to 
allow the application to generate incorrect results.

> One example of this is the .NET socket API: if I do non-
> blocking I/O on a socket, I get an exception if no data is ready for
> reading (which is the common and expected case), and I get a zero return
> value if the connection was lost (which is the uncommon and unexpected
> case).
> 
> In other words, the .NET API gets this completely the wrong way round.

Well we can agree on that!

> If the API did what it should, namely, throw an exception when the
> connection is lost, and not throw when I do a read (whether data was
> ready or not), the code would be far simpler and far more maintainable.
> 
> At no point did I ever advocate not to use exception handling.
> Exceptions are the correct mechanism to handle errors. However, what is
> considered an error is very much in the eye of the beholder. As the API
> creator, if I indicate errors with exceptions, I make a policy decision
> about what is an error and what is not. It behooves me to be
> conservative in that policy: I should throw exceptions only for
> conditions that are unlikely to arise during routine and normal use of
> the API.

But lost connections *are* routine and normal. Hopefully they are rare.

-- 
Steven