Exception as the primary error handling mechanism?

Mon Jan 4 18:44:09 EST 2010

On Mon, 04 Jan 2010 13:34:34 -0800, Michi wrote:

> On Jan 4, 1:30 pm, Steven D'Aprano
> <ste... at REMOVE.THIS.cybersource.com.au> wrote:
>>
>> This is very true, but good APIs often trade-off increased usability
>> and reduced defect rate against machine efficiency too. In fact, I
>> would argue that this is a general design principle of programming
>> languages: since correctness and programmer productivity are almost
>> always more important than machine efficiency, the long-term trend
>> across virtually all languages is to increase correctness and
>> productivity even if doing so costs some extra CPU cycles.
> 
> Yes, I agree with that in general. Correctness and productivity are more
> important, as a rule, and should be given priority.

I'm glad we agree on that, but I wonder why you previously emphasised 
machine efficiency so much, and correctness almost not at all, in your 
previous post?

>> > (For example, wrapper APIs often require additional memory
>> > allocations and/or data copies.) Incorrect use of exceptions also
>> > incurs an efficiency penalty.
>>
>> And? *Correct* use of exceptions also incur a penalty. So does the use
>> of functions. Does this imply that putting code in functions is a poor
>> API? Certainly not.
> 
> It does imply that incorrect use of exceptions incurs an unnecessary
> performance penalty, no more, no less, just as incorrect use of wrappers
> incurs an unnecessary performance penalty.

If all you're argument is that we shouldn't write crappy APIs, then I 
agree with you completely. The .NET example you gave previously is a good 
example of an API that is simply poor: using exceptions isn't a panacea 
that magically makes code better. So I can't disagree that using 
exceptions badly incurs an unnecessary performance penalty, but it also 
incurs an unnecessary penalty against correctness and programmer 
productivity.

>> But no matter how much more expensive, there will always be a cut-off
>> point where it is cheaper on average to suffer the cost of handling an
>> exception than it is to make unnecessary tests.
>>
>> In Python, for dictionary key access, that cut-off is approximately at
>> one failure per ten or twenty attempts. So unless you expect more than
>> one in ten attempts to lead to a failure, testing first is actually a
>> pessimation, not an optimization.
> 
> What this really comes down to is how frequently or infrequently a
> particular condition arises before that condition should be considered
> an exceptional condition rather than a normal one. It also relates to
> how the set of conditions partitions into "normal" conditions and
> "abnormal" conditions. The difficulty for the API designer is to make
> these choices correctly.

The first case is impossible for the API designer to predict, although 
she may be able to make some educated estimates based on experience. For 
instance I know that when I search a string for a substring, "on average" 
I expect to find the substring present more often than not. I've put "on 
average" in scare-quotes because it's not a statistical average at all, 
but a human expectation -- a prejudice in fact. I *expect* to have 
searching succeed more often than fail, not because I actually know how 
many searches succeed and fail, but because I think of searching for an 
item to "naturally" find the item. But if I actually profiled my code in 
use on real data, who knows what ratio of success/failure I would find?

In the second case, the decision of what counts as "ordinary" and what 
counts as "exceptional" should, in general, be rather obvious. (That's 
not to discount the possibility of unobvious cases, but that's probably a 
case that the function is too complex and tries to do too much.) Take the 
simplest description of what the function is supposed to do: (e.g. "find 
the offset of a substring in a source string"). That's the ordinary case, 
and should be returned. Is there anything else that the function may do? 
(E.g. fail to find the substring because it isn't there.) Then that's an 
exceptional case. 

(There may be other exceptional cases, which is another reason to prefer 
exceptions to magic return values. In general, it's much easier to deal 
with multiple exception types than it is to test for multiple magic 
return values. Consider a function that returns a pointer. You can return 
null to indicate an error. What if you want to distinguish between two 
different error states? What about ten error states?)

I argue that as designers, we should default to raising an exception and 
only choose otherwise if there is a good reason not to. As we agreed 
earlier, exceptions (in general) are better for correctness and 
productivity, which in turn are (in general) more important than machine 
efficiency. The implication of this is that in general, we should prefer 
exceptions, and only avoid them when necessary. Your argument seems to be 
that we should avoid exceptions by default, and only use them if 
unavoidable. I think that is backwards.

>> In some, limited, cases you might be able to use the magic return value
>> strategy, but this invariably leads to lost programmer productivity,
>> more complex code, lowered readability and usability, and more defects,
>> because programmers will invariably neglect to test for the special
>> value:
> 
> I disagree here, to the extent that, whether something is an error or
> not can very much depend on the circumstances in which the API is used.

That's certainly true: a missing key (for example) may be an error, or a 
present key may be an error, or neither may be an error, just different 
branches of an algorithm. That's an application-specific decision. But I 
don't see how that relates to my claim that magic return values are less 
robust and usable than exceptions. Whether it is an error or not, it 
still needs to be handled. If the caller neglects to handle the special 
case, an exception-based strategy will almost certainly lead to the 
application halting (hopefully leading to a harmless bug report rather 
than the crash of a billion-dollar space probe), but a magic return value 
will very often lead to the application silently generating invalid 
results.

[...]
> Wanting to ignore a return value from a function is perfectly normal and
> legitimate in many cases. 

I wouldn't say that's normal. If you don't care about the function's 
result, why are you calling it? For the side-effects? In languages that 
support procedures, such mutator functions should be written as 
procedures that don't return anything. For languages that don't, like 
Python, they should be written as de-facto procedures, always return 
None, and allow the user to pretend that nothing was returned.

That is to say, ignoring the return value is acceptable as a work-around 
for the lack of true procedures. But even there, procedures necessarily 
operate by side-effect, and side-effects should be avoided as much as 
possible. So I would say, ideally, wanting to ignore the return value 
should be exceptionally rare.

> However, if a function throws instead of
> returning a value, ignoring that value becomes more difficult for the
> caller and can extract a performance penalty that may be unacceptable to
> the caller. 

There's that premature micro-optimization again.

> The problem really is that, at the time the API is designed,
> there often is no way to tell whether this will actually be the case; in
> turn, no matter whether I choose to throw an exception or return an
> error code, it will be wrong for some people some of the time.

I've been wondering when you would reach the conclusion that an API 
should offer both forms. For example, Python offers both key-lookup that 
raises exceptions (dict[key]) and key-lookup that doesn't (dict.get(key)).

The danger of this is that it complicates the API, leads to a more 
complex implementation, and may result in duplicated code (if the two 
functions have independent implementations). But if you don't duplicate 
the code, then the assumed performance benefit of magic return values 
over exceptions might very well be completely negated:

def get(self, key):
    # This is not the real Python dict.get implementation!
    # This is merely an illustration of how it *could* be.
    try:
        return self[key]
    except KeyError:
        return None

This just emphasises the importance of not optimising code by assumption. 
If you haven't *measured* the speed of a function you don't know whether 
it will be faster or slower than catching an exception.

You will note that the above has nothing to do with the API, but is 
entirely an implementation decision. This to me demonstrates that the 
question of machine efficiency is irrelevant to API design. 

>> This is a classic example of premature optimization. Unless such
>> inefficiency can be demonstrated to actually matter, then you do nobody
>> any favours by preferring the API that leads to more defects on the
>> basis of *assumed* efficiency.
> 
> I agree with the concern about premature optimisation. However, I don't
> agree with a blanket statement that special return values always and
> unconditionally lead to more defects. 

I can't say that they *always* lead to more defects, since that also 
depends on the competence of the caller, but I will say that as a general 
principle, they should be *expected* to lead to more defects.

> Returning to the .NET non-
> blocking I/O example, the fact that the API throws an exception when it
> shouldn't very much complicates the code and introduces a lot of extra
> control logic that is much more likely to be wrong than a simple
> if-then-else statement. As I said, throwing an exception when none
> should be thrown can be just as harmful as the opposite case.

In this case, it's worse than that -- they use a special return value 
when there should be an exception, and an exception when there should be 
an ordinary, non-special value (an empty string, if I recall correctly).

>> It doesn't matter whether it is an error or not. They are called
>> EXCEPTIONS, not ERRORS. What matters is that it is an exceptional case.
>> Whether that exceptional case is an error condition or not is dependent
>> on the application.
> 
> Exactly. To me, that implies that making something an exception that, to
> the caller, shouldn't be is just as inconvenient as the other way
> around.

Well, obviously I agree that you should only make things be an exception 
if they actually should be an exception. I don't quite see where the 
implication is -- I find myself in the curious position of agreeing with 
your conclusion while questioning your reasoning, as if you had said 
something like:

All cats have four legs, therefore cats are mammals.

>> > Apart from the potential performance penalty, throwing exceptions for
>> > expected outcomes is bad also because it forces a try-catch block on
>> > the caller.
>>
>> But it's okay to force a `if (result==MagicValue)` test instead?
> 
> Yes, in some cases it is. For example:
> 
> int numBytes;
> int fd = open(...);
> while ((numBytes = read(fd, …)) > 0) {
>     // process data...
> }
> 
> Would you prefer to see EOF indicated by an exception rather than a zero
> return value? I wouldn't.

Why not? Assuming this is a blocking read, once you hit EOF you will 
never recover from it. Is this about the micro-optimisation again? Disc 
IO is almost certainly a thousand times slower than any exception you 
could catch here.

In Python, we *do* use exceptions for file reads. An explicit read 
returns an empty string, and we might write:

f = open(filename)
while 1:
    block = f.read(buffersize)
    if not block:
        f.close()
        break
    process(block)

This would arguably be easier to write and read, and demonstrates the 
intent of the while loop better:

f = open(filename)
try:
    while 1:
        process(f.read(buffersize))
except EOFError:
    f.close()

(But the above doesn't work, because an explicit read doesn't raise an 
exception.)

However, there's another idiom for reading a file which does use an 
exception: line-by-line reading.

f = open(filename)
for line in f:
    process(line)
f.close()

Because iterating over the file generates a StopIteration when EOF is 
reached, the for loop automatically breaks. If you wanted to handle that 
by hand, something like this should work (but is unnecessary, because 
Python already does it for you):

f = open(filename)
try:
    while 1:
        process(f.next())
except StopIteration:
    f.close()

[...]
> The core problem isn't whether exceptions are good or bad in a
> particular case, but that most APIs make this an either-or choice. For
> example, if I had an API that allowed me to choose at run time whether
> an exception will be thrown for a particular condition, I could adapt
> that API to my needs, instead of being stuck with whatever the designer
> came up with.
> 
> There are many ways this could be done. For example, I could have a
> find() operation on a collection that throws if a value isn't found, and
> I could have findNoThrow() if I want a sentinel value returned. Or, the
> API could offer a callback hook that decides at run time whether to
> throw or not. (There are many other possible ways to do this, such as
> setting the behaviour at construction time, or by having different
> collection types with different behaviours.)
> 
> The point is that a more flexible API is likely to be more useful than
> one that sets a single exception policy for everyone.

This has costs of its own. The costs of developer education -- learning 
about, memorising, and deciding between such multiple APIs does not come 
for free. The costs of developing and maintaining the multiple functions. 
The risks of duplicated code in the implementation. The cost of writing 
documentation. A bloated API is not free of costs.

>> > As the API
>> > creator, if I indicate errors with exceptions, I make a policy
>> > decision about what is an error and what is not. It behooves me to be
>> > conservative in that policy: I should throw exceptions only for
>> > conditions that are unlikely to arise during routine and normal use
>> > of the API.
>>
>> But lost connections *are* routine and normal. Hopefully they are rare.
> 
> In the context of my example, they are not. The range of behaviours
> naturally falls into these categories:
> 
> * No data ready
> * Data ready
> * EOF
> * Socket error

Right -- that fourth example is one of the NATURAL categories that any 
half-way decent developer needs to be aware of. When you say something 
isn't natural, and then immediately contradict yourself, that's a sign 
you need to think about what you really mean :)

> The first three cases are the "normal" ones; they operate on the same
> program state and they are completely expected: while reading a message
> off the wire, the program will almost certainly encounter the first two
> conditions and, if there is no error, it will always encounter the EOF
> condition. 

I would call these the ordinary cases, as opposed to the exceptional 
cases.

> The fourth case is the unexpected one, in the sense that this
> case will often not arise at all.

But it is still expected -- you have to expect that you might get a 
socket error, and code accordingly.

> That's not to say that lost connections aren't routine; they are. 

Right -- we actually agree on this, we just disagree on the terminology. 
I believe that talking about "normal" and "errors" is misleading. Better 
is to talk about "ordinary" and "exceptional".

> But, when a connection is lost,
> the program has to do different things and operate on different state
> than when the connection stays up. This strongly suggests that the first
> three conditions should be dealt with by return values and/or out
> parameters, and the fourth condition should be dealt with as an
> exception.

Agreed.

-- 
Steven