[Python-Dev] PEP 344: Exception Chaining and Embedded Tracebacks

Fri May 20 20:43:37 CEST 2005

[Guido]
> > Here's a bunch of commentary:

[Ping]
> Thanks.  Sorry it's taken me a couple of days to get back to this.
> I think i'm caught up on the mail now.

No problem!

<snip>

> > Also, in that same example, according to your specs, the TypeError
> > raised by bar() has the ZeroDivisionError raised in foo() as its
> > context.  Do we really want this?
> 
> I don't think it's absolutely necessary, though it doesn't seem to
> hurt.  We agree that if the TypeError makes it up to foo's frame,
> it should have the ZeroDivisionError as its __context__, right?

Yes.

> If so, do i understand correctly that you want the __context__ to
> depend on where the exception is caught as well as where it is raised?

I think so, but that's not how I think about it.  IMO the only time
when the context becomes *relevant* is when a finally/except clause is
left with a different exception than it was entered.

> In your thinking, is this mainly a performance or a cleanliness issue?

Hard to say; the two are often difficult to separate for me.  The
performance in this case bothers me because it means unnecessary churn
when exceptions are raised and caught.  I know I've said in the past I
don't care about the performance of exceptions, but that's not *quite*
true, given that they are quite frequently used for control flow
(e.g. StopIteration).  I don't know how to quantify the performance
effect though (unless it means that exceptions would have to be
*instantiated* sooner than currently; in those cases where exception
instantiation is put off, it is put off for one reason, and that
reason is performance!

The cleanliness issue is also important to me: when I have some
isolated code that raises and successfully catches an exception, and
that code happens to be used by a logging operation that is invoked
from an exception handler, why would the context of the exception have
to include something that happened way deeper on the stack and that is
totally irrelevant to the code that catches *my* exception?

> Basically i was looking for the simplest description that would
> guarantee ending up with all the relevant tracebacks reported in
> chronological order.  I thought it would be more complicated if we
> had to keep modifying the traceback on the way up, but now that
> i've re-learned how tracebacks are constructed, it's moot -- we're
> already extending the traceback on the way through each frame.
> 
> I have a proposal for the implicit chaining semantics that i'll post
> in a separate message so it isn't buried in the middle of this one.

OK, looking forward to it.

> > Do we really need new syntax to set __cause__?  Java does this without
> > syntax by having a standard API initCause() (as well as constructors
> > taking a cause as argument; I understand why you don't want to rely on
> > that -- neither does Java).  That seems more general because it can be
> > used outside the context of a raise statement.
> 
> I went back and forth on this.  An earlier version of the PEP actually
> proposes a 'setcause' method.  I eventually settled on a few reasons
> for the "raise ... from" syntax:
> 
>     1.  (major) No possibility of method override; no susceptibility
>         to manipulation of __dict__ or __getattr__; no possibility of
>         another exception happening while trying to set the cause.

Hm; the inability to override the method actually sounds like a major
disadvantage.  I'm not sure what you mean with __dict__ or __getattr__
except other ways to tweak the attribute assignment; maybe you forgot
about descriptors?  Another exception could *still* happen and I think
the __context__ setting mechanism will take care of it just fine.

>     2.  (moderate) There is a clear, distinct idiom for exception
>         replacement requiring that the cause and effect must be
>         identified together at the point of raising.

Well, nothing stops me from adding a setCause() method to my own
exception class and using that instead of the from syntax, right?  I'm
not sure why it is so important to have a distinct idiom, and even if
we do, I think that a method call will do just fine:

  except EnvironmentError, err:
      raise MyApplicationError("boo hoo").setCause(err)

>     3.  (minor) No method namespace pollution.
> 
>     4.  (minor) Less typing, less punctuation.
> 
> The main thing is that handling exceptions is a delicate matter, so it's
> nice to have guarantees that the things you're doing aren't going to
> suddenly raise more exceptions.

I don't see that there's all that much that can go wrong in
setCause().  After all, it's the setCause() of the *new* exception
(which you can know and trust) that could cause trouble; a boobytrap
in the exception you just caught could not possibly be set off by
simply using it as a cause.  (It can cause the traceback printing to
fail, of course, but that's not a new issue.)

> > Why insert a blank line between chained tracebacks?
> 
> Just to make them easier to read.  The last line of each traceback
> is important because it identifies the exception type, and that will
> be a lot easier to find if it isn't buried in an uninterrupted stream
> of lines.

Yeah, but those lines follow an easy-to-recognize pattern with
alternating "File" lines and source code lines, always indented 2
resp. 4 spaces; anything different is easily found.

> > I might want to add an extra line at the very end (and
> > perhaps at each chaining point) warning the user that the exception
> > has a chained counterpart that was printed earlier.
> 
> How about if the line says how many exceptions there were?  Like:
> 
>     [This is the last of 5 exceptions; see above for the others.]

Something like that would be very helpful indeed.

> > Why should the C level APIs not automatically set __context__?  (There
> > may be an obvious reason but it doesn't hurt stating it.)
> 
> Because:
> 
>     (a) you indicated some discomfort with the idea, perhaps because
>         it would make the interpreter do unnecessary work;
>     (b) no one seems to be asking for it;
>     (c) it seems potentially complicated.

None of these seem very good reasons if we were to go with your
original design. :-)

> However, if we go for the semantics you want, PyErr_Set* wouldn't set
> __context__ at the moment of raising anyway.  If __context__ is set
> during unwinding, then i expect it would get set on exceptions raised
> from C too, since the interpreter wouldn't know the difference.

Probably true.  It's worth looking at the implementation in detail (I
don't have it in my head and not enough time to look it up myself).

> > I was surprised to learn that yield clears the exception state; I
> > wonder if this isn't a bug in the generator implementation?  IMO
> > better semantics would be for the exception state to survive across
> > yield.
> 
> I agree -- i just didn't want to tackle that issue in this PEP.
> It could be considered a separate enhancement/bugfix.

Perhaps.  You might look into why this is -- I wonder if it isn't an
accident of the generator implementation (like ceval() might be
clearing the exception info upon entry even if it's resuming a frame
-- no idea if that's the case without doing more research than I have
time for).

> > I don't like the example (in "Open Issues") of applications wrapping
> > arbitrary exceptions in ApplicationError.  I consider this bad style,
> > even if the chaining makes it not quite as bad as it used to be.
> 
> Isn't it a reasonable possibility that, as part of its contract, a
> library will want to guarantee that it only raises exceptions of
> certain types?

Yeah, but if you apply that recursively, it should only have to
*catch* exceptions of certain types when it is calling other code.
E.g. it's fine to catch EnvironmentError (which is basically IOError +
os.error) when you're manipulating files; but it's not okay to catch
all exceptions and wrap them.  I've seen too many ApplicationErrors
that were either hiding bugs in the application, or trapping
KeyboardInterrupt and similar ones.  MemoryError, SystemExit and
SystemError also shouldn't be wrapped.

I don't think it's a good idea to try to guarantee "this application
only ever raises ApplicationError".  A better guarantee is "errors
that this application *detects* will always be reported as
ApplicationError.  IMO bugs in the application should *never* be
wrapped in ApplicationError.

> > I don't see the need for "except *, exc" -- assuming all exceptions
> > derive from a single base class, we can just write the name of that
> > base class.
> 
> If we get there, yes.  But at the moment, i don't believe there's any
> way to catch an arbitrary string exception or an exception of a
> non-Exception instance other than "except:".

Sure.  But since we all seem to be agreeing on eventually making all
exceptions derive from a common base class, we shouldn't be inventing
new syntax that will later become redundant.

> > I don't like having sys.exception; instead, the only way to access the
> > "current" exception ought to be to use an except clause with a
> > variable.  (sys.last_exception is fine.)
> 
> That would be nice, again once we have all exceptions derive from
> Exception.  It seems to me we'd have to do these changes in this order:
> 
>     (a) ban string exceptions
>     (b) require all exceptions to derive from Exception
>     (c) ban bare "except:"
>     (d) eliminate sys.exc_*
> 
> Or do them all at once in Python 3000.  (Well, i guess all that is just
> repeating what Brett has talked about putting in his exception PEP.)

I guess it's a separate topic.  I'd like your PEP to focus in the
ideal Python 3000 semantics first; once we agree on that we can
discuss how to get there from here.

> > I hope that this can be accepted together with the son-of-PEP-343 (PEP
> > 343 plus generator exception injection and finalization) so __exit__
> > can take a single exception argument from the start.  (But what should
> > it receive if a string exception is being caught?  A triple perhaps?)
> 
> Dare i suggest... a string subclass with a __traceback__ attribute?
> 
> A string subclass (that also subclasses Exception) might be a migration
> path to eliminating string exceptions.

Hardly.  With string exceptions, the *identity* of the string object
decides the identity of the exception caught (matching uses 'is' not
'==').

I'd rather just leave string exceptions alone and say certain features
don't work if you use them -- that'll be an encouragement for people
to rip them out.  If you want to do *anything* about them, you might
create a family of Exception subclasses, one subclass per string
object.  Implementing this would be a nice exercise using metaclasses.
Suppose you have a class factory named StringExceptionClass that takes
a string object and returns the appropriate StringException subclass
(maintaining a global dict mapped by object identity so it returns the
same class if the same object is passed).  Then

  raise "abc", XYZ

could be translated into

  raise StringExceptionClass("abc")(XYZ)

and

  except "abc", err:

could be equivalent to

  except StringExceptionClass("abc"), err:

The remaining incompatibility would be that err would hold a
StringException instance rather than just the value of XYZ.  Or the
VM's except handling code could pull the value out of the exception and
store it in err for full compatibility.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)