[Python-ideas] Use `isinstance` check during exception handling

Thu Nov 5 17:03:14 EST 2015

On Nov 5, 2015, at 12:18, Sjoerd Job Postmus <sjoerdjob at sjec.nl> wrote:
> 
>> On 5 Nov 2015, at 21:06, Andrew Barnert <abarnert at yahoo.com> wrote:
>> 
>> On Nov 5, 2015, at 07:00, sjoerdjob at sjec.nl wrote:
>> 
>>>>> On Nov 4, 2015, at 14:02, sjoerdjob at sjec.nl wrote:
>>>>> 
>>>>> Hi all,
>>>>> 
>>>>> TL;DR: Change exception checking logic from
>>>>> err.__class__ in exc.__mro__
>>>>> to
>>>>> isinstance(err, exc)
>>>>> Because it is more powerful, more consistent and allows for cleaner
>>>>> code.
>>>> 
>>>> Correct me if I'm wrong, but doesn't Python allow you to raise a class and
>>>> handle it without ever creating an instance? If an instance is needed
>>>> (e.g., because the exception is caught by a handler with an as clause),
>>>> one is created by calling exc(), but if it's not needed, that doesn't
>>>> happen.
>>>> 
>>>> Assuming I'm right, your proposal would mean the instance is _always_
>>>> needed, so if you raise a type, exc() is always called. So, for example:
>>>> 
>>>>  class MyException(Exception):
>>>>      def __init__(self, thingy, *a, **kw):
>>>>          super().__init__(*a, **kw)
>>>>          self.thingy = thingy
>>>> 
>>>>  try:
>>>>      raise MyException
>>>>  except MyException:
>>>>      print('whoops')
>>>> 
>>>> Instead of printing "whoops", it would now (I think) abort with a
>>>> TypeError about trying to call MyException with the wrong number of
>>>> arguments.
>>> 
>>> Yes, the exception is instantiated as soon as it is raised (not when it's
>>> matched). This happens in Python/ceval.c. So you actually have to catch
>>> `TypeError` in this case.
>> 
>> Then what's the point of the code at Python/errors.c#186 that deals with the fact that the given exception (err) can be either an instance or a type? If it must always be an instance, surely those two lines are unnecessary and misleading.
>> 
>> In fact, looking into it a bit more: the normal way to check for an exception from C is with PyErr_ExceptionMatches, which calls PyErr_GivenExceptionMatches with the type, not the value, of the exception. So, even if an instance has been created, you don't usually have it inside PyErr_GivenExceptionMatches anyway, so the proposed change wouldn't work without further changes. At the very least you'd need to change PyErr_ExceptionMatches to pass the value instead of the type. But presumably there's a reason it passes the type in the first place (maybe in CPython it's not possible to raise a type without an instance from pure Python code, but it is from C API code?), which means there's probably other code that has to change as well. 
>> 
>> And that's assuming there's no other code than calls PyErr_GivenExceptionMatches with the value of PyErr_Occurred() (or tstate->curexc_type or any other way of getting an exception type) anywhere in CPython or in any extension module anyone's ever written. Since there's nothing about the docs that implies that this is invalid, and in fact the docs for PyErr_ExceptionMatches and the source code for PyErr_GivenExceptionMatches both strongly imply that passing a type is the normal way to check, I doubt that assumption is true.
> 
> I added an assert before the except, asserting that no instance is passed, and surprisingly that assert triggers.
> 
> Also: in some cases, tstate->cur_value is actually a PyString object instead of an exception subclass (for instance raised by PyErr_FormatV (or something like that, don't have the code in front of me right now).
> 
> So yes, it does appear that a lot of code might need changing. But it also seems odd to me that exceptions get raised with strings for values instead of exception instances.
> 
> It might seem troublesome to change these locations to raise actual instances instead of strings, but those might actually just be hidden bugs just the same.
> 
> Do you think anyone would object if I'd go over the code looking for places where type(tstate->cur_value) != tstate->cur_type and propose patches for these?

It would be really nice if someone remembered _why_ the code is this way. It may be a performance issue—if the exception is raised in C code and handled in C code and never displayed, the exception object never gets used, so why waste time (and complicate the reference graph) creating one? Or it may just be a holdover from Python 1.x's string exceptions. Or there may be some other reason neither of us can guess. Without knowing which it is, it's hard to know what to do about it.

But meanwhile, even if you patch all of CPython, that doesn't change the fact that there are probably extension libraries out there setting exceptions with a null or string value, or checking the exception by using PyErr_ExceptionOccurred or other means that ignore the value (without going through PyErr_GivenExceptionMatches or anything else you could change), which are perfectly valid things to do according to the docs, and have worked for decades. How are you going to fix all of those?

Also, even if it's true that pure Python code can't create an exception without a value, only C code can, the way the language reference is written, that seems to be something specific to CPython, not guaranteed by the language. So, other implementations might be doing something different, and you'd have to fix all of them as well. (That may not be a big deal—are there any other major 3.x implementations besides PyPy, which is still on 3.2?)

The best thing I can think of is to add a new API around exception values instead of types (a replacement for ExceptionOccurred that returns the value instead of the type, a replacement for ExceptionMatches that requires an instance instead of taking an instance or type, etc.), 
change GivenExceptionMatches (and the except clause implementation) to use that API, change SetNone and friends to always create a type, deprecate ExceptionOccurred and friends, change the documentation to make it clear than raising a type in Python is the same as raising Type(), wait a couple versions for the C API deprecation, and only then can you assume that ExceptionMatches always gets an instance (and decide what you do if not), so you can make your proposed change.