[Python-Dev] Py3K: indirect coupling between raise and exception handler

Fri, 10 Mar 2000 09:28:13 -0600

Consider the following snippet of code from MySQLdb.py:

    try:
        self._query(query % escape_row(args, qc))
    except TypeError:
        self._query(query % escape_dict(args, qc))

It's not quite right.  There are at least four reasons I can think of why
the % operator might raise a TypeError:

    1. query has not enough format specifiers
    2. query has too many format specifiers
    3. argument type mismatch between individual format specifier and
       corresponding argument
    4. query expects dist-style interpolation

The except clause only handles the last case.  That leaves the other three
cases mishandled.  The above construct pretends that all TypeErrors possible
are handled by calling escape_dict() instead of escape_row().

I stumbled on case 2 yesterday and got a fairly useless error message when
the code in the except clause also bombed.  Took me a few minutes of head
scratching to see that I had an extra %s in my format string.  A note to
Andy Dustman, MySQLdb's author, yielded the following modified version:

    try:
        self._query(query % escape_row(args, qc))
    except TypeError, m:
        if m.args[0] == "not enough arguments for format string": raise
        if m.args[0] == "not all arguments converted": raise
        self._query(query % escape_dict(args, qc))

This will do the trick for me for the time being.  Note, however, that the
only way for Andy to decide which of the cases occurred (case 3 still isn't
handled above, but should occur very rarely in MySQLdb since it only uses
the more accommodating %s as a format specifier) is to compare the string
value of the message to see which of the four cases was raised.

This strong coupling via the error message text between the exception being
raised (in C code, in this case) and the place where it's caught seems bad
to me and encourages authors to either not recover from errors or to recover
from them in the crudest fashion.  If Guido decides to tweak the TypeError
message in any fashion, perhaps to include the count of arguments in the
format string and argument tuple, this code will break.  It makes me wonder
if there's not a better mechanism waiting to be discovered.  Would it be
possible to publish an interface of some sort via the exceptions module that
would allow symbolic names or dictionary references to be used to decide
which case is being handled?  I envision something like the following in
exceptions.py:

    UNKNOWN_ERROR_CATEGORY = 0
    TYP_SHORT_FORMAT = 1
    TYP_LONG_FORMAT = 2
    ...
    IND_BAD_RANGE = 1

    message_map = {
        # leave
        (TypeError, ("not enough arguments for format string",)):
	    TYP_SHORT_FORMAT,
	(TypeError, ("not all arguments converted",)):
	    TYP_LONG_FORMAT,
	...
	(IndexError, ("list index out of range",)): IND_BAD_RANGE,
	...
    }

This would isolate the raw text of exception strings to just a single place
(well, just one place on the exception handling side of things).  It would
be used something like

    try:
        self._query(query % escape_row(args, qc))
    except TypeError, m:
        from exceptions import *
        exc_case = message_map.get((TypeError, m.args), UNKNOWN_ERROR_CATEGORY)
        if exc_case in [UNKNOWN_ERROR_CATEGORY,TYP_SHORT_FORMAT,
		        TYP_LONG_FORMAT]: raise
        self._query(query % escape_dict(args, qc))

This could be added to exceptions.py without breaking existing code.

Does this (or something like it) seem like a reasonable enhancement for
Py2K?  If we can narrow things down to an implementable solution I'll create 
a patch.

Skip Montanaro | http://www.mojam.com/
skip@mojam.com | http://www.musi-cal.com/