[Python-Dev] Exception.__unicode__ and tp_unicode

Simon Cross hodgestar+pythondev at gmail.com
Tue Jun 10 18:31:13 CEST 2008


Originally Python exceptions had no __unicode__ method. In Python 2.5
__unicode__ was added. This led to "unicode(Exception)" failing and so
the addition of __unicode__ was reverted [1].

This leaves Python 2.6 in a position where calls to
"unicode(Exception(u'\xe1'))" fail as they are equivalent to
"uncode(str(Exception(u'\xe1'))" which cannot convert the non-ASCII
character to ASCII (or other default encoding) [2].

>From here there are 3 options:

 1) Leave things as they are.
 2) Add back __unicode__ and have "unicode(Exception)" fail.
 3) Add a tp_unicode slot to Python objects and have everything work
(at the cost of adding the slot).

Each option has its draw backs.

Ideally I'd like to see 3) implemented (there are already two
volunteers for and some initial stabs at implementing it) but a change
to Object is going to need an okay from someone quite high up. Also,
if you know of any code this would break, now is the time to let me
know.

If we can't have 3) I'd like to see us fall back to option 2). Passing
unicode exceptions back is useful in a number of common situations
(non-English exception messages, database errors, pretty much anywhere
that something goes wrong while dealing with potentially non-ASCII
text) and encoding to some specific format is usually not an option
since there is no way to know where the exception will eventually be
caught. Also, unicode(ClassA) already fails for any class that
implements __unicode__ so even without this effecting Exception it's
already not safe to do u"%s" % SomeClass. Also, there is a readily
available work around by doing u"%s" % str(SomeClass).

I'm opposed to 1) because a full work around means doing something like:

  def unicode_exception(e):
    if len(e.args) == 0:
      return u""
    elif len(e.args) == 1:
      return unicode(e.args[0])
    else:
      return unicode(e.args)

and then using unicode_exception(...) instead of unicode(...) whenever
one needs to get a unicode value for an exception.

The issue doesn't affect Python 3.0 where unicode(...) is replaced by str(...).

[1] http://bugs.python.org/issue1551432
[2] http://bugs.python.org/issue2517

Schiavo
Simon


More information about the Python-Dev mailing list