unicode in exception traceback

WaterWalk toolmaster at 163.com
Thu Apr 3 10:33:33 EDT 2008


On Apr 3, 5:56 pm, Peter Otten <__pete... at web.de> wrote:
> WaterWalk wrote:
> > Hello. I just found on Windows when an exception is raised and
> > traceback info is printed on STDERR, all the characters printed are
> > just plain ASCII. Take the unicode character u'\u4e00' for example. If
> > I write:
>
> > print u'\u4e00'
>
> > If the system locale is "PRC China", then this statement will print
> > this character as a single Chinese character.
>
> > But if i write: assert u'\u4e00' == 1
>
> > An AssertionError will be raised and traceback info will be put to
> > STDERR, while this time, u'\u4e00' will simply be printed just as
> > u'\u4e00', several ASCII characters instead of one single Chinese
> > character. I use the coding directive commen(# -*- coding: utf-8 -*-)t
> > on the first line of Python source file and also save it in utf-8
> > format, but the problem remains.
>
> > What's worse, if i directly write Chinese characters in a unicode
> > string, when the traceback info is printed, they'll appear in a non-
> > readable way, that is, they show themselves as something else. It's
> > like printing something DBCS characters when the locale is incorrect.
>
> > I think this problem isn't unique. When using some other East-Asia
> > characters, the same problem may recur.
>
> > Is there any workaround to it?
>
> Pass a byte string but make some effort to use the right encoding:
>
> >>> assert False, u"\u4e00".encode(sys.stdout.encoding or "ascii", "xmlcharrefreplace")
>
> Traceback (most recent call last):
>   File "<stdin>", line 1, in <module>
> AssertionError: 一
>
> You might be able to do this in the except hook:
>
> $ cat unicode_exception_message.py
> import sys
>
> def eh(etype, exc, tb, original_excepthook=sys.excepthook):
>     message = exc.args[0]
>     if isinstance(message, unicode):
>         exc.args = (message.encode(sys.stderr.encoding or "ascii", "xmlcharrefreplace"),) + exc.args[1:]
>     return original_excepthook(etype, exc, tb)
>
> sys.excepthook = eh
>
> assert False, u"\u4e00"
>
> $ python unicode_exception_message.py
> Traceback (most recent call last):
>   File "unicode_exception_message.py", line 11, in <module>
>     assert False, u"\u4e00"
> AssertionError: 一
>
> If python cannot figure out the encoding this falls back to ascii with
> xml charrefs:
>
> $ python unicode_exception_message.py 2>tmp.txt
> $ cat tmp.txt
> Traceback (most recent call last):
>   File "unicode_exception_message.py", line 11, in <module>
>     assert False, u"\u4e00"
> AssertionError: 一
>
> Note that I've not done any tests; e.g. if there are exceptions with
> immutable .args the except hook itself will fail.
>
> Peter

Thanks. My brief test indicates that it works. I'll try it in more
situations.



More information about the Python-list mailing list