unicode in exception traceback

WaterWalk toolmaster at 163.com
Thu Apr 3 04:45:00 EDT 2008


Hello. I just found on Windows when an exception is raised and
traceback info is printed on STDERR, all the characters printed are
just plain ASCII. Take the unicode character u'\u4e00' for example. If
I write:

print u'\u4e00'

If the system locale is "PRC China", then this statement will print
this character as a single Chinese character.

But if i write: assert u'\u4e00' == 1

An AssertionError will be raised and traceback info will be put to
STDERR, while this time, u'\u4e00' will simply be printed just as
u'\u4e00', several ASCII characters instead of one single Chinese
character. I use the coding directive commen(# -*- coding: utf-8 -*-)t
on the first line of Python source file and also save it in utf-8
format, but the problem remains.

What's worse, if i directly write Chinese characters in a unicode
string, when the traceback info is printed, they'll appear in a non-
readable way, that is, they show themselves as something else. It's
like printing something DBCS characters when the locale is incorrect.

I think this problem isn't unique. When using some other East-Asia
characters, the same problem may recur.

Is there any workaround to it?



More information about the Python-list mailing list