Chinese language support of Python?

Sun Jul 7 08:26:08 EDT 2002

I installed ChineseCodecs1.2.0 into lib/encodings, it converts GB2312
(simplified Chinese) to UTF-8, and I can use this:

root.title(unicode('中文',"eucgb2312_cn"))

Great! I can put whole raw Chinese string in source now!
Before that, I also tried ChinesePython, a Chinese translation version
of Python 2.1, it even enabled this:

root.title('中文') #directly put Chinese in normal string! The Best!!

But a little pity, it translated all prompt/error messages into BIG5
(Traditional Chinese), I can not view them in my GB Windows
environment, and no GB version available now yet. I have to uninstall
it and adopted the first solution.

More pity: the "Python GUI" utility -- IDLE, can not handle Chinese
string in source file(seems bit7 removed), neither the Python2.2.1
from python.org nor above chinesepython versions. If I open my source
with IDLE and save back, all Chinese string will be changed, this
means I cannot use it even for edit. Then, how can I debug the script
in GUI?

Thanks!
Leon Wang

Boudewijn Rempt <boud at valdyas.org> wrote in message news:<3d27de14$0$94898$e4fe514c at dreader3.news.xs4all.nl>...
> Leon Wang wrote:
> 
> > Hi, I got the Chinese displayed correctly in window title without
> > change the default encoding in site.py by:
> > 
> > root.title(u'\u4e2d\u6587')
> > 
> > But still can not put Chinese directly as string in source, I can not
> > live with so much \u... for a whole Chinese sentence/paragraph, it's
> > impossible to read and edit them :(
> > However, I can print Chinese string (normal string, without u prefix
> > and \u codes) in console with command line python.exe. How can I let
> > Tkinter accept that?
> 
> I don't think that's going to work (caveat: I use PyQt which has different
> conventions). If you absolutely want to have Chinese characters in your
> source files*, you can do something like the following**:
> 
> root.title(unicode('伱好?', 'utf-8')
> 
> Note that you _will_ have to construct a unicode object, not an ordinary
> string, since ordinary strings are just containers for bytes, one character
> per byte. If you want the system to understand what you mean.
> 
> You can find out which encodings are available by inspecting the 
> python/lib/encodings directory (or, python\lib\encodings): you can use
> any encoding instead of the 'utf-8'. Of course, the string must then
> be in the right encoding, too.
> 
> There are some errors in my handling of this topic in my book, but it might 
> still be useful to you:
> 
> http://www.opendocspublishing.com/pyqt/index.lxp?lxpwrap=c2029%2ehtm
> 
> errata:
> 
> http://www.valdyas.org/python/book.html
> 
> The paper version has nice pictures that are quite useful in this chapter.
> 
> * Actually I still think it would be great to be able to have sourcefiles
> in utf-8, not limited to unicode strings. I want to type:
> 
> def 印刷():
>     pass
> 
> That this would make my source code unreadable for a lot other people, tant
> pis, I still would like the power. Just as I want the power to do a quick
> sys.setAppDefaultEncoding('utf-8') to make sure this application sees all
> its strings as encoded in utf-8.
> 
> ** Note that this posting is encoded in utf-8. If you see gibberish instead
> of a friendly greeting, then either the message is mangled, or your 
> newsreader can't handle the encoding, or you don't have the fonts to show
> Chinese.