I like Unicode more than I used to...

SUZUKI Hisao suzuki611 at oki.com
Wed Feb 26 03:20:07 EST 2003


In message <200302251213.52956.aleaxit at yahoo.com>, Alex Martelli wrote:
[...]
> There will always be "specificity", depending on the way the
> strings are sourced or built.  I've snipped the rest of your mail,
> on which we agree, about literals using specified encodings in
> your sources (by whatever means).  But the crux of our abiding
> disagreement is: I do not think relying on the default encoding
> on sites you do not control is ever appropriate.  And there is
> no case of "universality" which would make it appropriate.

If you don't mind me saying so, you have discussed a lot
abstractly but not concretely enough.

In Japan, there are various encodings.  The (default) encodings of
the most popular terminal programs are:
   euc-jp for kterm on X11
   utf-8 for xterm -u8 on X11 (*1)
   utf-8 for Terminal on MacOS X Jaguar
   utf-8 for MuTerminal on BeOS
   cp932 (a variant of shift_jis) for command prompt on Windows

It is handy to write your script generically if you are going to
use it over various platforms (often in one room).
"Universality" is not the Holy Grail but only a daily need.

For example, you can read iso-2022-jp text (Japanese e-mails are
written in iso-2022-jp almost without exception) in any Japanese
terminal with a simple script below:
------------------------------------------------------------
#!/bin/sh -
"exec" "python" "-O" "$0" "$@"

import fileinput
for line in fileinput.input():
    print line.decode('iso-2022-jp').encode(),
------------------------------------------------------------
N.B. BeOS does not have /usr/bin/env; it has /bin/env.
     The above is appropriate for both BeOS and Unix.


(*1) If you have a recent version of X11 (for example, which
comes with Cygwin), you can display utf-8 texts which include
Japanese characters.  Put the following in your ~/.Xdefaults

XTerm*utf8: 1
XTerm*font: -misc-fixed-medium-r-normal--18-120-100-100-c-90-iso10646-1

-- SUZUKI Hisao





More information about the Python-list mailing list