[Python-3000] Workaround for py3k build problem on CJK MacOS X?

Hye-Shik Chang hyeshik at gmail.com
Fri Aug 15 15:41:53 CEST 2008


Currently, Python 3.0 fails to build in MacOS X with CJK locales
due to lack of codec support for preferred encodings, such as
MacJapanese or MacKorean, provided by locale.getpreferredencoding().
We have a patch that implements the codecs that resolves the problem
in issue #1276, but it's too late to put the new code into the
upcoming release. So, I propose few alternative temporary workarounds
that can be incorporated in 3.0.

1) Add temporary encoding aliases.

All Macintosh CJK encodings have their base legacy encoding that
is supported by Python already. They are virtually identical to the
legacy encodings except few Apple extension codepoints which are
rarely used. Just adding aliases will immediately resolve the problem
while it's little bit incorrect.


2) Force to use non-CJK encoding in build process.

The preferred encoding comes from an environment variable,
__CF_USER_TEXT_ENCODING. We can easily change the encoding to U.S.
English by setting the env variable in Makefile. But this still
means locale.getpreferredencoding() returns non-support encoding
on installed runtime on MacOS X with CJK locales.  The problem has
been mentioned several times by users of applications that refers
the preferred encoding, such as Django.


3) Add Mac script to legacy encoding mapping in _locale.

There's a function, mac_getscript(), that converts Macintosh encoding
name to Python codec encoding in _localemodule.c. We shall add
temporary mappings from Macintosh's to their legacy base encoding.
This is also not quite correct like 1), but there'll be no problem
in practical use.


4) Add FAQ entry in the documentation.

Alternative to the code changes, we can just add a FAQ entry about the
problem on MacOS X with CJK locales in the documentation.


What do you think?



Hye-Shik


More information about the Python-3000 mailing list