[Python-Dev] Hindsight on Py_UNICODE_WIDE?

"Martin v. Löwis" martin at v.loewis.de
Sat Mar 24 11:45:54 CET 2007


> 1.  In hindsight, what do you think about PEP 261, the Py_UNICODE_WIDE
> build option?  On balance, has this been good, bad, or indifferent?
> What's good/bad about it?

Unlike MAL, I think it was good choice, primarily for political reasons.
People kept complaining that Python doesn't "really" support Unicode,
and they went silence since this change was made. These days, Linux
distributions always ship Python in UCS-4 mode (triggered by the fact
that wchar_t is also UCS-4 on Linux); Windows distributions always
chose UCS-2 (for the same reason: wchar_t is two bytes).

If it weren't for Windows, I would suggest that always using UCS-4
is the simplest solution.

FWIW, Smalltalk implementations (VisualWorks and Squeak in particular)
seem to go the "multiple internal representations" route, even though
they have mutable strings. If you put a wider character into a smaller
string, the smaller string become:s transparently wider.

HTH,
Martin


More information about the Python-Dev mailing list