Multibyte Character Surport for Python

Stephen J. Turnbull stephen at
Sat May 11 08:31:19 EDT 2002

>>>>> "Martin" == Martin v Löwis <loewis at> writes:

    >> 1. In Python 3.0, the input character set is unicode - either
    >> UTF-16 or UTF-8 (I'm not prepared to make a solid arguement one
    >> way or the other at this time.)

    Martin> Actually, PEP 263 gives a much wider choice; consider this
    Martin> aspect solved.

Some of us consider the wider choice to be a severe defect of PEP 263.

That doesn't mean we think that Python should prohibit writing
programs in arbitrary user-specified encodings.  Only that the
facility for transforming a non-Unicode program into Unicode should be
provided as a standard library facility, rather than part of the
language.  The lexical properties of the language would be specified
in terms of Unicode.

Institute of Policy and Planning Sciences
University of Tsukuba                    Tennodai 1-1-1 Tsukuba 305-8573 JAPAN
 My nostalgia for Icon makes me forget about any of the bad things.  I don't
have much nostalgia for Perl, so its faults I remember.  Scott Gilbert

More information about the Python-list mailing list