Multibyte Character Surport for Python
Stephen J. Turnbull
stephen at xemacs.org
Thu May 9 06:05:37 EDT 2002
>>>>> "Martin" == Martin v Loewis <martin at v.loewis.de> writes:
Martin> It would break introspective tools who suddenly find
Martin> Unicode objects in attribute dictionaries.
What Unicode objects? They find ordinary strings that are mandated to
be encoded in UTF-8. The tools only need to be 8-bit clean, and not
do anything that involves the assumption that #characters == #octets.
And _that_ only affects people using non-ASCII identifiers, which
might be OK since it is an extension.
We do the migration to Unicode objects later, at the same time that
you would have done it anyway. In the meantime, this fits right in
with the kind of "backwards compatibility" that PEP 263 is all about.
Except that people who want to look at non-ASCII identifiers must
work in UTF-8 environments and not their local encoding. But that is
OK because non-ASCII identifiers are an extension.
--
Institute of Policy and Planning Sciences http://turnbull.sk.tsukuba.ac.jp
University of Tsukuba Tennodai 1-1-1 Tsukuba 305-8573 JAPAN
My nostalgia for Icon makes me forget about any of the bad things. I don't
have much nostalgia for Perl, so its faults I remember. Scott Gilbert c.l.py
More information about the Python-list
mailing list