[Python-Dev] PEP 393 Summer of Code Project

Guido van Rossum guido at python.org
Fri Aug 26 00:54:03 CEST 2011


On Wed, Aug 24, 2011 at 3:06 AM, Terry Reedy <tjreedy at udel.edu> wrote:
> Excuse me for believing the fine 3.2 manual that says
> "Strings contain Unicode characters." (And to a naive reader, that implies
> that string iteration and indexing should produce Unicode characters.)

The naive reader also doesn't know the difference between characters,
code points and code units. It's the advanced, Unicode-aware reader
who is confused by this phrase in the docs. It should say code units;
or perhaps code units for narrow builds and code points for wide
builds. With PEP 393 we can unconditionally say code points, which is
much better. We should try to remove our use of "characters" -- or
else we should *define* our use of the term "characters" as "what the
Unicode standard calls code points".

-- 
--Guido van Rossum (python.org/~guido)


More information about the Python-Dev mailing list