[Python-3000] PEP 3131 roundup

Stephen J. Turnbull stephen at xemacs.org
Wed Jun 6 07:28:36 CEST 2007


Steve Howell writes:

 > So I'm +1 on the unquoted third option, that canonically
 > equivalent, but differently encoded, Unicode characters are allowed
 > yet treated as different.
 > 
 > Am I stretching the analogy too far?

Yes.  By definition, that is nonconformant to the standard.
Canonically equivalent sequences are *identical characters* in
Unicode.  The difference you are talking about is equivalent to the
differences among "7", "07", and "0x7" as C numeric literals.  They
look different, but their semantics is identical in the program.

Pragmatically, if you have an editor which normally produces NFD, and
another which normally produces NFC, those programs will not be
link-compatible under your program, yet both editors will present the
user with identical displays.



More information about the Python-3000 mailing list