[Python-Dev] PEP 393 Summer of Code Project

Terry Reedy tjreedy at udel.edu
Sat Aug 27 05:51:30 CEST 2011



On 8/26/2011 8:42 PM, Guido van Rossum wrote:
> On Fri, Aug 26, 2011 at 3:57 PM, Terry Reedy<tjreedy at udel.edu>  wrote:

>> My impression is that a UFT-16 implementation, to be properly called such,
>> must do len and [] in terms of code points, which is why Python's narrow
>> builds are called UCS-2 and not UTF-16.
>
> I don't think anyone else has that impression. Please cite chapter and
> verse if you really think this is important. IIUC, UCS-2 does not
> allow surrogate pairs, whereas Python (and Java, and .NET, and
> Windows) 16-bit strings all do support surrogate pairs. And they all

For that reason, I think UTF-16 is a better term that UCS-2 for narrow 
builds (whether or not the above impression is true).
But Marc Lemburg disagrees.
http://mail.python.org/pipermail/python-dev/2010-November/105751.html
The 2.7 docs still refer to usc2 builds, as is his wish.

---
Terry Jan Reedy


More information about the Python-Dev mailing list