[Python-Dev] len(chr(i)) = 2?

Raymond Hettinger raymond.hettinger at gmail.com
Mon Nov 22 19:13:30 CET 2010


On Nov 22, 2010, at 2:48 AM, Stephen J. Turnbull wrote:

> Raymond Hettinger writes:
> 
>> Neither UTF-16 nor UCS-2 is exactly correct anyway.
> 
> From a standards lawyer point of view, UCS-2 is exactly correct, 

You're twisting yourself into definitional knots.

Any explanation we give users needs to let them know two things:
* that we cover the entire range of unicode not just BMP
* that sometimes len(chr(i)) is one and sometimes two

The term UCS-2 is a complete communications failure
in that regard.  If someone looks up the term, they will
immediately see something like the wikipedia entry which says,
"UCS-2 cannot represent code points outside the BMP".
How is that helpful?


Raymond

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20101122/262d2d5c/attachment.html>


More information about the Python-Dev mailing list