[Python-Dev] 2.2 Unicode questions
Guido van Rossum
guido@digicool.com
Thu, 19 Jul 2001 10:58:23 -0400
> > But isn't the whole point of UTF-16 to fool code that believes it's
> > manipulating UCS-2 into a false sense of security? :-)
>
> Well, sort of. More like fooling into a true sense of insecurity. :)
Same difference. :-)
> Anyway, the Standard sez that a conforming UCS-2 application will
> not use characters in the surrogates area. Future versions of ISO10646
> and the Unicode Standard will probably require UTF-16 instead of UCS-2.
So the proper way to code *libraries* that use 16-bit data would be
not to commit on the issue: don't generate surrogates on your own
account, but also don't actively reject them, instead passing them
through transparently. This should conform to both UCS-2 and UTF-16.
--Guido van Rossum (home page: http://www.python.org/~guido/)