[I18n-sig] How does Python Unicode treat surrogates?

Florian Weimer fw@deneb.enyo.de
22 Feb 2001 17:38:26 +0100


"M.-A. Lemburg" <mal@lemburg.com> writes:

> Note that UTF-16 surrogates are only needed to reach Unicode
> code points beyond BMP. AFAIK, there are plans to fill this
> area in the next Unicode version, but the designers are very
> well aware of the issues this imposes on the existing implementations:
> Windows and Java are Unicode 2.0 based which is not capable of
> handling character points outside BMP.

And so is Ada.

However, a few useful extensions are planned for the next Unicode
revisions: several mathematical alphabets and language tags come to
my mind immediately.  It's certainly no longer true that non-BMP
characters are going to be used only by scholars (as it seemed a few
years ago).