[Python-Dev] UTF-16 code point comparison

Tim Peters tim_one@email.msn.com
Fri, 28 Jul 2000 01:24:11 -0400


[Tim]
> ... Don't know how long it will take this half of the world to
> realize it, but UCS-4 is inevitable.

[Bill Tutt]
> On new systems perhaps, but important existing systems (Win32,
> and probably Java) are stuck with that bad decision and have to
> use UTF-16 for backward compatability purposes.

Somehow that doesn't strike me as a good reason for Python to mimic them
<wink>.

> Surrogates aren't as far out as you might think. (The next rev of
> the Unicode spec)

But indeed, that's the *point*:  they exhausted their 64K space in just a
few years.  Now the same experts say that adding 4 bits to the range will
suffice for all time; I don't buy it; they picked 4 bits because that's what
the surrogate mechanism was defined earlier to support.

> That's certainly sooner than Win32 going away.  :)

I hope it stays around forever -- it's a great object lesson in what
optimizing for yesterday's hardware can buy you <wink>.