[Python-Dev] UTF-16 code point comparison

Bill Tutt billtut@microsoft.com
Thu, 27 Jul 2000 07:49:06 -0700


Fredrik:
> sorry, but you're being silly.  using variable-width encoding for
> interal storage is difficult, slow, and just plain stupid on modern
> hardware.

So use UCS-4 internal storage now. UTF-16 just seems like a handy internal
storage mechanism to pick since Win32 and Java use it for their native
string processing.

> (image processing people stopped doing stupid things like that
> ages ago, and trust me -- a typical image contains many more
> pixels than a typical text ;-)

> after all, if variable-width internal storage had been easy to deal
> with, we could have used UTF-8 from the start...  (and just like
> the Tcl folks, we would have ended up rewriting the whole thing
> in the next release ;-)

Oh please, UTF-16 is substantially simpler to deal with than UTF-8.
I would go nuts if our internal storage mechanism was UTF-8.

Bill