[Python-Dev] Internationalization Toolkit

Greg Stein gstein@lyra.org
Fri, 12 Nov 1999 02:30:04 -0800 (PST)


On Fri, 12 Nov 1999, Tim Peters wrote:
>...
> Using UTF-8 internally is also reasonable, and if it's being rejected on the
> grounds of supposed slowness

No... my main point was interaction with the underlying OS. I made a SWAG
(Scientific Wild Ass Guess :-) and stated that UTF-8 is probably slower
for various types of operations. As always, your infernal meddling has
dashed that hypothesis, so I must retreat...

>...
> I expect either would work well.  It's at least curious that Perl and Tcl
> both went with UTF-8 -- does anyone think they know *why*?  I don't.  The
> people here saying UCS-2 is the obviously better choice are all from the
> Microsoft camp <wink>.  It's not obvious to me, but then neither do I claim
> that UTF-8 is obviously better.

Probably for the exact reason that you stated in your messages: many 8-bit
(7-bit?) functions continue to work quite well when given a UTF-8-encoded
string. i.e. they didn't have to rewrite the entire Perl/TCL interpreter
to deal with a new string type.

I'd guess it is a helluva lot easier for us to add a Python Type than for
Perl or TCL to whack around with new string types (since they use strings
so heavily).

Cheers,
-g

--
Greg Stein, http://www.lyra.org/