[I18n-sig] Random thoughts on Unicode and Python

Tom Emerson tree@basistech.com
Sat, 10 Feb 2001 17:45:25 -0500


M.-A. Lemburg writes:
> How does Ruby (which seems to be the direct Python-competitor
> in Japan) deal with the difference between binary data and
> text data ?

Strings are strings. The interpretation of the bytes in a string is
affected by the setting of the KCODE built-in variable.

> I think that much concern about these proposals lies in a misunder-
> standing of the general idea behind the proposed move to Unicode for
> text data:

Agreed.

> The module which we are currently talking about can be outlined
> as follows:
> 
>                   binary data string *)
>                          |
>                          |
>                   text data string 
>                     |           |
>                     |           |
>          Unicode string      encoded 8-bit string (with encoding 
>            *)                                      information !)
> 
> *) these are implemented in Python 1.6-2.1.
> 
> How does this compare to e.g. Ruby ?

As I said, Ruby has a String type, and an override for
Japanese-encoded strings.

The above is much more similar to the model used by Dylan.

    -tree

-- 
Tom Emerson                                          Basis Technology Corp.
Stringologist                                      http://www.basistech.com
  "Beware the lollipop of mediocrity: lick it once and you suck forever"