Pure python implementation of string-like class

Ross Ridge rridge at csclub.uwaterloo.ca
Sun Feb 26 12:46:16 EST 2006


Ross Ridge wrote:
> Akihiro Kayama in his original post made it clear that he wanted to use
> a character set larger than entire Unicode code space.

Xavier Morel wrote:
> He implies that ...

He explictly said that character set he wanted to use wouldn't fit in
UTF-16.

>... but in later messages he
> 1. Implies that he wants to use the Unicode private spaces, which are in
> the Unicode code space

He explictly said that he wanted to use the  "U+60000000...U+7FFFFFFF"
range which is outside of the Unicode code space, despite him
mistakenly calling them Unicode characters.

> 2. Says explicitly that  his needs concern Kanji encoding...

I have no clue whether he really needs such a large character set, but
if he does then it makes sense for him to want to use an encoding
that's wider than UTF-16.

As for the problem he actually posed, I'd suggest using tuples rather
than lists, since tuples are immutable like strings.  That would make
it easier for the class to be used as key in a dictionary.  Hmm...
thiking about it, it might actually make sense to use strings as the
internal representation as a lot operations can be implemented by using
the standard string operation but multipling the offsets and lengths by
4.

                           Ross Ridge




More information about the Python-list mailing list