[Python-Dev] UTF-8 is no fun...
Fredrik Lundh
fredrik@pythonware.com
Wed, 12 Apr 2000 11:39:03 +0200
Andy Robinson <andy@reportlab.com> wrote:
> I've spent a fair bit of time converting strings and files the=20
> last few days, and I'd add that what we have now seems both rock solid
> and very easy to use. =20
I'm not worried about the core string types or the conversion
machinery; what disturbs me is mostly the use of automagic
conversions to UTF-8, which breaks the fundamental assumption
that a string is a sequence of len(string) characters.
"The items of a string are characters. There is no
separate character type; a character is represented
by a string of one item"
(from the language reference)
I still think the "all strings are sequences of unicode characters"
strawman I posted earlier would simplify things for everyone in-
volved (programmers, users, and the interpreter itself).
more on this later. gotta ship some code first.
</F>