"%s" vs unicode

Gerd Woetzel woetzel at gmd.de
Thu Jan 9 06:26:51 EST 2003


What I have written after some beer:

>><imho>
>> Unfortunately the "general principle" is wrong.
>> There is a canonical embedding of Unicode strings into byte strings (which
>> is UTF-8) but no canonical embedding of byte strings into Unicode strings.
>> Hence it should be vice versa.
>></imho>
>>Its a real shame that I have no acces to the bvd's time machine :-)

Robin Becker <robin at jessikat.fsnet.co.uk> writes:

>I'm fairly sure I agree with you, but time travel will make criminals of
>us all

martin at v.loewis.de (Martin v. =?iso-8859-15?q?L=F6wis?=) writes:

>[...]
>Then people requested that byte-string-unicode-string conversion
>should use other encodings, and it was pointed out that UTF-8 is maybe
>confusing for existing applications. So the default encoding is now
>administrator-settable, and defaults to ASCII.

>With ASCII being the default encoding, there is *no* canonical
>embedding of Unicode strings into byte strings: some Unicode strings
>("most") cannot be converted to a byte string automatically.
>[...]

I agree with you, but time travel will definitely make a criminal of me!

Cheers, Gerd




More information about the Python-list mailing list