international coding - SOLUTION

Martin v. Löwis loewis at informatik.hu-berlin.de
Fri May 3 10:53:29 EDT 2002


Jaros³aw Zabie³³o <webmaster at apologetyka.com.pl (delete .PL)> writes:

> utf8Txt = u"""a\u0105 c=\u0107 e\u0119 l\u0142 n\u0144 o\xf3 s\u015b
> z\u017c x\u017a"""

This is a terminology problem - the text is not "utf-8"; it is a
"Unicode object". "UTF-8" is an encoding, just like cp1250, or mac-latin2.

To go from a Unicode object to UTF-8, do

really_utf8Txt = utf8Txt.encode("utf-8")

utf8Txt = unicode(really_utf8Txt, "utf-8")

Regards,
Martin



More information about the Python-list mailing list