A Unicode problem -HELP

"Martin v. Löwis" martin at v.loewis.de
Wed May 17 18:16:52 EDT 2006


manstey wrote:
> Thanks very much. Your def comma_separated_utf8(items): approach raises
> an exception in codecs.py, so I tried  = u", ".join(word_info + parse +
> gloss), which works perfectly. So I want to understand exactly why this
> works. word_info and parse and gloss are all tuples. does str convert
> the three into an ascii string?

Correct: a tuple is converted into a string with (contents), where
contents is achieved through comma-separating repr() of each tuple
element. repr(a_unicode_string) creates a \x or \u representation.

> but the join method retains their unicode status.

Correct. The result is a Unicode string if the joiner is a Unicode
string, and all tuple elements are Unicode strings. If one is not,
a conversion to Unicode is attempted.

> In the text file, the unicode characters appear perfectly, so I'm very
> happy.

Glad it works.

Regards,
Martin



More information about the Python-list mailing list