Unicode lists and join (python 2.2.3)
"Martin v. Löwis"
martin at v.loewis.de
Sun May 25 16:06:16 EDT 2008
> x = [u"\xeeabc2:xyz", u"abc3:123"]
> u = "\xe7abc"
u is not a Unicode string.
> x.append("%s:%s" % ("xfasfs", u))
so what you append is not a Unicode string, either.
> x.append(u"Hello:afddfdsfa")
>
> y = u'\n'.join(x)
As a consequence, .join tries to convert the byte string to
a Unicode string, and fails, because it contains non-ASCII
bytes.
> Why does this work with no exceptions
>
> x=[]
> u = "\xe7abc"
> x.append("%s:%s" % ("xfasfs", u))
% here is applied to a byte string, with all arguments also byte
strings. The result is a byte string.
>
> and this doesnt
> x=[]
> u = "\xe7abc"
> x.append("%s:%s" % (u"xfasfs", u))
% is applied to a byte string, with one argument being a Unicode
string. The result is a Unicode string, where all byte strings
get converted to Unicode. Converting u fails, as it has non-ASCII
bytes in it.
Regards,
Martin
More information about the Python-list
mailing list