Unicode lists and join (python 2.2.3)

"Martin v. Löwis" martin at v.loewis.de
Sun May 25 16:06:16 EDT 2008


>     x = [u"\xeeabc2:xyz", u"abc3:123"]
>     u = "\xe7abc"

u is not a Unicode string.

>     x.append("%s:%s" % ("xfasfs", u))

so what you append is not a Unicode string, either.

>     x.append(u"Hello:afddfdsfa")
> 
>     y = u'\n'.join(x)

As a consequence, .join tries to convert the byte string to
a Unicode string, and fails, because it contains non-ASCII
bytes.

> Why does this work with no exceptions
> 
>     x=[]
>     u = "\xe7abc"
>     x.append("%s:%s" % ("xfasfs", u))

% here is applied to a byte string, with all arguments also byte
strings. The result is a byte string.
> 
> and this doesnt
>     x=[]
>     u = "\xe7abc"
>     x.append("%s:%s" % (u"xfasfs", u))

% is applied to a byte string, with one argument being a Unicode
string. The result is a Unicode string, where all byte strings
get converted to Unicode. Converting u fails, as it has non-ASCII
bytes in it.

Regards,
Martin



More information about the Python-list mailing list