Problems with joining Unicode strings

John Machin sjmachin at lexicon.net
Sun Mar 23 17:56:37 EDT 2008


On Mar 24, 7:58 am, Ulysse <maxim... at gmail.com> wrote:
> Hello,
>
> I have problems with joining strings.
>
> My program get web page fragments, then joins them into one single web
> page. I have error when I try to join these fregments :
> "UnicodeDecodeError: 'ascii' codec can't decode byte 0xe9 in position
> 208: ordinal not in range(128)"
>
> Here is my code :
[snip]

Contrary to your message subject, you are *not* joining "Unicode
strings". Your resume_news contains at least one str (8-bit string)
objects [probably all of them are str!]. You need to decode each str
object into a unicode object, using whatever encoding is appropriate
to that str object. When you do
u''.join(sequence_including_str_objects), Python attempts to decode
each str object using the default 'ascii' encoding ... this of course
fails if there is a non-ASCII character in the str object.

This may help: www.amk.ca/python/howto/unicode

Cheers,
John



More information about the Python-list mailing list