unicode string problems
Brian Quinlan
brian at sweetapp.com
Mon Apr 1 17:57:19 EST 2002
Gonçalo Rodrigues wrote:
> f.write("Março 2002" + march.Name())
>
> where march.Name() returns a unicode string I get a unicode error. I
> tried converting both unicodes to strings via str but obviously I got
an
> error (in the first string the culprit is the "ç" character).
>
> Can someone help me out here and show me the way to write these
strings
> to the file?
Maybe there should be a FAQ for this...
The problem that you are having is due to the fact that there are many
possible string encodings for the same Unicode string (and vise-versa).
For example:
>>> u'ç'.encode('iso-8859-1')
'\x87'
>>> u'ç'.encode('utf-8')
'\xc2\x87'
>>> u'ç'.encode('utf-16-le')
'\x87\x00'
So, when you add a string and Unicode object together, Python attempts
to convert the string into a Unicode object. But it refuses to guess
what encoding you mean and rejects all non-ASCII characters.
Here is a simple solution:
f.write("Março 2002" + march.Name().encode('latin-1'))
This will convert the Unicode name into a string object using the
Latin-1 encoding.
Cheers,
Brian
More information about the Python-list
mailing list