strange behaviour of str()

Uwe Schmitt schmitt at num.uni-sb.de
Wed Aug 31 06:58:32 EDT 2005


> 
> Hello,
> 
> I'm wondering about the following behaviour of str() with strings 
> containing non-ASCII characters:
> 
> str(u'foo') returns 'foo' as expected.
> 
> str('lää') returns 'lää' as expected.
> 
> str(u'lää') raises UnicodeEncodeError
> 

This does not work, because you need an encoder to convert
unicode to str. str() does not know a priori which  encoder
to use. There are many ways to encode a unicode string
to a classic byte-stream based string.

you have to procede as follows:

   >>> s=u"äää"
   >>> print s.encode("latin-1")
   äää

try "utf-8" and "utf-16" instead of "latin-1"

Greetings, Uwe.







More information about the Python-list mailing list