char 128? no... 256

Tim Peters tim_one at email.msn.com
Wed Feb 12 00:21:43 EST 2003


[Afanasiy]
> I obviously need to be able to decode the Unicode object.
> The sad part is, this doesn't have have to be a Unicode
> object but Python decides to make it one. So I guess the
> solution is to stop using Python for this project.

Another solution is to learn how Unicode works.

> >>> s = u'ö'
> >>> s
> u'\x94'

You explicitly asked for a Unicode object there, so I hope this isn't what
you mean by "Python decides to make it one" -- in that example, you told
Python to create a Unicode string, and that's what it did.

> >>> unicode(s,'latin-1')
> Traceback (most recent call last):
>   File "<stdin>", line 1, in ?
> TypeError: decoding Unicode is not supported
> >>>

The message is telling you that you can't pass a Unicode string to the
unicode() function.  The purpose of that function is to convert an 8-bit
string (which u'\x94' is not) to a Unicode string via a specified encoding.

It's unfortunate that Python *let* you write s = u'ö' to begin with, because
I think it's giving you a wrong idea of how things work.  If you tried that
line in Python 2.3, you'd see this instead:

DeprecationWarning: Non-ASCII character '\x94', but no declared encoding






More information about the Python-list mailing list