char 128? no... 256

Tue Feb 11 23:53:52 EST 2003

On Tue, 11 Feb 2003 20:47:37 -0800, Erik Max Francis <max at alcyone.com>
wrote:

>Afanasiy wrote:
>
>[lots of Python interpreter traceback snipped]
>
>Again, it's not clear to me what you're actually trying to do.  If you
>want to go from an 8-bit string to a Unicode string or back, you need to
>specify the encoding, because an 8-bit string is meaningless without it.
>
>>>> s = '\xc3'
>>>> s
>'\xc3'
>>>> u = unicode(s, 'latin-1')
>>>> u
>u'\xc3'
>>>> ss = u.encode('latin-1')
>>>> ss
>'\xc3'
>>>> s == ss
>1
>
>If these don't work for you, then it's likely your system simply doesn't
>have Unicode support for some reason.
>
>You're saying they're just "extended ASCII foreign characters," but this
>doesn't mean anything without specifying the encoding.

I obviously need to be able to decode the Unicode object.
The sad part is, this doesn't have have to be a Unicode
object but Python decides to make it one. So I guess the
solution is to stop using Python for this project.

>>> s = u'ö'
>>> s
u'\x94'
>>> unicode(s,'latin-1')
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
TypeError: decoding Unicode is not supported
>>>