char 128? no... 256

Erik Max Francis max at alcyone.com
Tue Feb 11 23:47:37 EST 2003


Afanasiy wrote:

[lots of Python interpreter traceback snipped]

Again, it's not clear to me what you're actually trying to do.  If you
want to go from an 8-bit string to a Unicode string or back, you need to
specify the encoding, because an 8-bit string is meaningless without it.

>>> s = '\xc3'
>>> s
'\xc3'
>>> u = unicode(s, 'latin-1')
>>> u
u'\xc3'
>>> ss = u.encode('latin-1')
>>> ss
'\xc3'
>>> s == ss
1

If these don't work for you, then it's likely your system simply doesn't
have Unicode support for some reason.

You're saying they're just "extended ASCII foreign characters," but this
doesn't mean anything without specifying the encoding.

-- 
 Erik Max Francis / max at alcyone.com / http://www.alcyone.com/max/
 __ San Jose, CA, USA / 37 20 N 121 53 W / &tSftDotIotE
/  \ The multitude of books is making us ignorant.
\__/ Voltaire
    EmPy / http://www.alcyone.com/pyos/empy/
 A templating system for Python.




More information about the Python-list mailing list