How to turn a string into a list of integers?

Steven D'Aprano steve+comp.lang.python at pearwood.info
Sun Sep 7 13:02:59 EDT 2014


MRAB wrote:

> I don't think you should be saying that it stores the string in Latin-1
> or UTF-16 because that might suggest that they are encoded. They aren't.

Of course they are encoded. Memory consists of bytes, not Unicode code
points, which are abstract numbers representing characters (and other
things). You can't store "ξ" (U+03BE) in memory, you can only store a
particular representation of that "ξ" in bytes, and that representation is
called an encoding. Of course you can create whatever representation you
like, or you can use an established encoding rather than re-invent the
wheel. Here are four established encodings which support that code point,
and the bytes that are used:

py> u'ξ'.encode('iso-8859-7')
'\xee'
py> u'ξ'.encode('utf-8')
'\xce\xbe'
py> u'ξ'.encode('utf-16be')
'\x03\xbe'
py> u'ξ'.encode('utf-32be')
'\x00\x00\x03\xbe'



-- 
Steven




More information about the Python-list mailing list