How to turn a string into a list of integers?

Rustom Mody rustompmody at gmail.com
Sun Sep 7 13:53:33 EDT 2014


On Sunday, September 7, 2014 10:33:26 PM UTC+5:30, Steven D'Aprano wrote:
> MRAB wrote:

> > I don't think you should be saying that it stores the string in Latin-1
> > or UTF-16 because that might suggest that they are encoded. They aren't.

> Of course they are encoded. Memory consists of bytes, not Unicode code
> points, which are abstract numbers representing characters (and other
> things). You can't store "ξ" (U+03BE) in memory, you can only store a
> particular representation of that "ξ" in bytes, and that representation is
> called an encoding. Of course you can create whatever representation you
> like, or you can use an established encoding rather than re-invent the
> wheel. Here are four established encodings which support that code point,
> and the bytes that are used:

> py> u'ξ'.encode('iso-8859-7')
> '\xee'
> py> u'ξ'.encode('utf-8')
> '\xce\xbe'
> py> u'ξ'.encode('utf-16be')
> '\x03\xbe'
> py> u'ξ'.encode('utf-32be')
> '\x00\x00\x03\xbe'


Dunno about philosophical questions -- especially unicode :-)
What I can see (python 3) which is I guess what MRAB was pointing out:

>>> "".encode
<built-in method encode of str object at 0x7f3955da3848>

>>> "".decode
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: 'str' object has no attribute 'decode'

>>> b"".decode
<built-in method decode of bytes object at 0x7f39549fda08>

>>> b"".encode
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: 'bytes' object has no attribute 'encode'
>>> 




More information about the Python-list mailing list