How to turn a string into a list of integers?

Chris “Kwpolska” Warrick kwpolska at gmail.com
Fri Sep 5 14:25:16 EDT 2014


On Sep 5, 2014 7:57 PM, "Kurt Mueller" <kurt.alfred.mueller at gmail.com>
wrote:
> Could someone please explain the following behavior to me:
> Python 2.7.7, MacOS 10.9 Mavericks
>
> >>> import sys
> >>> sys.getdefaultencoding()
> 'ascii'
> >>> [ord(c) for c in 'AÄ']
> [65, 195, 132]
> >>> [ord(c) for c in u'AÄ']
> [65, 196]
>
> My obviously wrong understanding:
> ‚AÄ‘ in ‚ascii‘ are two characters
>      one with ord A=65 and
>      one with ord Ä=196 ISO8859-1 <depends on code table>
>      —-> why [65, 195, 132]
> u’AÄ’ is an Unicode string
>      —-> why [65, 196]
>
> It is just the other way round as I would expect.

Basically, the first string is just a bunch of bytes, as provided by your
terminal — which sounds like UTF-8 (perfectly logical in 2014).  The second
one is converted into a real Unicode representation. The codepoint for Ä is
U+00C4 (196 decimal). It's just a coincidence that it also matches latin1
aka ISO 8859-1 as Unicode starts with all 256 latin1 codepoints. Please
kindly forget encodings other than UTF-8.

BTW: ASCII covers only the first 128 bytes.

--
Chris “Kwpolska” Warrick <http://chriswarrick.com/>
Sent from my Galaxy S3.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-list/attachments/20140905/b497ae08/attachment.html>


More information about the Python-list mailing list