Python 3.2 has some deadly infection

Marko Rauhamaa marko at pacujo.net
Fri Jun 6 13:02:47 EDT 2014


Chris Angelico <rosuav at gmail.com>:

> "ASCII" means two things: Firstly, it's a mapping from the letter A to
> the number 65, from the exclamation mark to 33, from the backslash to
> 92, and so on. And secondly, it's an encoding of those numbers into
> the lowest seven bits of a byte, with the high byte left clear.
> Between those two, you get a means of representing the letter 'A' as
> the byte 0x41, and one of them is an encoding.

   The American Standard Code for Information Interchange [...] is a
   character-encoding scheme [...] <URL:
   http://en.wikipedia.org/wiki/ASCII>

> "Unicode", on the other hand, is only the first part. It maps all the
> same characters to the same numbers that ASCII does, and then adds a
> few more... a few followed by a few, followed by... okay, quite a lot
> more. Unicode specifies that the character OK HAND SIGN, which looks
> like 👌 if you have the right font, is number 1F44C in hex (128076
> decimal). This is the "Universal Character Set" or UCS.

   Unicode is a computing industry standard for the consistent encoding,
   representation and handling of text [...] <URL:
   http://en.wikipedia.org/wiki/Unicode>

Each standard assigns numbers to letters and other symbols. In a word,
each is a code. That's what their names say, too.


Marko



More information about the Python-list mailing list