Coding systems are political (was Exended ASCII and code pages)
Terry Reedy
tjreedy at udel.edu
Sun May 29 14:46:29 EDT 2016
On 5/29/2016 2:12 AM, Rustom Mody wrote:
> In short that a € costs more than a $ is a combination of the factors
> - a natural cause -- there are a million chars to encode (lets assume that the
> million of Unicode is somehow God-given AS A SET)
> - an artificial political one -- out of the million-factorial permutations of
> that million, the one that the Unicode consortium chose is towards satisfying the
> equation: Keep ASCII users undisturbed and happy
From the Python developer viewpoint, Unicode might as well be a fact of
nature. I also note that in English text, a (phoneme) char conveys
about 6 bits of information, while in Chinese text, a (word) char
conveys perhaps 15 bits of information. So I argue that Python 3.3+'s
FSR is being fair in using 1 byte for the first and most often 2 bytes
for the other.
--
Terry Jan Reedy
More information about the Python-list
mailing list