A few questiosn about encoding

Dave Angel davea at davea.name
Wed Jun 12 08:43:05 EDT 2013


On 06/12/2013 05:24 AM, Steven D'Aprano wrote:
> On Wed, 12 Jun 2013 09:09:05 +0000, Νικόλαος Κούρας wrote:
>
>> Isn't 14 bits way to many to store a character ?
>
> No.
>
> There are 1114111 possible characters in Unicode. (And in Japan, they
> sometimes use TRON instead of Unicode, which has even more.)
>
> If you list out all the combinations of 14 bits:
>
> 0000 0000 0000 00
> 0000 0000 0000 01
> 0000 0000 0000 10
> 0000 0000 0000 11
> [...]
> 1111 1111 1111 10
> 1111 1111 1111 11
>
> you will see that there are only 32767 (2**15-1) such values. You can't
> fit 1114111 characters with just 32767 values.
>
>

Actually, it's worse.  There are 16536 such values (2**14), assuming you 
include null, which you did in your list.

-- 
DaveA



More information about the Python-list mailing list