A few questiosn about encoding

Chris Angelico rosuav at gmail.com
Thu Jun 20 12:37:48 EDT 2013


On Fri, Jun 21, 2013 at 2:27 AM,  <wxjmfauth at gmail.com> wrote:
> And all these coding schemes have something in common,
> they work all with a unique set of code points, more
> precisely a unique set of encoded code points (not
> the set of implemented code points (byte)).
>
> Just what the flexible string representation is not
> doing, it artificially devides unicode in subsets and try
> to handle eache subset differently.
>


UTF-16 divides Unicode into two subsets: BMP characters (encoded using
one 16-bit unit) and astral characters (encoded using two 16-bit units
in the D800::/5 netblock, or equivalent thereof). Your beloved narrow
builds are guilty of exactly the same crime as the hated 3.3.

ChrisA



More information about the Python-list mailing list