Flexible string representation, unicode, typography, ...

Ian Kelly ian.g.kelly at gmail.com
Mon Aug 27 16:14:07 EDT 2012


On Mon, Aug 27, 2012 at 1:16 PM,  <wxjmfauth at gmail.com> wrote:
> - Why int32 and not uint32? No idea, I tried to find an
> answer without asking.

UCS-4 is technically only a 31-bit encoding. The sign bit is not used,
so the choice of int32 vs. uint32 is inconsequential.

(In fact, since they made the decision to limit Unicode to the range 0
- 0x0010FFFF, one might even point out that the *entire high-order
byte* as well as 3 bits of the next byte are irrelevant.  Truly,
UTF-32 is not designed for memory efficiency.)



More information about the Python-list mailing list