Flexible string representation, unicode, typography, ...

wxjmfauth at gmail.com wxjmfauth at gmail.com
Wed Aug 29 07:38:21 EDT 2012


Le lundi 27 août 2012 22:37:03 UTC+2, (inconnu) a écrit :
> Le lundi 27 août 2012 22:14:07 UTC+2, Ian a écrit :
> 
> > On Mon, Aug 27, 2012 at 1:16 PM,  <wxjmfauth at gmail.com> wrote:
> 
> > 
> 
> > > - Why int32 and not uint32? No idea, I tried to find an
> 
> > 
> 
> > > answer without asking.
> 
> > 
> 
> > 
> 
> > 
> 
> > UCS-4 is technically only a 31-bit encoding. The sign bit is not used,
> 
> > 
> 
> > so the choice of int32 vs. uint32 is inconsequential.
> 
> > 
> 
> > 
> 
> > 
> 
> > (In fact, since they made the decision to limit Unicode to the range 0
> 
> > 
> 
> > - 0x0010FFFF, one might even point out that the *entire high-order
> 
> > 
> 
> > byte* as well as 3 bits of the next byte are irrelevant.  Truly,
> 
> > 
> 
> > UTF-32 is not designed for memory efficiency.)
> 
> 
> 
> I know all this. The question is more, why not a uint32 knowing
> 
> there are only positive code points. It seems to me more "natural".

Answer found. In short: using negative ints
simplifies internal tasks.



More information about the Python-list mailing list