[Python-Dev] RE: [Patches] [Patch #100745] Fix PR #384, fixes UTF-8 en/decode

Greg Stein gstein@lyra.org
Thu, 6 Jul 2000 15:00:42 -0700


On Thu, Jul 06, 2000 at 11:42:08PM +0200, M.-A. Lemburg wrote:
> Bill Tutt wrote:
> > 
> > On Thu, 6 Jul 2000, Guido van Rossum wrote:
> > 
> > > > In any event, having the typedef is still useful since it clarifies the
> > > > meaning behind the code.
> > >
> > 
> > How about this:
> > /*
> >  * Use this typedef when you need to represent a UTF-16 surrogate pair
> >  * as single unsigned integer.
> >  */
> > #if SIZEOF_INT >= 4
> > typedef unsigned int Py_UCS4;
> > #else
> > #if SIZEOF_LONG >= 4
> > typedef unsigned long Py_UCS4;
> > #else
> > #error "can't find integral type that can contain 32 bits"
> > #endif /* SIZEOF_LONG */
> > #endif /* SIZEOF_INT */
> 
> I like the name... Py_UCS4 is indeed what we're talking about
> here.
> 
> What I don't understand is why you raise a compile error; AFAIK,
> unsigned long is at least 32 bits on all platforms and that's
> what the Unicode implementation would need to support UCS4 -- more
> bits don't do any harm since the storage type is fixed at
> 16-bit UTF-16 values.

Agreed. Some logic is still desirable (picking "int" over "long" is
goodness), but the error is not needed. Python simply does not work if a
long is not at least 32 bits. Period. No reason for an error.

> Ideal would be combining the above with the C9X typedefs,
> e.g. typedef uint4_t Py_UCS4;

Actually, uint_fast32_t

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/