[Python-Dev] RE: [Patches] [Patch #100745] Fix PR #384, fixes UTF-8 en/decode

M.-A. Lemburg mal@lemburg.com
Thu, 06 Jul 2000 23:42:08 +0200


Bill Tutt wrote:
> 
> On Thu, 6 Jul 2000, Guido van Rossum wrote:
> 
> > > In any event, having the typedef is still useful since it clarifies the
> > > meaning behind the code.
> >
> 
> How about this:
> /*
>  * Use this typedef when you need to represent a UTF-16 surrogate pair
>  * as single unsigned integer.
>  */
> #if SIZEOF_INT >= 4
> typedef unsigned int Py_UCS4;
> #else
> #if SIZEOF_LONG >= 4
> typedef unsigned long Py_UCS4;
> #else
> #error "can't find integral type that can contain 32 bits"
> #endif /* SIZEOF_LONG */
> #endif /* SIZEOF_INT */

I like the name... Py_UCS4 is indeed what we're talking about
here.

What I don't understand is why you raise a compile error; AFAIK,
unsigned long is at least 32 bits on all platforms and that's
what the Unicode implementation would need to support UCS4 -- more
bits don't do any harm since the storage type is fixed at
16-bit UTF-16 values.

Ideal would be combining the above with the C9X typedefs,
e.g. typedef uint4_t Py_UCS4;

-- 
Marc-Andre Lemburg
______________________________________________________________________
Business:                                      http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/