[Python-Dev] Re: [Python-checkins]python/dist/src/Objects unicodeobject.c, 2.197, 2.198

Tim Peters tim.one at comcast.net
Wed Sep 17 22:55:35 EDT 2003


[Jeremy Hylton]
> I was a little confused by the various UNICODE macros.  (Is there a
> comment block somewhere that explains what they are for?)

Not that I've found.  If someone writes one, don't forget the intended
difference between PY_UNICODE_TYPE and Py_UNICODE (hint:  there isn't a
difference <wink>).

> gcc -E tells me:
>
> typedef unsigned int Py_UCS4;
> typedef wchar_t Py_UNICODE;
> typedef long int wchar_t;
>
> (not necessarily in that order)
>
> I got Py_UCS4 and Py_UNICODE confused.  The detailed output confirms
> that Py_UNICODE is a signed long int.

So that puts an end to the claim that it's unlikely wchar_t will resolve to
a signed type.  Strangely, while char is a signed type under MSVC, wchar_t
is an unsigned type.  I expect both differ under gcc, then.  At least it's
consistent <wink>.

Anyway, everywhere the code may be doing

    a_Py_UNICODE  comparison  a_(signed)_int

is doing something unintended now on your box.  "The rules" for
mixed-signedness comparison are pretty much a nightmare, especially when
you're not sure how many bits are involved on both sides:

    http://yarchive.net/comp/ansic_broken_unsigned.html

MAL's idea of forcing PY_UNICODE_TYPE to resolve to an unsigned type may be
the easiest way out.




More information about the Python-Dev mailing list