[I18n-sig] UCS-4 configuration

Tim Peters tim.one@home.com
Tue, 26 Jun 2001 21:38:34 -0400


[/F]
> looks like your patch doesn't support sizeof(short) > 2 (e.g. cray).
> except for that, it's not too different from what I was working on.

[Martin v. Loewis]
> Indeed it doesn't. How are you going to solve this? Generating
> UCS-2/UTF-16 when you have no two-byte type is not easy, unless you
> plan to do all byte operations yourself.

As opposed to what, having elves do them for us while we sleep <wink>?  You
need at least 16 bits, but it should be no problem if you have more than
that -- all it takes is a tiny bit of care, and standard C (not even C99)
does not guarantee that any integral type has exactly 2 bytes (or 4, or 8).
All C guarantees is minimal sizes, and they refused to make stronger
guarantees than that because the real world wouldn't let them.

I have decades of experience with this, so either trust me on it or point me
at code you think is a problem.  The saving grace is that any sequence of
16-bit operations involving +, -, *, &, |, ^ and << yields exactly the same
result if you do it with any number of bits >= 16, then take the last 16
bits at the end.  /, ~ and >> *may* require a little thought.  Note that MAL
made a similar argument in the Cray T3E bug report, I asked him to point me
at some troublesome code, and it turned out that didn't need *any* changes
to work correctly when sizeof(Py_UNICODE)==4 (or 8, or 10000000000 on the
next Cray <wink>).

> Anyway, at the moment, it is a compile time error if short is not two
> bytes.

Yes, I discovered that when the Windows build fell on its face <wink>.  Just
ribbing you there -- 'twas a trivial fix.