[I18n-sig] UCS-4 configuration

Martin v. Loewis martin@loewis.home.cs.tu-berlin.de
Wed, 27 Jun 2001 00:50:24 +0200


> ouch.  duplicate effort here.

Sorry about this. When I noticed you had some code committed, I
thought "release early, release often".

> go ahead and check it in.

Done. Some clean-up could be still applied, such as defining only one
of USE_UCS4_STORAGE and Py_UNICODE_SIZE, but I'll leave that to your
judgement (i.e. I won't attempt any further changes at the moment
unless asked).

> looks like your patch doesn't support sizeof(short) > 2 (e.g. cray).
> except for that, it's not too different from what I was working on.

Indeed it doesn't. How are you going to solve this? Generating
UCS-2/UTF-16 when you have no two-byte type is not easy, unless you
plan to do all byte operations yourself.

Anyway, at the moment, it is a compile time error if short is not two
bytes. I hope I found all places where Py_UCS2 should be used.

Regards,
Martin

P.S. This patch makes the test suite fail in four byte mode, when
trying to check the output of u'\ud800\udc02'.encode('utf-8'). IMO,
all literals denoting surrogates should be replaced with \U
literals in test_unicode; this is not done yet.