[Python-Dev] 2.2 Unicode questions

Martin von Loewis loewis@informatik.hu-berlin.de
Thu, 19 Jul 2001 17:37:42 +0200 (MEST)


> The impression I got from the discussion around this was that ISO
> 10464 now *also* promises to limit itself to 0x110000 characters
> forever.  MvL or MAL can corroborate.

It appears that the state is still the one of resolution M38.6, as
reported in

http://209.109.201.97/unicode/reports/tr19/tr19-7.html

# WG2 accepts the proposal in document N2175 towards removing the
# provision for Private Use Groups and Planes beyond Plane 16 in
# ISO/IEC 10646, to ensure internal consistency in the standard
# between UCS-4, UTF-8 and UTF-16 encoding formats, and instructs its
# project editor [to] prepare suitable text for processing as a future
# Technical Corrigendum or an Amendment to 10646-1:2000."

The original proposal can be found in

http://anubis.dkuug.dk/JTC1/SC2/WG2/docs/n2175.htm

It appears that the promised amendment is PDAM 1 to ISO 10646-1:2000,
in

http://anubis.dkuug.dk/JTC1/SC2/WG2/docs/n2308.pdf

which, in 9.1, reserves planes 11 to FF in group 0, and all other
groups, for future use, and removes the private use planes E0 to plane
FF of group 0, as well as the private use groups 60-7F. In addition,
it adds the note

# To ensure continued interoperability between the UTF-16 form and
# other coded representations of the UCS, it is intended that no other
# characters will ever be allocated to code positions above 0010FFFF.

However, this addmendment is still in the draft stage, with comments
in

http://anubis.dkuug.dk/JTC1/SC2/WG2/docs/n2355.pdf

Since voting in ISO usually takes a while, there may be some more
months until ISO 10646 is officially restricted to 17 planes - but it
is unlikely that this won't happen.

Regards,
Martin