[issue10542] Py_UNICODE_NEXT and other macros for surrogates

Thu Dec 30 03:38:41 CET 2010

Alexander Belopolsky <belopolsky at users.sourceforge.net> added the comment:

On Wed, Dec 29, 2010 at 8:02 PM, Martin v. Löwis <report at bugs.python.org> wrote:
..
>
> I plan to propose a complete redesign of the representation of Unicode
> strings, which may well make this entire set of changes obsolete.
>

Are you serious?  This sounds like a py4k idea.  Can you give us a
hint on what the new representation will be?  Meanwhile, what it your
recommendation for application developers?  Should they attempt to fix
the code that assumes len(chr(i)) == 1?  Should text processing
applications designed to run on a narrow build simply reject non-BMP
text? Should application writers avoid using str.isxyz() methods?

> As for language definition: I think the definition is quite clear
> and unambiguous. It may be that Python 3.2 doesn't fully implement it.
>

Given that until recently (r87433) the PEP and the reference manual
disagreed on the definition, I have to ask what definition you refer
to.  What Python 3.2 (or rather 3.1) implements, however is important
because it has been declared to be *the* definition of the Python
language regardless of what PEPs docs have to say.

> IOW: relax.

This is the easy part. :-)

----------

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue10542>
_______________________________________