[issue10542] Py_UNICODE_NEXT and other macros for surrogates

Alexander Belopolsky report at bugs.python.org
Fri Dec 3 20:27:08 CET 2010


Alexander Belopolsky <belopolsky at users.sourceforge.net> added the comment:

On Sat, Nov 27, 2010 at 6:38 PM, Raymond Hettinger
<report at bugs.python.org> wrote:
..
> I suggest Py_UNICODE_ADVANCE() to avoid false suggestion that the iterator protocol is being used.
>

As a data point, ICU defines U16_NEXT() for similar purpose.  I also
like ICU terminology for surrogates ("lead" and "trail") better than
the backward "high" and "low".  The U16_APPEND()  suggests
Py_UNICODE_APPEND instead of PUT_NEXT (this one has a virtue of not
having "next" in the name as well.)  I still like NEXT better than
ADVANCE because it is shorter and has an obvious PREV counterpart that
we may want to add later.

Note that ICU uses U16_ prefix for these macros even when they operate
on 32-bit characters.

More at

http://icu-project.org/apiref/icu4c/utf16_8h.html
http://userguide.icu-project.org/strings

----------

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue10542>
_______________________________________


More information about the Python-bugs-list mailing list