[issue9200] Make str methods work with non-BMP chars on narrow builds

Ezio Melotti report at bugs.python.org
Fri Aug 19 17:55:03 CEST 2011


Ezio Melotti <ezio.melotti at gmail.com> added the comment:

Here's a new version of the patch.
I decided to leave the prefix anyway, for consistency with what I'll commit to 3.3 and because without the prefix NEXT() looks ambiguous (and it's not entirely clear if it's private or not).
I rewrote the macro as Victor suggested and tested that it still works (I also added a test with surrogates).
The macros are now called _Py_UNICODE_IS_{LOW|HIGH}_SURROGATE, with '_'s.  I also tried the implementation proposed in #12751 and benchmarked with:
$ ./python -m timeit -s 's = "\uD800\uD8000\uDFFF\uDFFF\uDFFF"*1000' 's.islower()'
and got "1000 loops, best of 3: 345 usec per loop" on both, so I left the old version because I think it's more readable.
Finally, I rewrote the comment about the macro, adding a note about its side effects.

----------
stage: patch review -> commit review
versions: +Python 2.7, Python 3.3
Added file: http://bugs.python.org/file22947/issue9200-2.diff

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue9200>
_______________________________________


More information about the Python-bugs-list mailing list