Grapheme clusters, a.k.a.real characters

Steve D'Aprano steve+python at pearwood.info
Thu Jul 13 20:35:41 EDT 2017


>From time to time, people discover that Python's string algorithms work on code
points rather than "real characters", which can lead to anomalies like the
following:

s = 'xäex'
s = unicodedata.normalize('NFD', s)
print(s)
print(s[::-1])


which results in:

xäex
xëax


If you're interested in this issue, there's an issue on the bug tracker about
it, which is seeing some activity.

http://bugs.python.org/issue30717



-- 
Steve
“Cheer up,” they said, “things could be worse.” So I cheered up, and sure
enough, things got worse.




More information about the Python-list mailing list