Glyphs and graphemes [was Re: Cult-like behaviour]

Marko Rauhamaa marko at pacujo.net
Tue Jul 17 08:22:33 EDT 2018


Antoon Pardon <antoon.pardon at vub.be>:

> On 17-07-18 10:27, Marko Rauhamaa wrote:
>> Also, Python2's strings do as good a job at delivering codepoints as
>> Python3.
>
> No they don't. The programs that I work on, need to be able to treat
> at least german, french, dutch and english text. My experience is that
> in python3 it is way easier to do things right. Especially if you are
> working with regular expressions.

If you assume that NFC normalizes every letter to a single codepoint
(and carefully use NFC everywhere), you are right. But equally likely
you may inadvertently be setting yourself up for a surprise.


Marko



More information about the Python-list mailing list