Glyphs and graphemes [was Re: Cult-like behaviour]

Marko Rauhamaa marko at pacujo.net
Wed Jul 18 04:07:54 EDT 2018


Antoon Pardon <antoon.pardon at vub.be>:

> On 17-07-18 14:22, Marko Rauhamaa wrote:
>> If you assume that NFC normalizes every letter to a single codepoint
>> (and carefully use NFC everywhere), you are right. But equally likely
>> you may inadvertently be setting yourself up for a surprise.
>
> You are moving the goal post. I didn't claim there were no surprises.
> I only claim that in the end combining regular expressions and working
> with multiple languages ended up being far easier with python3 strings
> than with python2 strings.

Fair enough.

> Sure there were some surprises or gotcha's, but the result was still
> better than doing it in python2 and they were easier to deal with than
> in python2.

BTW, in those needs, even Python2 has Unicode strings and unicodedata at
your disposal.


Marko



More information about the Python-list mailing list