Glyphs and graphemes [was Re: Cult-like behaviour]

Antoon Pardon antoon.pardon at vub.be
Wed Jul 18 03:31:30 EDT 2018


On 17-07-18 14:22, Marko Rauhamaa wrote:
> Antoon Pardon <antoon.pardon at vub.be>:
>
>> On 17-07-18 10:27, Marko Rauhamaa wrote:
>>> Also, Python2's strings do as good a job at delivering codepoints as
>>> Python3.
>> No they don't. The programs that I work on, need to be able to treat
>> at least german, french, dutch and english text. My experience is that
>> in python3 it is way easier to do things right. Especially if you are
>> working with regular expressions.
> If you assume that NFC normalizes every letter to a single codepoint
> (and carefully use NFC everywhere), you are right. But equally likely
> you may inadvertently be setting yourself up for a surprise.

You are moving the goal post. I didn't claim there were no surprises. I only claim
that in the end combining regular expressions and working with multiple languages
ended up being far easier with python3 strings than with python2 strings.

Sure there were some surprises or gotcha's, but the result was still better than
doing it in python2 and they were easier to deal with than in python2.

-- 
Antoon.




More information about the Python-list mailing list