Grapheme clusters, a.k.a.real characters

Chris Angelico rosuav at gmail.com
Tue Jul 18 16:51:42 EDT 2017


On Wed, Jul 19, 2017 at 6:05 AM, Mikhail V <mikhailwas at gmail.com> wrote:
> On 2017-07-18, Steve D'Aprano <steve+python at pearwood.info> wrote:
>
>> That's neither better nor worse than the system used by English and French,
>> where letters with dicritics are not distinct letters, but guides to
>> pronunciation.
>
>>_Neither system is right or wrong, or better than the other._
>
>
> If that is said just "not to hurt anybody" then its ok.
> Though this statement is pretty absurd, not so many
> (intelligent) people will buy this out today.

Let me give you one concrete example: the letter "ö". In English, it
is (very occasionally) used to indicate diaeresis, where a pair of
letters is not a double letter - for example, "coöperate". (You can
also hyphenate, "co-operate".) In German, it is the letter "o" with a
pronunciation mark (umlaut), and is considered the same letter as "o".
In Swedish, it is a distinct letter, alphabetized last (following z,
å, and ä, in that order). But in all these languages, it's represented
the exact same way.

Steven is pointing out that there's nothing fundamentally wrong about
using "ö" as a unique letter, nor is there anything fundamentally
wrong about using it as "o" with a pronunciation mark. Which I agree
with.

ChrisA



More information about the Python-list mailing list