Grapheme clusters, a.k.a.real characters

Mikhail V mikhailwas at gmail.com
Mon Jul 17 18:01:29 EDT 2017


ChrisA wrote:

>Yep! Nobody would take any notice of the fact that you just put dots
>on all those letters. It's not like it's going to make any difference
>to anything. We're not dealing with matters of life and death here.

>Oh wait.

>https://www.theinquirer.net/inquirer/news/1017243/cellphone-localisation-glitch

>I'll leave you with that thought.


For Turkish and Slavic languages there is actually
a demand for at least one Yeru letter to distinguish
the common i  and Yeru. In cyrillic it is "ы".
It should be romanized as "y".
And the Yot /j/ should be romanized as "j".
I.e. for Turkish:
yazım - should be : jazym
For Russian:
ярлык - should be : jarlyk

Simple, asscii input, no ambiguity.
How many exercises in futility could be avoided...

And just in case still its not clear: this is not
solved by adding dirt around the letter: if there is
enough significance of the phoneme distinction then
one should add a distinct letter for a syntax in question.

And not like: well it is not so significant then we'll
add a bit of dirt, it is more significant - we add some more dirt.
It is not how the textual representation is made effecient.


Mikhail



More information about the Python-list mailing list