Grapheme clusters, a.k.a.real characters

Mikhail V mikhailwas at gmail.com
Sun Jul 16 22:28:00 EDT 2017


 ChrisA wrote:

>On Sun, Jul 16, 2017 at 2:33 PM, Rustom Mody <rustompmody at gmail.com> wrote:
>> Right now in an adjacent mailing list (debian) I see someone signed off with a
>>
>> grüß
>>
>> I guess the third character is a u with some ‘dirt’
>> Whats the fourth?

>It's a "sharp S".

or "Eszett", is a merge of two symbols that were used in old german
texts: "f"-like glyph
and "s" glyph, i.e. sort of ligature. Or simply, ß is a symbol that is
quite similar to "B".

I would just write : gruss
since it is simpler to type and has cleaner look.
"ß" is sort of deprecated, often subsituted with "ss". If I am not mistaken,
this substitution is oficially allowed in many regions (what a liberality!).

>>Heck even in the English that I learnt in school we had
>>ægis, homœopath etc

Similar to the above, historical symbols. These are (should be)
deprecated due to
legibility issues, roughly speaking. OTOH good for freaking-out.
Like: I was in Ægypt. and a reader so: aaaeeeegypt


 ChrisA wrote:
>Tell me, is "å" an a with some 'dirt', or is it a separate character?

>From the way you are asking, it seems that you are planning some tricky
business again... Hope not to argue on terminology again, å simply
makes the text flow inconsistent, such things are parasitic for
readability regardless if someone proclaims it a separate character or
not. In a reader-oriented medium should be used only as a last resort.

Looks like "a" whith a circle above, so yes, an "a" with a good deal of dirt.

>Is "i" an ı with some dirt, or a separate letter? Oh wait, you
>probably think that "i" is a letter, and "ı" is the same letter but
>with some dirt missing.

 "i" is a letter, you can't just remove the dot. So there can be just
dirt and there is
'dirt' which is in fact the natural part of the letter. Like a serif
for example.
but I am not expecting your acceptance of these statements,
I am just telling what follows from my long experience with the topic.
Though you can try to replace "i" with "ı" globally in a text and there
are chances you will notice something. Then you can try also with å.


>What about "p"? Is that just "d" written the
>wrong way up?

Sort of. The early designers did not find a better solution than taking
the rotated version of one glyph. Are you curious about all other letters?
Then probably you should start trying to design a legible typeface.
But ideally you should try to design a typeface from scratch, say some
20 glyphs,
not just a Latin-based variation, but truly from scratch. Then some
question should become more transparent, words are too weak in
transmitting these kind of things.

>At what point does something merit being called a
>different letter?

For truly different, when the structural difference is significant,
i.e. much more significant than the difference
between "ı" and "i". Yes in Turkish both are used.
And what can I say, its misfortunate for the users: suboptimal for
legibility + non ascii typing.
But could be much worse, look at Vietnamese writings.



Mikhail



More information about the Python-list mailing list