Grapheme clusters, a.k.a.real characters

Gregory Ewing greg.ewing at canterbury.ac.nz
Sat Jul 15 19:50:06 EDT 2017


Chris Angelico wrote:
> Hold on, let me just grab my MUD
> client, which is already using a fixed width font...
> 
> -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
> 忘掉那 無形鎖
> الثلج لا يشعرني بإكتئاب
> הקור לא מפריע לי, לא חודר
> U+1680 is " "
> U+200B is ""
> U+180E is "᠎"
> 다 잊어 다 잊어
> -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-

I suspect that different lines in that example are actually
being rendered in different fonts. Characters within the *same*
monospaced font should have the same width (otherwise it's not
really a monospaced font!), but there are no guarantees between
different fonts.

Perhaps the meta-problem here is that Unicode being so big has
made it impractical to have a single font that encompasses all
the characters you might ever want to render, so you often have
to make do with a hodgepodge of fonts that don't play well
together.

-- 
Greg



More information about the Python-list mailing list