Python Unicode handling wins again -- mostly

Ned Batchelder ned at nedbatchelder.com
Sat Nov 30 18:07:36 EST 2013


On 11/30/13 5:37 PM, Gregory Ewing wrote:
> wxjmfauth at gmail.com wrote:
>> And do you know the origin of this typographical feature?
>> Because, mechanically, the dot of the "i" broke too often.
>>
>> In my opinion, a very plausible explanation.
>
> It doesn't sound very plausible to me, because there
> are a lot more stand-alone 'i's in English text than
> there are ones following an f. What is there to stop
> them from breaking?
>
> It's more likely to be simply a kerning issue. You
> want to get the stems of the f and the i close together,
> and the only practical way to do that with mechanical
> type is to merge them into one piece of metal.
>
> Which makes it even sillier to have an 'ffi' character
> in this day and age, when you can simply space the
> characters so that they overlap.
>

The fi ligature was created because visually, an f and i wouldn't work 
well together: the crossbar of the f was near, but not connected to the 
serif of the i, and the terminal bulb of the f was close to, but not 
coincident, with the dot of the i.

This article goes into great detail, and has a good illustration of how 
an f and i can clash, and how an fi ligature can fix the problem: 
http://opentype.info/blog/2012/11/20/whats-a-ligature/ . Note the second 
fi illustration, which demonstrates using a ligature to make the letters 
appear *less* connected than they would individually!

This is also why "simply spacing the characters" isn't a solution: a 
specially designed ligature looks better than a separate f and i, no 
matter how minutely kerned.

It's unfortunate that Unicode includes presentation alternatives like 
the fi (and ff, fl, ffi, and fl) ligatures.  It was done to be a 
superset of existing encodings.

Many typefaces have other non-encoded ligatures as well, especially 
display faces, which also have alternate glyphs.  Unicode is a funny mix 
in that it includes some forms of alternates, but can't include all of 
them, so we have to put up with both an ad-hoc Unicode that includes 
presentational variants, and also some other way to specify variants 
because Unicode can't include all of them.

--Ned.




More information about the Python-list mailing list