Python Unicode handling wins again -- mostly

Mark Lawrence breamoreboy at yahoo.co.uk
Sat Nov 30 03:07:38 EST 2013


On 30/11/2013 02:08, Roy Smith wrote:
> In article <529934dc$0$29993$c3e8da3$5496439d at news.astraweb.com>,
>   Steven D'Aprano <steve+comp.lang.python at pearwood.info> wrote:
>
>> (8) What's the uppercase of "baffle" spelled with an ffl ligature?
>>
>> Like most other languages, Python 3.2 fails:
>>
>> py> 'baffle'.upper()
>> 'BAfflE'
>>
>> but Python 3.3 passes:
>>
>> py> 'baffle'.upper()
>> 'BAFFLE'
>
> I disagree.
>
> The whole idea of ligatures like fi is purely typographic.  The crossbar
> on the "f" (at least in some fonts) runs into the dot on the "i".
> Likewise, the top curl on an "f" run into the serif on top of the "l"
> (and similarly for ffl).
>
> There is no such thing as a "FFL" ligature, because the upper case
> letterforms don't run into each other like the lower case ones do.
> Thus, I would argue that it's wrong to say that calling upper() on an
> ffl ligature should yield FFL.
>
> I would certainly expect, x.lower() == x.upper().lower(), to be True for
> all values of x over the set of valid unicode codepoints.  Having
> u"\uFB04".upper() ==> "FFL" breaks that.  I would also expect len(x) ==
> len(x.upper()) to be True.
>

http://bugs.python.org/issue19819 talks about these beasties.  Please 
don't come back to me as I haven't got a clue!!!

-- 
Python is the second best programming language in the world.
But the best has yet to be invented.  Christian Tismer

Mark Lawrence




More information about the Python-list mailing list