[issue12736] Request for python casemapping functions to use full not simple casemaps per Unicode's recommendation
Tom Christiansen
report at bugs.python.org
Sun Aug 28 23:01:50 CEST 2011
Tom Christiansen <tchrist at perl.com> added the comment:
Antoine Pitrou <report at bugs.python.org> wrote on Sat, 27 Aug 2011 20:04:56 -0000:
>> Neither am I. Even in "old-style" English with ae and oe, one wrote
>> ÆGYPT and ÆSIR all caps but Ægypt and Æsir in titlecase, not *Aegypt or
>> *Aesir. Similarly with ŒNOLOGY / Œnology / œnology, never *Oenology.
> Trying to disprove you a bit:
> http://ecx.images-amazon.com/images/I/51G6CH9XFFL._SL500_AA300_.jpg
> http://ecx.images-amazon.com/images/I/51k7TmosPdL._SL500_AA300_.jpg
> http://ecx.images-amazon.com/images/I/518UzMeLFCL._SL500_AA300_.jpg
> but classical typographies seem to write either the uppercase Œ or the
> lowercase œ.
That's what I meant: one only ever sees œufs or ŒUFS, never OEUFS.
French doesn't fit into ISO 8859-1. That's one of the changes to
ISO-8859-15 compared with ISO-8859-1 (and Unicode):
iso-8859-1 A4 ⇔ U+00A4 < ¤ > \N{CURRENCY SIGN}
iso-8859-15 A4 ⇒ U+20AC < € > \N{EURO SIGN}
iso-8859-1 A6 ⇔ U+00A6 < ¦ > \N{BROKEN BAR}
iso-8859-15 A6 ⇒ U+0160 < Š > \N{LATIN CAPITAL LETTER S WITH CARON}
iso-8859-1 A8 ⇔ U+00A8 < ¨ > \N{DIAERESIS}
iso-8859-15 A8 ⇒ U+0161 < š > \N{LATIN SMALL LETTER S WITH CARON}
iso-8859-1 B4 ⇔ U+00B4 < ´ > \N{ACUTE ACCENT}
iso-8859-15 B4 ⇒ U+017D < Ž > \N{LATIN CAPITAL LETTER Z WITH CARON}
iso-8859-1 B8 ⇔ U+00B8 < ¸ > \N{CEDILLA}
iso-8859-15 B8 ⇒ U+017E < ž > \N{LATIN SMALL LETTER Z WITH CARON}
iso-8859-1 BC ⇔ U+00BC < ¼ > \N{VULGAR FRACTION ONE QUARTER}
iso-8859-15 BC ⇒ U+0152 < Œ > \N{LATIN CAPITAL LIGATURE OE}
iso-8859-1 BD ⇔ U+00BD < ½ > \N{VULGAR FRACTION ONE HALF}
iso-8859-15 BD ⇒ U+0153 < œ > \N{LATIN SMALL LIGATURE OE}
iso-8859-1 BE ⇔ U+00BE < ¾ > \N{VULGAR FRACTION THREE QUARTERS}
iso-8859-15 BE ⇒ U+0178 < Ÿ > \N{LATIN CAPITAL LETTER Y WITH DIAERESIS}
> That said, I wonder why Unicode even includes ligatures like ff. Sounds
> like mission creep to me (and horrible annoyances for people like us).
I'm pretty sure that typographic ligatures are there for roundtripping
with legacy encodings. I believe that œ/Œ is the only code point
with ligature in its name that you're "supposed" to still use, and
that all others should be figured out by modern fonting software.
--tom
----------
_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue12736>
_______________________________________
More information about the Python-bugs-list
mailing list