Unicode normalisation [was Re: [beginner] What's wrong?]

Chris Angelico rosuav at gmail.com
Fri Apr 8 13:50:16 EDT 2016


On Sat, Apr 9, 2016 at 3:44 AM, Marko Rauhamaa <marko at pacujo.net> wrote:
> Unicode heroically and definitively solved the problems ASCII had posed
> but introduced a bag of new, trickier problems.
>
> (As for ligatures, I understand that there might be quite a bit of
> legacy software that dedicated code points and code pages for ligatures.
> Translating that legacy software to Unicode was made more
> straightforward by introducing analogous codepoints to Unicode. Unicode
> has quite many such codepoints: µ, K, Ω etc.)

More specifically, Unicode solved the problems that *codepages* had
posed. And one of the principles of its design was that every
character in every legacy encoding had a direct representation as a
Unicode codepoint, allowing bidirectional transcoding for
compatibility. Perhaps if Unicode had existed from the dawn of
computing, we'd have less characters; but backward compatibility is
way too important to let a narrow purity argument sway it.

ChrisA



More information about the Python-list mailing list