Sorting a list of Unicode strings?

thebjorn BjornSteinarFjeldPettersen at gmail.com
Sun Aug 19 21:47:05 EDT 2007


On Aug 19, 8:09 pm, al... at mac.com (Alex Martelli) wrote:
[...]
> In both Swedish and Danish, I believe, A-with-ring sorts AFTER the
> letter Z in the alphabet; so, having Åaland (where I'm using Aa for
> A-with-ring, since this newsreader has some problem in letting me enter
> non-ascii characters;-) sort "right at the bottom", while it "doesn't
> look right" to YOU (maybe an English-speaker?) may look right to the
> inhabitants of that locality (be they Danes or Swedes -- but I believe
> Norwegian may also work similarly in terms of sorting).

You're absolutely correct, the Norwegian and Danish alphabets end
with ..xyzæøå, while the Swedish alphabet ends with ..xyzåäö and sort
order follows placement. Indeed, my first reaction to the op was:
where else would Åland be but at the end? One, perhaps interesting,
tidbit, is that Åland "belongs" to Finland (it's an autonomous,
demilitarized, monolingually Swedish-speaking administrative province
of Finland). The Finnish alphabet is identical to the Swedish
alphabet, including sort order (at least in this case)

For the ascii-speakers out there, the key point to remember is that
the letter Å (pronounced like the au in brittish autumn) is not an
ascii A with a ring on top. The ring-on-top is an intrinsic part of
the letter, in the same way the tail on the letter Q isn't a
decoration of the letter O.

-- bjorn




More information about the Python-list mailing list