case-insensitive and internationalized sort

Fredrik Lundh fredrik at pythonware.com
Thu Dec 19 16:49:58 EST 2002


Martin v. Löwis wrote:

> Define "correctly". Different languages sort accented characters in
> different ways: either put them after all other letters (Swedish, I
> believe

not quite: the swedish alphabet has 29 characters: a-z is followed
by åäö, which are sorted as separate characters, not accented a's
and o's.

"ü" is usually sorted as if it was a "y".

"é" and other accented characters are less common, but are usually
sorted as if they didn't have an accent.

names are sometimes sorted based on pronouncation, not strict
alphabetical order (e.g. carlson and karlsson are two spellings of
the same name, lundh is sorted before lundgren, etc).

etc.

there's is no such thing as a single correct sort order.

</F>





More information about the Python-list mailing list