Sorting strings containing special characters (german 'Umlaute')

Jussi Salmela tiedon_jano at hotmail.com
Sun Mar 4 02:55:19 EST 2007


Robin Becker kirjoitti:
> 
> Björn, in one of our projects we are sorting in javascript in several 
> languages English, German, Scandinavian languages, Japanese; from 
> somewhere (I cannot actually remember) we got this sort spelling 
> function for scandic languages
> 
> a
> .replace(/\u00C4/g,'A~') //A umlaut
> .replace(/\u00e4/g,'a~') //a umlaut
> .replace(/\u00D6/g,'O~') //O umlaut
> .replace(/\u00f6/g,'o~') //o umlaut
> .replace(/\u00DC/g,'U~') //U umlaut
> .replace(/\u00fc/g,'u~') //u umlaut
> .replace(/\u00C5/g,'A~~') //A ring
> .replace(/\u00e5/g,'a~~'); //a ring
> 
> does this actually make sense?

I think this order is not correct for Finnish, which is one of the 
Scandinavian languages. The Finnish alphabet in alphabetical order is:

	a-z, å, ä, ö

If I understand correctly your replacements cause the order of the last 
3 characters to be

	ä, å, ö

which is wrong.

HTH,
Jussi



More information about the Python-list mailing list