Sorting strings containing special characters (german 'Umlaute')

Robin Becker robin at reportlab.com
Fri Mar 2 09:20:52 EST 2007


DierkErdmann at mail.com wrote:
> Hi !
> 
> I know that this topic has been discussed in the past, but I could not
> find a working solution for my problem: sorting (lists of) strings
> containing special characters like "ä", "ü",... (german umlaute).
> Consider the following list:
> l = ["Aber", "Beere", "Ärger"]
> 
> For sorting the letter "Ä" is supposed to be treated like "Ae",
> therefore sorting this list should yield
> l = ["Aber, "Ärger", "Beere"]
> 
> I know about the module locale and its method strcoll(string1,
> string2), but currently this does not work correctly for me. Consider
>      >>> locale.strcoll("Ärger", "Beere")
>      1
> 
> Therefore "Ärger" ist sorted after "Beere", which is not correct IMO.
> Can someone help?
> 
> Btw: I'm using WinXP (german) and
>>>> locale.getdefaultlocale()
> prints
>    ('de_DE', 'cp1252')
> 
> TIA.
> 
>   Dierk
> 
we tried this in a javascript version and it seems to work sorry for long line 
and possible bad translation to Python


#coding: cp1252
def _deSpell(a):
	u = a.decode('cp1252')
	return 
u.replace(u'\u00C4','Ae').replace(u'\u00e4','ae').replace(u'\u00D6','OE').replace(u'\u00f6','oe').replace(u'\u00DC','Ue').replace(u'\u00fc','ue').replace(u'\u00C5','Ao').replace(u'\u00e5','ao')
def deSort(a,b):
	return cmp(_deSpell(a),_deSpell(b))

l = ["Aber", "Ärger", "Beere"]
l.sort(deSort)
print l



-- 
Robin Becker




More information about the Python-list mailing list