[Tutor] regexp

Peter Otten __peter__ at web.de
Mon Nov 7 11:48:55 CET 2011


Albert-Jan Roskam wrote:

> Nice solution indeed! Will it also work with accented characters? And how
> should one incorporate the collating sequence into the solution? By
> explicitly setting the locale? It might be nice if the outcome is always
> the same, whereever you are in the world.

This is probably easier to achieve with sorted() than with regular 
expressions:

>>> locale.setlocale(locale.LC_ALL, "")
'de_DE.UTF-8'
>>> words = [line.strip() for line in open("/usr/share/dict/ngerman") if 
len(line)>4]
>>> [w for w in words if "".join(sorted(w, key=locale.strxfrm)) == w]
['Abel', 'Abels', 'Abgott', 'Abort', 'Achim', 'Achims', 'Adel', 'Adelns', 
'Adels', 'Ader', 'Agio', 'Agios', 'Akku', 'Alls', 'Amor', 'Amors', 'BGHSt', 
'BIOS', 'Beet', 'Beginns', 'Behr', 'Behrs', 'Beil', 'Beils', 'Bein', 
'Beins', 'Bens', 'Benz', 'Beos', 'Bert', 'Bett', 'Betty', 'Bill', 'Bills', 
'Billy', 'Boot', 'Boss', 'Cello', 'Cellos', 'Cent', 'Chintz', 'Chip', 
'Chips', 'Chlor', 'Chlors', 'Chor', 'Chors', 'City', 'Clou', 'Cmos', 'Cruz', 
'Dekor', 'Dekors', 'Dell', 'Dells', 'Delors', 'Demo', 'Demos', 'Depp', 
'Depps', 'Dill', 'Dills', 'Egos', 'Film', 'Films', 'Filz', 'First', 'Flop', 
'Floß', 'Flöz', 'Forst', 'Gips', 'Gott', 'Hinz', 'Horst', 'Hort', 'Inst', 
'Klops', 'Klos', 'Klotz', 'Kloß', 'Knox', 'Kost', 'Löss', 'Moor', 'Moors', 
'Moos', 'Mopp', 'Mopps', 'Mops', 'Most', 'aalst', 'aalt', 'abbeißt', 'aber', 
'abesst', 'abfloss', 'abflosst', 'abhörst', 'abhört', 'ablöst', 'acht', 
'adeln', 'adelst', 'adelt', 'agil', 'ahmst', 'ahmt', 'ahnst', 'ahnt', 
'anno', 'beehrst', 'beehrt', 'beeilst', 'beeilt', 'beginn', 'beginnst', 
'beginnt', 'begoss', 'begosst', 'beim', 'beirrst', 'beirrt', 'beißt', 
'bellst', 'bellt', 'biss', 'bisst', 'bist', 'bloß', 'dehnst', 'dehnt', 
'dein', 'denn', 'dimm', 'dimmst', 'dimmt', 'dorrst', 'dorrt', 'dort', 
'dörrst', 'dörrt', 'döst', 'ehrst', 'ehrt', 'eilst', 'eilt', 'eins', 
'einst', 'eint', 'erst', 'esst', 'filmst', 'filmt', 'floss', 'flosst', 
'flott', 'flößt', 'foppst', 'foppt', 'fort', 'gilt', 'goss', 'gosst', 
'hisst', 'hopst', 'hörst', 'hört', 'irrst', 'irrt', 'isst', 'lost', 'löst', 
'Äffin', 'äffst', 'äfft']




More information about the Tutor mailing list