sorting slovak utf

Serge Orlov sombDELETE at pobox.ru
Wed Dec 10 00:36:10 EST 2003


"Stano Paska" <paska at kios.sk> wrote in message news:mailman.256.1070955895.16879.python-list at python.org...
> I had an imagination, that there is some easy way
> to work with slovak, russian, english and german text in one application.
Depends on what you mean "work". Upcase? Split words? Sort? Spell check?
Translate? Display?

> I only change locale from sk_SK.utf-8 to ru_RU.utf-8, ... and system works.
> Input and output are in utf-8.
>
> Is this a fantasy?
If you mean sorting, yes. Python does not have handy functions to do that.
The good news is that the solution is only 10-15 lines away from you. You've
been given all information in this thread. Let me summarize it:
1. Convert your input to unicode.
2. Use locale named 'Slovak' (see my previous post why)
3. Use DSU trick to sort the words, here's the (untested) D part of it:
def decorate(seq, locale_encoding):
    return [(locale.strxfrm(s.encode(locale_encoding,'replace'),s) \
               for s in seq]

It's not as scary as strxfrm name implies.
-- Serge.








More information about the Python-list mailing list