Sorting a list of Unicode strings?

Steve Holden steve at holdenweb.com
Sun Aug 19 20:45:02 EDT 2007


Alex Martelli wrote:
> oliver at obeattie.com <oliver at obeattie.com> wrote:
>    ...
>>>> Maybe I'm missing something fundamental here, but if I have a list of
>>>> Unicode strings, and I want to sort these alphabetically, then it
>>>> places those that begin with unicode characters at the bottom.
>    ...
>> Anyway, I know _why_ it does this, but I really do need it to sort
>> them correctly based on how humans would look at it.
> 
> Depending on the nationality of those humans, you may need very
> different sorting criteria; indeed, in some countries, different sorting
> criteria apply to different use cases (such as sorting surnames versus
> sorting book titles, etc; sorry, I don't recall specific examples, but
> if you delve on sites about i18n issues you'll find some).
> 
Just one example from my own experience. When sorting names in Scotland 
(and technically in the rest of the UK too in deference to Scotland, 
though this is often ignored) named beginning with "Mc" have to be 
sorted /as though/ they began with "Mac". Since the two prefixes are 
indistinguishable phonetically it would otherwise mean twice as much 
work to look up one of those names.

regards
  Steve
-- 
Steve Holden        +1 571 484 6266   +1 800 494 3119
Holden Web LLC/Ltd           http://www.holdenweb.com
Skype: holdenweb      http://del.icio.us/steve.holden
--------------- Asciimercial ------------------
Get on the web: Blog, lens and tag the Internet
Many services currently offer free registration
----------- Thank You for Reading -------------




More information about the Python-list mailing list