[I18n-sig] python implementation of unicode collation algorithm

James Tauber jtauber at jtauber.com
Tue Jan 24 14:18:03 CET 2006


On 23/01/2006, at 4:49 AM, M.-A. Lemburg wrote:

> James Tauber wrote:
>> I've made a start on a pure python implementation of the Unicode
>> Collation Algorithm (UTS #10) but I thought I'd best check with this
>> SIG whether such a thing already exists.
>
> Not that I'm aware of.
>
> Note that given the sizes of the collation tables, it's probably
> better to have them defined in a C module, rather than a Python
> data structure.

Yes, this is certainly true of the DUCET, although for language- 
specific collation element tables, it would be more manageable.

I'll probably start with a pure Python implementation and then take  
it from there (or let someone with better C extension experience  
optimize it)

James



More information about the I18n-sig mailing list