ICU wrapper for Python?

Fredrik Juhlin laz at strakt.com
Tue Mar 5 04:23:09 EST 2002


On Fri, Mar 01, 2002 at 10:00:43PM +0100, Martin v. Loewis wrote:
> Fredrik Juhlin <laz at strakt.com> writes:
> 
> > I was fiddling around with a Python wrapper for the ICU libraries this
> > weekend, when it occured to me to check if anyone else is/has been working
> > on the same thing. I looked around a bit on the net but couldn't find
> > anything, so I figured I'd drop a question here too :)
> 
> Did you make any progress on that? I'd be most interested in exposing
> the codecs.
Yes I did, actually. My main interest was in collation, so that's what
I've done so far. Now I'm at the point where I need to write some docs and
such.

However, I'm relying on the fact that since Python uses UCS-2 and ICU uses
UTF-16 for their respective internal format, any Python unicode string can
be used as an ICU unicode string. So for the collation I don't need to do
any conversion between the two. To expose the codecs, one would have to
convert the resulting strings from UTF-16 to UCS-2.

I have functions to convert between Python unicode strings and ICU dittos
and while they do work, they need to go through the "make it right" and
"make it fast" stages.

Also, so far I've only tested this on my own (x86 Linux) system. 

If you're interested in what I have so far, I'll stick it on a web server
for downloading. That'll have to wait until tonight though, since my
latest version is at home.

Any feedback would certainly be welcome!

//FJ




More information about the Python-list mailing list