unicode, C++, python 2.2

Trond Eivind Glomsrød teg at scali.com
Fri Sep 9 05:36:02 EDT 2005


I am currently writing a python interface to a C++ library.  Some of the 
functions in this library take unicode strings (UTF-8, mostly) as arguments.

However, when getting these data I run into problem on python 2.2 
(RHEL3) - while the data is all nice UCS4 in 2.3, in 2.2 it seems to be 
UTF-8 on top of UCS4.  UTF8 encoded in UCS4, meaning that 3 bytes of the 
UCS4 char is 0 and the first one contains a byte of the string encoding 
in UTF-8.

Is there a trick to get python 2.2 to do UCS4 more cleanly?

-- 
Trond Eivind Glomsrød
Senior Software Engineer
Scali - www.scali.com
Scaling the Linux Datacenter




More information about the Python-list mailing list