On Mon, Mar 30, 2009 at 3:36 AM, spir <denis.spir at free.fr> wrote: > Everything is in the title ;-) > (Is it kind of integers representing the code point?) Unicode is represented as 16-bit integers. I'm not sure, but I don't think Python has support for surrogate pairs, i.e. characters outside the BMP. Kent