is PyCodec_Encode API able to change encoding fron UCS-2 to UCS-4

Gabriel Genellina gagsl-py2 at yahoo.com.ar
Mon Apr 27 15:26:49 EDT 2009


En Mon, 27 Apr 2009 08:56:44 -0300, rahul <rahul03535 at gmail.com> escribió:

> is this generic API can be use to change ucs-2 to ucs-4
> PyObject *  PyCodec_Encode(
>        PyObject *object,
>        const char *encoding,
>        const char *errors
>        );
>
> if yes than what is the format of denoting ucs-4, because i try to do
> that but all times i got segmentation fault, i used "ucs-4-le" for
> little endian system and "ucs-4-be"  for big endian system to set
> ucs-4 encoding.

The PyCodec_XXX functions seem to be undocumented - I don't know if this  
is on purpose or not. Anyway, I'd use the str/unicode methods:

PyObject* u = PyString_AsDecodedObject(some_string_in_utf16, "utf-16",  
NULL);
// don't forget to check for errors
PyObject* s = PyUnicode_AsEncodedString(u, "utf-32", NULL);
// don't forget to check for errors and decref u

Python 2.6 provides some convenience functions, like PyUnicode_DecodeUTF16  
and PyUnicode_EncodeUTF32

-- 
Gabriel Genellina




More information about the Python-list mailing list