Writing python module in C: wchar_t or Py_UNICODE?

Fri Mar 16 10:13:20 EDT 2007

On Fri, 2007-03-16 at 04:04 -0700, Yury wrote:
> I am new to python and programming generally, but someday it is time
> to start :)
> I am writing a python module in C and have a question about multibyte
> character strings in python<=>C.
> I want a C function which takes a string as argument from python
> script:
> 
> static PyObject *
> connect_to_server(PyObject *self, PyObject * authinfo){
> wchar_t * login;  /* Must support unicode */
> char * serveraddr;
> int * port;
> 
> if(!PyArgsParseTuple(authinfo, "sdu", &serveraddr, &port, &login))
> return NULL;
> 
> ...
> 
> Will that code work?
> Or i should use Py_UNICODE * data type? Will it be compatible with
> standard C string comparison/concantenation functions?

You should familiarize yourself with the Python/C API documentation. It
contains the answers to all the above questions.

http://docs.python.org/api/arg-parsing.html says this about the "u"
format character: "a pointer to the existing Unicode data is stored into
the Py_UNICODE pointer variable whose address you pass."

http://docs.python.org/api/unicodeObjects.html says this about
Py_UNICODE: "On platforms where wchar_t is available and compatible with
the chosen Python Unicode build variant, Py_UNICODE is a typedef alias
for wchar_t to enhance native platform compatibility."

The first quote says that, to be strictly correct, "login" should be a
"Py_UNICODE*", but the second quote says that under the right
circumstances, Py_UNICODE is the same as wchar_t. It's up to you to
determine if your platform provides the right circumstances for this to
be the case.

Hope this helps,

Carsten.