Creating custom types from C code

Stefan Behnel stefan_ml at behnel.de
Sat Dec 18 01:18:35 EST 2010


Eric Frederich, 17.12.2010 23:58:
> I have an extension module for a 3rd party library in which I am
> wrapping some structures.
> My initial attempt worked okay on Windows but failed on Linux.
> I was doing it in two parts.
> The first part on the C side of things I was turning the entire
> structure into a char array.
> The second part in Python code I would unpack the structure.
>
> Anyway, I decided I should be doing things a little cleaner so I read
> up on "Defining New Types"
> http://docs.python.org/extending/newtypes.html
>
> I got it to work but I'm not sure how to create these new objects from C.

You may want to take a look at Cython. It makes writing C extensions easy. 
For one, it will do all sorts of type conversions for you, and do them 
efficiently and safely (you get an exception on overflow, for example). 
It's basically Python, so creating classes and instantiating them is trivial.

Also note that it's generally not too much work to rewrite an existing C 
wrapper in Cython, but it's almost always worth it. You immediately get 
more maintainable code that's much easier to extend and work on. It's also 
often faster than hand written code.

http://cython.org


> My setup is almost exactly like the example on that page except
> instead of 2 strings and an integer I have 5 unsigned ints.
>
> I do not expect to ever be creating these objects in Python.  They
> will only be created as return values from my wrapper functions to the
> 3rd party library.

In Cython 0.14, you can declare classes as "final" and "internal" using a 
decorator, meaning that they cannot be subtyped from Python and do not show 
up in the module dict. However, note that there is no way to prevent users 
from getting their hands at the type once you give them an instance.


> I could return a tuple from those functions but I want to use dot
> notation (e.g. somestruct.var1).

Then __getattr__ or properties are your friend.


> So, question number 1:
>      Is defining my own type like that overkill just to have an object
> to access using dots?

Creating wrapper objects is totally normal.

Also note that recent Python versions have named tuples, BTW.


>      I'll never create those objects from Python.
>      Is there a shortcut to creating objects and setting attributes
> from within C?

The Cython code for instantiating classes is identical to Python.


> In any case, I was able to create my own custom object from C code like so...
>
>      PyObject *foo(SomeCStruct bar){
>          PyObject *ret;
>          ret = _PyObject_New(&mymodule_SomeStructType);
>          PyObject_SetAttrString(ret, "var1" , Py_BuildValue("I", bar.var1 ));
>          PyObject_SetAttrString(ret, "var2" , Py_BuildValue("I", bar.var2 ));
>          PyObject_SetAttrString(ret, "var3" , Py_BuildValue("I", bar.var3 ));
>          PyObject_SetAttrString(ret, "var4" , Py_BuildValue("I", bar.var4 ));
>          PyObject_SetAttrString(ret, "var5" , Py_BuildValue("I", bar.var5 ));
>          return ret;
>      }
>
> When using _PyObject_New I notice that neither my new or init function
> are ever called.
> I verified that they are getting called when creating the object from
> Python

Things often work a little different in Python and C. Directly calling 
_PyObject_New() is a lot less than what Python does internally. The 
canonical way is to PyObject_Call() the type (or to use one of the other 
call functions, depending on what your arguments are).


> (which I would never do anyway).

Your users could do it, though, so you should make sure that won't crash 
the interpreter that way by leaving internal data fields uninitialised.


> Question number 2:
>      Do I need to be calling PyObject_SetAttrString or is there a way
> to set the unsigned ints on the structure direcly?
>      It seems overkill to create a Python object for an unsigned int
> just to set it as an attribute on a custom defined type.

You will have to do it at some point, though, either at instantiation time 
or at Python access time. Depending on the expected usage, either of the 
two can be more wasteful.


> Question number 3:
>      In the above code, is there a memory leak?  Should I be
> Py_DECREF'ing the return value from Py_BuildValue after I'm done using
> it.

You can look that up in the C-API docs. If a function doesn't say that it 
"steals" a reference, you still own the reference when it returns and have 
to manually decref it (again, a thing that you won't usually have to care 
about in Cython). So, yes, the above leaks one reference for each call to 
Py_BuildValue().

Stefan




More information about the Python-list mailing list