[Python-Dev] Let's update CObject API so it is safe and regular!

Lisandro Dalcin dalcinl at gmail.com
Thu Apr 2 05:36:32 CEST 2009


On Wed, Apr 1, 2009 at 11:58 PM, Larry Hastings <larry at hastings.org> wrote:
>
> Guido van Rossum wrote:
>>
>> Yeah, any two CAPI objects can be used to play this trick, as long as
>> you have some place that calls them. :-(
>
> FWIW, I can't take credit for this observation.  Neal Norwitz threw me at
> this class of problem at the Py3k sprints in August 2007 at Google Mountain
> View, specifically with curses, though the approach he suggested then was
> removing the CObjects.
>

IMHO, removing them would be a really bad idea... PyCObject's are the
documented recommended way to make ext modules export its API's, and
that works pretty well in practice, and more well now with your
approach.

>
>> So what's your solution? If it was me I'd change the API to put the
>> full module name and variable name of the object inside the object and
>> have the IMPORT call check that. Then you can only have crashes if
>> some extension module cheats, and surely there are many other ways
>> that C extensions can cheat, so that doesn't bother me. :)
>
> My proposed API requires that the creator of the CObject pass in a "type"
> string, which must be of nonzero length, and the caller must pass in a
> matching string.  I figured that was easy to get right and sufficient for
> "consenting adults".

Just for reference, I'll comment how Cython uses this. First, Cython
exports API in a function-by-function basis (instead of a single
pointer to a C struct with function pointers, as e.g. cStringIO, or an
array of func pointers, as e.g. NumPy). All these are cached in a
"private" module global (a dict) named "__pyx_api__". See the link
below, for example:

http://mpi4py.scipy.org/docs/api/mpi4py.MPI-module.html#__pyx_capi__

So the dict keys are the exported function names. Moreover, the
PyCObject's "desc" are a C string with the function signature. Cython
retrieves a function by name from the dict and checks that the
expected signature match. BTW, now I believe Cython should also use
the function name for the "descr" :-)

The only issue with this approach for Cython is that PyCObject
currently stores "void*" (i.e., pointers to data), but does not have
room for "void(*)(void)" (i.e. pointers to functions, aka code).
Recently I had to write some hackery using type-punning with unions to
avoid the illegal conversion problem between pointers to data and
functions.

Larry, I did not understand your comments in the tracker about this.
Why do you see the above approach a miss-use of the API? All this
works extremely well in practice... A Cython-implement extension
module can export its API, and next you can consume it from Cython,
and moreover from hand-written C extension (and then you can easily
write SWIG typemaps).  And as the function are exported one by one,
you can even add stuff to some module API, and the consumers will not
notice the thing (API tables implemented with pointer to C struct or
array of function pointers, you need to be more careful for API
exporting being backward)


-- 
Lisandro Dalcín
---------------
Centro Internacional de Métodos Computacionales en Ingeniería (CIMEC)
Instituto de Desarrollo Tecnológico para la Industria Química (INTEC)
Consejo Nacional de Investigaciones Científicas y Técnicas (CONICET)
PTLC - Güemes 3450, (3000) Santa Fe, Argentina
Tel/Fax: +54-(0)342-451.1594


More information about the Python-Dev mailing list