[Numpy-discussion] Getting C-function pointers from Python to C

Tue Apr 10 06:37:34 EDT 2012

On Tue, Apr 10, 2012 at 1:57 AM, Travis Oliphant <travis at continuum.io> wrote:
> On Apr 9, 2012, at 7:21 PM, Nathaniel Smith wrote:
>
> ...isn't this an operation that will be performed once per compiled
> function? Is the overhead of the easy, robust method (calling ctypes.cast)
> actually measurable as compared to, you know, running an optimizing
> compiler?
>
> Yes, there can be significant overhead.   The compiler is run once and
> creates the function.   This function is then potentially used many, many
> times.    Also, it is entirely conceivable that the "build" step happens at
> a separate "compilation" time, and Numba actually loads a pre-compiled
> version of the function from disk which it then uses at run-time.
>
> I have been playing with a version of this using scipy.integrate and
> unfortunately the overhead of ctypes.cast is rather significant --- to the
> point of making the code-path using these function pointers to be useless
> when without the ctypes.cast overhed the speed up is 3-5x.

Ah, I was assuming that you'd do the cast once outside of the inner
loop (at the same time you did type compatibility checking and so
forth).

> In general, I think NumPy will need its own simple function-pointer object
> to use when handing over raw-function pointers between Python and C.   SciPy
> can then re-use this object which also has a useful C-API for things like
> signature checking.    I have seen that ctypes is nice but very slow and
> without a compelling C-API.

Sounds reasonable to me. Probably nicer than violating ctypes's
abstraction boundary, and with no real downsides.

> The kind of new C-level cfuncptr object I imagine has attributes:
>
> void *func_ptr;
> char *signature string  /* something like 'dd->d' to indicate a function
> that takes two doubles and returns a double */

This looks like it's setting us up for trouble later. We already have
a robust mechanism for describing types -- dtypes. We should use that
instead of inventing Yet Another baby type system. We'll need to
convert between this representation and dtypes anyway if you want to
use these pointers for ufunc loops... and if we just use dtypes from
the start, we'll avoid having to break the API the first time someone
wants to pass a struct or array or something.

> methods would be:
>
> from_ctypes  (classmethod)
> to_ctypes
> and simple inline functions to get the function pointer and the signature.

The other approach would be to define an interface, something like:

class MyFuncWrapper:
  def func_pointer(requested_rettype, requested_argtypes):
    return an_integer

fp = wrapper.func_pointer(float, (float, float))

This would be trivial to implement for ctypes functions, cython
functions, and numba. For ctypes or cython you'd probably just check
that the requested prototype matched the prototype for the wrapped
function and otherwise raise an error. For numba you'd check if you've
already compiled the function for the given type signature, and if not
then you could compile it on the fly. It'd also let you wrap an entire
family of ufunc loop functions at once (maybe np.add.c_func is an
object that implements the above interface to return any registered
add loop).

OTOH maybe there are places where the code that *calls* the "c
function object" should be adapting to its signature, rather than the
other way around -- in that case you'd want some way for the "c
function object" to advertise what signature(s) it supports. I'm not
sure which way the flexibility goes for the cases you're thinking of.

I feel iike I may not be putting my finger on what you're asking,
though, so hopefully these random thoughts are helpful.

-- Nathaniel