[pypy-dev] C++ library bindings

Fri Oct 17 05:50:36 CEST 2008

Hello,

I posted a response to your blog post on C++ library bindings, and
wanted to continue the discussion further via email if anyone's
interested.  I just signed up for the mailing list, so apologies if I
missed a lot of previous discussion.  I'll say up front that it's
unlikely that I'll be able to devote any actual coding effort to this,
so feel free to tell me to get lost if you have plenty of ideas and
not enough manpower. :)

I started out writing C++ bindings using Boost.Python, and was very
happy with it for a long time.  It's strongest point is the ability to
wrap libraries that were never designed with python in mind,
specifically code with poor and inflexible ownership semantics.
Internally, this means that C++ objects are exposed indirectly through
a holder object containing either an inline copy of the C++ object or
any type of pointer holding the object.  Every access to the object
has to go through runtime dispatch in order to work with any possible
holder type.  The holder also contains the logic for ownership and
finalization.  For example, Boost.Python can return a reference to a
field inside another object, in which case the holder will keep a
reference to the parent object to keep it alive as long as the field
reference lives.

The problem with this generality is that it produces a huge amount of
object code (wrapping a single function in Boost.Python can add 10k to
the object file), and adds a lot of runtime indirection.

Assuming that one is writing C++ bindings because of speed issues,
it'd be nice if this extra layer of memory indirection and runtime
dispatch was exposed to the (eventual) JIT.  In order to do that, pypy
would have to be capable of handling pointers to raw memory containing
non-python objects (is already true due to ctypes stuff?), with
separate information about type and ownership.  For example, if you
have bindings for a C++ vector class and a C++ array containing the
vectors, a "reference" to an individual vector in the array is really
three different pieces:

1. The actual pointer to the vector.
2. A type structure containing functions to be called with the pointer
(1) as an argument.
3. A list of references to other objects that need to stay alive while
this reference lives.

If pypy and the JIT ends up able to treat these pieces separately,
it'd be a significant performance win over libraries wrapped with
CPython.

The other main source of slowness and complexity in Boost.Python is
overloading support, but I think that part is fairly straightforward
to handle in the python level.  All Boost.Python does internally is
loop over the set of functions registered for a given name, and for
each one loop over the arguments calling into its converter registry
to see if the python object can be converted to the C++ type.

As I mentioned in the blog comment, a lot of these issues come up in
contexts outside C++, like numpy.  Internally numpy represents
operations like addition as a big list of optimized routines to call
depending on the stored data type.  Functions in these tables are
called on raw pointers to memory, which is fundamental since numpy
arrays can refer to memory inside objects from C++, Fortran, mmap,
etc.  It'd be really awesome if the type dispatch step could be
written in python but still call into optimized C code for the final
arithmetic.

The other major issue is safety: if a lot of overloading and dispatch
code is going to be written in python, it'd be nice to shield that
code from segfaults.  I think you can get a long way there just by
having a consistent scheme for boxing the three components above
(pointer, type, and reference info), a way to label C function
pointers with type information, a small RPython layer that did simple
type-checked calls (with no support for overloading or type
conversion).  I just wrote a C++ analogue to this last part as a
minimal replacement for Boost.Python, so I could try to formulate what
I mean in pseudocode if there's interest.  There'd be some amount of
duplicate type checking if higher level layers such as overload
resolution were written in application level python, but that
duplication should be amenable to elimination by the JIT.

That's enough for now.  I'll look forward to the discussion.  Most of
my uses of python revolve heavily around C++ bindings, so it's
exciting to see that you're starting to think about it even if it's a
long way off.

Geoffrey