[C++-sig] Redesign for to_python/from_python converters

Sun Dec 22 21:52:22 CET 2002

Some background for this thread is in the following messages:

http://article.gmane.org/gmane.comp.python.c++/638 describes the
global Boost.Python to/from-python converter registry which associates
conversion code with the C++ type(s) being converted.  On re-reading
this message, I realize it's confusingly worded.  Don't be discouraged
if it doesn't all sink in.  The point is that when a function
accepting a C++ argument of type Foo is wrapped, it needs a procedure
to extract a Foo from the Python object.  The registry associates the
code for that procedure with the type Foo.

http://article.gmane.org/gmane.comp.python.c++/2044 describes a
proposed scheme for dealing with the fact that currently, any given
C++ type can have only one to-python conversion in the global
registry.  This is a problem because one module may use
class_<vector<string> >, while another one registers a custom
conversion from vector<string> to a Python tuple of strings.

http://article.gmane.org/gmane.comp.python.c%2B%2B/2161 describes the
general procedure for defining an rvalue from_python converter, and
shows why I haven't yet documented how to make user-defined
conversions: it's just too hairy a procedure in v2 at the moment.

The purpose of this thread is to discuss the design of a new interface
for users to interact with to/from-python converters.

First, I'd like to discuss some things which I think are desirable.
It would be worth knowing which of these are important to the
community:

1. Users should have the option to say that certain to/from-python
   converters shall only have a local effect in a single extension
   module.

2. It should be possible to optimize away the cost of registry lookups
   in some cases where it is known that the conversion is defined in
   the local extension module.  This is a completely separate issue
   from #1.  We can have either one without the other.

3. It should be easy for users to explicitly define new conversions.

Let's deal with #3 first.  You might ask, "what was wrong with the old
Boost.Python v1 approach? It sure was simple!"  

   PyObject* to_python(SomeUDT const& x) { ... }
   SomeUDT from_python(PyObject* p, boost::type<SomeUDT>) { ... }

It was simple, but it was basically not legal C++.  It worked in so
many places because it exploited a very common (and somewhat subtle)
bug in C++ compilers, but when if you tried to use more conforming
compilers (e.g. CodeWarrior >= 8 or recent EDGs), it would fail to
compile.  The reasons have to do with the rules for looking up these
functions from within templates.  Such a scheme can be made to work
legally, but you have to resort to weird tricks like asking users to
define all their converters before #including Boost.Python headers or
adding dummy arguments to the functions for the sake of
argument-dependent lookup.

There is another problem with the v1 scheme.  In particular, it was an
important design goal of Boost.Python v2 to eliminate the use of C++
exceptions as part of the process of resolving overloaded C++
functions (in v1, we would throw a special exception to indicate a
Python argument couldn't be converted to the corresponding C++
argument type).  That means that from_python conversion has to be a
2-phase process: first, determine whether a conversion is possible,
then if all arguments can be converted, do all the conversions and
call the C++ function.  Any user-defined from_python conversion needs
to be able to report convertibility separately from actually doing the
conversion.

There were also some ways in which the v1 interface was hard to use:
the 2nd argument to the from_python converter had to exactly match the
argument type to any C++ functions being wrapped, so you might need:

   SomeUDT from_python(PyObject* p, boost::type<SomeUDT>) { ... }
   SomeUDT& from_python(PyObject* p, boost::type<SomeUDT&>) { ... }
   SomeUDT const& from_python(PyObject* p, boost::type<SomeUDT const&>) { ... }
   SomeUDT* from_python(PyObject* p, boost::type<SomeUDT*>) { ... }
   SomeUDT const* from_python(PyObject* p, boost::type<SomeUDT const*>) { ... }

and a few others.

So what should the interface look like?  Let's first examine the
constraints that the C++ language imposes.  User-defined converters
can be viewed as behavioral customizations of templates in the
Boost.Python library.

We basically have two approaches available to us for customizing
behaviors:

1. Runtime dispatching through virtual functions or function pointers.
   This is the approach currently taken in Boost.Python v2 for most
   converters.  The converter registry contains pointer to functions
   which implement the conversions.  Some runtime dispatching is
   always needed for cross-module conversion support, unless you want
   to repeat the conversion code in every module which needs it (and
   nobody wants that).  There are other reasons to do this having to
   do with the way dynamic_cast and RTTI are implemented in most
   compilers.

2. Compile-time customization.  This was the default approach in
   Boost.Python v1: the compiler would look up the appropriate
   to_/from_python function and insert a call in the function wrapper.
   It could even be completely inlined.  The disadvantage of this
   approach is that it is heaviliy dependent on code visibility: the
   customizations have to be visible in every place that wants to take
   advantage of them.  Boost.Python v2 uses compile-time customization
   only for to-python conversions of "builtin" C++ types for which
   there can be only one reasonable Python interpretation.  For
   example, const char* and std::string both are converted to Python
   strings.

Ideally, the user could select compile-time customization for
to_python conversion of selected types, and additionally choose to
export their converters (e.g. to the global registry).  I'm really
unsure about the value of compile-time customization of from_python
converters.  The biggest problem with it is that it limits the whole
extension module to *one* conversion method for a given C++ type.
Normally, a from_python converter is used to convert one Python type
to a given C++ type.  If another extension module has exported
conversions for other Python types to the same C++ type, do you want
to be able use them?

There are basically two viable techniques for compile-time
customization in C++: defining functions to be found by
argument-dependent lookup, and template specialization.  For various
reasons I won't bore you with, I feel that template specialization is
the only recourse for Boost.Python.  I'll begin discussion of some
possible interfaces in a follow-on message.

-- 
                       David Abrahams
   dave at boost-consulting.com * http://www.boost-consulting.com
Boost support, enhancements, training, and commercial distribution