[C++-SIG] (Still) confused about wrapping C++ types with CXX

Fri Jun 2 00:04:06 CEST 2000

I've spent almost a week trying to figure out how to ask my
question(s) and am still having difficulty finding a concise way to
share the issues I'm facing. Here goes:

I'm trying to prove that I can write non-intrusive Python wrappers
around non-Python-aware C++ types. In order to start in on this
exercise, I created a hypothetical C++ type called "some_pod":

==================================================
#ifndef POD_HH
#define POD_HH

#include <string>
#include <list>
#include <map>

class some_pod
{
public:
    some_pod(const std::string&);
    ~some_pod();

    some_pod(const some_pod&);
    some_pod& operator=(const some_pod&);

    void do_it() const;
    const std::list<int>& get_list() const;
    const std::map<std::string, int>& get_map() const;

private:
    std::string m_str;
    std::list<int> m_list;
    std::map<std::string, int> m_map;
};

#endif  // POD_HH
==================================================

some_pod is not exactly POD in the rigorous sense, but it embodies a
pattern common to the classes I'll need to wrap with Python: it does
some stuff, and exposes collections of other stuff. Sometimes these
collections contain items that are themselves collections. I didn't
show any "setter" methods here, or collection "getters" that expose
non-const references. Eventually I'll need to tackle those as well.

One of the first things I did was to write some support classes to
enable me to write a Py::PythonExtension-derived wrapper around a live
instance of some_pod (or any other type used as a template
argument). That is, my Py::PythonExtension-derived class does not
contain a some_pod instance; it contains a reference to one. The
assumption there is that the some_pod instance would likely be "live"
in the hosting application, and that application will want to call
into some user-provided Python code, offering up the some_pod instance
to user query and modification.

Now, for some types, there's no way it would make sense to create an
instance directly in Python (such as, say, the "application
context"). For others, maybe you can create one in Python with the
right constructor arguments. In the first case, you could just use my
'py_bound_pod' Py::PythonExtension-derived class, constructing it around
a reference to the live some_pod instance. In the second case, you can
use one of the "owning_wrapper" templates I've written that bind
together a C++ instance and the Py::PythonExtension-derived type that
wraps it. The owning_wrapper template type derives publicly from the
Py::PythonExtension-derived type. Here's part of that facility:

==================================================
template <class Owned>
class concrete_owner_base
{
public:
	typedef Owned owned_type;

	owned_type& get_owned() throw()
	{ return m_owned; }

	const owned_type& get_owned() const throw()
	{ return m_owned; }

protected:
	concrete_owner_base()
	{}

	concrete_owner_base(const owned_type& owned)
		: m_owned( owned )
	{}

private:	// TODO: should this just be protected?
	owned_type m_owned;
};

// Assumptions:
// 1. class Base is constructible from an Owned&.

template <class Base, class Owned>
class concrete_owning_wrapper : public concrete_owner_base<Owned>,
								public Base
{
public:
	typedef Base base_type;

	concrete_owning_wrapper()
		: base_type( get_owned() )
	{}

	concrete_owning_wrapper(const owned_type& owned)
		: concrete_owner_base<owned_type>( owned ),
		  base_type( get_owned() )
	{}
};
==================================================

There's also a 'ptr_owning_wrapper' that holds a std::auto_ptr to a
heap-allocated C++ instance of the "Owned" type. By putting the
"Owned" type first in the class layout, we ensure that it lives longer
than the wrapper around it.

With all of this in place, I can create and use a wrapper around
some_pod more or less with CXX's existing, simple facilities. We can
also permit construction of a some_pod/py_bound_pod pair from Python,
using the owning_wrapper templates to unite their lifetimes. Things
get much more complicated with the nested list and map.

Next, I tried to write a Py::PythonExtension-derived class that wraps
a const std::list<>&. Wrapping a non-const reference will warrant more
methods being present, so I held off on that for now. It took a while
to get to where I am, but I still feel that I'm missing some big
points.

Take, for example, sequence_concat(). Should it be able to join with any
kind of sequence, or just another instance of our same type? What type
of sequence should it return? If we're wrapping a std::list, should we
bother creating a new std::list to underly another wrapper around it,
or should we just stuff the merged contents into a Py::List? Here, you
can see me fighting with each choice:

==================================================
Py::Object sequence_concat(const Py::Object& j)
{
    // Should we check if this is the same type as we are?
    const Py::Sequence seq( j );

    // Should we return a new instance of our own type...
/*  typedef concrete_owning_wrapper<this_type, wrapped_type> owner_type;

    std::auto_ptr<owner_type> powner( new owner_type( m_wrapped ) );

    wrapped_type& wrapped = powner->get_owned();
    wrapped.insert( wrapped.end(),
                    seq.begin(), seq.end() );

    return Py::asObject( powner );
*/

    // ... or a Py::List instance?
    Py::List listConcat( m_wrapped.size() + seq.size() );
    for ( typename wrapped_type::const_iterator
          it = m_wrapped.begin(),
          itEnd = m_wrapped.end();
          it != itEnd; ++it )
        listConcat.append( Py::make_object( *it ) );

    std::copy( seq.begin(), seq.end(),
               std::back_inserter( listConcat ) );

    return listConcat;
}
==================================================

Here, I've used some extensions to CXX to make dealing with
auto_ptr<>s safe (returning them properly through an overload of the
existing Py::asObject function) and methods like push_back() added to
class Py::List so that it works with std::back_inserter. There's also
that Py::make_object() function that I'll get to in a minute.

All of this gets even more complicated if you consider that the values
in the std::list may not be something as simple as an integer. What if
it's a type that deserves Python's "reference semantics" around it?
The elements of a list resulting from sequence_concat should be
references back to the same objects "referred to" in the source
lists. I can't figure out how to do this correctly.

Part of the problem comes from the fact that many of the "native"
Python types, as wrapped in CXX, are derived from Py::Object. You can
create them, copy them, and all the reference counting works
properly. But for Py::PythonExtension-derived types, they do not work
as simply as Py::Objects. Yes, you can create an instance and build
several Py::ExtensionObject instances around it. It's that creation of
the original instance that seems weird to me.

Consider an even simpler example: the sequence_item() method. Here's
part of my implementation:

==================================================
Py::Object sequence_item(int i)
{
    py_std_cont::seek_item_by_cached_index( m_wrapped, i,
                                            m_index, m_it );

    return Py::make_object( *m_it );
}
==================================================

It does some fancy iterator caching (with m_index and m_it) to avoid
starting at begin() every time it needs to get to an iterator by
index. The trouble is, what should it return?

Yes, we know it should return a Py::Object, presumably wrapped around
either a native Python type instance (e.g. Py::Int) or, in the more
complicated case, a Py::PythonExtension-derived instance. It works
fine when the type of the std::list is some like int than we can toss
back a Py::Int instance of.

Imagine, though, if we had std::list<some_pod> as the list type. We
have class py_bound_pod that can be constructed around a reference to
a some_pod instance. But py_bound_pod can't be returned as an instance
of Py::Object. We'd have to create a py_bound_pod on the heap, give it
to a Py::Object to own, and return that, like this:

==================================================
template <class T>
inline typename py_traits<T>::py_type make_object(const T& val)
{
    typedef typename py_traits<T>::py_type py_type;
    return Py::asObject( std::auto_ptr<py_type>( new py_type( val ) ) );
}
==================================================

I also have these py_traits structs specialized for most of the built-in
types that can return things like Py::Int from int directly:

==================================================
#define DECL_MAKEPY_BY_VAL(ctype) \
inline py_traits<ctype>::py_type make_object(ctype v) \
{ return py_traits<ctype>::py_type( v ); }

#define DECL_MAKEPY_BY_REF(ctype) \
inline py_traits<ctype>::py_type make_object(const ctype& v) \
{ return py_traits<ctype>::py_type( v ); }

DECL_MAKEPY_BY_VAL( int )
DECL_MAKEPY_BY_VAL( long )
DECL_MAKEPY_BY_VAL( float )
DECL_MAKEPY_BY_VAL( double )
DECL_MAKEPY_BY_VAL( char )
DECL_MAKEPY_BY_VAL( const char* )
DECL_MAKEPY_BY_REF( std::string )
==================================================

Granted, it's not complete, but hopefully you can see the problem I'm
trying to solve.

In the first case above, where we heap-allocate a new
Py::PythonExtension-derived wrapper around the underlying C++ type *on
each call* to sequence_item() produces a bad situation: each read of
the item creates a new Python object. Read that carefully. We'd _like_
to have each read return a new Py::Object instance pointing to the
_same_ underlying reference-counted object. Instead, we get a new
Py::Object instance pointing to a _different_ reference-counted
object, which happens to eventually point to the same wrapped C++
instance. But from Python's point of view, the following
(hypothetical) code would not behave properly:

>>> list = something.get_list_of_pods()
>>> r1 = list[0]
>>> r2 = list[0]
>>> r1 == r2
<?> - should be 1, right?
>>> r1 is r2
<?> - should be 1, right?

Each call to list[0] would create a new Py::PythonExtension-derived
instance. r1 and r2 would not be managing the same reference-counted
object. The _value_ of those objects would probably be the same since
each points to the same C++ instance under the covers, but we'd still
be violating Python's built-in mechanics.

One solution I can see is to create a parallel list that holds the
"one true wrapper" for each of the underlying list elements, and the
return Py::Object instances that point to our "one true wrapper"
instances. That gets a little more ugly to manage, and I haven't tried
that yet because I fear I may already be too far off-base.

This is a long post and it only begins to cover what's plaguing my CXX
progress. I'd be happy to post more complete code if necessary. Any
corrections and clarifications of my (erroneous) assumptions would be
most welcome. Is anyone else trying to do this sort of thing?

-- 
Steven E. Harris
Primus Knowledge Solutions, Inc.
http://www.primus.com