[C++-sig] Tips for exposing classes with own memory management model

Fri Apr 19 14:26:20 CEST 2013

Hi again,

Whilst hoping for a reply, I thought I'd add some further insights I've
learnt about the current PyTypeObject scheme.

On Thu, 18 Apr 2013 17:14:05 +0100, Jim Bosch <jbosch at astro.princeton.edu>
wrote:

>  I was originally thinking that maybe you could get away with  
> essentially wrapping your own classes just using the Python C API  
> directly (i.e. following the approach in the "extending and embedding"  
> tutorial in the official Python docs), but then use Boost.Python to wrap  
> all of your functions and handle type conversion.  But even that seems  
> like it's pretty difficult.

Well, this is basically what I did... I started playing around with
noddy_NoddyType[1], to see what I could learn about Boost.Python's current
way of things. I don't want to break my code just yet, nor fall any more
of a victim to the early optimisation problem, but here we go anyway..

Exposing a simple type
----------------------

Wrapping noddy_NoddyObject is about as simple as it gets:

/////////////////////////////////////////////////////////////////
//
// File: noddy.hpp

#include "noddy.h" // (code from [1])

struct NoddyClass
       : noddy_NoddyObject
{
       NoddyClass(void)
           : noddy_NoddyObject() {}
       ~NoddyClass() {}
};

// Register converter for classes derived from NoddyClass.
// i.e. Tell Boost::Python to use noddy_NoddyType members.
//
// N.B. For this to compile, noddy_NoddyType must be forward
// declared like PyListType.
//
// i.e.
//   PyAPI_DATA(PyTypeObject) noddy_NoddyType;
//
// Then, don't(!) declare noddy_NoddyType as static.

namespace boost { namespace python {

       namespace converter
       {
          template <>
          struct object_manager_traits<NoddyClass>
              : pytype_object_manager_traits<&noddy_NoddyType, NoddyClass>
          {
          };
       }

}   }

/////////////////////////////////////////////////////////////////
//
// File: noddy.cpp

#include "noddy.hpp"
#include <boost/python/module.hpp>
#include <boost/python/class.hpp>

BOOST_PYTHON_MODULE(noddy)
{
       boost::python::class_<NoddyClass, boost::noncopyable>
           ("Noddy", "Empty Python object using custom PyTypeObject",
            boost::python::init<>()[  ...CallPolicies...   ])
           // ...
           ;
}

-----------------

Analysis
--------

Now, I'm not exactly experienced when it comes to disassembling and the
like, but an easy observation to make, from the Python side, is to use
sys.getsizeof() :-

>>> import sys, noddy
>>> nod = noddy.Noddy()
>>> print sys.getsizeof(nod)
80

Okay, 80 bytes apparently. And a plain old Python object?

>>> obj = object()
>>> print sys.getsizeof(obj)
16

If I use only the C-Python API, compiling the code exactly as listed in
[1], then:-

>>> print sys.getsizeof(nod)
16

Relatively, that's quite a vast size difference (x5), so I thought I'd
look for an explanation. I found it in a couple of places: the python wiki
[2], and instance.hpp[3].

Sequence objects
----------------

As mentioned on the wiki, amongst other things, a wrapped object is the
size of "the extra size required to allow variable-length data in the
instance".

This is fixed in instance.hpp, where Boost uses the 'PyObject_VAR_HEAD'
macro at the top of the instance template. This is instead of the
comparatively much smaller 'PyObject_HEAD' macro used in simple objects.

A fairer test then, might be to compare a Python list() to a
Boost.Python-wrapped NoddyObject:-

>>> print sys.getsizeof(list())
72

The relative size difference is now much smaller (x10/9), but I can't
explain the last lost 8 bytes. There are apparently "zero or more bytes of
padding required to ensure that the instanceholder is properly aligned." I
don't really understand why this is necessary; isn't the compiler supposed
to decide how to align objects? Well, apparently numpy arrays do the same
thing[4].

NumPy
-----

So, what about numpy arrays?

>>> print sys.getsizeof(numpy.array([]))
80

Oh, that's impressive! It's identical. With that in mind, I would believe
that the object is as good as it gets for sequence types. But what about
simple numpy datatypes?

>>> print sys.getsizeof( numpy.float64() )
24

Ah, now there's a noticeable difference (x10/3). But, I think Boost
objects could be made identical in this respect, with some time and
dedication...

------------------

It doesn't appear that anyone has ever had an issue with this design, but
it seems to me that there is large room for improvement in memory
efficiency, when it comes to simple data-types managing only one thing at
a time.

Notes
-----

* I can't see any effect from the object_manager_traits<> call. Doesn't
seem to matter if I use it or not, but that's probably because I haven't
done anything special with noddy_NoddyType.

* I tried some different CallPolicies on the class_<> init method: the
default; return_internal_reference; and copy_const_reference. None of
these affect object size.

* A couple of open source extension modules where I've seen PyTypeObject's
used: Numpy API docs [4] and PythonQt source code [5].

------------------

Now I'm guessing that there are no plans to improve the memory usage of
bp::object's. From what Jim has said and from what I've seen in
'class.hpp', 'object_base.hpp' and other header files, it would probably
require some dramatic modifications. But could it be as simple as adding a
template to instance.hpp? Probably not..

It's not currently in my capacity to rewrite large portions of boost
python; I'm far too new to Boost and C++ in general to begin to attempt
this really, and I've got a thesis to write atm, too...

Back to previous use case
-------------------------

   From what I've seen in my extension library (and using the classes from  
my
last email), every ManagedObj could be designated as a simple datatype;
there are (far fewer) dedicated containers for managing multiple instances
of them; these would require using the current instance system, but most
of my exposed classes would benefit from a smaller Python instance.

If anyone has any thoughts or ideas on how to squeeze extra performance
   from simple PyTypeObject's, please do share!

Kind regards,
Alex

[1] - http://docs.python.org/2/extending/newtypes.html
[2] - http://wiki.python.org/moin/boost.python/InternalDataStructures
[3] - http://www.boost.org/doc/libs/1_53_0/boost/python/object/instance.hpp
[4] -
http://docs.scipy.org/doc/numpy/reference/c-api.types-and-structures.html
[5] - http://pythonqt.sourceforge.net/PythonQtClassWrapper_8h_source.html