[C++-sig] Re: C++/Boost vs. Python object memory footprint

David Abrahams dave at boost-consulting.com
Wed Dec 11 14:22:06 CET 2002


Stephen,

Was my answer helpful?

David Abrahams <dave at boost-consulting.com> writes:

> Stephen Davies <chalky at ieee.org> writes:
>
>> Hi Dave,
>>
>> I was wondering what the memory usage of Synopsis would be if I
>> converted the in-memory AST to be C++ objects with Boost.Python
>> wrappers.
>>
>> I had 350MB of AST in memory yesterday and my system didn't cope too
>> well. I figure each object has at least one dictionary which can't be
>> cheap memory wise. The question is, will I still have proxy objects
>> floating around if I use Boost.Python, or will the python code directly
>> use the C++ objects without creating the instance dictionary et.al.?
>
> All the code for implementing C++ object wrappers is in
> libs/python/src/object/class.cpp.  Instance dictionaries are created
> only "on demand", the first time the instance's __dict__ attribute is
> accessed (see instance_get_dict), but I have no idea whether that
> tends to happen almost always or almost never.
>
> In general, a wrapped C++ object with a corresponding Python object is
> the size of a new-style class (derived from 'object' in Python)
> instance plus the extra size required to allow variable-length data in
> the instance, plus the size of the C++ object, plus the size of a
> vtable pointer, plus a pointer to the C++ object's instanceholder,
> plus zero or more bytes of padding required to ensure that the
> instanceholder is properly aligned.  
>
> You can see this in boost/python/object/instance.hpp. Most Python
> objects are represented by instance<value_holder<T> >, for some C++
> class T.
>
> I'm not sure what you mean by your question "will I still have proxy
> objects floating around...?" 
>
> If your C++ data structure contains pointers or smart pointers, you
> can arrange for Python objects to be created which only embed those
> pointers (instance<pointer_holder<Ptr> >). These Python objects will
> be in existence only as long as your Python code holds a reference to
> them. So, for example, it should be possible for Python code to do a
> walk over your C++ AST, with only O(log(N)) Python objects in
> existence corresponding to those N C++ objects at any given time.
>
>> A related question is what the speed is like calling the C++ objects vs.
>> normal Python objects. I can't imagine it would be slower. 
>
> I haven't done any tests, but it certainly could be slower if used
> poorly.  There is some overhead at the Python/C++ boundary associated
> with looking for eligible type converters, overloading, etc.  However,
> I imagine that ends up being negligible in most cases.  The best use
> of a Boost.Python C++ binding puts a large amount of computation on
> the C++ side of the language boundary.  One way I could imagine
> slowing down a Python program would be to translate a very large
> number of trivial functions to C++.  C++ function wrappers will
> certainly occupy more memory than the corresponding Python code would,
> and this could eventually affect cache locality.
>
>> The pickling/unpickling speed is also of some concern
>> currently. Would this be affected?
>
> Probably, one way or the other ;-) I think that if you treat your tree
> as one monolithic Python object from the standpoint of pickling, you
> could probably achieve much better speed than you currently have by
> writing some C++ serialization/deserialization code and pickling it as
> one long string.  Whether or not that's practical for you, of course,
> I can't say.
>
>> Perhaps you can add these to the Boost.Python FAQ :)
>
> If you'll help me edit them into a suitable form, I'd be most happy
> to!
>
> -- 
>                        David Abrahams
>    dave at boost-consulting.com * http://www.boost-consulting.com
> Boost support, enhancements, training, and commercial distribution

-- 
                       David Abrahams
   dave at boost-consulting.com * http://www.boost-consulting.com
Boost support, enhancements, training, and commercial distribution





More information about the Cplusplus-sig mailing list