[C++-sig] Implementation of proper overload resolution

Troy D. Straszheim troy at resophonic.com
Sun Dec 20 00:59:13 CET 2009


Neal Becker <ndbecker2 at gmail.com> writes:

> I am concerned that this doesn't introduce too much overhead for the common 
> case where there is no ambiguity.  I suppose this has been optimized?
>

For the following module:

  int f(int x, int y, int z) { return x*100 + y*10 + z; }
  BOOST_PYTHON_MODULE(m) { def("f", &f); } 

and the following script:

  from m import f
  for i in range(10000000):
      f(i, i, i)

With all optimizations turned on, Boost.Python from the 1.41.0 release
runs this in (average over several runs) 4.25 seconds, and this new
implementation in 4.38 seconds, so an increase of about 3% wall-clock
time.  This test case is designed to make the performance hit look as
bad as possible: if int f(...) did anything substantial, of course the
relative slowdown would be less.

In an attempt to get an idea of exactly how many additional CPU ticks
are involved I ran a script that just does

   from m import f

under 'valgrind --tool=lackey' and compared the number of 'guest
instructions', some measure of how much work the virtual CPU is doing:

  1.41.0:  guest instrs:  26,559,194

  New:     guest instrs:  26,864,330

So there's an additional total 305k instructions required to create and
destroy the module. If you then add a single call to f() to the script:

   from m import f
   f(1,1,1)

you get:

  1.41.0: guest instrs:  26,593,334

  New:    guest instrs:  26,899,095

So 1.41.0 requires 34140 of these cpu 'ticks' to call f(), and the new
version requires 34765 of them, or 625 instructions ~= 2% more.

Note that this implementation is also 'fusionized', i.e. where function
calling is concerned, much of the boost.preprocessor guts have been
removed and replaced with boost fusion, so it is a little hard to track
where this 2% is actually going.  OTOH polymorphic function objects and
phoenix expressions are passable to def().

The fusionization also involves some increase in library size.
Presumably the difference would be less if inlining were turned off.

Test Module                             1.41.0    New
------------------------------------    ------    ----
andreas_beyer_ext.so                    120K      120K
args_ext.so                             208K      264K
auto_ptr_ext.so                         132K      156K
back_reference_ext.so                   148K      176K
ben_scott1_ext.so                       112K      132K
bienstman1_ext.so                       72K       72K
bienstman2_ext.so                       80K       96K
bienstman3_ext.so                       44K       60K
builtin_converters_ext.so               428K      544K 
callbacks_ext.so                        200K      240K
const_argument_ext.so                   28K       28K
crossmod_exception_a.so                 24K       28K
crossmod_exception_b.so                 24K       28K
crossmod_opaque_a.so                    28K       36K
crossmod_opaque_b.so                    28K       36K
data_members_ext.so                     320K      372K
defaults_ext.so                         304K      392K
dict_ext.so                             80K       96K
docstring_ext.so                        104K      116K
enum_ext.so                             84K       92K
exception_translator_ext.so             28K       36K
extract_ext.so                          192K      220K
implicit_ext.so                         100K      116K
injected_ext.so                         100K      124K
input_iterator.so                       124K      140K
iterator_ext.so                         324K      364K
list_ext.so                             176K      204K
long_ext.so                             100K      116K
m1.so                                   292K      352K
m2.so                                   128K      156K
map_indexing_suite_ext.so               736K      876K
minimal_ext.so                          20K       20K
multi_arg_constructor_ext.so            48K       60K
nested_ext.so                           112K      124K
object_ext.so                           332K      392K
opaque_ext.so                           80K       96K
operators_ext.so                        228K      268K
pickle1_ext.so                          92K       104K
pickle2_ext.so                          104K      124K
pickle3_ext.so                          116K      136K
pickle4_ext.so                          60K       68K
pointer_vector_ext.so                   220K      252K
polymorphism2_auto_ptr_ext.so           200K      268K
polymorphism_ext.so                     216K      264K
properties_ext.so                       164K      188K
raw_ctor_ext.so                         104K      120K
return_arg_ext.so                       112K      132K
shared_ptr_ext.so                       432K      672K
slice_ext.so                            96K       112K
staticmethod_ext.so                     76K       96K
stl_iterator_ext.so                     116K      128K
str_ext.so                              68K       76K
tuple_ext.so                            68K       80K
vector_indexing_suite_ext.so            612K      720K
virtual_functions_ext.so                176K      216K
voidptr_ext.so                          48K       60K
wrapper_held_type_ext.so                92K       112K

-t







More information about the Cplusplus-sig mailing list