[C++-sig] Implementation of proper overload resolution
Troy D. Straszheim
troy at resophonic.com
Sun Dec 20 00:59:13 CET 2009
Neal Becker <ndbecker2 at gmail.com> writes:
> I am concerned that this doesn't introduce too much overhead for the common
> case where there is no ambiguity. I suppose this has been optimized?
>
For the following module:
int f(int x, int y, int z) { return x*100 + y*10 + z; }
BOOST_PYTHON_MODULE(m) { def("f", &f); }
and the following script:
from m import f
for i in range(10000000):
f(i, i, i)
With all optimizations turned on, Boost.Python from the 1.41.0 release
runs this in (average over several runs) 4.25 seconds, and this new
implementation in 4.38 seconds, so an increase of about 3% wall-clock
time. This test case is designed to make the performance hit look as
bad as possible: if int f(...) did anything substantial, of course the
relative slowdown would be less.
In an attempt to get an idea of exactly how many additional CPU ticks
are involved I ran a script that just does
from m import f
under 'valgrind --tool=lackey' and compared the number of 'guest
instructions', some measure of how much work the virtual CPU is doing:
1.41.0: guest instrs: 26,559,194
New: guest instrs: 26,864,330
So there's an additional total 305k instructions required to create and
destroy the module. If you then add a single call to f() to the script:
from m import f
f(1,1,1)
you get:
1.41.0: guest instrs: 26,593,334
New: guest instrs: 26,899,095
So 1.41.0 requires 34140 of these cpu 'ticks' to call f(), and the new
version requires 34765 of them, or 625 instructions ~= 2% more.
Note that this implementation is also 'fusionized', i.e. where function
calling is concerned, much of the boost.preprocessor guts have been
removed and replaced with boost fusion, so it is a little hard to track
where this 2% is actually going. OTOH polymorphic function objects and
phoenix expressions are passable to def().
The fusionization also involves some increase in library size.
Presumably the difference would be less if inlining were turned off.
Test Module 1.41.0 New
------------------------------------ ------ ----
andreas_beyer_ext.so 120K 120K
args_ext.so 208K 264K
auto_ptr_ext.so 132K 156K
back_reference_ext.so 148K 176K
ben_scott1_ext.so 112K 132K
bienstman1_ext.so 72K 72K
bienstman2_ext.so 80K 96K
bienstman3_ext.so 44K 60K
builtin_converters_ext.so 428K 544K
callbacks_ext.so 200K 240K
const_argument_ext.so 28K 28K
crossmod_exception_a.so 24K 28K
crossmod_exception_b.so 24K 28K
crossmod_opaque_a.so 28K 36K
crossmod_opaque_b.so 28K 36K
data_members_ext.so 320K 372K
defaults_ext.so 304K 392K
dict_ext.so 80K 96K
docstring_ext.so 104K 116K
enum_ext.so 84K 92K
exception_translator_ext.so 28K 36K
extract_ext.so 192K 220K
implicit_ext.so 100K 116K
injected_ext.so 100K 124K
input_iterator.so 124K 140K
iterator_ext.so 324K 364K
list_ext.so 176K 204K
long_ext.so 100K 116K
m1.so 292K 352K
m2.so 128K 156K
map_indexing_suite_ext.so 736K 876K
minimal_ext.so 20K 20K
multi_arg_constructor_ext.so 48K 60K
nested_ext.so 112K 124K
object_ext.so 332K 392K
opaque_ext.so 80K 96K
operators_ext.so 228K 268K
pickle1_ext.so 92K 104K
pickle2_ext.so 104K 124K
pickle3_ext.so 116K 136K
pickle4_ext.so 60K 68K
pointer_vector_ext.so 220K 252K
polymorphism2_auto_ptr_ext.so 200K 268K
polymorphism_ext.so 216K 264K
properties_ext.so 164K 188K
raw_ctor_ext.so 104K 120K
return_arg_ext.so 112K 132K
shared_ptr_ext.so 432K 672K
slice_ext.so 96K 112K
staticmethod_ext.so 76K 96K
stl_iterator_ext.so 116K 128K
str_ext.so 68K 76K
tuple_ext.so 68K 80K
vector_indexing_suite_ext.so 612K 720K
virtual_functions_ext.so 176K 216K
voidptr_ext.so 48K 60K
wrapper_held_type_ext.so 92K 112K
-t
More information about the Cplusplus-sig
mailing list