[capi-sig] running C method, benchmarks

Wed Dec 24 22:30:53 CET 2008

Stefan Behnel wrote:
> Campbell Barton wrote:
>> Discussing if its worth moving Py/C functions from METH_VARARGS to
>> METHO when they only recieve 1 argument on the PyGame mailing list.
>>
>> tested different ways to evaluate args to see how much speed
>> difference there was
>> * 10,000,000 tests, python 2.6 on 32bit arch linux
>> * included a pass and NOARGS metrhod to see the difference in overhead
>> of the loop and parsing an arg compared to running a method with no
>> args.
>>
>> ---- output
>> pass 1.85659885406
>> METH_NOARGS 3.24079704285
>> METH_O 3.66321516037
>> METH_VARARGS 6.09881997108
>> METH_KEYWORDS 6.037307024
>> METH_KEYWORDS (as keyword) 10.9263861179
> 
> I tried doing something similar in Cython, but it's not directly
> comparable. Cython uses optimised code instead of a generic call to
> ParseTupleAndKeywords and it will always give you a METH_O function when
> you only use one argument. Anyway, here are the numbers. I used the latest
> Cython developer version with gcc 4.1.3 on Linux.
> 
> Benchmarked code:
> 
> -----------------
> def f0(): pass             # METH_NOARGS
> def f1(a): pass            # METH_O
> def f1opt(a=1): pass       # METH_VARARGS|METH_KEYWORDS
> def f2(a,b): pass          # METH_VARARGS|METH_KEYWORDS
> def f2opt(a=1,b=2): pass   # METH_VARARGS|METH_KEYWORDS
> -----------------
> 
> Benchmarks:
> 
> $ python2.5 -m timeit -s '...; from calltest import f0' 'f0()'
> 10000000 loops, best of 3: 0.126 usec per loop
> $ python2.5 -m timeit -s '...; from calltest import f1opt' 'f1opt()'
> 10000000 loops, best of 3: 0.14 usec per loop
> $ python2.5 -m timeit -s '...; from calltest import f2opt' 'f2opt()'
> 10000000 loops, best of 3: 0.141 usec per loop
> $ python2.5 -m timeit -s '...; from calltest import f1' 'f1(1)'
> 10000000 loops, best of 3: 0.145 usec per loop
> $ python2.5 -m timeit -s '...; from calltest import f2' 'f2(1,2)'
> 1000000 loops, best of 3: 0.225 usec per loop
> $ python2.5 -m timeit -s '...; from calltest import f2' 'f2(1,b=2)'
> 1000000 loops, best of 3: 0.489 usec per loop

I noticed that I forgot to run one interesting test, which is calling
f2opt() with a single argument in comparison to calling f1() with one argument:

$ python2.5 -m timeit -s '...; from calltest import f1' 'f1(1)'
10000000 loops, best of 3: 0.145 usec per loop
$ python2.5 -m timeit -s '...; from calltest import f2opt' 'f2opt(1)'
1000000 loops, best of 3: 0.204 usec per loop
$ python2.5 -m timeit -s '...; from calltest import f1opt' 'f1opt(1)'
1000000 loops, best of 3: 0.204 usec per loop

So, yes, this actually is slower than the METH_O function f1(). It's not
66% as in your example, more like 40%, but it definitely is a lot slower.
So I would say that the overhead of calling a METH_VARARGS|METH_KEYWORDS
function versus a METH_O function is somewhere in the order of 40% for the
case that only positional arguments are involved.

Stefan