Python speed vs csharp

Mon Aug 4 04:25:32 EDT 2003

Siegfried Gonzi wrote:
   ...
> I am sure you know what you are after, but  Python for numerical
> computations is more or less crap. Not only has Python a crippled

...Note for the unwary...:
It's interesting that "gonzi", in Italian, means "gullible" (masculine
plural adjective, also masculine plural noun "gullible individuals") --
this may or may not give a hint to the purpose of this troll/flamewar:-).

Anyway, a challenge is a challenge, so:

> times. Also included a C program. The timings on a simple Celeron
> laptop are as follows:
> 
> g++ -O3 erf.c : 0.5 seconds
> bigloo -Obench erf.scm: 1.1 seconds

Never having been as silly as to purchase such a deliberately crippled
chip as a Celeron, I cannot, unfortunately, reproduce these timings --
I will, however, offer my own timings on a somewhat decent CPU, an old
but reasonably well-configured early Athlon.  No bigloo here, so I
started with the C program:

> ;;; C ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
> #include <stdio.h>
> #include <math.h>
> 
> 
> double erfc( double x )
> {
>     double p, a1, a2, a3, a4, a5;
>     double t, erfcx;
> 
>     p  =  0.3275911;
>     a1 =  0.254829592;
>     a2 = -0.284496736;
>     a3 =  1.421413741;
>     a4 = -1.453152027;
>     a5 =  1.061405429;
> 
>     t = 1.0 / (1.0 + p*x);
>     erfcx = ( (a1 + (a2 + (a3 +
> (a4 + a5*t)*t)*t)*t)*t ) * exp(-pow(x,2.0));
> 
>     return erfcx;
> }
> 
> int main()
> {
>     double erg=0.0;
>     int i;
> 
>     for(i=0; i<1000000; i++)
>     {
>        erg += erfc(0.456);
>     }
>     printf("%f",erg);
> 
>     return 1;
> }
> ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;

saved this to gonzi.c, compiled it and ran/timed it:

[alex at lancelot swig_wrappers]$ gcc -O -o gonzi gonzi.c -lm
[alex at lancelot swig_wrappers]$ time ./gonzi
519003.933831Command exited with non-zero status 1
0.37user 0.01system 0:00.39elapsed 96%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (107major+13minor)pagefaults 0swaps

(that silly "return 1" must be connected to some in-joke or other...
why's the program unconditionally returning an error-indication!?-)]

Then I "converted" it to Python, a trivial exercise to be sure:

import math

def erfc(x):
    exp = math.exp

    p  =  0.3275911
    a1 =  0.254829592
    a2 = -0.284496736
    a3 =  1.421413741
    a4 = -1.453152027
    a5 =  1.061405429

    t = 1.0 / (1.0 + p*x)
    erfcx = ( (a1 + (a2 + (a3 +
                          (a4 + a5*t)*t)*t)*t)*t ) * exp(-x*x)
    return erfcx

def main():
    erg = 0.0

    for i in xrange(1000000):
        erg += erfc(0.456)

    print "%f" % erg

if __name__ == '__main__':
    main()

and ran that:

[alex at lancelot swig_wrappers]$ time python gonzi.py
519003.933831
5.32user 0.05system 0:05.44elapsed 98%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (473major+270minor)pagefaults 0swaps

then I put a huge effort into optimization -- specifically, I inserted
after the "import math" the two full lines:

import psyco
psyco.full()

[must have taken me AT LEAST 3 seconds of work, maybe as much as 4] and
ran it again:

[alex at lancelot swig_wrappers]$ time python gonzi.py
519003.933831
0.15user 0.02system 0:00.24elapsed 69%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (547major+319minor)pagefaults 0swaps

There -- optimized Python over twice as fast as optimized C in terms of
user-mode CPU consumption, almost twice as fast in terms of elapsed time
(of course, using programs as tiny and fast as this one for such purposes
is not clever, since they run so fast their speed is hard to measure,
but "oh well").  No need to declare types, of course, since psyco can
easily infer them.

Of course, there IS a "subtle" trick here -- I write x*x rather than
pow(x,2.0).  Anybody whose instincts fail to rebel against "pow(x,2.0)"
(or x**2, for that matter) doesn't _deserve_ to do numeric computation...
it's the first trick in the book, routinely taught to every freshman
programmer before they're entrusted to punch in the first card of their
first Fortran program (or at least, so it was in my time in university,
both as a student and as a professor).

Correcting this idiocy in the C code (and removing the return 1 since
we're at it -- return 0 is the only sensible choice, of course:-) we
turn C's performance from the previous:

[alex at lancelot swig_wrappers]$ time ./gonzi
519003.933831Command exited with non-zero status 1
0.37user 0.01system 0:00.39elapsed 96%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (107major+13minor)pagefaults 0swaps

to a more respectable:
[alex at lancelot swig_wrappers]$ time ./gonzi
519003.9338310.20user 0.00system 0:00.19elapsed 103%CPU (0avgtext+0avgdata
0maxresident)k
0inputs+0outputs (105major+13minor)pagefaults 0swaps

(the "103%" of CPU being a funny artefact of my Linux kernel, no doubt).

So, about HALF the elapsed time was being squandered in that absurd call
to pow -- so much for Mr Gonzi's abilities as a numerical programmer:-).

With the program having been made decent, C's faster than optimized Python
again, not in CPU time (0.20 all usermode for C, 0.15 usermode + 0.02
systemmode for Python) but in elapsed time (0.19 for C vs 0.24 for
Python) -- the inversion being accounted for in psyco's lavish use of
memory and resulting pagination.

Still, the performance ratio of psyco-optimized Python vs C on my
machine is much better than that of bigloo vs C on Mr Gonzi's Celeron,
and with no need for type declarations either.  So much, then, for
Mr Gonzi's claims about Python being "more or less crap"!-)

Alex