Side by side comparison - CPython, nuitka, PyPy

Mon Dec 24 09:24:11 EST 2018

Anthony Flury via Python-list schrieb am 21.12.18 um 09:06:
> I thought I would look at a side by side comparison of CPython, nuitka and
> PyPy

Interesting choice. Why nuitka?

> *The functionality under test*
> 
> I have a library (called primelib) which implements a Sieve of Erathoneses
> in pure Python - it was orginally written as part of my project Euler attempts
> 
> Not only does it build a sieve to test primality, it also builds an
> iterable list of primes, and has functionality to calculate the prime
> factors (and exponents) and also calculate all divisors of a given integer
> (up to the size of the sieve).
> 
> To test the primelib there is a simple harness which :
> 
>  * Builds a sieve for integers from 2 to 104729 (104729 is the 10,000th
>    prime number)
>  * Using a pre-built list from primes.utm.edu -
>      o For every integer from 2 to 104729 the prime sieve and pre-built
>        list agree on the primality or non-primality
>      o confirm that the list of ALL primes identified by the sieve is
>        the same as the pre-built list.
>      o For every integer from 2 to 104729, get primelib to generate the
>        prime factors and exponents - and comfirm that they multiply up
>        to the expected integer
>      o For every integer from 2 to 104729 get primelib to generate the
>        divisors on the integer, and confirm that each divisor does
>        divide cleanly into the integer
> 
> The Sieve is rebuilt between each test, there is no caching of data between
> test cases, so the test harness forces a lot of recalculations.
> 
> I have yet to convert primelib to be Python 3 compatible.
> 
> Exactly the same test harness was run in all 3 cases :
> 
>  * Under CPython 2.7.15, the execution of the test harness took around
>    75 seconds to execute over 5 runs - fastest 73, slowest 78.
>  * Under Nuitka 0.6, the execution of the test harness after compiler
>    took around 85 seconds over 5 runes, fastest 84, slowest 86.
>  * Under PyPy, the execution of the test harness took 4.9 seconds on
>    average over 5 runs, fastest 4.79, slowest 5.2
> 
> I was very impressed at the execution time improvement under PyPy, and a
> little surprised about the lack of improvement under Nuitka.
> 
> I know Nuitka is a work in progress, but given that Nuitka compiles Python
> to C code I would have expected some level of gain, especially in a maths
> heavy implementation.

It compiles to C, yes, but that by itself doesn't mean that it makes it run
faster. Remember that CPython is also written in C, so why should a simple
static translation from Python code to C make it run faster than in CPython?

Cython [1], on the other hand, is an optimising Python-to-C compiler, which
aims to generate fast code and allow users to manually tune it. That's when
you start getting real speedups that are relevant for real-world code.

> This comparison is provided for information only, and is not intended as
> any form of formal benchmark. I don't claim that primelib is as efficient
> as it could be - although every effort was made to try to make it as fast
> as I could.

I understand that it came to life as an exercise, and you probably won't
make production use of it. Actually, I doubt that there is a shortage of
prime detection libraries. ;) Still, thanks for the writeup. It's helpful
to see comparisons of "code how people write it" under different runtimes
from time to time.

Stefan

[1] http://cython.org/