How to make Python run as fast (or faster) than Julia

Steven D'Aprano steve+comp.lang.python at pearwood.info
Fri Feb 23 21:46:02 EST 2018


On Fri, 23 Feb 2018 12:43:06 -0600, Python wrote:

> Even if testing optimized
> code is the point, as the article claims, it utterly fails to do that. 
> Bad science.

You've used that statement two or three times now.

*This isn't science*.

There's nothing scientific about writing benchmarks, or even objective. 
It is through and through subjective choices given a paper-thin patina of 
objectivity because the results include numbers.

But those numbers depend on the precise implementation of the benchmark. 
They depend on the machine you run them on, sometimes strongly enough 
that the order of which language is faster can swap. I remember a bug in 
Python's urllib module, I think it was, that made code using it literally 
hundreds of times slower on Windows than Linux or OS X.

The choice of algorithms used is not objective, or fair. Most of it is 
tradition: the famous "whetstone" benchmark apparently measures something 
which has little or no connection to anything software developers should 
care about. It, like the Dhrystone variant, were invented to benchmark 
CPU performance. The relevance to comparing languages is virtually zero.

    "As this data reveals, Dhrystone is not a particularly 
    representative sample of the kinds of instruction sequences
    that are typical of today's applications. The majority of 
    embedded applications make little use of the C libraries
    for example, and even desktop applications are unlikely to
    have such a high weighting of a very small number of
    specific library calls."

http://dell.docjava.com/courses/cr346/data/papers/DhrystoneMIPS-
CriticismbyARM.pdf


Take the Fibonacci double-recursion benchmark. Okay, it tests how well 
your language does at making millions of function calls. Why? How often 
do you make millions of function calls? For most application code, 
executing the function is far more costly than the overhead of calling 
it, and the call overhead is dwarfed by the rest of the application.

For many, many applications, the *entire* program run could take orders 
of magnitude fewer function calls than a single call to fib(38).

If you have a language with tail recursion elimination, you can bet 
that's its benchmarks will include examples of tail recursion and tail 
recursion will be a favoured idiom in that language. If it doesn't, it 
won't.

I'm going to end with a quote:

    "And of course, the very success of a benchmark program is
    a danger in that people may tune their compilers and/or
    hardware to it, and with this action make it less useful."

    Reinhold P. Weicker, Siemens AG, April 1989
    Author of the Dhrystone Benchmark



-- 
Steve




More information about the Python-list mailing list