How to make Python run as fast (or faster) than Julia

Mon Feb 26 05:36:49 EST 2018

On Sun, 25 Feb 2018 19:26:12 -0800, Rick Johnson wrote:

> On Friday, February 23, 2018 at 8:48:55 PM UTC-6, Steven D'Aprano wrote:
> [...]
>> Take the Fibonacci double-recursion benchmark. Okay, it tests how well
>> your language does at making millions of function calls. Why?
> 
> Because making "millons of function calls" is what software *DOES*!

You're right -- I worded that badly, and ended up saying something I 
didn't really mean. Mea culpa.

Of course over the course of the entire program run, most non-trivial 
programmers will end up having millions (if not billions) of function 
calls *in total*. But they're not all happening *together*.

What I tried to get across is, how often do we use millions of function 
calls to get one value? By default, Python will only allow one thousand 
function calls in a stack before raising RecursionError.

py> def bad():
...     return bad()
...
py> bad()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 2, in bad
  [ previous line repeats 998 times ]
RecursionError: maximum recursion depth exceeded

There's nothing unique about recursion that will trigger this error. Any 
call stack of more than 1000 function calls will do it. So forget about 
code getting a million functions deep. The average is probably more like 
a dozen, perhaps a hundred tops.

[...]
>> For most application code, executing the function is far more costly
>> than the overhead of calling it,
> 
> What? Your functions are only called _once_ during the lifetime of your
> code execution? Enough with the leg pulling.

Let me explain what I intended to say.

Here is a really poor estimate of the overhead of calling a function on 
my computer, measured in iPython.

In [50]: def call():
   ....:     pass
   ....:
In [51]: %timeit call()
1000000 loops, best of 3: 717 ns per loop

So, on my computer, in iPython, it takes 717ns to call a function that 
does nothing. How long does it take to do the tiniest smidgen of useful 
work?

In [52]: def do_something():
   ....:     a = 1
   ....:     b = a+1
   ....:     return b
   ....:

In [53]: %timeit do_something()
1000000 loops, best of 3: 923 ns per loop

Only an extra 200 ns! Wow, function calls are expensive!!!

(Well, yes, I knew that.) However, let's do some *real* work, not a 
Mickey Mouse function like do_something().

In [54]: import pyprimes

In [55]: %timeit pyprimes.nth_prime(100000)
1 loops, best of 3: 2.6 s per loop

I happen to know that calling nth_prime ends up making a couple of 
hundred function calls. (Because I wrote it.) Let's call it 10,000 
function calls, to be on the safe side. So 10,000 x 800 nanoseconds is 
0.008 second, let's call it 0.01 second, which is just 0.4% of the total 
cost of calculating the number I was after.

(For the record, that's 1299709.)

So just a trivial fraction of the total cost of the function.

Now I know that this is not always the case. But I maintain that for 
*most* (but not all) practical, real-world code that actually does 
something useful, the overhead of calling the functions is *usually* (but 
not always) low compared to the cost of actually doing whatever it is 
that the functions do.

-- 
Steve