Proper tail recursion

Sat Jul 10 15:15:57 EDT 2004

Tim Peters wrote:
> WHen fiddling with Python internals, build Python in debug mode.  That
> enables many checks that will save you weeks of debugging.

Thanks! I found the bug -- I was just forgetting to reset the stack 
after a call_function.

I have a preliminary implementation ready, the patch is against 2.4a1: 
http://users.wpi.edu/~squirrel/temp/tailcall.diff.gz

This patch only works for simple functions (i.e. those that can be 
handled by fast_function), but extension to other function types should 
be trivial. I haven't fully tested this with regards to exception 
handling and whatnot, so I want to stick with a simple case for now.

The patch works roughly as follows:
1. When a CALL_FUNCTION opcode is encountered, check if the following 
opcode is RETURN_VALUE. If so, execute the tail call code.
2. The tail call code calls a modified version of call_function that 
sets up and returns a frame object for the function to be called (if 
possible), rather than recursively calling PyEval_EvalFrame. This frame 
is stored in a temporary variable, and a RETURN_VALUE is simulated to 
exit the loop.
3. After cleaning up the current frame, PyEval_EvalFrame loops back up 
to the top, now using the temporarily stored frame as the current frame.

Of course, instead of a loop, gcc's tail-call optimization feature could 
be used, but this would be non-portable to other compilers.

An example of the patch in action:

# Note default arguments aren't supported in the current patch
def fact2(n,v):
     if n:
         return fact2(n-1,v*n)
     else:
         return v

def fact(n):
     return fact2(n,1)

Without the patch:
 >>> fact(10000)
RuntimeError: maximum recursion depth exceeded

With the patch:
 >>> fact(10000)
<really really huge number>

Any feedback would be greatly appreciated!