Tail recursion to while iteration in 2 easy steps

Steven D'Aprano steve+comp.lang.python at pearwood.info
Mon Oct 7 22:36:06 EDT 2013


On Mon, 07 Oct 2013 17:16:35 -0700, Mark Janssen wrote:

> It's like this: there *should* be one-to-one mappings between the
> various high-level constructs to the machine code, varying only between
> different chips (that is the purpose of the compiler after all), yet for
> some operations, in languages like scheme, well... I cannot say what
> happens...  hence my challenge.
> 
>> But even putting that aside, even if somebody wrote such a description,
>> it would be reductionism gone mad. What possible light on the problem
>> would be shined by a long, long list of machine code operations, even
>> if written using assembly mnemonics?
> 
> Only that you've got a consistent, stable (and therefore, formalizable)
> translation from your language to the machine.

You are mistaken to think that there is a single, one-to-one, mapping 
between high-level code and machine code.

In practice, only if you use the exact same source code, the exact same 
compiler, the exact same version of that compiler, with the exact same 
compiler options, and the exact same version of the language and all its 
libraries, then and only then will you will get the same machine code. 
And even that is not guaranteed.

Take, for example, the single high-level operation:

sort(alist)

What machine code will be executed? Obviously that will depend on the 
sort algorithm used. There are *dozens*. Here are just a few:

http://en.wikipedia.org/wiki/Category:Sorting_algorithms

Now sorting is pretty high level, but the same principle applies to even 
simple operations like "multiply two numbers". There are often multiple 
algorithms for performing the operation, and even a single algorithm can 
often be implemented in slightly different ways. Expecting all compilers 
to generate the same machine code is simply naive.

We can use Python to discuss this, since Python includes a compiler. It 
generates byte code, which just means machine code for a virtual machine 
instead of a hardware machine, but the principle is the same. Python even 
has a disassembler, for taking those raw byte codes and turning them into 
human-readable pseudo-assembly for the Python Virtual Machine.

Here is some "machine code" corresponding to the high-level instructions:

x = 23
y = 42
z = x + y
del x, y


py> code = compile('x = 23; y = 42; z = x + y; del x, y', '', 'exec')
py> code.co_code
'd\x00\x00Z\x00\x00d\x01\x00Z\x01\x00e\x00\x00e\x01\x00\x17Z\x02\x00[\x00
\x00[\x01\x00d\x02\x00S'


Translated into "assembly":

py> from dis import dis
py> dis(code)
  1           0 LOAD_CONST               0 (23)
              3 STORE_NAME               0 (x)
              6 LOAD_CONST               1 (42)
              9 STORE_NAME               1 (y)
             12 LOAD_NAME                0 (x)
             15 LOAD_NAME                1 (y)
             18 BINARY_ADD
             19 STORE_NAME               2 (z)
             22 DELETE_NAME              0 (x)
             25 DELETE_NAME              1 (y)
             28 LOAD_CONST               2 (None)
             31 RETURN_VALUE


You should be able to see that there are all sorts of changes that the 
compiler could have made, which would have lead to different "machine 
code" but with the exact same behaviour. This particular compiler emits 
code for x=23; y=42 in the order that they were given, but that's not 
actually required in this case. The compiler might have reversed the 
order, generating something like:

    0 LOAD_CONST               1 (42)
    3 STORE_NAME               1 (y)
    6 LOAD_CONST               0 (23)
    9 STORE_NAME               0 (x)

or even:

    0 LOAD_CONST               1 (42)
    3 LOAD_CONST               0 (23)
    6 STORE_NAME               1 (y)
    9 STORE_NAME               0 (x)

without changing the behaviour of the code. Or it might have even 
optimized the entire thing to:

    0 LOAD_CONST               0 (65)
    3 STORE_NAME               0 (z)


since x and y are temporary variables and a smart compiler could perform 
the computation at compile time and throw them away. (Nitpicks about 
"what if x and y already existed?" aside.) CPython even does this sort of 
optimization, although in a more limited fashion:

py> dis(compile('x = 23 + 42', '', 'exec'))
  1           0 LOAD_CONST               3 (65)
              3 STORE_NAME               0 (x)
              6 LOAD_CONST               2 (None)
              9 RETURN_VALUE

This is called keyhole optimization. It's not beyond possibility for a 
smarter compiler to look beyond the keyhole and make optimizations based 
on the whole program.

So you can see that there is no one-to-one correspondence from high-level 
source code to low-level machine code, even for a single machine. The 
same is even more true for languages such as C, Fortran, Pascal, Lisp, 
Scheme, Haskell, Java, etc. where people have spent years or decades 
working on compiler technology. Compilers differ in the quality and 
efficiency of the machine code they emit and the choices they make about 
translating source code to machine code.


> That's all.  Everything
> else is magic.  Do you know that the Warren Abstraction Engine used to
> power the predicate logic in Prolog into machien code for a VonNeumann
> machine is so complex, no one has understood it

That's nonsense. In your next post, you even supply a link to a book 
describing how the WAM works with detailed step-by-step instructions for 
writing your own. For those reading, here it is:

www.cvc.uab.es/shared/teach/a25002/wambook.pdf


Prolog is not some dark black magic that nobody understands. There are 
multiple implementations of Prolog-like logic languages for Python:

http://pyke.sourceforge.net/
https://github.com/logpy/logpy
https://github.com/enriquepablo/nl/
http://code.google.com/p/fuxi/
https://sites.google.com/site/pydatalog/

to say nothing of similar, more advanced languages like Mercury. Just 
because *you personally* don't understand something, don't think that 
nobody else does.



-- 
Steven



More information about the Python-list mailing list