interpreter vs. compiled

Fri Aug 1 06:24:35 EDT 2008

On 1 Aug, 07:11, castironpi <castiro... at gmail.com> wrote:
>
> Given the restrictions (or rather, freedoms) of Python, does there
> exist code that necessarily cannot translate to machine code?  In
> other words, can you translate all Python code to machine code?

Given that all valid Python code can be executed somehow and that
execution takes place as the processor performs instructions which "it
gets from somewhere", meaning that those instructions can belong
either to a general interpreter or to specific code generated for a
given user program (or a combination of these things), I think that
you have to refine your question. What you seem to be asking is this:
can you translate Python code to machine code which encodes the
behaviour of the user program in a way nearing the efficiency of code
generated from other programming languages? Rephrased, the question is
this: can Python code be efficiently represented using low-level
machine instructions?

I think you've already touched upon this when thinking about integer
operations. The apparently simple case of integer addition in Python
is not completely encoded by a few machine instructions. In other
words...

  a + b # in Python

...is not sufficiently represented by...

  ldr r1, a
  ldr r2, b
  add r3, r1, r2

...in some assembly language (and the resulting machine code), mostly
because the semantics of Python addition are more complicated. Of
course, you can generate code for those semantics, which would lead to
quite a few more machine instructions than those suggested above, but
then it might be interesting to bundle those instructions in some kind
of subroutine, and we could call this subroutine BINARY_ADD. At this
point, you'd almost be back at the stage where you're writing a
bytecode interpreter again.

Of course, it's worth considering something in between these
situations (the verbose expansion of the user program vs. a bytecode
interpreter which examines virtual instructions and jumps to
subroutines), and there are apparently a few techniques which make
virtual machines more efficient (so that the processor isn't jumping
around too much in the interpreter code, for example), and there are
also going to be techniques which permit the simplification of any
verbose machine code representation (most likely by not generating
code which is never going to be executed, due to various properties of
the program).

Obviously, CPython isn't oriented towards investigating these matters
in great depth, but that doesn't mean that other implementations can't
pursue other approaches.

> Similarly, I take it that the decision to make CPython a stack machine
> + VM was a design decision, not a necessity, favoring internal
> simplicity over the extra 5%.

Probably: it simplifies code generation somewhat.

Paul