interpreter vs. compiled

castironpi castironpi at gmail.com
Sat Aug 2 02:33:04 EDT 2008


On Aug 1, 5:24 am, Paul Boddie <p... at boddie.org.uk> wrote:
> On 1 Aug, 07:11, castironpi <castiro... at gmail.com> wrote:
>
>
>
> > Given the restrictions (or rather, freedoms) of Python, does there
> > exist code that necessarily cannot translate to machine code?  In
> > other words, can you translate all Python code to machine code?
>
> Given that all valid Python code can be executed somehow and that
> execution takes place as the processor performs instructions which "it
> gets from somewhere", meaning that those instructions can belong
> either to a general interpreter or to specific code generated for a
> given user program (or a combination of these things), I think that
> you have to refine your question. What you seem to be asking is this:
> can you translate Python code to machine code which encodes the
> behaviour of the user program in a way nearing the efficiency of code
> generated from other programming languages? Rephrased, the question is
> this: can Python code be efficiently represented using low-level
> machine instructions?
>
> I think you've already touched upon this when thinking about integer
> operations. The apparently simple case of integer addition in Python
> is not completely encoded by a few machine instructions. In other
> words...
>
>   a + b # in Python
>
> ...is not sufficiently represented by...
>
>   ldr r1, a
>   ldr r2, b
>   add r3, r1, r2
>
> ...in some assembly language (and the resulting machine code), mostly
> because the semantics of Python addition are more complicated.

No, it is not sufficiently represented.  Python runs checks before and
after, to check for overflows.

   test safeinteger a
   test safeinteger b
   ldr r1, a
   ldr r2, b
   add r3, r1, r2
   test not overflow

However, no implementation of Python can do better, given Python's
specification.

> Of
> course, you can generate code for those semantics, which would lead to
> quite a few more machine instructions than those suggested above, but
> then it might be interesting to bundle those instructions in some kind
> of subroutine, and we could call this subroutine BINARY_ADD. At this
> point, you'd almost be back at the stage where you're writing a
> bytecode interpreter again.

This isn't the bytecode interpreter returning, it's bounds checking,
which is part and parcel of Python, I hold.

Another factor, a and b are known to be and are always integers, in a
given C context.

  int a, b;
  ...
  a + b

The C compilation process outputs:

   ldr r1, a
   ldr r2, b
   add r3, r1, r2

and you are correct.  However, for:

  string a, b;
  a + b

performs a concatenation which is not that simple.  The point is, C
compilation runs, and you actually have -ldr, ldr, add- lying around
in a file somewhere, which can run as three consecutive instructions
on a processor.  It's already in the interpreter in Python, and you
have the -test, test, ldr, ldr, add, test- sequence somewhere in
Python.exe, specifically wherever the object code for ceval.c is
going.

Incidentally, I find 2 bytes in 16K different in a simple C program
that merely executes a + b vs. a - b.  It is not in this practical
case a three-word output (12 bytes, ldr-ldr-add vs. ldr-ldr-subtr),
though it's not clear what entry points the OS requires.

> Of course, it's worth considering something in between these
> situations (the verbose expansion of the user program vs. a bytecode
> interpreter which examines virtual instructions and jumps to
> subroutines), and there are apparently a few techniques which make
> virtual machines more efficient (so that the processor isn't jumping
> around too much in the interpreter code, for example), and there are
> also going to be techniques which permit the simplification of any
> verbose machine code representation (most likely by not generating
> code which is never going to be executed, due to various properties of
> the program).

I think it's relevant and fair to consider -consecutive- language
instructions at this point.  Working example:

  int a, b, c, d;
  ...
  a + b
  c + d

The C compilation process outputs:

   ldr r1, a
   ldr r2, b
   add r3, r1, r2
   ldr r1, c
   ldr r2, d
   add r3, r1, r2

Whereas there is no equivalent in CPython.  The actual code that runs
on the processor (summary) is:

  :loop
  ...
  :addition_sign
  test safeinteger a
  test safeinteger b
  ldr r1, a
  ldr r2, b
  add r3, r1, r2
  test not overflow
  :goto loop

as opposed to two duplicate sections being outputted back-to-back,
even though they are -run- effectively back-to-back.  For a different
type, say list concatenation, the disassembly looks like:

>>> def f():
...     []+[]
...
>>> f()
>>> dis.dis(f)
  2           0 BUILD_LIST               0
              3 BUILD_LIST               0
              6 BINARY_ADD
              7 POP_TOP
              8 LOAD_CONST               0 (None)
             11 RETURN_VALUE

the meaning of which I am not finding in ceval.c.  Anyone?

Regardless, the JIT compilation process allocates a new executable
block of memory:

  :loop
  ...
  :addition_sign
  output( 'ldr r1, a; ldr r1, a; ldr r2, b' )
  :goto loop

which in this stage executes twice, yielding

   ldr r1, a
   ldr r2, b
   add r3, r1, r2
   ldr r1, c
   ldr r2, d
   add r3, r1, r2

somewhere in memory, same as C.  Then it runs the block.  It also has
to have already ascertained that 'a' and 'b' are necessarily integers
by the time it makes the output( ) statement.  I remain unclear on the
disagreement between why JIT is not called compiling, and why it
doesn't output an executable binary, unless the practical reasons of
saving developer time and cross-platform object files, or merely
terminology, are the only difference.



More information about the Python-list mailing list