Why Python is not both an interpreter and a compiler?

Steven D'Aprano steve at pearwood.info
Tue Sep 1 11:20:00 EDT 2015


On Tue, 1 Sep 2015 03:48 am, Mahan Marwat wrote:

>> Python programs *could* easily be compiled the same way, but it generally
>> hasn't been considered all that useful.
> 
> If it hasn't been considered all that useful, then why the tools like
> cx_freeze, pytoexe are doing very hard! And if it is really easy, then why
> cx_freeze, pytoexe developer are doing it in such a rubbish way instead of
> creating one (compiler)?

I believe that Marko is wrong. It is not so easy to compile Python to
machine language for real machines. That's why the compiler targets a
virtual machine instead.

Let's take the simple case of adding two numbers:

    x + y

With a real machine, this will probably compile down to a handful of
assembly language statements. But, x will have to be an integer of a
certain type, say, 32 bits, and y likewise will also have to be the same
size. And what happens if the addition overflows? Some machines may
overflow to a negative value, some might return the largest positive value.

Python addition does not do that. With Python, it doesn't matter whether x
and y are the same size, or different. In fact, Python integers are BigNums
which support numbers as big as you like, limited only by the amount of
memory. Or they could be floats, complex numbers, or fractions. Or x might
be a string, and y a list, and the + operator has to safely raise an
exception, not dump core, or worse.

So to compile Python into assembly language for a real CPU would be very
hard. Even simple instructions would require a large, complex chunk of
machine code. If only we could design our own machine with a CPU that
understood commands like "add two values together" without caring that the
values are compatible (real) machine types.

And that's what we do. Python runs in a virtual machine with "assembly
language" commands that are far, far more complex than real assembly for
real CPUs. Those commands, of course, eventually end up executing machine
code in the real CPU, but it usually does so by calling the Python runtime
engine.

There is another reason why frozen Python code is so large. It has to
include a full Python interpreter. The Python language includes a few
commands for compiling and interpreting Python source code:

- eval
- exec
- compile


so the Python runtime has to include the Python compiler and interpreter,
which effectively means that the Python runtime has to be *all* of Python,
or at least nearly all.

There may be projects that will compile Python down to machine code for a
real machine (perhaps Cython?) but doing so isn't easy, which is why most
compilers don't do it.




-- 
Steven




More information about the Python-list mailing list