Why is it impossible to create a compiler than can compile Python to machinecode like C?

Chris Angelico rosuav at gmail.com
Mon Mar 4 19:33:47 EST 2013


On Tue, Mar 5, 2013 at 9:55 AM, CM <cmpython at gmail.com> wrote:
>
>> The main issue is that python has dynamic typing.  The type of object
>> that is referenced by a particular name can vary, and there's no way
>> (in general) to know at compile time what the type of object "foo" is.
>>
>> That makes generating object code to manipulate "foo" very difficult.
>
> Could you help me understand this better?  For example, if you
> have this line in the Python program:
>
> foo = 'some text'
> bar = {'apple':'fruit'}
>
> If the interpreter can determine at runtime that foo is a string
> and bar is a dict, why can't the compiler figure that out at
> compile time?  Or is the problem that if later in the program
> you have this line:
>
> foo = 12
>
> now foo is referring to an integer object, not a string, and
> compilers can't have two names referring to two different
> types of objects?  Something like that?
>
> I in no way doubt you that this is not possible, I just don't
> understand enough about how compiling works to yet "get"
> why dynamic typing is a problem for compilers.

Python doesn't have "variables" with "values"; it has names, which may
(or may not) point to objects. Dynamic typing just means that one name
is allowed to point to multiple different types of object at different
times.

The problem with dynamic typing is more one of polymorphism. Take this
expression as an example:

foo += bar;

In C, the compiler knows the data types of the two variables, and can
compile that to the appropriate code. If they're both integers,
that'll possibly become a single machine instruction that adds two
registers and stores the result back.

In C++, foo could be a custom class with an operator+= function. The
compiler will know, however, what function to call; unless it's a
virtual function, in which case there's a run-time check to figure out
what subclass foo is of, and then the function is called dynamically.

In Python, *everything* is a subclass of PyObject, and every function
call is virtual. That += operation is backed by the __iadd__ function,
defined by PyObject and possibly overridden by whatever type foo is.
So, at run time, the exact function is looked up.

C++ is most definitely a compiled language, at least in most
implementations I've seen. But it has the exact same issue as Python
has: true dynamism requires run-time lookups. That's really what
you're seeing here; it's nothing to do with any sort of "compiled" vs
"interpreted" dichotomy, but with "compile time" vs "run time"
lookups. In C, everything can be done at compile time; in Python, most
things are done at run time.

It's mainly a matter of degree. A more dynamic language needs to do
more work at run time.

ChrisA



More information about the Python-list mailing list