On-topic: alternate Python implementations

Paul Rubin no.email at nospam.invalid
Sat Aug 4 16:43:29 EDT 2012


Stefan Behnel <stefan_ml at behnel.de> writes:
>> Calling CPython hardly counts as compiling Python into C.
> CPython is written in C, though. So anything that CPython does can be
> done in C. It's not like the CPython project used a completely unusual
> way of writing C code.

CPython is a relatively simple interpreter, and executing code 
by invoking such an interpreter IMHO doesn't count as "compiling" it in
any meaningful way.  

> You will always need some kind of runtime infrastructure when you
> "compile Python into C", so you can just as well use CPython for that
> instead of reimplementing it completely from scratch. 

Maybe there's parts of Cpython you can re-use, but having the CPython
interpreter be the execution engine for "compiled" Python generators
again fails the seriousness test of what it means to compile code.  If
you mean something other than that, you might explain more clearly.

> Both Cython and Nuitka do exactly that, 

I didn't know about Nuitka; it looks interesting but (at least after a
few minutes looking) I don't have much sense of how it works.

> No, you are going to compile only the generator function into a function
> that uses gotos, maybe with an additional in-out struct parameter that
> holds its state.

Yeah, ok, I guess that can work, given python generators are limited
to returning through just one stack level.  You might want to avoid
copying locals by just putting everything into a struct, that has to
be retained across entries/exits.

> If you don't like that, you can experiment with anything from a dedicated
> GC to transactional memory.

OK, but then CPython is no longer managing the memory.

> Last I heard, PyPy had a couple of GCs to choose from,

PyPy doesn't compile to C, but I guess compiling to C doesn't preclude
precise GC, as long as the generated C code carefully tracks what C
objects can contain GC-able pointers, and follows some constraints about
when the GC can run.  Some other compilers do this so it's not as big a
deal as it sounded like at first.  OK.
>
>>> or provide none at all.
>> You're going to let the program just leak memory until it crashes??
> Well, it's not like CPython leaks memory until it crashes...

I was counting CPython's reference counting as a rudimentary form of GC,
though I guess that's terminology that not everyone agrees on.

> Huh? LuaJIT is a reimplementation of Lua that uses an optimising JIT
> compiler specifically for Lua code. How is that similar to the Jython
> runtime that runs *on top of* the JVM with its generic byte code based
> JIT compiler?

I thought LuaJIT compiles the existing Lua VM code, but I haven't
looked at it closely or used it.

>> It seems very hard to do reasonable optimizations in the presence of
>> standard Python techniques
>
> Sure. Even when targeting the CPython runtime with the generated C
> code (like Cython or Nuitka), you can still do a lot. And sure, static
> code analysis will never be able to infer everything that a JIT
> compiler can see.

I think even a JIT can't avoid a lot of pain and slowdown, without
complex whole-program analysis and requiring the application to follow
some special conventions, like never importing at "runtime".



More information about the Python-list mailing list