[pypy-dev] Compiling PyPy interpreter without GC

Sat Mar 21 11:43:50 CET 2015

Hi Armin,

Thank you for your interest.

On 21/03/15 19:39, Armin Rigo wrote:
> Hi Kunshan,
> 
> On 20 March 2015 at 04:16, Kunshan Wang <kunshan.wang at anu.edu.au> wrote:
>> (...) Others (including PyPy) build a whole new VM,
>> doing everything from scratch. There are many "high-performance VM"
>> projects like PyPy (LuaJIT, v8, JavaScriptCore, HHVM to name a few), but
>> the most important low-level parts (JIT, GC, ...) are not reused.
> 
> Nice to see a project that is similar to PyPy...  even though you seem
> to have missed that point :-)  PyPy is based on RPython, whose goal is
> also to provide a generic implementation substrate for any
> interpreter-based language, by providing a general GC and a
> meta-tracing JIT.  For that reason I would not put PyPy in the same
> bag as LuaJIT, V8 and HHVM.  There are a number of very different
> languages that already exist on top of the RPython architecture; PyPy
> (the Python implementation) is only the most well-known.

Indeed. The RPython framework has already hosted several languages. In
this sense, RPython is much more reuseable.

> 
> Unlike Mu, the RPython toolchain was co-developed with PyPy, and so is
> a bit more specifically suited for languages that share some
> characteristics with Python (for example, multithreading is quite
> limited).  Also, the separation between the two layers does not come
> with (e.g.) a nice intermediate file format produced from the source
> interpreter and fed to the RPython toolchain.

Mu is explicitly minimal, language-neutral, and has a much lower level,
although slightly higher than LLVM in term of supporting managed
languages. For example, Mu does not understand "interpreter" or
"meta-tracing". We already realised that it will be very hard to build a
full language implementation directly on top of such a low-level micro
VM, in which case the implementer will have to generate SSA-based CFG
with low-level types (though there are still reference types).

We think there should be several kinds of "libraries" for different
kinds of languages. Libraries can be pre-written Mu IR code snippets or
programs running in the client doing various transformations, but
libraries are *not* part of the micro VM (the micro VM is minimal to the
extent that if anything can be done outside the micro VM without
breaking the abstrctions, it will be). For example, a library that
supports concurrent languages should provide implementations of
locking/synchronisation primitives written in Mu IR (the Mu IR only
provides atomic memory access and a futex-like waiting mechanism). A
library that supports OOP should provide some abstractions of classes,
inheritence and virtual methods. There can also be libraries that
perform Mu IR to Mu IR optimisations, but the language implementer
should also perform other optimisations on a higher level.

From this point of view, RPython can play two roles: as a language
implemented on Mu, and a library to implement other languages.

John is working on getting RPython working on Mu. If this approach
eventually works (unlikely to happen soon since we don't have a lot of
man power), other languages implemented in RPython will work on Mu, too.

> 
> It is certainly interesting to see what kind of results you'd get by
> feeding the "rtyped but not GC-transformed" instructions from RPython
> to the Mu platform.

This is the first step of our plan. The interpreter written in RPython
will be ahead-of-time compiled to the Mu IR (rather than translated to C
as RPython currently does). Then the interpreter will run on Mu as other
ordinary Mu IR programs. After this step, there will be a Python
implementation as an interpreter running on Mu.

The next step is letting the PyPy-level tracing JIT compiler work. The
generated JIT compiler can be ahead-of-time compiled to Mu IR, too,
together with the interpreter. The JIT compiler, instead of generating
native machine code, will generate Mu IR code. (That is "Mu IR code
generating other Mu IR code", something meta-circular.)

In this Mu-based RPython implementation, two things will be different.

1. Mu has its own garbage collector and exception handling. This is why
we work just below the RTyper but before inserting GC and exception
codes. RPython no longer needs to implement the concrete garbage
collector by itself.

2. Both the RPython back-end compiler and the JIT compiler will
generates Mu IR instead. It reduces some platform dependency. The
foreign function interface (FFI) will still be platform-dependent.

Some researchers in the University of Massachusetts are planning to add
transactional memory to Mu. This will provide some low-level support of
the STM-based PyPy.

However, we are still far from having a "high-performance" Mu
implementation. Only until then could we know how exactly our approach
performs.

> Again, I would recommend to start with smaller
> things than the whole PyPy.  You can go about it in a test-driven way
> by starting from, say, the simple examples in
> rpython/translator/c/test/test_typed.py, and then pick the Prolog
> interpreter at https://bitbucket.org/cfbolz/pyrolog/ as a much-smaller
> first complete example.

Thank you for the advice.

> 
> 
> A bientôt,
> 
> Armin.
> 

Regards,
Kunshan

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 473 bytes
Desc: OpenPGP digital signature
URL: <http://mail.python.org/pipermail/pypy-dev/attachments/20150321/e7f63f88/attachment-0001.sig>