[pypy-dev] Question on the future of RPython

Thu Sep 2 11:52:56 CEST 2010

On 2 September 2010 19:00, Saravanan Shanmugham <sarvi at yahoo.com> wrote:
> So far as I can tell from Unladed Swallow and PyPy, it is some of these
> Dynamic features of Python, such as Dynamic Typing that make it hard to
> compile/optimize and hit C like speeds.
> Hence the need for RPython in PyPy or Restricted Python in Shedskin?

Hence the need for the JIT, not rpython. Rpython is an implementation
detail, to support translation easily to C as well as CLI and JVM
bytecode, and to support translation aspects such as stackless, and
testing on top of a full python environment. Rewriting things in
rpython for performance is a hack that should stop happening as the
JIT matures.

Dynamic typing means you need to do more work to produce code of the
same performance, but it's not impossible.

On 2 September 2010 18:56, Paolo Giarrusso <p.giarrusso at gmail.com> wrote:
> On Thu, Sep 2, 2010 at 10:40, William Leslie <william.leslie.ttg at gmail.com>
> wrote:
>>
>> But what makes you think that? A dynamic compiler has more information, so
>> it should be able to produce better code.
>
> Note that he's not arguing about a static compiler for the same code, which
> has no type information, and where you are obviously right. He's arguing
> about a statically typed language, where the type information is already
> there in the source, e.g. C - there is much less information missing.
> Actually, your point can still be made, but it becomes much less obvious.
> For this case, it's much more contended what's best - see the "java faster
> than C" debate. Nobody has yet given a proof convincing enough to close the
> debate.

Sure - having static type guarantees is another case of "more information".

There is a little more room for discussion here, because there are
cases where a dynamic compiler for a safe runtime can do better at
considering certain optimisations, too. We have been talking about our
stock-standard type systems here, which ensure that our object will
have the field or method that we are interested in at runtime, and
perhaps (as long as it isn't an interface method, which we don't have
in rpython anyway) the offset into the instance or vtable
respectively. That makes for a pretty decent optimisation, but type
systems can infer much more than this, including which objects may
escape (via region typing a-la cyclone), which fields may be None, and
which instructions are loop invariant. The point is that some of these
type systems work fine with separate compilation, and some do
significantly better with runtime or linktime specialisation.

On 2 September 2010 17:56, Paolo Giarrusso <p.giarrusso at gmail.com> wrote:
> On Thu, Sep 2, 2010 at 09:09, William Leslie
> <william.leslie.ttg at gmail.com> wrote:
>> The other is that type inference is global and changes you make to one
>> function can have far-reaching consequences.
> Is it module-global or is it performed on the whole program?

Rtyping is whole-program.

> Functional languages allow separate compilation - is there any
> RPython-specific problem for that? I've omitted my guesses here.

Many do, yes. To use ML derivatives as an example, you require the
signature of any modules you directly import. I was recently reading
about MLKit's module system, which is quite interesting (it has region
typing, and the way it unifies these types at the module boundary is
cute - carrying region information around in the source text is
fragile, so must be inferred). Haskell is kind of a special case,
requiring dictionaries to be passed around at runtime to determine
which method of some typeclass to call.

For OCaml (most MLs are similar) see section 2.5: "Modules and
separate compilation" of
http://pauillac.inria.fr/ocaml/htmlman/manual004.html

On MLKit's module implementation and region inference:
http://www.itu.dk/research/mlkit/index.php/Static_Interpretation

-- 
William Leslie