[pypy-dev] Separate compilation and friends

Tue Feb 15 09:32:50 CET 2011

On Tue, Feb 15, 2011 at 09:17, Dima Tisnek <dimaqq at gmail.com> wrote:
> On 15 February 2011 01:10, Antonio Cuni <anto.cuni at gmail.com> wrote:
>> On 15/02/11 03:41, Dima Tisnek wrote:
>>> On a related note, how hard is it to "freeze" the translator/compiler
>>> state of a given pypy version just before it begins to read extension
>>> modules and distribute that, it would speed up module development a
>>> lot.
>>> It would be a quick equivalent of distributing headers for a C library.
>>
>> this is hard, because the compilation of the modules is intermixed with the
>> compilation of the rest of the interpreter for each phase: we have (roughly)
>> something like:
>>
>> - annotation of the interpreter
>> - annotation of the modules

How much time would be saved by saving the annotation results?

>> - rtyping of the interpreter
>> - rtyping of the modules
>> - etc. etc.
>>
>> ciao,
>> Anto
>>
>
> Yeah I figured as much, I was wondering if it could be changed like this:
>
> - annotation of the interpreter, save state (1)
> - rtyping of the interpreter, shouldn't depend on modules here, save state (2)
> - etc.
> - annotation of the modules, using state from 1
> - rtyping of the modules, using state from 1,2
> - etc.
>
> I assume here that modules don't introduce dependencies into the iterpreter.
> I guess in the long run this ought to be the case, right?

I don't think you can guarantee this. Type inference is global, and
you might need a user for each API to better infer its type. Maybe
uses of an API in testcases allow fully inferring their types, but I'd
guess not.

However, what is true in general is that if less specific types are
inferred, that affects just performance, not correctness (I don't know
if that's true of PyPy, but you ought to be able to pass "object"s
around). Maybe the slowdown is insignificant, maybe it is a huge
problem, maybe few annotations can save the day.

However, it is still not clear (to me) where previous efforts stopped.
Is it hard to:
1) devise an algorithm like Dima proposed
or to
2) implement it (because of too much code to change and limited manpower)
or to
3) or to have a small performance loss?

Per-file separate compilation would likely fall into 3), because too
little type inference would happen, isn't it?

> If this is possible, it would be a useful quick hack to separate
> module build from main build.
> If it's still very hard, then some else is in order.
> I'd love to play with this myself, but I don't have enough ram for a
> full build ;-(

-- 
Paolo Giarrusso - Ph.D. Student
http://www.informatik.uni-marburg.de/~pgiarrusso/