[pypy-dev] Re: [pypy-sprint] vilnius sprint planning progress

Mon Oct 11 22:26:30 CEST 2004

Hi Armin, 

[Armin Rigo Mon, Oct 11, 2004 at 01:46:39PM +0100]
> A. PyPy -> FlowGraph
> ====================
> 
> See goal/translate_pypy.py; run it and fix things until this script
> processes the whole of PyPy.
> 
> This requires updating things in PyPy when they are not RPythonic enough.
> In particular, there are some more efforts to be done on "caches": lazily
> built objects.  Generally, the code to build such objects is not RPython;
> so for RPython, the objects must be built in advance.  The flow space must
> force these caches to be completely built.  This part can be done
> independently. 

Because of the current way 'genc.py' is done we can mostly
forget about "module completeness" for this goal, right? But
for example, we need to properly get stdout/file interaction
working otherwise we will never get any output from our first
translated PyPy/C version. I am not quite clear on how far the
current genc.py goes in letting us use the CPython runtime.

> The goal would be to get a complete graph, which the
> existing genc.py can (mostly) already translate to C and run, for testing.

Well, i consider the how-to-do-exceptions question still a major 
problem. 

> B. FlowGraph -> Optimized FlowGraph
> ===================================
> 
> Still open to discussion.  What exactly should be done here, and how?
> An idea is that we could provide a set of rules that transform some
> operations according to the type inferred for their arguments.  This would
> introduce new operations that work on individual fields of PyObjects
> instead of just calling PyObject_Xxx() on them.  Global analysis can
> further discover when PyObject can be inlined into a parent, when they
> don't need reference counters, etc.

Actually the latter sounds to me like there must already be some 
algorithms out there that do it.  Maybe we should invite Donald
Knuth to one of our sprints, anyway :-)   More seriously, though, 
it would be helpful to get some gcc or other compiler people to
one of our next sprints to give a lecture and help doing flowgraph 
transformations. 

> C. Optimized FlowGraph -> C code
> ================================
> 
> See genc.py.  This not-too-long piece of code translates a regular flow
> graph into C code, and it seems to work fine, mostly.  There are a few
> missing RPython constructions (e.g. exception handling) that A will
> generate.  In parallel, the optimizations introduced in B will produce
> flow graphs with new kinds of operations whose support must then be added
> to genc.py.  So if you'd like to work on genc.py, people working on A and
> B will keep throwing at you new kind of flow graphs and optimization-related
> data to support.

We also need to think about ways to test this.  This is most often currently 
done by transparently compiling the generated C file and calling functions
in it to see if they produce the expected result.  However, for the
optimizations we should write test in a more fine grained way, feeding
it graphs and checking the resulting graphs.  Otherwise we will over time
get subtle errors and segmentation faults :-) 

Thanks for the good description already.  I have started a wiki page 

    http://codespeak.net/moin/pypy/moin.cgi/VilniusSprintTasks

where we should try to put the basic tasks in, the more fine 
grained the better.  As soon as i know the exact sprint dates 
we should sent out a real sprint announcement and finish the 
web pages for it. 

cheers, 

    holger