[pypy-dev] Re: Project suggestions

Tue Sep 27 15:06:19 CEST 2005

Aurélien Campéas <aurelien.campeas at free.fr> writes:

> Michael Hudson a écrit :
>> Aurélien Campéas <aurelien.campeas at free.fr> writes:
>> 
>>>Hello, pypy people,
>>>
>>>Armin Rigo a écrit :
>>>
>>>>Hi Boria,
>>>>We are indeed starting to think about more focused research areas in
>>>>PyPy.  For example, along these lines, we will need more work on
>>>>compiler optimizations and generation of code for various architectures,
>>>>both either statically or just-in-time.  More along the lines of
>>>>interpreters and virtual machines, we could start investigating new
>>>>aspects that would be useful to code into the interpreter or the
>>>>translation process: continuations, persistence (either dumping a
>>>>whole-process image or something more fine-grained), security (running
>>>>code in a sandbox), and much more, all of which is hinted at in some
>>>>documentation on Codespeak.  Finally, there is also the idea of
>>>>supporting other dynamic languages than Python by writing an interpreter
>>>>for them in PyPy.
>>>
>>>I have been looking into pypy for a few days, and trying to understand
>>>how to make the Lisp backend work.
>> Cool!
>> 
>>>I now understand that due to pypy heavily layered nature, it is
>>>possible  to use a language like lisp, as a target, at different
>>>levels. For instance, bytecode interpreter (by providing a proper set
>>>of functions to handle the bytecodes ?), or compiler (by translating
>>>the function graph into low-level lisp). Right now, I am sticking with
>>>the second way, as it was started in gencl.py
>>>
>>>(btw, one question that is not clear to me is about the function graph
>>>: does it contain python or rpython opcodes ?)
>> Um, neither.  It roughly goes through three potential stages:
>> unannotated SpaceOperations, annotated SpaceOperations and finally
>> LowLevelOperations.  I think a lisp translator would probably want to
>> work with annotated SpaceOperations; LowLevelOperations would be too
>> low level (probably :).
>> 
>
> Ok, thanks :)
> But then, I still don't get completely the relation between python,
> rpython, and the three potential stages of the function graph
> (probably I have to reread again all the doc on codespeak)

RPython is a subset of python with the main constraint being able to
do some level of static type analysis (for full python, the amount of
static ananlysis you can do is really very small, you can read Brett
Cannon's thesis about this:

http://www.ocf.berkeley.edu/~bac/thesis.pdf).

So, when translating, the code that implements the interpreter
(roughly interpreter/*.py, objspace/std/*.py) is imported and a flow
graph built from it.  This is then annotated (code in annotator/*.py),
a fairly hairy process.  This (for the C and LLVM backends, at least)
is then turned into a graph containing low level operations (like
"dereference this pointer").

Python is just the language all of the above happens to be implemented
in (and also the interpreter/*.py code is involved in making a flow
graph which includes itself, but this isn't that important -- just
confusing :).

>>>In fact, the distance from de-sugared python to the opcode is
>>>unknown to me (even though I suspect there is no 1:1 mapping between
>>>the two). Whatever, that could be the start of a strategy to
>>>translate python towards other high-level languages (ruby, or js as
>>>in the other thread) without paying the full price of opcode
>>>interpretation in the target (that is : parts which are semantically
>>>similar could be easily translated, others would be -costily-
>>>emulated).
>> Thing is, I don't know how feasible this is.  It's pretty hard,
>> without some kind of type inference, to translate, say this Python:
>>     a + b
>> into anything significantly more efficient than this Common Lisp:
>>     (py:add a b)
>
> The mere fact that it will be compiled will make it more efficient I
> guess. I mean, not on CLISP, but with a real lisp compiler.

Not much.  The "interpreter overhead" for Python is usually estimated
at about 20%, though it obviously depends what code you're running to
some extent.

>> And making type inference possible is what RPython is all about.
>
> Sure, but then, it is a restricted subset of Python, and I like python
> completely unrestricted ;)

Well, so do we all, but then you can't have type inference.  There is
no simple answer to this.

>> You could make #'py:add a generic function and see if a given CLOS
>> implementation is fast enough to give a useful speed (but I think the
>> coercion rules would probably drive you insane first).
>
> In this case, CLOS would add overhead. In fact the python add operator
> (and some arith. ops) seems close enough to the lisp one that one can
> be tempted to translate in the lisp equivalent with only minor
> adaptation (like the printer appending an "L" on bignums, etc.).

Er... no.  #'cl:+ is not that much like python's + (e.g. the latter
operates on strings and you can't overload the former).

> I am not sure I would use CLOS at all, in fact (at least for a first
> attempt at producing a lisp backend).

Fair enough.

> BTW, what's this "insanity with coercion rules" that you mention -
> can you expand a little on this ?

Think about things like 2L**135 < 1.0e40 or range(10)*3 or... mixed
type operations are not that simple in Python.

Cheers,
mwh

-- 
  Considering that this thread is completely on-topic in the way only
  c.l.py threads can be, I think I can say that you should replace
  "Oblivion" with "Gravity", and increase your Radiohead quotient.
                                      -- Ben Wolfson, comp.lang.python