[pypy-svn] r48297 - pypy/extradoc/talk/roadshow-ibm

arigo at codespeak.net arigo at codespeak.net
Sun Nov 4 19:31:41 CET 2007


Author: arigo
Date: Sun Nov  4 19:31:39 2007
New Revision: 48297

Added:
   pypy/extradoc/talk/roadshow-ibm/flowgraph.png   (contents, props changed)
   pypy/extradoc/talk/roadshow-ibm/overview1.png   (contents, props changed)
Modified:
   pypy/extradoc/talk/roadshow-ibm/overview2.png
   pypy/extradoc/talk/roadshow-ibm/talk.txt
Log:
Review the slides.  Slidify pages.  Add a couple of diagrams.


Added: pypy/extradoc/talk/roadshow-ibm/flowgraph.png
==============================================================================
Binary file. No diff available.

Added: pypy/extradoc/talk/roadshow-ibm/overview1.png
==============================================================================
Binary file. No diff available.

Modified: pypy/extradoc/talk/roadshow-ibm/overview2.png
==============================================================================
Binary files. No diff available.

Modified: pypy/extradoc/talk/roadshow-ibm/talk.txt
==============================================================================
--- pypy/extradoc/talk/roadshow-ibm/talk.txt	(original)
+++ pypy/extradoc/talk/roadshow-ibm/talk.txt	Sun Nov  4 19:31:39 2007
@@ -7,36 +7,86 @@
 Automatic generation of VMs for dynamic languages - JIT included
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 
-:authors: Samuele Pedroni, Armin Rigo, Laura Creighton, Jacob Hallén
+.. raw:: html
+
+   <br>
+   <center>
+   <table border=0>
+   <tr><td>Samuele Pedroni</td><td>&nbsp;&nbsp;&nbsp;</td>
+       <td>Laura Creighton</td></tr>
+   <tr><td>Armin Rigo</td><td></td>
+       <td>Jacob Hallén</td></tr>
+   </table>
+   </center>
+
+
+PyPy
+==================
+
+.. raw:: html
 
+   <br>
+   <br>
+   <center>
+
+**PyPy is a tool-chain for constructing dynamic languages.**
+
+.. raw:: html
+
+   </center>
 
-What is PyPy
+Interpreters
 ==================
 
-PyPy is a tool-chain for constructing dynamic languages.
+...are good to implement dynamic languages:
+
+* Easy to write
+
+* Portable
 
-Interpreters in a relatively high-level language, without low-level
-details are the easiest, most evolvable and portable way to implement
-such languages.
-
-What PyPy does is setup enough infrastructure such that speed is regained
-and features requiring low-level manipulations are (re-)added as aspects
-without cluttering the interpreter.
+* Flexible and easy to evolve, if written in high-level language
+  (without low-level details)
+
+The PyPy Project
+==================
+
+We built enough infrastructure such that:
+
+* speed is regained
+
+* features requiring low-level manipulations are (re-)added as *aspects*
+
+* interpreters are kept simple and uncluttered
 
 Targets as different as C and the industry OO VMs (JVM, CLR) are supported.
 
-What is PyPy
-==============
+PyPy as a project
+===================
+
+We operate both as an open source with production usage aspirations and
+research project.
+
+We focus on the whole system.
+
+We want the tool-chain itself to be as simple as possible
+(but not simpler).
+
+Some of what we do is relatively straight-forward, some is challenging
+(generating dynamic compilers!).
+
+The Origin of PyPy
+=====================
 
 PyPy is a reaction to the frustrations, resource problems and
 duplicated efforts of how mainstream open-source languages (like
 Python) are implemented now.
 
-We want the tool-chain itself to be as simple as possible.
-
 
-Folk Wisdom about Interpreters for Dynamic Languages
+Folk Wisdom
 ====================================================
+
+...about interpreters for Dynamic Languages:
+
 * There are unavoidable tradeoffs between flexibility, maintainability,
   and speed
 
@@ -49,108 +99,175 @@
 
 * are relatively slow
 
-* are not very flexible:
+* are not very flexible
+
+* are harder to maintain than we would like them to be
+
+Not very flexible
+=================
 
 - Low-level decisions permeate the entire code base.
-- One cannot simply plug-in a new garbage collector, or threading model
-  when one desires to experiment.  
+- Not ideal to experiment - cannot simply plug-in a new garbage collector,
+  memory model, or threading model
 - Early decisions come back to haunt you.
 
-What this means in Practice (2)
-==================================
+Hard to maintain
+================
 
-.. XXX too big
+because they are traditionally written in low-level languages:
 
-* are harder to maintain than we would like them to be
+- the community generates experts in the dynamic language but
+  requires experts in C or C++ for its own maintenance
+- every time a new VM is needed, the language's community forks
+  (CPython - Jython - IronPython)
 
-- because they are traditionally written in low-level languages
-- the language community, which generates experts in the dynamic language,
-  can not use this expertise in its own maintenance.  Instead, expertise
-  in C, or C++ is usually needed.
-- every time a new VM is needed, there is a fork in the language community.
-  The maintainers of Jython and IronPython, for instance, are lost to the
-  C Python community.  They have enough to do just to keep up with C Python.
 
-PyPy as a project
-===================
+PyPy Approach
+=============================
 
-We operate both as an open source with production usage aspirations and
-research project.
+.. raw:: html
 
-We focus on the whole system.
+   <br>
 
-Some of what we do is relatively straight-forward, some is challenging
-(generating dynamic compilers!).
+.. image:: overview1.png
+   :align: center
 
+Translation
+==============================================
 
+.. raw:: html
 
-Translation: Going from interpreters to VMs
-==============================================
+   <br>
+
+Going from interpreters to VMs
+------------------------------
 
 In PyPy interpreters are written in RPython:
-a subset of Python amenable to static analysis.
-RPython itself still has garbage collection support
-and rich built-in types.
 
-The tool-chain implements good static compilation
+* A subset of Python amenable to static analysis
+
+* Still fully garbage collected
+
+* Rich built-in types
+
+RPython is still close to Python.
+
+Translation
+==============================================
+
+.. raw:: html
+
+   <br>
+
+Going from interpreters to VMs (2)
+----------------------------------
+
+The translation tool-chain implements good static compilation
 of RPython to multiple targets.
 
-It has pluggable backends, and implements so called
-translation aspects.
+It has pluggable backends, and inserts low-level details
+as needed (*translation aspects*).
 
 Translation details
 =======================
 
-- RPython translation starts from loaded and initialized RPython
-  code as python bytecode in a Python VM.
+.. raw:: html
+
+   <table border=0><tr><td>
 
-- PyPy uses abstract interpretation extensively: to construct flow-graphs,
-  for type inference, to gather information for some optimisations
+- First, load and initialize RPython code inside a normal Python VM
 
-- Obviously in the PE based generated Dynamic Compilers
+- RPython translation starts from the resulting "live" bytecode
 
-- Flow-graph transformation and rewriting is also used 
+- Unified "intermediate code" representation:
+  a forest of *Control Flow Graphs*
+
+.. raw:: html
+
+   </td><td>&nbsp;</td><td><img src="flowgraph.png"></td></tr></table>
+
+Translation details (2)
+=======================
+
+PyPy uses abstract interpretation extensively:
+
+- to construct Flow Graphs
+- for type inference
+- to gather info for some optimisations
+- for Partial Evaluation in the generated Dynamic Compilers...
+
+also uses Flow Graph transformation and rewriting.
 
 Representation choice
 ========================
 
-A complex part of RPython translation is choosing implementations
-and representations of its still rich built-in types that work for
-the target platforms (C/LLVM vs. OO VMs)
+A complex part of RPython translation:
+
+* RPython types are still rich
+
+* we have to choose implementations and representations
+  that work for the target platforms (C vs. OO VMs)
+
+Type Systems
+=========================
+
+We model the different targets through different type systems:
+
+- LL (low-level C-like targets): data and function pointers, structures,
+  arrays...
+
+- OO (object oriented targets): classes and instances
+  with inheritance and dispatching
+
+Type systems (2)
+===========================
 
-We model the classes of targets through different type systems:
+Translation:
 
-- low-level: data and function pointers, structures, ...
-- object oriented: classes, instances, inheritance and dispatching
+* starts from *RPython Flow Graphs*
+
+* turns them into *LL Flow Graphs* or *OO Flow Graphs*
+
+* which are then sent to the backends.
 
 Type systems and helpers
 ===========================
 
-We have emulation of the type systems that can run on top of CPython,
-we use them for testing but also for:
+We have emulation of the type systems that can run on top of CPython
+for testing but also for:
 
-- constructing and representing the static data that our approach involves
-  (we start from live objects) at translation time
+- constructing and representing the prebuilt data that our approach involves
+  (we start from live objects)
 
-- the implementation of built-in type for the targets require helper functions:
-  they are expressed using the emulations which our translation knows about
-  and can translate too
+- helper functions (e.g. implementations of RPython types)
+  use the emulations which our translation knows about too
 
 
 Translation aspects
 ========================
 
-- The interpreters that we write should be free of low-level details
-  (this is also required to target platforms as different as Posix/C
-  and the JVM/.NET)
+The interpreters in RPython are free of low-level details
+(as required to target platforms as different as Posix/C
+and the JVM/.NET).
 
 - Advanced features related to execution should not need wide-spread
   changes to the interpreters
 
-- The interpreters instead should use support offered and inserted by
+- Instead, the interpreters should use support from
   the translation framework
 
-Examples: GC and memory management, stack inspection and manipulation
+Translation aspects (2)
+========================
+
+Examples:
+
+- GC and memory management
+
+- memory layout
+
+- stack inspection and manipulation
+
+- unboxed integers as tagged pointers
 
 Implementation
 ==================
@@ -158,38 +275,52 @@
 - Translation aspects are implemented as transformation of low-level
   graphs
 
-- Calls to library/helper code written in RPython can be inserted
-  too which will also be analyzed and translated
+- Calls to library/helper code can be inserted too
+
+- The helper code is also written in RPython and analyzed and translated
 
 GC Framework
 ===============
 
-- RPython has been extended with allocation and address manipulation
-  primitives that can be used to express GC in RPython directly
+The LL Type System is extended with allocation and address manipulation
+primitives, used to express GC in RPython directly.
 
 - GCs are linked by substituting memory allocation operations with calls
   into them
 
-- Right now bookkeeping code to keep track of reference counting or
-  roots is inserted by the GC framework
+- Transformation inserts bookkeeping code, e.g. to keep track of roots
 
-- Inlining is used to eliminate call overhead for the fast paths of
-  allocation and barriers
+- Inline fast paths of allocation and barriers
 
 .. MMTk reference
 
 Stackless transformation
 =========================
 
-- One translation aspect transformation inserts support code
-  around calls such that the stack can be unwound and functions asked to
-  store and reflect their current activation frame state to the heap
+Inserts support code around calls such that the stack can be unwound.
+
+- Functions can store their current activation frame state to the heap
+
+- Chains of saved activation state can be resumed
+
+We have implemented coroutine switching using this.
+
+A special aspect
+==================================
+
+.. raw:: html
+
+   <br>
+   <br>
+   <center>
 
-- Chains of saved activation state can then be resumed
+**Generating JIT compilers**
 
-- We have implemented coroutine switching using this
+.. raw:: html
+
+   </center>
 
-A special aspect: JIT generation
+JIT motivation
 ==================================
 
 Flexibility vs. Performance:
@@ -277,8 +408,7 @@
 - a few hints in the Python interpreter to guide the JIT
   generator
 - *promotion*
-- lazy allocation of objects - only when they escape
-  ("virtuals")
+- lazy allocation of objects (only on escape)
 - use CPU stack and registers for the contents of the Python frame
 
 ..  ("virtualizables")
@@ -308,9 +438,24 @@
 
 .. demo f1
 
+
 EXTRA MATERIAL
 ==================
 
+* More about the JIT Generation:
+
+  - The *Timeshifting* transformation
+  - *Virtuals* and *Promotion*
+
+* More on the Stackless transformation
+
+  - *Resume points*
+
+* More on any other part that you are interested in
+
+* More demos
+
+
 The transformation
 ==================================
 
@@ -320,7 +465,7 @@
 
 * Guided by a binding time analysis ("color" of the graphs)
 
-*"timeshifting"*
+* *"timeshifting"*
 
 Coloring
 =================
@@ -479,6 +624,8 @@
 Resume points
 ===============
 
+Based on the Stackless Transformation:
+
 - this transformation can also insert code that allows to construct
   artificial chains of activation states corresponding to labeled points in the
   program



More information about the Pypy-commit mailing list