[pypy-svn] r10801 - pypy/dist/pypy/documentation

arigo at codespeak.net arigo at codespeak.net
Sun Apr 17 23:41:35 CEST 2005


Author: arigo
Date: Sun Apr 17 23:41:34 2005
New Revision: 10801

Added:
   pypy/dist/pypy/documentation/translation.txt
      - copied, changed from r10793, pypy/dist/pypy/documentation/annotation.txt
Removed:
   pypy/dist/pypy/documentation/annotation.txt
   pypy/dist/pypy/documentation/basicblock.asc
Modified:
   pypy/dist/pypy/documentation/navlist
   pypy/dist/pypy/documentation/redirections
   pypy/dist/pypy/documentation/test_redirections.py
   pypy/dist/pypy/documentation/theory.txt
Log:
The last big txt file is translation.txt.  Draft for now.


Deleted: /pypy/dist/pypy/documentation/annotation.txt
==============================================================================
--- /pypy/dist/pypy/documentation/annotation.txt	Sun Apr 17 23:41:34 2005
+++ (empty file)
@@ -1,306 +0,0 @@
-The annotation pass
-===================
-
-(INCOMPLETE DRAFT)
-
-We describe below how a control flow graph can be "annotated" to 
-discover the types of the objects.  This annotation pass is a form of 
-type inference.  It is done after control flow graphs are built by the 
-FlowObjSpace, but before these graphs are translated into low-level code 
-(e.g. C/Lisp/Pyrex).
-
-
------
-Model
------
-
-The major goal of the annotator is to "annotate" each variable that 
-appears in a flow graph.  An "annotation" describes all the possible 
-Python objects that this variable could contain at run-time, based on a 
-whole-program analysis of all the flow graphs --- one per function.
-
-An "annotation" is an instance of ``SomeObject``.  There are subclasses 
-that are meant to represent specific families of objects.  Note that 
-these classes are all meant to be instantiated; the classes ``SomeXxx`` 
-themselves are not the annotations.
-
-Here is an overview (see ``pypy.annotation.model``):
-
-* ``SomeObject`` is the base class.  An instance ``SomeObject()`` 
-  represents any Python object.  It is used for the case where we don't 
-  have enough information to be more precise.  In practice, the presence 
-  of ``SomeObject()`` means that we have to make the annotated source code 
-  simpler or the annotator smarter.
-
-* ``SomeInteger()`` represents any integer.  
-  ``SomeInteger(nonneg=True)`` represent a non-negative integer (``>=0``).
-
-* ``SomeString()`` represents any string; ``SomeChar()`` a string of 
-  length 1.
-
-* ``SomeTuple([s1,s2,..,sn])`` represents a tuple of length ``n``.  The 
-  elements in this tuple are themselves constrained by the given list of 
-  annotations.  For example, ``SomeTuple([SomeInteger(), SomeString()])`` 
-  represents a tuple with two items: an integer and a string.
-
-There are more complex subclasses of ``SomeObject`` that we describe in 
-more details below.
-
-All the ``SomeXxx`` instances can optionally have a ``const`` attribute, 
-which means that we know exactly which Python object the Variable will 
-contain.
-
-All the ``SomeXxx`` instances are supposed to be immutable.  The 
-annotator manages a dictionary mapping Variables (which appear in flow 
-graphs) to ``SomeXxx`` instances; if it needs to revise its belief about 
-what a Variable can contain, it does so by updating this dictionary, not 
-the ``SomeXxx`` instance.
-
-
----------
-Annotator
----------
-
-The annotator itself (``pypy.translator.annrpython``) works by 
-propagating the annotations forward in the flow graphs, starting at some 
-entry point function, possibly with explicitely provided annotations 
-about the entry point's input arguments.  It considers each operation in 
-the flow graph in turn.  Each operation takes a few input arguments 
-(Variables and Constants) and produce a single result (a Variable).  
-Depending on the input argument's annotations, an annotation about the 
-operation result is produced.  The exact rules to do this are provided 
-by the whole ``pypy.annotation`` subdirectory, which defines all the 
-cases in detail according to the R-Python semantics.  For example, if 
-the operation is 'v3=add(v1,v2)' and the Variables v1 and v2 are 
-annotated with ``SomeInteger()``, then v3 also receives the annotation 
-``SomeInteger()``.  So for example the function::
-
-    def f(n):
-        return n+1
-
-corresponds to the flow graph::
-
-    start ----------.
-                    |
-                    V 
-           +-------------------+
-           |  v2 = add(v1, 1)  |
-           +-------------------+
-                    |
-                    `---> return block
-
-If the annotator is told that v1 is ``SomeInteger()``, then it will 
-deduce that v2 (and hence the function's return value) is 
-``SomeInteger()``.
-
-This step-by-step annotation phase proceeds through all the operations 
-in a block, and then along the links between the blocks of the flow 
-graph.  If there are loops in the flow graph, then the links will close 
-back to already-seen blocks, as in::
-
-    def g(n):
-        i = 0
-        while n:
-            i = i + n
-            n = n - 1
-
-whose flow graph is::
-
-    start -----.           ,-----------------.
-               | n1 0      | m3 j3           |
-               V           v                 |
-           +-------------------+             |
-           |   input: n2 i2    |             |
-           |  v2 = is_true(n2) |             |
-           +-------------------+             |
-               |             |               |
-               |ifFalse      |ifTrue         |
-    return <---'             | n2 i2         |
-                             V               |
-                    +--------------------+   |
-                    |   input: n3 i3     |   |
-                    |  j3 = add(i3, n3)  |   |
-                    |  m3 = sub(n3, 1)   |---'
-                    +--------------------+
-
-Be sure to follow the variable renaming that occurs systematically 
-across each link in a flow graph.  In the above example the Variables 
-have been given names similar to the name of the original variables in 
-the source code (the FlowObjSpace tries to do this too) but keep in mind 
-that all Variables are different: n1, n2, i2, v2, n3, i3, j3, m3.
-
-Assume that we call the annotator with an input annotation of 
-``SomeInteger()`` for n1.  Following the links from the start, the 
-annotator will first believe that the Variable i2, whose value comes 
-from the constant 0 of the first link, must always be zero.  It will 
-thus use the annotation ``SomeInteger(const=0)`` for i2.  Then it will 
-propagate the annotations through both blocks, and find that v2 is 
-``SomeBool()`` and all other variables are ``SomeInteger()``.  In 
-particular, the annotation of j3 is different from the annotation of the 
-Variable i2 into which it is copied (via the back-link).  More 
-precisely, j3 is ``SomeInteger()`` but i2 is the more specific 
-``SomeInteger(const=0)``.  This means that the assumption that i2 must 
-always be zero is found to be wrong.  At this point, the annotation of 
-i2 is *generalized* to include both the existing and the new annotation.  
-(This is the purpose of ``pypy.annotation.model.unionof()``).  Then 
-these more general annotations must again be propagated forward.
-
-This process of successive generalizations continues until the 
-annotations stabilize.  In the above example, it is sufficient to 
-re-analyse the first block once, but in general it can take several 
-iterations to reach a fixpoint.  Annotations may also be propagated from 
-one flow graph to another and back repeatedly, across ``call`` 
-operations.  The overall model should ensure that this process 
-eventually terminates under reasonable conditions.  Note that as long as 
-the process is not finished, the annotations given to the Variables are 
-wrong, in the sense that they are too specific; at run-time, the 
-Variables will possibly contain Python objects outside the set defined 
-by the annotation, and the annotator doesn't know it yet.
-
-
-----------------------------------
-Description of the available types
-----------------------------------
-
-The reference and the details for the annotation model is found in the 
-module ``pypy.annotation.model``.  We describe below the issues related 
-to the various kinds of annotations.
-
-
-Simple Types
-++++++++++++
-
-``SomeInteger``, ``SomeBool``, ``SomeString``, ``SomeChar`` all stands 
-for the obvious corresponding set of immutable Python objects.
-
-
-Tuples
-++++++
-
-``SomeTuple`` only considers tuples of known length.  We don't try to 
-handle tuples of varying length (the program should use lists instead).
-
-
-Lists and Dictionaries
-++++++++++++++++++++++
-
-``SomeList`` stands for a list of homogenous type (i.e. all the elements 
-of the list are represented by a single common ``SomeXxx`` annotation).
-
-``SomeDict`` stands for a homogenous dictionary (i.e. all keys have the 
-same ``SomeXxx`` annotation, and so have all values).
-
-These types are mutable, which requires special support for the 
-annotator.  The problem is that in code like::
-
-   lst = [42]
-   update_list(lst)
-   value = lst[0]
-
-the annotation given to ``value`` depends on the order in which the 
-annotator progresses.  As ``lst`` is originally considered as a list of 
-``SomeInteger(const=42)``, it is possible that ``value`` becomes 
-``SomeInteger(const=42)`` as well if the analysis of ``update_list()`` 
-is not completed by the time the third operation is first considered.  
-To solve this problem, each ``SomeList`` or ``SomeDict`` is linked to a 
-set of so-called *factories*.  Each creation point, i.e. each 'newlist' 
-or 'newdict' operation, gets its associated factory.  The factory 
-remembers what kind of object it really needs to build.  For example, in 
-code like::
-
-   lst = [42]
-   lst.append(43)
-
-the factory associated with the first line originally builds a list 
-whose items are all constants equal to 42; when the ``append(43)`` call 
-is then found, the factory is updated to build a more general list of 
-integers, and the annotator restarts its analysis from the factory 
-position.  Our model is not sensitive to timing: it doesn't know that 
-the same list object may contain different items at different times.  It 
-only computes how general the items in the list must be to cover all 
-cases.
-
-For initially empty lists, as created by ``lst = []``, we build a list 
-whose items have the annotation ``SomeImpossibleValue``.  This is an 
-annotation that denotes that no Python object at all can possibly appear 
-here at run-time.  It is the least general annotation.  The rationale is 
-that::
-
-   lst = []
-   oups = lst[0]
-
-will give the variable ``oups`` the annotation ``SomeImpossibleValue``, 
-which is reasonable given that no concrete Python object can ever be put 
-in ``oups`` at run-time.  In a more usual example::
-
-   lst = []
-   lst.append(42)
-
-the list is first built with ``SomeImpossibleValue`` items, and then the 
-factory is generalized to produce a list of ``SomeInteger(const=42)``.  
-With this "impossible object" trick we don't have to do anything special 
-about empty lists.
-
-
-User-defined Classes and Instances
-++++++++++++++++++++++++++++++++++
-
-``SomeInstance`` stands for an instance of the given class or any 
-subclass of it.  For each user-defined class seen by the annotator, we 
-maintain a ClassDef (``pypy.annotation.classdef``) describing the 
-attributes of the instances of the class; essentially, a ClassDef gives 
-the set of all class-level and instance-level attributes, and for each 
-one, a corresponding ``SomeXxx`` annotation.
-
-Instance-level attributes are discovered progressively as the annotation 
-progresses.  Assignments like::
-
-   inst.attr = value
-
-update the ClassDef of the given instance to record that the given 
-attribute exists and can be as general as the given value.
-
-For every attribute, the ClassDef also records all the positions where 
-the attribute is *read*.  If, at some later time, we discover an 
-assignment that forces the annotation about the attribute to be 
-generalized, then all the places that read the attribute so far are 
-marked as invalid and the annotator will have to restart its analysis 
-from there.
-
-The distinction between instance-level and class-level attributes is 
-thin; class-level attributes are essentially considered as initial 
-values for instance-level attributes.  Methods are not special in this 
-respect, expect that they are bound to the instance (i.e. ``self = 
-SomeInstance(cls)``) when considered as the initial value for the 
-instance.
-
-The inheritance rules are as follows: the union of two ``SomeInstance`` 
-annotations is the ``SomeInstance`` of the most precise common base 
-class.  If an attribute is considered (i.e. read or written) through a 
-``SomeInstance`` of a parent class, then we assume that all subclasses 
-also have the same attribute, and that the same annotation applies to 
-them all (so code like ``return self.x`` in a method of a parent class 
-forces the parent class and all its subclasses to have an attribute 
-``x``, whose annotation is general enough to contain all the values that 
-all the subclasses might want to store in ``x``).  However, distinct 
-subclasses can have attributes of the same names with different, 
-unrelated annotations if they are not used in a general way through the 
-parent class.
-
-
-Prebuilt Constants
-++++++++++++++++++
-
-(to be completed)
-
-
-Built-in functions and methods
-++++++++++++++++++++++++++++++
-
-(to be completed)
-
-
-Others
-++++++
-
-(to be completed)

Deleted: /pypy/dist/pypy/documentation/basicblock.asc
==============================================================================
--- /pypy/dist/pypy/documentation/basicblock.asc	Sun Apr 17 23:41:34 2005
+++ (empty file)
@@ -1,36 +0,0 @@
-the CtlFlowObjSpace is supposed to produce the following object-model
-(currently just a linear list of objects referencing each other, sorry
-no UML diagream :-)
-
-the reason we want this objectmodel as the source for doing the
-translation is that otherwise we get very-hard-to-read low-level/pyrex
-code with lots of gotos etc.pp.
-
-With the CtlFlow-object model we can basically reconstruct some
-source-structure (without examining the bytecode).
-
-class BasicBlock:
- .inputargs = [list-of-input-locals]
- .locals = [list-of-all-locals-incluing-inputargs]
- .operations = [list-of-operations]
- .branch = <Branch or ConditionalBranch instance>
-
-class Variable:
-  pass
-
-class Constant:
-  .value = ...
-
-class SpaceOperation:
-  .opname = 'add'
-  .args = [list-of-variables]
-  .result = <Variable/Constant instance>
-
-class Branch:
-  .args = [list-of-variables]
-  .target = <BasicBlock instance>
-
-class ConditionalBranch:
-  .condition = <Variable instance>
-  .iftrue = <Branch instance>
-  .iffalse = <Branch instance>

Modified: pypy/dist/pypy/documentation/navlist
==============================================================================
--- pypy/dist/pypy/documentation/navlist	(original)
+++ pypy/dist/pypy/documentation/navlist	Sun Apr 17 23:41:34 2005
@@ -2,7 +2,8 @@
     'getting_started.html', 
     'architecture.html', 
     'coding-style.html', 
-    'objspace.html', 
+    'objspace.html',
+    'translation.html',
     'misc.html', 
     'theory.html', 
 ]

Modified: pypy/dist/pypy/documentation/redirections
==============================================================================
--- pypy/dist/pypy/documentation/redirections	(original)
+++ pypy/dist/pypy/documentation/redirections	Sun Apr 17 23:41:34 2005
@@ -19,6 +19,9 @@
     'goals.html'            : 'misc.html#goals', 
 
     'developers.html'       : 'misc.html#developers', 
-    'cmodules.html'         : 'misc.html#cmodules', 
+    'cmodules.html'         : 'misc.html#cmodules',
+
+    'annotation.html'       : 'translation.html#annotator',
+    'basicblock.asc'        : 'objspace.html#the-flow-model',
 }
 

Modified: pypy/dist/pypy/documentation/test_redirections.py
==============================================================================
--- pypy/dist/pypy/documentation/test_redirections.py	(original)
+++ pypy/dist/pypy/documentation/test_redirections.py	Sun Apr 17 23:41:34 2005
@@ -15,4 +15,6 @@
         yield checkexist, redir.dirpath(newname) 
 
 def test_navlist(): 
-    assert eval(redir.dirpath('navlist').read())
+    navlist = eval(redir.dirpath('navlist').read())
+    for entry in navlist:
+        yield checkexist, redir.dirpath(entry)

Modified: pypy/dist/pypy/documentation/theory.txt
==============================================================================
--- pypy/dist/pypy/documentation/theory.txt	(original)
+++ pypy/dist/pypy/documentation/theory.txt	Sun Apr 17 23:41:34 2005
@@ -62,4 +62,4 @@
 .. _`quite general one`: http://codespeak.net/svn/pypy/dist/pypy/objspace/std/multimethod.py
 .. _StdObjSpace: objspace.html#the-standard-object-space
 .. _`short two-arguments-dispatching one`: http://codespeak.net/svn/pypy/dist/pypy/annotation/pairtype.py
-.. _annotator: annotation.html
+.. _annotator: translation.html#annotator

Copied: pypy/dist/pypy/documentation/translation.txt (from r10793, pypy/dist/pypy/documentation/annotation.txt)
==============================================================================
--- pypy/dist/pypy/documentation/annotation.txt	(original)
+++ pypy/dist/pypy/documentation/translation.txt	Sun Apr 17 23:41:34 2005
@@ -1,3 +1,51 @@
+=====================
+    Translation
+=====================
+
+.. contents::
+.. sectnum::
+
+This document describes the tool chain that we developed to analyze and
+"compile" RPython_ programs (like PyPy itself) to various lower-level
+languages.
+
+.. _RPython: coding-style.html#restricted-python
+
+
+Overview
+========
+
+XXX very preliminary documentation!
+
+The module `translator.py`_ is the common entry point to the various parts
+of the translation process.  It is available as an interactive utility to
+`play around`_.
+
+Here are the steps we follow to translate a given program:
+
+1. The complete program is imported.  If needed, extra initialization is performed.  Once this is done, the program must be present in memory is a form that is "static enough" in the sense of RPython_.
+
+2. The `Flow Object Space`_ processes the input program, turning each function independently into a `control flow graph`_ data structure recording sequences of basic operations in "single-style assignment".
+
+3. Optionally, the Annotator_ performs global type inference on the control flow graphs.  Each variable gets annotated with an inferred type.
+
+4. One of the Code Generators (XXX not documented yet) turns the optionally annotated flow graphs and produces a source file in a lower-level language: C_, LLVM_, `Common Lisp`_, Pyrex_, Java_, or `Python again`_ (this is used in PyPy to turn sufficiently RPythonic app-level code into interp-level code).
+
+.. _`translator.py`: http://codespeak.net/svn/pypy/dist/pypy/translator/translator.py
+.. _`play around`: getting_started.html#trying-out-the-translator
+.. _`Flow Object Space`: objspace.html#the-flow-object-space
+.. _`control flow graph`: objspace.html#the-flow-model
+.. _C: http://codespeak.net/svn/pypy/dist/pypy/translator/genc/
+.. _LLVM: http://codespeak.net/svn/pypy/dist/pypy/translator/llvm/
+.. _`Common Lisp`: http://codespeak.net/svn/pypy/dist/pypy/translator/gencl.py
+.. _Pyrex: http://codespeak.net/svn/pypy/dist/pypy/translator/genpyrex.py
+.. _Java: http://codespeak.net/svn/pypy/dist/pypy/translator/java/
+.. _`Python again`: http://codespeak.net/svn/pypy/dist/pypy/translator/geninterplevel.py
+
+
+
+.. _Annotator:
+
 The annotation pass
 ===================
 
@@ -10,9 +58,8 @@
 (e.g. C/Lisp/Pyrex).
 
 
------
 Model
------
+------------------------
 
 The major goal of the annotator is to "annotate" each variable that 
 appears in a flow graph.  An "annotation" describes all the possible 
@@ -57,9 +104,8 @@
 the ``SomeXxx`` instance.
 
 
----------
 Annotator
----------
+--------------------------
 
 The annotator itself (``pypy.translator.annrpython``) works by 
 propagating the annotations forward in the flow graphs, starting at some 
@@ -158,9 +204,8 @@
 by the annotation, and the annotator doesn't know it yet.
 
 
-----------------------------------
 Description of the available types
-----------------------------------
+-----------------------------------------------
 
 The reference and the details for the annotation model is found in the 
 module ``pypy.annotation.model``.  We describe below the issues related 



More information about the Pypy-commit mailing list