[pypy-svn] r17937 - pypy/dist/pypy/doc
arigo at codespeak.net
arigo at codespeak.net
Wed Sep 28 11:55:01 CEST 2005
Author: arigo
Date: Wed Sep 28 11:54:52 2005
New Revision: 17937
Modified:
pypy/dist/pypy/doc/_ref.txt
pypy/dist/pypy/doc/draft-dynamic-language-translation.txt
pypy/dist/pypy/doc/theory.txt
Log:
* More work on the annotator section.
* Added a reference to Wikipedia in theory.txt.
Modified: pypy/dist/pypy/doc/_ref.txt
==============================================================================
--- pypy/dist/pypy/doc/_ref.txt (original)
+++ pypy/dist/pypy/doc/_ref.txt Wed Sep 28 11:54:52 2005
@@ -4,6 +4,7 @@
.. _`annotation/`:
.. _`pypy/annotation`: ../../pypy/annotation
.. _`annotation/binaryop.py`: ../../pypy/annotation/binaryop.py
+.. _`pypy/annotation/model.py`: ../../pypy/annotation/model.py
.. _`doc/`: ../../pypy/doc
.. _`doc/revreport/`: ../../pypy/doc/revreport
.. _`interpreter/`:
Modified: pypy/dist/pypy/doc/draft-dynamic-language-translation.txt
==============================================================================
--- pypy/dist/pypy/doc/draft-dynamic-language-translation.txt (original)
+++ pypy/dist/pypy/doc/draft-dynamic-language-translation.txt Wed Sep 28 11:54:52 2005
@@ -2,6 +2,9 @@
Compiling dynamic language implementations
============================================================
+.. contents::
+.. sectnum::
+
The analysis of dynamic languages
===============================================
@@ -494,6 +497,12 @@
constant-folded away, instead of having the various possible values of
``next_block`` be merged at the beginning of the loop.
+For more information see `The Interplevel Back-End`_ in the reference
+documentation.
+
+.. _`The Interplevel Back-End`: translation.html#the-interplevel-back-end
+
+
Annotator
---------------------------------
@@ -570,10 +579,122 @@
process necessarily terminates, as we will show in the sequel.
+Flow graph model
+~~~~~~~~~~~~~~~~
+
+For the purpose of the sequel, an informal description of the data model
+used to represent flow graphs will suffice (a `precise description`_ can
+be found in the reference documentation).
+
+The flow graphs are in Static Single Information (SSI) form, an
+extension of Static Single Assignment (SSA_): each variable is only used
+in exactly one basic block. All variables that are not dead at the end
+of a basic block are explicitely carried over to the next block and
+renamed. Instead of the traditional phi functions of SSA we use a minor
+variant, parameter-passing style: each block declares a number of *input
+variables* playing the role of input arguments to the block; each link
+going out of a block carries a matching number of variables and
+constants from the previous block into the target block's input
+arguments.
+
+We use the following notation for an *operation* recorded in a block of
+the flow graph of a function::
+
+ z = opname(x_1, ..., x_n) | z'
+
+where *x_1, ..., x_n* are the arguments of the operation (either
+variables defined earlier in the block, or constants), *z* is the
+variable into which the result is stored (each operation introduces a
+new fresh variable as its result), and *z'* is a fresh extra variable
+which we will use in particular cases (which we omit from the notation
+when it is irrelevant).
+
+Let us assume that we are given a user program, which for the purpose of
+the model we assume to be fully known in advance. Let us define the set
+*V* of all variables as follows:
+
+* *V* contains all the variables that appear in operations, in any flow
+ graph of any function of the program, as described above;
+
+* in addition, for each class ``C`` of the user program and each
+ possible attribute name ``attr``, we add to *V* a variable called
+ *v_C.attr*.
+
+For a function ``f`` of the user program, we call *arg_f_1, ...,
+arg_f_n* the variables bound to the input arguments of ``f`` (which are
+actually the input variables of the first block in the flow graph of
+``f``) and *return_f* the variable bound to the return value of ``f``
+(which is the single input variable of a special empty "return" block
+ending the flow graph).
+
+Note that the complete knowledge of the operations and classes that
+appear in the user program allow us to bound the size of *V*. Indeed,
+the set of possible attribute names can be defined as all names that
+appear in a ``getattr`` or ``setattr`` operation; no other name will
+play a role during annotation.
+
+.. _`precise description`: objspace.html#the-flow-model
+.. _`SSA`: http://en.wikipedia.org/wiki/Static_single_assignment_form
+
+
Annotation model
~~~~~~~~~~~~~~~~
-::
+As in the `formal definition`_ of Abstract Interpretation, the model for
+our annotation forms a *lattice_*, although we only use its structure of
+*`join-semilattice`_*.
+
+The set *A* of annotations is defined as the following formal terms:
+
+* Bot, Top -- the minimum and maximum elements (corresponding to
+ "impossible value" and "most general value");
+
+* Int, NonNegInt, Bool -- integers, known-non-negative integers, booleans;
+
+* Str, Char -- strings, characters (which are strings of length 1);
+
+* Inst(*class*) -- instance of *class* or a subclass thereof (there is
+ one such term per *class*);
+
+* List(*v*) -- list; *v* is a variable summarizing the items of the list
+ (there is one such term per variable);
+
+* Callable(*set*) -- where the *set* is a subset of the (finite) set of
+ all functions, all classes, and all pairs of a class and a function
+ (written ``class.f``).
+
+* None -- stands for the singleton ``None`` object of Python.
+
+More details about the annotations will be introduced in due time. In
+addition, some of the annotations have a corresponding "nullable" twin,
+which stands for "either the object described or ``None``". We use it
+to propagate knowledge about which variable, after translation to C,
+could ever contain a NULL pointer. (More precisely, there are a
+NullableStr, nullable instances, and nulllable callables, and all lists
+are implicitely assumed to be nullable).
+
+Each annotation corresponds to a family of run-time Python object; the
+ordering of the lattice is essentially the subset order. Formally, it
+is the partial order generated by:
+
+* Bot <= a <= Top -- for any annotation *a*;
+
+* Bool <= NonNegInt <= Int;
+
+* Char <= Str;
+
+* Inst(*subclass*) <= Inst(*class*) -- for any class and subclass;
+
+* Callable(*subset*) <= Callable(*set*);
+
+* a <= b -- for any annotation *a* with a nullable twin *b*;
+
+* None <= b -- for any nullable annotation *b*.
+
+It is left as an exercice to show that this partial order makes *A* a
+lattice.
+
+Graphically::
____________ Top ___________
/ / | \ \
@@ -582,14 +703,16 @@
/ NullableStr | | |
Int / \ | (lists) |
/ Str \ (instances) | (pbcs)
- NonNegInt \ \ \ | |
- \ Char \ \ / /
- Bool \ \ \ / /
- \ \ `----- None -----'
- \ \ /
- \ \ /
- `--------`-- Bottom
+ NonNegInt \ \ | | |
+ \ Char \ |\ / /
+ Bool \ \ | \ / /
+ \ \ `----- None -----/
+ \ \ | / /
+ \ \ | / /
+ `--------`-- Bottom ------'
+Here is the part about instances and nullable instances, assuming a
+simple class hierarchy with only two direct subclasses of ``object``::
Top
|
@@ -614,7 +737,7 @@
\ / /
Bottom
-
+All list terms for all variables are unordered::
__________________ Top __________________
/ / / \ \ \
@@ -626,22 +749,27 @@
\ \ \ / / /
'------------'--- None ----'------------'
+The callables form a classical finite set-of-subsets lattice. In
+practice, we consider ``None`` as a degenerated callable, so the None
+annotation is actually Callable({None}).
+
+We should mention (but ignore for the sequel) that all annotations also
+have a variant where they stand for a single known object; this
+information is used in constant propagation. In addition, we have left
+out a number of other annotations that are irrelevant for the basic
+description of the annotator, and straightforward to handle. The
+complete list is defined and documented in `pypy/annotation/model.py`_
+and described in more practical terms in `The Annotation Pass`_ in the
+reference documentation.
+.. _`The Annotation Pass`: translation.html#annotator
- Bot
- Top
-
- Int
-
- NonNegInt
+Draft
+~~~~~
- Bool
+::
- Str
-
- NullableStr
-
Char
Inst(class)
@@ -783,6 +911,22 @@
Classes and instances
~~~~~~~~~~~~~~~~~~~~~
+We assume that the classes in the user program are organized in a single
+inheritance tree rooted at the ``object`` base class. (Python supports
+multiple inheritance, but the annotator is limited to single inheritance
+plus simple mix-ins.)
+
+Remember that Python has no notion of classes declaring attributes and
+methods. Classes are merely hierarchical namespaces: an expression like
+``obj.attr`` means that the ``attr`` attribute is looked up in the class
+that ``obj`` is an instance of at run-time, and all parent classes (a
+``getattr`` operation). Expressions like ``obj.meth()`` that look like
+method calls are actually grouped as ``(obj.meth)()``: they correspond
+to two operations, a ``getattr`` followed by a ``call``.
+
+So it is down to the annotator to reconstruct a static structure for
+each class in the hierarchy XXX.
+
XXX
Termination
@@ -791,6 +935,22 @@
XXX termination + soundness + most-precise-fixpoint-ness + complexity
+The lattice is finite, although its size depends on the size of the
+program. The List part has the same size as *V*, and the Callable part
+is exponential on the number of callables. However, in this model a
+chain of annotations (where each one is larger than the previous) cannot
+be longer than::
+
+ max(5, number-of-callables + 3, depth-of-class-hierarchy + 3).
+
+In the extended lattice used in practice it is more difficult to compute
+an upper bound. Such a bound exists -- some considerations can even
+show that a finite subset of the extended lattice suffices -- but it
+does not reflect any partical complexity considerations. It is simpler
+to prove that there is no infinite ascending chain, which is enough to
+guarantee termination.
+
+
Non-static aspects
~~~~~~~~~~~~~~~~~~
@@ -832,6 +992,9 @@
.. _architecture: architecture.html
.. _`Thunk Object Space`: objspace.html#the-thunk-object-space
.. _`abstract interpretation`: theory.html#abstract-interpretation
+.. _`formal definition`: http://en.wikipedia.org/wiki/Abstract_interpretation
+.. _lattice: http://en.wikipedia.org/wiki/Lattice_%28order%29
+.. _`join-semilattice`: http://en.wikipedia.org/wiki/Lattice_%28order%29
.. _`Flow Object Space`: objspace.html#the-flow-object-space
.. _`Standard Object Space`: objspace.html#the-standard-object-space
.. _Psyco: http://psyco.sourceforge.net/
Modified: pypy/dist/pypy/doc/theory.txt
==============================================================================
--- pypy/dist/pypy/doc/theory.txt (original)
+++ pypy/dist/pypy/doc/theory.txt Wed Sep 28 11:54:52 2005
@@ -32,8 +32,18 @@
In PyPy, the FlowObjSpace_ uses the abstract interpretation technique to generate a control flow graph of the functions of RPython_ programs.
+In its `more formal definition`_, Abstract Interpretation typically
+considers abstract objects that are organized in a lattice_: some of
+these objects are more (or less) abstract than others, in the sense that
+they represent less (or more) known information; to say that this forms
+a lattice essentially means that any two abstract objects have
+well-defined unions and intersections (which are again abstract
+objects).
+
.. _FlowObjSpace: objspace.html#the-flow-object-space
.. _RPython: coding-guide.html#restricted-python
+.. _`more formal definition`: http://en.wikipedia.org/wiki/Abstract_interpretation
+.. _lattice: http://en.wikipedia.org/wiki/Lattice_%28order%29
Multimethods
More information about the Pypy-commit
mailing list