[pypy-commit] pypy default: moved optimizer/overview.rst -> jit/optimizer.rst

Mon Mar 16 12:24:55 CET 2015

Author: Richard Plangger <rich at pasra.at>
Branch: 
Changeset: r76399:cea0639cfe6c
Date: 2015-03-16 11:48 +0100
http://bitbucket.org/pypy/pypy/changeset/cea0639cfe6c/

Log:	moved optimizer/overview.rst -> jit/optimizer.rst updated index.rst
	to link to it

diff --git a/rpython/doc/index.rst b/rpython/doc/index.rst
--- a/rpython/doc/index.rst
+++ b/rpython/doc/index.rst
@@ -63,7 +63,7 @@
    translation
    rtyper
    garbage_collection
-   optimizer/overview
+   jit/optimizer
 
 
 Indices and tables
diff --git a/rpython/doc/jit/optimizer.rst b/rpython/doc/jit/optimizer.rst
new file mode 100644
--- /dev/null
+++ b/rpython/doc/jit/optimizer.rst
@@ -0,0 +1,193 @@
+.. _trace_optimizer:
+
+Trace Optimzier
+===============
+
+Traces of user programs are not directly translated into machine code.
+The optimizer module implements several different semantic preserving
+transformations that either allow operations to be swept from the trace
+or convert them to operations that need less time or space.
+
+When you try to make sense of this module, this page might get you started.
+
+Before some optimizations are explained in more detail, it is essential to
+understand how traces look like.
+The optimizer comes with a test suit. It contains many trace
+examples and you might want to take a look at it 
+(in `rpython/jit/metainterp/optimizeopt/test/*.py`).
+The allowed operations can be found in `rpython/jit/metainterp/resoperation.py`.
+Here is an example of a trace:: 
+
+    [p0,i0,i1]
+    label(p0, i0, i1)
+    i2 = getarray_item_raw(p0, i0, descr=<Array Signed>)
+    i3 = int_add(i1,i2)
+    i4 = int_add(i0,1)
+    i5 = int_le(i4, 100) # less equal
+    guard_true(i5)
+    jump(p0, i4, i3)
+
+At the beginning it might be clumsy to read but it makes sense when you start
+to compare the Python code that constructed the trace::
+
+    from array import array
+    a = array('i',range(101))
+    sum = 0; i = 0
+    while i <= 100: # can be seen as label
+        sum += a[i]
+        i += 1
+        # jumps back to the while header
+
+There are better ways to the sum from ``[0..100]``, but it gives a better intuition on how
+traces are constructed than ``sum(range(101))``.
+Note that the trace syntax is the one used in the test suit. It is also very
+similar a printed trace at runtime. The first operation is called ``input``, the
+second a ``label``, the last is the backwards ``jump``.
+
+These instruction mention earlier are special:
+
+* ``input`` defines the input parameter type and name to enter the trace.
+* ``label`` is the instruction a ``jump`` can target. Label instructions have
+  a ``JitCellToken`` associated that uniquely identifies the label. Any jump
+  has a target token of a label.
+
+The token is saved in a so called `descriptor` of the instruction. It is
+not written explicitly because it is not done in the tests either. But
+the test suit creates a dummy token for each trace and adds it as descriptor
+to ``label`` and ``jump``. Of course the optimizer does the same at runtime,
+but using real values.
+The sample trace includes a descriptor in ``getarrayitem_raw``. Here it
+annotates the type of the array. It is a signed integer array.
+
+High level overview
+-------------------
+
+Before the JIT backend transforms any trace into a machine code, it tries to
+transform the trace into an equivalent trace that executes faster. The method
+`optimize_trace` in `rpython/jit/metainterp/optimizeopt/__init__.py` is the
+main entry point.
+
+Optimizations are applied in a sequence one after another and the base
+sequence is as follows::
+
+    intbounds:rewrite:virtualize:string:earlyforce:pure:heap:unroll
+
+Each of the colon separated name has a class attached. It is later
+instantiated as a subclass of `Optimization`. The `Optimizer` class 
+derives from the `Optimization` class and implements the control logic for
+the optimization. Most of the optimizations only require a single forward pass.
+The trace is 'propagated' in to each optimization using the method
+`propagate_forward`. Instruction by instruction then flows from the
+first optimization to the last optimization. The method `emit_operation`
+is called for every operation that is passed to the next optimizer.
+
+A frequently encountered pattern
+--------------------------------
+
+To find potential optimization targets it is necessary to know the instruction
+type. Simple solution is to switch using the operation number (= type)::
+
+    for op in operations:
+        if op.getopnum() == rop.INT_ADD:
+            # handle this instruction
+            pass
+        elif op.getopnum() == rop.INT_FLOOR_DIV:
+            pass
+        # and many more
+
+Things get worse if you start to match the arguments
+(is argument one constant and two variable or vice versa?). The pattern to tackle
+this code bloat is to move it to a separate method using
+`make_dispatcher_method`. It associates methods with instruction types::
+
+    class OptX(Optimization):
+        def prefix_INT_ADD(self, op):
+            pass # emit, transform, ...
+
+    dispatch_opt = make_dispatcher_method(OptX, 'prefix_',
+                                          default=OptX.emit_operation)
+    OptX.propagate_forward = dispatch_opt
+
+    optX = OptX()
+    for op in operations:
+        optX.propagate_forward(op)
+
+``propagate_forward`` searches for the method that is able to handle the instruction
+type. As an example `INT_ADD` will invoke `prefix_INT_ADD`. If there is no function
+for the instruction, it is routed to the default implementation (``emit_operation``
+in this example).
+
+Rewrite optimization
+--------------------
+
+The second optimization is called 'rewrite' an is commonly also known as
+strength reduction. A simple example would be that an integer multiplied
+by 2 is equivalent to the bits shifted to the left once
+(e.g. ``x * 2 == x << 1``). Not only strength reduction is done in this
+optimization but also boolean or arithmetic simplifications. Other examples
+would be: ``x & 0 == 0``, ``x - 0 == x``
+
+Whenever such an operation is encountered (e.g. ``x & 0``), no operation is
+emitted. Instead the variable of x is made equal to 0
+(= ``make_equal_to(op.result, 0)``). The variables found in a trace are
+instances of Box classes that can be found in
+`rpython/jit/metainterp/history.py`. `OptValue` wraps those variables again
+and maps the boxes to the optimization values in the optimizer. When a
+value is made equal, the box in the opt. value. This renders a new value
+to any further access.
+As a result the optimizer must provide the means to access the ``Box``
+instances. The instance method `make_args_key` returns the boxed value.
+
+**NOTE: that OptValue is likely to to be replaced in near future.**
+
+Pure optimization
+-----------------
+
+Is interwoven into the basic optimizer. It saves operations, results,
+arguments to be known to have pure semantics.
+
+Pure is free of side effects and it is referentially transparent
+(the operation can be replaced with its value without changing the program
+semantics). The operations marked as ALWAYS_PURE in `resoperation.py` is a
+subset of the SIDEEFFECT free operations. Operations such as new, new array,
+getfield_(raw/gc) are marked SIDEEFFECT free but not as ALWAYS_PURE.
+
+This can be seen as memoization technique. Once an operation proved to
+be 'pure' it is saved and should not be recomputed later.
+
+Unroll optimization
+-------------------
+
+A detailed description can be found the document
+`Loop-Aware Optimizations in PyPy's Tracing JIT`__
+
+.. __: http://www2.maths.lth.se/matematiklth/vision/publdb/reports/pdf/ardo-bolz-etal-dls-12.pdf
+
+This optimization does not fall into the traditional scheme of one forward
+pass only. In a nutshell it unrolls the trace _once_, connects the two
+traces (by inserting parameters into the jump and label of the peeled trace)
+and uses information to iron out allocations, propagate constants and
+do any other optimization currently present in the 'optimizeopt' module.
+
+It is prepended all optimizations and thus extends the Optimizer class
+and unrolls the loop once before it proceeds.
+
+
+What is missing?
+----------------
+
+* Guards are not explained
+* Several optimizations
+
+
+Further references
+------------------
+
+* `Allocation Removal by Partial Evaluation in a Tracing JIT`__
+* `Loop-Aware Optimizations in PyPy's Tracing JIT`__
+
+.. __: http://www.stups.uni-duesseldorf.de/mediawiki/images/b/b0/Pub-BoCuFiLePeRi2011.pdf
+.. __: http://www2.maths.lth.se/matematiklth/vision/publdb/reports/pdf/ardo-bolz-etal-dls-12.pdf
+
+
+