[pypy-svn] r63823 - pypy/extradoc/talk/icooolps2009

Wed Apr 8 13:25:25 CEST 2009

Author: cfbolz
Date: Wed Apr  8 13:25:22 2009
New Revision: 63823

Modified:
   pypy/extradoc/talk/icooolps2009/paper.tex
Log:
a number of fixes


Modified: pypy/extradoc/talk/icooolps2009/paper.tex
==============================================================================

--- pypy/extradoc/talk/icooolps2009/paper.tex	(original)
+++ pypy/extradoc/talk/icooolps2009/paper.tex	Wed Apr  8 13:25:22 2009
@@ -8,7 +8,7 @@
 \usepackage[utf8]{inputenc}
 
 \newboolean{showcomments}
-\setboolean{showcomments}{false}
+\setboolean{showcomments}{true}
 \ifthenelse{\boolean{showcomments}}
   {\newcommand{\nb}[2]{
     \fbox{\bfseries\sffamily\scriptsize#1}
@@ -208,7 +208,7 @@
 Tracing JITs are an idea initially explored by the Dynamo project
 \cite{bala_dynamo:transparent_2000} in the context of dynamic optimization of
 machine code at runtime. The techniques were then successfully applied to Java
-VMs \cite{gal_hotpathvm:effective_2006}. It also turned out that they are a
+VMs \cite{gal_hotpathvm:effective_2006, andreas_gal_incremental_2006}. It also turned out that they are a
 relatively simple way to implement a JIT compiler for a dynamic language
 \cite{mason_chang_efficient_2007}. The technique is now
 being used by both Mozilla's TraceMonkey JavaScript VM
@@ -223,10 +223,10 @@
 
 The basic approach of a tracing JIT is to only generate machine code for the hot
 code paths of commonly executed loops and to interpret the rest of the program.
-The code for those common loops however should be highly optimized, including
+The code for those common loops however is highly optimized, including
 aggressive inlining.
 
-Typically, programs executed by a tracing VMs goes through various phases:
+Typically, programs executed by a tracing VM go through various phases:
 \begin{itemize}
 \item Interpretation/profiling
 \item Tracing
@@ -310,8 +310,8 @@
     return result
 \end{verbatim}
 }
-\toon{next sentence is strange} To trace this, a bytecode form of these functions needs to be introduced that
-the tracer understands. The tracer interprets a bytecode that is an encoding of
+
+The tracer interprets these functions in a bytecode that is an encoding of
 the intermediate representation of PyPy's translation toolchain after type
 inference has been performed.
 When the profiler discovers
@@ -409,9 +409,13 @@
 \end{figure}
 
 An example is given in Figure \ref{fig:tlr-basic}. It shows the code of a very
-simple bytecode interpreter with 256 registers and an accumulator. The
+simple bytecode interpreter with 256 registers and an accumulator.  The
 \texttt{bytecode} argument is a string of bytes, all register and the
-accumulator are integers. A program for this interpreter that computes
+accumulator are integers.\footnote{The
+chain of \texttt{if}, \texttt{elif}, ... instructions that check the various
+opcodes is transformed into a \texttt{switch} statement by one of PyPy's
+optimizations. Python does not have a \texttt{switch} statement}
+A program for this interpreter that computes
 the square of the accumulator is shown in Figure \ref{fig:square}. If the
 tracing interpreter traces the execution of the \texttt{DECR\_A} opcode (whose
 integer value is 7), the trace would look as in Figure \ref{fig:trace-normal}.
@@ -442,7 +446,7 @@
 relevant variables of the language interpreter with the help of a \emph{hint}.
 The tracing interpreter will then effectively add the values of these variables
 to the position key. This means that the loop will only be considered to be
-closed if these variables that are making up program counter at the language
+closed if these variables that are making up the program counter at the language
 interpreter level are the same a second time.  Loops found in this way are, by
 definition, user loops.
 
@@ -535,16 +539,16 @@
 language interpreter, it would still be an improvement if some of these operations could
 be removed.
 
-\toon{very difficult to read (actually so is the whole paragraph; rephrase)}
-The simple insight how to improve the situation is that most of the
-operations in the trace are actually concerned with manipulating the
-bytecode and the program counter. Those are stored in variables that are part of
-the position key (they are ``green''), that means that the tracer checks that they
-are some fixed value at the beginning of the loop (they may well change over the
-course of the loop, though). In the example the check
-would be that the \texttt{bytecode} variable is the bytecode string
-corresponding to the square function and that the \texttt{pc} variable is
-\texttt{4}. Therefore it is possible to constant-fold computations on them away,
+The simple insight how to improve the situation is that most of the operations
+in the trace are actually concerned with manipulating the bytecode string and
+the program counter. Those are stored in variables that are ``green'' (e.g. they
+are part of the position key).  This means that the tracer checks that those
+variables have some fixed value at the beginning of the loop (they may well
+change over the course of the loop, though). In the example of Figure
+\ref{fig:trace-no-green-folding} the check would be that at the beginning of the
+trace the \texttt{pc} variable is \texttt{4} and the \texttt{bytecode} variable
+is the bytecode string corresponding to the square function. Therefore it is
+possible to constant-fold computations on them away,
 as long as the operations are side-effect free. Since strings are immutable in
 RPython, it is possible to constant-fold the \texttt{strgetitem} operation. The
 \texttt{int\_add} are additions of the green variable \texttt{pc} and a constant
@@ -595,9 +599,7 @@
 all. It is possible to choose when the language interpreter is translated to C
 whether the JIT should be built in or not. If the JIT is not enabled, all the
 hints that are possibly in the interpreter source are just ignored by the
-translation process. In this way, the result of the translation is identical to
-that when no hints were present in the interpreter at all. \toon{strange
-sentence}
+translation process.
 
 If the JIT is enabled, things are more interesting. At the moment the JIT can
 only be enabled when translating the interpreter to C, but we hope to lift that
@@ -729,13 +731,9 @@
 Python). 
 
 The results show that the tracing JIT speeds up the execution of this Python
-function significantly, even outperforming CPython. \sout{by a bit. The tracer needs to
-trace through quite a bit of dispatching machinery of the Python interpreter to
-achieve this, XXX.}
-\anto{
-To achieve this, the tracer traces through the whole Python dispatching
-machinery, automatically inlining only the relevant fast paths.
-}
+function significantly, even outperforming CPython. To achieve this, the tracer
+traces through the whole Python dispatching machinery, automatically inlining
+the relevant fast paths.
 
 \begin{figure}
 \label{fig:bench-example}