[pypy-svn] r60585 - pypy/extradoc/talk/ecoop2009

Fri Dec 19 11:39:18 CET 2008

Author: antocuni
Date: Fri Dec 19 11:39:16 2008
New Revision: 60585

Modified:
   pypy/extradoc/talk/ecoop2009/benchmarks.tex
   pypy/extradoc/talk/ecoop2009/clibackend.tex
   pypy/extradoc/talk/ecoop2009/jitgen.tex
   pypy/extradoc/talk/ecoop2009/rainbow.tex
   pypy/extradoc/talk/ecoop2009/tlc.tex
Log:
minor fixes, some comments



Modified: pypy/extradoc/talk/ecoop2009/benchmarks.tex
==============================================================================

--- pypy/extradoc/talk/ecoop2009/benchmarks.tex	(original)
+++ pypy/extradoc/talk/ecoop2009/benchmarks.tex	Fri Dec 19 11:39:16 2008
@@ -1,8 +1,11 @@
 \section{Benchmarks}
 
-In section \ref{sec:tlc-features}, we saw that TLC provides most of the
+\anto{We should say somewhere that flexswitches are slow but benchmarks are so
+  good because they are not involved in the inner loops}
+
+In section \ref{sec:tlc-properties}, we saw that TLC provides most of the
 features that usaully make dynamically typed language so slow, such as
-\emph{stack-based VM}, \emph{boxed arithmetic} and \emph{dynamic lookup} of
+\emph{stack-based interpreter}, \emph{boxed arithmetic} and \emph{dynamic lookup} of
 methods and attributes.
 
 In the following sections, we will show some benchmarks that show how our
@@ -12,9 +15,9 @@
 
 \begin{enumerate}
 \item By plain interpretation, without any jitting.
-\item With the jit enabled: this run includes the time spent by doing the
+\item With the JIT enabled: this run includes the time spent by doing the
   compilation itself, plus the time spent by running the produced code.
-\item Again with the jit enabled, but this time the compilation has already
+\item Again with the JIT enabled, but this time the compilation has already
   been done, so we are actually measuring how good is the code we produced.
 \end{enumerate}
 
@@ -50,7 +53,7 @@
 much better.  At the first iteration, the classes of the two operands of the
 multiplication are promoted; then, the JIT compiler knows that both are
 integers, so it can inline the code to compute the result.  Moreover, it can
-\emph{virtualize} all the temporary objects, because they never escape from
+\emph{virtualize} (see section \ref{sec:virtuals} all the temporary objects, because they never escape from
 the inner loop.  The same remarks apply to the other two operations inside
 the loop.
 
@@ -182,7 +185,7 @@
 
 The computation \emph{per se} is trivial, as it calculates either $-n$ or
 $1+2...+n-1$, depending on the sign of $n$. The interesting part is the
-polymorphic call to \lstinline{accumulate} inside the loop, because the VM has
+polymorphic call to \lstinline{accumulate} inside the loop, because the interpreter has
 no way to know in advance which method to call (unless it does flow analysis,
 which could be feasible in this case but not in general).  The equivalent C\#
 code we wrote uses two classes and a \lstinline{virtual} method call to
@@ -191,7 +194,7 @@
 However, our generated JIT does not compile the whole function at
 once. Instead, it compiles and executes code chunk by chunk, waiting until it
 knows enough informations to generate highly efficient code.  In particualr,
-at the time when it emits the code for the inner loop it exactly knows the
+at the time it emits the code for the inner loop it exactly knows the
 type of \lstinline{obj}, thus it can remove the overhead of dynamic dispatch
 and inline the method call.  Moreover, since \lstinline{obj} never escapes the
 function, it is \emph{virtualized} and its field \lstinline{value} is stored

Modified: pypy/extradoc/talk/ecoop2009/clibackend.tex
==============================================================================
--- pypy/extradoc/talk/ecoop2009/clibackend.tex	(original)
+++ pypy/extradoc/talk/ecoop2009/clibackend.tex	Fri Dec 19 11:39:16 2008
@@ -52,17 +52,13 @@
 Since in .NET methods are the basic units of compilation, a possible
 solution consists in creating a new method 
 any time a new case has to be added to a flexswitch.
-\dacom{comment for Antonio: I am not sure this is the best solution. This cannot work for Java where classes are the basic
-  units. Closures will be available only with Java Dolphin and I do
-  not know how much efficient will be}
 In this way, whereas flow graphs without flexswitches are translated
 to a single method, the translation of flow graphs which can dynamically grow because of
 flexswitches will be scattered over several methods.
 Summarizing, the backend behaves in the following way:
 \begin{itemize}
 \item Each flow graph is translated in a collection of methods which
-  can grow dynamically. \dacom{I propose primary/secondary instead of
-    the overloaded terms main/child} Each collection contains at least one
+  can grow dynamically. Each collection contains at least one
   method, called \emph{primary}, which is the first to be created.
   All other methods, called \emph{secondary}, are added dynamically 
   whenever a new case is added to a flexswitch.
@@ -71,12 +67,13 @@
   number of blocks, all belonging to the same flow graph. Among these blocks
   there always exists an initial block whose input variables are
   parameters of the method; the input variables of all other blocks
-  are local variables of the method.
+  are local variables of the method. \anto{This is wrong: the signature of the secondary methods is fixed, and input args are passed inside the InputArgs class, not as methodo parameters}
 \end{itemize} 
 
 When  a new case is added to a flexswitch, new blocks are generated
 and translated by the backend in a new single method pointed
-by a delegate which is stored in the code implementing the flexswitch,
+by a delegate \footnote{\emph{Delegates} are the .NET equivalent of function pointers}
+ of  which is stored in the code implementing the flexswitch,
 so that the method can be invoked later.
 
 \subsubsection{Internal and external links}
@@ -88,7 +85,7 @@
 Following an internal link is not difficult in IL bytecode: a jump to
 the corresponding code fragment in the same method is emitted 
 to execute the new block, whereas the appropriate local variables are
-used for passing arguments.
+used for passing arguments. \anto{this is wrong for the same reason as above}
 Also following an external link whose target is the initial block of a
 method is not difficult: the corresponding method has to be invoked
 with the appropriate arguments.
@@ -106,7 +103,7 @@
 determine which block has to be executed.
 This is done by passing to the method a 32 bits number, called 
 \emph{block id}, which uniquely identifies the next block of the graph to be executed.
-The high word of a block id is the id of the method to which the block
+The high word \anto{a word is 32 bit, block num and method id are 16 bit each} of a block id is the id of the method to which the block
 belongs, whereas the low word is a progressive number univocally identifying
 each block implemented by the method.
 
@@ -145,7 +142,7 @@
 If the next block to be executed is implemented in the same method
 ({\small\lstinline{methodid == MY_METHOD_ID}}), then the appropriate
 jump to the corresponding code is executed, hence internal links
-can be managed efficiently.
+can be managed efficiently. \anto{wrong: internal links don't go through the dispatcher}
 Otherwise, the \lstinline{jump_to_ext}
 part of the dispatcher has to be executed.
 The code that actually jumps to an external block is contained in
@@ -267,10 +264,10 @@
 the link and jumps to the right block by performing a linear search in
 array \lstinline{values}.
 
-Recall that the first argument of delegate \lstinline{FlexSwitchCase}
-is the block id to jump to; since the target of an external jump is
-always the initial block of the method, the first argument will be
-always 0.
+Recall that the first argument of delegate \lstinline{FlexSwitchCase} is the
+block id to jump to. By construction, the target block of a flexswitch is
+always the first in a secondary method, and we use the special value
+\lstinline{0} to signal this.
 
 The value returned by method \lstinline{execute} is the next block id
 to be executed; 

Modified: pypy/extradoc/talk/ecoop2009/jitgen.tex
==============================================================================
--- pypy/extradoc/talk/ecoop2009/jitgen.tex	(original)
+++ pypy/extradoc/talk/ecoop2009/jitgen.tex	Fri Dec 19 11:39:16 2008
@@ -60,7 +60,7 @@
 \label{fig:tlc-main}
 \begin{center}
 \input{tlc-simplified.py}
-\caption{The main loop of the TLC interpreter}
+\caption{The main loop of the TLC interpreter, written in RPython}
 \end{center}
 \end{figure}
 
@@ -194,7 +194,7 @@
 The binding-time analyzer of our translation tool-chain is using a simple
 abstract-interpretation based analysis. It is based on the
 same type inference engine that is used on the source RPython program,
-the annotator.  In this mode, it is called the \emph{hint-annotator}; it
+the annotator \anto{I'm not sure we should mention the annotator, as it is not referred anywhere else}.  In this mode, it is called the \emph{hint-annotator}; it
 operates over input graphs that are already low-level instead of
 RPython-level, and propagates annotations that do not track types but
 value dependencies and manually-provided binding time hints.

Modified: pypy/extradoc/talk/ecoop2009/rainbow.tex
==============================================================================
--- pypy/extradoc/talk/ecoop2009/rainbow.tex	(original)
+++ pypy/extradoc/talk/ecoop2009/rainbow.tex	Fri Dec 19 11:39:16 2008
@@ -140,6 +140,7 @@
 
 
 \section{Automatic Unboxing of Intermediate Results}
+\label{sec:virtuals}
 
 XXX the following section needs a rewriting to be much more high-level and to
 compare more directly with classical escape analysis
@@ -151,7 +152,7 @@
 residual code as long as possible. The idea is to try to keep new
 run-time structures "exploded": instead of a single run-time object allocated on
 the heap, the object is "virtualized" as a set
-of fresh variables, one per field. Only when the object can be accessed by from
+of fresh local variables, one per field. Only when the object can be accessed by from
 somewhere else is it actually allocated on the heap. The effect of this is similar to that of
 escape analysis \cite{XXX}, which also prevents allocations of objects that can
 be proven to not escape a method or set of methods.

Modified: pypy/extradoc/talk/ecoop2009/tlc.tex
==============================================================================
--- pypy/extradoc/talk/ecoop2009/tlc.tex	(original)
+++ pypy/extradoc/talk/ecoop2009/tlc.tex	Fri Dec 19 11:39:16 2008
@@ -21,7 +21,8 @@
 Objects represent a collection of named attributes (much like JavaScript or
 Self) and named methods.  At creation time, it is necessary to specify the set
 of attributes of the object, as well as its methods.  Once the object has been
-created, it is not possible to add/remove attributes and methods.
+created, it is possible to call methods and read or write attributes, but not
+to add or remove them.
 
 The interpreter for the language is stack-based and uses bytecode to represent
 the program. It provides the following bytecode instructions:
@@ -47,10 +48,8 @@
 the VM needs to do all these checks at runtime; in case one of the check
 fails, the execution is simply aborted.
 
-\subsection{TLC features}
-\label{sec:tlc-features}
-\cfbolz{calling this sections "features" is a bit obscure, since it is more
-properties of the implementation}
+\subsection{TLC properties}
+\label{sec:tlc-properties}
 
 Despite being very simple and minimalistic, \lstinline{TLC} is a good
 candidate as a language to test our JIT generator, as it has some of the