[pypy-commit] extradoc extradoc: merge heads

antocuni noreply at buildbot.pypy.org
Tue Jul 10 11:16:36 CEST 2012

Author: Antonio Cuni <anto.cuni at gmail.com>
Branch: extradoc
Changeset: r4289:e99feb284e66
Date: 2012-07-10 11:16 +0200

Log:	merge heads

diff --git a/.hgignore b/.hgignore
--- a/.hgignore
+++ b/.hgignore
@@ -1,3 +1,11 @@
 syntax: glob
\ No newline at end of file
diff --git a/blog/draft/plans-for-2-years.rst b/blog/draft/plans-for-2-years.rst
new file mode 100644
--- /dev/null
+++ b/blog/draft/plans-for-2-years.rst
@@ -0,0 +1,73 @@
+What we'll be busy for the forseeable future
+The PyPy dev process has been dubbed as too opaque. In this blog post
+we try to highlight a few projects being worked on or in plans for the near
+future. As it usually goes with such lists, don't expect any deadlines,
+it's more "a lot of work that will keep us busy". It also answers
+whether or not PyPy has achieved its total possible performance.
+Here is the list of areas, mostly with open branches. Note that the list is
+not exhaustive - in fact it does not contain all the areas that are covered
+by funding, notably numpy, STM and py3k.
+Iterating in RPython
+Right now code that has a loop in RPython can be surprised by receiving
+an iterable it does not expect. This ends up with doing an unnecessary copy
+(or two or three in corner cases), essentially forcing an iterator.
+An example of such code would be::
+  import itertools
+  ''.join(itertools.repeat('ss', 10000000))
+Would take 4s on PyPy and .4s on CPython. That's absolutely unacceptable :-)
+More optimized frames and generators
+Right now generator expressions and generators have to have full frames,
+instead of optimized ones like in the case of python functions. This leads
+to inefficiences. There is a plan to improve the situation on the
+``continuelet-jit-2`` branch. ``-2`` in branch names means it's hard and
+has been already tried unsuccessfully :-)
+A bit by chance it would make stackless work with the JIT. Historically though,
+the idea was to make stackless work with the JIT and later figured out this
+could also be used for generators. Who would have thought :)
+This work should allow to improve the situation of uninlined functions
+as well.
+Dynamic specialized tuples and instances
+PyPy already uses maps. Read our `blog`_ `posts`_ about details. However,
+it's possible to go even further, by storing unboxed integers/floats
+directly into the instance storage instead of having pointers to python
+objects. This should improve memory efficiency and speed for the cases
+where your instances have integer or float fields.
+Tracing speed
+PyPy is probably one of the slowest compilers when it comes to warmup times.
+There is no open branch, but we're definitely thinking about the problem :-)
+Bridge optimizations
+Another "area of interest" is bridge generation. Right now generating a bridge
+from compiled loop "forgets" some kind of optimization information from the
+GC pinning and I/O performance
+``minimark-gc-pinning`` branch tries to improve the performance of the IO.
+32bit on 64bit
diff --git a/talk/dls2012/licm.pdf b/talk/dls2012/licm.pdf
new file mode 100644
index 0000000000000000000000000000000000000000..dd7d2286dbdb2201e2f9e266c9279ce9a9ba2a0d
GIT binary patch


diff --git a/talk/dls2012/paper.tex b/talk/dls2012/paper.tex
--- a/talk/dls2012/paper.tex
+++ b/talk/dls2012/paper.tex
@@ -124,6 +124,8 @@
 One of the nice properties of a tracing JIT is that many of its optimization
 are simple requiring one forward pass only. This is not true for loop-invariant code
 motion which is a very important optimization for code with tight kernels.
+Especially for dynamic languages that typically performs quite a lot of loop invariant
+type checking, boxed value unwrapping and virtual method lookups.
 In this paper we present a scheme for making simple optimizations loop-aware by
 using a simple pre-processing step on the trace and not changing the
 optimizations themselves. The scheme can give performance improvements of a
@@ -141,13 +143,15 @@
-A dynamically typed language needs to do a lot of type
-checking and unwrapping. For tight computationally intensive loops a
+A dynamic language typically needs to do quite a lot of type
+checking, wrapping/unwrapping of boxed values, and virtual method dispatching. 
+For tight computationally intensive loops a
 significant amount of the execution time might be spend on such tasks
-instead of the actual calculations. Moreover, the type checking and
-unwrapping is often loop invariant and performance could be increased
-by moving those operations out of the loop. We propose to design a
-loop-aware tracing JIT to perform such optimization at run time.
+instead of the actual computations. Moreover, the type checking,
+unwrapping and method lookups are often loop invariant and performance could be increased
+by moving those operations out of the loop. We propose a simple scheme
+to make a tracing JIT loop-aware by allowing it's existing optimizations to
+perform loop invariant code motion. 
 One of the advantages that tracing JIT compilers have above traditional
@@ -533,7 +537,7 @@
 Each operation in the trace is copied in order.
 To copy an operation $v=\text{op}\left(A_1, A_2, \cdots, A_{|A|}\right)$
-a new variable, $\hat v$ is introduced. The copied operation will
+a new variable, $\hat v$, is introduced. The copied operation will
 return $\hat v$ using
   \hat v = \text{op}\left(m\left(A_1\right), m\left(A_2\right), 
@@ -696,12 +700,12 @@
 By constructing a vector, $H$,  of such variables, the input and jump
 arguments can be updated using
-  \hat J = \left(J_1, J_2, \cdots, J_{|J|}, H_1, H_2, \cdots, H_{|H}\right)
+  \hat J = \left(J_1, J_2, \cdots, J_{|J|}, H_1, H_2, \cdots, H_{|H|}\right)
-  \hat K = \left(K_1, K_2, \cdots, K_{|J|}, m(H_1), m(H_2), \cdots, m(H_{|H})\right)
+  \hat K = \left(K_1, K_2, \cdots, K_{|J|}, m(H_1), m(H_2), \cdots, m(H_{|H|})\right)
@@ -772,7 +776,7 @@
 The arguments of the \lstinline{jump} operation of the peeled loop,
-$K$, is constructed by inlining $\hat J$,
+$K$, is constructed from $\hat J$ using the map $m$,
   \hat K = \left(m\left(\hat J_1\right), m\left(\hat J_1\right), 
                  \cdots, m\left(\hat J_{|\hat J|}\right)\right)
diff --git a/talk/ep2012/stackless/Makefile b/talk/ep2012/stackless/Makefile
new file mode 100644
--- /dev/null
+++ b/talk/ep2012/stackless/Makefile
@@ -0,0 +1,15 @@
+# you can find rst2beamer.py here:
+# http://codespeak.net/svn/user/antocuni/bin/rst2beamer.py
+slp-talk.pdf: slp-talk.rst author.latex title.latex stylesheet.latex
+	rst2beamer.py --stylesheet=stylesheet.latex --documentoptions=14pt slp-talk.rst slp-talk.latex || exit
+	sed 's/\\date{}/\\input{author.latex}/' -i slp-talk.latex || exit
+	sed 's/\\maketitle/\\input{title.latex}/' -i slp-talk.latex || exit
+	sed 's/\\usepackage\[latin1\]{inputenc}/\\usepackage[utf8]{inputenc}/' -i slp-talk.latex || exit
+	pdflatex slp-talk.latex  || exit
+view: slp-talk.pdf
+	evince talk.pdf &
+xpdf: slp-talk.pdf
+	xpdf slp-talk.pdf &
diff --git a/talk/ep2012/stackless/author.latex b/talk/ep2012/stackless/author.latex
new file mode 100644
--- /dev/null
+++ b/talk/ep2012/stackless/author.latex
@@ -0,0 +1,8 @@
+\definecolor{rrblitbackground}{rgb}{0.0, 0.0, 0.0}
+\title[The Story of Stackless Python]{The Story of Stackless Python}
+\author[tismer, nagare]
+{Christian Tismer, Herv&#233; Coatanhay}
+\institute{EuroPython 2012}
+\date{July 4 2012}
diff --git a/talk/ep2012/stackless/beamerdefs.txt b/talk/ep2012/stackless/beamerdefs.txt
new file mode 100644
--- /dev/null
+++ b/talk/ep2012/stackless/beamerdefs.txt
@@ -0,0 +1,108 @@
+.. colors
+.. ===========================
+.. role:: green
+.. role:: red
+.. general useful commands
+.. ===========================
+.. |pause| raw:: latex
+   \pause
+.. |small| raw:: latex
+   {\small
+.. |end_small| raw:: latex
+   }
+.. |scriptsize| raw:: latex
+   {\scriptsize
+.. |end_scriptsize| raw:: latex
+   }
+.. |strike<| raw:: latex
+   \sout{
+.. closed bracket
+.. ===========================
+.. |>| raw:: latex
+   }
+.. example block
+.. ===========================
+.. |example<| raw:: latex
+   \begin{exampleblock}{
+.. |end_example| raw:: latex
+   \end{exampleblock}
+.. alert block
+.. ===========================
+.. |alert<| raw:: latex
+   \begin{alertblock}{
+.. |end_alert| raw:: latex
+   \end{alertblock}
+.. columns
+.. ===========================
+.. |column1| raw:: latex
+   \begin{columns}
+      \begin{column}{0.45\textwidth}
+.. |column2| raw:: latex
+      \end{column}
+      \begin{column}{0.45\textwidth}
+.. |end_columns| raw:: latex
+      \end{column}
+   \end{columns}
+.. |snake| image:: ../../img/py-web-new.png
+           :scale: 15%
+.. nested blocks
+.. ===========================
+.. |nested| raw:: latex
+   \begin{columns}
+      \begin{column}{0.85\textwidth}
+.. |end_nested| raw:: latex
+      \end{column}
+   \end{columns}
diff --git a/talk/ep2012/stackless/demo/pickledtasklet.py b/talk/ep2012/stackless/demo/pickledtasklet.py
new file mode 100644
--- /dev/null
+++ b/talk/ep2012/stackless/demo/pickledtasklet.py
@@ -0,0 +1,25 @@
+import pickle, sys
+import stackless
+ch = stackless.channel()
+def recurs(depth, level=1):
+    print 'enter level %s%d' % (level*'  ', level)
+    if level >= depth:
+        ch.send('hi')
+    if level < depth:
+        recurs(depth, level+1)
+    print 'leave level %s%d' % (level*'  ', level)
+def demo(depth):
+    t = stackless.tasklet(recurs)(depth)
+    print ch.receive()
+    pickle.dump(t, file('tasklet.pickle', 'wb'))
+if __name__ == '__main__':
+    if len(sys.argv) > 1:
+        t = pickle.load(file(sys.argv[1], 'rb'))
+        t.insert()
+    else:
+        t = stackless.tasklet(demo)(9)
+    stackless.run()
diff --git a/talk/ep2012/stackless/eurpython-2012.pptx b/talk/ep2012/stackless/eurpython-2012.pptx
new file mode 100644
index 0000000000000000000000000000000000000000..9b34bb66e92cbe27ce5dc5c3928fe9413abf2cef
GIT binary patch


diff --git a/talk/ep2012/stackless/logo_small.png b/talk/ep2012/stackless/logo_small.png
new file mode 100644
index 0000000000000000000000000000000000000000..acfe083b78f557c394633ca542688a2bfca6a5e8
GIT binary patch


diff --git a/talk/ep2012/stackless/slp-talk.pdf b/talk/ep2012/stackless/slp-talk.pdf
new file mode 100644
index 0000000000000000000000000000000000000000..afcb8c00b73bb83d114dc4e0d9c8ec1157800ef3
GIT binary patch


diff --git a/talk/ep2012/stackless/slp-talk.rst b/talk/ep2012/stackless/slp-talk.rst
new file mode 100644
--- /dev/null
+++ b/talk/ep2012/stackless/slp-talk.rst
@@ -0,0 +1,675 @@
+.. include:: beamerdefs.txt
+The Story of Stackless Python
+About This Talk
+* first talk after a long break
+  - *rst2beamer* for the first time
+guest speaker:
+* Herve Coatanhay about Nagare
+  - PowerPoint (Mac)
+Meanwhile I used
+* Powerpoint (PC)
+* Keynote (Mac)
+* Google Docs
+poll: What is your favorite slide tool?
+What is Stackless?
+* *Stackless is a Python version that does not use the C stack*
+  |pause|
+  - really? naah
+* Stackless is a Python version that does not keep state on the C stack
+  - the stack *is* used but
+  - cleared between function calls
+* Remark:
+  - theoretically. In practice...
+  - ... it is reasonable 90 % of the time
+  - we come back to this!
+What is Stackless about?
+* it is like CPython
+* it can do a little bit more
+* adds a single builtin module
+|example<| |>|
+  .. sourcecode:: python
+    import stackless
+* is like an extension
+  - but, sadly, not really
+  - stackless **must** be builtin  
+  - **but:** there is a solution...
+Now, what is it really about?
+* have tiny little "main" programs
+  - ``tasklet``
+* tasklets communicate via messages
+  - ``channel``
+* tasklets are often called ``microthreads``
+  - but there are no threads at all
+  - only one tasklets runs at any time
+* *but see the PyPy STM* approach
+  - this will apply to tasklets as well
+Cooperative Multitasking ...
+|example<| |>|
+  .. sourcecode:: pycon
+    >>> import stackless
+    >>>
+    >>> channel = stackless.channel()
+  .. sourcecode:: pycon
+    >>> def receiving_tasklet():
+    ...     print "Receiving tasklet started"
+    ...     print channel.receive()
+    ...     print "Receiving tasklet finished"
+  .. sourcecode:: pycon
+    >>> def sending_tasklet():
+    ...     print "Sending tasklet started"
+    ...     channel.send("send from sending_tasklet")
+    ...     print "sending tasklet finished"
+... Cooperative Multitasking ...
+|example<| |>|
+  .. sourcecode:: pycon
+    >>> def another_tasklet():
+    ...     print "Just another tasklet in the scheduler"
+  .. sourcecode:: pycon
+    >>> stackless.tasklet(receiving_tasklet)()
+    <stackless.tasklet object at 0x00A45B30>
+    >>> stackless.tasklet(sending_tasklet)()
+    <stackless.tasklet object at 0x00A45B70>
+    >>> stackless.tasklet(another_tasklet)()
+    <stackless.tasklet object at 0x00A45BF0>
+... Cooperative Multitasking
+|example<| |>|
+  .. sourcecode:: pycon
+    <stackless.tasklet object at 0x00A45B70>
+    >>> stackless.tasklet(another_tasklet)()
+    <stackless.tasklet object at 0x00A45BF0>
+    >>>
+    >>> stackless.run()
+    Receiving tasklet started
+    Sending tasklet started
+    send from sending_tasklet
+    Receiving tasklet finished
+    Just another tasklet in the scheduler
+    sending tasklet finished
+Why not just the *greenlet* ?
+* greenlets are a subset of stackless
+  - can partially emulate stackless
+  - there is no builtin scheduler
+  - technology quite close to Stackless 2.0
+* greenlets are about 10x slower to switch context because
+  using only hard-switching
+  - but that's ok in most cases
+* greenlets are kind-of perfect
+  - near zero maintenace
+  - minimal interface
+* but the main difference is ...
+Excurs: Hard-Switching
+Sorry ;-)
+Switching program state "the hard way":
+Without notice of the interpreter
+* the machine stack gets hijacked
+  - Brute-Force: replace the stack with another one
+  - like threads
+* stackless, greenlets
+  - stack slicing
+  - semantically same effect
+* switching works fine
+* pickling does not work, opaque data on the stack
+  - this is more sophisticated in PyPy, another story...
+Excurs: Soft-Switching
+Switching program state "the soft way":
+With knowledge of the interpreter
+* most efficient implementation in Stackless 3.1
+* demands the most effort of the developers
+* no opaque data on the stack, pickling does work
+  - again, this is more sophisticated in PyPy
+* now we are at the main difference, as you guessed ...
+Pickling Program State
+|example<| Persistence (p. 1 of 2) |>|
+  .. sourcecode:: python
+    import pickle, sys
+    import stackless
+    ch = stackless.channel()
+    def recurs(depth, level=1):
+        print 'enter level %s%d' % (level*'  ', level)
+        if level >= depth:
+            ch.send('hi')
+        if level < depth:
+            recurs(depth, level+1)
+        print 'leave level %s%d' % (level*'  ', level)
+# *remember to show it interactively*
+Pickling Program State
+|example<| Persistence (p. 2 of 2) |>|
+  .. sourcecode:: python
+    def demo(depth):
+        t = stackless.tasklet(recurs)(depth)
+        print ch.receive()
+        pickle.dump(t, file('tasklet.pickle', 'wb'))
+    if __name__ == '__main__':
+        if len(sys.argv) > 1:
+            t = pickle.load(file(sys.argv[1], 'rb'))
+            t.insert()
+        else:
+            t = stackless.tasklet(demo)(9)
+        stackless.run()
+# *remember to show it interactively*
+Script Output 1
+|example<| |>|
+  .. sourcecode:: pycon
+    $ ~/src/stackless/python.exe demo/pickledtasklet.py
+    enter level   1
+    enter level     2
+    enter level       3
+    enter level         4
+    enter level           5
+    enter level             6
+    enter level               7
+    enter level                 8
+    enter level                   9
+    hi
+    leave level                   9
+    leave level                 8
+    leave level               7
+    leave level             6
+    leave level           5
+    leave level         4
+    leave level       3
+    leave level     2
+    leave level   1
+Script Output 2
+|example<| |>|
+  .. sourcecode:: pycon
+    $ ~/src/stackless/python.exe demo/pickledtasklet.py tasklet.pickle 
+    leave level                   9
+    leave level                 8
+    leave level               7
+    leave level             6
+    leave level           5
+    leave level         4
+    leave level       3
+    leave level     2
+    leave level   1
+Greenlet vs. Stackless
+* Greenlet is a pure extension module
+  - but performance is good enough
+* Stackless can pickle program state
+  - but stays a replacement of Python
+* Greenlet never can, as an extension
+* *easy installation* lets people select greenlet over stackless
+  - see for example the *eventlet* project
+  - *but there is a simple work-around, we'll come to it*
+* *they both have their application domains
+  and they will persist.*
+Why Stackless makes a Difference
+* Microthreads ?
+  - the feature where I put most effort into
+  |pause|
+  - can be emulated: (in decreasing speed order)
+    - generators (incomplete, "half-sided")
+    - greenlet
+    - threads (even ;-)
+* Pickling program state ! ==
+* **persistence**
+Persistence, Cloud Computing
+* freeze your running program
+* let it continue anywhere else
+  - on a different computer
+  - on a different operating system (!)
+  - in a cloud
+* migrate your running program
+* save snapshots, have checkpoints
+  - without doing any extra-work
+Software archeology
+* Around since 1998
+  - version 1
+    - using only soft-switching
+    - continuation-based
+    - *please let me skip old design errors :-)*
+* Complete redesign in 2002
+  - version 2
+    - using only hard-switching
+    - birth of tasklets and channels
+* Concept merge in 2004
+  - version 3
+    - **80-20** rule:
+    - soft-switching whenever possible
+    - hard-switching if foreign code is on the stack
+  - these 80 % can be *pickled*  (90?)
+* This stayed as version 3.1
+Status of Stackless Python
+* mature
+* Python 2 and Python 3, all versions
+* maintained by
+  - Richard Tew
+  - Kristjan Valur Jonsson
+  - me  (a bit)
+The New Direction for Stackless
+* ``pip install stackless-python``
+  - will install ``slpython``
+  - or even ``python``     (opinions?)
+* drop-in replacement of CPython
+  *(psssst)*
+* ``pip uninstall stackless-python``
+  - Stackless is a bit cheating, as it replaces the python binary
+  - but the user perception will be perfect
+* *trying stackless made easy!*
+New Direction (cont'd)
+* first prototype yesterday from
+  Anselm Kruis       *(applause)*
+  - works on Windows
+  |pause|
+  - OS X
+    - I'll do that one
+  |pause|
+  - Linux
+    - soon as well
+* being very careful to stay compatible
+  - python 2.7.3 installs stackless for 2.7.3
+  - python 3.2.3 installs stackless for 3.2.3
+  - python 2.7.2 : *please upgrade*
+    - or maybe have an over-ride option?
+Consequences of the Pseudo-Package
+The technical effect is almost nothing.
+The psycological impact is probably huge:
+* stackless is easy to install and uninstall
+* people can simply try if it fits their needs
+* the never ending discussion
+  - "Why is Stackless not included in the Python core?"
+* **has ended**
+  - "Why should we, after all?"
+  |pause|
+  - hey Guido :-)
+  - what a relief, for you and me
+Status of Stackless PyPy
+* was completely implemented before the Jit
+  - together with
+    greenlets
+    coroutines
+  - not Jit compatible
+* was "too complete" with a 30% performance hit
+* new approach is almost ready
+  - with full Jit support
+  - but needs some fixing
+  - this *will* be efficient
+Applications using Stackless Python
+* The Eve Online MMORPG
+  http://www.eveonline.com/
+  - based their games on Stackless since 1998
+* science + computing ag, Anselm Kruis
+  https://ep2012.europython.eu/conference/p/anselm-kruis
+* The Nagare Web Framework
+  http://www.nagare.org/
+  - works because of Stackless Pickling
+* today's majority: persistence
+Thank you
+* the new Stackless Website
+  http://www.stackless.com/
+  - a **great** donation from Alain Pourier, *Nagare*
+* You can hire me as a consultant
+* Questions?
diff --git a/talk/ep2012/stackless/stylesheet.latex b/talk/ep2012/stackless/stylesheet.latex
new file mode 100644
--- /dev/null
+++ b/talk/ep2012/stackless/stylesheet.latex
@@ -0,0 +1,11 @@
+\setbeamertemplate{navigation symbols}{}
+\definecolor{darkgreen}{rgb}{0, 0.5, 0.0}
diff --git a/talk/ep2012/stackless/title.latex b/talk/ep2012/stackless/title.latex
new file mode 100644
--- /dev/null
+++ b/talk/ep2012/stackless/title.latex
@@ -0,0 +1,5 @@
diff --git a/talk/ep2012/stm/stmdemo2.py b/talk/ep2012/stm/stmdemo2.py
--- a/talk/ep2012/stm/stmdemo2.py
+++ b/talk/ep2012/stm/stmdemo2.py
@@ -1,33 +1,37 @@
-    def specialize_more_blocks(self):
-        while True:
-            # look for blocks not specialized yet
-            pending = [block for block in self.annotator.annotated
-                             if block not in self.already_seen]
-            if not pending:
-                break
+def specialize_more_blocks(self):
+    while True:
+       # look for blocks not specialized yet
+       pending = [block for block in self.annotator.annotated
+                        if block not in self.already_seen]
+       if not pending:
+           break
-            # specialize all blocks in the 'pending' list
-            for block in pending:
-                self.specialize_block(block)
-                self.already_seen.add(block)
+       # specialize all blocks in the 'pending' list
+       for block in pending:
+           self.specialize_block(block)
+           self.already_seen.add(block)
-    def specialize_more_blocks(self):
-        while True:
-            # look for blocks not specialized yet
-            pending = [block for block in self.annotator.annotated
-                             if block not in self.already_seen]
-            if not pending:
-                break
-            # specialize all blocks in the 'pending' list
-            # *using transactions*
-            for block in pending:
-                transaction.add(self.specialize_block, block)
-            transaction.run()
-            self.already_seen.update(pending)
+def specialize_more_blocks(self):
+    while True:
+       # look for blocks not specialized yet
+       pending = [block for block in self.annotator.annotated
+                        if block not in self.already_seen]
+       if not pending:
+           break
+       # specialize all blocks in the 'pending' list
+       # *using transactions*
+       for block in pending:
+           transaction.add(self.specialize_block, block)
+       transaction.run()
+       self.already_seen.update(pending)
diff --git a/talk/ep2012/stm/talk.pdf b/talk/ep2012/stm/talk.pdf
index 19067d178980accc5a060fa819059611fcf1acdc..59ba6454817cd0a87accdf48e505190fe99b4924
GIT binary patch


diff --git a/talk/ep2012/stm/talk.rst b/talk/ep2012/stm/talk.rst
--- a/talk/ep2012/stm/talk.rst
+++ b/talk/ep2012/stm/talk.rst
@@ -484,6 +484,8 @@
 * http://pypy.org/
-* You can hire Antonio
+* You can hire Antonio (http://antocuni.eu)
 * Questions?
+* PyPy help desk on Thursday morning
\ No newline at end of file
diff --git a/talk/ep2012/tools/demo.py b/talk/ep2012/tools/demo.py
new file mode 100644
--- /dev/null
+++ b/talk/ep2012/tools/demo.py
@@ -0,0 +1,208 @@
+def simple():
+    for i in range(100000):
+        pass
+def bridge():
+    s = 0
+    for i in range(100000):
+        if i % 2:
+            s += 1
+        else:
+            s += 2
+def bridge_overflow():
+    s = 2
+    for i in range(100000):
+        s += i*i*i*i
+    return s
+def nested_loops():
+    s = 0
+    for i in range(10000):
+        for j in range(100000):
+            s += 1
+def inner1():
+    return 1
+def inlined_call():
+    s = 0
+    for i in range(10000):
+        s += inner1()
+def inner2(a):
+    for i in range(3):
+        a += 1
+    return a
+def inlined_call_loop():
+    s = 0
+    for i in range(100000):
+        s += inner2(i)
+class A(object):
+    def __init__(self, x):
+        if x % 2:
+            self.y = 3
+        self.x = x
+def object_maps():
+    l = [A(i) for i in range(100)]
+    s = 0
+    for i in range(1000000):
+        s += l[i % 100].x
+if __name__ == '__main__':
+    simple()
+    bridge()
+    bridge_overflow()
+    nested_loops()
+    inlined_call()
+    inlined_call_loop()
+    object_maps()
diff --git a/talk/ep2012/tools/talk.html b/talk/ep2012/tools/talk.html
new file mode 100644
--- /dev/null
+++ b/talk/ep2012/tools/talk.html
@@ -0,0 +1,120 @@
+	<meta name="viewport" content="width=1024, user-scalable=no">
+  <link rel="stylesheet" href="/home/fijal/src/deckjs/core/deck.core.css">
+  <link rel="stylesheet" href="web-2.0.css">
+  <link rel="stylesheet" href="/home/fijal/src/deckjs/themes/transition/horizontal-slide.css">
+  <script src="/home/fijal/src/deckjs/modernizr.custom.js"></script>
+  <script src="/home/fijal/src/deckjs/jquery-1.7.min.js"></script>
+  <script src="/home/fijal/src/deckjs/core/deck.core.js"></script>
+  <script>
+    $(function() {
+	  $.deck('.slide');
+    });
+  </script>
+<body class="deck-container">
+  <section class="slide" id="title-slide">
+    <h1>Performance analysis tools for JITted VMs</h1>
+  </section>
+  <section class="slide">
+    <h2>Who am I?</h2>
+    <ul>
+      <li>worked on PyPy for 5+ years</li>
+      <li>often presented with a task "my program runs slow"</li>
+      <li>never completely satisfied with present solutions</li>
+      <li class="slide">I'm not antisocial, just shy</li>
+    </ul>
+  </section>
+  <section class="slide">
+    <h2>The talk</h2>
+    <ul>
+      <li>apologies for a lack of advanced warning - this is a rant</li>
+      <div class="slide">
+        <li>I'll talk about tools</li>
+        <li>primarily profiling tools</li>
+      </div>
+      <div class="slide">
+        <li>lots of questions</li>
+        <li>not that many answers</li>
+      </div>
+    </ul>
+  </section>
+  <section class="slide">
+    <h2>Why ranting?</h2>
+    <ul>
+      <li>the topic at hand is hard</li>
+      <li>the mindset about tools is very much rooted in the static land</li>
+    </ul>
+  </section>
+  <section class="slide">
+    <h2>Profiling theory</h2>
+    <ul>
+      <li>you spend 90% of your time in 10% of the functions</li>
+      <li>hence you can start profiling after you're done developing</li>
+      <li>by optimizing few functions</li>
+      <div class="slide">
+        <li>problem - 10% of 600k lines is still 60k lines</li>
+        <li>that might be even 1000s of functions</li>
+      </div>
+    </ul>
+  </section>
+  <section class="slide">
+    <h2>Let's talk about profiling</h2>
+    <ul>
+      <li>I'll try profiling!</li>
+    </ul>
+  </section>
+  <section class="slide">
+    <h2>JITted landscape</h2>
+    <ul>
+      <li>you have to account for warmup times</li>
+      <li>time spent in functions is very context dependent</li>
+    </ul>
+  </section>
+  <section class="slide">
+    <h2>Let's try!</h2>
+  </section>
+  <section class="slide">
+    <h2>High level languages</h2>
+    <ul>
+      <li>in C relation C <-> assembler is "trivial"</li>
+      <li>in PyPy, V8 (JS) or luajit (lua), the mapping is far from trivial</li>
+      <div class="slide">
+        <li>multiple versions of the same code</li>
+        <li>bridges even if there is no branch in user code</li>
+      </div>
+      <li class="slide">sometimes I have absolutely no clue</li>
+    </ul>
+  </section>
+  <section class="slide">
+    <h2>The problem</h2>
+    <ul>
+      <li>what I've shown is pretty much the state of the art</li>
+    </ul>
+  </section>
+  <section class="slide">
+    <h2>Another problem</h2>
+    <ul>
+      <li>often when presented with profiling, it's already too late</li>
+    </ul>
+  </section>
+  <section class="slide">
+    <h2>Better tools</h2>
+    <ul>
+      <li>good vm-level instrumentation</li>
+      <li>better visualizations, more code oriented</li>
+      <li>hints at the editor level about your code</li>
+      <li>hints about coverage, tests</li>
+    </ul>
+  </section>
+  <section class="slide">
+    <h2>&lt;/rant&gt;</h2>
+    <ul>
+      <li>good part - there are people working on it</li>
+      <li>questions, suggestions?</li>
+    </ul>
+  </section>
diff --git a/talk/ep2012/tools/web-2.0.css b/talk/ep2012/tools/web-2.0.css
new file mode 100644
--- /dev/null
+++ b/talk/ep2012/tools/web-2.0.css
@@ -0,0 +1,215 @@
+ at charset "UTF-8";
+.deck-container {
+  font-family: "Gill Sans", "Gill Sans MT", Calibri, sans-serif;
+  font-size: 2.75em;
+  background: #f4fafe;
+  /* Old browsers */
+  background: -moz-linear-gradient(top, #f4fafe 0%, #ccf0f0 100%);
+  /* FF3.6+ */
+  background: -webkit-gradient(linear, left top, left bottom, color-stop(0%, #f4fafe), color-stop(100%, #ccf0f0));
+  /* Chrome,Safari4+ */
+  background: -webkit-linear-gradient(top, #f4fafe 0%, #ccf0f0 100%);
+  /* Chrome10+,Safari5.1+ */
+  background: -o-linear-gradient(top, #f4fafe 0%, #ccf0f0 100%);
+  /* Opera11.10+ */
+  background: -ms-linear-gradient(top, #f4fafe 0%, #ccf0f0 100%);
+  /* IE10+ */
+  background: linear-gradient(top, #f4fafe 0%, #ccf0f0 100%);
+  /* W3C */
+  background-attachment: fixed;
+.deck-container > .slide {
+  text-shadow: 1px 1px 1px rgba(255, 255, 255, 0.5);
+.deck-container > .slide .deck-before, .deck-container > .slide .deck-previous {
+  opacity: 0.4;
+.deck-container > .slide .deck-before:not(.deck-child-current) .deck-before, .deck-container > .slide .deck-before:not(.deck-child-current) .deck-previous, .deck-container > .slide .deck-previous:not(.deck-child-current) .deck-before, .deck-container > .slide .deck-previous:not(.deck-child-current) .deck-previous {
+  opacity: 1;
+.deck-container > .slide .deck-child-current {
+  opacity: 1;
+.deck-container .slide h1, .deck-container .slide h2, .deck-container .slide h3, .deck-container .slide h4, .deck-container .slide h5, .deck-container .slide h6 {
+  font-family: "Hoefler Text", Constantia, Palatino, "Palatino Linotype", "Book Antiqua", Georgia, serif;
+  font-size: 1.75em;
+.deck-container .slide h1 {
+  color: #08455f;
+.deck-container .slide h2 {
+  color: #0b7495;
+  border-bottom: 0;
+.cssreflections .deck-container .slide h2 {
+  line-height: 1;
+  -webkit-box-reflect: below -0.556em -webkit-gradient(linear, left top, left bottom, from(transparent), color-stop(0.3, transparent), color-stop(0.7, rgba(255, 255, 255, 0.1)), to(transparent));
+  -moz-box-reflect: below -0.556em -moz-linear-gradient(top, transparent 0%, transparent 30%, rgba(255, 255, 255, 0.3) 100%);
+.deck-container .slide h3 {
+  color: #000;
+.deck-container .slide pre {
+  border-color: #cde;
+  background: #fff;
+  position: relative;
+  z-index: auto;
+  /* http://nicolasgallagher.com/css-drop-shadows-without-images/ */
+.borderradius .deck-container .slide pre {
+  -webkit-border-radius: 5px;
+  -moz-border-radius: 5px;
+  border-radius: 5px;
+.csstransforms.boxshadow .deck-container .slide pre > :first-child:before {
+  content: "";
+  position: absolute;
+  z-index: -1;
+  background: #fff;
+  top: 0;
+  bottom: 0;
+  left: 0;
+  right: 0;
+.csstransforms.boxshadow .deck-container .slide pre:before, .csstransforms.boxshadow .deck-container .slide pre:after {
+  content: "";
+  position: absolute;
+  z-index: -2;
+  bottom: 15px;
+  width: 50%;
+  height: 20%;
+  max-width: 300px;
+  -webkit-box-shadow: 0 15px 10px rgba(0, 0, 0, 0.7);
+  -moz-box-shadow: 0 15px 10px rgba(0, 0, 0, 0.7);
+  box-shadow: 0 15px 10px rgba(0, 0, 0, 0.7);
+.csstransforms.boxshadow .deck-container .slide pre:before {
+  left: 10px;
+  -webkit-transform: rotate(-3deg);
+  -moz-transform: rotate(-3deg);
+  -ms-transform: rotate(-3deg);
+  -o-transform: rotate(-3deg);
+  transform: rotate(-3deg);
+.csstransforms.boxshadow .deck-container .slide pre:after {
+  right: 10px;
+  -webkit-transform: rotate(3deg);
+  -moz-transform: rotate(3deg);
+  -ms-transform: rotate(3deg);
+  -o-transform: rotate(3deg);
+  transform: rotate(3deg);
+.deck-container .slide code {
+  color: #789;
+.deck-container .slide blockquote {
+  font-family: "Hoefler Text", Constantia, Palatino, "Palatino Linotype", "Book Antiqua", Georgia, serif;
+  font-size: 2em;
+  padding: 1em 2em .5em 2em;
+  color: #000;
+  background: #fff;
+  position: relative;
+  border: 1px solid #cde;
+  z-index: auto;
+.borderradius .deck-container .slide blockquote {
+  -webkit-border-radius: 5px;
+  -moz-border-radius: 5px;
+  border-radius: 5px;
+.boxshadow .deck-container .slide blockquote > :first-child:before {
+  content: "";
+  position: absolute;
+  z-index: -1;
+  background: #fff;
+  top: 0;
+  bottom: 0;
+  left: 0;
+  right: 0;
+.boxshadow .deck-container .slide blockquote:after {
+  content: "";
+  position: absolute;
+  z-index: -2;
+  top: 10px;
+  bottom: 10px;
+  left: 0;
+  right: 50%;
+  -moz-border-radius: 10px/100px;
+  border-radius: 10px/100px;
+  -webkit-box-shadow: 0 0 15px rgba(0, 0, 0, 0.6);
+  -moz-box-shadow: 0 0 15px rgba(0, 0, 0, 0.6);
+  box-shadow: 0 0 15px rgba(0, 0, 0, 0.6);
+.deck-container .slide blockquote p {
+  margin: 0;
+.deck-container .slide blockquote cite {
+  font-size: .5em;
+  font-style: normal;
+  font-weight: bold;
+  color: #888;
+.deck-container .slide blockquote:before {
+  content: "&#8220;";
+  position: absolute;
+  top: 0;
+  left: 0;
+  font-size: 5em;
+  line-height: 1;
+  color: #ccf0f0;
+  z-index: 1;
+.deck-container .slide ::-moz-selection {
+  background: #08455f;
+  color: #fff;
+.deck-container .slide ::selection {
+  background: #08455f;
+  color: #fff;
+.deck-container .slide a, .deck-container .slide a:hover, .deck-container .slide a:focus, .deck-container .slide a:active, .deck-container .slide a:visited {
+  color: #599;
+  text-decoration: none;
+.deck-container .slide a:hover, .deck-container .slide a:focus {
+  text-decoration: underline;
+.deck-container .deck-prev-link, .deck-container .deck-next-link {
+  background: #fff;
+  opacity: 0.5;
+.deck-container .deck-prev-link, .deck-container .deck-prev-link:hover, .deck-container .deck-prev-link:focus, .deck-container .deck-prev-link:active, .deck-container .deck-prev-link:visited, .deck-container .deck-next-link, .deck-container .deck-next-link:hover, .deck-container .deck-next-link:focus, .deck-container .deck-next-link:active, .deck-container .deck-next-link:visited {
+  color: #599;
+.deck-container .deck-prev-link:hover, .deck-container .deck-prev-link:focus, .deck-container .deck-next-link:hover, .deck-container .deck-next-link:focus {
+  opacity: 1;
+  text-decoration: none;
+.deck-container .deck-status {
+  font-size: 0.6666em;
+.deck-container.deck-menu .slide {
+  background: transparent;
+  -webkit-border-radius: 5px;
+  -moz-border-radius: 5px;
+  border-radius: 5px;
+.rgba .deck-container.deck-menu .slide {
+  background: rgba(0, 0, 0, 0.1);
+.deck-container.deck-menu .slide.deck-current, .rgba .deck-container.deck-menu .slide.deck-current, .no-touch .deck-container.deck-menu .slide:hover {
+  background: #fff;
+.deck-container .goto-form {
+  background: #fff;
+  border: 1px solid #cde;
+  -webkit-border-radius: 5px;
+  -moz-border-radius: 5px;
+  border-radius: 5px;
+.boxshadow .deck-container .goto-form {
+  -webkit-box-shadow: 0 15px 10px -10px rgba(0, 0, 0, 0.5), 0 1px 4px rgba(0, 0, 0, 0.3), 0 0 40px rgba(0, 0, 0, 0.1) inset;
+  -moz-box-shadow: 0 15px 10px -10px rgba(0, 0, 0, 0.5), 0 1px 4px rgba(0, 0, 0, 0.3), 0 0 40px rgba(0, 0, 0, 0.1) inset;
+  box-shadow: 0 15px 10px -10px rgba(0, 0, 0, 0.5), 0 1px 4px rgba(0, 0, 0, 0.3), 0 0 40px rgba(0, 0, 0, 0.1) inset;
diff --git a/talk/vmil2012/Makefile b/talk/vmil2012/Makefile
--- a/talk/vmil2012/Makefile
+++ b/talk/vmil2012/Makefile
@@ -6,8 +6,14 @@
 	pdflatex paper
 	mv paper.pdf jit-guards.pdf
+UNAME := $(shell "uname")
 view: jit-guards.pdf
+ifeq ($(UNAME), Linux)
 	evince jit-guards.pdf &
+ifeq ($(UNAME), Darwin)
+	open jit-guards.pdf &
 %.tex: %.py
 	pygmentize -l python -o $@ $<
diff --git a/talk/vmil2012/difflogs.py b/talk/vmil2012/difflogs.py
new file mode 100755
--- /dev/null
+++ b/talk/vmil2012/difflogs.py
@@ -0,0 +1,180 @@
+#!/usr/bin/env python
+Parse and summarize the traces produced by pypy-c-jit when PYPYLOG is set.
+only works for logs when unrolling is disabled
+import py
+import os
+import sys
+import csv
+import optparse
+from pprint import pprint
+from pypy.tool import logparser
+from pypy.jit.tool.oparser import parse
+from pypy.jit.metainterp.history import ConstInt
+from pypy.rpython.lltypesystem import llmemory, lltype
+categories = {
+    'setfield_gc': 'set',
+    'setarrayitem_gc': 'set',
+    'strsetitem': 'set',
+    'getfield_gc': 'get',
+    'getfield_gc_pure': 'get',
+    'getarrayitem_gc': 'get',
+    'getarrayitem_gc_pure': 'get',
+    'strgetitem': 'get',
+    'new': 'new',
+    'new_array': 'new',
+    'newstr': 'new',
+    'new_with_vtable': 'new',
+    'guard_class': 'guard',
+    'guard_nonnull_class': 'guard',
+all_categories = 'new get set guard numeric rest'.split()
+def extract_opnames(loop):
+    loop = loop.splitlines()
+    for line in loop:
+        if line.startswith('#') or line.startswith("[") or "end of the loop" in line:
+            continue
+        frontpart, paren, _ = line.partition("(")
+        assert paren
+        if " = " in frontpart:
+            yield frontpart.split(" = ", 1)[1]
+        elif ": " in frontpart:
+            yield frontpart.split(": ", 1)[1]
+        else:
+            yield frontpart
+def summarize(loop, adding_insns={}):    # for debugging
+    insns = adding_insns.copy()
+    seen_label = True
+    if "label" in loop:
+        seen_label = False
+    for opname in extract_opnames(loop):
+        if not seen_label:
+            if opname == 'label':
+                seen_label = True
+            else:
+                assert categories.get(opname, "rest") == "get"
+                continue
+        if opname.startswith("int_") or opname.startswith("float_"):
+            opname = "numeric"
+        else:
+            opname = categories.get(opname, 'rest')
+        insns[opname] = insns.get(opname, 0) + 1
+    assert seen_label
+    return insns
+def compute_summary_diff(loopfile, options):
+    print loopfile
+    log = logparser.parse_log_file(loopfile)
+    loops, summary = consider_category(log, options, "jit-log-opt-")
+    # non-optimized loops and summary
+    nloops, nsummary = consider_category(log, options, "jit-log-noopt-")
+    diff = {}
+    keys = set(summary.keys()).union(set(nsummary))
+    for key in keys:
+        before = nsummary[key]
+        after = summary[key]
+        diff[key] = (before-after, before, after)
+    return len(loops), summary, diff
+def main(loopfile, options):
+    _, summary, diff = compute_summary_diff(loopfile, options)
+    print
+    print 'Summary:'
+    print_summary(summary)
+    if options.diff:
+        print_diff(diff)
+def consider_category(log, options, category):
+    loops = logparser.extract_category(log, category)
+    if options.loopnum is None:
+        input_loops = loops
+    else:
+        input_loops = [loops[options.loopnum]]
+    summary = dict.fromkeys(all_categories, 0)
+    for loop in loops:
+        summary = summarize(loop, summary)
+    return loops, summary
+def print_summary(summary):
+    ops = [(summary[key], key) for key in summary]
+    ops.sort(reverse=True)
+    for n, key in ops:
+        print '%5d' % n, key
+def print_diff(diff):
+    ops = [(key, before, after, d) for key, (d, before, after) in diff.iteritems()]
+    ops.sort(reverse=True)
+    tot_before = 0
+    tot_after = 0
+    print ",",
+    for key, before, after, d in ops:
+        print key, ", ,",
+    print "total"
+    print args[0], ",",
+    for key, before, after, d in ops:
+        tot_before += before
+        tot_after += after
+        print before, ",", after, ",",
+    print tot_before, ",", tot_after
+def mainall(options):
+    logs = os.listdir("logs")
+    all = []
+    for log in logs:
+        parts = log.split(".")
+        if len(parts) != 3:
+            continue
+        l, exe, bench = parts
+        if l != "logbench":
+            continue
+        all.append((exe, bench, log))
+    all.sort()
+    with file("logs/summary.csv", "w") as f:
+        csv_writer = csv.writer(f)
+        row = ["exe", "bench", "number of loops"]
+        for cat in all_categories:
+            row.append(cat + " before")
+            row.append(cat + " after")
+        csv_writer.writerow(row)
+        print row
+        for exe, bench, log in all:
+            num_loops, summary, diff = compute_summary_diff("logs/" + log, options)
+            print diff
+            print exe, bench, summary
+            row = [exe, bench, num_loops]
+            for cat in all_categories:
+                difference, before, after = diff[cat]
+                row.append(before)
+                row.append(after)
+            csv_writer.writerow(row)
+            print row
+if __name__ == '__main__':
+    parser = optparse.OptionParser(usage="%prog loopfile [options]")
+    parser.add_option('-n', '--loopnum', dest='loopnum', default=None, metavar='N', type=int,
+                      help='show the loop number N [default: last]')
+    parser.add_option('-a', '--all', dest='loopnum', action='store_const', const=None,
+                      help='show all loops in the file')
+    parser.add_option('-d', '--diff', dest='diff', action='store_true', default=False,
+                      help='print the difference between non-optimized and optimized operations in the loop(s)')
+    parser.add_option('--diffall', dest='diffall', action='store_true', default=False,
+                      help='diff all the log files around')
+    options, args = parser.parse_args()
+    if options.diffall:
+        mainall(options)
+    elif len(args) != 1:
+        parser.print_help()
+        sys.exit(2)
+    else:
+        main(args[0], options)
diff --git a/talk/vmil2012/paper.tex b/talk/vmil2012/paper.tex
--- a/talk/vmil2012/paper.tex
+++ b/talk/vmil2012/paper.tex
@@ -104,10 +104,10 @@
 The contributions of this paper are:
- \item 
+ \item
-The paper is structured as follows: 
+The paper is structured as follows:
@@ -116,6 +116,34 @@
+The RPython language and the PyPy Project were started in 2002 with the goal of
+creating a python interpreter written in a High level language, allowing easy
+language experimentation and extension. PyPy is now a fully compatible
+alternative implementation of the Python language, xxx mention speed. The
+Implementation takes advantage of the language features provided by RPython
+such as the provided tracing just-in-time compiler described below.
+RPython, the language and the toolset originally developed to implement the
+Python interpreter have developed into a general environment for experimenting
+and developing fast and maintainable dynamic language implementations. xxx Mention
+the different language impls.
+RPython is built of two components, the language and the translation toolchain
+used to transform RPython programs to executable units.  The RPython language
+is a statically typed object oriented high level language. The language provides
+several features such as automatic memory management (aka. Garbage Collection)
+and just-in-time compilation. When writing an interpreter using RPython the
+programmer only has to write the interpreter for the language she is
+implementing.  The second RPython component, the translation toolchain, is used
+to transform the program to a low level representations suited to be compiled
+and run on one of the different supported target platforms/architectures such
+as C, .NET and Java. During the transformation process
+different low level aspects suited for the target environment are automatically
+added to program such as (if needed) a garbage collector and with some hints
+provided by the author a just-in-time compiler.
 \subsection{PyPy's Meta-Tracing JIT Compilers}
@@ -134,7 +162,7 @@
 * High level handling of resumedata
    * trade-off fast tracing v/s memory usage
-   * creation in the frontend&#194; 
+   * creation in the frontend&#194;
    * optimization
    * compression
    * interaction with optimization

More information about the pypy-commit mailing list