[pypy-svn] pypy default: some thoughts about a jit-aware profiler

Tue May 3 16:12:07 CEST 2011

Author: Antonio Cuni <anto.cuni at gmail.com>
Branch: 
Changeset: r43859:8eed157756c4
Date: 2011-05-03 16:11 +0200
http://bitbucket.org/pypy/pypy/changeset/8eed157756c4/

Log:	some thoughts about a jit-aware profiler

diff --git a/pypy/doc/discussion/jit-profiler.rst b/pypy/doc/discussion/jit-profiler.rst
new file mode 100644
--- /dev/null
+++ b/pypy/doc/discussion/jit-profiler.rst
@@ -0,0 +1,79 @@
+A JIT-aware profiler
+====================
+
+Goal: have a profiler which is aware of the PyPy JIT and which shows which
+percentage of the time have been spent in which loops.
+
+Long term goal: integrate the data collected by the profiler with the
+jitviewer.
+
+The idea is record an event in the PYPYLOG everytime we enter and exit a loop
+or a bridge.
+
+Expected output
+----------------
+
+[100] {jit-profile-enter
+loop1      # e.g. an entry bridge
+[101] jit-profile-enter}
+...
+[200] {jit-profile-enter
+loop0      # JUMP from loop1 to loop0
+[201] jit-profile-enter}
+...
+[500] {jit-profile-exit
+loop0      # e.g. because of a failing guard
+[501] jit-profile-exit}
+
+In this example, the exiting from loop1 is implicit because we are entering
+loop0.  So, we spent 200-100=100 ticks in the entry bridge, and 500-200=300
+ticks in the actual loop.
+
+What to do about "inner" bridges?
+----------------------------------
+
+"Inner bridges" are those bridges which jump back to the loop where they
+originate from.  There are two possible ways of dealing with them:
+
+  1. we ignore them: we record when we enter the loop, but not when we jump to
+     a compiled inner bridge.  The exit event will be recorded only in case of
+     a non-compiled guard failure or a JUMP to another loop
+
+  2. we record the enter/exit of each inner bridge
+
+The disadvantage of solution (2) is that there are certain loops which takes
+bridges at everty single iteration.  So, in this case we would record a huge
+number of events, possibly adding a lot of overhead and thus making the
+profiled data useless.
+
+
+Detecting the enter to/exit from a loop
+----------------------------------------
+
+Ways to enter:
+
+    - just after the tracing/compilation
+
+    - from the interpreter, if the loop has already been compiled
+
+    - from another loop, via a JUMP operation
+
+    - from a hot guard failure (which we ignore, in case we choose solution
+      (1) above)
+
+    - XXX: am I missing anything?
+
+Ways to exit:
+
+    - guard failure (entering blackhole)
+
+    - guard failure (jumping to a bridge) (ignored in case of solution (1))
+
+    - jump to another loop
+
+    - XXX: am I missing anything?
+
+
+About call_assembler: I think that at the beginning, we should just ignore
+call_assembler: the time spent inside the call will be accounted to the loop
+calling it.
diff --git a/pypy/module/pypyjit/test_pypy_c/test_pypy_c_new.py b/pypy/module/pypyjit/test_pypy_c/test_pypy_c_new.py
--- a/pypy/module/pypyjit/test_pypy_c/test_pypy_c_new.py
+++ b/pypy/module/pypyjit/test_pypy_c/test_pypy_c_new.py
@@ -1662,3 +1662,20 @@
         assert log.result == 300
         loop, = log.loops_by_filename(self.filepath)
         assert loop.match_by_id('shift', "")  # optimized away
+
+    def test_division_to_rshift(self):
+        def main(b):
+            res = 0
+            a = 0
+            while a < 300:
+                assert a >= 0
+                assert 0 <= b <= 10
+                res = a/b     # ID: div
+                a += 1
+            return res
+        #
+        log = self.run(main, [3], threshold=200)
+        #assert log.result == 149
+        loop, = log.loops_by_filename(self.filepath)
+        import pdb;pdb.set_trace()
+        assert loop.match_by_id('div', "")  # optimized away