[pypy-commit] extradoc extradoc: merge heads

Sat Oct 8 01:04:24 EDT 2016

Author: Armin Rigo <arigo at tunes.org>
Branch: extradoc
Changeset: r5737:3452de724eed
Date: 2016-10-08 07:04 +0200
http://bitbucket.org/pypy/extradoc/changeset/3452de724eed/

Log:	merge heads

diff --git a/blog/draft/jit-leaner-frontend.rst b/blog/draft/jit-leaner-frontend.rst
--- a/blog/draft/jit-leaner-frontend.rst
+++ b/blog/draft/jit-leaner-frontend.rst
@@ -5,18 +5,26 @@
 
 I'm pleased to inform that we've finished another round of
 improvements to the warmup performance of PyPy. Before I go
-into details, I'll recap achievements that we've done since we've started
+into details, I'll recap the achievements that we've done since we've started
 working on the warmup performance. I picked a random PyPy from November 2014
 (which is definitely before we started the warmup work) and compared it with
 a recent one, after 5.0. The exact revisions are respectively ``ffce4c795283``
 and ``cfbb442ae368``. First let's compare `pure warmup benchmarks`_ that
 can be found in our benchmarking suite. Out of those,
-``pypy-graph-alloc-removal`` could have been improved in the meantime by
-doing other work on PyPy, while the rest is purely focused on warmup.
+``pypy-graph-alloc-removal`` numbers should be taken with a grain of salt,
+since other work could have influenced the results.
+The rest of the benchmarks mentioned is bottlenecked purely by warmup times.
 
 You can see how much your program spends in warmup running
 ``PYPYLOG=jit-summary:- pypy your-program.py`` under "tracing" and "backend"
-fields.
+fields (in the first three lines). An example looks like that::
+
+    [e00c145a41] {jit-summary
+    Tracing:        71      0.053645 <- time spent tracing & optimizing
+    Backend:        71      0.028659 <- time spent compiling to assembler
+    TOTAL:                  0.252217 <- total run time of the program
+
+The results of the benchmarks
 
 +---------------------------+------------+------------+---------+----------------+----------------+
 | benchmark                 | time - old | time - new | speedup | JIT time - old | JIT time - new |
@@ -35,36 +43,41 @@
 As we can see, the overall warmup benchmarks got up to **90% faster** with
 JIT time dropping by up to **2.5x**. We have more optimizations in the pipeline,
 with an idea how to transfer some of the JIT gains into more of a total program
-runtime by jitting earlier and more eager.
+runtime by jitting earlier and more eagerly.
 
 Details of the last round of optimizations
 ------------------------------------------
 
 Now the nitty gritty details - what did we actually do? I covered a lot of
-warmup improvements in the past blog posts so I'm going to focus on
-the last change, jit-leaner-frontend branch. The last change is simple, instead of using
-pointers to store the "operations" object after tracing, we use a compact list of
-16-bit integers (with 16bit pointers in between). On 64bit machine the wins are
-tremendous - it's 4x more efficient to use 16bit pointers than full 64bit pointers.
-.. XXX: I assume you are talking about "memory efficiency": we should be clearer
-Additionally those pointers have a much better defined lifespan, so we don't need to
+warmup improvements in the `past`_ `blog`_ posts so I'm going to focus on
+the last change, the jit-leaner-frontend branch. This last change is simple, instead of using
+pointers to store the "operations" objects created during tracing, we use a compact list of
+16-bit integers (with 16bit pointers in between). On 64bit machine the memory wins are
+tremendous - the new representation is 4x more efficient to use 16bit pointers than full 64bit pointers.
+Additionally, the smaller representation has much better cache behavior and much less
+pointer chasing in memory. It also has a better defined lifespan, so we don't need to
 bother tracking them by the GC, which also saves quite a bit of time.
 
-Now the change sounds simple, but the details in the underlaying data mean that
+.. _`past`: http://morepypy.blogspot.com/2015/10/pypy-memory-and-warmup-improvements-2.html
+.. _`blog`: http://morepypy.blogspot.com/2015/09/pypy-warmup-improvements.html
+
+The change sounds simple, but the details in the underlaying data mean that
 everything in the JIT had to be changed which took quite a bit of effort :-)
 
-Going into the future in the JIT front, we have an exciting set of optimizations,
+Going into the future on the JIT front, we have an exciting set of optimizations,
 ranging from faster loops through faster warmup to using better code generation
 techniques and broadening the kind of program that PyPy speeds up. Stay tuned
 for the updates.
 
 We would like to thank our commercial partners for making all of this possible.
-The work has been performed by baroquesoftware.com and would not be possible
+The work has been performed by `baroquesoftware`_ and would not be possible
 without support from people using PyPy in production. If your company uses
 PyPy and want it to do more or does not use PyPy but has performance problems
-with the Python instalation, feel free to get in touch with me, trust me using
+with the Python installation, feel free to get in touch with me, trust me using
 PyPy ends up being a lot cheaper than rewriting everything in go :-)
 
+.. _`baroquesoftware`: http://baroquesoftware.com
+
 Best regards,
 Maciej Fijalkowski
 
diff --git a/planning/py3.5/milestone-1-progress.rst b/planning/py3.5/milestone-1-progress.rst
--- a/planning/py3.5/milestone-1-progress.rst
+++ b/planning/py3.5/milestone-1-progress.rst
@@ -71,7 +71,8 @@
   (DONE, maybe a small optimization left---TYPE_*ASCII*---that
   depends on compact unicode representation)
 
-* enum: Support for enumeration types (PEP 435). (PURELY-APPLEVEL)
+* enum: Support for enumeration types (PEP 435).
+  the is a pypi package called enum34 that implements it (pure python maybe?)
 
 * pathlib: Object-oriented filesystem paths (PEP 428). (PURELY-APPLEVEL)
 
diff --git a/talk/pyconza2016/pypy/img/how-jit.png b/talk/pyconza2016/pypy/img/how-jit.png
new file mode 100644
index e69de29bb2d1d6434b8b29ae775ad8c2e48c5391..ad9ce720343f9b461e202faacd566abc3a331b61
GIT binary patch

[cut]

diff --git a/talk/pyconza2016/pypy/index.html b/talk/pyconza2016/pypy/index.html
--- a/talk/pyconza2016/pypy/index.html
+++ b/talk/pyconza2016/pypy/index.html
@@ -32,16 +32,27 @@
                 </section>
                 <section>
                     <section>
-                        <h1>PyPy is ..</h1>
+                        <h1>More "general" PyPy talk</h1>
+			<p>Goals:</p>
+			<ul>
+				<li>An approach to optimize Python programs</li>
+				<li>Examples</li>
+				<li>How not to start optimizing</li>
+				<li>What is PyPy up to now?</li>
+			</ul>
                     </section>
+                </section>
+                <section>
                     <section>
-			<p>... a software project ... </p>
+                        <h1>PyPy is a ...</h1>
+			<p class="fragment">... <strong>fast virtual machine for Python</strong> </p>
+			<p class="fragment">developed by researchers, freelancers and many contributors.</p>
                     </section>
+                </section>
+                <section>
                     <section>
-			<p>... assembling a <strong>fast virtual machine for Python</strong> ... </p>
-                    </section>
-                    <section>
-			<p>developed by many researchers, freelancers and many contributors.</p>
+                        <p><code>$ python yourprogram.py</code></p>
+                        <p><code>$ pypy yourprogram.py</code></p>
                     </section>
                 </section>
                 <section>
@@ -66,7 +77,7 @@
                     <section>
                         <h1>About me</h1>
 			<p>Working on PyPy (+1,5y)</p>
-			<p>Master degree - Sticked with PyPy</p>
+			<p>Master thesis → GSoC 2015 → PyPy</p>
 			<p>living and working in Austria</p>
                     </section>
                 </section>
@@ -85,17 +96,18 @@
                         <p><strong>Neither</strong></p>
                     </section>
                     <section>
-                        <p>Run you program an measure your criteria</p>
+                        <p>Run your program an measure your <strong>criteria</strong></p>
                     </section>
                     <section>
-                        <h1>Criteria examples?</h1>
+                        <h1>For example?</h1>
 			<ul>
 				<li>CPU time</li>
 				<li>Peak Heap Memory</li>
 				<li>Requests per second</li>
+				<li>Latency</li>
 				<li>...</li>
 			</ul>
-			<p>Dissatisfaction with one attribute of your program!</p>
+			<p>Dissatisfaction with one criteria of your program!</p>
                     </section>
                 </section>
 		<section>
@@ -103,22 +115,23 @@
 			    <h1>Some theory ... </h1>
 			</section>
                 	<section>
-			    <h1>Hot spots</h1>
-			    <p>Loops!</p>
-			    <p>What kind program can you build without loops?</p>
-			</section>
-                	<section>
 			    <h1>Complexity</h1>
-			    <p>Big-O-Notation - Express how many steps a program to complete at most</p>
+			    <p>Big-O-Notation</p>
+			    <p>Classify e.g. a function and it's processing time</p>
+			    <p>Increase input size to the function</p>
 			</section>
                 	<section>
 				<ul>
-			    		<li><code>a = 3</code> # runs in O(1)</li>
-			    		<li><code>[x+1 for x in range(n)]</code> # runs in O(n)</li>
-                                        <li><code>[[x+y for x in range(n)] for y in range(m)]</code> # O(n*m)</li>
+			    		<li><code>a = 3</code> # O(1)</li>
+			    		<li><code>[x+1 for x in range(n)]</code> # O(n)</li>
+                                        <li><code>[[x+y for x in range(n)] \ <br> for y in range(m)]</code> # O(n*m) == O(n) if n > m</li>
 				</ul>
 			</section>
                 	<section>
+				Bubble sort vs Quick Sort
+				<p>O(n**2) vs O(n log n)</p>
+			</section>
+                	<section>
 				<h1>Complexity</h1>
 				<p>Yields the most gain, independent from the language</p>
                                 <p>E.g. prefer O(n) over O(n**2)</p>
@@ -144,28 +157,43 @@
 					<li>Written in Python</li>
 					<li>Moved to vmprof.com</li>
 					<li>Log files can easily take up to 40MB uncompressed</li>
-					<li>Takes ~14 seconds to parse with CPython</li>
+					<li>Takes ~10 seconds to parse with CPython</li>
 					<li>Complexity is linear to input size of the log file</li>
 				</ul>
 			</section>
 			<section>
+				<p><h3>Thanks to Python</h3></p>
 				<p class="advantage">+ Little development time</p>
 				<p class="advantage">+ Easy to test</p>
-				<p><h3>Thanks to Python</h3></p>
 			</section>
 			<section>
 				<p class="disadvantage">- Takes too long to parse</p>
-				<p>Our criteria: CPU time to long</p>
+				<p class="disadvantage">- Parsing is done each request</p>
+				<p>Our criteria: CPU time to long + requests per second</p>
+				<p>(Many objects are allocated)</p>
 			</section>
 			<section>
-				<p class="">Several possible ways</p>
+				<h1>Suggestion</h1>
 				<p>Caching</p>
 				<p>Reduce CPU time</p>
 				<p>Let's have both</p>
 			</section>
 			<section>
-				<p>Caching - Easily done with django caching frame work</p>
-				<p>Reduce CPU time - Look at vmprof</p>
+				<p>Caching - Easily done with your favourite caching framework</p>
+				<p>Reduce CPU time - PyPy seems to be good at that?</p>
+			</section>
+			<section>
+				<h1>Let's run it...</h1>
+				<p><code>$ cpython2.7 parse.py 40mb.log<br>~ 10 seconds</code></p>
+				<p><code>$ pypy2 parse.py 40mb.log<br>~ 2 seconds</code></p>
+			</section>
+			<section>
+				<h1>Caching</h1>
+				<p>Requests really feel instant after the log has been loaded once</p>
+				<p>Precache</p>
+			</section>
+			<section>
+				<h1>The lazy approach of optimizing Python</h1>
 			</section>
 			<section>
 				<h1>VMProf</h1>
@@ -177,14 +205,16 @@
 			</section>
 			<section data-background="img/vmprof-screen-pypy.png">
 			</section>
-			<section>
-				<h1>~4 times faster on PyPy</h1>
-			</section>
                 </section>
 		<section>
 			<section>
 				<h1>Introducing PyPy's JIT</h1>
 			</section>
+                	<section>
+			    <h1>Hot spots</h1>
+			    <p>Loops / Repeat construct!</p>
+			    <p>What kind program can you build without loops?</p>
+			</section>
 			<section>
 				<h1>A simplified view</h1>
 				<ol>
@@ -193,12 +223,17 @@
 					<li>Optimization stage</li>
 					<li>Machine code generation</li>
 				</ol>
-				<p>Cannot represent control flow as a graph (other than loop jumps)</p>
-				<p>Guards ensure correctness</p>
+				
 			</section>
 			<section>
 				<h1>Beyond the scope of loops</h1>
-				<p>Frequent guard failure trigger recording</p>
+				<p>Guards ensure correctness</p>
+				<p>Frequent guard failure triggers recording</p>
+			</section>
+			<section>
+				<h1>Perception</h1>
+				<img src="img/how-jit.png">
+				<small>http://abstrusegoose.com/secretarchives/under-the-hood - CC BY-NC 3.0 US</small>
 			</section>
 			<section data-background-image="img/jitlog.png">
 				<a href="http://vmprof.com/#/7930e1f54f9eee75084738aafa6cb612/traces">→ link</a>
@@ -209,6 +244,17 @@
 				<p>Helps you to learn and understand PyPy</p>
 				<p>Provided at vmprof.com</p>
 			</section>
+			<section>
+				<h1>Properties & Tricks</h1>
+				<ul>
+					<li>Type specialization</li>
+					<li>Object unboxing</li>
+					<li>GC scheme</li>
+					<li>Dicts</li>
+					<li>Dynamic class creation (Instance maps)</li>
+					<li>Function calls (+ Inlining)</li>
+				</ul>
+			</section>
                 </section>
 		<section>
 			<section>
@@ -218,11 +264,11 @@
 			</section>
 			<section>
 				<h1>Magnetic</h1>
-				<p>marketing tech company</p>
-				<p>switched to PyPy 3 years ago</p>
+				<p>Marketing tech company</p>
+				<p>Switched to PyPy 3 years ago</p>
 			</section>
 			<section>
-				<h1>Q: what does your service do?</h1>
+				<h1>Q: What does your service do?</h1>
 				<p>A: ... allow generally large companies to send targeted marketing (e.g. serve ads) to people based on data we have learned </p>
 			</section>
 			<section>
@@ -242,9 +288,36 @@
 				<p>So it spends lots of time blocking</p>
 			</section>
                 </section>
+		<section>
+			<section>
+				<h1>timeit</h1>
+				<p>why not use perf?</p>
+				<p class="fragment">Try timeit on PyPy</p>
+			</section>
+			<section>
+				<h1>Python 3.5</h1>
+				<p>Progressed quite a bit</p>
+				<p class="fragment">async io</p>
+				<p class="fragment">Many more small details (sprint?)</p>
+			</section>
+			<section>
+				<h1>C-Extentions</h1>
+				<p>NumPy on top of the emulated layer</p>
+				<p>Boils down to managing PyPy & CPython objects</p>
+			</section>
+                </section>
+		<section>
+			<section>
+				<h1>Closing example</h1>
+				<p>how to move from cpu limited to network limited</p>
+				<a href="https://www.reddit.com/r/Python/comments/kt8bx/ask_rpython_whats_your_experience_with_pypy_and/">link</a>
+			</section>
+
+		</section>
                 <section>
                     <h4>Questions?</h4>
                     <a href="morepypy.blogspot.com">morepypy.blogspot.com</a><br>
+		    <a href="">software at vimloc.systems</a><br>
 		    Join on IRC <a href="">#pypy</a>
                 </section>
             </div>