[pypy-commit] extradoc extradoc: merge
fijal
noreply at buildbot.pypy.org
Mon Oct 14 17:57:34 CEST 2013
Author: Maciej Fijalkowski <fijall at gmail.com>
Branch: extradoc
Changeset: r5073:8ae7445f165e
Date: 2013-10-14 17:57 +0200
http://bitbucket.org/pypy/extradoc/changeset/8ae7445f165e/
Log: merge
diff --git a/blog/draft/stm-oct2013.rst b/blog/draft/stm-oct2013.rst
new file mode 100644
--- /dev/null
+++ b/blog/draft/stm-oct2013.rst
@@ -0,0 +1,78 @@
+Update on STM
+=============
+
+Hi all,
+
+the sprint in London was a lot of fun and very fruitful. In the last
+update on STM, Armin was working on improving and specializing the
+automatic barrier placement.
+There is still a lot to do in that area, but that work was merged and
+lowered the overhead of STM over non-STM to around **XXX**. The same
+improvement has still to be done in the JIT.
+
+But that is not all. Right after the sprint, we were able to squeeze
+the last obvious bugs in the STM-JIT combination. However, the performance
+was nowhere near to what we want. So until now, we fixed some of the most
+obvious issues. Many come from RPython erring on the side of caution
+and e.g. making a transaction inevitable even if that is not strictly
+necessary, thereby limiting parallelism.
+**XXX any interesting details? transaction breaks maybe? guard counters?**
+There are still many performance issues of various complexity left
+to tackle. So stay tuned or contribute :)
+
+Now, since the JIT is all about performance, we want to at least
+show you some numbers that are indicative of things to come.
+Our set of STM benchmarks is very small unfortunately
+(something you can help us out with), so this is
+not representative of real-world performance. We tried to
+minimize the effect of JIT warm-up in the benchmark results.
+
+
+**Raytracer** from `stm-benchmarks <https://bitbucket.org/Raemi/stm-benchmarks/src>`_:
+Render times in seconds for a 1024x1024 image:
+
++-------------+----------------------+-------------------+
+| Interpreter | Base time: 1 thread | 8 threads |
++=============+======================+===================+
+| PyPy-2.1 | 2.47 | 2.56 |
++-------------+----------------------+-------------------+
+| CPython | 81.1 | 73.4 |
++-------------+----------------------+-------------------+
+| PyPy-STM | 50.2 | 10.8 |
++-------------+----------------------+-------------------+
+
+For comparison, disabling the JIT gives 148ms on PyPy-2.1 and 87ms on
+PyPy-STM (with 8 threads).
+
+**Richards** from `PyPy repository on the stmgc-c4
+branch <https://bitbucket.org/pypy/pypy/commits/branch/stmgc-c4>`_:
+Average time per iteration in milliseconds using 8 threads:
+
++-------------+----------------------+-------------------+
+| Interpreter | Base time: 1 thread | 8 threads |
++=============+======================+===================+
+| PyPy-2.1 | 15.6 | 15.4 |
++-------------+----------------------+-------------------+
+| CPython | 239 | 237 |
++-------------+----------------------+-------------------+
+| PyPy-STM | 371 | 116 |
++-------------+----------------------+-------------------+
+
+For comparison, disabling the JIT gives 492ms on PyPy-2.1 and 538ms on
+PyPy-STM.
+
+All this can be found in the `PyPy repository on the stmgc-c4
+branch <https://bitbucket.org/pypy/pypy/commits/branch/stmgc-c4>`_.
+Try it for yourself, but keep in mind that this is still experimental
+with a lot of things yet to come.
+
+You can also download a prebuilt binary from here: **XXX**
+
+As a summary, what the numbers tell us is that PyPy-STM is, as expected,
+the only of the three interpreters where multithreading gives a large
+improvement in speed. What they also tell us is that, obviously, the
+result is not good enough *yet:* it still takes longer on a 8-threaded
+PyPy-STM than on a regular single-threaded PyPy-2.1. As you should know
+by now, we are good at promizing speed and delivering it years later.
+It has been two years already since PyPy-STM started, so we're in the
+fast-progressing step right now :-)
diff --git a/blog/draft/stm-sept2013.rst b/blog/draft/stm-sept2013.rst
deleted file mode 100644
--- a/blog/draft/stm-sept2013.rst
+++ /dev/null
@@ -1,52 +0,0 @@
-Update on STM
-=============
-
-Hi all,
-
-the sprint in London was a lot of fun and very fruitful. In the last
-update on STM, Armin was working on improving and specializing the
-automatic barrier placement.
-There is still a lot to do in that area, but that work was merged and
-lowered the overhead of STM over non-STM to around **XXX**. The same
-improvement has still to be done in the JIT.
-
-But that is not all. Right after the sprint, we were able to squeeze
-the last obvious bugs in the STM-JIT combination. However, the performance
-was nowhere near what we want. So until now, we fixed some of the most
-obvious issues. Many come from RPython erring on the side of caution
-and e.g. making a transaction inevitable even if that is not strictly
-necessary, thereby limiting parallelism.
-**XXX any interesting details?**
-There are still many performance issues of various complexity left
-to tackle. So stay tuned or contribute :)
-
-Now, since the JIT is all about performance, we want to at least
-show you some numbers that are indicative of things to come.
-Our set of STM benchmarks is very small unfortunately
-(something you can help us out with), so this is
-not representative of real-world performance.
-
-**Raytracer** from `stm-benchmarks <https://bitbucket.org/Raemi/stm-benchmarks/src>`_:
-Render times for a 1024x1024 image using 6 threads
-
-+-------------+----------------------+
-| Interpeter | Time (no-JIT / JIT) |
-+=============+======================+
-| PyPy-2.1 | ... / ... |
-+-------------+----------------------+
-| CPython | ... / - |
-+-------------+----------------------+
-| PyPy-STM | ... / ... |
-+-------------+----------------------+
-
-**XXX same for Richards**
-
-
-All this can be found in the `PyPy repository on the stmgc-c4
-branch <https://bitbucket.org/pypy/pypy/commits/branch/stmgc-c4>`_.
-Try it for yourself, but keep in mind that this is still experimental
-with a lot of things yet to come.
-
-You can also download a prebuilt binary frome here: **XXX**
-
-
diff --git a/planning/jit.txt b/planning/jit.txt
--- a/planning/jit.txt
+++ b/planning/jit.txt
@@ -45,9 +45,6 @@
(SETINTERIORFIELD, GETINTERIORFIELD). This is needed for the previous item to
fully work.
-- {}.update({}) is not fully unrolled and constant folded because HeapCache
- loses track of values in virtual-to-virtual ARRAY_COPY calls.
-
- ovfcheck(a << b) will do ``result >> b`` and check that the result is equal
to ``a``, instead of looking at the x86 flags.
diff --git a/talk/pyconza2013/Makefile b/talk/pyconza2013/Makefile
--- a/talk/pyconza2013/Makefile
+++ b/talk/pyconza2013/Makefile
@@ -1,13 +1,13 @@
view: talk.pdf
- xpdf talk.pdf
+ evince talk.pdf
talk.pdf: talk.tex
64bit pdflatex talk.tex
-talk.tex: talk1.tex fix.py
- python fix.py < talk1.tex > talk.tex
+talk.tex: talk.rst
+ rst2beamer --stylesheet=stylesheet.latex --documentoptions=14pt --input-encoding=utf8 --output-encoding=utf8 --overlaybullets=false $< > talk.tex
-talk1.tex: talk.rst
- rst2beamer $< > talk1.tex
+clean:
+ rm -f talk.tex talk.pdf
diff --git a/talk/pyconza2013/stylesheet.latex b/talk/pyconza2013/stylesheet.latex
new file mode 100644
--- /dev/null
+++ b/talk/pyconza2013/stylesheet.latex
@@ -0,0 +1,10 @@
+\usetheme{Warsaw}
+\usecolortheme{whale}
+\setbeamercovered{transparent}
+\definecolor{darkgreen}{rgb}{0, 0.5, 0.0}
+\newcommand{\docutilsrolegreen}[1]{\color{darkgreen}#1\normalcolor}
+\newcommand{\docutilsrolered}[1]{\color{red}#1\normalcolor}
+\addtobeamertemplate{block begin}{}{\setlength{\parskip}{35pt plus 1pt minus 1pt}}
+
+\newcommand{\green}[1]{\color{darkgreen}#1\normalcolor}
+\newcommand{\red}[1]{\color{red}#1\normalcolor}
diff --git a/talk/pyconza2013/talk.pdf b/talk/pyconza2013/talk.pdf
index 6fed83a5c845e1d71cd4c32a98eb6a6b93d07bcf..fec69aacfbd0fc9af5c9c60eb65501eed188fc5a
GIT binary patch
[cut]
diff --git a/talk/pyconza2013/talk.rst b/talk/pyconza2013/talk.rst
--- a/talk/pyconza2013/talk.rst
+++ b/talk/pyconza2013/talk.rst
@@ -1,25 +1,25 @@
.. include:: beamerdefs.txt
-=======================================
-Software Transactional Memory with PyPy
-=======================================
+.. raw:: latex
+ \title{Software Transactional Memory with PyPy}
+ \author[arigo]{Armin Rigo}
-Software Transactional Memory with PyPy
----------------------------------------
+ \institute{PyCon ZA 2013}
+ \date{4th October 2013}
-* PyCon ZA 2013
-
-* talk by Armin Rigo
-
-* sponsored by crowdfunding (thanks!)
+ \maketitle
Introduction
------------
+* me: Armin Rigo
+
* what is PyPy: an alternative implementation of Python
+* very compatible
+
* main focus is on speed
@@ -27,13 +27,21 @@
------------
.. image:: speed.png
- :scale: 65%
+ :scale: 67%
:align: center
SQL by example
--------------
+.. raw:: latex
+
+ %empty
+
+
+SQL by example
+--------------
+
::
BEGIN TRANSACTION;
@@ -58,6 +66,27 @@
::
+ ...
+ obj.value += 1
+ ...
+
+
+Python by example
+-----------------
+
+::
+
+ ...
+ x = obj.value
+ obj.value = x + 1
+ ...
+
+
+Python by example
+-----------------
+
+::
+
begin_transaction()
x = obj.value
obj.value = x + 1
@@ -100,10 +129,10 @@
::
- BEGIN TRANSACTION; BEGIN TRANSACTION; BEGIN..
- SELECT * FROM ...; SELECT * FROM ...; SELEC..
- UPDATE ...; UPDATE ...; UPDAT..
- COMMIT; COMMIT; COMMI..
+ BEGIN TRANSACTION; BEGIN TRANSACTION; BEGIN..
+ SELECT * FROM ...; SELECT * FROM ...; SELEC..
+ UPDATE ...; UPDATE ...; UPDAT..
+ COMMIT; COMMIT; COMMI..
Locks != Transactions
@@ -111,9 +140,9 @@
::
- with the_lock: with the_lock: with ..
- x = obj.val x = obj.val x =..
- obj.val = x + 1 obj.val = x + 1 obj..
+ with the_lock: with the_lock: with ..
+ x = obj.val x = obj.val x =..
+ obj.val = x + 1 obj.val = x + 1 obj..
Locks != Transactions
@@ -121,9 +150,9 @@
::
- with atomic: with atomic: with ..
- x = obj.val x = obj.val x =..
- obj.val = x + 1 obj.val = x + 1 obj..
+ with atomic: with atomic: with ..
+ x = obj.val x = obj.val x =..
+ obj.val = x + 1 obj.val = x + 1 obj..
STM
@@ -134,14 +163,46 @@
* advanced but not magic (same as databases)
-STM versus HTM
---------------
+By the way
+----------
-* Software versus Hardware
+* STM replaces the GIL (Global Interpreter Lock)
-* CPU hardware specially to avoid the high overhead
+* any existing multithreaded program runs on multiple cores
-* too limited for now
+
+By the way
+----------
+
+* the GIL is necessary and very hard to avoid,
+ but if you look at it like a lock around every single
+ subexpression, then it can be replaced with `with atomic` too
+
+
+So...
+-----
+
+* yes, any existing multithreaded program runs on multiple cores
+
+* yes, we solved the GIL
+
+* great
+
+
+So...
+-----
+
+* no, it would be quite hard to implement it in standard CPython
+
+* too bad for now, only in PyPy
+
+* but it would not be completely impossible
+
+
+But...
+------
+
+* but only half of the story in my opinion `:-)`
Example 1
@@ -149,11 +210,13 @@
::
- def apply_interest_rate(self):
+ def apply_interest(self):
self.balance *= 1.05
+
for account in all_accounts:
- account.apply_interest_rate()
+ account.apply_interest()
+ .
Example 1
@@ -161,12 +224,27 @@
::
- def apply_interest_rate(self):
+ def apply_interest(self):
self.balance *= 1.05
+
for account in all_accounts:
- add_task(account.apply_interest_rate)
- run_tasks()
+ account.apply_interest()
+ ^^^ run this loop multithreaded
+
+
+Example 1
+---------
+
+::
+
+ def apply_interest(self):
+ #with atomic: --- automatic
+ self.balance *= 1.05
+
+ for account in all_accounts:
+ add_task(account.apply_interest)
+ run_all_tasks()
Internally
@@ -178,6 +256,8 @@
* uses threads, but internally only
+* very simple, pure Python
+
Example 2
---------
@@ -187,7 +267,7 @@
def next_iteration(all_trains):
for train in all_trains:
start_time = ...
- for othertrain in train.dependencies:
+ for othertrain in train.deps:
if ...:
start_time = ...
train.start_time = start_time
@@ -215,37 +295,29 @@
* but with `objects` instead of `records`
-* the transaction aborts and automatically retries
+* the transaction aborts and retries automatically
Inevitable
----------
-* means "unavoidable"
+* "inevitable" (means "unavoidable")
* handles I/O in a `with atomic`
* cannot abort the transaction any more
-By the way
-----------
-
-* STM replaces the GIL
-
-* any existing multithreaded program runs on multiple cores
-
-
Current status
--------------
* basics work, JIT compiler integration almost done
-* different executable called `pypy-stm`
+* different executable (`pypy-stm` instead of `pypy`)
* slow-down: around 3x (in bad cases up to 10x)
-* speed-ups measured with 4 cores
+* real time speed-ups measured with 4 or 8 cores
* Linux 64-bit only
@@ -258,9 +330,11 @@
::
Detected conflict:
+ File "foo.py", line 58, in wtree
+ walk(root)
File "foo.py", line 17, in walk
if node.left not in seen:
- Transaction aborted, 0.000047 seconds lost
+ Transaction aborted, 0.047 sec lost
User feedback
@@ -273,11 +347,11 @@
Forced inevitable:
File "foo.py", line 19, in walk
print >> log, logentry
- Transaction blocked others for 0.xx seconds
+ Transaction blocked others for XX s
-Async libraries
----------------
+Asynchronous libraries
+----------------------
* future work
@@ -287,11 +361,11 @@
* existing Twisted apps still work, but we need to
look at conflicts/inevitables
-* similar with Tornado, gevent, and so on
+* similar with Tornado, eventlib, and so on
-Async libraries
----------------
+Asynchronous libraries
+----------------------
::
@@ -318,6 +392,16 @@
* reduce slow-down, port to other OS'es
+STM versus HTM
+--------------
+
+* Software versus Hardware
+
+* CPU hardware specially to avoid the high overhead (Intel Haswell processor)
+
+* too limited for now
+
+
Under the cover
---------------
@@ -329,8 +413,8 @@
* the most recent version can belong to one thread
-* synchronization only when a thread "steals" another thread's most
- recent version, to make it shared
+* synchronization only at the point where one thread "steals"
+ another thread's most recent version, to make it shared
* integrated with a generational garbage collector, with one
nursery per thread
@@ -345,4 +429,8 @@
* a small change for Python users
+* (and the GIL is gone)
+
+* this work is sponsored by crownfunding (thanks!)
+
* `Q & A`
More information about the pypy-commit
mailing list