[pypy-commit] pypy stm-thread: Complete

Wed May 9 15:33:30 CEST 2012

Author: Armin Rigo <arigo at tunes.org>
Branch: stm-thread
Changeset: r54985:c33c8f8595e4
Date: 2012-05-09 15:33 +0200
http://bitbucket.org/pypy/pypy/changeset/c33c8f8595e4/

Log:	Complete

diff --git a/pypy/doc/stm.rst b/pypy/doc/stm.rst
--- a/pypy/doc/stm.rst
+++ b/pypy/doc/stm.rst
@@ -9,24 +9,33 @@
 
 PyPy can be translated in a special mode based on Software Transactional
 Memory (STM).  This mode is not compatible with the JIT so far, and moreover
-adds a constant run-time overhead in the range 2x to 5x.  The benefit is
-that the resulting ``pypy-stm`` can execute multiple threads of Python code
-in parallel.
+adds a constant run-time overhead, expected to be in the range 2x to 5x.
+(XXX for now it is bigger, but past experience show it can be reduced.)
+The benefit is that the resulting ``pypy-stm`` can execute multiple
+threads of Python code in parallel.
+
+* ``pypy-stm`` is fully compatible with a GIL-based PyPy; you can use it
+  as a drop-in replacement and multithreaded programs will run on multiple
+  cores.
+
+* ``pypy-stm`` adds a low-level API in the ``thread`` module, namely
+  ``thread.atomic``, that can be used as described below.  This is meant
+  to improve existing multithread-based programs.  It is also meant to
+  be used to build higher-level interfaces on top of it.
+
+* A number of higher-level interfaces are planned, using internally
+  threads and ``thread.atomic``.  They are meant to be used in
+  non-thread-based programs.  Given the higher level, we also recommend
+  using them in new programs instead of structuring your program to use
+  raw threads.
 
 
 High-level interface
 ====================
 
-At the lowest levels, the Global Interpreter Lock (GIL) was just
-replaced with STM techniques.  This gives a ``pypy-stm`` that should
-behave identically to a regular GIL-enabled PyPy, but run multithreaded
-programs in a way that scales with the number of cores.  The details of
-the implementation are explained below.
-
-However, what we are pushing for is *not writing multithreaded programs*
-at all.  It is possible to use higher-level interfaces.  The basic one
-is found in the ``transaction`` module (XXX name to change).  Minimal
-example of usage::
+The basic high-level interface is planned in the ``transaction`` module
+(XXX name can change).  A minimal example of usage will be along the
+lines of::
 
     for i in range(10):
         transaction.add(do_stuff, i)
@@ -34,9 +43,9 @@
 
 This schedules and runs all ten ``do_stuff(i)``.  Each one appears to
 run serially, but in random order.  It is also possible to ``add()``
-more transactions within each transaction, to schedule additional pieces
-of work.  The call to ``run()`` returns when all transactions have
-completed.
+more transactions within each transaction, causing additional pieces of
+work to be scheduled.  The call to ``run()`` returns when all
+transactions have completed.
 
 The module is written in pure Python (XXX not written yet, add url).
 See the source code to see how it is based on the `low-level interface`_.
@@ -45,20 +54,20 @@
 Low-level interface
 ===================
 
-``pypy-stm`` offers one additional low-level API: ``thread.atomic``.
-This is a context manager to use in a ``with`` statement.  Any code
-running in the ``with thread.atomic`` block is guaranteed to be fully
-serialized with respect to any code run by other threads (so-called
-*strong isolation*).
+Besides replacing the GIL with STM techniques, ``pypy-stm`` offers one
+additional explicit low-level API: ``thread.atomic``.  This is a context
+manager to use in a ``with`` statement.  Any code running in the ``with
+thread.atomic`` block is guaranteed to be fully serialized with respect
+to any code run by other threads (so-called *strong isolation*).
 
 Note that this is a guarantee of observed behavior: under the conditions
-described below, multiple ``thread.atomic`` blocks can actually run in
-parallel.
+described below, a ``thread.atomic`` block can actually run in parallel
+with other threads, whether they are in a ``thread.atomic`` or not.
 
 Classical minimal example: in a thread, you want to pop an item from
 ``list1`` and append it to ``list2``, knowing that both lists can be
-mutated concurrently by other threads.  Using ``thread.atomic`` this
-can be done without careful usage of locks::
+mutated concurrently by other threads.  Using ``thread.atomic`` this can
+be done without careful usage of locks on any mutation of the lists::
 
     with thread.atomic:
         x = list1.pop()
@@ -91,10 +100,9 @@
 inside ``thread.atomic`` blocks.  Writing this kind of code::
 
     with thread.atomic:
-        print "hello, the value is:"
-        print "\t", value
+        print "hello, the value is:", value
 
-actually also helps ensuring that the whole line or lines are printed
+actually also helps ensuring that the whole line (or lines) is printed
 atomically, instead of being broken up with interleaved output from
 other threads.
 
@@ -115,8 +123,8 @@
 
 Each thread is actually running as a sequence of "transactions", which
 are separated by "transaction breaks".  The execution of the whole
-multithreaded program works as if all transactions were serialized, but
-actually executing the transactions in parallel.
+multithreaded program works as if all transactions were serialized.
+You don't see the transactions actually running in parallel.
 
 This works as long as two principles are respected.  The first one is
 that the transactions must not *conflict* with each other.  The most
@@ -140,17 +148,17 @@
 
 Transaction breaks *never* occur in ``thread.atomic`` mode.
 
-Every transaction can further be in one of two modes: either "normal" or
-"inevitable".  To simplify, a transaction starts in "normal" mode, but
-switches to "inevitable" as soon as it performs input/output.  If we
-have an inevitable transaction, all other transactions are paused; this
-effect is similar to the GIL.
+Additionally, every transaction can further be in one of two modes:
+either "normal" or "inevitable".  To simplify, a transaction starts in
+"normal" mode, but switches to "inevitable" as soon as it performs
+input/output.  If we have an inevitable transaction, all other
+transactions are paused; this effect is similar to the GIL.
 
 In the absence of ``thread.atomic``, inevitable transactions only have a
 small effect.  Indeed, as soon as the current bytecode finishes, the
 interpreter notices that the transaction is inevitable and immediately
 introduces a transaction break in order to switch back to a normal-mode
-transaction.  It means that inevitable transactions only run for a short
+transaction.  It means that inevitable transactions only run for a small
 fraction of the time.
 
 With ``thread.atomic`` however you have to be a bit careful, because the
@@ -158,7 +166,15 @@
 ``with thread.atomic``.  Basically, you should organize your code in
 such a way that for any ``thread.atomic`` block that runs for a
 noticable time, any I/O is done near the end of it, not when there is
-still a lot of CPU time ahead.
+still a lot of CPU (or I/O) time ahead.
+
+In particular, this means that you should ideally avoid blocking I/O
+operations in ``thread.atomic`` blocks.  They work, but because the
+transaction is turned inevitable *before* the I/O is performed, they
+will prevent any parallel work at all.  (This may look like
+``thread.atomic`` blocks reverse the usual effects of the GIL: if the
+block is computation-intensive it will nicely be parallelized, but doing
+any long I/O prevents any parallel work.)
 
 
 Implementation