[pypy-commit] pypy default: Small rewrites

arigo noreply at buildbot.pypy.org
Fri Mar 20 18:11:41 CET 2015


Author: Armin Rigo <arigo at tunes.org>
Branch: 
Changeset: r76482:81078d224b97
Date: 2015-03-20 18:11 +0100
http://bitbucket.org/pypy/pypy/changeset/81078d224b97/

Log:	Small rewrites

diff --git a/pypy/doc/stm.rst b/pypy/doc/stm.rst
--- a/pypy/doc/stm.rst
+++ b/pypy/doc/stm.rst
@@ -174,6 +174,38 @@
 User Guide
 ==========
 
+How to write multithreaded programs: the 10'000-feet view
+---------------------------------------------------------
+
+PyPy-STM offers two ways to write multithreaded programs:
+
+* the traditional way, using the ``thread`` or ``threading`` modules,
+  described first__.
+
+* using ``TransactionQueue``, described next__, as a way to hide the
+  low-level notion of threads.
+
+.. __: `Drop-in replacement`_
+.. __: `transaction.TransactionQueue`_
+
+The issue with low-level threads are well known (particularly in other
+languages that don't have GIL-based interpreters): memory corruption,
+deadlocks, livelocks, and so on.  There are alternative approaches to
+dealing directly with threads, like OpenMP_.  These approaches
+typically enforce some structure on your code.  ``TransactionQueue``
+is in part similar: your program needs to have "some chances" of
+parallelization before you can apply it.  But I believe that the scope
+of applicability is much larger with ``TransactionQueue`` than with
+other approaches.  It usually works without forcing a complete
+reorganization of your existing code, and it works on any Python
+program which has got *latent* and *imperfect* parallelism.  Ideally,
+it only requires that the end programmer identifies where this
+parallelism is likely to be found, and communicates it to the system
+using a simple API.
+
+.. _OpenMP: http://en.wikipedia.org/wiki/OpenMP
+
+
 Drop-in replacement
 -------------------
 
@@ -196,31 +228,6 @@
 order.
 
 
-How to write multithreaded programs: the 10'000-feet view
----------------------------------------------------------
-
-PyPy-STM offers two ways to write multithreaded programs:
-
-* the traditional way, using the ``thread`` or ``threading`` modules.
-
-* using ``TransactionQueue``, described next__, as a way to hide the
-  low-level notion of threads.
-
-.. __: `transaction.TransactionQueue`_
-
-``TransactionQueue`` hides the hard multithreading-related issues that
-we typically encounter when using low-level threads.  This is not the
-first alternative approach to avoid dealing with low-level threads;
-for example, OpenMP_ is one.  However, it is one of the first ones
-which does not require the code to be organized in a particular
-fashion.  Instead, it works on any Python program which has got
-*latent* and *imperfect* parallelism.  Ideally, it only requires that
-the end programmer identifies where this parallelism is likely to be
-found, and communicates it to the system using a simple API.
-
-.. _OpenMP: http://en.wikipedia.org/wiki/OpenMP
-
-
 transaction.TransactionQueue
 ----------------------------
 
@@ -256,8 +263,9 @@
 behavior did not change because we are using ``TransactionQueue``.
 All the calls still *appear* to execute in some serial order.
 
-However, the performance typically does not increase out of the box.
-In fact, it is likely to be worse at first.  Typically, this is
+A typical usage of ``TransactionQueue`` goes like that: at first,
+the performance does not increase.
+In fact, it is likely to be worse.  Typically, this is
 indicated by the total CPU usage, which remains low (closer to 1 than
 N cores).  First note that it is expected that the CPU usage should
 not go much higher than 1 in the JIT warm-up phase: you must run a
@@ -282,9 +290,9 @@
 because of the reason shown in the two independent single-entry
 tracebacks: one thread ran the line ``someobj.stuff = 5``, whereas
 another thread concurrently ran the line ``someobj.other = 10`` on the
-same object.  Two writes to the same object cause a conflict, which
-aborts one of the two transactions.  In the example above this
-occurred 12412 times.
+same object.  These two writes are done to the same object.  This
+causes a conflict, which aborts one of the two transactions.  In the
+example above this occurred 12412 times.
 
 The two other conflict sources are ``STM_CONTENTION_INEVITABLE``,
 which means that two transactions both tried to do an external
@@ -303,7 +311,7 @@
   each transaction starts with sending data to a log file.  You should
   refactor this case so that it occurs either near the end of the
   transaction (which can then mostly run in non-inevitable mode), or
-  even delegate it to a separate thread.
+  delegate it to a separate transaction or even a separate thread.
 
 * Writing to a list or a dictionary conflicts with any read from the
   same list or dictionary, even one done with a different key.  For
@@ -322,7 +330,7 @@
   results is fine, use ``transaction.time()`` or
   ``transaction.clock()``.
 
-* ``transaction.threadlocalproperty`` can be used as class-level::
+* ``transaction.threadlocalproperty`` can be used at class-level::
 
       class Foo(object):     # must be a new-style class!
           x = transaction.threadlocalproperty()
@@ -342,11 +350,11 @@
   threads, each running the transactions one after the other; such
   thread-local properties will have the value last stored in them in
   the same thread,, which may come from a random previous transaction.
-  ``threadlocalproperty`` is still useful to avoid conflicts from
-  cache-like data structures.
+  This means that ``threadlocalproperty`` is useful mainly to avoid
+  conflicts from cache-like data structures.
 
 Note that Python is a complicated language; there are a number of less
-common cases that may cause conflict (of any type) where we might not
+common cases that may cause conflict (of any kind) where we might not
 expect it at priori.  In many of these cases it could be fixed; please
 report any case that you don't understand.  (For example, so far,
 creating a weakref to an object requires attaching an auxiliary
@@ -395,8 +403,8 @@
 it likely that such a piece of code will eventually block all other
 threads anyway.
 
-Note that if you want to experiment with ``atomic``, you may have to add
-manually a transaction break just before the atomic block.  This is
+Note that if you want to experiment with ``atomic``, you may have to
+manually add a transaction break just before the atomic block.  This is
 because the boundaries of the block are not guaranteed to be the
 boundaries of the transaction: the latter is at least as big as the
 block, but may be bigger.  Therefore, if you run a big atomic block, it
@@ -522,7 +530,7 @@
 where, say, all threads try to increase the same global counter and do
 nothing else).
 
-However, using if the program requires longer transactions, it comes
+However, if the program requires longer transactions, it comes
 with less obvious rules.  The exact details may vary from version to
 version, too, until they are a bit more stabilized.  Here is an
 overview.


More information about the pypy-commit mailing list