[pypy-dev] Syntax for the 'transaction' module

Armin Rigo arigo at tunes.org
Tue May 1 15:55:58 CEST 2012


Re-hi,

Now I did a full circle, and I'm wondering again if we couldn't go
with the single primitive of an "atomic" object.  You use it as "with
atomic:", in a way quite traditional for Transactional Memory.

Indeed, it seems after all possible to have the following model: a
pypy-stm interpreter with the unmodified thread and threading modules,
with just the addition of "atomic" (from some module, maybe "atomic",
or "transaction" again, or "transact", or a better name).  You can use
it to ensure that a piece of code is run atomically; this would mean,
in the sense of CPython's GIL, that the GIL is not released for the
complete duration of the "with" statement.  (On top of CPython, this
cannot be fully achieved without hacking at the interpreter, but to a
limited extent we can emulate it by saying at least the following: two
threads that both want to run "with atomic" are serialized.  Easy to
do with just one lock.)

On top of that, it is possible to write more nicely usable constructs
in pure Python.  For example, to write the existing
transaction.add()/run(), we would write code that starts N threads,
each polling a Queue and running the items in a "with atomic" block.
Or we can write "schedule.Runner()" or similar.

This approach has several advantages: the fact that the thread pooling
logic is written in pure Python and tweakable; but also the fact that
it is compatible with normal threads.  This means all the doubts I had
about blocking C calls are already resolved --- if you need really to
do blocking C calls, do them in a separate thread.  For example, in
twisted, the loop calling select() would be written in a thread --- or
likely in the main thread --- while a pool of extra threads runs the
actual logic.  (Not a normal thread pool, but one which uses "with
atomic" to run each item.)  People are already used to this approach
--- with the exception that this gives the additional *huge* benefit
of illusive serial execution, so locks&friends are useless.

The new things here, when compared to CPython or PyPy-without-STM, are:

(1) the ability to use multiple cores, by running transactions that go
from one release-the-GIL to the next one (which is already existing as
work-in-progress on CPython for Hardware TM);

(2) the "with atomic" construct that I already proposed to CPython
last year (and certainly is not new, but inspired directly from
classical TM).

Does it make any sense to you?


A bientôt,

Armin.


More information about the pypy-dev mailing list