[pypy-commit] extradoc extradoc: merge
hakanardo
noreply at buildbot.pypy.org
Fri Aug 17 12:21:08 CEST 2012
Author: Hakan Ardo <hakan at debian.org>
Branch: extradoc
Changeset: r4663:04439fef5415
Date: 2012-08-17 12:20 +0200
http://bitbucket.org/pypy/extradoc/changeset/04439fef5415/
Log: merge
diff --git a/talk/stm2012/stmimpl.rst b/talk/stm2012/stmimpl.rst
--- a/talk/stm2012/stmimpl.rst
+++ b/talk/stm2012/stmimpl.rst
@@ -19,7 +19,8 @@
done in a local copy. If this transaction successfully commits, the
original global object is *not* changed --- it is really immutable. But
the copy becomes global, and the old global object's header is updated
-with a pointer to the new global object.
+with a pointer to the new global object. We thus make a chained list
+of global versions.
CPUs model
@@ -31,9 +32,9 @@
be delayed and only show up later in main memory. The delayed stores
are always flushed to main memory in program order.
-Of course if the same CPU loads a value just stored, it will see the
+Of course if the same CPU loads a value it just stored, it will see the
value as modified (self-consistency); but other CPUs might temporarily
-still see the old value.
+see the old value.
The MFENCE instruction waits until all delayed stores from this CPU have
been flushed. (A CPU has no built-in way to wait until *other* CPU's
@@ -49,12 +50,23 @@
Every object starts with three fields:
- h_global (boolean)
-- h_nonmodified (boolean)
+- h_possibly_outdated (boolean)
+- h_written (boolean)
- h_version (unsigned integer)
The h_version is an unsigned "version number". More about it below.
-The other two fields are flags. (In practice they are just two bits
-of the GC h_tid field.)
+The other fields are flags. (In practice they are just bits inside the
+GC h_tid field.)
+
+- ``h_global`` means that the object is a global object.
+
+- ``h_possibly_outdated`` is used as an optimization: it means that the
+ object is possibly outdated. It is False for all local objects. It
+ is also False if the object is a global object, is the most recent of
+ its chained list of versions, and is known to have no ``global2local``
+ entry in any transaction.
+
+- ``h_written`` is set on local objects that have been written to.
Transaction details
@@ -65,6 +77,8 @@
- start_time
- global2local
+- list_of_read_objects
+- recent_reads_cache
The ``start_time`` is the "time" at which the transaction started. All
reads and writes done so far in the transaction appear consistent with
@@ -74,8 +88,17 @@
``global2local`` is a dictionary-like mapping of global objects to their
corresponding local objects.
+``list_of_read_objects`` is a set of all global objects read from, in
+the version that was used for reading. It is actually implemented as a
+list, but the order or repeated presence of elements in the list is
+irrelevant.
-Pseudo-code during transactions
+``recent_reads_cache`` is a fixed-size cache that remembers recent
+additions to the preceeding list, in order to avoid inserting too much
+repeated entries into the list, as well as keep lightweight statistics.
+
+
+Pseudo-code: read/write barriers
---------------------------------------
Variable names:
@@ -87,19 +110,19 @@
* ``R`` is a pointer to an object that was checked for being
*read-ready*: reading its fields is ok.
-* ``L`` is a pointer to a *local* object. Reading its fields is
- always ok, but not necessarily writing.
+* ``L`` is a pointer to a *local* object. We can always read from
+ but not necessarily write to local objects.
-* ``W`` is a pointer to a local object ready to *write*.
+* ``W`` is a pointer to a *writable* local object.
-``W = Allocate(size)`` allocates a local object, and as the name of
-the variable suggests, returns it ready to write::
+``W = Allocate(size)`` allocates a local object::
def Allocate(size):
W = malloc(size)
W->h_global = False
- W->h_nonmodified = False
+ W->h_possibly_outdated = False
+ W->h_written = True
W->h_version = 0
return W
@@ -115,7 +138,7 @@
while (v := R->h_version) & 1: # "has a more recent version"
R = v & ~ 1
if v > start_time: # object too recent?
- validate_fast() # try to move start_time forward
+ ValidateFast() # try to move start_time forward
return LatestGlobalVersion(G) # restart searching from G
PossiblyUpdateChain(G)
return R
@@ -125,107 +148,113 @@
It takes a random pointer ``P`` and returns a possibly different pointer
``R`` out of which we can read from the object. The result ``R``
remains valid for read access until either the current transaction ends,
-or until a write into the same object is done.
-
-::
+or until a write into the same object is done. Pseudo-code::
def DirectReadBarrier(P):
if not P->h_global: # fast-path
return P
- R = LatestGlobalVersion(P)
+ if not P->h_possibly_outdated:
+ R = P
+ else:
+ R = LatestGlobalVersion(P)
+ if R->h_possibly_outdated and R in global2local:
+ L = global2local[R]
+ return L
+ R = AddInReadSet(R) # see below
+ return R
+
+
+A simple optimization is possible. If ``R`` is returned by a previous
+call to ``DirectReadBarrier`` and the current transaction is still
+running, but we could have written to ``R`` in the meantime, then we
+need to repeat only part of the logic, because we don't need
+``AddInReadSet`` again. It gives this::
+
+ def RepeatReadBarrier(R):
+ if not R->h_possibly_outdated: # fast-path
+ return R
+ # LatestGlobalVersion(R) would either return R or abort
+ # the whole transaction, so omitting it is not wrong
if R in global2local:
L = global2local[R]
return L
- else:
- AddInReadSet(R) # see below
- return R
+ return R
-``L = Localize(R)`` is an operation that takes a read-ready pointer to
-a global object and returns a corresponding pointer to a local object.
-
-::
+``L = Localize(R)`` is an operation that takes a read-ready pointer to a
+global object and returns a corresponding pointer to a local object::
def Localize(R):
- if P in global2local:
- return global2local[P]
+ if R in global2local:
+ return global2local[R]
L = malloc(sizeof R)
- L->h_nonmodified = True
- L->h_version = P
+ L->h_global = False
+ L->h_possibly_outdated = False
+ L->h_written = False
+ L->h_version = R # back-reference to the original
L->objectbody... = R->objectbody...
global2local[R] = L
return L
-``L = LocalizeReadBarrier(P)`` is a different version of the read
-barrier that works by returning a local object.
+``W = WriteBarrier(P)`` and ``W = WriteBarrierFromReadReady(R)`` are
+two versions of the write barrier::
-::
-
- def LocalizeReadBarrier(P):
+ def WriteBarrier(P):
if not P->h_global: # fast-path
return P
- R = LatestGlobalVersion(P)
- L = Localize(R)
- return L
-
-
-``W = WriteBarrier(P)`` is the write barrier.
-
-::
-
- def WriteBarrier(P):
- W = LocalizeReadBarrier(P)
- W->h_nonmodified = False
+ if P->h_possibly_outdated:
+ R = LatestGlobalVersion(P)
+ else:
+ R = P
+ W = Localize(R)
+ W->h_written = True
+ R->h_possibly_outdated = True
return W
+ def WriteBarrierFromReadReady(P):
+ if not R->h_global: # fast-path
+ return R
+ W = Localize(R)
+ W->h_written = True
+ R->h_possibly_outdated = True
+ return W
-``R = AdaptiveReadBarrier(P)`` is the adaptive read barrier. It can use
-the technique of either ``DirectReadBarrier`` or
-``LocalizeReadBarrier``, based on heuristics for better performance::
- def AdaptiveReadBarrier(P):
- if not P->h_global: # fast-path
- return P
- R = LatestGlobalVersion(P)
- if R in global2local:
- return global2local[R]
- if R seen often enough in readset:
- L = Localize(R) # LocalizeReadBarrier
- return L
+Auto-localization of some objects
+----------------------------------------
+
+The "fast-path" markers above are quick checks that are supposed to be
+inlined in the caller, so that we only have to pay for a full call to a
+barrier implementation when the fast-path fails.
+
+However, even the fast-path of ``DirectReadBarrier`` fails repeatedly
+when the ``DirectReadBarrier`` is invoked repeatedly on the same set of
+global objects. This occurs in example of code that repeatedly
+traverses the same data structure, visiting the same objects over and
+over again.
+
+If the objects that make up the data structure were local, then we would
+completely avoid triggering the read barrier's implementation. So
+occasionally, it is better to *localize* global objects even when they
+are only read from.
+
+This is done by tweaking ``AddInReadSet``, whose main purpose is to
+record the read object in a set (actually a list)::
+
+ def AddInReadSet(R):
+ if R not in recent_reads_cache:
+ list_of_read_objects.append(R)
+ recent_reads_cache[R] = 1
+ # the cache is fixed-size, so the line above
+ # possibly evinces another older entry
+ return R
else:
- AddInReadSet(R) # DirectReadBarrier
- return R
-
-
-This adaptive localization of read-only objects is useful for example in
-the following situation: we have a pointer ``P1`` to some parent object,
-out of which we repeatedly try to read the same field ``Field`` and use
-the result ``P`` in some call. Because the call may possibly have write
-effects to the parent object, we normally need to redo
-``DirectReadBarrier`` on ``P1`` every time. If instead we do
-``AdaptiveReadBarrier`` then after a few iterations it will localize the
-object and return ``L1``. On ``L1`` no read barrier is needed any more.
-
-Moreover, if we also need to read the subobject ``P``, we also need to
-call a read barrier on it every time. It may return ``L`` after a few
-iterations, but this time we win less, because during the next iteration
-we again read ``P`` out of ``L1``. The trick is that when we read a
-field out of a local object ``L1``, and it is a pointer on which we
-subsequently do a read barrier, then afterwards we can update the
-original pointer directly in ``L1``.
-
-Similarily, if we start with a global ``R1`` and read a pointer ``P``
-which is updated to its latest global version ``R``, then we can update
-the original pointer in-place.
-
-The only case in which it is not permitted xxx
-
-::
-
- def DependentUpdate(R1, Field, R):
- if R1->h_global: # can't modify R1 unless it is local
- return
- R1->Field = R # possibly update the pointer
-
-
+ count = recent_reads_cache[R]
+ count += 1
+ recent_reads_cache[R] = count
+ if count < THRESHOLD:
+ return R
+ else:
+ L = Localize(R)
+ return L
More information about the pypy-commit
mailing list