[pypy-commit] extradoc extradoc: Finish this part, hopefully.

arigo noreply at buildbot.pypy.org
Fri Aug 17 12:55:34 CEST 2012


Author: Armin Rigo <arigo at tunes.org>
Branch: extradoc
Changeset: r4664:bc132a8801f3
Date: 2012-08-17 12:55 +0200
http://bitbucket.org/pypy/extradoc/changeset/bc132a8801f3/

Log:	Finish this part, hopefully.

diff --git a/talk/stm2012/stmimpl.rst b/talk/stm2012/stmimpl.rst
--- a/talk/stm2012/stmimpl.rst
+++ b/talk/stm2012/stmimpl.rst
@@ -63,8 +63,8 @@
 - ``h_possibly_outdated`` is used as an optimization: it means that the
   object is possibly outdated.  It is False for all local objects.  It
   is also False if the object is a global object, is the most recent of
-  its chained list of versions, and is known to have no ``global2local``
-  entry in any transaction.
+  its chained list of versions, and is known to have no
+  ``global_to_local`` entry in any transaction.
 
 - ``h_written`` is set on local objects that have been written to.
 
@@ -76,7 +76,7 @@
 the transaction it has so far.  The following data is transaction-local:
 
 - start_time
-- global2local
+- global_to_local
 - list_of_read_objects
 - recent_reads_cache
 
@@ -85,8 +85,8 @@
 the state at time ``start_time``.  The "time" is a single global number
 that is atomically incremented whenever a transaction commits.
 
-``global2local`` is a dictionary-like mapping of global objects to their
-corresponding local objects.
+``global_to_local`` is a dictionary-like mapping of global objects to
+their corresponding local objects.
 
 ``list_of_read_objects`` is a set of all global objects read from, in
 the version that was used for reading.  It is actually implemented as a
@@ -133,14 +133,14 @@
 ``R`` to see that it was not created after ``start_time``.
 Pseudo-code::
 
-    def LatestGlobalVersion(G):
+    def LatestGlobalVersion(G, ...):
         R = G
         while (v := R->h_version) & 1:    # "has a more recent version"
             R = v & ~ 1
         if v > start_time:                # object too recent?
             ValidateFast()                # try to move start_time forward
             return LatestGlobalVersion(G) # restart searching from G
-        PossiblyUpdateChain(G)
+        PossiblyUpdateChain(G, R, ...)    # see below
         return R
 
 
@@ -150,17 +150,17 @@
 remains valid for read access until either the current transaction ends,
 or until a write into the same object is done.  Pseudo-code::
 
-    def DirectReadBarrier(P):
-        if not P->h_global:         # fast-path
+    def DirectReadBarrier(P, ...):
+        if not P->h_global:                    # fast-path
             return P
         if not P->h_possibly_outdated:
             R = P
         else:
-            R = LatestGlobalVersion(P)
-            if R->h_possibly_outdated and R in global2local:
-                L = global2local[R]
+            R = LatestGlobalVersion(P, ...)
+            if R->h_possibly_outdated and R in global_to_local:
+                L = ReadGlobalToLocal(R, ...)  # see below
                 return L
-        R = AddInReadSet(R)         # see below
+        R = AddInReadSet(R)                    # see below
         return R
 
 
@@ -170,13 +170,13 @@
 need to repeat only part of the logic, because we don't need
 ``AddInReadSet`` again.  It gives this::
 
-    def RepeatReadBarrier(R):
-        if not R->h_possibly_outdated:   # fast-path
+    def RepeatReadBarrier(R, ...):
+        if not R->h_possibly_outdated:       # fast-path
             return R
         # LatestGlobalVersion(R) would either return R or abort
         # the whole transaction, so omitting it is not wrong
-        if R in global2local:
-            L = global2local[R]
+        if R in global_to_local:
+            L = ReadGlobalToLocal(R, ...)    # see below
             return L
         return R
 
@@ -185,15 +185,15 @@
 global object and returns a corresponding pointer to a local object::
 
     def Localize(R):
-        if R in global2local:
-            return global2local[R]
+        if R in global_to_local:
+            return global_to_local[R]
         L = malloc(sizeof R)
         L->h_global = False
         L->h_possibly_outdated = False
         L->h_written = False
         L->h_version = R          # back-reference to the original
         L->objectbody... = R->objectbody...
-        global2local[R] = L
+        global_to_local[R] = L
         return L
 
 
@@ -239,6 +239,11 @@
 occasionally, it is better to *localize* global objects even when they
 are only read from.
 
+The idea of localization is to break the strict rule that, as long as we
+don't write anything, we can only find more global objects starting from
+a global object.  This is relaxed here by occasionally making a local
+copy even though we don't write to the object.
+
 This is done by tweaking ``AddInReadSet``, whose main purpose is to
 record the read object in a set (actually a list)::
 
@@ -258,3 +263,44 @@
             else:
                 L = Localize(R) 
                 return L
+
+
+Note that the localized objects are just copies of the global objects.
+So all the pointers they normally contain are pointers to further global
+objects.  If we have a data structure involving a number of objects,
+when traversing it we are going to fetch global pointers out of
+localized objects, and we still need read barriers to go from the global
+objects to the next local objects.
+
+To get the most out of the optimization above, we also need to "fix"
+local objects to change their pointers to go directly to further
+local objects.
+
+So ``L = ReadGlobalToLocal(R, R_Container, FieldName)`` is called with
+optionally ``R_Container`` and ``FieldName`` referencing some
+container's field out of which ``R`` was read::
+
+    def ReadGlobalToLocal(R, R_Container, FieldName):
+        L = global_to_local[R]
+        if not R_Container->h_global:
+            L_Container = R_Container
+            L_Container->FieldName = L     # fix in-place
+        return L
+
+
+Finally, a similar optimization can be applied in
+``LatestGlobalVersion``.  After it follows the chain of global versions,
+it can "compress" that chain in case it contained several hops, and also
+update the original container's field to point directly to the latest
+version::
+
+    def PossiblyUpdateChain(G, R, R_Container, FieldName):
+        if R != G:
+            # compress the chain
+            while G->h_version != R | 1:
+                G_next = G->h_version & ~ 1
+                G->h_version = R | 1
+                G = G_next
+            # update the original field
+            R_Container->FieldName = R
+


More information about the pypy-commit mailing list