[pypy-commit] stmgc default: hg merge c8-gil-like
arigo
noreply at buildbot.pypy.org
Sun Jun 14 11:52:24 CEST 2015
Author: Armin Rigo <arigo at tunes.org>
Branch:
Changeset: r1830:ab54aa35b24a
Date: 2015-06-14 11:53 +0200
http://bitbucket.org/pypy/stmgc/changeset/ab54aa35b24a/
Log: hg merge c8-gil-like
Fixes the bad timings of a program that does many tiny external
calls. Previously, it would cause many tiny transactions. Now a
single larger inevitable transaction covers the series of calls.
diff too long, truncating to 2000 out of 2448 lines
diff --git a/c8/CALL_RELEASE_GIL b/c8/CALL_RELEASE_GIL
new file mode 100644
--- /dev/null
+++ b/c8/CALL_RELEASE_GIL
@@ -0,0 +1,120 @@
+
+c8-gil-like
+===========
+
+A branch to have "GIL-like" behavior for inevitable transactions: one
+not-too-short inevitable transaction that is passed around multiple
+threads.
+
+The goal is to have good fast-case behavior with the PyPy JIT around
+CALL_RELEASE_GIL. This is how it works in default (with shadowstack):
+
+
+- "rpy_fastgil" is a global variable. The value 0 means the GIL is
+ definitely unlocked; the value 1 means it is probably locked (it is
+ actually locked only if some mutex object is acquired too).
+
+- before CALL_RELEASE_GIL, we know that we have the GIL and we need to
+ release it. So we know that "rpy_fastgil" is 1, and we just write 0
+ there.
+
+- then we do the external call.
+
+- after CALL_RELEASE_GIL, two cases:
+
+ - if "rpy_fastgil" has been changed to 1 by some other thread *or*
+ if the (non-thread-local) shadowstack pointer changed, then we
+ call reacqgil_addr();
+
+ - otherwise, we swap rpy_fastgil back to 1 and we're done.
+
+- if the external call is long enough, a different thread will notice
+ that rpy_fastgil == 0 by regular polling, and grab the GIL for
+ itself by swapping it back to 1. (The changes from 0 to 1 are done
+ with atomic instructions.)
+
+- a different mechanism is used when we voluntarily release the GIL,
+ based on the mutex mentioned above. The mutex is also used by the
+ the reacqgil_addr() function if it actually needs to wait.
+
+
+Plan for porting this idea to stmgc:
+
+- we add a few macros to stmgc.h which can be used by C code, around
+ external calls; and we also inline these macros manually around
+ CALL_RELEASE_GIL in PyPy's JIT.
+
+- we add the "detached" mode to inevitable transactions: it means that
+ no thread is actively running this inevitable transaction for now,
+ but it was not committed yet. It is meant to be reattached, by the
+ same or a different thread.
+
+- we add a global variable, "stm_detached_inevitable_from_thread". It
+ is equal to the stm_thread_local pointer of the thread that detached
+ inevitable transaction (like rpy_fastgil == 0), or NULL if there is
+ no detached inevitable transaction (like rpy_fastgil == 1).
+
+- the macro stm_detach_inevitable_transaction() simply writes the
+ current thread's stm_thread_local pointer into the global variable
+ stm_detached_inevitable_from_thread. It can only be used if the
+ current transaction is inevitable (and in particular the inevitable
+ transaction was not detached already, because we're running it).
+ After the macro is called, the current thread is assumed not to be
+ running in a transaction any more (no more object or shadowstack
+ access).
+
+- the macro stm_reattach_transaction() does an atomic swap on
+ stm_detached_inevitable_from_thread to change it to NULL. If the
+ old value was equal to our own stm_thread_local pointer, we are done. If
+ not, we call a helper, _stm_reattach_transaction().
+
+- we also add the macro stm_detach_transation(). If the current
+ thread is inevitable it calls stm_detach_inevitable_transaction().
+ Otherwise it calls a helper, _stm_detach_noninevitable_transaction().
+
+- _stm_reattach_transaction(old): called with the old value from
+ stm_detached_inevitable_from_thread (which was swapped to be NULL just
+ now). If old != NULL, this swap had the effect that we took over
+ the inevitable transaction originally detached from a different
+ thread; we need to fix a few things like the stm_thread_local and %gs but
+ then we can continue running this reattached inevitable transaction.
+ If old == NULL, we need to fall back to the current
+ stm_start_transaction(). (A priori, there is no need to wait at
+ this point. The waiting point is later, in the optional
+ stm_become_inevitable()).
+
+- _stm_detach_noninevitable_transaction(): we try to make the
+ transaction inevitable. If it works we can then use
+ stm_detach_inevitable_transaction(). On the other hand, if we can't
+ make it inevitable without waiting, then instead we just commit it
+ and continue. In the latter case,
+ stm_detached_inevitable_from_thread is still NULL.
+
+- other place to fix: major collections. Maybe simply look inside
+ stm_detached_inevitable_from_thread, and if not NULL, grab the
+ inevitable transaction and commit it now. Or maybe not. The point
+ is that we need to prevent a thread from asynchronously grabbing it
+ by an atomic swap of stm_detached_inevitable_from_thread; instead,
+ the parallel threads that finish their external calls should all
+ find NULL in this variable and call _stm_reattach_transaction()
+ which will wait for the major GC to end.
+
+- stm_become_inevitable(): if it finds a detached inevitable
+ transaction, it should attach and commit it as a way to get rid of
+ it. This is why it might be better to call directly
+ stm_start_inevitable_transaction() when possible: that one is
+ allowed to attach to a detached inevitable transaction and simply
+ return, unlike stm_become_inevitable() which must continue running
+ the existing transaction.
+
+- commit logic of a non-inevitable transaction: we wait if there is
+ an inevitable transaction. Here too, if the inevitable transaction
+ is found to be detached, we could just commit it now. Or, a better
+ approach: if we find a detached inevitable transaction we grab it
+ temporarily, and commit only the *non-inevitable* transaction if it
+ doesn't conflict. The inevitable transaction is then detached
+ again. (Note that the conflict detection is: we don't commit any
+ write to any of the objects in the inevitable transaction's
+ read-set. This relies on inevitable threads maintaining their
+ read-set correctly, which should be the case in PyPy, but needs to
+ be checked.)
diff --git a/c8/demo/demo_random.c b/c8/demo/demo_random.c
--- a/c8/demo/demo_random.c
+++ b/c8/demo/demo_random.c
@@ -8,6 +8,8 @@
#include <sys/wait.h>
#include "stmgc.h"
+#include "stm/fprintcolor.h"
+#include "stm/fprintcolor.c"
#define NUMTHREADS 2
#define STEPS_PER_THREAD 500
@@ -48,8 +50,10 @@
int num_roots;
int num_roots_at_transaction_start;
int steps_left;
+ long globally_unique;
};
__thread struct thread_data td;
+static long progress = 1;
struct thread_data *_get_td(void)
{
@@ -57,9 +61,16 @@
}
+long check_size(long size)
+{
+ assert(size >= sizeof(struct node_s));
+ assert(size <= sizeof(struct node_s) + 4096*70);
+ return size;
+}
+
ssize_t stmcb_size_rounded_up(struct object_s *ob)
{
- return ((struct node_s*)ob)->my_size;
+ return check_size(((struct node_s*)ob)->my_size);
}
void stmcb_trace(struct object_s *obj, void visit(object_t **))
@@ -69,7 +80,8 @@
/* and the same value at the end: */
/* note, ->next may be the same as last_next */
- nodeptr_t *last_next = (nodeptr_t*)((char*)n + n->my_size - sizeof(void*));
+ nodeptr_t *last_next = (nodeptr_t*)((char*)n + check_size(n->my_size)
+ - sizeof(void*));
assert(n->next == *last_next);
@@ -113,36 +125,36 @@
}
}
-void reload_roots()
-{
- int i;
- assert(td.num_roots == td.num_roots_at_transaction_start);
- for (i = td.num_roots_at_transaction_start - 1; i >= 0; i--) {
- if (td.roots[i])
- STM_POP_ROOT(stm_thread_local, td.roots[i]);
- }
-
- for (i = 0; i < td.num_roots_at_transaction_start; i++) {
- if (td.roots[i])
- STM_PUSH_ROOT(stm_thread_local, td.roots[i]);
- }
-}
-
void push_roots()
{
int i;
+ assert(td.num_roots_at_transaction_start <= td.num_roots);
for (i = td.num_roots_at_transaction_start; i < td.num_roots; i++) {
if (td.roots[i])
STM_PUSH_ROOT(stm_thread_local, td.roots[i]);
}
+ STM_SEGMENT->no_safe_point_here = 0;
}
void pop_roots()
{
int i;
- for (i = td.num_roots - 1; i >= td.num_roots_at_transaction_start; i--) {
- if (td.roots[i])
+ STM_SEGMENT->no_safe_point_here = 1;
+
+ assert(td.num_roots_at_transaction_start <= td.num_roots);
+ for (i = td.num_roots - 1; i >= 0; i--) {
+ if (td.roots[i]) {
STM_POP_ROOT(stm_thread_local, td.roots[i]);
+ assert(td.roots[i]);
+ }
+ }
+
+ dprintf(("stm_is_inevitable() = %d\n", (int)stm_is_inevitable()));
+ for (i = 0; i < td.num_roots_at_transaction_start; i++) {
+ if (td.roots[i]) {
+ dprintf(("root %d: %p\n", i, td.roots[i]));
+ STM_PUSH_ROOT(stm_thread_local, td.roots[i]);
+ }
}
}
@@ -150,6 +162,7 @@
{
int i;
assert(idx >= td.num_roots_at_transaction_start);
+ assert(idx < td.num_roots);
for (i = idx; i < td.num_roots - 1; i++)
td.roots[i] = td.roots[i + 1];
@@ -158,6 +171,7 @@
void add_root(objptr_t r)
{
+ assert(td.num_roots_at_transaction_start <= td.num_roots);
if (r && td.num_roots < MAXROOTS) {
td.roots[td.num_roots++] = r;
}
@@ -184,7 +198,8 @@
nodeptr_t n = (nodeptr_t)p;
/* and the same value at the end: */
- nodeptr_t TLPREFIX *last_next = (nodeptr_t TLPREFIX *)((stm_char*)n + n->my_size - sizeof(void*));
+ nodeptr_t TLPREFIX *last_next = (nodeptr_t TLPREFIX *)((stm_char*)n +
+ check_size(n->my_size) - sizeof(void*));
assert(n->next == *last_next);
n->next = (nodeptr_t)v;
*last_next = (nodeptr_t)v;
@@ -196,7 +211,8 @@
nodeptr_t n = (nodeptr_t)p;
/* and the same value at the end: */
- nodeptr_t TLPREFIX *last_next = (nodeptr_t TLPREFIX *)((stm_char*)n + n->my_size - sizeof(void*));
+ nodeptr_t TLPREFIX *last_next = (nodeptr_t TLPREFIX *)((stm_char*)n +
+ check_size(n->my_size) - sizeof(void*));
OPT_ASSERT(n->next == *last_next);
return n->next;
@@ -229,7 +245,7 @@
sizeof(struct node_s) + (get_rand(100000) & ~15),
sizeof(struct node_s) + 4096,
sizeof(struct node_s) + 4096*70};
- size_t size = sizes[get_rand(4)];
+ size_t size = check_size(sizes[get_rand(4)]);
p = stm_allocate(size);
nodeptr_t n = (nodeptr_t)p;
n->sig = SIGNATURE;
@@ -240,7 +256,6 @@
n->next = NULL;
*last_next = NULL;
pop_roots();
- /* reload_roots not necessary, all are old after start_transaction */
break;
case 4: // read and validate 'p'
read_barrier(p);
@@ -288,6 +303,15 @@
return p;
}
+static void end_gut(void)
+{
+ if (td.globally_unique != 0) {
+ fprintf(stderr, "[GUT END]");
+ assert(progress == td.globally_unique);
+ td.globally_unique = 0;
+ stm_resume_all_other_threads();
+ }
+}
objptr_t do_step(objptr_t p)
{
@@ -308,8 +332,14 @@
return NULL;
} else if (get_rand(240) == 1) {
push_roots();
- stm_become_globally_unique_transaction(&stm_thread_local, "really");
- fprintf(stderr, "[GUT/%d]", (int)STM_SEGMENT->segment_num);
+ if (td.globally_unique == 0) {
+ stm_stop_all_other_threads();
+ td.globally_unique = progress;
+ fprintf(stderr, "[GUT/%d]", (int)STM_SEGMENT->segment_num);
+ }
+ else {
+ end_gut();
+ }
pop_roots();
return NULL;
}
@@ -347,37 +377,53 @@
objptr_t p;
- stm_start_transaction(&stm_thread_local);
+ stm_enter_transactional_zone(&stm_thread_local);
assert(td.num_roots >= td.num_roots_at_transaction_start);
td.num_roots = td.num_roots_at_transaction_start;
p = NULL;
pop_roots(); /* does nothing.. */
- reload_roots();
while (td.steps_left-->0) {
if (td.steps_left % 8 == 0)
fprintf(stdout, "#");
- assert(p == NULL || ((nodeptr_t)p)->sig == SIGNATURE);
+ int local_seg = STM_SEGMENT->segment_num;
+ int p_sig = p == NULL ? 0 : ((nodeptr_t)p)->sig;
+
+ assert(p == NULL || p_sig == SIGNATURE);
+ (void)local_seg;
+ (void)p_sig;
+
+ if (!td.globally_unique)
+ ++progress; /* racy, but good enough */
p = do_step(p);
if (p == (objptr_t)-1) {
push_roots();
+ end_gut();
long call_fork = (arg != NULL && *(long *)arg);
if (call_fork == 0) { /* common case */
- stm_commit_transaction();
- td.num_roots_at_transaction_start = td.num_roots;
- if (get_rand(100) < 98) {
- stm_start_transaction(&stm_thread_local);
- } else {
- stm_start_inevitable_transaction(&stm_thread_local);
+ if (get_rand(100) < 50) {
+ stm_leave_transactional_zone(&stm_thread_local);
+ /* Nothing here; it's unlikely that a different thread
+ manages to steal the detached inev transaction.
+ Give them a little chance with a usleep(). */
+ dprintf(("sleep...\n"));
+ usleep(1);
+ dprintf(("sleep done\n"));
+ td.num_roots_at_transaction_start = td.num_roots;
+ stm_enter_transactional_zone(&stm_thread_local);
+ }
+ else {
+ _stm_commit_transaction();
+ td.num_roots_at_transaction_start = td.num_roots;
+ _stm_start_transaction(&stm_thread_local);
}
td.num_roots = td.num_roots_at_transaction_start;
p = NULL;
pop_roots();
- reload_roots();
}
else {
/* run a fork() inside the transaction */
@@ -401,16 +447,17 @@
}
}
push_roots();
- stm_commit_transaction();
+ end_gut();
+ stm_force_transaction_break(&stm_thread_local);
/* even out the shadow stack before leaveframe: */
- stm_start_inevitable_transaction(&stm_thread_local);
+ stm_become_inevitable(&stm_thread_local, "before leaveframe");
while (td.num_roots > 0) {
td.num_roots--;
objptr_t t;
STM_POP_ROOT(stm_thread_local, t);
}
- stm_commit_transaction();
+ stm_leave_transactional_zone(&stm_thread_local);
stm_rewind_jmp_leaveframe(&stm_thread_local, &rjbuf);
stm_unregister_thread_local(&stm_thread_local);
diff --git a/c8/demo/demo_random2.c b/c8/demo/demo_random2.c
--- a/c8/demo/demo_random2.c
+++ b/c8/demo/demo_random2.c
@@ -8,6 +8,8 @@
#include <sys/wait.h>
#include "stmgc.h"
+#include "stm/fprintcolor.h"
+#include "stm/fprintcolor.c"
#define NUMTHREADS 3
#define STEPS_PER_THREAD 50000
@@ -52,8 +54,10 @@
int active_roots_num;
long roots_on_ss;
long roots_on_ss_at_tr_start;
+ long globally_unique;
};
__thread struct thread_data td;
+static long progress = 1;
struct thread_data *_get_td(void)
{
@@ -61,9 +65,16 @@
}
+long check_size(long size)
+{
+ assert(size >= sizeof(struct node_s));
+ assert(size <= sizeof(struct node_s) + 4096*70);
+ return size;
+}
+
ssize_t stmcb_size_rounded_up(struct object_s *ob)
{
- return ((struct node_s*)ob)->my_size;
+ return check_size(((struct node_s*)ob)->my_size);
}
void stmcb_trace(struct object_s *obj, void visit(object_t **))
@@ -73,7 +84,8 @@
/* and the same value at the end: */
/* note, ->next may be the same as last_next */
- nodeptr_t *last_next = (nodeptr_t*)((char*)n + n->my_size - sizeof(void*));
+ nodeptr_t *last_next = (nodeptr_t*)((char*)n + check_size(n->my_size)
+ - sizeof(void*));
assert(n->next == *last_next);
@@ -193,7 +205,8 @@
nodeptr_t n = (nodeptr_t)p;
/* and the same value at the end: */
- nodeptr_t TLPREFIX *last_next = (nodeptr_t TLPREFIX *)((stm_char*)n + n->my_size - sizeof(void*));
+ nodeptr_t TLPREFIX *last_next = (nodeptr_t TLPREFIX *)((stm_char*)n +
+ check_size(n->my_size) - sizeof(void*));
assert(n->next == *last_next);
n->next = (nodeptr_t)v;
*last_next = (nodeptr_t)v;
@@ -205,7 +218,8 @@
nodeptr_t n = (nodeptr_t)p;
/* and the same value at the end: */
- nodeptr_t TLPREFIX *last_next = (nodeptr_t TLPREFIX *)((stm_char*)n + n->my_size - sizeof(void*));
+ nodeptr_t TLPREFIX *last_next = (nodeptr_t TLPREFIX *)((stm_char*)n +
+ check_size(n->my_size) - sizeof(void*));
OPT_ASSERT(n->next == *last_next);
return n->next;
@@ -239,6 +253,7 @@
sizeof(struct node_s)+32, sizeof(struct node_s)+48,
sizeof(struct node_s) + (get_rand(100000) & ~15)};
size_t size = sizes[get_rand(sizeof(sizes) / sizeof(size_t))];
+ size = check_size(size);
p = stm_allocate(size);
nodeptr_t n = (nodeptr_t)p;
n->sig = SIGNATURE;
@@ -296,6 +311,16 @@
return p;
}
+static void end_gut(void)
+{
+ if (td.globally_unique != 0) {
+ fprintf(stderr, "[GUT END]");
+ assert(progress == td.globally_unique);
+ td.globally_unique = 0;
+ stm_resume_all_other_threads();
+ }
+}
+
void frame_loop();
objptr_t do_step(objptr_t p)
{
@@ -309,13 +334,22 @@
p = simple_events(p, _r);
} else if (get_rand(20) == 1) {
long pushed = push_roots();
- stm_commit_transaction();
- td.roots_on_ss_at_tr_start = td.roots_on_ss;
-
- if (get_rand(100) < 98) {
- stm_start_transaction(&stm_thread_local);
- } else {
- stm_start_inevitable_transaction(&stm_thread_local);
+ end_gut();
+ if (get_rand(100) < 95) {
+ stm_leave_transactional_zone(&stm_thread_local);
+ /* Nothing here; it's unlikely that a different thread
+ manages to steal the detached inev transaction.
+ Give them a little chance with a usleep(). */
+ dprintf(("sleep...\n"));
+ usleep(1);
+ dprintf(("sleep done\n"));
+ td.roots_on_ss_at_tr_start = td.roots_on_ss;
+ stm_enter_transactional_zone(&stm_thread_local);
+ }
+ else {
+ _stm_commit_transaction();
+ td.roots_on_ss_at_tr_start = td.roots_on_ss;
+ _stm_start_transaction(&stm_thread_local);
}
td.roots_on_ss = td.roots_on_ss_at_tr_start;
td.active_roots_num = 0;
@@ -331,15 +365,21 @@
} else if (get_rand(20) == 1) {
long pushed = push_roots();
stm_become_inevitable(&stm_thread_local, "please");
- assert(stm_is_inevitable());
+ assert(stm_is_inevitable(&stm_thread_local));
pop_roots(pushed);
p= NULL;
} else if (get_rand(20) == 1) {
p = (objptr_t)-1; // possibly fork
- } else if (get_rand(20) == 1) {
+ } else if (get_rand(100) == 1) {
long pushed = push_roots();
- stm_become_globally_unique_transaction(&stm_thread_local, "really");
- fprintf(stderr, "[GUT/%d]", (int)STM_SEGMENT->segment_num);
+ if (td.globally_unique == 0) {
+ stm_stop_all_other_threads();
+ td.globally_unique = progress;
+ fprintf(stderr, "[GUT/%d]", (int)STM_SEGMENT->segment_num);
+ }
+ else {
+ end_gut();
+ }
pop_roots(pushed);
p = NULL;
}
@@ -364,6 +404,8 @@
p = do_step(p);
+ if (!td.globally_unique)
+ ++progress; /* racy, but good enough */
if (p == (objptr_t)-1) {
p = NULL;
@@ -371,6 +413,7 @@
long call_fork = (thread_may_fork != NULL && *(long *)thread_may_fork);
if (call_fork) { /* common case */
long pushed = push_roots();
+ end_gut();
/* run a fork() inside the transaction */
printf("========== FORK =========\n");
*(long*)thread_may_fork = 0;
@@ -426,7 +469,7 @@
setup_thread();
td.roots_on_ss_at_tr_start = 0;
- stm_start_transaction(&stm_thread_local);
+ stm_enter_transactional_zone(&stm_thread_local);
td.roots_on_ss = td.roots_on_ss_at_tr_start;
td.active_roots_num = 0;
@@ -435,7 +478,8 @@
frame_loop();
}
- stm_commit_transaction();
+ end_gut();
+ stm_leave_transactional_zone(&stm_thread_local);
stm_rewind_jmp_leaveframe(&stm_thread_local, &rjbuf);
stm_unregister_thread_local(&stm_thread_local);
diff --git a/c8/demo/demo_simple.c b/c8/demo/demo_simple.c
--- a/c8/demo/demo_simple.c
+++ b/c8/demo/demo_simple.c
@@ -70,18 +70,20 @@
object_t *tmp;
int i = 0;
+
+ stm_enter_transactional_zone(&stm_thread_local);
while (i < ITERS) {
- stm_start_transaction(&stm_thread_local);
tl_counter++;
if (i % 500 < 250)
STM_PUSH_ROOT(stm_thread_local, stm_allocate(16));//gl_counter++;
else
STM_POP_ROOT(stm_thread_local, tmp);
- stm_commit_transaction();
+ stm_force_transaction_break(&stm_thread_local);
i++;
}
OPT_ASSERT(org == (char *)stm_thread_local.shadowstack);
+ stm_leave_transactional_zone(&stm_thread_local);
stm_rewind_jmp_leaveframe(&stm_thread_local, &rjbuf);
stm_unregister_thread_local(&stm_thread_local);
diff --git a/c8/demo/test_shadowstack.c b/c8/demo/test_shadowstack.c
--- a/c8/demo/test_shadowstack.c
+++ b/c8/demo/test_shadowstack.c
@@ -43,17 +43,16 @@
stm_register_thread_local(&stm_thread_local);
stm_rewind_jmp_enterframe(&stm_thread_local, &rjbuf);
- stm_start_transaction(&stm_thread_local);
+ stm_enter_transactional_zone(&stm_thread_local);
node_t *node = (node_t *)stm_allocate(sizeof(struct node_s));
node->value = 129821;
STM_PUSH_ROOT(stm_thread_local, node);
STM_PUSH_ROOT(stm_thread_local, 333); /* odd value */
- stm_commit_transaction();
/* now in a new transaction, pop the node off the shadowstack, but
then do a major collection. It should still be found by the
tracing logic. */
- stm_start_transaction(&stm_thread_local);
+ stm_force_transaction_break(&stm_thread_local);
STM_POP_ROOT_RET(stm_thread_local);
STM_POP_ROOT(stm_thread_local, node);
assert(node->value == 129821);
diff --git a/c8/stm/atomic.h b/c8/stm/atomic.h
--- a/c8/stm/atomic.h
+++ b/c8/stm/atomic.h
@@ -24,15 +24,21 @@
#if defined(__i386__) || defined(__amd64__)
-# define HAVE_FULL_EXCHANGE_INSN
static inline void spin_loop(void) { asm("pause" : : : "memory"); }
static inline void write_fence(void) { asm("" : : : "memory"); }
+/*# define atomic_exchange(ptr, old, new) do { \
+ (old) = __sync_lock_test_and_set(ptr, new); \
+ } while (0)*/
#else
static inline void spin_loop(void) { asm("" : : : "memory"); }
static inline void write_fence(void) { __sync_synchronize(); }
+/*# define atomic_exchange(ptr, old, new) do { \
+ (old) = *(ptr); \
+ } while (UNLIKELY(!__sync_bool_compare_and_swap(ptr, old, new))); */
+
#endif
diff --git a/c8/stm/core.c b/c8/stm/core.c
--- a/c8/stm/core.c
+++ b/c8/stm/core.c
@@ -324,10 +324,7 @@
/* Don't check this 'cl'. This entry is already checked */
if (STM_PSEGMENT->transaction_state == TS_INEVITABLE) {
- //assert(first_cl->next == INEV_RUNNING);
- /* the above assert may fail when running a major collection
- while the commit of the inevitable transaction is in progress
- and the element is already attached */
+ assert(first_cl->next == INEV_RUNNING);
return true;
}
@@ -496,11 +493,23 @@
static void wait_for_other_inevitable(struct stm_commit_log_entry_s *old)
{
+ intptr_t detached = fetch_detached_transaction();
+ if (detached != 0) {
+ commit_fetched_detached_transaction(detached);
+ return;
+ }
+
timing_event(STM_SEGMENT->running_thread, STM_WAIT_OTHER_INEVITABLE);
while (old->next == INEV_RUNNING && !safe_point_requested()) {
spin_loop();
usleep(10); /* XXXXXX */
+
+ detached = fetch_detached_transaction();
+ if (detached != 0) {
+ commit_fetched_detached_transaction(detached);
+ break;
+ }
}
timing_event(STM_SEGMENT->running_thread, STM_WAIT_DONE);
}
@@ -509,7 +518,8 @@
static void readd_wb_executed_flags(void);
static void check_all_write_barrier_flags(char *segbase, struct list_s *list);
-static void _validate_and_attach(struct stm_commit_log_entry_s *new)
+static bool _validate_and_attach(struct stm_commit_log_entry_s *new,
+ bool can_sleep)
{
struct stm_commit_log_entry_s *old;
@@ -571,6 +581,8 @@
/* XXXXXX for now just sleep. We should really ask to inev
transaction to do the commit for us, and then we can
continue running. */
+ if (!can_sleep)
+ return false;
dprintf(("_validate_and_attach(%p) failed, "
"waiting for inevitable\n", new));
wait_for_other_inevitable(old);
@@ -591,18 +603,17 @@
if (is_commit) {
/* compare with _validate_and_add_to_commit_log */
- STM_PSEGMENT->transaction_state = TS_NONE;
- STM_PSEGMENT->safe_point = SP_NO_TRANSACTION;
-
list_clear(STM_PSEGMENT->modified_old_objects);
STM_PSEGMENT->last_commit_log_entry = new;
release_modification_lock_wr(STM_SEGMENT->segment_num);
}
+ return true;
}
-static void _validate_and_turn_inevitable(void)
+static bool _validate_and_turn_inevitable(bool can_sleep)
{
- _validate_and_attach((struct stm_commit_log_entry_s *)INEV_RUNNING);
+ return _validate_and_attach((struct stm_commit_log_entry_s *)INEV_RUNNING,
+ can_sleep);
}
static void _validate_and_add_to_commit_log(void)
@@ -611,6 +622,8 @@
new = _create_commit_log_entry();
if (STM_PSEGMENT->transaction_state == TS_INEVITABLE) {
+ assert(_stm_detached_inevitable_from_thread == 0); /* running it */
+
old = STM_PSEGMENT->last_commit_log_entry;
new->rev_num = old->rev_num + 1;
OPT_ASSERT(old->next == INEV_RUNNING);
@@ -621,17 +634,18 @@
STM_PSEGMENT->modified_old_objects);
/* compare with _validate_and_attach: */
- STM_PSEGMENT->transaction_state = TS_NONE;
- STM_PSEGMENT->safe_point = SP_NO_TRANSACTION;
+ acquire_modification_lock_wr(STM_SEGMENT->segment_num);
list_clear(STM_PSEGMENT->modified_old_objects);
STM_PSEGMENT->last_commit_log_entry = new;
/* do it: */
bool yes = __sync_bool_compare_and_swap(&old->next, INEV_RUNNING, new);
OPT_ASSERT(yes);
+
+ release_modification_lock_wr(STM_SEGMENT->segment_num);
}
else {
- _validate_and_attach(new);
+ _validate_and_attach(new, /*can_sleep=*/true);
}
}
@@ -1123,7 +1137,7 @@
-static void _stm_start_transaction(stm_thread_local_t *tl)
+static void _do_start_transaction(stm_thread_local_t *tl)
{
assert(!_stm_in_transaction(tl));
@@ -1140,7 +1154,7 @@
#endif
STM_PSEGMENT->shadowstack_at_start_of_transaction = tl->shadowstack;
STM_PSEGMENT->threadlocal_at_start_of_transaction = tl->thread_local_obj;
-
+ STM_PSEGMENT->total_throw_away_nursery = 0;
assert(list_is_empty(STM_PSEGMENT->modified_old_objects));
assert(list_is_empty(STM_PSEGMENT->large_overflow_objects));
@@ -1181,35 +1195,34 @@
stm_validate();
}
-long stm_start_transaction(stm_thread_local_t *tl)
+#ifdef STM_NO_AUTOMATIC_SETJMP
+static int did_abort = 0;
+#endif
+
+long _stm_start_transaction(stm_thread_local_t *tl)
{
s_mutex_lock();
#ifdef STM_NO_AUTOMATIC_SETJMP
- long repeat_count = 0; /* test/support.py */
+ long repeat_count = did_abort; /* test/support.py */
+ did_abort = 0;
#else
long repeat_count = stm_rewind_jmp_setjmp(tl);
#endif
- _stm_start_transaction(tl);
+ _do_start_transaction(tl);
+
+ if (repeat_count == 0) { /* else, 'nursery_mark' was already set
+ in abort_data_structures_from_segment_num() */
+ STM_SEGMENT->nursery_mark = ((stm_char *)_stm_nursery_start +
+ stm_fill_mark_nursery_bytes);
+ }
return repeat_count;
}
-void stm_start_inevitable_transaction(stm_thread_local_t *tl)
-{
- /* used to be more efficient, starting directly an inevitable transaction,
- but there is no real point any more, I believe */
- rewind_jmp_buf rjbuf;
- stm_rewind_jmp_enterframe(tl, &rjbuf);
-
- stm_start_transaction(tl);
- stm_become_inevitable(tl, "start_inevitable_transaction");
-
- stm_rewind_jmp_leaveframe(tl, &rjbuf);
-}
-
#ifdef STM_NO_AUTOMATIC_SETJMP
void _test_run_abort(stm_thread_local_t *tl) __attribute__((noreturn));
-int stm_is_inevitable(void)
+int stm_is_inevitable(stm_thread_local_t *tl)
{
+ assert(STM_SEGMENT->running_thread == tl);
switch (STM_PSEGMENT->transaction_state) {
case TS_REGULAR: return 0;
case TS_INEVITABLE: return 1;
@@ -1224,6 +1237,7 @@
{
stm_thread_local_t *tl = STM_SEGMENT->running_thread;
+ assert(_has_mutex());
STM_PSEGMENT->safe_point = SP_NO_TRANSACTION;
STM_PSEGMENT->transaction_state = TS_NONE;
@@ -1231,7 +1245,15 @@
list_clear(STM_PSEGMENT->objects_pointing_to_nursery);
list_clear(STM_PSEGMENT->old_objects_with_cards_set);
list_clear(STM_PSEGMENT->large_overflow_objects);
- timing_event(tl, event);
+ if (tl != NULL)
+ timing_event(tl, event);
+
+ /* If somebody is waiting for us to reach a safe point, we simply
+ signal it now and leave this transaction. This should be enough
+ for synchronize_all_threads() to retry and notice that we are
+ no longer SP_RUNNING. */
+ if (STM_SEGMENT->nursery_end != NURSERY_END)
+ cond_signal(C_AT_SAFE_POINT);
release_thread_segment(tl);
/* cannot access STM_SEGMENT or STM_PSEGMENT from here ! */
@@ -1280,24 +1302,55 @@
}
-void stm_commit_transaction(void)
+void _stm_commit_transaction(void)
+{
+ assert(STM_PSEGMENT->running_pthread == pthread_self());
+ _core_commit_transaction(/*external=*/ false);
+}
+
+static void _core_commit_transaction(bool external)
{
exec_local_finalizers();
assert(!_has_mutex());
assert(STM_PSEGMENT->safe_point == SP_RUNNING);
- assert(STM_PSEGMENT->running_pthread == pthread_self());
+ assert(STM_PSEGMENT->transaction_state != TS_NONE);
+ if (globally_unique_transaction) {
+ stm_fatalerror("cannot commit between stm_stop_all_other_threads "
+ "and stm_resume_all_other_threads");
+ }
- dprintf(("> stm_commit_transaction()\n"));
- minor_collection(1);
+ dprintf(("> stm_commit_transaction(external=%d)\n", (int)external));
+ minor_collection(/*commit=*/ true, external);
+ if (!external && is_major_collection_requested()) {
+ s_mutex_lock();
+ if (is_major_collection_requested()) { /* if still true */
+ major_collection_with_mutex();
+ }
+ s_mutex_unlock();
+ }
push_large_overflow_objects_to_other_segments();
/* push before validate. otherwise they are reachable too early */
+ if (external) {
+ /* from this point on, unlink the original 'stm_thread_local_t *'
+ from its segment. Better do it as soon as possible, because
+ other threads might be spin-looping, waiting for the -1 to
+ disappear. */
+ STM_SEGMENT->running_thread = NULL;
+ write_fence();
+ assert(_stm_detached_inevitable_from_thread == -1);
+ _stm_detached_inevitable_from_thread = 0;
+ }
+
bool was_inev = STM_PSEGMENT->transaction_state == TS_INEVITABLE;
_validate_and_add_to_commit_log();
- stm_rewind_jmp_forget(STM_SEGMENT->running_thread);
+ if (!was_inev) {
+ assert(!external);
+ stm_rewind_jmp_forget(STM_SEGMENT->running_thread);
+ }
/* XXX do we still need a s_mutex_lock() section here? */
s_mutex_lock();
@@ -1314,23 +1367,9 @@
invoke_and_clear_user_callbacks(0); /* for commit */
- /* >>>>> there may be a FORK() happening in the safepoint below <<<<<*/
- enter_safe_point_if_requested();
- assert(STM_SEGMENT->nursery_end == NURSERY_END);
-
- /* if a major collection is required, do it here */
- if (is_major_collection_requested()) {
- major_collection_with_mutex();
- }
-
- _verify_cards_cleared_in_all_lists(get_priv_segment(STM_SEGMENT->segment_num));
-
- if (globally_unique_transaction && was_inev) {
- committed_globally_unique_transaction();
- }
-
/* done */
stm_thread_local_t *tl = STM_SEGMENT->running_thread;
+ assert(external == (tl == NULL));
_finish_transaction(STM_TRANSACTION_COMMIT);
/* cannot access STM_SEGMENT or STM_PSEGMENT from here ! */
@@ -1338,7 +1377,8 @@
/* between transactions, call finalizers. this will execute
a transaction itself */
- invoke_general_finalizers(tl);
+ if (tl != NULL)
+ invoke_general_finalizers(tl);
}
static void reset_modified_from_backup_copies(int segment_num)
@@ -1399,7 +1439,7 @@
abort_finalizers(pseg);
- long bytes_in_nursery = throw_away_nursery(pseg);
+ throw_away_nursery(pseg);
/* clear CARD_MARKED on objs (don't care about CARD_MARKED_OLD) */
LIST_FOREACH_R(pseg->old_objects_with_cards_set, object_t * /*item*/,
@@ -1433,7 +1473,26 @@
assert(tl->shadowstack == pseg->shadowstack_at_start_of_transaction);
#endif
tl->thread_local_obj = pseg->threadlocal_at_start_of_transaction;
- tl->last_abort__bytes_in_nursery = bytes_in_nursery;
+
+
+ /* Set the next nursery_mark: first compute the value that
+ nursery_mark must have had at the start of the aborted transaction */
+ stm_char *old_mark =pseg->pub.nursery_mark + pseg->total_throw_away_nursery;
+
+ /* This means that the limit, in term of bytes, was: */
+ uintptr_t old_limit = old_mark - (stm_char *)_stm_nursery_start;
+
+ /* If 'total_throw_away_nursery' is smaller than old_limit, use that */
+ if (pseg->total_throw_away_nursery < old_limit)
+ old_limit = pseg->total_throw_away_nursery;
+
+ /* Now set the new limit to 90% of the old limit */
+ pseg->pub.nursery_mark = ((stm_char *)_stm_nursery_start +
+ (uintptr_t)(old_limit * 0.9));
+
+#ifdef STM_NO_AUTOMATIC_SETJMP
+ did_abort = 1;
+#endif
list_clear(pseg->objects_pointing_to_nursery);
list_clear(pseg->old_objects_with_cards_set);
@@ -1502,36 +1561,40 @@
void _stm_become_inevitable(const char *msg)
{
- if (STM_PSEGMENT->transaction_state == TS_REGULAR) {
+ assert(STM_PSEGMENT->transaction_state == TS_REGULAR);
+ _stm_collectable_safe_point();
+
+ if (msg != MSG_INEV_DONT_SLEEP) {
dprintf(("become_inevitable: %s\n", msg));
- _stm_collectable_safe_point();
timing_become_inevitable();
-
- _validate_and_turn_inevitable();
- STM_PSEGMENT->transaction_state = TS_INEVITABLE;
-
- stm_rewind_jmp_forget(STM_SEGMENT->running_thread);
- invoke_and_clear_user_callbacks(0); /* for commit */
+ _validate_and_turn_inevitable(/*can_sleep=*/true);
}
else {
- assert(STM_PSEGMENT->transaction_state == TS_INEVITABLE);
+ if (!_validate_and_turn_inevitable(/*can_sleep=*/false))
+ return;
+ timing_become_inevitable();
}
+ STM_PSEGMENT->transaction_state = TS_INEVITABLE;
+
+ stm_rewind_jmp_forget(STM_SEGMENT->running_thread);
+ invoke_and_clear_user_callbacks(0); /* for commit */
}
+#if 0
void stm_become_globally_unique_transaction(stm_thread_local_t *tl,
const char *msg)
{
- stm_become_inevitable(tl, msg); /* may still abort */
+ stm_become_inevitable(tl, msg);
s_mutex_lock();
synchronize_all_threads(STOP_OTHERS_AND_BECOME_GLOBALLY_UNIQUE);
s_mutex_unlock();
}
-
+#endif
void stm_stop_all_other_threads(void)
{
- if (!stm_is_inevitable()) /* may still abort */
+ if (!stm_is_inevitable(STM_SEGMENT->running_thread)) /* may still abort */
_stm_become_inevitable("stop_all_other_threads");
s_mutex_lock();
diff --git a/c8/stm/core.h b/c8/stm/core.h
--- a/c8/stm/core.h
+++ b/c8/stm/core.h
@@ -152,6 +152,9 @@
stm_char *sq_fragments[SYNC_QUEUE_SIZE];
int sq_fragsizes[SYNC_QUEUE_SIZE];
int sq_len;
+
+ /* For nursery_mark */
+ uintptr_t total_throw_away_nursery;
};
enum /* safe_point */ {
@@ -170,6 +173,8 @@
TS_INEVITABLE,
};
+#define MSG_INEV_DONT_SLEEP ((const char *)1)
+
#define in_transaction(tl) \
(get_segment((tl)->last_associated_segment_num)->running_thread == (tl))
@@ -297,6 +302,7 @@
static void _signal_handler(int sig, siginfo_t *siginfo, void *context);
static bool _stm_validate(void);
+static void _core_commit_transaction(bool external);
static inline bool was_read_remote(char *base, object_t *obj)
{
diff --git a/c8/stm/detach.c b/c8/stm/detach.c
new file mode 100644
--- /dev/null
+++ b/c8/stm/detach.c
@@ -0,0 +1,175 @@
+#ifndef _STM_CORE_H_
+# error "must be compiled via stmgc.c"
+#endif
+
+#include <errno.h>
+
+
+/* Idea: if stm_leave_transactional_zone() is quickly followed by
+ stm_enter_transactional_zone() in the same thread, then we should
+ simply try to have one inevitable transaction that does both sides.
+ This is useful if there are many such small interruptions.
+
+ stm_leave_transactional_zone() tries to make sure the transaction
+ is inevitable, and then sticks the current 'stm_thread_local_t *'
+ into _stm_detached_inevitable_from_thread.
+ stm_enter_transactional_zone() has a fast-path if the same
+ 'stm_thread_local_t *' is still there.
+
+ If a different thread grabs it, it atomically replaces the value in
+ _stm_detached_inevitable_from_thread with -1, commits it (this part
+ involves reading for example the shadowstack of the thread that
+ originally detached), and at the point where we know the original
+ stm_thread_local_t is no longer relevant, we reset
+ _stm_detached_inevitable_from_thread to 0.
+*/
+
+volatile intptr_t _stm_detached_inevitable_from_thread;
+
+
+static void setup_detach(void)
+{
+ _stm_detached_inevitable_from_thread = 0;
+}
+
+
+void _stm_leave_noninevitable_transactional_zone(void)
+{
+ int saved_errno = errno;
+ dprintf(("leave_noninevitable_transactional_zone\n"));
+ _stm_become_inevitable(MSG_INEV_DONT_SLEEP);
+
+ /* did it work? */
+ if (STM_PSEGMENT->transaction_state == TS_INEVITABLE) { /* yes */
+ dprintf(("leave_noninevitable_transactional_zone: now inevitable\n"));
+ stm_thread_local_t *tl = STM_SEGMENT->running_thread;
+ _stm_detach_inevitable_transaction(tl);
+ }
+ else { /* no */
+ dprintf(("leave_noninevitable_transactional_zone: commit\n"));
+ _stm_commit_transaction();
+ }
+ errno = saved_errno;
+}
+
+static void commit_external_inevitable_transaction(void)
+{
+ assert(STM_PSEGMENT->transaction_state == TS_INEVITABLE); /* can't abort */
+ _core_commit_transaction(/*external=*/ true);
+}
+
+void _stm_reattach_transaction(stm_thread_local_t *tl)
+{
+ intptr_t old;
+ int saved_errno = errno;
+ restart:
+ old = _stm_detached_inevitable_from_thread;
+ if (old != 0) {
+ if (old == -1) {
+ /* busy-loop: wait until _stm_detached_inevitable_from_thread
+ is reset to a value different from -1 */
+ dprintf(("reattach_transaction: busy wait...\n"));
+ while (_stm_detached_inevitable_from_thread == -1)
+ spin_loop();
+
+ /* then retry */
+ goto restart;
+ }
+
+ if (!__sync_bool_compare_and_swap(&_stm_detached_inevitable_from_thread,
+ old, -1))
+ goto restart;
+
+ stm_thread_local_t *old_tl = (stm_thread_local_t *)old;
+ int remote_seg_num = old_tl->last_associated_segment_num;
+ dprintf(("reattach_transaction: commit detached from seg %d\n",
+ remote_seg_num));
+
+ tl->last_associated_segment_num = remote_seg_num;
+ ensure_gs_register(remote_seg_num);
+ commit_external_inevitable_transaction();
+ }
+ dprintf(("reattach_transaction: start a new transaction\n"));
+ _stm_start_transaction(tl);
+ errno = saved_errno;
+}
+
+void stm_force_transaction_break(stm_thread_local_t *tl)
+{
+ dprintf(("> stm_force_transaction_break()\n"));
+ assert(STM_SEGMENT->running_thread == tl);
+ _stm_commit_transaction();
+ _stm_start_transaction(tl);
+}
+
+static intptr_t fetch_detached_transaction(void)
+{
+ intptr_t cur;
+ restart:
+ cur = _stm_detached_inevitable_from_thread;
+ if (cur == 0) { /* fast-path */
+ return 0; /* _stm_detached_inevitable_from_thread not changed */
+ }
+ if (cur == -1) {
+ /* busy-loop: wait until _stm_detached_inevitable_from_thread
+ is reset to a value different from -1 */
+ while (_stm_detached_inevitable_from_thread == -1)
+ spin_loop();
+ goto restart;
+ }
+ if (!__sync_bool_compare_and_swap(&_stm_detached_inevitable_from_thread,
+ cur, -1))
+ goto restart;
+
+ /* this is the only case where we grabbed a detached transaction.
+ _stm_detached_inevitable_from_thread is still -1, until
+ commit_fetched_detached_transaction() is called. */
+ assert(_stm_detached_inevitable_from_thread == -1);
+ return cur;
+}
+
+static void commit_fetched_detached_transaction(intptr_t old)
+{
+ /* Here, 'seg_num' is the segment that contains the detached
+ inevitable transaction from fetch_detached_transaction(),
+ probably belonging to an unrelated thread. We fetched it,
+ which means that nobody else can concurrently fetch it now, but
+ everybody will see that there is still a concurrent inevitable
+ transaction. This should guarantee there are no race
+ conditions.
+ */
+ int mysegnum = STM_SEGMENT->segment_num;
+ int segnum = ((stm_thread_local_t *)old)->last_associated_segment_num;
+ dprintf(("commit_fetched_detached_transaction from seg %d\n", segnum));
+ assert(segnum > 0);
+
+ if (segnum != mysegnum) {
+ set_gs_register(get_segment_base(segnum));
+ }
+ commit_external_inevitable_transaction();
+
+ if (segnum != mysegnum) {
+ set_gs_register(get_segment_base(mysegnum));
+ }
+}
+
+static void commit_detached_transaction_if_from(stm_thread_local_t *tl)
+{
+ intptr_t old;
+ restart:
+ old = _stm_detached_inevitable_from_thread;
+ if (old == (intptr_t)tl) {
+ if (!__sync_bool_compare_and_swap(&_stm_detached_inevitable_from_thread,
+ old, -1))
+ goto restart;
+ commit_fetched_detached_transaction(old);
+ return;
+ }
+ if (old == -1) {
+ /* busy-loop: wait until _stm_detached_inevitable_from_thread
+ is reset to a value different from -1 */
+ while (_stm_detached_inevitable_from_thread == -1)
+ spin_loop();
+ goto restart;
+ }
+}
diff --git a/c8/stm/detach.h b/c8/stm/detach.h
new file mode 100644
--- /dev/null
+++ b/c8/stm/detach.h
@@ -0,0 +1,5 @@
+
+static void setup_detach(void);
+static intptr_t fetch_detached_transaction(void);
+static void commit_fetched_detached_transaction(intptr_t old);
+static void commit_detached_transaction_if_from(stm_thread_local_t *tl);
diff --git a/c8/stm/finalizer.c b/c8/stm/finalizer.c
--- a/c8/stm/finalizer.c
+++ b/c8/stm/finalizer.c
@@ -494,11 +494,11 @@
rewind_jmp_buf rjbuf;
stm_rewind_jmp_enterframe(tl, &rjbuf);
- stm_start_transaction(tl);
+ _stm_start_transaction(tl);
_execute_finalizers(&g_finalizers);
- stm_commit_transaction();
+ _stm_commit_transaction();
stm_rewind_jmp_leaveframe(tl, &rjbuf);
__sync_lock_release(&lock);
diff --git a/c8/stm/forksupport.c b/c8/stm/forksupport.c
--- a/c8/stm/forksupport.c
+++ b/c8/stm/forksupport.c
@@ -40,7 +40,7 @@
bool was_in_transaction = _stm_in_transaction(this_tl);
if (!was_in_transaction)
- stm_start_transaction(this_tl);
+ _stm_start_transaction(this_tl);
assert(in_transaction(this_tl));
stm_become_inevitable(this_tl, "fork");
@@ -73,7 +73,7 @@
s_mutex_unlock();
if (!was_in_transaction) {
- stm_commit_transaction();
+ _stm_commit_transaction();
}
dprintf(("forksupport_parent: continuing to run\n"));
@@ -159,7 +159,7 @@
assert(STM_SEGMENT->segment_num == segnum);
if (!fork_was_in_transaction) {
- stm_commit_transaction();
+ _stm_commit_transaction();
}
/* Done */
diff --git a/c8/stm/fprintcolor.h b/c8/stm/fprintcolor.h
--- a/c8/stm/fprintcolor.h
+++ b/c8/stm/fprintcolor.h
@@ -37,5 +37,6 @@
/* ------------------------------------------------------------ */
+__attribute__((unused))
static void stm_fatalerror(const char *format, ...)
__attribute__((format (printf, 1, 2), noreturn));
diff --git a/c8/stm/nursery.c b/c8/stm/nursery.c
--- a/c8/stm/nursery.c
+++ b/c8/stm/nursery.c
@@ -11,8 +11,13 @@
static uintptr_t _stm_nursery_start;
+#define DEFAULT_FILL_MARK_NURSERY_BYTES (NURSERY_SIZE / 4)
+
+uintptr_t stm_fill_mark_nursery_bytes = DEFAULT_FILL_MARK_NURSERY_BYTES;
+
/************************************************************/
+
static void setup_nursery(void)
{
assert(_STM_FAST_ALLOC <= NURSERY_SIZE);
@@ -309,6 +314,7 @@
else
assert(finalbase <= ssbase && ssbase <= current);
+ dprintf(("collect_roots_in_nursery:\n"));
while (current > ssbase) {
--current;
uintptr_t x = (uintptr_t)current->ss;
@@ -320,6 +326,7 @@
else {
/* it is an odd-valued marker, ignore */
}
+ dprintf((" %p: %p -> %p\n", current, (void *)x, current->ss));
}
minor_trace_if_young(&tl->thread_local_obj);
@@ -447,7 +454,7 @@
}
-static size_t throw_away_nursery(struct stm_priv_segment_info_s *pseg)
+static void throw_away_nursery(struct stm_priv_segment_info_s *pseg)
{
#pragma push_macro("STM_PSEGMENT")
#pragma push_macro("STM_SEGMENT")
@@ -480,7 +487,9 @@
# endif
#endif
+ pseg->total_throw_away_nursery += nursery_used;
pseg->pub.nursery_current = (stm_char *)_stm_nursery_start;
+ pseg->pub.nursery_mark -= nursery_used;
/* free any object left from 'young_outside_nursery' */
if (!tree_is_cleared(pseg->young_outside_nursery)) {
@@ -505,8 +514,6 @@
}
tree_clear(pseg->nursery_objects_shadows);
-
- return nursery_used;
#pragma pop_macro("STM_SEGMENT")
#pragma pop_macro("STM_PSEGMENT")
}
@@ -519,6 +526,7 @@
static void _do_minor_collection(bool commit)
{
dprintf(("minor_collection commit=%d\n", (int)commit));
+ assert(!STM_SEGMENT->no_safe_point_here);
STM_PSEGMENT->minor_collect_will_commit_now = commit;
@@ -561,11 +569,12 @@
assert(MINOR_NOTHING_TO_DO(STM_PSEGMENT));
}
-static void minor_collection(bool commit)
+static void minor_collection(bool commit, bool external)
{
assert(!_has_mutex());
- stm_safe_point();
+ if (!external)
+ stm_safe_point();
timing_event(STM_SEGMENT->running_thread, STM_GC_MINOR_START);
@@ -579,7 +588,7 @@
if (level > 0)
force_major_collection_request();
- minor_collection(/*commit=*/ false);
+ minor_collection(/*commit=*/ false, /*external=*/ false);
#ifdef STM_TESTS
/* tests don't want aborts in stm_allocate, thus
diff --git a/c8/stm/nursery.h b/c8/stm/nursery.h
--- a/c8/stm/nursery.h
+++ b/c8/stm/nursery.h
@@ -10,9 +10,9 @@
object_t *obj, uint8_t mark_value,
bool mark_all, bool really_clear);
-static void minor_collection(bool commit);
+static void minor_collection(bool commit, bool external);
static void check_nursery_at_transaction_start(void);
-static size_t throw_away_nursery(struct stm_priv_segment_info_s *pseg);
+static void throw_away_nursery(struct stm_priv_segment_info_s *pseg);
static void major_do_validation_and_minor_collections(void);
static void assert_memset_zero(void *s, size_t n);
diff --git a/c8/stm/setup.c b/c8/stm/setup.c
--- a/c8/stm/setup.c
+++ b/c8/stm/setup.c
@@ -134,8 +134,12 @@
setup_pages();
setup_forksupport();
setup_finalizer();
+ setup_detach();
set_gs_register(get_segment_base(0));
+
+ dprintf(("nursery: %p -> %p\n", (void *)NURSERY_START,
+ (void *)NURSERY_END));
}
void stm_teardown(void)
@@ -229,6 +233,8 @@
{
int num;
s_mutex_lock();
+ tl->self = tl; /* for faster access to &stm_thread_local (and easier
+ from the PyPy JIT, too) */
if (stm_all_thread_locals == NULL) {
stm_all_thread_locals = tl->next = tl->prev = tl;
num = 0;
@@ -263,6 +269,8 @@
void stm_unregister_thread_local(stm_thread_local_t *tl)
{
+ commit_detached_transaction_if_from(tl);
+
s_mutex_lock();
assert(tl->prev != NULL);
assert(tl->next != NULL);
diff --git a/c8/stm/sync.c b/c8/stm/sync.c
--- a/c8/stm/sync.c
+++ b/c8/stm/sync.c
@@ -1,6 +1,7 @@
#include <sys/syscall.h>
#include <sys/prctl.h>
#include <asm/prctl.h>
+#include <time.h>
#ifndef _STM_CORE_H_
# error "must be compiled via stmgc.c"
@@ -21,25 +22,29 @@
static void setup_sync(void)
{
- if (pthread_mutex_init(&sync_ctl.global_mutex, NULL) != 0)
- stm_fatalerror("mutex initialization: %m");
+ int err = pthread_mutex_init(&sync_ctl.global_mutex, NULL);
+ if (err != 0)
+ stm_fatalerror("mutex initialization: %d", err);
long i;
for (i = 0; i < _C_TOTAL; i++) {
- if (pthread_cond_init(&sync_ctl.cond[i], NULL) != 0)
- stm_fatalerror("cond initialization: %m");
+ err = pthread_cond_init(&sync_ctl.cond[i], NULL);
+ if (err != 0)
+ stm_fatalerror("cond initialization: %d", err);
}
}
static void teardown_sync(void)
{
- if (pthread_mutex_destroy(&sync_ctl.global_mutex) != 0)
- stm_fatalerror("mutex destroy: %m");
+ int err = pthread_mutex_destroy(&sync_ctl.global_mutex);
+ if (err != 0)
+ stm_fatalerror("mutex destroy: %d", err);
long i;
for (i = 0; i < _C_TOTAL; i++) {
- if (pthread_cond_destroy(&sync_ctl.cond[i]) != 0)
- stm_fatalerror("cond destroy: %m");
+ err = pthread_cond_destroy(&sync_ctl.cond[i]);
+ if (err != 0)
+ stm_fatalerror("cond destroy: %d", err);
}
memset(&sync_ctl, 0, sizeof(sync_ctl));
@@ -59,19 +64,30 @@
stm_fatalerror("syscall(arch_prctl, ARCH_SET_GS): %m");
}
+static void ensure_gs_register(long segnum)
+{
+ /* XXX use this instead of set_gs_register() in many places */
+ if (STM_SEGMENT->segment_num != segnum) {
+ set_gs_register(get_segment_base(segnum));
+ assert(STM_SEGMENT->segment_num == segnum);
+ }
+}
+
static inline void s_mutex_lock(void)
{
assert(!_has_mutex_here);
- if (UNLIKELY(pthread_mutex_lock(&sync_ctl.global_mutex) != 0))
- stm_fatalerror("pthread_mutex_lock: %m");
+ int err = pthread_mutex_lock(&sync_ctl.global_mutex);
+ if (UNLIKELY(err != 0))
+ stm_fatalerror("pthread_mutex_lock: %d", err);
assert((_has_mutex_here = true, 1));
}
static inline void s_mutex_unlock(void)
{
assert(_has_mutex_here);
- if (UNLIKELY(pthread_mutex_unlock(&sync_ctl.global_mutex) != 0))
- stm_fatalerror("pthread_mutex_unlock: %m");
+ int err = pthread_mutex_unlock(&sync_ctl.global_mutex);
+ if (UNLIKELY(err != 0))
+ stm_fatalerror("pthread_mutex_unlock: %d", err);
assert((_has_mutex_here = false, 1));
}
@@ -83,26 +99,70 @@
#endif
assert(_has_mutex_here);
- if (UNLIKELY(pthread_cond_wait(&sync_ctl.cond[ctype],
- &sync_ctl.global_mutex) != 0))
- stm_fatalerror("pthread_cond_wait/%d: %m", (int)ctype);
+ int err = pthread_cond_wait(&sync_ctl.cond[ctype],
+ &sync_ctl.global_mutex);
+ if (UNLIKELY(err != 0))
+ stm_fatalerror("pthread_cond_wait/%d: %d", (int)ctype, err);
+}
+
+static inline void timespec_delay(struct timespec *t, double incr)
+{
+#ifdef CLOCK_REALTIME
+ clock_gettime(CLOCK_REALTIME, t);
+#else
+ struct timeval tv;
+ RPY_GETTIMEOFDAY(&tv);
+ t->tv_sec = tv.tv_sec;
+ t->tv_nsec = tv.tv_usec * 1000 + 999;
+#endif
+ /* assumes that "incr" is not too large, less than 1 second */
+ long nsec = t->tv_nsec + (long)(incr * 1000000000.0);
+ if (nsec >= 1000000000) {
+ t->tv_sec += 1;
+ nsec -= 1000000000;
+ assert(nsec < 1000000000);
+ }
+ t->tv_nsec = nsec;
+}
+
+static inline bool cond_wait_timeout(enum cond_type_e ctype, double delay)
+{
+#ifdef STM_NO_COND_WAIT
+ stm_fatalerror("*** cond_wait/%d called!", (int)ctype);
+#endif
+
+ assert(_has_mutex_here);
+
+ struct timespec t;
+ timespec_delay(&t, delay);
+
+ int err = pthread_cond_timedwait(&sync_ctl.cond[ctype],
+ &sync_ctl.global_mutex, &t);
+ if (err == 0)
+ return true; /* success */
+ if (LIKELY(err == ETIMEDOUT))
+ return false; /* timeout */
+ stm_fatalerror("pthread_cond_timedwait/%d: %d", (int)ctype, err);
}
static inline void cond_signal(enum cond_type_e ctype)
{
- if (UNLIKELY(pthread_cond_signal(&sync_ctl.cond[ctype]) != 0))
- stm_fatalerror("pthread_cond_signal/%d: %m", (int)ctype);
+ int err = pthread_cond_signal(&sync_ctl.cond[ctype]);
+ if (UNLIKELY(err != 0))
+ stm_fatalerror("pthread_cond_signal/%d: %d", (int)ctype, err);
}
static inline void cond_broadcast(enum cond_type_e ctype)
{
- if (UNLIKELY(pthread_cond_broadcast(&sync_ctl.cond[ctype]) != 0))
- stm_fatalerror("pthread_cond_broadcast/%d: %m", (int)ctype);
+ int err = pthread_cond_broadcast(&sync_ctl.cond[ctype]);
+ if (UNLIKELY(err != 0))
+ stm_fatalerror("pthread_cond_broadcast/%d: %d", (int)ctype, err);
}
/************************************************************/
+#if 0
void stm_wait_for_current_inevitable_transaction(void)
{
restart:
@@ -125,7 +185,7 @@
}
s_mutex_unlock();
}
-
+#endif
static bool acquire_thread_segment(stm_thread_local_t *tl)
@@ -155,10 +215,12 @@
num = (num+1) % (NB_SEGMENTS-1);
if (sync_ctl.in_use1[num+1] == 0) {
/* we're getting 'num', a different number. */
- dprintf(("acquired different segment: %d->%d\n",
- tl->last_associated_segment_num, num+1));
+ int old_num = tl->last_associated_segment_num;
+ dprintf(("acquired different segment: %d->%d\n", old_num, num+1));
tl->last_associated_segment_num = num+1;
set_gs_register(get_segment_base(num+1));
+ dprintf((" %d->%d\n", old_num, num+1));
+ (void)old_num;
goto got_num;
}
}
@@ -185,18 +247,22 @@
static void release_thread_segment(stm_thread_local_t *tl)
{
+ int segnum;
assert(_has_mutex());
cond_signal(C_SEGMENT_FREE);
assert(STM_SEGMENT->running_thread == tl);
- assert(tl->last_associated_segment_num == STM_SEGMENT->segment_num);
- assert(in_transaction(tl));
- STM_SEGMENT->running_thread = NULL;
- assert(!in_transaction(tl));
+ segnum = STM_SEGMENT->segment_num;
+ if (tl != NULL) {
+ assert(tl->last_associated_segment_num == segnum);
+ assert(in_transaction(tl));
+ STM_SEGMENT->running_thread = NULL;
+ assert(!in_transaction(tl));
+ }
- assert(sync_ctl.in_use1[tl->last_associated_segment_num] == 1);
- sync_ctl.in_use1[tl->last_associated_segment_num] = 0;
+ assert(sync_ctl.in_use1[segnum] == 1);
+ sync_ctl.in_use1[segnum] = 0;
}
__attribute__((unused))
@@ -263,16 +329,19 @@
}
assert(!pause_signalled);
pause_signalled = true;
+ dprintf(("request to pause\n"));
}
static inline long count_other_threads_sp_running(void)
{
/* Return the number of other threads in SP_RUNNING.
- Asserts that SP_RUNNING threads still have the NSE_SIGxxx. */
+ Asserts that SP_RUNNING threads still have the NSE_SIGxxx.
+ (A detached inevitable transaction is still SP_RUNNING.) */
long i;
long result = 0;
- int my_num = STM_SEGMENT->segment_num;
+ int my_num;
+ my_num = STM_SEGMENT->segment_num;
for (i = 1; i < NB_SEGMENTS; i++) {
if (i != my_num && get_priv_segment(i)->safe_point == SP_RUNNING) {
assert(get_segment(i)->nursery_end <= _STM_NSE_SIGNAL_MAX);
@@ -295,6 +364,7 @@
if (get_segment(i)->nursery_end == NSE_SIGPAUSE)
get_segment(i)->nursery_end = NURSERY_END;
}
+ dprintf(("request removed\n"));
cond_broadcast(C_REQUEST_REMOVED);
}
@@ -312,6 +382,8 @@
if (STM_SEGMENT->nursery_end == NURSERY_END)
break; /* no safe point requested */
+ dprintf(("enter safe point\n"));
+ assert(!STM_SEGMENT->no_safe_point_here);
assert(STM_SEGMENT->nursery_end == NSE_SIGPAUSE);
assert(pause_signalled);
@@ -326,11 +398,15 @@
cond_wait(C_REQUEST_REMOVED);
STM_PSEGMENT->safe_point = SP_RUNNING;
timing_event(STM_SEGMENT->running_thread, STM_WAIT_DONE);
+ assert(!STM_SEGMENT->no_safe_point_here);
+ dprintf(("left safe point\n"));
}
}
static void synchronize_all_threads(enum sync_type_e sync_type)
{
+ restart:
+ assert(_has_mutex());
enter_safe_point_if_requested();
/* Only one thread should reach this point concurrently. This is
@@ -349,8 +425,19 @@
/* If some other threads are SP_RUNNING, we cannot proceed now.
Wait until all other threads are suspended. */
while (count_other_threads_sp_running() > 0) {
+
+ intptr_t detached = fetch_detached_transaction();
+ if (detached != 0) {
+ remove_requests_for_safe_point(); /* => C_REQUEST_REMOVED */
+ s_mutex_unlock();
+ commit_fetched_detached_transaction(detached);
+ s_mutex_lock();
+ goto restart;
+ }
+
STM_PSEGMENT->safe_point = SP_WAIT_FOR_C_AT_SAFE_POINT;
- cond_wait(C_AT_SAFE_POINT);
+ cond_wait_timeout(C_AT_SAFE_POINT, 0.00001);
+ /* every 10 microsec, try again fetch_detached_transaction() */
STM_PSEGMENT->safe_point = SP_RUNNING;
if (must_abort()) {
diff --git a/c8/stm/sync.h b/c8/stm/sync.h
--- a/c8/stm/sync.h
+++ b/c8/stm/sync.h
@@ -17,6 +17,7 @@
static bool _has_mutex(void);
#endif
static void set_gs_register(char *value);
+static void ensure_gs_register(long segnum);
/* acquire and release one of the segments for running the given thread
diff --git a/c8/stmgc.c b/c8/stmgc.c
--- a/c8/stmgc.c
+++ b/c8/stmgc.c
@@ -18,6 +18,7 @@
#include "stm/rewind_setjmp.h"
#include "stm/finalizer.h"
#include "stm/locks.h"
+#include "stm/detach.h"
#include "stm/misc.c"
#include "stm/list.c"
@@ -41,3 +42,4 @@
#include "stm/rewind_setjmp.c"
#include "stm/finalizer.c"
#include "stm/hashtable.c"
+#include "stm/detach.c"
diff --git a/c8/stmgc.h b/c8/stmgc.h
--- a/c8/stmgc.h
+++ b/c8/stmgc.h
@@ -13,6 +13,7 @@
#include <limits.h>
#include <unistd.h>
+#include "stm/atomic.h"
#include "stm/rewind_setjmp.h"
#if LONG_MAX == 2147483647
@@ -39,9 +40,11 @@
struct stm_segment_info_s {
uint8_t transaction_read_version;
+ uint8_t no_safe_point_here; /* set from outside, triggers an assert */
int segment_num;
char *segment_base;
stm_char *nursery_current;
+ stm_char *nursery_mark;
uintptr_t nursery_end;
struct stm_thread_local_s *running_thread;
};
@@ -65,13 +68,10 @@
the following raw region of memory is cleared. */
char *mem_clear_on_abort;
size_t mem_bytes_to_clear_on_abort;
- /* after an abort, some details about the abort are stored there.
- (this field is not modified on a successful commit) */
- long last_abort__bytes_in_nursery;
/* the next fields are handled internally by the library */
int last_associated_segment_num; /* always a valid seg num */
int thread_local_counter;
- struct stm_thread_local_s *prev, *next;
+ struct stm_thread_local_s *self, *prev, *next;
void *creating_pthread[2];
} stm_thread_local_t;
@@ -82,6 +82,17 @@
void _stm_write_slowpath_card(object_t *, uintptr_t);
object_t *_stm_allocate_slowpath(ssize_t);
object_t *_stm_allocate_external(ssize_t);
+
+extern volatile intptr_t _stm_detached_inevitable_from_thread;
+long _stm_start_transaction(stm_thread_local_t *tl);
+void _stm_commit_transaction(void);
+void _stm_leave_noninevitable_transactional_zone(void);
+#define _stm_detach_inevitable_transaction(tl) do { \
+ write_fence(); \
+ assert(_stm_detached_inevitable_from_thread == 0); \
+ _stm_detached_inevitable_from_thread = (intptr_t)(tl->self); \
+} while (0)
+void _stm_reattach_transaction(stm_thread_local_t *tl);
void _stm_become_inevitable(const char*);
void _stm_collectable_safe_point(void);
@@ -379,39 +390,92 @@
rewind_jmp_enum_shadowstack(&(tl)->rjthread, callback)
-/* Starting and ending transactions. stm_read(), stm_write() and
- stm_allocate() should only be called from within a transaction.
- The stm_start_transaction() call returns the number of times it
- returned, starting at 0. If it is > 0, then the transaction was
- aborted and restarted this number of times. */
-long stm_start_transaction(stm_thread_local_t *tl);
-void stm_start_inevitable_transaction(stm_thread_local_t *tl);
-void stm_commit_transaction(void);
+#ifdef STM_NO_AUTOMATIC_SETJMP
+int stm_is_inevitable(stm_thread_local_t *tl);
+#else
+static inline int stm_is_inevitable(stm_thread_local_t *tl) {
+ return !rewind_jmp_armed(&tl->rjthread);
+}
+#endif
-/* Temporary fix? Call this outside a transaction. If there is an
- inevitable transaction running somewhere else, wait until it finishes. */
-void stm_wait_for_current_inevitable_transaction(void);
+
+/* Entering and leaving a "transactional code zone": a (typically very
+ large) section in the code where we are running a transaction.
+ This is the STM equivalent to "acquire the GIL" and "release the
+ GIL", respectively. stm_read(), stm_write(), stm_allocate(), and
+ other functions should only be called from within a transaction.
+
+ Note that transactions, in the STM sense, cover _at least_ one
+ transactional code zone. They may be longer; for example, if one
+ thread does a lot of stm_enter_transactional_zone() +
+ stm_become_inevitable() + stm_leave_transactional_zone(), as is
+ typical in a thread that does a lot of C function calls, then we
+ get only a few bigger inevitable transactions that cover the many
+ short transactional zones. This is done by having
+ stm_leave_transactional_zone() turn the current transaction
+ inevitable and detach it from the running thread (if there is no
+ other inevitable transaction running so far). Then
+ stm_enter_transactional_zone() will try to reattach to it. This is
+ far more efficient than constantly starting and committing
+ transactions.
+
+ stm_enter_transactional_zone() and stm_leave_transactional_zone()
+ preserve the value of errno.
+*/
+#ifdef STM_DEBUGPRINT
+#include <stdio.h>
+#endif
+static inline void stm_enter_transactional_zone(stm_thread_local_t *tl) {
+ if (__sync_bool_compare_and_swap(&_stm_detached_inevitable_from_thread,
+ (intptr_t)tl, 0)) {
+#ifdef STM_DEBUGPRINT
+ fprintf(stderr, "stm_enter_transactional_zone fast path\n");
+#endif
+ }
+ else {
+ _stm_reattach_transaction(tl);
+ /* _stm_detached_inevitable_from_thread should be 0 here, but
+ it can already have been changed from a parallel thread
+ (assuming we're not inevitable ourselves) */
+ }
+}
+static inline void stm_leave_transactional_zone(stm_thread_local_t *tl) {
+ assert(STM_SEGMENT->running_thread == tl);
+ if (stm_is_inevitable(tl)) {
+#ifdef STM_DEBUGPRINT
+ fprintf(stderr, "stm_leave_transactional_zone fast path\n");
+#endif
+ _stm_detach_inevitable_transaction(tl);
+ }
+ else {
+ _stm_leave_noninevitable_transactional_zone();
+ }
+}
+
+/* stm_force_transaction_break() is in theory equivalent to
+ stm_leave_transactional_zone() immediately followed by
+ stm_enter_transactional_zone(); however, it is supposed to be
+ called in CPU-heavy threads that had a transaction run for a while,
+ and so it *always* forces a commit and starts the next transaction.
+ The new transaction is never inevitable. See also
+ stm_should_break_transaction(). */
+void stm_force_transaction_break(stm_thread_local_t *tl);
/* Abort the currently running transaction. This function never
- returns: it jumps back to the stm_start_transaction(). */
+ returns: it jumps back to the start of the transaction (which must
+ not be inevitable). */
void stm_abort_transaction(void) __attribute__((noreturn));
-#ifdef STM_NO_AUTOMATIC_SETJMP
-int stm_is_inevitable(void);
-#else
-static inline int stm_is_inevitable(void) {
- return !rewind_jmp_armed(&STM_SEGMENT->running_thread->rjthread);
-}
-#endif
-
/* Turn the current transaction inevitable.
stm_become_inevitable() itself may still abort the transaction instead
of returning. */
static inline void stm_become_inevitable(stm_thread_local_t *tl,
const char* msg) {
assert(STM_SEGMENT->running_thread == tl);
- if (!stm_is_inevitable())
+ if (!stm_is_inevitable(tl))
_stm_become_inevitable(msg);
+ /* now, we're running the inevitable transaction, so this var should be 0 */
+ assert(_stm_detached_inevitable_from_thread == 0);
}
/* Forces a safe-point if needed. Normally not needed: this is
@@ -425,6 +489,23 @@
void stm_collect(long level);
+/* A way to detect that we've run for a while and should call
+ stm_force_transaction_break() */
+static inline int stm_should_break_transaction(void)
+{
+ return ((intptr_t)STM_SEGMENT->nursery_current >=
+ (intptr_t)STM_SEGMENT->nursery_mark);
+}
+extern uintptr_t stm_fill_mark_nursery_bytes;
+/* ^^^ at the start of a transaction, 'nursery_mark' is initialized to
+ 'stm_fill_mark_nursery_bytes' inside the nursery. This value can
+ be larger than the nursery; every minor collection shifts the
+ current 'nursery_mark' down by one nursery-size. After an abort
+ and restart, 'nursery_mark' is set to ~90% of the value it reached
+ in the last attempt.
+*/
+
+
/* Prepare an immortal "prebuilt" object managed by the GC. Takes a
pointer to an 'object_t', which should not actually be a GC-managed
structure but a real static structure. Returns the equivalent
@@ -466,8 +547,8 @@
other threads. A very heavy-handed way to make sure that no other
transaction is running concurrently. Avoid as much as possible.
Other transactions will continue running only after this transaction
- commits. (xxx deprecated and may be removed) */
-void stm_become_globally_unique_transaction(stm_thread_local_t *tl, const char *msg);
+ commits. (deprecated, not working any more according to demo_random2) */
+//void stm_become_globally_unique_transaction(stm_thread_local_t *tl, const char *msg);
/* Moves the transaction forward in time by validating the read and
write set with all commits that happened since the last validation
diff --git a/c8/test/support.py b/c8/test/support.py
More information about the pypy-commit
mailing list