diff mbox

[1/3] optabs: ensure mem_thread_fence is a compiler barrier

Message ID alpine.LNX.2.20.13.1708071135520.29517@monopod.intra.ispras.ru
State New
Headers show

Commit Message

Alexander Monakov Aug. 7, 2017, 8:42 a.m. UTC
On Sat, 5 Aug 2017, Richard Sandiford wrote:
> It would be simpler to test whether targetm.gen_mem_thread_fence
> returns NULL.
> 
> This feels a bit hacky though.  Checking whether a generator produced no
> instructions is usually the test for whether the generator FAILed, which
> should normally be treated as though the pattern wasn't defined to start
> with.  This is useful if the FAIL condition is too complex to put in the
> define_expand/insn C condition.
> 
> In those contexts, if a generator wants to succeed and generate no
> instructions, it should emit a note instead.  (Yes, it would be good
> to have a nicer interface...)
> 
> So IMO it's confusing to overload the empty sequence to mean something
> else in this context, and it wouldn't work if a target does emit the kind
> of "I didn't FAIL" note just mentioned.

Thanks for the elucidation, I appreciate it.

> FWIW, I agree with Jeff in the previous thread that (compared to this)
> it would be more robust to make optabs.c emit the compile barrier
> unconditionally and just leave the target to emit a machine barrier.

OK, I've changed the patch accordingly.  I've also factored out the
is_mm_relaxed check in this iteration to simplify resulting code.

	PR target/80640
	* doc/md.texi (mem_thread_fence): Remove mention of mode.  Rewrite.
        * optabs.c (expand_mem_thread_fence): Emit a compiler barrier when
        using targetm.gen_mem_thread_fence.
testsuite/
	* gcc.dg/atomic/pr80640.c: New testcase.

Comments

Jeff Law Aug. 14, 2017, 8:41 p.m. UTC | #1
On 08/07/2017 02:42 AM, Alexander Monakov wrote:
> On Sat, 5 Aug 2017, Richard Sandiford wrote:
>> It would be simpler to test whether targetm.gen_mem_thread_fence
>> returns NULL.
>>
>> This feels a bit hacky though.  Checking whether a generator produced no
>> instructions is usually the test for whether the generator FAILed, which
>> should normally be treated as though the pattern wasn't defined to start
>> with.  This is useful if the FAIL condition is too complex to put in the
>> define_expand/insn C condition.
>>
>> In those contexts, if a generator wants to succeed and generate no
>> instructions, it should emit a note instead.  (Yes, it would be good
>> to have a nicer interface...)
>>
>> So IMO it's confusing to overload the empty sequence to mean something
>> else in this context, and it wouldn't work if a target does emit the kind
>> of "I didn't FAIL" note just mentioned.
> 
> Thanks for the elucidation, I appreciate it.
> 
>> FWIW, I agree with Jeff in the previous thread that (compared to this)
>> it would be more robust to make optabs.c emit the compile barrier
>> unconditionally and just leave the target to emit a machine barrier.
> 
> OK, I've changed the patch accordingly.  I've also factored out the
> is_mm_relaxed check in this iteration to simplify resulting code.
> 
> 	PR target/80640
> 	* doc/md.texi (mem_thread_fence): Remove mention of mode.  Rewrite.
>         * optabs.c (expand_mem_thread_fence): Emit a compiler barrier when
>         using targetm.gen_mem_thread_fence.
> testsuite/
> 	* gcc.dg/atomic/pr80640.c: New testcase.
THanks for your patience.  OK for the trunk.

jeff
diff mbox

Patch

diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi
index ea959208c98..64a137ecad0 100644
--- a/gcc/doc/md.texi
+++ b/gcc/doc/md.texi
@@ -7044,14 +7044,20 @@  If these patterns are not defined, attempts will be made to use
 counterparts.  If none of these are available a compare-and-swap
 loop will be used.
 
-@cindex @code{mem_thread_fence@var{mode}} instruction pattern
-@item @samp{mem_thread_fence@var{mode}}
+@cindex @code{mem_thread_fence} instruction pattern
+@item @samp{mem_thread_fence}
 This pattern emits code required to implement a thread fence with
 memory model semantics.  Operand 0 is the memory model to be used.
 
-If this pattern is not specified, all memory models except
-@code{__ATOMIC_RELAXED} will result in issuing a @code{sync_synchronize}
-barrier pattern.
+For the @code{__ATOMIC_RELAXED} model no instructions need to be issued
+and this expansion is not invoked.
+
+The compiler always emits a compiler memory barrier regardless of what
+expanding this pattern produced.
+
+If this pattern is not defined, the compiler falls back to expanding the
+@code{memory_barrier} pattern, then to emitting @code{__sync_synchronize}
+library call, and finally to just placing a compiler memory barrier.
 
 @cindex @code{mem_signal_fence@var{mode}} instruction pattern
 @item @samp{mem_signal_fence@var{mode}}
diff --git a/gcc/optabs.c b/gcc/optabs.c
index a9900657a58..71b74dd5feb 100644
--- a/gcc/optabs.c
+++ b/gcc/optabs.c
@@ -6297,17 +6297,19 @@  expand_asm_memory_barrier (void)
 void
 expand_mem_thread_fence (enum memmodel model)
 {
+  if (is_mm_relaxed (model))
+    return;
   if (targetm.have_mem_thread_fence ())
-    emit_insn (targetm.gen_mem_thread_fence (GEN_INT (model)));
-  else if (!is_mm_relaxed (model))
     {
-      if (targetm.have_memory_barrier ())
-	emit_insn (targetm.gen_memory_barrier ());
-      else if (synchronize_libfunc != NULL_RTX)
-	emit_library_call (synchronize_libfunc, LCT_NORMAL, VOIDmode, 0);
-      else
-	expand_asm_memory_barrier ();
+      emit_insn (targetm.gen_mem_thread_fence (GEN_INT (model)));
+      expand_asm_memory_barrier ();
     }
+  else if (targetm.have_memory_barrier ())
+    emit_insn (targetm.gen_memory_barrier ());
+  else if (synchronize_libfunc != NULL_RTX)
+    emit_library_call (synchronize_libfunc, LCT_NORMAL, VOIDmode, 0);
+  else
+    expand_asm_memory_barrier ();
 }
 
 /* This routine will either emit the mem_signal_fence pattern or issue a 
diff --git a/gcc/testsuite/gcc.dg/atomic/pr80640.c b/gcc/testsuite/gcc.dg/atomic/pr80640.c
new file mode 100644
index 00000000000..fd17978a482
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/atomic/pr80640.c
@@ -0,0 +1,34 @@ 
+/* { dg-do run } */
+/* { dg-options "-pthread" } */
+/* { dg-require-effective-target pthread } */
+
+#include <pthread.h>
+
+static volatile int sem1;
+static volatile int sem2;
+
+static void *f(void *va)
+{
+  void **p = va;
+  if (*p) return *p;
+  sem1 = 1;
+  while (!sem2);
+  __atomic_thread_fence(__ATOMIC_ACQUIRE);
+  // GCC used to RTL-CSE this and the first load, causing 0 to be returned
+  return *p;
+}
+
+int main()
+{
+  void *p = 0;
+  pthread_t thr;
+  if (pthread_create(&thr, 0, f, &p))
+    return 2;
+  while (!sem1);
+  __atomic_thread_fence(__ATOMIC_ACQUIRE);
+  p = &p;
+  __atomic_thread_fence(__ATOMIC_RELEASE);
+  sem2 = 1;
+  pthread_join(thr, &p);
+  return !p;
+}