diff mbox

[1/3] optabs: ensure mem_thread_fence is a compiler barrier

Message ID 20170802174548.12344-1-amonakov@ispras.ru
State New
Headers show

Commit Message

Alexander Monakov Aug. 2, 2017, 5:45 p.m. UTC
As recently discussed in the previous thread for PR 80640, some targets have
sufficiently strong hardware memory ordering that implementation of C11 atomic
fences might not need machine barriers.  However, a compiler memory barrier
nevertheless must be present, and at least two targets (x86, s390) got this
wrong in their .md expanders.

Handle it in the middle-end by detecting empty expansion and emitting a
compiler memory barrier.  If the backend produced non-empty RTL, expect that
it's using the same RTX structure as in "memory_barrier" (__sync_synchronize)
expansion, which must be a compiler memory on its own.

A followup refactor is possible in expand_mem_thread_fence to make it start with

  if (is_mm_relaxed (model))
    return;

as it's not useful to pass __ATOMIC_RELAXED model to gen_mem_thread_fence.

	PR target/80640
	* doc/md.texi (mem_thread_fence): Remove mention of mode.  Clarify that
        empty expansion is handled by placing a compiler barrier.
        * optabs.c (expand_mem_thread_fence): Place a compiler barrier if
        targetm.gen_mem_thread_fence produced no instructions.
testsuite/
	* gcc.dg/atomic/pr80640.c: New testcase.

---
 gcc/doc/md.texi                       |  9 +++++++--
 gcc/optabs.c                          |  7 ++++++-
 gcc/testsuite/gcc.dg/atomic/pr80640.c | 34 ++++++++++++++++++++++++++++++++++
 3 files changed, 47 insertions(+), 3 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/atomic/pr80640.c

Comments

Richard Sandiford Aug. 5, 2017, 9:03 a.m. UTC | #1
Alexander Monakov <amonakov@ispras.ru> writes:
> diff --git a/gcc/optabs.c b/gcc/optabs.c
> index a9900657a58..d568ca376fe 100644
> --- a/gcc/optabs.c
> +++ b/gcc/optabs.c
> @@ -6298,7 +6298,12 @@ void
>  expand_mem_thread_fence (enum memmodel model)
>  {
>    if (targetm.have_mem_thread_fence ())
> -    emit_insn (targetm.gen_mem_thread_fence (GEN_INT (model)));
> +    {
> +      rtx_insn *last = get_last_insn ();
> +      emit_insn (targetm.gen_mem_thread_fence (GEN_INT (model)));
> +      if (last == get_last_insn () && !is_mm_relaxed (model))
> +	expand_asm_memory_barrier ();
> +    }
>    else if (!is_mm_relaxed (model))
>      {
>        if (targetm.have_memory_barrier ())

It would be simpler to test whether targetm.gen_mem_thread_fence
returns NULL.

This feels a bit hacky though.  Checking whether a generator produced no
instructions is usually the test for whether the generator FAILed, which
should normally be treated as though the pattern wasn't defined to start
with.  This is useful if the FAIL condition is too complex to put in the
define_expand/insn C condition.

In those contexts, if a generator wants to succeed and generate no
instructions, it should emit a note instead.  (Yes, it would be good
to have a nicer interface...)

So IMO it's confusing to overload the empty sequence to mean something
else in this context, and it wouldn't work if a target does emit the kind
of "I didn't FAIL" note just mentioned.

FWIW, I agree with Jeff in the previous thread that (compared to this)
it would be more robust to make optabs.c emit the compile barrier
unconditionally and just leave the target to emit a machine barrier.

Your earlier patch to make targets emit at least a compile barrier
seemed good too.  Perhaps that way we can treat an empty sequence as a
FAIL in the normal way, or assert that the pattern doesn't FAIL.

Thanks,
Richard
diff mbox

Patch

diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi
index ea959208c98..0be161a08dc 100644
--- a/gcc/doc/md.texi
+++ b/gcc/doc/md.texi
@@ -7044,11 +7044,16 @@  If these patterns are not defined, attempts will be made to use
 counterparts.  If none of these are available a compare-and-swap
 loop will be used.
 
-@cindex @code{mem_thread_fence@var{mode}} instruction pattern
-@item @samp{mem_thread_fence@var{mode}}
+@cindex @code{mem_thread_fence} instruction pattern
+@item @samp{mem_thread_fence}
 This pattern emits code required to implement a thread fence with
 memory model semantics.  Operand 0 is the memory model to be used.
 
+Expanding this pattern may produce no instructions if no machine barrier
+is required on the target for given memory model.  In that case, unless
+memory model is @code{__ATOMIC_RELAXED}, a compiler memory barrier is
+emitted automatically.
+
 If this pattern is not specified, all memory models except
 @code{__ATOMIC_RELAXED} will result in issuing a @code{sync_synchronize}
 barrier pattern.
diff --git a/gcc/optabs.c b/gcc/optabs.c
index a9900657a58..d568ca376fe 100644
--- a/gcc/optabs.c
+++ b/gcc/optabs.c
@@ -6298,7 +6298,12 @@  void
 expand_mem_thread_fence (enum memmodel model)
 {
   if (targetm.have_mem_thread_fence ())
-    emit_insn (targetm.gen_mem_thread_fence (GEN_INT (model)));
+    {
+      rtx_insn *last = get_last_insn ();
+      emit_insn (targetm.gen_mem_thread_fence (GEN_INT (model)));
+      if (last == get_last_insn () && !is_mm_relaxed (model))
+	expand_asm_memory_barrier ();
+    }
   else if (!is_mm_relaxed (model))
     {
       if (targetm.have_memory_barrier ())
diff --git a/gcc/testsuite/gcc.dg/atomic/pr80640.c b/gcc/testsuite/gcc.dg/atomic/pr80640.c
new file mode 100644
index 00000000000..fd17978a482
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/atomic/pr80640.c
@@ -0,0 +1,34 @@ 
+/* { dg-do run } */
+/* { dg-options "-pthread" } */
+/* { dg-require-effective-target pthread } */
+
+#include <pthread.h>
+
+static volatile int sem1;
+static volatile int sem2;
+
+static void *f(void *va)
+{
+  void **p = va;
+  if (*p) return *p;
+  sem1 = 1;
+  while (!sem2);
+  __atomic_thread_fence(__ATOMIC_ACQUIRE);
+  // GCC used to RTL-CSE this and the first load, causing 0 to be returned
+  return *p;
+}
+
+int main()
+{
+  void *p = 0;
+  pthread_t thr;
+  if (pthread_create(&thr, 0, f, &p))
+    return 2;
+  while (!sem1);
+  __atomic_thread_fence(__ATOMIC_ACQUIRE);
+  p = &p;
+  __atomic_thread_fence(__ATOMIC_RELEASE);
+  sem2 = 1;
+  pthread_join(thr, &p);
+  return !p;
+}