[gomp4] Preserve NVPTX "reconvergence" points
diff mbox

Message ID 20150603124527.48eb7d6f@octopus
State New
Headers show

Commit Message

Julian Brown June 3, 2015, 11:45 a.m. UTC
On Thu, 28 May 2015 16:37:04 +0200
Richard Biener <richard.guenther@gmail.com> wrote:

> On Thu, May 28, 2015 at 4:06 PM, Julian Brown
> <julian@codesourcery.com> wrote:
> > For NVPTX, it is vitally important that the divergence of threads
> > within a warp can be controlled: in particular we must be able to
> > generate code that we know "reconverges" at a particular point.
> > Unfortunately GCC's middle-end optimisers can cause this property to
> > be violated, which causes problems for the OpenACC execution model
> > we're planning to use for NVPTX.
> 
> Hmm, I don't think adding a new edge flag is good nor necessary.  It
> seems to me that instead the broadcast operation should have abnormal
> control flow and thus basic-blocks should be split either before or
> after it (so either incoming or outgoing edge(s) should be
> abnormal).  I suppose splitting before the broadcast would be best
> (thus handle it similar to setjmp ()).

Here's a version of the patch that uses abnormal edges with semantics
unchanged, splitting the "false"/non-execution edge using a dummy block
to avoid the prohibited case of both EDGE_TRUE/EDGE_FALSE and
EDGE_ABNORMAL on the outgoing edges of a GIMPLE_COND.

So for a fragment like this:

  if (threadIdx.x == 0) /* cond_bb */
  {
    /* work */
    p0 = ...; /* assign */
  }
  pN = broadcast(p0);
  if (pN) goto T; else goto F;

Incoming edges to a broadcast operation have EDGE_ABNORMAL set:

  +--------+
  |cond_bb |--------,
  +--------+        |
      | (true edge) | (false edge)
      v             v
  +--------+     +-------+
  | (work) |     | dummy |
  +--------+     +-------+
  | assign |        |
  +--------+        |
ABNORM|             |ABNORM
      v             |
  +--------+<-------'
  |  bcast |
  +--------+
  |  cond  |
  +--------+
   /     \
  T       F

The abnormal edges actually serve two purposes, I think: as well as
ensuring the broadcast operation takes place when a warp is
non-diverged/coherent, they ensure that p0 is not seen as uninitialised
along the "false" path from cond_bb, possibly leading to the broadcast
operation being optimised away as partially redundant. This feels
somewhat fragile though! We'll have to continue to think about
warp divergence in subsequent patches.

The patch passes libgomp testing (with Bernd's recent worker-single
patch also). OK for gomp4 branch (together with the
previously-mentioned inline thread builtin patch)?

Thanks,

Julian

ChangeLog

    gcc/
    * omp-low.c (make_predication_test): Split false block out of
    cond_bb, making latter edge abnormal.
    (predicate_bb): Set EDGE_ABNORMAL on edges before broadcast
    operations.

Patch
diff mbox

commit 38056ae4a29f93ce54715dfad843a233f3b0fd2a
Author: Julian Brown <julian@codesourcery.com>
Date:   Mon Jun 1 11:12:41 2015 -0700

    Use abnormal edges before broadcast ops

diff --git a/gcc/omp-low.c b/gcc/omp-low.c
index 7048f9f..310eb72 100644
--- a/gcc/omp-low.c
+++ b/gcc/omp-low.c
@@ -10555,7 +10555,16 @@  make_predication_test (edge true_edge, basic_block skip_dest_bb, int mask)
   gsi_insert_after (&tmp_gsi, cond_stmt, GSI_NEW_STMT);
 
   true_edge->flags = EDGE_TRUE_VALUE;
-  make_edge (cond_bb, skip_dest_bb, EDGE_FALSE_VALUE);
+
+  /* Force an abnormal edge before a broadcast operation that might be present
+     in SKIP_DEST_BB.  This is only done for the non-execution edge (with
+     respect to the predication done by this function) -- the opposite
+     (execution) edge that reaches the broadcast operation must be made
+     abnormal also, e.g. in this function's caller.  */
+  edge e = make_edge (cond_bb, skip_dest_bb, EDGE_FALSE_VALUE);
+  basic_block false_abnorm_bb = split_edge (e);
+  edge abnorm_edge = single_succ_edge (false_abnorm_bb);
+  abnorm_edge->flags |= EDGE_ABNORMAL;
 }
 
 /* Apply OpenACC predication to basic block BB which is in
@@ -10605,6 +10614,7 @@  predicate_bb (basic_block bb, struct omp_region *parent, int mask)
 						   mask);
 
       edge e = split_block (bb, splitpoint);
+      e->flags = EDGE_ABNORMAL;
       skip_dest_bb = e->dest;
 
       gimple_cond_set_condition (as_a <gcond *> (stmt), EQ_EXPR,
@@ -10624,6 +10634,7 @@  predicate_bb (basic_block bb, struct omp_region *parent, int mask)
 						   gsi_asgn, mask);
 
       edge e = split_block (bb, splitpoint);
+      e->flags = EDGE_ABNORMAL;
       skip_dest_bb = e->dest;
 
       gimple_switch_set_index (sstmt, new_var);