diff mbox

Sink clobbers if EH block contains just clobbers (PR tree-optimization/51117)

Message ID Pine.LNX.4.64.1112121532141.26507@wotan.suse.de
State New
Headers show

Commit Message

Michael Matz Dec. 12, 2011, 2:46 p.m. UTC
Hello,

On Fri, 9 Dec 2011, Jakub Jelinek wrote:

> I had to tweak a little bit the expander conflict checking, because
> if we have a BB with two incoming EH edges and clobber stmts from both
> sunk into its beginning, then it would consider both variables (a and b
> above) to be live at the same time, while there is no insn on which they
> can actually live at the same time, the PHIs don't mention either of them
> (and after all, PHIs aren't memory loads), and after the PHIs we have
> immediately the clobbers.

The idea is sound, the implementation can be tidied with the observation 
that only the first real instruction (instead of the BB start) is the 
point at which all currently live things need to be conflicted, like in 
the patch below (only cfgexpand.c part changed).  I.e. moving the existing 
code from add_scope_clobbers_1 a bit is enough.  I'm putting this through 
regstrapping on x86_64-linux and will commit if that succeeds, given rths 
approval for the other parts.

I wonder how to best test this.


Ciao,
Michael.
	PR tree-optimization/51117
	* tree-eh.c (sink_clobbers): New function.
	(execute_lower_eh_dispatch): Call it for BBs ending with
	internally throwing RESX.
	* cfgexpand.c (add_scope_conflicts_1): Add all conflicts only
	at the first real instruction.

Comments

Jakub Jelinek Dec. 12, 2011, 4:18 p.m. UTC | #1
On Mon, Dec 12, 2011 at 03:46:01PM +0100, Michael Matz wrote:
> > I had to tweak a little bit the expander conflict checking, because
> > if we have a BB with two incoming EH edges and clobber stmts from both
> > sunk into its beginning, then it would consider both variables (a and b
> > above) to be live at the same time, while there is no insn on which they
> > can actually live at the same time, the PHIs don't mention either of them
> > (and after all, PHIs aren't memory loads), and after the PHIs we have
> > immediately the clobbers.
> 
> The idea is sound, the implementation can be tidied with the observation 
> that only the first real instruction (instead of the BB start) is the 
> point at which all currently live things need to be conflicted, like in 
> the patch below (only cfgexpand.c part changed).  I.e. moving the existing 
> code from add_scope_clobbers_1 a bit is enough.  I'm putting this through 
> regstrapping on x86_64-linux and will commit if that succeeds, given rths 
> approval for the other parts.

Looks cleaner, yes.
Just I wonder:
1) what if a bb contains no real insns (I know, they should be optimized
   out, but do we assert that etc.?) - then the EXECUTE_IF_SET_IN_BITMAP
   loop just wouldn't be done.  Perhaps that is fine, it would make it
   into the bitmap at the end of the bb and perhaps following bb would
   do this loop.
2) the PHIs are then handled always with visit_op instead of visit_conflict,
   I'd guess the needed add_stack_var_conflict calls would then happen
   in that EXECUTE_IF_SET_IN_BITMAP loop, right?

> I wonder how to best test this.

One kind of testing was watching the size of .gcc_except_table going down
with each patch (vanilla -> optimize_cloobers -> optimize_clobbers thinko
fix -> sink_clobbers).

The following test unfortunately succeeds already with current trunk
(i.e. tests already the optimize_clobbers optimization), and with
throw; instead of return 1; it contains __cxa_rethrow call even
with the sink optimization and the difference is mainly that
.gcc_except_table is smaller and __cxa_rethrow block is reachable just from
one spot (recorded in .gcc_except_table), while without sink_clobbers
there is also a jump to it from another spot.  For the
s/return 1/throw/ testcase we could perhaps
scan-tree-dump-not the optimized dump for __builtin_eh_copy_values.

// { dg-do compile }
// { dg-options "-O2 -fexceptions" }

struct A { char buf[64]; };
void bar (A *);

int
foo ()
{
  A c;
  bar (&c);
  try
  {
    {
      A a;
      bar (&a);
      if (a.buf[13])
	throw 1;
      else if (a.buf[52])
	throw 3;
    }
    {
      A b;
      bar (&b);
      if (b.buf[13])
	throw 2;
    }
  }
  catch ( ...)
  {
    return 1;
  }
  return 0;
}

// { dg-final { scan-assembler-not "__cxa_rethrow" } }


	Jakub
Michael Matz Dec. 12, 2011, 4:29 p.m. UTC | #2
Hi,

On Mon, 12 Dec 2011, Jakub Jelinek wrote:

> Looks cleaner, yes.
> Just I wonder:
> 1) what if a bb contains no real insns (I know, they should be optimized
>    out, but do we assert that etc.?) - then the EXECUTE_IF_SET_IN_BITMAP
>    loop just wouldn't be done.  Perhaps that is fine, it would make it
>    into the bitmap at the end of the bb and perhaps following bb would
>    do this loop.

Not only perhaps.  That is exactly what will happen.  If some of the 
successor BBs then has real instructions _that_ one will cause creation of 
all the necessary conflicts.

> 2) the PHIs are then handled always with visit_op instead of visit_conflict,
>    I'd guess the needed add_stack_var_conflict calls would then happen
>    in that EXECUTE_IF_SET_IN_BITMAP loop, right?

Correct.  The PHIs don't need to create the conflicts, any new mention of 
a DECL name will be noted as active, and then creates a conflict at the 
next real instruction (if not cancelled by a clobber before).

> > I wonder how to best test this.
> 
> One kind of testing was watching the size of .gcc_except_table going down
> with each patch (vanilla -> optimize_cloobers -> optimize_clobbers thinko
> fix -> sink_clobbers).

My idea was to somehow check the EH tree for some dump (.ehcleanup2 
perhaps) for being in the expected form, or that the correct number of 
removals happen.  And of course that the sharing still happens, but that's 
even worse to test in the .expand dump :-/


Ciao,
Michael.
Jakub Jelinek Dec. 12, 2011, 4:40 p.m. UTC | #3
On Mon, Dec 12, 2011 at 05:29:09PM +0100, Michael Matz wrote:
> On Mon, 12 Dec 2011, Jakub Jelinek wrote:
> > Just I wonder:
> > 1) what if a bb contains no real insns (I know, they should be optimized
> >    out, but do we assert that etc.?) - then the EXECUTE_IF_SET_IN_BITMAP
> >    loop just wouldn't be done.  Perhaps that is fine, it would make it
> >    into the bitmap at the end of the bb and perhaps following bb would
> >    do this loop.
> 
> Not only perhaps.  That is exactly what will happen.  If some of the 
> successor BBs then has real instructions _that_ one will cause creation of 
> all the necessary conflicts.

Ok.

> > 2) the PHIs are then handled always with visit_op instead of visit_conflict,
> >    I'd guess the needed add_stack_var_conflict calls would then happen
> >    in that EXECUTE_IF_SET_IN_BITMAP loop, right?
> 
> Correct.  The PHIs don't need to create the conflicts, any new mention of 
> a DECL name will be noted as active, and then creates a conflict at the 
> next real instruction (if not cancelled by a clobber before).

Ok.  So, I'm happy with your changes and rth already acked the tree-eh.c
side, so can we just get an ack on these cfgexpand.c changes?  Thanks.

The testcases can perhaps follow when they are ready (and I plan to submit
at least the one checking for no __cxa_rethrow in assembly).

	Jakub
Michael Matz Dec. 12, 2011, 5:03 p.m. UTC | #4
Hi,

On Mon, 12 Dec 2011, Jakub Jelinek wrote:

> Ok.  So, I'm happy with your changes and rth already acked the tree-eh.c 
> side, so can we just get an ack on these cfgexpand.c changes?  Thanks.

Hmpf, I would have simply committed without a re-approval, but if you 
think it's necessary I'll wait.

FYI, I've actually regstrapped with the patch starting iterating from i+1 
in the nested EXECUTE_IF_SET_IN_BITMAP to ignore the diagonal.


Ciao,
Michael.
Richard Henderson Dec. 12, 2011, 5:13 p.m. UTC | #5
On 12/12/2011 09:03 AM, Michael Matz wrote:
> Hmpf, I would have simply committed without a re-approval, but if you 
> think it's necessary I'll wait.

The revised patch is ok too.


r~
diff mbox

Patch

Index: cfgexpand.c
===================================================================
--- cfgexpand.c	(revision 182241)
+++ cfgexpand.c	(working copy)
@@ -456,34 +456,14 @@  add_scope_conflicts_1 (basic_block bb, b
   FOR_EACH_EDGE (e, ei, bb->preds)
     bitmap_ior_into (work, (bitmap)e->src->aux);
 
-  if (for_conflict)
-    {
-      /* We need to add conflicts for everything life at the start of
-         this block.  Unlike classical lifeness for named objects we can't
-	 rely on seeing a def/use of the names we're interested in.
-	 There might merely be indirect loads/stores.  We'd not add any
-	 conflicts for such partitions.  */
-      bitmap_iterator bi;
-      unsigned i;
-      EXECUTE_IF_SET_IN_BITMAP (work, 0, i, bi)
-	{
-	  unsigned j;
-	  bitmap_iterator bj;
-	  EXECUTE_IF_SET_IN_BITMAP (work, i, j, bj)
-	    add_stack_var_conflict (i, j);
-	}
-      visit = visit_conflict;
-    }
-  else
-    visit = visit_op;
+  visit = visit_op;
 
   for (gsi = gsi_start_phis (bb); !gsi_end_p (gsi); gsi_next (&gsi))
     {
       gimple stmt = gsi_stmt (gsi);
-      if (!is_gimple_debug (stmt))
-	walk_stmt_load_store_addr_ops (stmt, work, visit, visit, visit);
+      walk_stmt_load_store_addr_ops (stmt, work, NULL, NULL, visit);
     }
-  for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next (&gsi))
+  for (gsi = gsi_after_labels (bb); !gsi_end_p (gsi); gsi_next (&gsi))
     {
       gimple stmt = gsi_stmt (gsi);
 
@@ -501,7 +481,29 @@  add_scope_conflicts_1 (basic_block bb, b
 	    bitmap_clear_bit (work, *v);
 	}
       else if (!is_gimple_debug (stmt))
-	walk_stmt_load_store_addr_ops (stmt, work, visit, visit, visit);
+	{
+	  if (for_conflict
+	      && visit == visit_op)
+	    {
+	      /* If this is the first real instruction in this BB we need
+	         to add conflicts for everything life at this point now.
+		 Unlike classical lifeness for named objects we can't
+		 rely on seeing a def/use of the names we're interested in.
+		 There might merely be indirect loads/stores.  We'd not add any
+		 conflicts for such partitions.  */
+	      bitmap_iterator bi;
+	      unsigned i;
+	      EXECUTE_IF_SET_IN_BITMAP (work, 0, i, bi)
+		{
+		  unsigned j;
+		  bitmap_iterator bj;
+		  EXECUTE_IF_SET_IN_BITMAP (work, i, j, bj)
+		    add_stack_var_conflict (i, j);
+		}
+	      visit = visit_conflict;
+	    }
+	  walk_stmt_load_store_addr_ops (stmt, work, visit, visit, visit);
+	}
     }
 }
 
Index: tree-eh.c
===================================================================
--- tree-eh.c	(revision 182241)
+++ tree-eh.c	(working copy)
@@ -3194,6 +3194,76 @@  optimize_clobbers (basic_block bb)
     }
 }
 
+/* Try to sink var = {v} {CLOBBER} stmts followed just by
+   internal throw to successor BB.  */
+
+static int
+sink_clobbers (basic_block bb)
+{
+  edge e;
+  edge_iterator ei;
+  gimple_stmt_iterator gsi, dgsi;
+  basic_block succbb;
+  bool any_clobbers = false;
+
+  /* Only optimize if BB has a single EH successor and
+     all predecessor edges are EH too.  */
+  if (!single_succ_p (bb)
+      || (single_succ_edge (bb)->flags & EDGE_EH) == 0)
+    return 0;
+
+  FOR_EACH_EDGE (e, ei, bb->preds)
+    {
+      if ((e->flags & EDGE_EH) == 0)
+	return 0;
+    }
+
+  /* And BB contains only CLOBBER stmts before the final
+     RESX.  */
+  gsi = gsi_last_bb (bb);
+  for (gsi_prev (&gsi); !gsi_end_p (gsi); gsi_prev (&gsi))
+    {
+      gimple stmt = gsi_stmt (gsi);
+      if (is_gimple_debug (stmt))
+	continue;
+      if (gimple_code (stmt) == GIMPLE_LABEL)
+	break;
+      if (!gimple_clobber_p (stmt)
+	  || TREE_CODE (gimple_assign_lhs (stmt)) == SSA_NAME)
+	return 0;
+      any_clobbers = true;
+    }
+  if (!any_clobbers)
+    return 0;
+
+  succbb = single_succ (bb);
+  dgsi = gsi_after_labels (succbb);
+  gsi = gsi_last_bb (bb);
+  for (gsi_prev (&gsi); !gsi_end_p (gsi); gsi_prev (&gsi))
+    {
+      gimple stmt = gsi_stmt (gsi);
+      tree vdef;
+      if (is_gimple_debug (stmt))
+	continue;
+      if (gimple_code (stmt) == GIMPLE_LABEL)
+	break;
+      unlink_stmt_vdef (stmt);
+      gsi_remove (&gsi, false);
+      vdef = gimple_vdef (stmt);
+      if (vdef && TREE_CODE (vdef) == SSA_NAME)
+	{
+	  vdef = SSA_NAME_VAR (vdef);
+	  mark_sym_for_renaming (vdef);
+	  gimple_set_vdef (stmt, vdef);
+	  gimple_set_vuse (stmt, vdef);
+	}
+      release_defs (stmt);
+      gsi_insert_before (&dgsi, stmt, GSI_SAME_STMT);
+    }
+
+  return TODO_update_ssa_only_virtuals;
+}
+
 /* At the end of inlining, we can lower EH_DISPATCH.  Return true when 
    we have found some duplicate labels and removed some edges.  */
 
@@ -3349,7 +3419,7 @@  static unsigned
 execute_lower_eh_dispatch (void)
 {
   basic_block bb;
-  bool any_rewritten = false;
+  int flags = 0;
   bool redirected = false;
 
   assign_filter_values ();
@@ -3362,16 +3432,20 @@  execute_lower_eh_dispatch (void)
       if (gimple_code (last) == GIMPLE_EH_DISPATCH)
 	{
 	  redirected |= lower_eh_dispatch (bb, last);
-	  any_rewritten = true;
+	  flags |= TODO_update_ssa_only_virtuals;
+	}
+      else if (gimple_code (last) == GIMPLE_RESX)
+	{
+	  if (stmt_can_throw_external (last))
+	    optimize_clobbers (bb);
+	  else
+	    flags |= sink_clobbers (bb);
 	}
-      else if (gimple_code (last) == GIMPLE_RESX
-	       && stmt_can_throw_external (last))
-	optimize_clobbers (bb);
     }
 
   if (redirected)
     delete_unreachable_blocks ();
-  return any_rewritten ? TODO_update_ssa_only_virtuals : 0;
+  return flags;
 }
 
 static bool