Patchwork 0005-Search-all-dominated-blocks-for-expressions-to-hoist.patch

login
register
mail settings
Submitter Maxim Kuvyrkov
Date July 9, 2010, 8:17 p.m.
Message ID <4C3783E3.4030501@codesourcery.com>
Download mbox | patch
Permalink /patch/58431/
State New
Headers show

Comments

Maxim Kuvyrkov - July 9, 2010, 8:17 p.m.
On 7/7/10 8:40 PM, Jeff Law wrote:
...
> Since there's no edge insertions with hoisting, the only potential
> problem I can see is when the hoisted expression itself can trigger
> traversal of an abnormal edge and the block we want to put the
> expression does not have an edge to the handler. In that case we'd need
> to add edges in the cfg and I don't see any compensation code in gcse.c
> to deal with that case.

I agree.

>
> Assuming that's the situation we need to avoid then we really need to
> switch the pre-like code since it detects expressions which can cause
> traversal of the abnormal edge much better. It's a fairly simple patch
> since it just prunes some expressions from the local tables before
> running the dataflow solver.

The first of the attached patches replaces transpout with an additional 
check in determining if an expression is anticipatable.

The second patch implements LCA approach to avoid hoisting expression 
too far up.  As a side effect of implementation, it somewhat simplifies 
control flow of hoist_code.

I really hope this is the last iteration on the one-line change this 
problem initially was :).

Thanks,
Jeff Law - July 14, 2010, 8:58 p.m.
07/09/10 14:17, Maxim Kuvyrkov wrote:
>
> The first of the attached patches replaces transpout with an 
> additional check in determining if an expression is anticipatable.
First, don't forget -p when generating your diffs.  -p includes the 
function's name in the diff output which helps the reviewer understand 
where a hunk was changed. Many reviewers prefer unidiffs (-u) as well.

Anyway, onward to the patches...


I'm not sure why you didn't just factor out the code from 
compute_pre_data and use it for both pre and hoisting.


Pull this fragment from compute_pre_data into its own function and call 
it from compute_pre_data and compute_hoist_data:

   /* Collect expressions which might trap.  */
   trapping_expr = sbitmap_alloc (expr_hash_table.n_elems);
   sbitmap_zero (trapping_expr);
   for (ui = 0; ui < expr_hash_table.size; ui++)
     {
       struct expr *e;
       for (e = expr_hash_table.table[ui]; e != NULL; e = e->next_same_hash)
         if (may_trap_p (e->expr))
           SET_BIT (trapping_expr, e->bitmap_index);
     }

Which gives you a bitmap of potentially trapping expressions.

Then pull this fragment from compute_pre_data into a function:


       FOR_EACH_EDGE (e, ei, bb->preds)
         if (e->flags & EDGE_ABNORMAL)
           {
             sbitmap_difference (antloc[bb->index], antloc[bb->index], 
trapping_expr);
             sbitmap_difference (transp[bb->index], transp[bb->index], 
trapping_expr);
             break;
           }

  compute_pre_data would then look like:

  compute_local_properties ( ... )
  sbitmap_vector_zero ( ... )
  compute_potentially_trapping_expressions ( ...)

  FOR_EACH_BB (bb)
     {
       /* If the current block is the destination of an abnormal edge, we
          kill all trapping expressions because we won't be able to properly
          place the instruction on the edge.  So make them neither
          anticipatable nor transparent.  This is fairly conservative.  */
       prune_antloc & transp (bb, antloc, transp, trapping_expr)

       sbitmap_a_or_b (ae_kill[bb->index], transp[bb->index], 
comp[bb->index]);
       sbitmap_not (ae_kill[bb->index], ae_kill[bb->index]);
     }

And compute_code_hoist_data would look like:

   compute_local_properties ( ... )
   compute_potentially_trapping_expressions ( ... )

   FOR_EACH_BB (bb)
     prune_antloc & transp (bb, antloc, transp, trapping_expr);

Or something very close to that.


That avoids having two hunks of code that are supposed to do the same 
thing, but due to implementation differences actually behave differently.



>
> The second patch implements LCA approach to avoid hoisting expression 
> too far up.  As a side effect of implementation, it somewhat 
> simplifies control flow of hoist_code.
This looks pretty good.  I still see a reference to compute_transpout, 
which presumably will go away once we settle on the final form of the 
first patch.  This patch is OK.

>
> I really hope this is the last iteration on the one-line change this 
> problem initially was :).
:-)  It happens like this sometimes.

Jeff
Maxim Kuvyrkov - July 14, 2010, 9:42 p.m.
On 7/15/10 12:58 AM, Jeff Law wrote:
> /* Collect expressions which might trap. */
> trapping_expr = sbitmap_alloc (expr_hash_table.n_elems);
> sbitmap_zero (trapping_expr);
> for (ui = 0; ui < expr_hash_table.size; ui++)
> {
> struct expr *e;
> for (e = expr_hash_table.table[ui]; e != NULL; e = e->next_same_hash)
> if (may_trap_p (e->expr))
> SET_BIT (trapping_expr, e->bitmap_index);
> }
...
> FOR_EACH_EDGE (e, ei, bb->preds)
> if (e->flags & EDGE_ABNORMAL)
> {
> sbitmap_difference (antloc[bb->index], antloc[bb->index], trapping_expr);
> sbitmap_difference (transp[bb->index], transp[bb->index], trapping_expr);
> break;
> }
>
...
> compute_local_properties ( ... )
> compute_potentially_trapping_expressions ( ... )
>
> FOR_EACH_BB (bb)
> prune_antloc & transp (bb, antloc, transp, trapping_expr);

This will have the effect of disabling hoisting for all potentially 
trapping expressions, even if they don't result in abnormal control 
flow.  E.g., a memory reference will not be eligible for hoisting in 
under these conditions.

[IIUC, it is OK to hoist a potentially trapping expression as long as it 
will trap on every path and provided non-call exceptions are disabled.]

> That avoids having two hunks of code that are supposed to do the same
> thing, but due to implementation differences actually behave differently.

PRE code for handling trapping expressions removes quite a larger set of 
expression from optimization space, and that seems like the right thing 
to do for PRE.  Hoisting, on the other hand can relax condtions on 
trapping expressions by compensating with the VBE requirement.  It seems 
to me that we shouldn't disable hoisting of all potentially trapping 
expressions, but only of those that have an abnormal edge sticking out 
of them.


>> The second patch implements LCA approach to avoid hoisting expression
>> too far up. As a side effect of implementation, it somewhat simplifies
>> control flow of hoist_code.
> This looks pretty good. I still see a reference to compute_transpout,
> which presumably will go away once we settle on the final form of the
> first patch. This patch is OK.

Thanks!
Jeff Law - July 15, 2010, 4:06 p.m.
On 07/14/10 15:42, Maxim Kuvyrkov wrote:
> On 7/15/10 12:58 AM, Jeff Law wrote:
>> /* Collect expressions which might trap. */
>> trapping_expr = sbitmap_alloc (expr_hash_table.n_elems);
>> sbitmap_zero (trapping_expr);
>> for (ui = 0; ui < expr_hash_table.size; ui++)
>> {
>> struct expr *e;
>> for (e = expr_hash_table.table[ui]; e != NULL; e = e->next_same_hash)
>> if (may_trap_p (e->expr))
>> SET_BIT (trapping_expr, e->bitmap_index);
>> }
> ...
>> FOR_EACH_EDGE (e, ei, bb->preds)
>> if (e->flags & EDGE_ABNORMAL)
>> {
>> sbitmap_difference (antloc[bb->index], antloc[bb->index], 
>> trapping_expr);
>> sbitmap_difference (transp[bb->index], transp[bb->index], 
>> trapping_expr);
>> break;
>> }
>>
> ...
>> compute_local_properties ( ... )
>> compute_potentially_trapping_expressions ( ... )
>>
>> FOR_EACH_BB (bb)
>> prune_antloc & transp (bb, antloc, transp, trapping_expr);
>
> This will have the effect of disabling hoisting for all potentially 
> trapping expressions, even if they don't result in abnormal control 
> flow.  E.g., a memory reference will not be eligible for hoisting in 
> under these conditions.
?  See the e->flags & EDGE_ABNORMAL test prior to removing elements of 
trapping_expr from antloc or transp.

> PRE code for handling trapping expressions removes quite a larger set 
> of expression from optimization space, and that seems like the right 
> thing to do for PRE.  Hoisting, on the other hand can relax condtions 
> on trapping expressions by compensating with the VBE requirement.  It 
> seems to me that we shouldn't disable hoisting of all potentially 
> trapping expressions, but only of those that have an abnormal edge 
> sticking out of them.
Partially correct.  If there aren't abnormal edges in the source blocks, 
then hoisting can relax based on the VBE requirement.  If there are 
abnormal edges, then hoisting can not relax because nothing adds the 
abnormal edges to the destination blocks.


Jeff

Patch

diff --git a/gcc/gcse.c b/gcc/gcse.c
index 4f6dc83..9479304 100644
--- a/gcc/gcse.c
+++ b/gcc/gcse.c
@@ -470,7 +470,6 @@  static void mark_oprs_set (rtx);
 static void alloc_cprop_mem (int, int);
 static void free_cprop_mem (void);
 static void compute_transp (const_rtx, int, sbitmap *, int);
-static void compute_transpout (void);
 static void compute_local_properties (sbitmap *, sbitmap *, sbitmap *,
 				      struct hash_table_d *);
 static void compute_cprop_data (void);
@@ -1357,6 +1356,27 @@  gcse_constant_p (const_rtx x)
   return CONSTANT_P (x) && (GET_CODE (x) != CONST || shared_const_p (x));
 }
 
+/* Return true if INSN can cause abnormal control flow.  */
+
+static bool
+causes_abnormal_control_flow_p (const_rtx insn)
+{
+  basic_block bb;
+  edge e;
+  edge_iterator ei;
+
+  bb = BLOCK_FOR_INSN (insn);
+
+  if (BB_END (bb) != insn)
+    return false;
+
+  FOR_EACH_EDGE (e, ei, bb->succs)
+    if (e->flags & EDGE_ABNORMAL)
+      return true;
+
+  return false;
+}
+
 /* Scan pattern PAT of INSN and add an entry to the hash TABLE (set or
    expression one).  */
 
@@ -1424,11 +1444,13 @@  hash_scan_set (rtx pat, rtx insn, struct hash_table_d *table)
 	{
 	  /* An expression is not anticipatable if its operands are
 	     modified before this insn or if this is not the only SET in
-	     this insn.  The latter condition does not have to mean that
+	     this insn.  The latter conditions do not have to mean that
 	     SRC itself is not anticipatable, but we just will not be
-	     able to handle code motion of insns with multiple sets.  */
-	  int antic_p = oprs_anticipatable_p (src, insn)
-			&& !multiple_sets (insn);
+	     able to handle code motion of insns with multiple sets or
+	     abnormal control flow.  */
+	  int antic_p = (oprs_anticipatable_p (src, insn)
+			 && !multiple_sets (insn)
+			 && !causes_abnormal_control_flow_p (insn));
 	  /* An expression is not available if its operands are
 	     subsequently modified, including this insn.  It's also not
 	     available if this is a branch, because we can't insert
@@ -3237,11 +3259,6 @@  bypass_conditional_jumps (void)
 /* Nonzero for expressions that are transparent in the block.  */
 static sbitmap *transp;
 
-/* Nonzero for expressions that are transparent at the end of the block.
-   This is only zero for expressions killed by abnormal critical edge
-   created by a calls.  */
-static sbitmap *transpout;
-
 /* Nonzero for expressions that are computed (available) in the block.  */
 static sbitmap *comp;
 
@@ -4097,52 +4114,6 @@  add_label_notes (rtx x, rtx insn)
     }
 }
 
-/* Compute transparent outgoing information for each block.
-
-   An expression is transparent to an edge unless it is killed by
-   the edge itself.  This can only happen with abnormal control flow,
-   when the edge is traversed through a call.  This happens with
-   non-local labels and exceptions.
-
-   This would not be necessary if we split the edge.  While this is
-   normally impossible for abnormal critical edges, with some effort
-   it should be possible with exception handling, since we still have
-   control over which handler should be invoked.  But due to increased
-   EH table sizes, this may not be worthwhile.  */
-
-static void
-compute_transpout (void)
-{
-  basic_block bb;
-  unsigned int i;
-  struct expr *expr;
-
-  sbitmap_vector_ones (transpout, last_basic_block);
-
-  FOR_EACH_BB (bb)
-    {
-      /* Note that flow inserted a nop at the end of basic blocks that
-	 end in call instructions for reasons other than abnormal
-	 control flow.  */
-      if (! CALL_P (BB_END (bb)))
-	continue;
-
-      for (i = 0; i < expr_hash_table.size; i++)
-	for (expr = expr_hash_table.table[i]; expr ; expr = expr->next_same_hash)
-	  if (MEM_P (expr->expr))
-	    {
-	      if (GET_CODE (XEXP (expr->expr, 0)) == SYMBOL_REF
-		  && CONSTANT_POOL_ADDRESS_P (XEXP (expr->expr, 0)))
-		continue;
-
-	      /* ??? Optimally, we would use interprocedural alias
-		 analysis to determine if this mem is actually killed
-		 by this call.  */
-	      RESET_BIT (transpout[bb->index], expr->bitmap_index);
-	    }
-    }
-}
-
 /* Code Hoisting variables and subroutines.  */
 
 /* Very busy expressions.  */
@@ -4171,7 +4142,6 @@  alloc_code_hoist_mem (int n_blocks, int n_exprs)
   hoist_vbein = sbitmap_vector_alloc (n_blocks, n_exprs);
   hoist_vbeout = sbitmap_vector_alloc (n_blocks, n_exprs);
   hoist_exprs = sbitmap_vector_alloc (n_blocks, n_exprs);
-  transpout = sbitmap_vector_alloc (n_blocks, n_exprs);
 }
 
 /* Free vars used for code hoisting analysis.  */
@@ -4186,7 +4156,6 @@  free_code_hoist_mem (void)
   sbitmap_vector_free (hoist_vbein);
   sbitmap_vector_free (hoist_vbeout);
   sbitmap_vector_free (hoist_exprs);
-  sbitmap_vector_free (transpout);
 
   free_dominance_info (CDI_DOMINATORS);
 }
@@ -4246,7 +4215,6 @@  static void
 compute_code_hoist_data (void)
 {
   compute_local_properties (transp, comp, antloc, &expr_hash_table);
-  compute_transpout ();
   compute_code_hoist_vbeinout ();
   calculate_dominance_info (CDI_DOMINATORS);
   if (dump_file)