diff mbox

Merge switch statements in tree-cfgcleanup

Message ID 96c4b009-0b69-2f7a-1fad-d9ae4331b017@redhat.com
State New
Headers show

Commit Message

Bernd Schmidt July 18, 2016, 4:07 p.m. UTC
The motivating example for this patch was a change that was submitted 
for genattrtab last year, which would have made us generate

switch (type = get_attr_type (insn))
   {
    ... some cases ...
    default:
      switch (type = get_attr_type (insn)))
        {
        ... some other cases ...
        }
   }

The idea was to optimize this by merging the code into a single switch. 
My expectation was that this was most likely to occur in 
machine-generated code, but there are a few instances of this pattern in 
the gcc sources themselves. One case is

    code = gimple_code (stmt);
    switch (code)
      {
      ....
      default:
        if (is_gimple_omp (code))
          {
          }
      }

where is_gimple_omp expands into another switch. More cases exist in the 
compiler as shown by various bootstrap failures along the way; sometimes 
these are exposed after other optimizations. One is in the Ada runtime 
library somewhere, and another (which currently cannot be transformed by 
the patch) is in the Fortran frontend.

In the future we could also look for if statements making another 
comparison of the variable in the default branch, that would be a minor 
extension.

The motivating example currently can't be transformed because 
get_attr_type calls are in the way.

Bootstrapped and tested on x86_64-linux. Ok?


Bernd

Comments

Richard Biener July 19, 2016, 8:07 a.m. UTC | #1
On Mon, Jul 18, 2016 at 6:07 PM, Bernd Schmidt <bschmidt@redhat.com> wrote:
> The motivating example for this patch was a change that was submitted for
> genattrtab last year, which would have made us generate
>
> switch (type = get_attr_type (insn))
>   {
>    ... some cases ...
>    default:
>      switch (type = get_attr_type (insn)))
>        {
>        ... some other cases ...
>        }
>   }
>
> The idea was to optimize this by merging the code into a single switch. My
> expectation was that this was most likely to occur in machine-generated
> code, but there are a few instances of this pattern in the gcc sources
> themselves. One case is
>
>    code = gimple_code (stmt);
>    switch (code)
>      {
>      ....
>      default:
>        if (is_gimple_omp (code))
>          {
>          }
>      }
>
> where is_gimple_omp expands into another switch. More cases exist in the
> compiler as shown by various bootstrap failures along the way; sometimes
> these are exposed after other optimizations. One is in the Ada runtime
> library somewhere, and another (which currently cannot be transformed by the
> patch) is in the Fortran frontend.
>
> In the future we could also look for if statements making another comparison
> of the variable in the default branch, that would be a minor extension.
>
> The motivating example currently can't be transformed because get_attr_type
> calls are in the way.
>
> Bootstrapped and tested on x86_64-linux. Ok?

This is not appropriate for CFG cleanup due to its complexity not
being O(# bbs + # edges).
I tried hard in the past to make it so (at least when no transform is done).

Please move this transform elsewhere.  I suggest the switch-conversion
pass or if that
is not a good fit then maybe if-combine (whose transforms are remotely related).

Not looking closer at the patch but missing some comments on how it deals with
common cases (you see to handle fallthrus to the default label by
ignoring them?)

Thanks,
Richard.


>
> Bernd
Bernd Schmidt July 19, 2016, 9:52 a.m. UTC | #2
On 07/19/2016 10:07 AM, Richard Biener wrote:
>
> This is not appropriate for CFG cleanup due to its complexity not
> being O(# bbs + # edges).
> I tried hard in the past to make it so (at least when no transform is done).

Why wouldn't it be, if no transform is done? Assuming we visit each bb 
once, we have at worst one walk over the successor edges of the 
predecessor (if we even find two switches on the same variable), and 
then we can decide whether or not to do the transformation.

When performing the transformation I could imagine one could construct a 
testcase where lots of these switches are nested inside each other, but 
I'm not convinved that's really a realistic worry.

> Please move this transform elsewhere.  I suggest the switch-conversion
> pass or if that
> is not a good fit then maybe if-combine (whose transforms are remotely related).

One problem is that this triggers rarely, but when it does, it occurs at 
various stages in the compilation after other optimizations have been 
done. Moving it to any given point is likely to limit the effectiveness.

> Not looking closer at the patch but missing some comments on how it deals with
> common cases (you see to handle fallthrus to the default label by
> ignoring them?)

If you are thinking of

switch (a)
  {
  case n:
  case m:
  default:
    switch (a) { .... }
  }

then the cases for n and m can simply be dropped when merging from the 
second switch into the first one. That's what happens, and there's a 
comment for it. So please elaborate what you mean.


Bernd
Richard Biener July 19, 2016, 10:09 a.m. UTC | #3
On Tue, Jul 19, 2016 at 11:52 AM, Bernd Schmidt <bschmidt@redhat.com> wrote:
> On 07/19/2016 10:07 AM, Richard Biener wrote:
>>
>>
>> This is not appropriate for CFG cleanup due to its complexity not
>> being O(# bbs + # edges).
>> I tried hard in the past to make it so (at least when no transform is
>> done).
>
>
> Why wouldn't it be, if no transform is done? Assuming we visit each bb once,
> we have at worst one walk over the successor edges of the predecessor (if we
> even find two switches on the same variable), and then we can decide whether
> or not to do the transformation.

I saw walks over stmts of a BB.  IMHO that's a no-go.

That said, CFG cleanup is not the place for this optimization.

The only trivial CFG cleanup transform for switches I can see is transforming
them to a simple if / else in case there is a single non-default
label.  And it's
not even doing that currently.

> When performing the transformation I could imagine one could construct a
> testcase where lots of these switches are nested inside each other, but I'm
> not convinved that's really a realistic worry.
>
>> Please move this transform elsewhere.  I suggest the switch-conversion
>> pass or if that
>> is not a good fit then maybe if-combine (whose transforms are remotely
>> related).
>
>
> One problem is that this triggers rarely, but when it does, it occurs at
> various stages in the compilation after other optimizations have been done.
> Moving it to any given point is likely to limit the effectiveness.

Well, that's true for all optimization passes we have.  The opportunity once it
arises is not likely to be removed by any pass.

>> Not looking closer at the patch but missing some comments on how it deals
>> with
>> common cases (you see to handle fallthrus to the default label by
>> ignoring them?)
>
>
> If you are thinking of
>
> switch (a)
>  {
>  case n:
>  case m:
>  default:
>    switch (a) { .... }
>  }
>
> then the cases for n and m can simply be dropped when merging from the
> second switch into the first one. That's what happens, and there's a comment
> for it. So please elaborate what you mean.

I'm thinking of

  switch (a)
   {
   ...
   case n:
      do-stuff;
   default:
     switch (a)
       {
       case n:
         do-stuff;
       ...
       }
   }

yes, you can simply drop cases when there is no code in the outer switch.

Richard.

>
>
> Bernd
>
Bernd Schmidt July 19, 2016, 10:10 a.m. UTC | #4
On 07/19/2016 12:09 PM, Richard Biener wrote:

> I saw walks over stmts of a BB.  IMHO that's a no-go.

Only to find the first or last nondebug one. Is that unacceptable?

> I'm thinking of
>
>   switch (a)
>    {
>    ...
>    case n:
>       do-stuff;
>    default:
>      switch (a)
>        {
>        case n:
>          do-stuff;
>        ...
>        }
>    }
>
> yes, you can simply drop cases when there is no code in the outer switch.

We check that the second switch has a single predecessor block so this 
case can't happen.


Bernd
Marc Glisse July 19, 2016, 10:22 a.m. UTC | #5
On Tue, 19 Jul 2016, Bernd Schmidt wrote:

> On 07/19/2016 12:09 PM, Richard Biener wrote:
>
>> I saw walks over stmts of a BB.  IMHO that's a no-go.
>
> Only to find the first or last nondebug one. Is that unacceptable?

Does gsi_start_nondebug_after_labels_bb not fit?
Bernd Schmidt July 19, 2016, 10:25 a.m. UTC | #6
On 07/19/2016 12:22 PM, Marc Glisse wrote:
> On Tue, 19 Jul 2016, Bernd Schmidt wrote:
>
>> On 07/19/2016 12:09 PM, Richard Biener wrote:
>>
>>> I saw walks over stmts of a BB.  IMHO that's a no-go.
>>
>> Only to find the first or last nondebug one. Is that unacceptable?
>
> Does gsi_start_nondebug_after_labels_bb not fit?

It might, if one realizes that such a thing exists. Will try.


Bernd
Richard Biener July 19, 2016, 10:35 a.m. UTC | #7
On Tue, Jul 19, 2016 at 12:25 PM, Bernd Schmidt <bschmidt@redhat.com> wrote:
> On 07/19/2016 12:22 PM, Marc Glisse wrote:
>>
>> On Tue, 19 Jul 2016, Bernd Schmidt wrote:
>>
>>> On 07/19/2016 12:09 PM, Richard Biener wrote:
>>>
>>>> I saw walks over stmts of a BB.  IMHO that's a no-go.
>>>
>>>
>>> Only to find the first or last nondebug one. Is that unacceptable?
>>
>>
>> Does gsi_start_nondebug_after_labels_bb not fit?
>
>
> It might, if one realizes that such a thing exists. Will try.

I think that start/end_recording_case_labels also merged adjacent labels
via group_case_labels_stmt.  Not sure why you need to stop recording
case labels during the transform.  Is this because you are building a new
switch stmt?

Richard.


> Bernd
>
Bernd Schmidt July 19, 2016, 11:07 a.m. UTC | #8
On 07/19/2016 12:35 PM, Richard Biener wrote:

> I think that start/end_recording_case_labels also merged adjacent labels
> via group_case_labels_stmt.  Not sure why you need to stop recording
> case labels during the transform.  Is this because you are building a new
> switch stmt?

It's because the cached mapping gets invalidated. Look in tree-cfg, it 
has a edge_to_cases map which I think cannot be maintained if you modify 
the structure. I certainly got lots of internal errors until I added 
that pair of calls.


Bernd
Richard Biener July 19, 2016, 11:18 a.m. UTC | #9
On Tue, Jul 19, 2016 at 1:07 PM, Bernd Schmidt <bschmidt@redhat.com> wrote:
> On 07/19/2016 12:35 PM, Richard Biener wrote:
>
>> I think that start/end_recording_case_labels also merged adjacent labels
>> via group_case_labels_stmt.  Not sure why you need to stop recording
>> case labels during the transform.  Is this because you are building a new
>> switch stmt?
>
>
> It's because the cached mapping gets invalidated. Look in tree-cfg, it has a
> edge_to_cases map which I think cannot be maintained if you modify the
> structure. I certainly got lots of internal errors until I added that pair
> of calls.

Yeah, I see that.  OTOH cfgcleanup relies on this cache to be efficient and
you (repeatedly) clear it.  Clearing parts of it should be sufficient and if you
used redirect_edge_and_branch instead of redirect_edge_pred it would have
maintained the cache as far as I can see, or you can make sure to maintain
it yourself or just clear the info associated with the edges you redirect from
one switch to another.

Btw,

+  gimple_stmt_iterator gsi1, gsi2;
+  gsi1 = gsi_last_nondebug_bb (pred_bb);
+  if (gsi_end_p (gsi1))
+    return false;
+  gimple *pred_end = gsi_stmt (gsi1);
+  if (gimple_code (pred_end) != GIMPLE_SWITCH)

this is just

   gimple *pred_end = last_stmt (pred_bb);
   if (! pred_end || gimple_code (pred_end) != GIMPLE_SWITCH)
     ...

Richard.


Richard.

>
> Bernd
>
diff mbox

Patch

	* tree-cfgcleanup.c (try_merge_switches): New function.
	(cleanup_tree_cfg_bb): Call it.

	* c-c++-common/merge-switch-1.c: New test.
	* c-c++-common/merge-switch-2.c: New test.
	* c-c++-common/merge-switch-3.c: New test.

Index: gcc/tree-cfgcleanup.c
===================================================================
--- gcc/tree-cfgcleanup.c	(revision 237797)
+++ gcc/tree-cfgcleanup.c	(working copy)
@@ -630,6 +630,242 @@  fixup_noreturn_call (gimple *stmt)
   return changed;
 }
 
+/* Look for situations where we have a switch inside the default case of
+   another, and they switch on the same condition.  We look for the
+   second switch in BB.  If we find such a situation, merge the two
+   switch statements.  */
+
+static bool
+try_merge_switches (basic_block bb)
+{
+  if (!single_pred_p (bb))
+    return false;
+  basic_block pred_bb = single_pred (bb);
+
+  /* Look for a structure with two switch statements on the same value.  */
+  gimple_stmt_iterator gsi1, gsi2;
+  gsi1 = gsi_last_nondebug_bb (pred_bb);
+  if (gsi_end_p (gsi1))
+    return false;
+  gimple *pred_end = gsi_stmt (gsi1);
+  if (gimple_code (pred_end) != GIMPLE_SWITCH)
+    return false;
+
+  gsi2 = gsi_after_labels (bb);
+  if (gsi_end_p (gsi2))
+    return false;
+
+  gimple *stmt = gsi_stmt (gsi2);
+  while (is_gimple_debug (stmt))
+    {
+      gsi_next (&gsi2);
+      if (gsi_end_p (gsi2))
+	return false;
+      stmt = gsi_stmt (gsi2);
+    }
+
+  if (gimple_code (stmt) != GIMPLE_SWITCH)
+    return false;
+  gswitch *sw1 = as_a <gswitch *> (pred_end);
+  gswitch *sw2 = as_a <gswitch *> (stmt);
+  tree idx1 = gimple_switch_index (sw1);
+  tree idx2 = gimple_switch_index (sw2);
+  if (TREE_CODE (idx1) != SSA_NAME || idx1 != idx2)
+    return false;
+  size_t n1 = gimple_switch_num_labels (sw1);
+  size_t n2 = gimple_switch_num_labels (sw2);
+  if (n1 <= 1 || n2 <= 1)
+    return false;
+  tree sw1_default = gimple_switch_default_label (sw1);
+  if (label_to_block (CASE_LABEL (sw1_default)) != bb)
+    return false;
+
+  /* We know we have the basic structure of what we are looking for.  Sort
+     out some special cases regarding phi nodes.  */
+  if (!gsi_end_p (gsi_start_phis (bb)))
+    return false;
+
+  edge e;
+  edge_iterator ei;
+  FOR_EACH_EDGE (e, ei, bb->succs)
+    {
+      basic_block dest = e->dest;
+      if (find_edge (pred_bb, dest))
+	{
+	  /* If a destination block is reached from both switches, any
+	     phi nodes there would become corrupted.  */
+	  gphi_iterator psi = gsi_start_phis (dest);
+	  if (!gsi_end_p (psi))
+	    return false;
+	}
+    }
+
+  /* At this point we know we are making the transformation.
+     Clear the CASE_CHAIN values to avoid inconsistencies.  */
+  end_recording_case_labels ();
+
+  /* Start at 1 to skip default labels.  */
+  size_t i1 = 1;
+  size_t i2 = 1;
+  tree ce1 = gimple_switch_label (sw1, i1);
+  tree ce2 = gimple_switch_label (sw2, i2);
+  auto_vec<tree> new_labels;
+
+  /* Keep track of the blocks that were reachable from the second switch,
+     and whose edges should be redirected to the first.  Sometimes we
+     eliminate cases from the second switch entirely since they are
+     unreachable; in such cases, the bit for the destination block
+     remains clear.  */
+  bitmap_head redirect_bbs;
+  bitmap_initialize (&redirect_bbs, &bitmap_default_obstack);
+
+  /* Merge case labels from sw2 into those of sw1.  */
+  while (i1 < n1 && i2 < n2)
+    {
+      tree min1 = CASE_LOW (ce1);
+      tree min2 = CASE_LOW (ce2);
+      tree max1 = CASE_HIGH (ce1);
+      tree max2 = CASE_HIGH (ce2);
+      if (max1 == NULL_TREE)
+	max1 = min1;
+      if (max2 == NULL_TREE)
+	max2 = min2;
+
+      if (label_to_block (CASE_LABEL (ce1)) == bb)
+	{
+	  /* A non-default case that jumps to the second switch.  Drop it.  */
+	  if (++i1 < n1)
+	    ce1 = gimple_switch_label (sw1, i1);
+	}
+      else if (tree_int_cst_compare (min1, max2) > 0)
+	{
+	  /* CE2 is a range entirely below that of CE1.  */
+	  basic_block other_bb = label_to_block (CASE_LABEL (ce2));
+	  bitmap_set_bit (&redirect_bbs, other_bb->index);
+	  new_labels.safe_push (ce2);
+	  if (++i2 < n2)
+	    ce2 = gimple_switch_label (sw2, i2);
+	}
+      else if (tree_int_cst_compare (min2, max1) > 0)
+	{
+	  /* CE1 is a range entirely below that of CE2.  */
+	  new_labels.safe_push (ce1);
+	  if (++i1 < n1)
+	    ce1 = gimple_switch_label (sw1, i1);
+	}
+      else if (tree_int_cst_compare (min1, min2) <= 0)
+	{
+	  /* The range of CE1 begins at or below the one of CE2.  */
+	  if (tree_int_cst_compare (max1, max2) >= 0)
+	    {
+	      /* CE1 ends after CE2.  Drop CE2.  */
+	      if (++i2 < n2)
+		ce2 = gimple_switch_label (sw2, i2);
+	    }
+	  else
+	    {
+	      /* We have overlap.  Adjust CE2 to being just after CE1.  */
+	      new_labels.safe_push (ce1);
+	      min2 = int_const_binop (PLUS_EXPR, max1, integer_one_node);
+	      CASE_LOW (ce2) = min2;
+	      if (++i1 < n1)
+		ce1 = gimple_switch_label (sw1, i1);
+	    }
+	}
+      else
+	{
+	  /* The range of CE1 begins after CE2, and there is overlap.
+	     Adjust and add a copy of CE2 to cover the range up to the
+	     start of CE1. */
+	  max2 = int_const_binop (MINUS_EXPR, max1, integer_one_node);
+	  tree split = copy_node (ce2);
+	  CASE_HIGH (split) = max2;
+	  CASE_LOW (ce2) = min1;
+	  new_labels.safe_push (split);
+	  basic_block other_bb = label_to_block (CASE_LABEL (ce2));
+	  bitmap_set_bit (&redirect_bbs, other_bb->index);
+	}
+    }
+  /* Append trailing elements of either array.  This will never append
+     from both.  */
+  while (i1 < n1)
+    {
+      new_labels.safe_push (ce1);
+      if (++i1 < n1)
+	ce1 = gimple_switch_label (sw1, i1);
+    }
+  while (i2 < n2)
+    {
+      basic_block other_bb = label_to_block (CASE_LABEL (ce2));
+      bitmap_set_bit (&redirect_bbs, other_bb->index);
+      new_labels.safe_push (ce2);
+      if (++i2 < n2)
+	ce2 = gimple_switch_label (sw2, i2);
+    }
+
+  /* Merge adjacent ranges.  */
+  size_t new_count = new_labels.length ();
+  for (i1 = i2 = 1; i1 < new_count; )
+    {
+      ce1 = new_labels[i1++];
+      tree high = CASE_HIGH (ce1);
+      if (high == NULL_TREE)
+	high = CASE_LOW (ce1);
+      while (i1 < new_count)
+	{
+	  ce2 = new_labels[i1];
+	  if (!integer_onep (int_const_binop (MINUS_EXPR, CASE_LOW (ce2),
+					      high))
+	      || CASE_LABEL (ce1) != CASE_LABEL (ce2))
+	    break;
+
+	  CASE_HIGH (ce1) = CASE_HIGH (ce2);
+	  i1++;
+	}
+      new_labels[i2++] = ce1;
+    }
+  new_labels.truncate (i2);
+
+  /* Create a replacement switch.  */
+  tree sw2_default = gimple_switch_default_label (sw2);
+  basic_block other_bb = label_to_block (CASE_LABEL (sw2_default));
+  bitmap_set_bit (&redirect_bbs, other_bb->index);
+
+  gswitch *new_sw1 = gimple_build_switch (gimple_switch_index (sw1),
+					  sw2_default, new_labels);
+  gsi_replace (&gsi1, new_sw1, false);
+
+  /* Redirect edges.  */
+
+  for (ei = ei_start (bb->succs); (e = ei_safe_edge (ei)); )
+    {
+      basic_block dest = e->dest;
+      if (!bitmap_bit_p (&redirect_bbs, dest->index)
+	  || find_edge (pred_bb, dest))
+	{
+	  ei_next (&ei);
+	  continue;
+	}
+      redirect_edge_pred (e, pred_bb);
+      bitmap_set_bit (cfgcleanup_altered_bbs, dest->index);
+    }
+
+  if (dom_info_available_p (CDI_DOMINATORS))
+    {
+      basic_block dombb = get_immediate_dominator (CDI_DOMINATORS, bb);
+      gcc_assert (dombb == pred_bb);
+      vec<basic_block> fix_bbs;
+      fix_bbs = get_all_dominated_blocks (CDI_DOMINATORS, pred_bb);
+      iterate_fix_dominators (CDI_DOMINATORS, fix_bbs, false);
+      fix_bbs.release ();
+    }
+
+  remove_edge_and_dominated_blocks (single_pred_edge (bb));
+  bitmap_clear (&redirect_bbs);
+
+  start_recording_case_labels ();
+  return true;
+}
 
 /* Tries to cleanup cfg in basic block BB.  Returns true if anything
    changes.  */
@@ -641,6 +877,9 @@  cleanup_tree_cfg_bb (basic_block bb)
       && remove_forwarder_block (bb))
     return true;
 
+  if (try_merge_switches (bb))
+    return true;
+
   /* Merging the blocks may create new opportunities for folding
      conditional branches (due to the elimination of single-valued PHI
      nodes).  */
Index: gcc/testsuite/c-c++-common/merge-switch-1.c
===================================================================
--- gcc/testsuite/c-c++-common/merge-switch-1.c	(revision 0)
+++ gcc/testsuite/c-c++-common/merge-switch-1.c	(working copy)
@@ -0,0 +1,30 @@ 
+/* { dg-options "-O2 -fdump-tree-optimized" } */
+
+__attribute__((noinline,noclone)) int foo (int c)
+{
+  switch (c)
+    {
+    case 1:
+    case 3:
+    case 5:
+    case 7:
+    case 9:
+    case 11:
+      return 5;
+    default:
+      switch (c)
+	{
+	case 0:
+	case 2:
+	case 4:
+	case 6:
+	case 8:
+	case 10:
+	  return 5;
+	default:
+	  return 0;
+	}
+    }
+}
+/* { dg-final { scan-tree-dump-times "switch" 1 "optimized" } } */
+/* { dg-final { scan-tree-dump-times "case 0 \\.\\.\\. 11" 1 "optimized" } } */
Index: gcc/testsuite/c-c++-common/merge-switch-2.c
===================================================================
--- gcc/testsuite/c-c++-common/merge-switch-2.c	(revision 0)
+++ gcc/testsuite/c-c++-common/merge-switch-2.c	(working copy)
@@ -0,0 +1,32 @@ 
+/* { dg-options "-O2 -fdump-tree-dse1" } */
+
+/* We use the dse1 dump since it's before switchconv, but after a cfg
+   cleanup pass which should take care of merging the switches.  */
+int foo (int c)
+{
+  switch (c)
+    {
+    case 1:
+    case 3:
+    case 5:
+    case 7:
+    case 9:
+    case 11:
+      return 4;
+    default:
+      switch (c)
+	{
+	case 0:
+	case 2:
+	case 4:
+	case 6:
+	case 8:
+	case 10:
+	  return 5;
+	default:
+	  return 0;
+	}
+    }
+}
+/* { dg-final { scan-tree-dump-times "switch" 1 "dse1" } } */
+/* { dg-final { scan-tree-dump-not "\\.\\.\\." "dse1" } } */
Index: gcc/testsuite/c-c++-common/merge-switch-3.c
===================================================================
--- gcc/testsuite/c-c++-common/merge-switch-3.c	(revision 0)
+++ gcc/testsuite/c-c++-common/merge-switch-3.c	(working copy)
@@ -0,0 +1,31 @@ 
+/* { dg-options "-O2 -fdump-tree-optimized" } */
+/* Sanity check that switch merging doesn't happen when switching on
+   different variables.  */
+
+int foo (int c, int d)
+{
+  switch (c)
+    {
+    case 1:
+    case 30:
+    case 51:
+    case 73:
+    case 92:
+    case 115:
+      return 4;
+    default:
+      switch (d)
+	{
+	case 10:
+	case 22:
+	case 42:
+	case 62:
+	case 81:
+	case 101:
+	  return 5;
+	default:
+	  return 0;
+	}
+    }
+}
+/* { dg-final { scan-tree-dump-times "switch" 2 "optimized" } } */