diff mbox

Allow cfgcleanup to remove forwarder loop preheaders and latches

Message ID 001901cf31e8$2218ad50$664a07f0$@arm.com
State New
Headers show

Commit Message

Bin Cheng Feb. 25, 2014, 5:12 a.m. UTC
Hi,
This patch is to fix regression reported in PR60280 by removing forward loop
headers/latches in cfg cleanup if possible.  Several tests are broken by
this change since cfg cleanup is shared by all optimizers.  Some tests has
already been fixed by recent patches, I went through and fixed the others.
One case needs to be clarified is "gcc.dg/tree-prof/update-loopch.c".  When
GCC removing a basic block, it checks profile information by calling
check_bb_profile after redirecting incoming edges of the bb.  This certainly
results in warnings about invalid profile information and causes the case to
fail.  I will send a patch to skip checking profile information for a
removing basic block in stage 1 if it sounds reasonable.  For now I just
twisted the case itself.

Bootstrap and tested on x86_64 and arm_a15.

Is it OK?


2014-02-25  Bin Cheng  <bin.cheng@arm.com>

	PR target/60280
	* tree-cfgcleanup.c (tree_forwarder_block_p): Protect loop
	preheaders and latches only if requested.  Fix latch if it
	is removed.
	* tree-ssa-dom.c (tree_ssa_dominator_optimize): Set
	LOOPS_HAVE_PREHEADERS.

gcc/testsuite/ChangeLog
2014-02-25  Bin Cheng  <bin.cheng@arm.com>

	PR target/60280
	* gnat.dg/renaming5.adb: Change to two expected gotos.
	* gcc.dg/tree-ssa/pr21559.c: Change back to three expected
	jump threads.
	* gcc.dg/tree-prof/update-loopch.c: Check two "Invalid sum"
	messages for removed basic block.
	* gcc.dg/tree-ssa/ivopt_1.c: Fix unreliable scanning string.
	* gcc.dg/tree-ssa/ivopt_2.c: Ditto.
	* gcc.dg/tree-ssa/ivopt_3.c: Ditto.
	* gcc.dg/tree-ssa/ivopt_4.c: Ditto.

Comments

Richard Biener Feb. 25, 2014, 10:38 a.m. UTC | #1
On Tue, Feb 25, 2014 at 6:12 AM, bin.cheng <bin.cheng@arm.com> wrote:
> Hi,
> This patch is to fix regression reported in PR60280 by removing forward loop
> headers/latches in cfg cleanup if possible.  Several tests are broken by
> this change since cfg cleanup is shared by all optimizers.  Some tests has
> already been fixed by recent patches, I went through and fixed the others.
> One case needs to be clarified is "gcc.dg/tree-prof/update-loopch.c".  When
> GCC removing a basic block, it checks profile information by calling
> check_bb_profile after redirecting incoming edges of the bb.  This certainly
> results in warnings about invalid profile information and causes the case to
> fail.  I will send a patch to skip checking profile information for a
> removing basic block in stage 1 if it sounds reasonable.  For now I just
> twisted the case itself.
>
> Bootstrap and tested on x86_64 and arm_a15.
>
> Is it OK?

Can you document the extra threading we do in pr21559.c?  The
comment still talks about two threadings we should perform.

Also the ivopt_* adjustmens would be better done by matching
"ivtmp.[0-9_]* = PHI" instead of matching ivtmp in one of the PHI
arguments.

@@ -497,6 +507,9 @@ remove_forwarder_block (basic_block bb)
       set_immediate_dominator (CDI_DOMINATORS, dest, dom);
     }

+  if (current_loops && bb->loop_father->latch == bb)
+    bb->loop_father->latch = dest;
+
   /* And kill the forwarder block.  */
   delete_basic_block (bb);

can you add a comment here?  I had

@@ -497,7 +500,12 @@ remove_forwarder_block (basic_block bb)
       set_immediate_dominator (CDI_DOMINATORS, dest, dom);
     }

-  /* And kill the forwarder block.  */
+  /* And kill the forwarder block, but first adjust its parent loop
+     latch info as otherwise the cfg hook has a hard time not to
+     kill the loop.  */
+  if (current_loops
+      && bb->loop_father->latch == bb)
+    bb->loop_father->latch = dest;
   delete_basic_block (bb);

   return true;

in my patch.

Thanks,
Richard.

>
> 2014-02-25  Bin Cheng  <bin.cheng@arm.com>
>
>         PR target/60280
>         * tree-cfgcleanup.c (tree_forwarder_block_p): Protect loop
>         preheaders and latches only if requested.  Fix latch if it
>         is removed.
>         * tree-ssa-dom.c (tree_ssa_dominator_optimize): Set
>         LOOPS_HAVE_PREHEADERS.
>
> gcc/testsuite/ChangeLog
> 2014-02-25  Bin Cheng  <bin.cheng@arm.com>
>
>         PR target/60280
>         * gnat.dg/renaming5.adb: Change to two expected gotos.
>         * gcc.dg/tree-ssa/pr21559.c: Change back to three expected
>         jump threads.
>         * gcc.dg/tree-prof/update-loopch.c: Check two "Invalid sum"
>         messages for removed basic block.
>         * gcc.dg/tree-ssa/ivopt_1.c: Fix unreliable scanning string.
>         * gcc.dg/tree-ssa/ivopt_2.c: Ditto.
>         * gcc.dg/tree-ssa/ivopt_3.c: Ditto.
>         * gcc.dg/tree-ssa/ivopt_4.c: Ditto.
Hans-Peter Nilsson Feb. 28, 2014, 1:34 a.m. UTC | #2
On Tue, 25 Feb 2014, bin.cheng wrote:

> Hi,
> This patch is to fix regression reported in PR60280 by removing forward loop
> headers/latches in cfg cleanup if possible.  Several tests are broken by
> this change since cfg cleanup is shared by all optimizers.  Some tests has
> already been fixed by recent patches, I went through and fixed the others.
> One case needs to be clarified is "gcc.dg/tree-prof/update-loopch.c".  When
> GCC removing a basic block, it checks profile information by calling
> check_bb_profile after redirecting incoming edges of the bb.  This certainly
> results in warnings about invalid profile information and causes the case to
> fail.  I will send a patch to skip checking profile information for a
> removing basic block in stage 1 if it sounds reasonable.  For now I just
> twisted the case itself.
>
> Bootstrap and tested on x86_64 and arm_a15.
>
> Is it OK?
>
>
> 2014-02-25  Bin Cheng  <bin.cheng@arm.com>
>
> 	PR target/60280
> 	* tree-cfgcleanup.c (tree_forwarder_block_p): Protect loop
> 	preheaders and latches only if requested.  Fix latch if it
> 	is removed.
> 	* tree-ssa-dom.c (tree_ssa_dominator_optimize): Set
> 	LOOPS_HAVE_PREHEADERS.
>
> gcc/testsuite/ChangeLog
> 2014-02-25  Bin Cheng  <bin.cheng@arm.com>
>
> 	PR target/60280
> 	* gnat.dg/renaming5.adb: Change to two expected gotos.
> 	* gcc.dg/tree-ssa/pr21559.c: Change back to three expected
> 	jump threads.
> 	* gcc.dg/tree-prof/update-loopch.c: Check two "Invalid sum"
> 	messages for removed basic block.
> 	* gcc.dg/tree-ssa/ivopt_1.c: Fix unreliable scanning string.
> 	* gcc.dg/tree-ssa/ivopt_2.c: Ditto.
> 	* gcc.dg/tree-ssa/ivopt_3.c: Ditto.
> 	* gcc.dg/tree-ssa/ivopt_4.c: Ditto.

Do you need to also update gcc.dg/tree-ssa/ssa-dom-thread-4.c,
at least for logical_op_short_circuit targets?
(There's a nice analysis comment there by Richard S which may
also have to be updated.)

This caused a regression for logical_op_short_circuit targets, I
entered PR60363 for convenience.

brgds, H-P
Bin.Cheng Feb. 28, 2014, 1:58 a.m. UTC | #3
Sorry, I didn't test it against logical_op_short_circuit target.  I
will look into this PR.

Thanks,
bin

On Fri, Feb 28, 2014 at 9:34 AM, Hans-Peter Nilsson <hp@bitrange.com> wrote:
> On Tue, 25 Feb 2014, bin.cheng wrote:
>
>> Hi,
>> This patch is to fix regression reported in PR60280 by removing forward loop
>> headers/latches in cfg cleanup if possible.  Several tests are broken by
>> this change since cfg cleanup is shared by all optimizers.  Some tests has
>> already been fixed by recent patches, I went through and fixed the others.
>> One case needs to be clarified is "gcc.dg/tree-prof/update-loopch.c".  When
>> GCC removing a basic block, it checks profile information by calling
>> check_bb_profile after redirecting incoming edges of the bb.  This certainly
>> results in warnings about invalid profile information and causes the case to
>> fail.  I will send a patch to skip checking profile information for a
>> removing basic block in stage 1 if it sounds reasonable.  For now I just
>> twisted the case itself.
>>
>> Bootstrap and tested on x86_64 and arm_a15.
>>
>> Is it OK?
>>
>>
>> 2014-02-25  Bin Cheng  <bin.cheng@arm.com>
>>
>>       PR target/60280
>>       * tree-cfgcleanup.c (tree_forwarder_block_p): Protect loop
>>       preheaders and latches only if requested.  Fix latch if it
>>       is removed.
>>       * tree-ssa-dom.c (tree_ssa_dominator_optimize): Set
>>       LOOPS_HAVE_PREHEADERS.
>>
>> gcc/testsuite/ChangeLog
>> 2014-02-25  Bin Cheng  <bin.cheng@arm.com>
>>
>>       PR target/60280
>>       * gnat.dg/renaming5.adb: Change to two expected gotos.
>>       * gcc.dg/tree-ssa/pr21559.c: Change back to three expected
>>       jump threads.
>>       * gcc.dg/tree-prof/update-loopch.c: Check two "Invalid sum"
>>       messages for removed basic block.
>>       * gcc.dg/tree-ssa/ivopt_1.c: Fix unreliable scanning string.
>>       * gcc.dg/tree-ssa/ivopt_2.c: Ditto.
>>       * gcc.dg/tree-ssa/ivopt_3.c: Ditto.
>>       * gcc.dg/tree-ssa/ivopt_4.c: Ditto.
>
> Do you need to also update gcc.dg/tree-ssa/ssa-dom-thread-4.c,
> at least for logical_op_short_circuit targets?
> (There's a nice analysis comment there by Richard S which may
> also have to be updated.)
>
> This caused a regression for logical_op_short_circuit targets, I
> entered PR60363 for convenience.
>
> brgds, H-P
diff mbox

Patch

Index: gcc/tree-ssa-dom.c
===================================================================
--- gcc/tree-ssa-dom.c	(revision 207938)
+++ gcc/tree-ssa-dom.c	(working copy)
@@ -849,9 +849,15 @@  tree_ssa_dominator_optimize (void)
   /* We need to know loop structures in order to avoid destroying them
      in jump threading.  Note that we still can e.g. thread through loop
      headers to an exit edge, or through loop header to the loop body, assuming
-     that we update the loop info.  */
-  loop_optimizer_init (LOOPS_HAVE_SIMPLE_LATCHES);
+     that we update the loop info.
 
+     TODO: We don't need to set LOOPS_HAVE_PREHEADERS generally, but due
+     to several overly conservative bail-outs in jump threading, case
+     gcc.dg/tree-ssa/pr21417.c can't be threaded if loop preheader is
+     missing.  We should improve jump threading in future then
+     LOOPS_HAVE_PREHEADERS won't be needed here.  */
+  loop_optimizer_init (LOOPS_HAVE_PREHEADERS | LOOPS_HAVE_SIMPLE_LATCHES);
+
   /* Initialize the value-handle array.  */
   threadedge_initialize_values ();
 
Index: gcc/tree-cfgcleanup.c
===================================================================
--- gcc/tree-cfgcleanup.c	(revision 207938)
+++ gcc/tree-cfgcleanup.c	(working copy)
@@ -308,14 +308,24 @@  tree_forwarder_block_p (basic_block bb, bool phi_w
   if (current_loops)
     {
       basic_block dest;
-      /* Protect loop latches, headers and preheaders.  */
+      /* Protect loop headers.  */
       if (bb->loop_father->header == bb)
 	return false;
+
       dest = EDGE_SUCC (bb, 0)->dest;
+      /* Protect loop preheaders and latches if requested.  */
+      if (dest->loop_father->header == dest)
+	{
+	  if (loops_state_satisfies_p (LOOPS_HAVE_PREHEADERS)
+	      && bb->loop_father->header != dest)
+	    return false;
 
-      if (dest->loop_father->header == dest)
-	return false;
+	  if (loops_state_satisfies_p (LOOPS_HAVE_SIMPLE_LATCHES)
+	      && bb->loop_father->header == dest)
+	    return false;
+	}
     }
+
   return true;
 }
 
@@ -497,6 +507,9 @@  remove_forwarder_block (basic_block bb)
       set_immediate_dominator (CDI_DOMINATORS, dest, dom);
     }
 
+  if (current_loops && bb->loop_father->latch == bb)
+    bb->loop_father->latch = dest;
+
   /* And kill the forwarder block.  */
   delete_basic_block (bb);
 
Index: gcc/testsuite/gnat.dg/renaming5.adb
===================================================================
--- gcc/testsuite/gnat.dg/renaming5.adb	(revision 207938)
+++ gcc/testsuite/gnat.dg/renaming5.adb	(working copy)
@@ -26,5 +26,5 @@  package body Renaming5 is
 
 end Renaming5;
 
--- { dg-final { scan-tree-dump-times "goto" 3 "optimized" } }
+-- { dg-final { scan-tree-dump-times "goto" 2 "optimized" } }
 -- { dg-final { cleanup-tree-dump "optimized" } }
Index: gcc/testsuite/gcc.dg/tree-ssa/ivopt_2.c
===================================================================
--- gcc/testsuite/gcc.dg/tree-ssa/ivopt_2.c	(revision 207938)
+++ gcc/testsuite/gcc.dg/tree-ssa/ivopt_2.c	(working copy)
@@ -13,5 +13,5 @@  void foo (int i_width, TYPE dst, TYPE src1, TYPE s
        }
 }
 
-/* { dg-final { scan-tree-dump-times "PHI <ivtmp" 1 "ivopts"} } */
+/* { dg-final { scan-tree-dump-times "PHI <.*ivtmp" 1 "ivopts"} } */
 /* { dg-final { cleanup-tree-dump "ivopts" } } */
Index: gcc/testsuite/gcc.dg/tree-ssa/pr21559.c
===================================================================
--- gcc/testsuite/gcc.dg/tree-ssa/pr21559.c	(revision 207938)
+++ gcc/testsuite/gcc.dg/tree-ssa/pr21559.c	(working copy)
@@ -37,7 +37,7 @@  void foo (void)
 /* Second, we should thread the edge out of the loop via the break
    statement.  We also realize that the final bytes == 0 test is useless,
    and thread over it.  */
-/* { dg-final { scan-tree-dump-times "Threaded jump" 2 "vrp1" } } */
+/* { dg-final { scan-tree-dump-times "Threaded jump" 3 "vrp1" } } */
 
 /* { dg-final { cleanup-tree-dump "vrp1" } } */
 
Index: gcc/testsuite/gcc.dg/tree-ssa/ivopt_4.c
===================================================================
--- gcc/testsuite/gcc.dg/tree-ssa/ivopt_4.c	(revision 207938)
+++ gcc/testsuite/gcc.dg/tree-ssa/ivopt_4.c	(working copy)
@@ -15,5 +15,5 @@  void foo (int i_width, TYPE dst, TYPE src1, TYPE s
        }
 }
 
-/* { dg-final { scan-tree-dump-times "PHI <ivtmp" 1 "ivopts"} } */
+/* { dg-final { scan-tree-dump-times "PHI <.*ivtmp" 1 "ivopts"} } */
 /* { dg-final { cleanup-tree-dump "ivopts" } } */
Index: gcc/testsuite/gcc.dg/tree-ssa/ivopt_1.c
===================================================================
--- gcc/testsuite/gcc.dg/tree-ssa/ivopt_1.c	(revision 207938)
+++ gcc/testsuite/gcc.dg/tree-ssa/ivopt_1.c	(working copy)
@@ -14,5 +14,5 @@  void foo (int i_width, TYPE dst, TYPE src1, TYPE s
 }
 
 
-/* { dg-final { scan-tree-dump-times "PHI <ivtmp" 1 "ivopts"} } */
+/* { dg-final { scan-tree-dump-times "PHI <.*ivtmp" 1 "ivopts"} } */
 /* { dg-final { cleanup-tree-dump "ivopts" } } */
Index: gcc/testsuite/gcc.dg/tree-ssa/ivopt_3.c
===================================================================
--- gcc/testsuite/gcc.dg/tree-ssa/ivopt_3.c	(revision 207938)
+++ gcc/testsuite/gcc.dg/tree-ssa/ivopt_3.c	(working copy)
@@ -14,7 +14,7 @@  void foo (int i_width, char* dst, char* src1, char
 	   src1+=sizeof(TYPE);
 	   src2+=sizeof(TYPE);
        }
-} 
+}
 
-/* { dg-final { scan-tree-dump-times "PHI <ivtmp" 1 "ivopts"} } */
+/* { dg-final { scan-tree-dump-times "PHI <.*ivtmp" 1 "ivopts"} } */
 /* { dg-final { cleanup-tree-dump "ivopts" } } */
Index: gcc/testsuite/gcc.dg/tree-prof/update-loopch.c
===================================================================
--- gcc/testsuite/gcc.dg/tree-prof/update-loopch.c	(revision 207938)
+++ gcc/testsuite/gcc.dg/tree-prof/update-loopch.c	(working copy)
@@ -16,6 +16,7 @@  main ()
    edge.  */
 /* { dg-final-use { scan-ipa-dump "loop depth 1, count 33334" "profile"} } */
 /* { dg-final-use { scan-tree-dump "loop depth 1, count 33332" "optimized"} } */
-/* { dg-final-use { scan-tree-dump-not "Invalid sum" "optimized"} } */
+/* { dg-final-use { scan-tree-dump-times "Removing basic block \[^\r\n\]*\[\\r\\n\]+\[^\r\n\]*\[\\r\\n\]+Invalid sum of\[^\r\n\]*\[\\r\\n\]+Invalid sum of" 1 "optimized"} } */
+/* { dg-final-use { scan-tree-dump-times "Invalid sum of" 2 "optimized"} } */
 /* { dg-final-use { cleanup-ipa-dump "profile" } } */
 /* { dg-final-use { cleanup-tree-dump "optimized" } } */