diff mbox

Allow cfgcleanup to remove forwarder loop preheaders and latches

Message ID 001d01cf321b$8b737860$a25a6920$@arm.com
State New
Headers show

Commit Message

Bin Cheng Feb. 25, 2014, 11:20 a.m. UTC
Updated as comments.

Thanks,
bin

> -----Original Message-----
> From: Richard Biener [mailto:richard.guenther@gmail.com]
> Sent: Tuesday, February 25, 2014 6:38 PM
> To: Bin Cheng
> Cc: GCC Patches
> Subject: Re: [PATCH GCC]Allow cfgcleanup to remove forwarder loop
> preheaders and latches
> 
> On Tue, Feb 25, 2014 at 6:12 AM, bin.cheng <bin.cheng@arm.com> wrote:
> > Hi,
> > This patch is to fix regression reported in PR60280 by removing
> > forward loop headers/latches in cfg cleanup if possible.  Several
> > tests are broken by this change since cfg cleanup is shared by all
> > optimizers.  Some tests has already been fixed by recent patches, I went
> through and fixed the others.
> > One case needs to be clarified is "gcc.dg/tree-prof/update-loopch.c".
> > When GCC removing a basic block, it checks profile information by
> > calling check_bb_profile after redirecting incoming edges of the bb.
> > This certainly results in warnings about invalid profile information
> > and causes the case to fail.  I will send a patch to skip checking
> > profile information for a removing basic block in stage 1 if it sounds
> > reasonable.  For now I just twisted the case itself.
> >
> > Bootstrap and tested on x86_64 and arm_a15.
> >
> > Is it OK?
> 
> Can you document the extra threading we do in pr21559.c?  The comment
> still talks about two threadings we should perform.
> 
> Also the ivopt_* adjustmens would be better done by matching
> "ivtmp.[0-9_]* = PHI" instead of matching ivtmp in one of the PHI
arguments.
> 
> @@ -497,6 +507,9 @@ remove_forwarder_block (basic_block bb)
>        set_immediate_dominator (CDI_DOMINATORS, dest, dom);
>      }
> 
> +  if (current_loops && bb->loop_father->latch == bb)
> +    bb->loop_father->latch = dest;
> +
>    /* And kill the forwarder block.  */
>    delete_basic_block (bb);
> 
> can you add a comment here?  I had
> 
> @@ -497,7 +500,12 @@ remove_forwarder_block (basic_block bb)
>        set_immediate_dominator (CDI_DOMINATORS, dest, dom);
>      }
> 
> -  /* And kill the forwarder block.  */
> +  /* And kill the forwarder block, but first adjust its parent loop
> +     latch info as otherwise the cfg hook has a hard time not to
> +     kill the loop.  */
> +  if (current_loops
> +      && bb->loop_father->latch == bb)
> +    bb->loop_father->latch = dest;
>    delete_basic_block (bb);
> 
>    return true;
> 
> in my patch.
> 
> Thanks,
> Richard.
> 
> >
> > 2014-02-25  Bin Cheng  <bin.cheng@arm.com>
> >
> >         PR target/60280
> >         * tree-cfgcleanup.c (tree_forwarder_block_p): Protect loop
> >         preheaders and latches only if requested.  Fix latch if it
> >         is removed.
> >         * tree-ssa-dom.c (tree_ssa_dominator_optimize): Set
> >         LOOPS_HAVE_PREHEADERS.
> >
> > gcc/testsuite/ChangeLog
> > 2014-02-25  Bin Cheng  <bin.cheng@arm.com>
> >
> >         PR target/60280
> >         * gnat.dg/renaming5.adb: Change to two expected gotos.
> >         * gcc.dg/tree-ssa/pr21559.c: Change back to three expected
> >         jump threads.
> >         * gcc.dg/tree-prof/update-loopch.c: Check two "Invalid sum"
> >         messages for removed basic block.
> >         * gcc.dg/tree-ssa/ivopt_1.c: Fix unreliable scanning string.
> >         * gcc.dg/tree-ssa/ivopt_2.c: Ditto.
> >         * gcc.dg/tree-ssa/ivopt_3.c: Ditto.
> >         * gcc.dg/tree-ssa/ivopt_4.c: Ditto.

Comments

Richard Biener Feb. 25, 2014, 11:33 a.m. UTC | #1
On Tue, Feb 25, 2014 at 12:20 PM, bin.cheng <bin.cheng@arm.com> wrote:
> Updated as comments.

Ok.

Thanks,
Richard.

> Thanks,
> bin
>
>> -----Original Message-----
>> From: Richard Biener [mailto:richard.guenther@gmail.com]
>> Sent: Tuesday, February 25, 2014 6:38 PM
>> To: Bin Cheng
>> Cc: GCC Patches
>> Subject: Re: [PATCH GCC]Allow cfgcleanup to remove forwarder loop
>> preheaders and latches
>>
>> On Tue, Feb 25, 2014 at 6:12 AM, bin.cheng <bin.cheng@arm.com> wrote:
>> > Hi,
>> > This patch is to fix regression reported in PR60280 by removing
>> > forward loop headers/latches in cfg cleanup if possible.  Several
>> > tests are broken by this change since cfg cleanup is shared by all
>> > optimizers.  Some tests has already been fixed by recent patches, I went
>> through and fixed the others.
>> > One case needs to be clarified is "gcc.dg/tree-prof/update-loopch.c".
>> > When GCC removing a basic block, it checks profile information by
>> > calling check_bb_profile after redirecting incoming edges of the bb.
>> > This certainly results in warnings about invalid profile information
>> > and causes the case to fail.  I will send a patch to skip checking
>> > profile information for a removing basic block in stage 1 if it sounds
>> > reasonable.  For now I just twisted the case itself.
>> >
>> > Bootstrap and tested on x86_64 and arm_a15.
>> >
>> > Is it OK?
>>
>> Can you document the extra threading we do in pr21559.c?  The comment
>> still talks about two threadings we should perform.
>>
>> Also the ivopt_* adjustmens would be better done by matching
>> "ivtmp.[0-9_]* = PHI" instead of matching ivtmp in one of the PHI
> arguments.
>>
>> @@ -497,6 +507,9 @@ remove_forwarder_block (basic_block bb)
>>        set_immediate_dominator (CDI_DOMINATORS, dest, dom);
>>      }
>>
>> +  if (current_loops && bb->loop_father->latch == bb)
>> +    bb->loop_father->latch = dest;
>> +
>>    /* And kill the forwarder block.  */
>>    delete_basic_block (bb);
>>
>> can you add a comment here?  I had
>>
>> @@ -497,7 +500,12 @@ remove_forwarder_block (basic_block bb)
>>        set_immediate_dominator (CDI_DOMINATORS, dest, dom);
>>      }
>>
>> -  /* And kill the forwarder block.  */
>> +  /* And kill the forwarder block, but first adjust its parent loop
>> +     latch info as otherwise the cfg hook has a hard time not to
>> +     kill the loop.  */
>> +  if (current_loops
>> +      && bb->loop_father->latch == bb)
>> +    bb->loop_father->latch = dest;
>>    delete_basic_block (bb);
>>
>>    return true;
>>
>> in my patch.
>>
>> Thanks,
>> Richard.
>>
>> >
>> > 2014-02-25  Bin Cheng  <bin.cheng@arm.com>
>> >
>> >         PR target/60280
>> >         * tree-cfgcleanup.c (tree_forwarder_block_p): Protect loop
>> >         preheaders and latches only if requested.  Fix latch if it
>> >         is removed.
>> >         * tree-ssa-dom.c (tree_ssa_dominator_optimize): Set
>> >         LOOPS_HAVE_PREHEADERS.
>> >
>> > gcc/testsuite/ChangeLog
>> > 2014-02-25  Bin Cheng  <bin.cheng@arm.com>
>> >
>> >         PR target/60280
>> >         * gnat.dg/renaming5.adb: Change to two expected gotos.
>> >         * gcc.dg/tree-ssa/pr21559.c: Change back to three expected
>> >         jump threads.
>> >         * gcc.dg/tree-prof/update-loopch.c: Check two "Invalid sum"
>> >         messages for removed basic block.
>> >         * gcc.dg/tree-ssa/ivopt_1.c: Fix unreliable scanning string.
>> >         * gcc.dg/tree-ssa/ivopt_2.c: Ditto.
>> >         * gcc.dg/tree-ssa/ivopt_3.c: Ditto.
>> >         * gcc.dg/tree-ssa/ivopt_4.c: Ditto.
diff mbox

Patch

Index: gcc/tree-cfgcleanup.c
===================================================================
--- gcc/tree-cfgcleanup.c	(revision 207938)
+++ gcc/tree-cfgcleanup.c	(working copy)
@@ -308,14 +308,24 @@  tree_forwarder_block_p (basic_block bb, bool phi_w
   if (current_loops)
     {
       basic_block dest;
-      /* Protect loop latches, headers and preheaders.  */
+      /* Protect loop headers.  */
       if (bb->loop_father->header == bb)
 	return false;
+
       dest = EDGE_SUCC (bb, 0)->dest;
+      /* Protect loop preheaders and latches if requested.  */
+      if (dest->loop_father->header == dest)
+	{
+	  if (loops_state_satisfies_p (LOOPS_HAVE_PREHEADERS)
+	      && bb->loop_father->header != dest)
+	    return false;
 
-      if (dest->loop_father->header == dest)
-	return false;
+	  if (loops_state_satisfies_p (LOOPS_HAVE_SIMPLE_LATCHES)
+	      && bb->loop_father->header == dest)
+	    return false;
+	}
     }
+
   return true;
 }
 
@@ -497,6 +507,11 @@  remove_forwarder_block (basic_block bb)
       set_immediate_dominator (CDI_DOMINATORS, dest, dom);
     }
 
+  /* Adjust latch infomation of BB's parent loop as otherwise
+     the cfg hook has a hard time not to kill the loop.  */
+  if (current_loops && bb->loop_father->latch == bb)
+    bb->loop_father->latch = dest;
+
   /* And kill the forwarder block.  */
   delete_basic_block (bb);
 
Index: gcc/tree-ssa-dom.c
===================================================================
--- gcc/tree-ssa-dom.c	(revision 207938)
+++ gcc/tree-ssa-dom.c	(working copy)
@@ -849,9 +849,15 @@  tree_ssa_dominator_optimize (void)
   /* We need to know loop structures in order to avoid destroying them
      in jump threading.  Note that we still can e.g. thread through loop
      headers to an exit edge, or through loop header to the loop body, assuming
-     that we update the loop info.  */
-  loop_optimizer_init (LOOPS_HAVE_SIMPLE_LATCHES);
+     that we update the loop info.
 
+     TODO: We don't need to set LOOPS_HAVE_PREHEADERS generally, but due
+     to several overly conservative bail-outs in jump threading, case
+     gcc.dg/tree-ssa/pr21417.c can't be threaded if loop preheader is
+     missing.  We should improve jump threading in future then
+     LOOPS_HAVE_PREHEADERS won't be needed here.  */
+  loop_optimizer_init (LOOPS_HAVE_PREHEADERS | LOOPS_HAVE_SIMPLE_LATCHES);
+
   /* Initialize the value-handle array.  */
   threadedge_initialize_values ();
 
Index: gcc/testsuite/gnat.dg/renaming5.adb
===================================================================
--- gcc/testsuite/gnat.dg/renaming5.adb	(revision 207938)
+++ gcc/testsuite/gnat.dg/renaming5.adb	(working copy)
@@ -26,5 +26,5 @@  package body Renaming5 is
 
 end Renaming5;
 
--- { dg-final { scan-tree-dump-times "goto" 3 "optimized" } }
+-- { dg-final { scan-tree-dump-times "goto" 2 "optimized" } }
 -- { dg-final { cleanup-tree-dump "optimized" } }
Index: gcc/testsuite/gcc.dg/tree-ssa/ivopt_2.c
===================================================================
--- gcc/testsuite/gcc.dg/tree-ssa/ivopt_2.c	(revision 207938)
+++ gcc/testsuite/gcc.dg/tree-ssa/ivopt_2.c	(working copy)
@@ -13,5 +13,5 @@  void foo (int i_width, TYPE dst, TYPE src1, TYPE s
        }
 }
 
-/* { dg-final { scan-tree-dump-times "PHI <ivtmp" 1 "ivopts"} } */
+/* { dg-final { scan-tree-dump-times "ivtmp.\[0-9_\]* = PHI <" 1 "ivopts"} } */
 /* { dg-final { cleanup-tree-dump "ivopts" } } */
Index: gcc/testsuite/gcc.dg/tree-ssa/pr21559.c
===================================================================
--- gcc/testsuite/gcc.dg/tree-ssa/pr21559.c	(revision 207938)
+++ gcc/testsuite/gcc.dg/tree-ssa/pr21559.c	(working copy)
@@ -36,8 +36,9 @@  void foo (void)
 
 /* Second, we should thread the edge out of the loop via the break
    statement.  We also realize that the final bytes == 0 test is useless,
-   and thread over it.  */
-/* { dg-final { scan-tree-dump-times "Threaded jump" 2 "vrp1" } } */
+   and thread over it.  We also know that toread != 0 is useless when
+   entering while loop and thread over it.  */
+/* { dg-final { scan-tree-dump-times "Threaded jump" 3 "vrp1" } } */
 
 /* { dg-final { cleanup-tree-dump "vrp1" } } */
 
Index: gcc/testsuite/gcc.dg/tree-ssa/ivopt_4.c
===================================================================
--- gcc/testsuite/gcc.dg/tree-ssa/ivopt_4.c	(revision 207938)
+++ gcc/testsuite/gcc.dg/tree-ssa/ivopt_4.c	(working copy)
@@ -15,5 +15,5 @@  void foo (int i_width, TYPE dst, TYPE src1, TYPE s
        }
 }
 
-/* { dg-final { scan-tree-dump-times "PHI <ivtmp" 1 "ivopts"} } */
+/* { dg-final { scan-tree-dump-times "ivtmp.\[0-9_\]* = PHI <" 1 "ivopts"} } */
 /* { dg-final { cleanup-tree-dump "ivopts" } } */
Index: gcc/testsuite/gcc.dg/tree-ssa/ivopt_1.c
===================================================================
--- gcc/testsuite/gcc.dg/tree-ssa/ivopt_1.c	(revision 207938)
+++ gcc/testsuite/gcc.dg/tree-ssa/ivopt_1.c	(working copy)
@@ -14,5 +14,5 @@  void foo (int i_width, TYPE dst, TYPE src1, TYPE s
 }
 
 
-/* { dg-final { scan-tree-dump-times "PHI <ivtmp" 1 "ivopts"} } */
+/* { dg-final { scan-tree-dump-times "ivtmp.\[0-9_\]* = PHI <" 1 "ivopts"} } */
 /* { dg-final { cleanup-tree-dump "ivopts" } } */
Index: gcc/testsuite/gcc.dg/tree-ssa/ivopt_3.c
===================================================================
--- gcc/testsuite/gcc.dg/tree-ssa/ivopt_3.c	(revision 207938)
+++ gcc/testsuite/gcc.dg/tree-ssa/ivopt_3.c	(working copy)
@@ -14,7 +14,7 @@  void foo (int i_width, char* dst, char* src1, char
 	   src1+=sizeof(TYPE);
 	   src2+=sizeof(TYPE);
        }
-} 
+}
 
-/* { dg-final { scan-tree-dump-times "PHI <ivtmp" 1 "ivopts"} } */
+/* { dg-final { scan-tree-dump-times "ivtmp.\[0-9_\]* = PHI <" 1 "ivopts"} } */
 /* { dg-final { cleanup-tree-dump "ivopts" } } */
Index: gcc/testsuite/gcc.dg/tree-prof/update-loopch.c
===================================================================
--- gcc/testsuite/gcc.dg/tree-prof/update-loopch.c	(revision 207938)
+++ gcc/testsuite/gcc.dg/tree-prof/update-loopch.c	(working copy)
@@ -16,6 +16,7 @@  main ()
    edge.  */
 /* { dg-final-use { scan-ipa-dump "loop depth 1, count 33334" "profile"} } */
 /* { dg-final-use { scan-tree-dump "loop depth 1, count 33332" "optimized"} } */
-/* { dg-final-use { scan-tree-dump-not "Invalid sum" "optimized"} } */
+/* { dg-final-use { scan-tree-dump-times "Removing basic block \[^\r\n\]*\[\\r\\n\]+\[^\r\n\]*\[\\r\\n\]+Invalid sum of\[^\r\n\]*\[\\r\\n\]+Invalid sum of" 1 "optimized"} } */
+/* { dg-final-use { scan-tree-dump-times "Invalid sum of" 2 "optimized"} } */
 /* { dg-final-use { cleanup-ipa-dump "profile" } } */
 /* { dg-final-use { cleanup-tree-dump "optimized" } } */