Message ID | 001901cf31e8$2218ad50$664a07f0$@arm.com |
---|---|
State | New |
Headers | show |
On Tue, Feb 25, 2014 at 6:12 AM, bin.cheng <bin.cheng@arm.com> wrote: > Hi, > This patch is to fix regression reported in PR60280 by removing forward loop > headers/latches in cfg cleanup if possible. Several tests are broken by > this change since cfg cleanup is shared by all optimizers. Some tests has > already been fixed by recent patches, I went through and fixed the others. > One case needs to be clarified is "gcc.dg/tree-prof/update-loopch.c". When > GCC removing a basic block, it checks profile information by calling > check_bb_profile after redirecting incoming edges of the bb. This certainly > results in warnings about invalid profile information and causes the case to > fail. I will send a patch to skip checking profile information for a > removing basic block in stage 1 if it sounds reasonable. For now I just > twisted the case itself. > > Bootstrap and tested on x86_64 and arm_a15. > > Is it OK? Can you document the extra threading we do in pr21559.c? The comment still talks about two threadings we should perform. Also the ivopt_* adjustmens would be better done by matching "ivtmp.[0-9_]* = PHI" instead of matching ivtmp in one of the PHI arguments. @@ -497,6 +507,9 @@ remove_forwarder_block (basic_block bb) set_immediate_dominator (CDI_DOMINATORS, dest, dom); } + if (current_loops && bb->loop_father->latch == bb) + bb->loop_father->latch = dest; + /* And kill the forwarder block. */ delete_basic_block (bb); can you add a comment here? I had @@ -497,7 +500,12 @@ remove_forwarder_block (basic_block bb) set_immediate_dominator (CDI_DOMINATORS, dest, dom); } - /* And kill the forwarder block. */ + /* And kill the forwarder block, but first adjust its parent loop + latch info as otherwise the cfg hook has a hard time not to + kill the loop. */ + if (current_loops + && bb->loop_father->latch == bb) + bb->loop_father->latch = dest; delete_basic_block (bb); return true; in my patch. Thanks, Richard. > > 2014-02-25 Bin Cheng <bin.cheng@arm.com> > > PR target/60280 > * tree-cfgcleanup.c (tree_forwarder_block_p): Protect loop > preheaders and latches only if requested. Fix latch if it > is removed. > * tree-ssa-dom.c (tree_ssa_dominator_optimize): Set > LOOPS_HAVE_PREHEADERS. > > gcc/testsuite/ChangeLog > 2014-02-25 Bin Cheng <bin.cheng@arm.com> > > PR target/60280 > * gnat.dg/renaming5.adb: Change to two expected gotos. > * gcc.dg/tree-ssa/pr21559.c: Change back to three expected > jump threads. > * gcc.dg/tree-prof/update-loopch.c: Check two "Invalid sum" > messages for removed basic block. > * gcc.dg/tree-ssa/ivopt_1.c: Fix unreliable scanning string. > * gcc.dg/tree-ssa/ivopt_2.c: Ditto. > * gcc.dg/tree-ssa/ivopt_3.c: Ditto. > * gcc.dg/tree-ssa/ivopt_4.c: Ditto.
On Tue, 25 Feb 2014, bin.cheng wrote: > Hi, > This patch is to fix regression reported in PR60280 by removing forward loop > headers/latches in cfg cleanup if possible. Several tests are broken by > this change since cfg cleanup is shared by all optimizers. Some tests has > already been fixed by recent patches, I went through and fixed the others. > One case needs to be clarified is "gcc.dg/tree-prof/update-loopch.c". When > GCC removing a basic block, it checks profile information by calling > check_bb_profile after redirecting incoming edges of the bb. This certainly > results in warnings about invalid profile information and causes the case to > fail. I will send a patch to skip checking profile information for a > removing basic block in stage 1 if it sounds reasonable. For now I just > twisted the case itself. > > Bootstrap and tested on x86_64 and arm_a15. > > Is it OK? > > > 2014-02-25 Bin Cheng <bin.cheng@arm.com> > > PR target/60280 > * tree-cfgcleanup.c (tree_forwarder_block_p): Protect loop > preheaders and latches only if requested. Fix latch if it > is removed. > * tree-ssa-dom.c (tree_ssa_dominator_optimize): Set > LOOPS_HAVE_PREHEADERS. > > gcc/testsuite/ChangeLog > 2014-02-25 Bin Cheng <bin.cheng@arm.com> > > PR target/60280 > * gnat.dg/renaming5.adb: Change to two expected gotos. > * gcc.dg/tree-ssa/pr21559.c: Change back to three expected > jump threads. > * gcc.dg/tree-prof/update-loopch.c: Check two "Invalid sum" > messages for removed basic block. > * gcc.dg/tree-ssa/ivopt_1.c: Fix unreliable scanning string. > * gcc.dg/tree-ssa/ivopt_2.c: Ditto. > * gcc.dg/tree-ssa/ivopt_3.c: Ditto. > * gcc.dg/tree-ssa/ivopt_4.c: Ditto. Do you need to also update gcc.dg/tree-ssa/ssa-dom-thread-4.c, at least for logical_op_short_circuit targets? (There's a nice analysis comment there by Richard S which may also have to be updated.) This caused a regression for logical_op_short_circuit targets, I entered PR60363 for convenience. brgds, H-P
Sorry, I didn't test it against logical_op_short_circuit target. I will look into this PR. Thanks, bin On Fri, Feb 28, 2014 at 9:34 AM, Hans-Peter Nilsson <hp@bitrange.com> wrote: > On Tue, 25 Feb 2014, bin.cheng wrote: > >> Hi, >> This patch is to fix regression reported in PR60280 by removing forward loop >> headers/latches in cfg cleanup if possible. Several tests are broken by >> this change since cfg cleanup is shared by all optimizers. Some tests has >> already been fixed by recent patches, I went through and fixed the others. >> One case needs to be clarified is "gcc.dg/tree-prof/update-loopch.c". When >> GCC removing a basic block, it checks profile information by calling >> check_bb_profile after redirecting incoming edges of the bb. This certainly >> results in warnings about invalid profile information and causes the case to >> fail. I will send a patch to skip checking profile information for a >> removing basic block in stage 1 if it sounds reasonable. For now I just >> twisted the case itself. >> >> Bootstrap and tested on x86_64 and arm_a15. >> >> Is it OK? >> >> >> 2014-02-25 Bin Cheng <bin.cheng@arm.com> >> >> PR target/60280 >> * tree-cfgcleanup.c (tree_forwarder_block_p): Protect loop >> preheaders and latches only if requested. Fix latch if it >> is removed. >> * tree-ssa-dom.c (tree_ssa_dominator_optimize): Set >> LOOPS_HAVE_PREHEADERS. >> >> gcc/testsuite/ChangeLog >> 2014-02-25 Bin Cheng <bin.cheng@arm.com> >> >> PR target/60280 >> * gnat.dg/renaming5.adb: Change to two expected gotos. >> * gcc.dg/tree-ssa/pr21559.c: Change back to three expected >> jump threads. >> * gcc.dg/tree-prof/update-loopch.c: Check two "Invalid sum" >> messages for removed basic block. >> * gcc.dg/tree-ssa/ivopt_1.c: Fix unreliable scanning string. >> * gcc.dg/tree-ssa/ivopt_2.c: Ditto. >> * gcc.dg/tree-ssa/ivopt_3.c: Ditto. >> * gcc.dg/tree-ssa/ivopt_4.c: Ditto. > > Do you need to also update gcc.dg/tree-ssa/ssa-dom-thread-4.c, > at least for logical_op_short_circuit targets? > (There's a nice analysis comment there by Richard S which may > also have to be updated.) > > This caused a regression for logical_op_short_circuit targets, I > entered PR60363 for convenience. > > brgds, H-P
Index: gcc/tree-ssa-dom.c =================================================================== --- gcc/tree-ssa-dom.c (revision 207938) +++ gcc/tree-ssa-dom.c (working copy) @@ -849,9 +849,15 @@ tree_ssa_dominator_optimize (void) /* We need to know loop structures in order to avoid destroying them in jump threading. Note that we still can e.g. thread through loop headers to an exit edge, or through loop header to the loop body, assuming - that we update the loop info. */ - loop_optimizer_init (LOOPS_HAVE_SIMPLE_LATCHES); + that we update the loop info. + TODO: We don't need to set LOOPS_HAVE_PREHEADERS generally, but due + to several overly conservative bail-outs in jump threading, case + gcc.dg/tree-ssa/pr21417.c can't be threaded if loop preheader is + missing. We should improve jump threading in future then + LOOPS_HAVE_PREHEADERS won't be needed here. */ + loop_optimizer_init (LOOPS_HAVE_PREHEADERS | LOOPS_HAVE_SIMPLE_LATCHES); + /* Initialize the value-handle array. */ threadedge_initialize_values (); Index: gcc/tree-cfgcleanup.c =================================================================== --- gcc/tree-cfgcleanup.c (revision 207938) +++ gcc/tree-cfgcleanup.c (working copy) @@ -308,14 +308,24 @@ tree_forwarder_block_p (basic_block bb, bool phi_w if (current_loops) { basic_block dest; - /* Protect loop latches, headers and preheaders. */ + /* Protect loop headers. */ if (bb->loop_father->header == bb) return false; + dest = EDGE_SUCC (bb, 0)->dest; + /* Protect loop preheaders and latches if requested. */ + if (dest->loop_father->header == dest) + { + if (loops_state_satisfies_p (LOOPS_HAVE_PREHEADERS) + && bb->loop_father->header != dest) + return false; - if (dest->loop_father->header == dest) - return false; + if (loops_state_satisfies_p (LOOPS_HAVE_SIMPLE_LATCHES) + && bb->loop_father->header == dest) + return false; + } } + return true; } @@ -497,6 +507,9 @@ remove_forwarder_block (basic_block bb) set_immediate_dominator (CDI_DOMINATORS, dest, dom); } + if (current_loops && bb->loop_father->latch == bb) + bb->loop_father->latch = dest; + /* And kill the forwarder block. */ delete_basic_block (bb); Index: gcc/testsuite/gnat.dg/renaming5.adb =================================================================== --- gcc/testsuite/gnat.dg/renaming5.adb (revision 207938) +++ gcc/testsuite/gnat.dg/renaming5.adb (working copy) @@ -26,5 +26,5 @@ package body Renaming5 is end Renaming5; --- { dg-final { scan-tree-dump-times "goto" 3 "optimized" } } +-- { dg-final { scan-tree-dump-times "goto" 2 "optimized" } } -- { dg-final { cleanup-tree-dump "optimized" } } Index: gcc/testsuite/gcc.dg/tree-ssa/ivopt_2.c =================================================================== --- gcc/testsuite/gcc.dg/tree-ssa/ivopt_2.c (revision 207938) +++ gcc/testsuite/gcc.dg/tree-ssa/ivopt_2.c (working copy) @@ -13,5 +13,5 @@ void foo (int i_width, TYPE dst, TYPE src1, TYPE s } } -/* { dg-final { scan-tree-dump-times "PHI <ivtmp" 1 "ivopts"} } */ +/* { dg-final { scan-tree-dump-times "PHI <.*ivtmp" 1 "ivopts"} } */ /* { dg-final { cleanup-tree-dump "ivopts" } } */ Index: gcc/testsuite/gcc.dg/tree-ssa/pr21559.c =================================================================== --- gcc/testsuite/gcc.dg/tree-ssa/pr21559.c (revision 207938) +++ gcc/testsuite/gcc.dg/tree-ssa/pr21559.c (working copy) @@ -37,7 +37,7 @@ void foo (void) /* Second, we should thread the edge out of the loop via the break statement. We also realize that the final bytes == 0 test is useless, and thread over it. */ -/* { dg-final { scan-tree-dump-times "Threaded jump" 2 "vrp1" } } */ +/* { dg-final { scan-tree-dump-times "Threaded jump" 3 "vrp1" } } */ /* { dg-final { cleanup-tree-dump "vrp1" } } */ Index: gcc/testsuite/gcc.dg/tree-ssa/ivopt_4.c =================================================================== --- gcc/testsuite/gcc.dg/tree-ssa/ivopt_4.c (revision 207938) +++ gcc/testsuite/gcc.dg/tree-ssa/ivopt_4.c (working copy) @@ -15,5 +15,5 @@ void foo (int i_width, TYPE dst, TYPE src1, TYPE s } } -/* { dg-final { scan-tree-dump-times "PHI <ivtmp" 1 "ivopts"} } */ +/* { dg-final { scan-tree-dump-times "PHI <.*ivtmp" 1 "ivopts"} } */ /* { dg-final { cleanup-tree-dump "ivopts" } } */ Index: gcc/testsuite/gcc.dg/tree-ssa/ivopt_1.c =================================================================== --- gcc/testsuite/gcc.dg/tree-ssa/ivopt_1.c (revision 207938) +++ gcc/testsuite/gcc.dg/tree-ssa/ivopt_1.c (working copy) @@ -14,5 +14,5 @@ void foo (int i_width, TYPE dst, TYPE src1, TYPE s } -/* { dg-final { scan-tree-dump-times "PHI <ivtmp" 1 "ivopts"} } */ +/* { dg-final { scan-tree-dump-times "PHI <.*ivtmp" 1 "ivopts"} } */ /* { dg-final { cleanup-tree-dump "ivopts" } } */ Index: gcc/testsuite/gcc.dg/tree-ssa/ivopt_3.c =================================================================== --- gcc/testsuite/gcc.dg/tree-ssa/ivopt_3.c (revision 207938) +++ gcc/testsuite/gcc.dg/tree-ssa/ivopt_3.c (working copy) @@ -14,7 +14,7 @@ void foo (int i_width, char* dst, char* src1, char src1+=sizeof(TYPE); src2+=sizeof(TYPE); } -} +} -/* { dg-final { scan-tree-dump-times "PHI <ivtmp" 1 "ivopts"} } */ +/* { dg-final { scan-tree-dump-times "PHI <.*ivtmp" 1 "ivopts"} } */ /* { dg-final { cleanup-tree-dump "ivopts" } } */ Index: gcc/testsuite/gcc.dg/tree-prof/update-loopch.c =================================================================== --- gcc/testsuite/gcc.dg/tree-prof/update-loopch.c (revision 207938) +++ gcc/testsuite/gcc.dg/tree-prof/update-loopch.c (working copy) @@ -16,6 +16,7 @@ main () edge. */ /* { dg-final-use { scan-ipa-dump "loop depth 1, count 33334" "profile"} } */ /* { dg-final-use { scan-tree-dump "loop depth 1, count 33332" "optimized"} } */ -/* { dg-final-use { scan-tree-dump-not "Invalid sum" "optimized"} } */ +/* { dg-final-use { scan-tree-dump-times "Removing basic block \[^\r\n\]*\[\\r\\n\]+\[^\r\n\]*\[\\r\\n\]+Invalid sum of\[^\r\n\]*\[\\r\\n\]+Invalid sum of" 1 "optimized"} } */ +/* { dg-final-use { scan-tree-dump-times "Invalid sum of" 2 "optimized"} } */ /* { dg-final-use { cleanup-ipa-dump "profile" } } */ /* { dg-final-use { cleanup-tree-dump "optimized" } } */