Message ID | 0623b8a3-f772-1f81-f029-611a186aa302@redhat.com |
---|---|
State | New |
Headers | show |
On Wed, Jan 4, 2017 at 6:31 AM, Jeff Law <law@redhat.com> wrote: > > So as noted in the BZ comments the jump threading code has code that detects > when a jump threading path wants to cross multiple loop headers and > truncates the jump threading path in that case. > > What we should have done instead is invalidate the cached loop information. > > Additionally, this BZ shows that just looking at loop headers is not > sufficient -- we might cross from a reducible to an irreducible region which > is equivalent to crossing into another loop in that we need to invalidate > the cached loop iteration information. > > What's so damn funny here is that eventually we take nested loops and > irreducible regions, thread various edges and end up with a nice natural > loop and no irreducible regions in the end :-) But the cached iteration > information is still bogus. > > Anyway, this patch corrects both issues. It treats moving between an > reducible and irreducible region as crossing a loop header and it > invalidates the cached iteration information rather than truncating the jump > thread path. > > Bootstrapped and regression tested on x86_64-linux-gnu. That compiler was > also used to build all the configurations in config-list.mk. > > Installing on the trunk. I could be convinced to install on the gcc-6 > branch as well since it's affected by the same problem. > > Jeff > > > commit 93e3964a4664350446eefe786e3b73eb41d99036 > Author: law <law@138bc75d-0d04-0410-961f-82ee72b054a4> > Date: Wed Jan 4 05:31:23 2017 +0000 > > PR tree-optimizatin/78856 > * tree-ssa-threadupdate.c: Include tree-vectorizer.h. > (mark_threaded_blocks): Remove code to truncate thread paths that > cross multiple loop headers. Instead invalidate the cached loop > iteration information and handle case of a thread path walking > into an irreducible region. > > PR tree-optimization/78856 > * gcc.c-torture/execute/pr78856.c: New test. > > git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@244045 > 138bc75d-0d04-0410-961f-82ee72b054a4 > > diff --git a/gcc/ChangeLog b/gcc/ChangeLog > index 3114e02..6b2888f 100644 > --- a/gcc/ChangeLog > +++ b/gcc/ChangeLog > @@ -1,3 +1,12 @@ > +2017-01-03 Jeff Law <law@redhat.com> > + > + PR tree-optimizatin/78856 > + * tree-ssa-threadupdate.c: Include tree-vectorizer.h. > + (mark_threaded_blocks): Remove code to truncate thread paths that > + cross multiple loop headers. Instead invalidate the cached loop > + iteration information and handle case of a thread path walking > + into an irreducible region. > + > 2016-12-30 Michael Meissner <meissner@linux.vnet.ibm.com> > > PR target/78900 > diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog > index cd2a065..cadfbc9 100644 > --- a/gcc/testsuite/ChangeLog > +++ b/gcc/testsuite/ChangeLog > @@ -1,3 +1,8 @@ > +2017-01-03 Jeff Law <law@redhat.com> > + > + PR tree-optimization/78856 > + * gcc.c-torture/execute/pr78856.c: New test. > + > 2017-01-03 Michael Meissner <meissner@linux.vnet.ibm.com> > > PR target/78953 > diff --git a/gcc/testsuite/gcc.c-torture/execute/pr78856.c > b/gcc/testsuite/gcc.c-torture/execute/pr78856.c > new file mode 100644 > index 0000000..80f2317 > --- /dev/null > +++ b/gcc/testsuite/gcc.c-torture/execute/pr78856.c > @@ -0,0 +1,25 @@ > +extern void exit (int); > + > +int a, b, c, d, e, f[3]; > + > +int main() > +{ > + while (d) > + while (1) > + ; > + int g = 0, h, i = 0; > + for (; g < 21; g += 9) > + { > + int j = 1; > + for (h = 0; h < 3; h++) > + f[h] = 1; > + for (; j < 10; j++) { > + d = i && (b ? 0 : c); > + i = 1; > + if (g) > + a = e; > + } > + } > + exit (0); > +} > + > diff --git a/gcc/tree-ssa-threadupdate.c b/gcc/tree-ssa-threadupdate.c > index adbb6e0..2da93a8 100644 > --- a/gcc/tree-ssa-threadupdate.c > +++ b/gcc/tree-ssa-threadupdate.c > @@ -34,6 +34,7 @@ along with GCC; see the file COPYING3. If not see > #include "cfgloop.h" > #include "dbgcnt.h" > #include "tree-cfg.h" > +#include "tree-vectorizer.h" > > /* Given a block B, update the CFG and SSA graph to reflect redirecting > one or more in-edges to B to instead reach the destination of an > @@ -2084,10 +2085,8 @@ mark_threaded_blocks (bitmap threaded_blocks) > /* Look for jump threading paths which cross multiple loop headers. > > The code to thread through loop headers will change the CFG in ways > - that break assumptions made by the loop optimization code. > - > - We don't want to blindly cancel the requests. We can instead do > better > - by trimming off the end of the jump thread path. */ > + that invalidate the cached loop iteration information. So we must > + detect that case and wipe the cached information. */ > EXECUTE_IF_SET_IN_BITMAP (tmp, 0, i, bi) > { > basic_block bb = BASIC_BLOCK_FOR_FN (cfun, i); > @@ -2102,26 +2101,16 @@ mark_threaded_blocks (bitmap threaded_blocks) > i++) > { > basic_block dest = (*path)[i]->e->dest; > + basic_block src = (*path)[i]->e->src; > crossed_headers += (dest == dest->loop_father->header); > + /* If we step from a block outside an irreducible region > + to a block inside an irreducible region, then we have > + crossed into a loop. */ > + crossed_headers += ((src->flags & BB_IRREDUCIBLE_LOOP) > + != (dest->flags & > BB_IRREDUCIBLE_LOOP)); I'm not sure about this one -- if dest is inside an irreducible region then dest->loop_father will be the containing loop which may be unrelated (just crossed). Either we miss handling of crossed loop headers or whether dest is inside an irreducible region doesn't matter. So, don't we want to check only whether we are src is inside a irreducible region and dest is not? And in this case we've crossed a loop header anyway... > if (crossed_headers > 1) > { > - /* Trim from entry I onwards. */ > - for (unsigned int j = i; j < path->length (); j++) > - delete (*path)[j]; > - path->truncate (i); > - > - /* Now that we've truncated the path, make sure > - what's left is still valid. We need at least > - two edges on the path and the last edge can not > - be a joiner. This should never happen, but let's > - be safe. */ > - if (path->length () < 2 > - || (path->last ()->type > - == EDGE_COPY_SRC_JOINER_BLOCK)) > - { > - delete_jump_thread_path (path); > - e->aux = NULL; > - } > + vect_free_loop_info_assumptions (dest->loop_father); > break; > } > } >
On 01/04/2017 05:25 AM, Richard Biener wrote: > On Wed, Jan 4, 2017 at 6:31 AM, Jeff Law <law@redhat.com> wrote: >> >> So as noted in the BZ comments the jump threading code has code that detects >> when a jump threading path wants to cross multiple loop headers and >> truncates the jump threading path in that case. >> >> What we should have done instead is invalidate the cached loop information. >> >> Additionally, this BZ shows that just looking at loop headers is not >> sufficient -- we might cross from a reducible to an irreducible region which >> is equivalent to crossing into another loop in that we need to invalidate >> the cached loop iteration information. >> >> What's so damn funny here is that eventually we take nested loops and >> irreducible regions, thread various edges and end up with a nice natural >> loop and no irreducible regions in the end :-) But the cached iteration >> information is still bogus. >> >> Anyway, this patch corrects both issues. It treats moving between an >> reducible and irreducible region as crossing a loop header and it >> invalidates the cached iteration information rather than truncating the jump >> thread path. >> >> Bootstrapped and regression tested on x86_64-linux-gnu. That compiler was >> also used to build all the configurations in config-list.mk. >> >> Installing on the trunk. I could be convinced to install on the gcc-6 >> branch as well since it's affected by the same problem. >> >> Jeff >> >> >> commit 93e3964a4664350446eefe786e3b73eb41d99036 >> Author: law <law@138bc75d-0d04-0410-961f-82ee72b054a4> >> Date: Wed Jan 4 05:31:23 2017 +0000 >> >> PR tree-optimizatin/78856 >> * tree-ssa-threadupdate.c: Include tree-vectorizer.h. >> (mark_threaded_blocks): Remove code to truncate thread paths that >> cross multiple loop headers. Instead invalidate the cached loop >> iteration information and handle case of a thread path walking >> into an irreducible region. >> >> PR tree-optimization/78856 >> * gcc.c-torture/execute/pr78856.c: New test. >> >> git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@244045 >> 138bc75d-0d04-0410-961f-82ee72b054a4 >> >> diff --git a/gcc/ChangeLog b/gcc/ChangeLog >> index 3114e02..6b2888f 100644 >> --- a/gcc/ChangeLog >> +++ b/gcc/ChangeLog >> @@ -1,3 +1,12 @@ >> +2017-01-03 Jeff Law <law@redhat.com> >> + >> + PR tree-optimizatin/78856 >> + * tree-ssa-threadupdate.c: Include tree-vectorizer.h. >> + (mark_threaded_blocks): Remove code to truncate thread paths that >> + cross multiple loop headers. Instead invalidate the cached loop >> + iteration information and handle case of a thread path walking >> + into an irreducible region. >> + >> 2016-12-30 Michael Meissner <meissner@linux.vnet.ibm.com> >> >> PR target/78900 >> diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog >> index cd2a065..cadfbc9 100644 >> --- a/gcc/testsuite/ChangeLog >> +++ b/gcc/testsuite/ChangeLog >> @@ -1,3 +1,8 @@ >> +2017-01-03 Jeff Law <law@redhat.com> >> + >> + PR tree-optimization/78856 >> + * gcc.c-torture/execute/pr78856.c: New test. >> + >> 2017-01-03 Michael Meissner <meissner@linux.vnet.ibm.com> >> >> PR target/78953 >> diff --git a/gcc/testsuite/gcc.c-torture/execute/pr78856.c >> b/gcc/testsuite/gcc.c-torture/execute/pr78856.c >> new file mode 100644 >> index 0000000..80f2317 >> --- /dev/null >> +++ b/gcc/testsuite/gcc.c-torture/execute/pr78856.c >> @@ -0,0 +1,25 @@ >> +extern void exit (int); >> + >> +int a, b, c, d, e, f[3]; >> + >> +int main() >> +{ >> + while (d) >> + while (1) >> + ; >> + int g = 0, h, i = 0; >> + for (; g < 21; g += 9) >> + { >> + int j = 1; >> + for (h = 0; h < 3; h++) >> + f[h] = 1; >> + for (; j < 10; j++) { >> + d = i && (b ? 0 : c); >> + i = 1; >> + if (g) >> + a = e; >> + } >> + } >> + exit (0); >> +} >> + >> diff --git a/gcc/tree-ssa-threadupdate.c b/gcc/tree-ssa-threadupdate.c >> index adbb6e0..2da93a8 100644 >> --- a/gcc/tree-ssa-threadupdate.c >> +++ b/gcc/tree-ssa-threadupdate.c >> @@ -34,6 +34,7 @@ along with GCC; see the file COPYING3. If not see >> #include "cfgloop.h" >> #include "dbgcnt.h" >> #include "tree-cfg.h" >> +#include "tree-vectorizer.h" >> >> /* Given a block B, update the CFG and SSA graph to reflect redirecting >> one or more in-edges to B to instead reach the destination of an >> @@ -2084,10 +2085,8 @@ mark_threaded_blocks (bitmap threaded_blocks) >> /* Look for jump threading paths which cross multiple loop headers. >> >> The code to thread through loop headers will change the CFG in ways >> - that break assumptions made by the loop optimization code. >> - >> - We don't want to blindly cancel the requests. We can instead do >> better >> - by trimming off the end of the jump thread path. */ >> + that invalidate the cached loop iteration information. So we must >> + detect that case and wipe the cached information. */ >> EXECUTE_IF_SET_IN_BITMAP (tmp, 0, i, bi) >> { >> basic_block bb = BASIC_BLOCK_FOR_FN (cfun, i); >> @@ -2102,26 +2101,16 @@ mark_threaded_blocks (bitmap threaded_blocks) >> i++) >> { >> basic_block dest = (*path)[i]->e->dest; >> + basic_block src = (*path)[i]->e->src; >> crossed_headers += (dest == dest->loop_father->header); >> + /* If we step from a block outside an irreducible region >> + to a block inside an irreducible region, then we have >> + crossed into a loop. */ >> + crossed_headers += ((src->flags & BB_IRREDUCIBLE_LOOP) >> + != (dest->flags & >> BB_IRREDUCIBLE_LOOP)); > > I'm not sure about this one -- if dest is inside an irreducible region then > dest->loop_father will be the containing loop which may be unrelated > (just crossed). Either we miss handling of crossed loop headers or > whether dest is inside an irreducible region doesn't matter. So imagine where we have an irreducible region that lives within a natural loop. We have a jump thread from outside the natural loop (and thus crosses the loop header) which targets a block inside the irreducible region. As we walk the jump threading path, we first set crossed_headers to one as we walk from outside the loop to the loop header. We then increment it again as we walk into the irreducible region. It may be the case that this can be relaxed, but I'd rather be safe than sorry at this stage in the game. Jeff
On Tue, Jan 10, 2017 at 8:16 PM, Jeff Law <law@redhat.com> wrote: > On 01/04/2017 05:25 AM, Richard Biener wrote: >> >> On Wed, Jan 4, 2017 at 6:31 AM, Jeff Law <law@redhat.com> wrote: >>> >>> >>> So as noted in the BZ comments the jump threading code has code that >>> detects >>> when a jump threading path wants to cross multiple loop headers and >>> truncates the jump threading path in that case. >>> >>> What we should have done instead is invalidate the cached loop >>> information. >>> >>> Additionally, this BZ shows that just looking at loop headers is not >>> sufficient -- we might cross from a reducible to an irreducible region >>> which >>> is equivalent to crossing into another loop in that we need to invalidate >>> the cached loop iteration information. >>> >>> What's so damn funny here is that eventually we take nested loops and >>> irreducible regions, thread various edges and end up with a nice natural >>> loop and no irreducible regions in the end :-) But the cached iteration >>> information is still bogus. >>> >>> Anyway, this patch corrects both issues. It treats moving between an >>> reducible and irreducible region as crossing a loop header and it >>> invalidates the cached iteration information rather than truncating the >>> jump >>> thread path. >>> >>> Bootstrapped and regression tested on x86_64-linux-gnu. That compiler >>> was >>> also used to build all the configurations in config-list.mk. >>> >>> Installing on the trunk. I could be convinced to install on the gcc-6 >>> branch as well since it's affected by the same problem. >>> >>> Jeff >>> >>> >>> commit 93e3964a4664350446eefe786e3b73eb41d99036 >>> Author: law <law@138bc75d-0d04-0410-961f-82ee72b054a4> >>> Date: Wed Jan 4 05:31:23 2017 +0000 >>> >>> PR tree-optimizatin/78856 >>> * tree-ssa-threadupdate.c: Include tree-vectorizer.h. >>> (mark_threaded_blocks): Remove code to truncate thread paths that >>> cross multiple loop headers. Instead invalidate the cached loop >>> iteration information and handle case of a thread path walking >>> into an irreducible region. >>> >>> PR tree-optimization/78856 >>> * gcc.c-torture/execute/pr78856.c: New test. >>> >>> git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@244045 >>> 138bc75d-0d04-0410-961f-82ee72b054a4 >>> >>> diff --git a/gcc/ChangeLog b/gcc/ChangeLog >>> index 3114e02..6b2888f 100644 >>> --- a/gcc/ChangeLog >>> +++ b/gcc/ChangeLog >>> @@ -1,3 +1,12 @@ >>> +2017-01-03 Jeff Law <law@redhat.com> >>> + >>> + PR tree-optimizatin/78856 >>> + * tree-ssa-threadupdate.c: Include tree-vectorizer.h. >>> + (mark_threaded_blocks): Remove code to truncate thread paths that >>> + cross multiple loop headers. Instead invalidate the cached loop >>> + iteration information and handle case of a thread path walking >>> + into an irreducible region. >>> + >>> 2016-12-30 Michael Meissner <meissner@linux.vnet.ibm.com> >>> >>> PR target/78900 >>> diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog >>> index cd2a065..cadfbc9 100644 >>> --- a/gcc/testsuite/ChangeLog >>> +++ b/gcc/testsuite/ChangeLog >>> @@ -1,3 +1,8 @@ >>> +2017-01-03 Jeff Law <law@redhat.com> >>> + >>> + PR tree-optimization/78856 >>> + * gcc.c-torture/execute/pr78856.c: New test. >>> + >>> 2017-01-03 Michael Meissner <meissner@linux.vnet.ibm.com> >>> >>> PR target/78953 >>> diff --git a/gcc/testsuite/gcc.c-torture/execute/pr78856.c >>> b/gcc/testsuite/gcc.c-torture/execute/pr78856.c >>> new file mode 100644 >>> index 0000000..80f2317 >>> --- /dev/null >>> +++ b/gcc/testsuite/gcc.c-torture/execute/pr78856.c >>> @@ -0,0 +1,25 @@ >>> +extern void exit (int); >>> + >>> +int a, b, c, d, e, f[3]; >>> + >>> +int main() >>> +{ >>> + while (d) >>> + while (1) >>> + ; >>> + int g = 0, h, i = 0; >>> + for (; g < 21; g += 9) >>> + { >>> + int j = 1; >>> + for (h = 0; h < 3; h++) >>> + f[h] = 1; >>> + for (; j < 10; j++) { >>> + d = i && (b ? 0 : c); >>> + i = 1; >>> + if (g) >>> + a = e; >>> + } >>> + } >>> + exit (0); >>> +} >>> + >>> diff --git a/gcc/tree-ssa-threadupdate.c b/gcc/tree-ssa-threadupdate.c >>> index adbb6e0..2da93a8 100644 >>> --- a/gcc/tree-ssa-threadupdate.c >>> +++ b/gcc/tree-ssa-threadupdate.c >>> @@ -34,6 +34,7 @@ along with GCC; see the file COPYING3. If not see >>> #include "cfgloop.h" >>> #include "dbgcnt.h" >>> #include "tree-cfg.h" >>> +#include "tree-vectorizer.h" >>> >>> /* Given a block B, update the CFG and SSA graph to reflect redirecting >>> one or more in-edges to B to instead reach the destination of an >>> @@ -2084,10 +2085,8 @@ mark_threaded_blocks (bitmap threaded_blocks) >>> /* Look for jump threading paths which cross multiple loop headers. >>> >>> The code to thread through loop headers will change the CFG in ways >>> - that break assumptions made by the loop optimization code. >>> - >>> - We don't want to blindly cancel the requests. We can instead do >>> better >>> - by trimming off the end of the jump thread path. */ >>> + that invalidate the cached loop iteration information. So we must >>> + detect that case and wipe the cached information. */ >>> EXECUTE_IF_SET_IN_BITMAP (tmp, 0, i, bi) >>> { >>> basic_block bb = BASIC_BLOCK_FOR_FN (cfun, i); >>> @@ -2102,26 +2101,16 @@ mark_threaded_blocks (bitmap threaded_blocks) >>> i++) >>> { >>> basic_block dest = (*path)[i]->e->dest; >>> + basic_block src = (*path)[i]->e->src; >>> crossed_headers += (dest == dest->loop_father->header); >>> + /* If we step from a block outside an irreducible >>> region >>> + to a block inside an irreducible region, then we >>> have >>> + crossed into a loop. */ >>> + crossed_headers += ((src->flags & BB_IRREDUCIBLE_LOOP) >>> + != (dest->flags & >>> BB_IRREDUCIBLE_LOOP)); >> >> >> I'm not sure about this one -- if dest is inside an irreducible region >> then >> dest->loop_father will be the containing loop which may be unrelated >> (just crossed). Either we miss handling of crossed loop headers or >> whether dest is inside an irreducible region doesn't matter. > > So imagine where we have an irreducible region that lives within a natural > loop. > > We have a jump thread from outside the natural loop (and thus crosses the > loop header) which targets a block inside the irreducible region. > > As we walk the jump threading path, we first set crossed_headers to one as > we walk from outside the loop to the loop header. We then increment it > again as we walk into the irreducible region. > > It may be the case that this can be relaxed, but I'd rather be safe than > sorry at this stage in the game. Sure, I was pointing out inconsistencies. Quoting the updated code: vec<jump_thread_edge *> *path = THREAD_PATH (e); for (unsigned int i = 0, crossed_headers = 0; i < path->length (); i++) { basic_block dest = (*path)[i]->e->dest; basic_block src = (*path)[i]->e->src; crossed_headers += (dest == dest->loop_father->header); /* If we step from a block outside an irreducible region to a block inside an irreducible region, then we have crossed into a loop. */ crossed_headers += ((src->flags & BB_IRREDUCIBLE_LOOP) != (dest->flags & BB_IRREDUCIBLE_LOOP)); if (crossed_headers > 1) { vect_free_loop_info_assumptions (dest->loop_father); break; } } assuming THREAD_PATH covers all edges in a path then assume a thread path threads from outside a two-level nested loop into the innermost loop then the above code will correctly detect we cross the outer loop header and reset its loop assumptions. But it will fail to reset assumptions for the 2nd header crossing into the inner loop. And as we can just reset assumptions for loops (not irreducible regions) I would have expected sth like for (unsigned int i = 0, crossed_headers = 0; i < path->length (); i++) { basic_block dest = (*path)[i]->e->dest; if (dest == dest->loop_father->header) vect_free_loop_info_assumptions (dest->loop_father); } or even walking the path backwards as I think only the path ultimate destination loop father info needs to be scrapped(?). So for (unsigned int i = 0, crossed_headers = 0; i < path->length (); i++) { basic_block dest = (*path)[i]->e->dest; basic_block src = (*path)[i]->e->src; crossed_headers += (dest == dest->loop_father->header); /* If we step from a block outside an irreducible region to a block inside an irreducible region, then we have crossed into a loop. */ crossed_headers += ((src->flags & BB_IRREDUCIBLE_LOOP) != (dest->flags & BB_IRREDUCIBLE_LOOP)); if (crossed_headers > 1) { vect_free_loop_info_assumptions ((*path)[path->length () - 1]->e->dest->loop_father); break; } } or walking the path backwards? Thanks, Richard. > Jeff
diff --git a/gcc/ChangeLog b/gcc/ChangeLog index 3114e02..6b2888f 100644 --- a/gcc/ChangeLog +++ b/gcc/ChangeLog @@ -1,3 +1,12 @@ +2017-01-03 Jeff Law <law@redhat.com> + + PR tree-optimizatin/78856 + * tree-ssa-threadupdate.c: Include tree-vectorizer.h. + (mark_threaded_blocks): Remove code to truncate thread paths that + cross multiple loop headers. Instead invalidate the cached loop + iteration information and handle case of a thread path walking + into an irreducible region. + 2016-12-30 Michael Meissner <meissner@linux.vnet.ibm.com> PR target/78900 diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog index cd2a065..cadfbc9 100644 --- a/gcc/testsuite/ChangeLog +++ b/gcc/testsuite/ChangeLog @@ -1,3 +1,8 @@ +2017-01-03 Jeff Law <law@redhat.com> + + PR tree-optimization/78856 + * gcc.c-torture/execute/pr78856.c: New test. + 2017-01-03 Michael Meissner <meissner@linux.vnet.ibm.com> PR target/78953 diff --git a/gcc/testsuite/gcc.c-torture/execute/pr78856.c b/gcc/testsuite/gcc.c-torture/execute/pr78856.c new file mode 100644 index 0000000..80f2317 --- /dev/null +++ b/gcc/testsuite/gcc.c-torture/execute/pr78856.c @@ -0,0 +1,25 @@ +extern void exit (int); + +int a, b, c, d, e, f[3]; + +int main() +{ + while (d) + while (1) + ; + int g = 0, h, i = 0; + for (; g < 21; g += 9) + { + int j = 1; + for (h = 0; h < 3; h++) + f[h] = 1; + for (; j < 10; j++) { + d = i && (b ? 0 : c); + i = 1; + if (g) + a = e; + } + } + exit (0); +} + diff --git a/gcc/tree-ssa-threadupdate.c b/gcc/tree-ssa-threadupdate.c index adbb6e0..2da93a8 100644 --- a/gcc/tree-ssa-threadupdate.c +++ b/gcc/tree-ssa-threadupdate.c @@ -34,6 +34,7 @@ along with GCC; see the file COPYING3. If not see #include "cfgloop.h" #include "dbgcnt.h" #include "tree-cfg.h" +#include "tree-vectorizer.h" /* Given a block B, update the CFG and SSA graph to reflect redirecting one or more in-edges to B to instead reach the destination of an @@ -2084,10 +2085,8 @@ mark_threaded_blocks (bitmap threaded_blocks) /* Look for jump threading paths which cross multiple loop headers. The code to thread through loop headers will change the CFG in ways - that break assumptions made by the loop optimization code. - - We don't want to blindly cancel the requests. We can instead do better - by trimming off the end of the jump thread path. */ + that invalidate the cached loop iteration information. So we must + detect that case and wipe the cached information. */ EXECUTE_IF_SET_IN_BITMAP (tmp, 0, i, bi) { basic_block bb = BASIC_BLOCK_FOR_FN (cfun, i); @@ -2102,26 +2101,16 @@ mark_threaded_blocks (bitmap threaded_blocks) i++) { basic_block dest = (*path)[i]->e->dest; + basic_block src = (*path)[i]->e->src; crossed_headers += (dest == dest->loop_father->header); + /* If we step from a block outside an irreducible region + to a block inside an irreducible region, then we have + crossed into a loop. */ + crossed_headers += ((src->flags & BB_IRREDUCIBLE_LOOP) + != (dest->flags & BB_IRREDUCIBLE_LOOP)); if (crossed_headers > 1) { - /* Trim from entry I onwards. */ - for (unsigned int j = i; j < path->length (); j++) - delete (*path)[j]; - path->truncate (i); - - /* Now that we've truncated the path, make sure - what's left is still valid. We need at least - two edges on the path and the last edge can not - be a joiner. This should never happen, but let's - be safe. */ - if (path->length () < 2 - || (path->last ()->type - == EDGE_COPY_SRC_JOINER_BLOCK)) - { - delete_jump_thread_path (path); - e->aux = NULL; - } + vect_free_loop_info_assumptions (dest->loop_father); break; } }