diff mbox series

Mark irreducible regions in the backwards threader

Message ID s45n2713-41p7-7s17-4042-r5n26ppsq3@fhfr.qr
State New
Headers show
Series Mark irreducible regions in the backwards threader | expand

Commit Message

Richard Biener June 14, 2021, 2:04 p.m. UTC
The same issue that arises in PR100934 can happen in the backward
threader now (Aldy?) - we eventually run into mark_irreducible_loops
for non-FSM thread paths checking for paths crossing irreducible
region boundaries.

The following thus initializes irreducible regions and makes
irreducible region discovery handle multiple latches.

Bootstrap & regtest pending.

2021-06-14  Richard Biener  <rguenther@suse.de>

	* cfgloopanal.c (mark_irreducible_loops): Use a dominance
	check to identify loop latches.
	* cfgloop.c (verify_loop_structure): Likewise.
	* loop-init.c (apply_loop_flags): Allow marked irreducible
	regions even with multiple latches.
	* predict.c (rebuild_frequencies): Simplify.
	* tree-ssa-threadbackward.c (pass_thread_jumps::execute):
	Mark irreducible regions.
	(pass_early_thread_jumps::execute): Likewise.
	* tree-ssa-threadupdate.c
	(jump_thread_path_registry::mark_threaded_blocks): Assert
	we have irreducible regions marked.
---
 gcc/cfgloop.c                 | 14 ++++++++++----
 gcc/cfgloopanal.c             |  2 +-
 gcc/loop-init.c               |  3 ++-
 gcc/predict.c                 |  3 +--
 gcc/tree-ssa-threadbackward.c |  6 ++++--
 gcc/tree-ssa-threadupdate.c   |  2 ++
 6 files changed, 20 insertions(+), 10 deletions(-)

Comments

Aldy Hernandez June 15, 2021, 5:31 a.m. UTC | #1
On 6/14/21 4:04 PM, Richard Biener wrote:
> The same issue that arises in PR100934 can happen in the backward
> threader now (Aldy?) - we eventually run into mark_irreducible_loops
> for non-FSM thread paths checking for paths crossing irreducible
> region boundaries.

I haven't been following the above PR, so I may be missing something, 
but it seems like you would need to twiddle all users of the path 
registry, not just the backwards threader.

>   
>     /* Try to thread each block with more than one successor.  */
>     thread_jumps threader (/*speed_p=*/false);
> diff --git a/gcc/tree-ssa-threadupdate.c b/gcc/tree-ssa-threadupdate.c
> index a86302be18e..387994fd882 100644
> --- a/gcc/tree-ssa-threadupdate.c
> +++ b/gcc/tree-ssa-threadupdate.c
> @@ -1917,6 +1917,8 @@ jump_thread_path_registry::mark_threaded_blocks (bitmap threaded_blocks)
>     edge e;
>     edge_iterator ei;
>   
> +  gcc_assert (loops_state_satisfies_p (LOOPS_HAVE_MARKED_IRREDUCIBLE_REGIONS));
> +
>     /* It is possible to have jump threads in which one is a subpath
>        of the other.  ie, (A, B), (B, C), (C, D) where B is a joiner
>        block and (B, C), (C, D) where no joiner block exists.
> 

If you're asserting this, don't you need to initialize DOM and VRP with 
LOOPS_MAY_HAVE_MARKED_IRREDUCIBLE_REGIONS as well?  They may use the 
registry via their jump_threader.

Hmmm...tree-vrp seems fine:

   loop_optimizer_init (LOOPS_NORMAL | LOOPS_HAVE_RECORDED_EXITS);

because LOOPS_NORMAL includes LOOPS_MAY_HAVE_MARKED_IRREDUCIBLE_REGIONS.

But you may need some massaging in DOM:

   loop_optimizer_init (LOOPS_HAVE_PREHEADERS | LOOPS_HAVE_SIMPLE_LATCHES);

Aldy
Richard Biener June 15, 2021, 6:04 a.m. UTC | #2
On Tue, 15 Jun 2021, Aldy Hernandez wrote:

> 
> 
> On 6/14/21 4:04 PM, Richard Biener wrote:
> > The same issue that arises in PR100934 can happen in the backward
> > threader now (Aldy?) - we eventually run into mark_irreducible_loops
> > for non-FSM thread paths checking for paths crossing irreducible
> > region boundaries.
> 
> I haven't been following the above PR, so I may be missing something, but it
> seems like you would need to twiddle all users of the path registry, not just
> the backwards threader.

Yes, VRP already uses LOOPS_NORMAL, I fixed DOM with the patch for
PR100934.  That leaves the backwards threader, so after this change
all consumers should be fixed.

The question was more like if we'd ever run into the code in
mark_threaded_blocks that reads

  /* Look for jump threading paths which cross multiple loop headers.

     The code to thread through loop headers will change the CFG in ways
     that invalidate the cached loop iteration information.  So we must
     detect that case and wipe the cached information.  */
  EXECUTE_IF_SET_IN_BITMAP (tmp, 0, i, bi)
...

we're definitely executing mark_threaded_blocks but we've executed
and pruned all FSM threading paths before.

I suppose I should simply place the assert there and see what happens
when I don't fixup the backwards threader ...

Richard.

> >   
> >     /* Try to thread each block with more than one successor.  */
> >     thread_jumps threader (/*speed_p=*/false);
> > diff --git a/gcc/tree-ssa-threadupdate.c b/gcc/tree-ssa-threadupdate.c
> > index a86302be18e..387994fd882 100644
> > --- a/gcc/tree-ssa-threadupdate.c
> > +++ b/gcc/tree-ssa-threadupdate.c
> > @@ -1917,6 +1917,8 @@ jump_thread_path_registry::mark_threaded_blocks
> > (bitmap threaded_blocks)
> >     edge e;
> >     edge_iterator ei;
> >   +  gcc_assert (loops_state_satisfies_p
> > (LOOPS_HAVE_MARKED_IRREDUCIBLE_REGIONS));
> > +
> >     /* It is possible to have jump threads in which one is a subpath
> >        of the other.  ie, (A, B), (B, C), (C, D) where B is a joiner
> >        block and (B, C), (C, D) where no joiner block exists.
> > 
> 
> If you're asserting this, don't you need to initialize DOM and VRP with
> LOOPS_MAY_HAVE_MARKED_IRREDUCIBLE_REGIONS as well?  They may use the registry
> via their jump_threader.
> 
> Hmmm...tree-vrp seems fine:
> 
>   loop_optimizer_init (LOOPS_NORMAL | LOOPS_HAVE_RECORDED_EXITS);
> 
> because LOOPS_NORMAL includes LOOPS_MAY_HAVE_MARKED_IRREDUCIBLE_REGIONS.
> 
> But you may need some massaging in DOM:
> 
>   loop_optimizer_init (LOOPS_HAVE_PREHEADERS | LOOPS_HAVE_SIMPLE_LATCHES);
> 
> Aldy
> 
> 
>
Richard Biener June 15, 2021, 8:27 a.m. UTC | #3
On Tue, 15 Jun 2021, Richard Biener wrote:

> On Tue, 15 Jun 2021, Aldy Hernandez wrote:
> 
> > 
> > 
> > On 6/14/21 4:04 PM, Richard Biener wrote:
> > > The same issue that arises in PR100934 can happen in the backward
> > > threader now (Aldy?) - we eventually run into mark_irreducible_loops
> > > for non-FSM thread paths checking for paths crossing irreducible
> > > region boundaries.
> > 
> > I haven't been following the above PR, so I may be missing something, but it
> > seems like you would need to twiddle all users of the path registry, not just
> > the backwards threader.
> 
> Yes, VRP already uses LOOPS_NORMAL, I fixed DOM with the patch for
> PR100934.  That leaves the backwards threader, so after this change
> all consumers should be fixed.
> 
> The question was more like if we'd ever run into the code in
> mark_threaded_blocks that reads
> 
>   /* Look for jump threading paths which cross multiple loop headers.
> 
>      The code to thread through loop headers will change the CFG in ways
>      that invalidate the cached loop iteration information.  So we must
>      detect that case and wipe the cached information.  */
>   EXECUTE_IF_SET_IN_BITMAP (tmp, 0, i, bi)
> ...
> 
> we're definitely executing mark_threaded_blocks but we've executed
> and pruned all FSM threading paths before.
> 
> I suppose I should simply place the assert there and see what happens
> when I don't fixup the backwards threader ...

diff --git a/gcc/tree-ssa-threadupdate.c b/gcc/tree-ssa-threadupdate.c
index a86302be18e..1155b3b9a8f 100644
--- a/gcc/tree-ssa-threadupdate.c
+++ b/gcc/tree-ssa-threadupdate.c
@@ -2122,6 +2122,7 @@ jump_thread_path_registry::mark_threaded_blocks 
(bitmap threaded_blocks)
        {
          if (e->aux)
            {
+             gcc_assert (loops_state_satisfies_p 
(LOOPS_HAVE_MARKED_IRREDUCIBLE_REGIONS));
              vec<jump_thread_edge *> *path = THREAD_PATH (e);
 
              for (unsigned int i = 0, crossed_headers = 0;

passes bootstrap and regtest, so I suppose the backwards threader
doesn't need the initialization right now.

I'll push this instead.

Richard.
Aldy Hernandez June 15, 2021, 9:33 a.m. UTC | #4
On 6/15/21 10:27 AM, Richard Biener wrote:
> On Tue, 15 Jun 2021, Richard Biener wrote:
> 
>> On Tue, 15 Jun 2021, Aldy Hernandez wrote:
>>
>>>
>>>
>>> On 6/14/21 4:04 PM, Richard Biener wrote:
>>>> The same issue that arises in PR100934 can happen in the backward
>>>> threader now (Aldy?) - we eventually run into mark_irreducible_loops
>>>> for non-FSM thread paths checking for paths crossing irreducible
>>>> region boundaries.
>>>
>>> I haven't been following the above PR, so I may be missing something, but it
>>> seems like you would need to twiddle all users of the path registry, not just
>>> the backwards threader.
>>
>> Yes, VRP already uses LOOPS_NORMAL, I fixed DOM with the patch for
>> PR100934.  That leaves the backwards threader, so after this change
>> all consumers should be fixed.
>>
>> The question was more like if we'd ever run into the code in
>> mark_threaded_blocks that reads
>>
>>    /* Look for jump threading paths which cross multiple loop headers.
>>
>>       The code to thread through loop headers will change the CFG in ways
>>       that invalidate the cached loop iteration information.  So we must
>>       detect that case and wipe the cached information.  */
>>    EXECUTE_IF_SET_IN_BITMAP (tmp, 0, i, bi)
>> ...
>>
>> we're definitely executing mark_threaded_blocks but we've executed
>> and pruned all FSM threading paths before.
>>
>> I suppose I should simply place the assert there and see what happens
>> when I don't fixup the backwards threader ...
> 
> diff --git a/gcc/tree-ssa-threadupdate.c b/gcc/tree-ssa-threadupdate.c
> index a86302be18e..1155b3b9a8f 100644
> --- a/gcc/tree-ssa-threadupdate.c
> +++ b/gcc/tree-ssa-threadupdate.c
> @@ -2122,6 +2122,7 @@ jump_thread_path_registry::mark_threaded_blocks
> (bitmap threaded_blocks)
>          {
>            if (e->aux)
>              {
> +             gcc_assert (loops_state_satisfies_p
> (LOOPS_HAVE_MARKED_IRREDUCIBLE_REGIONS));
>                vec<jump_thread_edge *> *path = THREAD_PATH (e);
>   
>                for (unsigned int i = 0, crossed_headers = 0;
> 
> passes bootstrap and regtest, so I suppose the backwards threader
> doesn't need the initialization right now.
> 
> I'll push this instead.

Thanks.  I'll keep an eye out with my proposed changes.

Aldy
Jeff Law June 15, 2021, 4:39 p.m. UTC | #5
On 6/15/2021 12:04 AM, Richard Biener wrote:
> On Tue, 15 Jun 2021, Aldy Hernandez wrote:
>
>>
>> On 6/14/21 4:04 PM, Richard Biener wrote:
>>> The same issue that arises in PR100934 can happen in the backward
>>> threader now (Aldy?) - we eventually run into mark_irreducible_loops
>>> for non-FSM thread paths checking for paths crossing irreducible
>>> region boundaries.
>> I haven't been following the above PR, so I may be missing something, but it
>> seems like you would need to twiddle all users of the path registry, not just
>> the backwards threader.
> Yes, VRP already uses LOOPS_NORMAL, I fixed DOM with the patch for
> PR100934.  That leaves the backwards threader, so after this change
> all consumers should be fixed.
>
> The question was more like if we'd ever run into the code in
> mark_threaded_blocks that reads
>
>    /* Look for jump threading paths which cross multiple loop headers.
>
>       The code to thread through loop headers will change the CFG in ways
>       that invalidate the cached loop iteration information.  So we must
>       detect that case and wipe the cached information.  */
>    EXECUTE_IF_SET_IN_BITMAP (tmp, 0, i, bi)
> ...
>
> we're definitely executing mark_threaded_blocks but we've executed
> and pruned all FSM threading paths before.
>
> I suppose I should simply place the assert there and see what happens
> when I don't fixup the backwards threader ...
I don't think we'll hit that, at least not directly.

All the backwards threader/FSM bits are handled before we get there.   
Of course that may mean we need a similar test earlier.

The fact that the backwards threader/FSM bits are inside 
tree-ssa-threadupdate is a historical wart.  It uses the registry and 
that's about it.


Jeff
diff mbox series

Patch

diff --git a/gcc/cfgloop.c b/gcc/cfgloop.c
index 4e227cd0891..f094538b9ff 100644
--- a/gcc/cfgloop.c
+++ b/gcc/cfgloop.c
@@ -1555,11 +1555,17 @@  verify_loop_structure (void)
 	  error ("loop %d%'s header does not belong directly to it", i);
 	  err = 1;
 	}
-      if (loops_state_satisfies_p (LOOPS_HAVE_MARKED_IRREDUCIBLE_REGIONS)
-	  && (loop_latch_edge (loop)->flags & EDGE_IRREDUCIBLE_LOOP))
+      if (loops_state_satisfies_p (LOOPS_HAVE_MARKED_IRREDUCIBLE_REGIONS))
 	{
-	  error ("loop %d%'s latch is marked as part of irreducible region", i);
-	  err = 1;
+	  edge_iterator ei;
+	  FOR_EACH_EDGE (e, ei, loop->header->preds)
+	    if (dominated_by_p (CDI_DOMINATORS, e->src, loop->header)
+		&& e->flags & EDGE_IRREDUCIBLE_LOOP)
+	      {
+		error ("loop %d%'s latch is marked as part of irreducible"
+		       " region", i);
+		err = 1;
+	      }
 	}
     }
 
diff --git a/gcc/cfgloopanal.c b/gcc/cfgloopanal.c
index d0eade3dd34..54426b54881 100644
--- a/gcc/cfgloopanal.c
+++ b/gcc/cfgloopanal.c
@@ -113,7 +113,7 @@  mark_irreducible_loops (void)
 
 	/* Ignore latch edges.  */
 	if (e->dest->loop_father->header == e->dest
-	    && e->dest->loop_father->latch == act)
+	    && dominated_by_p (CDI_DOMINATORS, act, e->dest))
 	  continue;
 
 	/* Edges inside a single loop should be left where they are.  Edges
diff --git a/gcc/loop-init.c b/gcc/loop-init.c
index 17f14075f62..1fde0ede441 100644
--- a/gcc/loop-init.c
+++ b/gcc/loop-init.c
@@ -48,7 +48,8 @@  apply_loop_flags (unsigned flags)
 	 not work).  However, we avoid modifying cfg, which some
 	 passes may want.  */
       gcc_assert ((flags & ~(LOOPS_MAY_HAVE_MULTIPLE_LATCHES
-			     | LOOPS_HAVE_RECORDED_EXITS)) == 0);
+			     | LOOPS_HAVE_RECORDED_EXITS
+			     | LOOPS_HAVE_MARKED_IRREDUCIBLE_REGIONS)) == 0);
       loops_state_set (LOOPS_MAY_HAVE_MULTIPLE_LATCHES);
     }
   else
diff --git a/gcc/predict.c b/gcc/predict.c
index 5d0cae5c4a4..d751e6cecce 100644
--- a/gcc/predict.c
+++ b/gcc/predict.c
@@ -4287,8 +4287,7 @@  rebuild_frequencies (void)
 
   if (profile_status_for_fn (cfun) == PROFILE_GUESSED)
     {
-      loop_optimizer_init (0);
-      mark_irreducible_loops ();
+      loop_optimizer_init (LOOPS_HAVE_MARKED_IRREDUCIBLE_REGIONS);
       connect_infinite_loops_to_exit ();
       estimate_bb_frequencies (true);
       remove_fake_exit_edges ();
diff --git a/gcc/tree-ssa-threadbackward.c b/gcc/tree-ssa-threadbackward.c
index 7dd8594e3d4..1a5deafc463 100644
--- a/gcc/tree-ssa-threadbackward.c
+++ b/gcc/tree-ssa-threadbackward.c
@@ -887,7 +887,8 @@  pass_thread_jumps::gate (function *fun ATTRIBUTE_UNUSED)
 unsigned int
 pass_thread_jumps::execute (function *fun)
 {
-  loop_optimizer_init (LOOPS_HAVE_PREHEADERS | LOOPS_HAVE_SIMPLE_LATCHES);
+  loop_optimizer_init (LOOPS_HAVE_PREHEADERS | LOOPS_HAVE_SIMPLE_LATCHES
+		       | LOOPS_HAVE_MARKED_IRREDUCIBLE_REGIONS);
 
   /* Try to thread each block with more than one successor.  */
   thread_jumps threader;
@@ -948,7 +949,8 @@  pass_early_thread_jumps::gate (function *fun ATTRIBUTE_UNUSED)
 unsigned int
 pass_early_thread_jumps::execute (function *fun)
 {
-  loop_optimizer_init (AVOID_CFG_MODIFICATIONS);
+  loop_optimizer_init (AVOID_CFG_MODIFICATIONS
+		       | LOOPS_HAVE_MARKED_IRREDUCIBLE_REGIONS);
 
   /* Try to thread each block with more than one successor.  */
   thread_jumps threader (/*speed_p=*/false);
diff --git a/gcc/tree-ssa-threadupdate.c b/gcc/tree-ssa-threadupdate.c
index a86302be18e..387994fd882 100644
--- a/gcc/tree-ssa-threadupdate.c
+++ b/gcc/tree-ssa-threadupdate.c
@@ -1917,6 +1917,8 @@  jump_thread_path_registry::mark_threaded_blocks (bitmap threaded_blocks)
   edge e;
   edge_iterator ei;
 
+  gcc_assert (loops_state_satisfies_p (LOOPS_HAVE_MARKED_IRREDUCIBLE_REGIONS));
+
   /* It is possible to have jump threads in which one is a subpath
      of the other.  ie, (A, B), (B, C), (C, D) where B is a joiner
      block and (B, C), (C, D) where no joiner block exists.