diff mbox series

middle-end: Fix peeled vect loop IV values.

Message ID patch-18056-tamar@arm.com
State New
Headers show
Series middle-end: Fix peeled vect loop IV values. | expand

Commit Message

Tamar Christina Dec. 6, 2023, 4:03 a.m. UTC
Hi All,

While waiting for reviews I found this case where both loop exit needs to go to
epilogue loop, but there was an IV related variable that was used in the scalar
iteration as well.

vect_update_ivs_after_vectorizer then blew the value away and replaced it with
the value if it took the normal exit.

For these cases where we've peeled an a vector iteration, we should skip
vect_update_ivs_after_vectorizer since all exits are "alternate" exits.

For this to be correct we have peeling put the right LCSSA variables so
vectorable_live_operations takes care of it.

This is triggered by new testcases 79 and 80 in early break testsuite
and I'll merge this commit in the main one.

Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.

Ok for master?

Thanks,
Tamar

gcc/ChangeLog:

	* tree-vect-loop-manip.cc (slpeel_tree_duplicate_loop_to_edge_cfg):
	Put right LCSSA var for peeled vect loops.
	(vect_do_peeling): Skip vect_update_ivs_after_vectorizer.

--- inline copy of patch -- 
diff --git a/gcc/tree-vect-loop-manip.cc b/gcc/tree-vect-loop-manip.cc
index 7d48502e2e46240553509dfa6d75fcab7fea36d3..bfdbeb7faaba29aad51c0561dace680c96759484 100644




--
diff --git a/gcc/tree-vect-loop-manip.cc b/gcc/tree-vect-loop-manip.cc
index 7d48502e2e46240553509dfa6d75fcab7fea36d3..bfdbeb7faaba29aad51c0561dace680c96759484 100644
--- a/gcc/tree-vect-loop-manip.cc
+++ b/gcc/tree-vect-loop-manip.cc
@@ -1668,6 +1668,7 @@ slpeel_tree_duplicate_loop_to_edge_cfg (class loop *loop, edge loop_exit,
       edge loop_entry = single_succ_edge (new_preheader);
       if (flow_loops)
 	{
+	  bool peeled_iters = single_pred (loop->latch) != loop_exit->src;
 	  /* Link through the main exit first.  */
 	  for (auto gsi_from = gsi_start_phis (loop->header),
 	       gsi_to = gsi_start_phis (new_loop->header);
@@ -1692,11 +1693,19 @@ slpeel_tree_duplicate_loop_to_edge_cfg (class loop *loop, edge loop_exit,
 		      continue;
 		    }
 		}
+	      /* If we have multiple exits and the vector loop is peeled then we
+		 need to use the value at start of loop.  */
+	      if (peeled_iters)
+		{
+		  tree tmp_arg = gimple_phi_result (from_phi);
+		  if (!new_phi_args.get (tmp_arg))
+		    new_arg = tmp_arg;
+		}
 
 	      tree new_res = copy_ssa_name (gimple_phi_result (from_phi));
 	      gphi *lcssa_phi = create_phi_node (new_res, new_preheader);
 
-	      /* Main loop exit should use the final iter value.  */
+	      /* Otherwise, main loop exit should use the final iter value.  */
 	      SET_PHI_ARG_DEF (lcssa_phi, loop_exit->dest_idx, new_arg);
 
 	      adjust_phi_and_debug_stmts (to_phi, loop_entry, new_res);
@@ -3394,9 +3403,13 @@ vect_do_peeling (loop_vec_info loop_vinfo, tree niters, tree nitersm1,
       if (LOOP_VINFO_EARLY_BREAKS (loop_vinfo))
 	update_e = single_succ_edge (e->dest);
 
-      /* Update the main exit.  */
-      vect_update_ivs_after_vectorizer (loop_vinfo, niters_vector_mult_vf,
-					update_e);
+      /* If we have a peeled vector iteration, all exits are the same, leave it
+	 and so the main exit needs to be treated the same as the alternative
+	 exits in that we leave their updates to vectorizable_live_operations.
+	 */
+      if (!LOOP_VINFO_EARLY_BREAKS_VECT_PEELED (loop_vinfo))
+	vect_update_ivs_after_vectorizer (loop_vinfo, niters_vector_mult_vf,
+					  update_e);
 
       if (skip_epilog || LOOP_VINFO_EARLY_BREAKS (loop_vinfo))
 	{

Comments

Richard Biener Dec. 6, 2023, 8:09 a.m. UTC | #1
On Wed, 6 Dec 2023, Tamar Christina wrote:

> Hi All,
> 
> While waiting for reviews I found this case where both loop exit needs to go to
> epilogue loop, but there was an IV related variable that was used in the scalar
> iteration as well.
> 
> vect_update_ivs_after_vectorizer then blew the value away and replaced it with
> the value if it took the normal exit.
> 
> For these cases where we've peeled an a vector iteration, we should skip
> vect_update_ivs_after_vectorizer since all exits are "alternate" exits.
> 
> For this to be correct we have peeling put the right LCSSA variables so
> vectorable_live_operations takes care of it.
> 
> This is triggered by new testcases 79 and 80 in early break testsuite
> and I'll merge this commit in the main one.
> 
> Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.
> 
> Ok for master?
> 
> Thanks,
> Tamar
> 
> gcc/ChangeLog:
> 
> 	* tree-vect-loop-manip.cc (slpeel_tree_duplicate_loop_to_edge_cfg):
> 	Put right LCSSA var for peeled vect loops.
> 	(vect_do_peeling): Skip vect_update_ivs_after_vectorizer.
> 
> --- inline copy of patch -- 
> diff --git a/gcc/tree-vect-loop-manip.cc b/gcc/tree-vect-loop-manip.cc
> index 7d48502e2e46240553509dfa6d75fcab7fea36d3..bfdbeb7faaba29aad51c0561dace680c96759484 100644
> --- a/gcc/tree-vect-loop-manip.cc
> +++ b/gcc/tree-vect-loop-manip.cc
> @@ -1668,6 +1668,7 @@ slpeel_tree_duplicate_loop_to_edge_cfg (class loop *loop, edge loop_exit,
>        edge loop_entry = single_succ_edge (new_preheader);
>        if (flow_loops)
>  	{
> +	  bool peeled_iters = single_pred (loop->latch) != loop_exit->src;
>  	  /* Link through the main exit first.  */
>  	  for (auto gsi_from = gsi_start_phis (loop->header),
>  	       gsi_to = gsi_start_phis (new_loop->header);
> @@ -1692,11 +1693,19 @@ slpeel_tree_duplicate_loop_to_edge_cfg (class loop *loop, edge loop_exit,
>  		      continue;
>  		    }
>  		}
> +	      /* If we have multiple exits and the vector loop is peeled then we
> +		 need to use the value at start of loop.  */

This comment doesn't really match 'peeled_iters'?  Iff the main IV exit
source isn't loop->latch then won't we miscompute?  I realize the
complication is that slpeel_tree_duplicate_loop_to_edge_cfg is used from
elsewhere as well (so we can't check LOOP_VINFO_EARLY_BREAKS_VECT_PEELED).

> +	      if (peeled_iters)
> +		{
> +		  tree tmp_arg = gimple_phi_result (from_phi);
> +		  if (!new_phi_args.get (tmp_arg))
> +		    new_arg = tmp_arg;
> +		}
>  
>  	      tree new_res = copy_ssa_name (gimple_phi_result (from_phi));
>  	      gphi *lcssa_phi = create_phi_node (new_res, new_preheader);
>  
> -	      /* Main loop exit should use the final iter value.  */
> +	      /* Otherwise, main loop exit should use the final iter value.  */
>  	      SET_PHI_ARG_DEF (lcssa_phi, loop_exit->dest_idx, new_arg);
>  
>  	      
adjust_phi_and_debug_stmts 
(to_phi, loop_entry, new_res);
> @@ -3394,9 +3403,13 @@ vect_do_peeling (loop_vec_info loop_vinfo, tree niters, tree nitersm1,
>        if (LOOP_VINFO_EARLY_BREAKS (loop_vinfo))
>  	update_e = single_succ_edge (e->dest);
>  
> -      /* Update the main exit.  */
> -      vect_update_ivs_after_vectorizer (loop_vinfo, niters_vector_mult_vf,
> -					update_e);
> +      /* If we have a peeled vector iteration, all exits are the same, leave it
> +	 and so the main exit needs to be treated the same as the alternative
> +	 exits in that we leave their updates to vectorizable_live_operations.
> +	 */
> +      if (!LOOP_VINFO_EARLY_BREAKS_VECT_PEELED (loop_vinfo))
> +	vect_update_ivs_after_vectorizer (loop_vinfo, niters_vector_mult_vf,
> +					  update_e);

and now we don't update the main exit?  What's
LOOP_VINFO_EARLY_BREAKS_VECT_PEELED again vs.
LOOP_VINFO_EARLY_BREAKS?

>        if (skip_epilog || LOOP_VINFO_EARLY_BREAKS (loop_vinfo))
>  	{
> 
> 
> 
> 
>
Tamar Christina Dec. 6, 2023, 8:47 a.m. UTC | #2
> > Hi All,
> >
> > While waiting for reviews I found this case where both loop exit needs to go to
> > epilogue loop, but there was an IV related variable that was used in the scalar
> > iteration as well.
> >
> > vect_update_ivs_after_vectorizer then blew the value away and replaced it with
> > the value if it took the normal exit.
> >
> > For these cases where we've peeled an a vector iteration, we should skip
> > vect_update_ivs_after_vectorizer since all exits are "alternate" exits.
> >
> > For this to be correct we have peeling put the right LCSSA variables so
> > vectorable_live_operations takes care of it.
> >
> > This is triggered by new testcases 79 and 80 in early break testsuite
> > and I'll merge this commit in the main one.
> >
> > Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.
> >
> > Ok for master?
> >
> > Thanks,
> > Tamar
> >
> > gcc/ChangeLog:
> >
> > 	* tree-vect-loop-manip.cc (slpeel_tree_duplicate_loop_to_edge_cfg):
> > 	Put right LCSSA var for peeled vect loops.
> > 	(vect_do_peeling): Skip vect_update_ivs_after_vectorizer.
> >
> > --- inline copy of patch --
> > diff --git a/gcc/tree-vect-loop-manip.cc b/gcc/tree-vect-loop-manip.cc
> > index
> 7d48502e2e46240553509dfa6d75fcab7fea36d3..bfdbeb7faaba29aad51c0561d
> ace680c96759484 100644
> > --- a/gcc/tree-vect-loop-manip.cc
> > +++ b/gcc/tree-vect-loop-manip.cc
> > @@ -1668,6 +1668,7 @@ slpeel_tree_duplicate_loop_to_edge_cfg (class loop
> *loop, edge loop_exit,
> >        edge loop_entry = single_succ_edge (new_preheader);
> >        if (flow_loops)
> >  	{
> > +	  bool peeled_iters = single_pred (loop->latch) != loop_exit->src;
> >  	  /* Link through the main exit first.  */
> >  	  for (auto gsi_from = gsi_start_phis (loop->header),
> >  	       gsi_to = gsi_start_phis (new_loop->header);
> > @@ -1692,11 +1693,19 @@ slpeel_tree_duplicate_loop_to_edge_cfg (class
> loop *loop, edge loop_exit,
> >  		      continue;
> >  		    }
> >  		}
> > +	      /* If we have multiple exits and the vector loop is peeled then we
> > +		 need to use the value at start of loop.  */
> 
> This comment doesn't really match 'peeled_iters'?  Iff the main IV exit
> source isn't loop->latch then won't we miscompute?  I realize the
> complication is that slpeel_tree_duplicate_loop_to_edge_cfg is used from
> elsewhere as well (so we can't check LOOP_VINFO_EARLY_BREAKS_VECT_PEELED).
> 

No, because in both exits we restart the scalar iteration at the start of the last vector iteration.
Normally, the counted main exit would be updated by vect_iters_bound_vf - vf.  Which is the same
As the induction value should we get to the final iteration.

More on it in your question below.

> > +	      if (peeled_iters)
> > +		{
> > +		  tree tmp_arg = gimple_phi_result (from_phi);
> > +		  if (!new_phi_args.get (tmp_arg))
> > +		    new_arg = tmp_arg;
> > +		}
> >
> >  	      tree new_res = copy_ssa_name (gimple_phi_result (from_phi));
> >  	      gphi *lcssa_phi = create_phi_node (new_res, new_preheader);
> >
> > -	      /* Main loop exit should use the final iter value.  */
> > +	      /* Otherwise, main loop exit should use the final iter value.  */
> >  	      SET_PHI_ARG_DEF (lcssa_phi, loop_exit->dest_idx, new_arg);
> >
> >
> adjust_phi_and_debug_stmts
> (to_phi, loop_entry, new_res);
> > @@ -3394,9 +3403,13 @@ vect_do_peeling (loop_vec_info loop_vinfo, tree
> niters, tree nitersm1,
> >        if (LOOP_VINFO_EARLY_BREAKS (loop_vinfo))
> >  	update_e = single_succ_edge (e->dest);
> >
> > -      /* Update the main exit.  */
> > -      vect_update_ivs_after_vectorizer (loop_vinfo, niters_vector_mult_vf,
> > -					update_e);
> > +      /* If we have a peeled vector iteration, all exits are the same, leave it
> > +	 and so the main exit needs to be treated the same as the alternative
> > +	 exits in that we leave their updates to vectorizable_live_operations.
> > +	 */
> > +      if (!LOOP_VINFO_EARLY_BREAKS_VECT_PEELED (loop_vinfo))
> > +	vect_update_ivs_after_vectorizer (loop_vinfo, niters_vector_mult_vf,
> > +					  update_e);
> 
> and now we don't update the main exit?  What's
> LOOP_VINFO_EARLY_BREAKS_VECT_PEELED again vs.
> LOOP_VINFO_EARLY_BREAKS?

So LOOP_VINFO_EARLY_BREAKS essentially says that there are multiple exits
That we can vectorize.

LOOP_VINFO_EARLY_BREAKS_VECT_PEELED is saying that in this loop, we've
picked as the main exit not the loops latch connected exit.  This means when we
exit from the "main" exit in the final iteration we may still have side effects to
perform and so the final iteration should be restarted.

Similarly exiting from an early exit in this case also means having to restart the
Loop at the same place to apply any partial side-effects.  This is because in these
Cases the loop exits at n - 1 iterations, the IV is updated and checked before the
final exit, so taking an early exit means we still have to apply partial side-effects.

Regards,
Tamar
> 
> >        if (skip_epilog || LOOP_VINFO_EARLY_BREAKS (loop_vinfo))
> >  	{
> >
> >
> >
> >
> >
> 
> --
> Richard Biener <rguenther@suse.de>
> SUSE Software Solutions Germany GmbH,
> Frankenstrasse 146, 90461 Nuernberg, Germany;
> GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)
Tamar Christina Dec. 6, 2023, 8:54 a.m. UTC | #3
> -----Original Message-----
> From: Tamar Christina <Tamar.Christina@arm.com>
> Sent: Wednesday, December 6, 2023 8:48 AM
> To: Richard Biener <rguenther@suse.de>
> Cc: gcc-patches@gcc.gnu.org; nd <nd@arm.com>; jlaw@ventanamicro.com
> Subject: RE: [PATCH]middle-end: Fix peeled vect loop IV values.
> 
> > > Hi All,
> > >
> > > While waiting for reviews I found this case where both loop exit needs to go to
> > > epilogue loop, but there was an IV related variable that was used in the scalar
> > > iteration as well.
> > >
> > > vect_update_ivs_after_vectorizer then blew the value away and replaced it with
> > > the value if it took the normal exit.
> > >
> > > For these cases where we've peeled an a vector iteration, we should skip
> > > vect_update_ivs_after_vectorizer since all exits are "alternate" exits.
> > >
> > > For this to be correct we have peeling put the right LCSSA variables so
> > > vectorable_live_operations takes care of it.
> > >
> > > This is triggered by new testcases 79 and 80 in early break testsuite
> > > and I'll merge this commit in the main one.
> > >
> > > Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.
> > >
> > > Ok for master?
> > >
> > > Thanks,
> > > Tamar
> > >
> > > gcc/ChangeLog:
> > >
> > > 	* tree-vect-loop-manip.cc (slpeel_tree_duplicate_loop_to_edge_cfg):
> > > 	Put right LCSSA var for peeled vect loops.
> > > 	(vect_do_peeling): Skip vect_update_ivs_after_vectorizer.
> > >
> > > --- inline copy of patch --
> > > diff --git a/gcc/tree-vect-loop-manip.cc b/gcc/tree-vect-loop-manip.cc
> > > index
> >
> 7d48502e2e46240553509dfa6d75fcab7fea36d3..bfdbeb7faaba29aad51c0561d
> > ace680c96759484 100644
> > > --- a/gcc/tree-vect-loop-manip.cc
> > > +++ b/gcc/tree-vect-loop-manip.cc
> > > @@ -1668,6 +1668,7 @@ slpeel_tree_duplicate_loop_to_edge_cfg (class loop
> > *loop, edge loop_exit,
> > >        edge loop_entry = single_succ_edge (new_preheader);
> > >        if (flow_loops)
> > >  	{
> > > +	  bool peeled_iters = single_pred (loop->latch) != loop_exit->src;
> > >  	  /* Link through the main exit first.  */
> > >  	  for (auto gsi_from = gsi_start_phis (loop->header),
> > >  	       gsi_to = gsi_start_phis (new_loop->header);
> > > @@ -1692,11 +1693,19 @@ slpeel_tree_duplicate_loop_to_edge_cfg (class
> > loop *loop, edge loop_exit,
> > >  		      continue;
> > >  		    }
> > >  		}
> > > +	      /* If we have multiple exits and the vector loop is peeled then we
> > > +		 need to use the value at start of loop.  */
> >
> > This comment doesn't really match 'peeled_iters'?  Iff the main IV exit
> > source isn't loop->latch then won't we miscompute?  I realize the
> > complication is that slpeel_tree_duplicate_loop_to_edge_cfg is used from
> > elsewhere as well (so we can't check
> LOOP_VINFO_EARLY_BREAKS_VECT_PEELED).
> >
> 
> No, because in both exits we restart the scalar iteration at the start of the last
> vector iteration.
> Normally, the counted main exit would be updated by vect_iters_bound_vf - vf.
> Which is the same
> As the induction value should we get to the final iteration.
> 
> More on it in your question below.
> 
> > > +	      if (peeled_iters)
> > > +		{
> > > +		  tree tmp_arg = gimple_phi_result (from_phi);
> > > +		  if (!new_phi_args.get (tmp_arg))
> > > +		    new_arg = tmp_arg;
> > > +		}
> > >
> > >  	      tree new_res = copy_ssa_name (gimple_phi_result (from_phi));
> > >  	      gphi *lcssa_phi = create_phi_node (new_res, new_preheader);
> > >
> > > -	      /* Main loop exit should use the final iter value.  */
> > > +	      /* Otherwise, main loop exit should use the final iter value.  */
> > >  	      SET_PHI_ARG_DEF (lcssa_phi, loop_exit->dest_idx, new_arg);
> > >
> > >
> > adjust_phi_and_debug_stmts
> > (to_phi, loop_entry, new_res);
> > > @@ -3394,9 +3403,13 @@ vect_do_peeling (loop_vec_info loop_vinfo, tree
> > niters, tree nitersm1,
> > >        if (LOOP_VINFO_EARLY_BREAKS (loop_vinfo))
> > >  	update_e = single_succ_edge (e->dest);
> > >
> > > -      /* Update the main exit.  */
> > > -      vect_update_ivs_after_vectorizer (loop_vinfo, niters_vector_mult_vf,
> > > -					update_e);
> > > +      /* If we have a peeled vector iteration, all exits are the same, leave it
> > > +	 and so the main exit needs to be treated the same as the alternative
> > > +	 exits in that we leave their updates to vectorizable_live_operations.
> > > +	 */
> > > +      if (!LOOP_VINFO_EARLY_BREAKS_VECT_PEELED (loop_vinfo))
> > > +	vect_update_ivs_after_vectorizer (loop_vinfo, niters_vector_mult_vf,
> > > +					  update_e);
> >
> > and now we don't update the main exit?  What's
> > LOOP_VINFO_EARLY_BREAKS_VECT_PEELED again vs.
> > LOOP_VINFO_EARLY_BREAKS?
> 
> So LOOP_VINFO_EARLY_BREAKS essentially says that there are multiple exits
> That we can vectorize.
> 
> LOOP_VINFO_EARLY_BREAKS_VECT_PEELED is saying that in this loop, we've
> picked as the main exit not the loops latch connected exit.  This means when we
> exit from the "main" exit in the final iteration we may still have side effects to
> perform and so the final iteration should be restarted.
> 
> Similarly exiting from an early exit in this case also means having to restart the
> Loop at the same place to apply any partial side-effects.  This is because in these
> Cases the loop exits at n - 1 iterations, the IV is updated and checked before the
> final exit, so taking an early exit means we still have to apply partial side-effects.
> 

As an example, the testcase is

__attribute__((noinline, noipa))
int f(x) {
  int i;
  for (i = 0; i < 8 && (x & 1) == 0; i++)
    x >>= 1;
  return i;
}

main() {
  if (f(4) != 2)
    abort();
  exit(0);
}

Where the value of x also needs to be preserved when we take either exit.

Cheers,
Tamar

> Regards,
> Tamar
> >
> > >        if (skip_epilog || LOOP_VINFO_EARLY_BREAKS (loop_vinfo))
> > >  	{
> > >
> > >
> > >
> > >
> > >
> >
> > --
> > Richard Biener <rguenther@suse.de>
> > SUSE Software Solutions Germany GmbH,
> > Frankenstrasse 146, 90461 Nuernberg, Germany;
> > GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)
Richard Biener Dec. 6, 2023, 9:10 a.m. UTC | #4
On Wed, 6 Dec 2023, Tamar Christina wrote:

> > > Hi All,
> > >
> > > While waiting for reviews I found this case where both loop exit needs to go to
> > > epilogue loop, but there was an IV related variable that was used in the scalar
> > > iteration as well.
> > >
> > > vect_update_ivs_after_vectorizer then blew the value away and replaced it with
> > > the value if it took the normal exit.
> > >
> > > For these cases where we've peeled an a vector iteration, we should skip
> > > vect_update_ivs_after_vectorizer since all exits are "alternate" exits.
> > >
> > > For this to be correct we have peeling put the right LCSSA variables so
> > > vectorable_live_operations takes care of it.
> > >
> > > This is triggered by new testcases 79 and 80 in early break testsuite
> > > and I'll merge this commit in the main one.
> > >
> > > Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.
> > >
> > > Ok for master?
> > >
> > > Thanks,
> > > Tamar
> > >
> > > gcc/ChangeLog:
> > >
> > > 	* tree-vect-loop-manip.cc (slpeel_tree_duplicate_loop_to_edge_cfg):
> > > 	Put right LCSSA var for peeled vect loops.
> > > 	(vect_do_peeling): Skip vect_update_ivs_after_vectorizer.
> > >
> > > --- inline copy of patch --
> > > diff --git a/gcc/tree-vect-loop-manip.cc b/gcc/tree-vect-loop-manip.cc
> > > index
> > 7d48502e2e46240553509dfa6d75fcab7fea36d3..bfdbeb7faaba29aad51c0561d
> > ace680c96759484 100644
> > > --- a/gcc/tree-vect-loop-manip.cc
> > > +++ b/gcc/tree-vect-loop-manip.cc
> > > @@ -1668,6 +1668,7 @@ slpeel_tree_duplicate_loop_to_edge_cfg (class loop
> > *loop, edge loop_exit,
> > >        edge loop_entry = single_succ_edge (new_preheader);
> > >        if (flow_loops)
> > >  	{
> > > +	  bool peeled_iters = single_pred (loop->latch) != loop_exit->src;
> > >  	  /* Link through the main exit first.  */
> > >  	  for (auto gsi_from = gsi_start_phis (loop->header),
> > >  	       gsi_to = gsi_start_phis (new_loop->header);
> > > @@ -1692,11 +1693,19 @@ slpeel_tree_duplicate_loop_to_edge_cfg (class
> > loop *loop, edge loop_exit,
> > >  		      continue;
> > >  		    }
> > >  		}
> > > +	      /* If we have multiple exits and the vector loop is peeled then we
> > > +		 need to use the value at start of loop.  */
> > 
> > This comment doesn't really match 'peeled_iters'?  Iff the main IV exit
> > source isn't loop->latch then won't we miscompute?  I realize the
> > complication is that slpeel_tree_duplicate_loop_to_edge_cfg is used from
> > elsewhere as well (so we can't check LOOP_VINFO_EARLY_BREAKS_VECT_PEELED).
> > 
> 
> No, because in both exits we restart the scalar iteration at the start of the last vector iteration.
> Normally, the counted main exit would be updated by vect_iters_bound_vf - vf.  Which is the same
> As the induction value should we get to the final iteration.
> 
> More on it in your question below.
> 
> > > +	      if (peeled_iters)
> > > +		{
> > > +		  tree tmp_arg = gimple_phi_result (from_phi);
> > > +		  if (!new_phi_args.get (tmp_arg))
> > > +		    new_arg = tmp_arg;
> > > +		}
> > >
> > >  	      tree new_res = copy_ssa_name (gimple_phi_result (from_phi));
> > >  	      gphi *lcssa_phi = create_phi_node (new_res, new_preheader);
> > >
> > > -	      /* Main loop exit should use the final iter value.  */
> > > +	      /* Otherwise, main loop exit should use the final iter value.  */
> > >  	      SET_PHI_ARG_DEF (lcssa_phi, loop_exit->dest_idx, new_arg);
> > >
> > >
> > adjust_phi_and_debug_stmts
> > (to_phi, loop_entry, new_res);
> > > @@ -3394,9 +3403,13 @@ vect_do_peeling (loop_vec_info loop_vinfo, tree
> > niters, tree nitersm1,
> > >        if (LOOP_VINFO_EARLY_BREAKS (loop_vinfo))
> > >  	update_e = single_succ_edge (e->dest);
> > >
> > > -      /* Update the main exit.  */
> > > -      vect_update_ivs_after_vectorizer (loop_vinfo, niters_vector_mult_vf,
> > > -					update_e);
> > > +      /* If we have a peeled vector iteration, all exits are the same, leave it
> > > +	 and so the main exit needs to be treated the same as the alternative
> > > +	 exits in that we leave their updates to vectorizable_live_operations.
> > > +	 */
> > > +      if (!LOOP_VINFO_EARLY_BREAKS_VECT_PEELED (loop_vinfo))
> > > +	vect_update_ivs_after_vectorizer (loop_vinfo, niters_vector_mult_vf,
> > > +					  update_e);
> > 
> > and now we don't update the main exit?  What's
> > LOOP_VINFO_EARLY_BREAKS_VECT_PEELED again vs.
> > LOOP_VINFO_EARLY_BREAKS?
> 
> So LOOP_VINFO_EARLY_BREAKS essentially says that there are multiple exits
> That we can vectorize.
> 
> LOOP_VINFO_EARLY_BREAKS_VECT_PEELED is saying that in this loop, we've
> picked as the main exit not the loops latch connected exit.  This means when we
> exit from the "main" exit in the final iteration we may still have side effects to
> perform and so the final iteration should be restarted.
> 
> Similarly exiting from an early exit in this case also means having to restart the
> Loop at the same place to apply any partial side-effects.  This is because in these
> Cases the loop exits at n - 1 iterations, the IV is updated and checked before the
> final exit, so taking an early exit means we still have to apply partial side-effects.

Ah, OK.  LOOP_VINFO_EARLY_BREAKS_VECT_PEELED basically says all exits
should be treated as early.  In theory we could handle a "early break"
(aka non-counted IV) that is last specially and slightly more optimal
(didn't really think it through).  It also means we should prefer
the last exist as IV exit if possible.

Thanks for clarifying.

Richard.
diff mbox series

Patch

--- a/gcc/tree-vect-loop-manip.cc
+++ b/gcc/tree-vect-loop-manip.cc
@@ -1668,6 +1668,7 @@  slpeel_tree_duplicate_loop_to_edge_cfg (class loop *loop, edge loop_exit,
       edge loop_entry = single_succ_edge (new_preheader);
       if (flow_loops)
 	{
+	  bool peeled_iters = single_pred (loop->latch) != loop_exit->src;
 	  /* Link through the main exit first.  */
 	  for (auto gsi_from = gsi_start_phis (loop->header),
 	       gsi_to = gsi_start_phis (new_loop->header);
@@ -1692,11 +1693,19 @@  slpeel_tree_duplicate_loop_to_edge_cfg (class loop *loop, edge loop_exit,
 		      continue;
 		    }
 		}
+	      /* If we have multiple exits and the vector loop is peeled then we
+		 need to use the value at start of loop.  */
+	      if (peeled_iters)
+		{
+		  tree tmp_arg = gimple_phi_result (from_phi);
+		  if (!new_phi_args.get (tmp_arg))
+		    new_arg = tmp_arg;
+		}
 
 	      tree new_res = copy_ssa_name (gimple_phi_result (from_phi));
 	      gphi *lcssa_phi = create_phi_node (new_res, new_preheader);
 
-	      /* Main loop exit should use the final iter value.  */
+	      /* Otherwise, main loop exit should use the final iter value.  */
 	      SET_PHI_ARG_DEF (lcssa_phi, loop_exit->dest_idx, new_arg);
 
 	      adjust_phi_and_debug_stmts (to_phi, loop_entry, new_res);
@@ -3394,9 +3403,13 @@  vect_do_peeling (loop_vec_info loop_vinfo, tree niters, tree nitersm1,
       if (LOOP_VINFO_EARLY_BREAKS (loop_vinfo))
 	update_e = single_succ_edge (e->dest);
 
-      /* Update the main exit.  */
-      vect_update_ivs_after_vectorizer (loop_vinfo, niters_vector_mult_vf,
-					update_e);
+      /* If we have a peeled vector iteration, all exits are the same, leave it
+	 and so the main exit needs to be treated the same as the alternative
+	 exits in that we leave their updates to vectorizable_live_operations.
+	 */
+      if (!LOOP_VINFO_EARLY_BREAKS_VECT_PEELED (loop_vinfo))
+	vect_update_ivs_after_vectorizer (loop_vinfo, niters_vector_mult_vf,
+					  update_e);
 
       if (skip_epilog || LOOP_VINFO_EARLY_BREAKS (loop_vinfo))
 	{