Patchwork Fix array bound niter estimate (PR middle-end/54937)

login
register
mail settings
Submitter Jan Hubicka
Date Oct. 19, 2012, 4:24 p.m.
Message ID <20121019162426.GA19644@kam.mff.cuni.cz>
Download mbox | patch
Permalink /patch/192769/
State New
Headers show

Comments

Jan Hubicka - Oct. 19, 2012, 4:24 p.m.
> > > What about the conservative variant of simply
> > > 
> > >       else
> > >         delta = double_int_one;
> > 
> > I think it would be bad idea: it makes us to completely unroll one interation
> > too many that bloats code for no benefit. No optimization cancels the path in
> > CFG because of undefined effect and thus the code will be output (unless someone
> > smarter, like VRP, cleans up later, but it is more an exception than rule.)
> 
> OK, on deper tought I guess I can add double_int_one always at that spot and
> once we are done with everything I can walk nb_iter_bound for all statements
> known to not be executed on last iteration and record them to pointer set.
> 
> Finally I can walk from header in DFS stopping on loop exits, side effects and
> those stateemnts.  If I visit no loop exit or side effect I know I can lower
> iteration count by 1 (in estimate_numbers_of_iterations_loop).
> 
> This will give accurate answer and requires just little extra bookkeeping.
> 
> I will give this a try.

Here is updated patch.  It solves the testcase and gives better estimates than before.

Here is obvious improvements: record_estimate can put all statements to the list not only
those that dominates loop latch and maybe_lower_iteration_bound can track lowest estimate
it finds on its walk.  This will need bit more work and I am thus sending the bugfix
separately, because I think it should go to 4.7, too.

Honza

	* tree-ssa-loop-niter.c (record_estimate): Remove confused
	dominators check.
	(maybe_lower_iteration_bound): New function.
	(estimate_numbers_of_iterations_loop): Use it.
Richard Guenther - Oct. 22, 2012, 8:19 a.m.
On Fri, 19 Oct 2012, Jan Hubicka wrote:

> > > > What about the conservative variant of simply
> > > > 
> > > >       else
> > > >         delta = double_int_one;
> > > 
> > > I think it would be bad idea: it makes us to completely unroll one interation
> > > too many that bloats code for no benefit. No optimization cancels the path in
> > > CFG because of undefined effect and thus the code will be output (unless someone
> > > smarter, like VRP, cleans up later, but it is more an exception than rule.)
> > 
> > OK, on deper tought I guess I can add double_int_one always at that spot and
> > once we are done with everything I can walk nb_iter_bound for all statements
> > known to not be executed on last iteration and record them to pointer set.
> > 
> > Finally I can walk from header in DFS stopping on loop exits, side effects and
> > those stateemnts.  If I visit no loop exit or side effect I know I can lower
> > iteration count by 1 (in estimate_numbers_of_iterations_loop).
> > 
> > This will give accurate answer and requires just little extra bookkeeping.
> > 
> > I will give this a try.
> 
> Here is updated patch.  It solves the testcase and gives better estimates than before.
> 
> Here is obvious improvements: record_estimate can put all statements to the list not only
> those that dominates loop latch and maybe_lower_iteration_bound can track lowest estimate
> it finds on its walk.  This will need bit more work and I am thus sending the bugfix
> separately, because I think it should go to 4.7, too.
> 
> Honza
> 
> 	* tree-ssa-loop-niter.c (record_estimate): Remove confused
> 	dominators check.
> 	(maybe_lower_iteration_bound): New function.
> 	(estimate_numbers_of_iterations_loop): Use it.
> Index: tree-ssa-loop-niter.c
> ===================================================================
> --- tree-ssa-loop-niter.c	(revision 192537)
> +++ tree-ssa-loop-niter.c	(working copy)
> @@ -2535,7 +2541,6 @@ record_estimate (struct loop *loop, tree
>  		 gimple at_stmt, bool is_exit, bool realistic, bool upper)
>  {
>    double_int delta;
> -  edge exit;
>  
>    if (dump_file && (dump_flags & TDF_DETAILS))
>      {
> @@ -2570,14 +2577,10 @@ record_estimate (struct loop *loop, tree
>      }
>  
>    /* Update the number of iteration estimates according to the bound.
> -     If at_stmt is an exit or dominates the single exit from the loop,
> -     then the loop latch is executed at most BOUND times, otherwise
> -     it can be executed BOUND + 1 times.  */
> -  exit = single_exit (loop);
> -  if (is_exit
> -      || (exit != NULL
> -	  && dominated_by_p (CDI_DOMINATORS,
> -			     exit->src, gimple_bb (at_stmt))))
> +     If at_stmt is an exit then the loop latch is executed at most BOUND times,
> +     otherwise it can be executed BOUND + 1 times.  We will lower the estimate
> +     later if such statement must be executed on last iteration  */
> +  if (is_exit)
>      delta = double_int_zero;
>    else
>      delta = double_int_one;
> @@ -2953,6 +2956,87 @@ gcov_type_to_double_int (gcov_type val)
>    return ret;
>  }
>  
> +/* See if every path cross the loop goes through a statement that is known
> +   to not execute at the last iteration. In that case we can decrese iteration
> +   count by 1.  */
> +
> +static void
> +maybe_lower_iteration_bound (struct loop *loop)
> +{
> +  pointer_set_t *not_executed_last_iteration = pointer_set_create ();
> +  pointer_set_t *visited;
> +  struct nb_iter_bound *elt;
> +  bool found = false;
> +  VEC (basic_block, heap) *queue = NULL;
> +
> +  for (elt = loop->bounds; elt; elt = elt->next)
> +    {
> +      if (!elt->is_exit
> +	  && elt->bound.ult (loop->nb_iterations_upper_bound))
> +	{
> +	  found = true;
> +	  pointer_set_insert (not_executed_last_iteration, elt->stmt);
> +	}
> +    }

So you are looking for all stmts a bound was derived from.

> +  if (!found)
> +    {
> +      pointer_set_destroy (not_executed_last_iteration);

create this on-demand in the above loop?

> +      return;
> +    }
> +  visited = pointer_set_create ();
> +  VEC_safe_push (basic_block, heap, queue, loop->header);
> +  pointer_set_insert (visited, loop->header);

pointer-set for BB visited?  In most other places we use a
[s]bitmap with block numbers.

> +  found = false;
> +
> +  while (VEC_length (basic_block, queue) && !found)

looks like a do-while loop should be possible with a !VEC_empty ()
guard at the end.

> +    {
> +      basic_block bb = VEC_pop (basic_block, queue);
> +      gimple_stmt_iterator gsi;
> +      bool stmt_found = false;
> +
> +      for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next (&gsi))
> +	{
> +	  gimple stmt = gsi_stmt (gsi);
> +	  if (pointer_set_contains (not_executed_last_iteration, stmt))
> +	    {
> +	      stmt_found = true;

we found one.

> +	      break;
> +	    }
> +	  if (gimple_has_side_effects (stmt))
> +	    {
> +	      found = true;

we found sth else?

> +	      break;
> +	    }
> +	}
> +      if (!stmt_found && !found)
> +	{

if we found neither walk more blocks.

> +          edge e;
> +          edge_iterator ei;
> +
> +          FOR_EACH_EDGE (e, ei, bb->succs)
> +	    {
> +	      if (loop_exit_edge_p (loop, e))
> +		{
> +		  found = true;

we reach a (possible) exit.  maybe rename found to
found_exit?

> +		  break;
> +		}
> +	      if (!pointer_set_insert (visited, e->dest))
> +		VEC_safe_push (basic_block, heap, queue, e->dest);
> +	    }
> +	}
> +    }
> +  if (!found)

if we didn't find an exit we reduce count.  double_int_one looks magic
here, but with the assertion that each queued 'stmt_found' upper
bound was less than loop->nb_iterations_upper_bound, subtracting one
is certainly conservative.  But why not use the maximum estimate from 
all stmts_found?

Thus, please add some comments, use a bitmap for visited and
rename variables to be more descriptive.

Overall I like this version a lot more ;)

Thanks,
Richard.

> +    {
> +      if (dump_file && (dump_flags & TDF_DETAILS))
> +	fprintf (dump_file, "Reducing loop iteration estimate by 1; "
> +		 "undefined statement must be executed at last iteration.");
> +      record_niter_bound (loop, loop->nb_iterations_upper_bound - double_int_one,
> +			  false, true);
> +    }
> +  pointer_set_destroy (visited);
> +  VEC_free (basic_block, heap, queue);
> +}
> +
>  /* Records estimates on numbers of iterations of LOOP.  If USE_UNDEFINED_P
>     is true also use estimates derived from undefined behavior.  */
>  
> @@ -2996,6 +3080,9 @@ estimate_numbers_of_iterations_loop (str
>  
>    infer_loop_bounds_from_undefined (loop);
>  
> +  if (loop->any_upper_bound)
> +    maybe_lower_iteration_bound (loop);
> +
>    /* If we have a measured profile, use it to estimate the number of
>       iterations.  */
>    if (loop->header->count != 0)
> 
>
Jan Hubicka - Oct. 22, 2012, 12:56 p.m.
> > +static void
> > +maybe_lower_iteration_bound (struct loop *loop)
> > +{
> > +  pointer_set_t *not_executed_last_iteration = pointer_set_create ();
> > +  pointer_set_t *visited;
> > +  struct nb_iter_bound *elt;
> > +  bool found = false;
> > +  VEC (basic_block, heap) *queue = NULL;
> > +
> > +  for (elt = loop->bounds; elt; elt = elt->next)
> > +    {
> > +      if (!elt->is_exit
> > +	  && elt->bound.ult (loop->nb_iterations_upper_bound))
> > +	{
> > +	  found = true;
> > +	  pointer_set_insert (not_executed_last_iteration, elt->stmt);
> > +	}
> > +    }
> 
> So you are looking for all stmts a bound was derived from.

Yes, with bound smaller than the current estimate.
> 
> > +  if (!found)
> > +    {
> > +      pointer_set_destroy (not_executed_last_iteration);
> 
> create this on-demand in the above loop?

Will do. 
> 
> > +      return;
> > +    }
> > +  visited = pointer_set_create ();
> > +  VEC_safe_push (basic_block, heap, queue, loop->header);
> > +  pointer_set_insert (visited, loop->header);
> 
> pointer-set for BB visited?  In most other places we use a
> [s]bitmap with block numbers.

Will switch to bitmap though I think it is mostly because this tradition was
invented before pointer-set. bitmap has liner walk in it, pointer-set should
scale better.
> 
> if we didn't find an exit we reduce count.  double_int_one looks magic
> here, but with the assertion that each queued 'stmt_found' upper
> bound was less than loop->nb_iterations_upper_bound, subtracting one
> is certainly conservative.  But why not use the maximum estimate from 
> all stmts_found?

Because it is always nb_iterations_upper_bound-1, see logic in record_estimate.
I plan to change this - basically we can change record_esitmate to record all
statements, not only those dominating exit and do Dijkstra's algorithm in this
walk looking for largest upper bound we can reach loopback with.

But as I wrote in the email, I would like to do this incrementally - fix the
bug first (possibly for 4.7, too - the bug is there I am not sure if it can
lead to wrong code) and change this next.  It means some further changes
thorough niter.c, but little challenge to implement Dijkstra with double-int
based queue.
> 
> Thus, please add some comments, use a bitmap for visited and
> rename variables to be more descriptive.

Will do, and need to analyze the bounds fortran failures :)

Thanks,
Honza

Patch

Index: tree-ssa-loop-niter.c
===================================================================
--- tree-ssa-loop-niter.c	(revision 192537)
+++ tree-ssa-loop-niter.c	(working copy)
@@ -2535,7 +2541,6 @@  record_estimate (struct loop *loop, tree
 		 gimple at_stmt, bool is_exit, bool realistic, bool upper)
 {
   double_int delta;
-  edge exit;
 
   if (dump_file && (dump_flags & TDF_DETAILS))
     {
@@ -2570,14 +2577,10 @@  record_estimate (struct loop *loop, tree
     }
 
   /* Update the number of iteration estimates according to the bound.
-     If at_stmt is an exit or dominates the single exit from the loop,
-     then the loop latch is executed at most BOUND times, otherwise
-     it can be executed BOUND + 1 times.  */
-  exit = single_exit (loop);
-  if (is_exit
-      || (exit != NULL
-	  && dominated_by_p (CDI_DOMINATORS,
-			     exit->src, gimple_bb (at_stmt))))
+     If at_stmt is an exit then the loop latch is executed at most BOUND times,
+     otherwise it can be executed BOUND + 1 times.  We will lower the estimate
+     later if such statement must be executed on last iteration  */
+  if (is_exit)
     delta = double_int_zero;
   else
     delta = double_int_one;
@@ -2953,6 +2956,87 @@  gcov_type_to_double_int (gcov_type val)
   return ret;
 }
 
+/* See if every path cross the loop goes through a statement that is known
+   to not execute at the last iteration. In that case we can decrese iteration
+   count by 1.  */
+
+static void
+maybe_lower_iteration_bound (struct loop *loop)
+{
+  pointer_set_t *not_executed_last_iteration = pointer_set_create ();
+  pointer_set_t *visited;
+  struct nb_iter_bound *elt;
+  bool found = false;
+  VEC (basic_block, heap) *queue = NULL;
+
+  for (elt = loop->bounds; elt; elt = elt->next)
+    {
+      if (!elt->is_exit
+	  && elt->bound.ult (loop->nb_iterations_upper_bound))
+	{
+	  found = true;
+	  pointer_set_insert (not_executed_last_iteration, elt->stmt);
+	}
+    }
+  if (!found)
+    {
+      pointer_set_destroy (not_executed_last_iteration);
+      return;
+    }
+  visited = pointer_set_create ();
+  VEC_safe_push (basic_block, heap, queue, loop->header);
+  pointer_set_insert (visited, loop->header);
+  found = false;
+
+  while (VEC_length (basic_block, queue) && !found)
+    {
+      basic_block bb = VEC_pop (basic_block, queue);
+      gimple_stmt_iterator gsi;
+      bool stmt_found = false;
+
+      for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next (&gsi))
+	{
+	  gimple stmt = gsi_stmt (gsi);
+	  if (pointer_set_contains (not_executed_last_iteration, stmt))
+	    {
+	      stmt_found = true;
+	      break;
+	    }
+	  if (gimple_has_side_effects (stmt))
+	    {
+	      found = true;
+	      break;
+	    }
+	}
+      if (!stmt_found && !found)
+	{
+          edge e;
+          edge_iterator ei;
+
+          FOR_EACH_EDGE (e, ei, bb->succs)
+	    {
+	      if (loop_exit_edge_p (loop, e))
+		{
+		  found = true;
+		  break;
+		}
+	      if (!pointer_set_insert (visited, e->dest))
+		VEC_safe_push (basic_block, heap, queue, e->dest);
+	    }
+	}
+    }
+  if (!found)
+    {
+      if (dump_file && (dump_flags & TDF_DETAILS))
+	fprintf (dump_file, "Reducing loop iteration estimate by 1; "
+		 "undefined statement must be executed at last iteration.");
+      record_niter_bound (loop, loop->nb_iterations_upper_bound - double_int_one,
+			  false, true);
+    }
+  pointer_set_destroy (visited);
+  VEC_free (basic_block, heap, queue);
+}
+
 /* Records estimates on numbers of iterations of LOOP.  If USE_UNDEFINED_P
    is true also use estimates derived from undefined behavior.  */
 
@@ -2996,6 +3080,9 @@  estimate_numbers_of_iterations_loop (str
 
   infer_loop_bounds_from_undefined (loop);
 
+  if (loop->any_upper_bound)
+    maybe_lower_iteration_bound (loop);
+
   /* If we have a measured profile, use it to estimate the number of
      iterations.  */
   if (loop->header->count != 0)