Patchwork [PR45098,7/10] Nowrap limits iterations

login
register
mail settings
Submitter Tom de Vries
Date May 20, 2011, 9:56 a.m.
Message ID <4DD63AE1.7070600@codesourcery.com>
Download mbox | patch
Permalink /patch/96577/
State New
Headers show

Comments

Tom de Vries - May 20, 2011, 9:56 a.m.
On 05/18/2011 11:11 PM, Zdenek Dvorak wrote:
> Hi,
> 
>> This patch attemps to estimate the number of iterations of the loop based on
>> nonwrapping arithmetic in the loop body.
>>
>> 2011-05-05  Tom de Vries  <tom@codesourcery.com>
>>
>> 	PR target/45098
>> 	* tree-ssa-loop-ivopts.c (struct ivopts_data): Add fields
>> 	max_iterations_p and max_iterations.
>> 	(is_nonwrap_use, max_loop_iterations, set_max_iterations): New function.
>> 	(may_eliminate_iv): Use max_iterations_p and max_iterations.
>> 	(tree_ssa_iv_optimize_loop): Use set_max_iterations.
> 
> what are the cases this handles better than the existing analysis of maximum numbers of iterations
> (estimate_numbers_of_iterations)?

Consider tr1:

extern int pfoo (int*)  __attribute__((pure));

int tr1 (int array[], unsigned int n)
{
  int sum = 0;
  unsigned int i;
  i = 0;
  while (1)
    {
      sum += pfoo (&array[i]);
      if (!(i < n))
        break;
      i++;
    }
  return sum;
}

The number of iterations is limited by &array[i]. &array[0x3fffffff] is still
ok, but &array[0x40000000] wraps. So i runs maximally from 0 to 0x3fffffff,
which is 0x40000000 loop iterations. Since &array[i] dominates the single
loop exit, any statement in the loop is executed at most 0x40000000 times.
That's also the amount registered in nb_iterations_upper_bound.
Since int *p has 0x40000000 distinct values, it's ok to replace the
exit test !(i < n) with p == &array[n].


On the other hand, consider tr6:

int tr6 (int array[], unsigned int n)
{
  int sum = 0;
  unsigned int i;
  for (i = 0; i < n; ++i)
      sum += pfoo (&array[i]);
  return sum;
}

The same holds as before, but &array[i] does not dominate the single
loop exit, so any statement in the loop is executed at most 0x40000001 times.
Note that the loop body is executed at most 0x40000000 times, and that only
the exit test is executed at most 0x40000001 times.
That's also the amount registered in nb_iterations_upper_bound.
Since int *p has only 0x40000000 distinct values, it's not ok to replace
the exit test in terms of p.


> Would it be possible to add the handling of these cases
> to estimate_numbers_of_iterations, rather than doing it just for ivopts?

Yes, I did that now.

Thanks,
- Tom

2011-05-05  Tom de Vries  <tom@codesourcery.com>

	PR target/45098
	* tree-ssa-loop-niter.c (infer_loop_bounds_from_pointer_arith): New
	function.
	(infer_loop_bounds_from_undefined): Use new function.
	* tree-ssa-loop-ivopts.c (may_eliminate_iv): Fix
	estimated_loop_iterations comparison.
Zdenek Dvorak - May 21, 2011, 12:24 p.m.
Hi,

> > Would it be possible to add the handling of these cases
> > to estimate_numbers_of_iterations, rather than doing it just for ivopts?
> 
> Yes, I did that now.
> 
> Thanks,
> - Tom
> 
> 2011-05-05  Tom de Vries  <tom@codesourcery.com>
> 
> 	PR target/45098
> 	* tree-ssa-loop-niter.c (infer_loop_bounds_from_pointer_arith): New
> 	function.
> 	(infer_loop_bounds_from_undefined): Use new function.

this is OK.

> 	* tree-ssa-loop-ivopts.c (may_eliminate_iv): Fix
> 	estimated_loop_iterations comparison.

I don't think this part is correct, though:

> Index: gcc/tree-ssa-loop-ivopts.c
> ===================================================================
> --- gcc/tree-ssa-loop-ivopts.c (revision 173734)
> +++ gcc/tree-ssa-loop-ivopts.c (working copy)
> @@ -4391,8 +4391,13 @@ may_eliminate_iv (struct ivopts_data *da
>              {
>                if (!estimated_loop_iterations (loop, true, &max_niter))
>                  return false;
> -              /* The loop bound is already adjusted by adding 1.  */
> -              if (double_int_ucmp (max_niter, period_value) > 0)
> +              /* The max iterations applies also to the number of times the loop
> +                 exit condition is executed.  The number of distinct values of
> +                 the cand is period_value + 1.  So, test for
> +                 'period_value + 1 >= max_iterations'.
> +               */
> +              period_value = double_int_add (period_value, double_int_one);
> +              if (double_int_ucmp (max_niter, period_value) > 0)
>                  return false;
>              }
>            else

max_niter is the upper bound on the number of iterations of the loop, i.e., the number
of executions of its latch edge.  Therefore, the control induction variable of the loop
will (at the exit statement) achieve at most max_niter + 1 different values.  Conversely,
the number of distinct values that the control iv can represent is period + 1 (naming of
the "period" variable is a bit missleading).  Thus, the correct test is
# of available values >= # of required values, equivalently
period + 1 >= max_niter + 1, equivalently
period >= max_niter, i.e.,
the current test.

Zdenek
Tom de Vries - May 21, 2011, 5:59 p.m.
On 05/21/2011 02:24 PM, Zdenek Dvorak wrote:
>> 	* tree-ssa-loop-ivopts.c (may_eliminate_iv): Fix
>> 	estimated_loop_iterations comparison.
> 
> I don't think this part is correct, though:
> 
>> Index: gcc/tree-ssa-loop-ivopts.c
>> ===================================================================
>> --- gcc/tree-ssa-loop-ivopts.c (revision 173734)
>> +++ gcc/tree-ssa-loop-ivopts.c (working copy)
>> @@ -4391,8 +4391,13 @@ may_eliminate_iv (struct ivopts_data *da
>>              {
>>                if (!estimated_loop_iterations (loop, true, &max_niter))
>>                  return false;
>> -              /* The loop bound is already adjusted by adding 1.  */
>> -              if (double_int_ucmp (max_niter, period_value) > 0)
>> +              /* The max iterations applies also to the number of times the loop
>> +                 exit condition is executed.  The number of distinct values of
>> +                 the cand is period_value + 1.  So, test for
>> +                 'period_value + 1 >= max_iterations'.
>> +               */
>> +              period_value = double_int_add (period_value, double_int_one);
>> +              if (double_int_ucmp (max_niter, period_value) > 0)
>>                  return false;
>>              }
>>            else
> 

> max_niter is the upper bound on the number of iterations of the loop, i.e., the number
> of executions of its latch edge.

max_niter is set from estimated_loop_iterations, meaning from
loop->nb_iterations_upper_bound.

consider:
...
void f(int *a)
{
  int i;

  for (i = 0; i < 10; ++i)
    a[i] = 0;
}
...

at ivopts, it looks like this (compiled with -Os -fno-tree-vrp
-fno-tree-dominator-opts -fno-tree-loop-ivcanon, to get a source-like
representation)
...
f (int * a)
{
  int i;
  int * D.2009;
  unsigned int D.2008;
  unsigned int i.0;

<bb 2>:
  goto <bb 4>;

<bb 3>:
  i.0_3 = (unsigned int) i_1;
  D.2008_4 = i.0_3 * 4;
  D.2009_6 = a_5(D) + D.2008_4;
  *D.2009_6 = 0;
  i_7 = i_1 + 1;

<bb 4>:
  # i_1 = PHI <0(2), i_7(3)>
  if (i_1 <= 9)
    goto <bb 3>;
  else
    goto <bb 5>;

<bb 5>:
  return;

}
...


The header block of the loop is bb 4, the latch block is bb 3:
...
(gdb) p loop.header.index
$4 = 4
(gdb) p loop.latch.index
$5 = 3
...

The number of times the latch edge is executed, is 10.

But loop->nb_iterations_upper_bound, or max_niter is 11:
...
(gdb) p *loop
$1 = {num = 1, ninsns = 0, header = 0xf7dc2440, latch = 0xf7dc2400, lpt_decision
= {decision = LPT_NONE, times = 0}, av_ninsns = 0, num_nodes = 2, superloops =
0xf7db6ee8, inner = 0x0, next = 0x0,
  aux = 0x0, nb_iterations = 0xf7d3d540, nb_iterations_upper_bound = {low = 11,
high = 0}, nb_iterations_estimate = {low = 11, high = 0}, any_upper_bound = 1
'\001', any_estimate = 1 '\001',
  can_be_parallel = 0 '\000', estimate_state = EST_AVAILABLE, bounds =
0xf7d3da2c, exits = 0xf7dc3d70}
...

> Therefore, the control induction variable of the loop
> will (at the exit statement) achieve at most max_niter + 1 different values.

Based on what I observe, I'd say the control induction variable of the loop will
achieve at most max_niter different values.

> Conversely,
> the number of distinct values that the control iv can represent is period + 1 (naming of
> the "period" variable is a bit missleading).

agree.

Thanks,
- Tom
H.J. Lu - May 23, 2011, 1:34 p.m.
On Sat, May 21, 2011 at 5:24 AM, Zdenek Dvorak <rakdver@kam.mff.cuni.cz> wrote:
> Hi,
>
>> > Would it be possible to add the handling of these cases
>> > to estimate_numbers_of_iterations, rather than doing it just for ivopts?
>>
>> Yes, I did that now.
>>
>> Thanks,
>> - Tom
>>
>> 2011-05-05  Tom de Vries  <tom@codesourcery.com>
>>
>>       PR target/45098
>>       * tree-ssa-loop-niter.c (infer_loop_bounds_from_pointer_arith): New
>>       function.
>>       (infer_loop_bounds_from_undefined): Use new function.
>
> this is OK.
>

This caused:

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49124

H.J.
Tom de Vries - May 28, 2011, 3:12 p.m.
Hi Zdenek,

On 05/21/2011 07:59 PM, Tom de Vries wrote:
> On 05/21/2011 02:24 PM, Zdenek Dvorak wrote:
>>> 	* tree-ssa-loop-ivopts.c (may_eliminate_iv): Fix
>>> 	estimated_loop_iterations comparison.
>>
>> I don't think this part is correct, though:
>>
>>> Index: gcc/tree-ssa-loop-ivopts.c
>>> ===================================================================
>>> --- gcc/tree-ssa-loop-ivopts.c (revision 173734)
>>> +++ gcc/tree-ssa-loop-ivopts.c (working copy)
>>> @@ -4391,8 +4391,13 @@ may_eliminate_iv (struct ivopts_data *da
>>>              {
>>>                if (!estimated_loop_iterations (loop, true, &max_niter))
>>>                  return false;
>>> -              /* The loop bound is already adjusted by adding 1.  */
>>> -              if (double_int_ucmp (max_niter, period_value) > 0)
>>> +              /* The max iterations applies also to the number of times the loop
>>> +                 exit condition is executed.  The number of distinct values of
>>> +                 the cand is period_value + 1.  So, test for
>>> +                 'period_value + 1 >= max_iterations'.
>>> +               */
>>> +              period_value = double_int_add (period_value, double_int_one);
>>> +              if (double_int_ucmp (max_niter, period_value) > 0)
>>>                  return false;
>>>              }
>>>            else
>>
> 
>> max_niter is the upper bound on the number of iterations of the loop, i.e., the number
>> of executions of its latch edge.
> 
> max_niter is set from estimated_loop_iterations, meaning from
> loop->nb_iterations_upper_bound.
> 
> consider:
> ...
> void f(int *a)
> {
>   int i;
> 
>   for (i = 0; i < 10; ++i)
>     a[i] = 0;
> }
> ...
> 
> at ivopts, it looks like this (compiled with -Os -fno-tree-vrp
> -fno-tree-dominator-opts -fno-tree-loop-ivcanon, to get a source-like
> representation)
> ...
> f (int * a)
> {
>   int i;
>   int * D.2009;
>   unsigned int D.2008;
>   unsigned int i.0;
> 
> <bb 2>:
>   goto <bb 4>;
> 
> <bb 3>:
>   i.0_3 = (unsigned int) i_1;
>   D.2008_4 = i.0_3 * 4;
>   D.2009_6 = a_5(D) + D.2008_4;
>   *D.2009_6 = 0;
>   i_7 = i_1 + 1;
> 
> <bb 4>:
>   # i_1 = PHI <0(2), i_7(3)>
>   if (i_1 <= 9)
>     goto <bb 3>;
>   else
>     goto <bb 5>;
> 
> <bb 5>:
>   return;
> 
> }
> ...
> 
> 
> The header block of the loop is bb 4, the latch block is bb 3:
> ...
> (gdb) p loop.header.index
> $4 = 4
> (gdb) p loop.latch.index
> $5 = 3
> ...
> 
> The number of times the latch edge is executed, is 10.
> 
> But loop->nb_iterations_upper_bound, or max_niter is 11:
> ...
> (gdb) p *loop
> $1 = {num = 1, ninsns = 0, header = 0xf7dc2440, latch = 0xf7dc2400, lpt_decision
> = {decision = LPT_NONE, times = 0}, av_ninsns = 0, num_nodes = 2, superloops =
> 0xf7db6ee8, inner = 0x0, next = 0x0,
>   aux = 0x0, nb_iterations = 0xf7d3d540, nb_iterations_upper_bound = {low = 11,
> high = 0}, nb_iterations_estimate = {low = 11, high = 0}, any_upper_bound = 1
> '\001', any_estimate = 1 '\001',
>   can_be_parallel = 0 '\000', estimate_state = EST_AVAILABLE, bounds =
> 0xf7d3da2c, exits = 0xf7dc3d70}
> ...
> 
>> Therefore, the control induction variable of the loop
>> will (at the exit statement) achieve at most max_niter + 1 different values.
> 
> Based on what I observe, I'd say the control induction variable of the loop will
> achieve at most max_niter different values.
> 

Any thoughts on my observations above?

Thanks,
- Tom
Zdenek Dvorak - May 30, 2011, 12:38 p.m.
Hi,

> > The header block of the loop is bb 4, the latch block is bb 3:
> > ...
> > (gdb) p loop.header.index
> > $4 = 4
> > (gdb) p loop.latch.index
> > $5 = 3
> > ...
> > 
> > The number of times the latch edge is executed, is 10.
> > 
> > But loop->nb_iterations_upper_bound, or max_niter is 11:

this is a bit strange, it looks like the # of iterations estimation is setting
nb_iterations_upper_bound too conservatively (or I gave nb_iterations_upper_bound
a different semantics than I remember -- but both my memory and the comment in cfgloop.h
suggest that nb_iterations_upper_bound >= nb_iterations, i.e., that it should be 10 in your
example),

Zdenek

Patch

Index: gcc/tree-ssa-loop-niter.c
===================================================================
--- gcc/tree-ssa-loop-niter.c (revision 173734)
+++ gcc/tree-ssa-loop-niter.c (working copy)
@@ -2835,6 +2835,54 @@  infer_loop_bounds_from_array (struct loo
    that signed arithmetics in STMT does not overflow.  */
 
 static void
+infer_loop_bounds_from_pointer_arith (struct loop *loop, gimple stmt)
+{
+  tree def, base, step, scev, type, low, high;
+  tree var, ptr;
+
+  if (!is_gimple_assign (stmt)
+      || gimple_assign_rhs_code (stmt) != POINTER_PLUS_EXPR)
+    return;
+
+  def = gimple_assign_lhs (stmt);
+  if (TREE_CODE (def) != SSA_NAME)
+    return;
+
+  type = TREE_TYPE (def);
+  if (!nowrap_type_p (type))
+    return;
+
+  ptr = gimple_assign_rhs1 (stmt);
+  if (!expr_invariant_in_loop_p (loop, ptr))
+    return;
+
+  var = gimple_assign_rhs2 (stmt);
+  if (TYPE_PRECISION (type) != TYPE_PRECISION (TREE_TYPE (var)))
+    return;
+
+  scev = instantiate_parameters (loop, analyze_scalar_evolution (loop, def));
+  if (chrec_contains_undetermined (scev))
+    return;
+
+  base = initial_condition_in_loop_num (scev, loop->num);
+  step = evolution_part_in_loop_num (scev, loop->num);
+
+  if (!base || !step
+      || TREE_CODE (step) != INTEGER_CST
+      || tree_contains_chrecs (base, NULL)
+      || chrec_contains_symbols_defined_in_loop (base, loop->num))
+    return;
+
+  low = lower_bound_in_type (type, type);
+  high = upper_bound_in_type (type, type);
+
+  record_nonwrapping_iv (loop, base, step, stmt, low, high, false, true);
+}
+
+/* Determine information about number of iterations of a LOOP from the fact
+   that signed arithmetics in STMT does not overflow.  */
+
+static void
 infer_loop_bounds_from_signedness (struct loop *loop, gimple stmt)
 {
   tree def, base, step, scev, type, low, high;
@@ -2907,7 +2955,10 @@  infer_loop_bounds_from_undefined (struct
 	  infer_loop_bounds_from_array (loop, stmt, reliable);
 
 	  if (reliable)
-	    infer_loop_bounds_from_signedness (loop, stmt);
+            {
+              infer_loop_bounds_from_signedness (loop, stmt);
+              infer_loop_bounds_from_pointer_arith (loop, stmt);
+            }
   	}
 
     }
Index: gcc/tree-ssa-loop-ivopts.c
===================================================================
--- gcc/tree-ssa-loop-ivopts.c (revision 173734)
+++ gcc/tree-ssa-loop-ivopts.c (working copy)
@@ -4391,8 +4391,13 @@  may_eliminate_iv (struct ivopts_data *da
             {
               if (!estimated_loop_iterations (loop, true, &max_niter))
                 return false;
-              /* The loop bound is already adjusted by adding 1.  */
-              if (double_int_ucmp (max_niter, period_value) > 0)
+              /* The max iterations applies also to the number of times the loop
+                 exit condition is executed.  The number of distinct values of
+                 the cand is period_value + 1.  So, test for
+                 'period_value + 1 >= max_iterations'.
+               */
+              period_value = double_int_add (period_value, double_int_one);
+              if (double_int_ucmp (max_niter, period_value) > 0)
                 return false;
             }
           else