Patchwork ivopts improvement

login
register
mail settings
Submitter Tom de Vries
Date Feb. 28, 2011, 9:41 a.m.
Message ID <4D6B6DB9.7050302@codesourcery.com>
Download mbox | patch
Permalink /patch/84757/
State New
Headers show

Comments

Tom de Vries - Feb. 28, 2011, 9:41 a.m.
Hi Zdenek,

could you please review the following patch set?

The goal is to remove the 'i' iterator from the following example, by
replacing 'i < n' with 'p < base + n'.

void
f (char *base, unsigned long int i, unsigned long int n)
{
  char *p = base + i;
  do
    {
      *p = '\0';
      p++;
      i++;
   }
  while (i < n);
}

The difficulty is that this replacement is only valid under the
guarantee that base + n does not overflow. I was not able to deduce from
the C standard that merely the fact that it's a pointer type guarantees
this, so I added analysis to prove no overflow based on the memory
access '*p' and the relation between the 2 iterators.

patch set description:
- iterator.6.1-ml.patch: maximum bounds fix.
- iterator.6.2-ml.patch: infrastructure change, no change in
  functionality. calculate the new comparison operator at the same time
  as the new bound, and store it next to the new bound in the cost pair.
- iterator.6.3-ml.patch: infrastructure change. carved out new function.
- iterator.6.4-ml.patch: patch to rewrite an exit test with a '<'
  using a different iterator.

reg-tested on x86-64, and a 4.5 version of the patch set reg-tested on
MIPS, ARM and PPC.

Thanks,
- Tom
Paolo Bonzini - Feb. 28, 2011, 10:37 a.m.
On 02/28/2011 10:41 AM, Tom de Vries wrote:
> The difficulty is that this replacement is only valid under the
> guarantee that base + n does not overflow.

I think this should be done unconditionally under 
-funsafe-loop-optimizations.  Look in the code for examples of how to 
add warnings, it should be something like:

   bool last_iter_may_overflow;
   if (TYPE_OVERFLOW_WRAPS (...))
     last_iter_may_overflow = false;
   else
     {
       last_iter_may_overflow = !flag_unsafe_loop_optimizations;
       if (warn_unsafe_loop_optimizations)
         {
           if (flag_unsafe_loop_optimizations)
             warning ("assuming ... does not overflow");
           else
             warning ("not optimizing ... because ... could overflow");
         }
     }

   if (last_iter_may_overflow
       && stmt_after_increment (loop, cand, use->stmt))

Looks like good work, thanks!

Paolo
Zdenek Dvorak - Feb. 28, 2011, 10:46 a.m.
Hi,

> 23-02-2011  Tom de Vries  <tom@codesourcery.com>
> 
> 	gcc/
> 	* tree-ssa-loop-ivopts.c (may_eliminate_iv): Fix boundary checks.
> 	* testsuite/gcc.dg/tree-ssa/ivopts-max.c: New test.
> 	* testsuite/gcc.dg/tree-ssa/ivopts-max-2.c: New test.
> 	* testsuite/gcc.dg/tree-ssa/ivopts-max-3.c: New test.

> diff -u gcc/tree-ssa-loop-ivopts.c gcc/tree-ssa-loop-ivopts.c
> --- gcc/tree-ssa-loop-ivopts.c	(revision 162585)
> +++ gcc/tree-ssa-loop-ivopts.c	(working copy)
> @@ -4079,8 +4079,12 @@
>    /* If the number of iterations is constant, compare against it directly.  */
>    if (TREE_CODE (nit) == INTEGER_CST)
>      {
> -      /* See cand_value_at.  */
> -      if (stmt_after_increment (loop, cand, use->stmt))
> +      /* An iv with TYPE_PRECISION 32 has 2^32 distinct values, and an iv_period
> +         of 2^32 - 1.  Such an iv can handle at most a nit of 2^32 - 1.  If the
> +         exit test is after the increment, the bound calculation will overflow.
> +         If overflow wraps, that's not a problem.  */
> +      if (!TYPE_OVERFLOW_WRAPS (TREE_TYPE (cand->iv->base))
> +          && stmt_after_increment (loop, cand, use->stmt))
>          {
>            if (!tree_int_cst_lt (nit, period))
>              return false;
> @@ -4100,7 +4104,9 @@
>        double_int period_value, max_niter;
>  
>        max_niter = desc->max;
> -      if (stmt_after_increment (loop, cand, use->stmt))
> +      /* See comment at TREE_CODE (nit) == INTEGER_CST.  */
> +      if (!TYPE_OVERFLOW_WRAPS (TREE_TYPE (cand->iv->base))
> +          && stmt_after_increment (loop, cand, use->stmt))
>          max_niter = double_int_add (max_niter, double_int_one);
>        period_value = tree_to_double_int (period);
>        if (double_int_ucmp (max_niter, period_value) > 0)

This looks strange.  If indeed there is an off-by-one error here, it should not depend
on whether TYPE_OVERFLOW_WRAPS (TREE_TYPE (cand->iv->base)) is true or not.  The only
candidates that satisfy !TYPE_OVERFLOW_WRAPS are those for that we are certain that their
computations cannot overflow (e.g., if they are the original induction variables of the loop
computed in a !TYPE_OVERFLOW_WRAPS type).

> 23-02-2011  Tom de Vries  <tom@codesourcery.com>
> 
> 	gcc/
> 	* tree-ssa-loop-ivopts.c (struct cost_pair): Add comp field.
> 	(set_use_iv_cost): Add comp parameter.
> 	(set_use_iv_cost): Use comp parameter to init comp field.
> 	(determine_use_iv_cost_generic, determine_use_iv_cost_address)
> 	(determine_use_iv_cost_condition): Add comp argument to set_use_iv_cost
> 	call.
> 	(may_eliminate_iv): Add comp parameter and set.
> 	(determine_use_iv_cost_condition): Add comp argument to may_eliminate_iv
> 	call.
> 	(determine_use_iv_cost_condition): Reset comp if too expensive.
> 	(rewrite_use_compare): Use compare from comp field of cp.

This is ok.

> 23-02-2011  Tom de Vries  <tom@codesourcery.com>
> 
> 	gcc/
> 	* tree-ssa-loop-ivopts.c (fold_build_plus): New function factored out of
> 	add_autoinc_candidates.
> 	(add_autoinc_candidates): Use fold_build_plus.

OK.

> 23-02-2011  Tom de Vries  <tom@codesourcery.com>
> 
> 	gcc/
> 	* tree-ssa-loop-ivopts.c (niter_for_exit): Return COND_EXPR if
> 	!integer_zerop (desc->may_be_zero).
> 	(cand_valid_pointer_at_use_p, get_lt_bound, iv_elimination_compare_lt):
> 	New function.
> 	(may_eliminate_iv): Use iv_elimination_compare_lt.

> +/* Tries to determine if the iterator CAND is a valid pointer at USE.  */

For consistency with the rest of the comments in ivopts, please use "candidate"
instead of "iterator" in similar comments.

> +static bool
> +cand_valid_pointer_at_use_p (struct ivopts_data *data, struct iv_cand *cand,
> +			     struct iv_use *use)
> +{
> +  struct iv_use *cand_iv_use = iv_use (data, cand->iv->use_id);

This is rather strange.  cand->iv->use_id is never being set (all candidates are
allocated through add_candidate_1, which creates a new iv for them from the given
base and step).  So this just chooses a random use (the first one appearing in the
loop), regrardless whether it has anything to do with cand or not.

Zdenek
Richard Guenther - Feb. 28, 2011, 11 a.m.
On Mon, Feb 28, 2011 at 11:46 AM, Zdenek Dvorak <rakdver@kam.mff.cuni.cz> wrote:
> Hi,
>
>> 23-02-2011  Tom de Vries  <tom@codesourcery.com>
>>
>>       gcc/
>>       * tree-ssa-loop-ivopts.c (may_eliminate_iv): Fix boundary checks.
>>       * testsuite/gcc.dg/tree-ssa/ivopts-max.c: New test.
>>       * testsuite/gcc.dg/tree-ssa/ivopts-max-2.c: New test.
>>       * testsuite/gcc.dg/tree-ssa/ivopts-max-3.c: New test.
>
>> diff -u gcc/tree-ssa-loop-ivopts.c gcc/tree-ssa-loop-ivopts.c
>> --- gcc/tree-ssa-loop-ivopts.c        (revision 162585)
>> +++ gcc/tree-ssa-loop-ivopts.c        (working copy)
>> @@ -4079,8 +4079,12 @@
>>    /* If the number of iterations is constant, compare against it directly.  */
>>    if (TREE_CODE (nit) == INTEGER_CST)
>>      {
>> -      /* See cand_value_at.  */
>> -      if (stmt_after_increment (loop, cand, use->stmt))
>> +      /* An iv with TYPE_PRECISION 32 has 2^32 distinct values, and an iv_period
>> +         of 2^32 - 1.  Such an iv can handle at most a nit of 2^32 - 1.  If the
>> +         exit test is after the increment, the bound calculation will overflow.
>> +         If overflow wraps, that's not a problem.  */
>> +      if (!TYPE_OVERFLOW_WRAPS (TREE_TYPE (cand->iv->base))
>> +          && stmt_after_increment (loop, cand, use->stmt))
>>          {
>>            if (!tree_int_cst_lt (nit, period))
>>              return false;
>> @@ -4100,7 +4104,9 @@
>>        double_int period_value, max_niter;
>>
>>        max_niter = desc->max;
>> -      if (stmt_after_increment (loop, cand, use->stmt))
>> +      /* See comment at TREE_CODE (nit) == INTEGER_CST.  */
>> +      if (!TYPE_OVERFLOW_WRAPS (TREE_TYPE (cand->iv->base))
>> +          && stmt_after_increment (loop, cand, use->stmt))
>>          max_niter = double_int_add (max_niter, double_int_one);
>>        period_value = tree_to_double_int (period);
>>        if (double_int_ucmp (max_niter, period_value) > 0)
>
> This looks strange.  If indeed there is an off-by-one error here, it should not depend
> on whether TYPE_OVERFLOW_WRAPS (TREE_TYPE (cand->iv->base)) is true or not.  The only
> candidates that satisfy !TYPE_OVERFLOW_WRAPS are those for that we are certain that their
> computations cannot overflow (e.g., if they are the original induction variables of the loop
> computed in a !TYPE_OVERFLOW_WRAPS type).
>
>> 23-02-2011  Tom de Vries  <tom@codesourcery.com>
>>
>>       gcc/
>>       * tree-ssa-loop-ivopts.c (struct cost_pair): Add comp field.
>>       (set_use_iv_cost): Add comp parameter.
>>       (set_use_iv_cost): Use comp parameter to init comp field.
>>       (determine_use_iv_cost_generic, determine_use_iv_cost_address)
>>       (determine_use_iv_cost_condition): Add comp argument to set_use_iv_cost
>>       call.
>>       (may_eliminate_iv): Add comp parameter and set.
>>       (determine_use_iv_cost_condition): Add comp argument to may_eliminate_iv
>>       call.
>>       (determine_use_iv_cost_condition): Reset comp if too expensive.
>>       (rewrite_use_compare): Use compare from comp field of cp.
>
> This is ok.
>
>> 23-02-2011  Tom de Vries  <tom@codesourcery.com>
>>
>>       gcc/
>>       * tree-ssa-loop-ivopts.c (fold_build_plus): New function factored out of
>>       add_autoinc_candidates.
>>       (add_autoinc_candidates): Use fold_build_plus.
>
> OK.
>
>> 23-02-2011  Tom de Vries  <tom@codesourcery.com>
>>
>>       gcc/
>>       * tree-ssa-loop-ivopts.c (niter_for_exit): Return COND_EXPR if
>>       !integer_zerop (desc->may_be_zero).
>>       (cand_valid_pointer_at_use_p, get_lt_bound, iv_elimination_compare_lt):
>>       New function.
>>       (may_eliminate_iv): Use iv_elimination_compare_lt.
>
>> +/* Tries to determine if the iterator CAND is a valid pointer at USE.  */
>
> For consistency with the rest of the comments in ivopts, please use "candidate"
> instead of "iterator" in similar comments.
>
>> +static bool
>> +cand_valid_pointer_at_use_p (struct ivopts_data *data, struct iv_cand *cand,
>> +                          struct iv_use *use)
>> +{
>> +  struct iv_use *cand_iv_use = iv_use (data, cand->iv->use_id);
>
> This is rather strange.  cand->iv->use_id is never being set (all candidates are
> allocated through add_candidate_1, which creates a new iv for them from the given
> base and step).  So this just chooses a random use (the first one appearing in the
> loop), regrardless whether it has anything to do with cand or not.

Btw, patches like this should get some benchmarking for a primary platform.

Richard.

> Zdenek
>
Tom de Vries - Feb. 28, 2011, 2:22 p.m.
Hi Paolo,

On 02/28/2011 11:37 AM, Paolo Bonzini wrote:
> On 02/28/2011 10:41 AM, Tom de Vries wrote:
>> The difficulty is that this replacement is only valid under the
>> guarantee that base + n does not overflow.
> 
> I think this should be done unconditionally under 
> -funsafe-loop-optimizations.

void
f (char *base, unsigned long int i, unsigned long int n)
{
  char *p = base + i;
  do
    {
      *p = '\0';
      p++;
      i++;
   }
  while (i < n);
}

AFAIU, -funsafe-loop-optimizations allows me to assume that i does not
overflow.  Are you saying that -funsafe-loop-optimizations also implies
that p does not overflow?

- Tom
Paolo Bonzini - Feb. 28, 2011, 2:43 p.m.
On 02/28/2011 03:22 PM, Tom de Vries wrote:
>> >
>> >  I think this should be done unconditionally under
>> >  -funsafe-loop-optimizations.
> void
> f (char *base, unsigned long int i, unsigned long int n)
> {
>    char *p = base + i;
>    do
>      {
>        *p = '\0';
>        p++;
>        i++;
>     }
>    while (i<  n);
> }
>
> AFAIU, -funsafe-loop-optimizations allows me to assume that i does not
> overflow.  Are you saying that -funsafe-loop-optimizations also implies
> that p does not overflow?

I'm wondering if the C standard doesn't allow you to accept this always, 
at least if the increment to i (and p) is positive.

I'm thinking that "undefined behavior for NULL pointer access" implies 
that an object cannot be allocated across a NULL pointer.  And since 
pointer arithmetic beyond object boundaries is undefined, an overflow of 
p (which would cross a NULL pointer) is also undefined.

I'm not sure what your code's intentions would be in a more general case 
that incrementing by 1.  But, in particular, this is valid code:

void
f (char *base, unsigned long int i, unsigned long int n)
{
    char *p = base + i;
    do
      {
        *p = '\0';
        p--;
        i--;
     }
    while (i > n);
}

while this isn't (assuming sizeof(long)==sizeof(void *) of course):

void
f (char *base, unsigned long int i, unsigned long int n)
{
    char *p = base + i;
    do
      {
        *p = '\0';
        p += ~0UL;
        i += ~0UL;
     }
    while (i > n);
}

Paolo
Richard Guenther - Feb. 28, 2011, 2:59 p.m.
On Mon, Feb 28, 2011 at 3:43 PM, Paolo Bonzini <bonzini@gnu.org> wrote:
> On 02/28/2011 03:22 PM, Tom de Vries wrote:
>>>
>>> >
>>> >  I think this should be done unconditionally under
>>> >  -funsafe-loop-optimizations.
>>
>> void
>> f (char *base, unsigned long int i, unsigned long int n)
>> {
>>   char *p = base + i;
>>   do
>>     {
>>       *p = '\0';
>>       p++;
>>       i++;
>>    }
>>   while (i<  n);
>> }
>>
>> AFAIU, -funsafe-loop-optimizations allows me to assume that i does not
>> overflow.  Are you saying that -funsafe-loop-optimizations also implies
>> that p does not overflow?
>
> I'm wondering if the C standard doesn't allow you to accept this always, at
> least if the increment to i (and p) is positive.
>
> I'm thinking that "undefined behavior for NULL pointer access" implies that
> an object cannot be allocated across a NULL pointer.  And since pointer
> arithmetic beyond object boundaries is undefined, an overflow of p (which
> would cross a NULL pointer) is also undefined.

The important thing is that IVOPTs is not allowed to introduce IVs that
overflow either.  And it is known to create weird IVs starting from
-4 ...

Richard.
Tom de Vries - Feb. 28, 2011, 4:04 p.m.
On 02/28/2011 11:46 AM, Zdenek Dvorak wrote:
> Hi,
> 
>> 23-02-2011  Tom de Vries  <tom@codesourcery.com>
>>
>> 	gcc/
>> 	* tree-ssa-loop-ivopts.c (may_eliminate_iv): Fix boundary checks.
>> 	* testsuite/gcc.dg/tree-ssa/ivopts-max.c: New test.
>> 	* testsuite/gcc.dg/tree-ssa/ivopts-max-2.c: New test.
>> 	* testsuite/gcc.dg/tree-ssa/ivopts-max-3.c: New test.
> 
>> diff -u gcc/tree-ssa-loop-ivopts.c gcc/tree-ssa-loop-ivopts.c
>> --- gcc/tree-ssa-loop-ivopts.c	(revision 162585)
>> +++ gcc/tree-ssa-loop-ivopts.c	(working copy)
>> @@ -4079,8 +4079,12 @@
>>    /* If the number of iterations is constant, compare against it directly.  */
>>    if (TREE_CODE (nit) == INTEGER_CST)
>>      {
>> -      /* See cand_value_at.  */
>> -      if (stmt_after_increment (loop, cand, use->stmt))
>> +      /* An iv with TYPE_PRECISION 32 has 2^32 distinct values, and an iv_period
>> +         of 2^32 - 1.  Such an iv can handle at most a nit of 2^32 - 1.  If the
>> +         exit test is after the increment, the bound calculation will overflow.
>> +         If overflow wraps, that's not a problem.  */
>> +      if (!TYPE_OVERFLOW_WRAPS (TREE_TYPE (cand->iv->base))
>> +          && stmt_after_increment (loop, cand, use->stmt))
>>          {
>>            if (!tree_int_cst_lt (nit, period))
>>              return false;
>> @@ -4100,7 +4104,9 @@
>>        double_int period_value, max_niter;
>>  
>>        max_niter = desc->max;
>> -      if (stmt_after_increment (loop, cand, use->stmt))
>> +      /* See comment at TREE_CODE (nit) == INTEGER_CST.  */
>> +      if (!TYPE_OVERFLOW_WRAPS (TREE_TYPE (cand->iv->base))
>> +          && stmt_after_increment (loop, cand, use->stmt))
>>          max_niter = double_int_add (max_niter, double_int_one);
>>        period_value = tree_to_double_int (period);
>>        if (double_int_ucmp (max_niter, period_value) > 0)
> 
> This looks strange.  If indeed there is an off-by-one error here, it should not depend
> on whether TYPE_OVERFLOW_WRAPS (TREE_TYPE (cand->iv->base)) is true or not.  The only
> candidates that satisfy !TYPE_OVERFLOW_WRAPS are those for that we are certain that their
> computations cannot overflow (e.g., if they are the original induction variables of the loop
> computed in a !TYPE_OVERFLOW_WRAPS type).
> 

Ok, in that case I'll remove the stmt_after_increment related code in
may_eliminate_iv.

>> +static bool
>> +cand_valid_pointer_at_use_p (struct ivopts_data *data, struct iv_cand *cand,
>> +			     struct iv_use *use)
>> +{
>> +  struct iv_use *cand_iv_use = iv_use (data, cand->iv->use_id);
> 
> This is rather strange.  cand->iv->use_id is never being set (all candidates are
> allocated through add_candidate_1, which creates a new iv for them from the given
> base and step).  So this just chooses a random use (the first one appearing in the
> loop), regrardless whether it has anything to do with cand or not.
> 

I see, that needs fixing indeed. Thanks for noticing that.

- Tom
Zdenek Dvorak - Feb. 28, 2011, 6:12 p.m.
Hi,

> >>> >  I think this should be done unconditionally under
> >>> >  -funsafe-loop-optimizations.
> >>
> >> void
> >> f (char *base, unsigned long int i, unsigned long int n)
> >> {
> >>   char *p = base + i;
> >>   do
> >>     {
> >>       *p = '\0';
> >>       p++;
> >>       i++;
> >>    }
> >>   while (i<  n);
> >> }
> >>
> >> AFAIU, -funsafe-loop-optimizations allows me to assume that i does not
> >> overflow.  Are you saying that -funsafe-loop-optimizations also implies
> >> that p does not overflow?
> >
> > I'm wondering if the C standard doesn't allow you to accept this always, at
> > least if the increment to i (and p) is positive.
> >
> > I'm thinking that "undefined behavior for NULL pointer access" implies that
> > an object cannot be allocated across a NULL pointer.  And since pointer
> > arithmetic beyond object boundaries is undefined, an overflow of p (which
> > would cross a NULL pointer) is also undefined.
> 
> The important thing is that IVOPTs is not allowed to introduce IVs that
> overflow either.  And it is known to create weird IVs starting from
> -4 ...

as long as we do the arithmetics in unsigned integer type and only cast to the
pointer type the final result, C standard (plus the gcc-specific interpretation
of casts) allows us to do that.

Nevertheless, pointer arithmetics has defined behavior only within
a single memory object or at most one array element after it; which more
or less forbids overflows, when the arithmetics is performed in pointer
type,

Zdenek

Patch

Index: gcc/testsuite/gcc.dg/tree-ssa/ivopts-lt.c
===================================================================
--- gcc/testsuite/gcc.dg/tree-ssa/ivopts-lt.c	(revision 0)
+++ gcc/testsuite/gcc.dg/tree-ssa/ivopts-lt.c	(revision 0)
@@ -0,0 +1,20 @@ 
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-ivopts -fno-tree-vectorize -fno-tree-loop-ivcanon" } */
+
+void
+f (char *p, unsigned long int i, unsigned long int n)
+{
+  p += i;
+  do
+    {
+      *p = '\0';
+      p += 1;
+      i++;
+    }
+  while (i < n);
+}
+
+/* { dg-final { scan-tree-dump-times "PHI" 1 "ivopts"} } */
+/* { dg-final { scan-tree-dump-times "PHI <p_" 1 "ivopts"} } */
+/* { dg-final { scan-tree-dump-times "p_\[0-9\]* <" 1 "ivopts"} } */
+/* { dg-final { cleanup-tree-dump "ivopts" } } */