diff mbox

[PR64705] Don't aggressively expand induction variable's base

Message ID CAHFci28_4e8j-qrUg7DMjA0qSqKnV3XS074zG0s1pbq3hZFppg@mail.gmail.com
State New
Headers show

Commit Message

Bin.Cheng Feb. 12, 2015, 7:28 a.m. UTC
On Wed, Feb 11, 2015 at 7:24 PM, Richard Biener
<richard.guenther@gmail.com> wrote:
> On Wed, Feb 11, 2015 at 9:23 AM, Bin.Cheng <amker.cheng@gmail.com> wrote:
>> On Tue, Feb 10, 2015 at 12:24 AM, Richard Biener
>> <richard.guenther@gmail.com> wrote:
>>
>> Previously, the computation of _1174 can be replaced by _629 in bb8 in
>> DOM2 pass, while it can't after patching.  This is the only possible
>> regression that I can see on TREE level.  There is another difference
>> but not regression on TREE.  Seems real change happens on RTL pre with
>> different register uses in the input.  I failed to go further or
>> extract a test case, it's just very volatile.
>
> Well, I can see what is happening and indeed we shouldn't blame the
> patch for this.
>
>> With all of this, I guess this patch shouldn't be blamed for this.  I
>> also wonder if the PR should be fixed in this way since the patch
>> definitely is a corner case.
>
> It might not fix the real issue (whatever that is), but not making
> IVOPTs (or tree-affines) life harder is very good (I believe I have
> seen this kind of issue as well).
I guess IV's base is expanded because we want to explore more CSE opportunities?
I suspect this doesn't work very well because of two reasons: 1)
overload the tree-affine facility; 2) weak IV rewriting capacity in
GCC (for example, mess up loop variant/invariant part expressions).  I
will do experiments on this.

As for the patch itself, I collected instrumental data from GCC
bootstrap and Spec2k6 compilation.  Can confirm in most cases
(bootstrap 99.9%, spec2k6 99.1%), there is only one ssa name in IV's
step.

>
> So I do think that the patch is fine.  Just seen the known-to-work GCC 3.4
> version so it's even a regression ....

Here is the refined patch according to your comments.  It passes
bootstrap and test on x86_64.

Thanks,
bin

2015-02-12  Bin Cheng  <bin.cheng@arm.com>

    PR tree-optimization/64705
    * tree-ssa-loop-niter.h (expand_simple_operations): New parameter.
    * tree-ssa-loop-niter.c (expand_simple_operations): New parameter.
    * tree-ssa-loop-ivopts.c (extract_single_var_from_expr): New.
    (find_bivs, find_givs_in_stmt_scev): Pass new argument to
    expand_simple_operations.

gcc/testsuite/ChangeLog
2015-02-12  Bin Cheng  <bin.cheng@arm.com>

    PR tree-optimization/64705
    * gcc.dg/tree-ssa/pr64705.c: New test.

Comments

Richard Biener Feb. 12, 2015, 1:11 p.m. UTC | #1
On Thu, Feb 12, 2015 at 8:28 AM, Bin.Cheng <amker.cheng@gmail.com> wrote:
> On Wed, Feb 11, 2015 at 7:24 PM, Richard Biener
> <richard.guenther@gmail.com> wrote:
>> On Wed, Feb 11, 2015 at 9:23 AM, Bin.Cheng <amker.cheng@gmail.com> wrote:
>>> On Tue, Feb 10, 2015 at 12:24 AM, Richard Biener
>>> <richard.guenther@gmail.com> wrote:
>>>
>>> Previously, the computation of _1174 can be replaced by _629 in bb8 in
>>> DOM2 pass, while it can't after patching.  This is the only possible
>>> regression that I can see on TREE level.  There is another difference
>>> but not regression on TREE.  Seems real change happens on RTL pre with
>>> different register uses in the input.  I failed to go further or
>>> extract a test case, it's just very volatile.
>>
>> Well, I can see what is happening and indeed we shouldn't blame the
>> patch for this.
>>
>>> With all of this, I guess this patch shouldn't be blamed for this.  I
>>> also wonder if the PR should be fixed in this way since the patch
>>> definitely is a corner case.
>>
>> It might not fix the real issue (whatever that is), but not making
>> IVOPTs (or tree-affines) life harder is very good (I believe I have
>> seen this kind of issue as well).
> I guess IV's base is expanded because we want to explore more CSE opportunities?
> I suspect this doesn't work very well because of two reasons: 1)
> overload the tree-affine facility; 2) weak IV rewriting capacity in
> GCC (for example, mess up loop variant/invariant part expressions).  I
> will do experiments on this.
>
> As for the patch itself, I collected instrumental data from GCC
> bootstrap and Spec2k6 compilation.  Can confirm in most cases
> (bootstrap 99.9%, spec2k6 99.1%), there is only one ssa name in IV's
> step.
>
>>
>> So I do think that the patch is fine.  Just seen the known-to-work GCC 3.4
>> version so it's even a regression ....
>
> Here is the refined patch according to your comments.  It passes
> bootstrap and test on x86_64.

Ok.

Thanks,
Richard.

> Thanks,
> bin
>
> 2015-02-12  Bin Cheng  <bin.cheng@arm.com>
>
>     PR tree-optimization/64705
>     * tree-ssa-loop-niter.h (expand_simple_operations): New parameter.
>     * tree-ssa-loop-niter.c (expand_simple_operations): New parameter.
>     * tree-ssa-loop-ivopts.c (extract_single_var_from_expr): New.
>     (find_bivs, find_givs_in_stmt_scev): Pass new argument to
>     expand_simple_operations.
>
> gcc/testsuite/ChangeLog
> 2015-02-12  Bin Cheng  <bin.cheng@arm.com>
>
>     PR tree-optimization/64705
>     * gcc.dg/tree-ssa/pr64705.c: New test.
diff mbox

Patch

Index: gcc/tree-ssa-loop-niter.c
===================================================================
--- gcc/tree-ssa-loop-niter.c	(revision 219574)
+++ gcc/tree-ssa-loop-niter.c	(working copy)
@@ -1552,10 +1552,11 @@  simplify_replace_tree (tree expr, tree old, tree n
 }
 
 /* Expand definitions of ssa names in EXPR as long as they are simple
-   enough, and return the new expression.  */
+   enough, and return the new expression.  If STOP is specified, stop
+   expanding if EXPR equals to it.  */
 
 tree
-expand_simple_operations (tree expr)
+expand_simple_operations (tree expr, tree stop)
 {
   unsigned i, n;
   tree ret = NULL_TREE, e, ee, e1;
@@ -1575,7 +1576,7 @@  tree
       for (i = 0; i < n; i++)
 	{
 	  e = TREE_OPERAND (expr, i);
-	  ee = expand_simple_operations (e);
+	  ee = expand_simple_operations (e, stop);
 	  if (e == ee)
 	    continue;
 
@@ -1594,7 +1595,8 @@  tree
       return ret;
     }
 
-  if (TREE_CODE (expr) != SSA_NAME)
+  /* Stop if it's not ssa name or the one we don't want to expand.  */
+  if (TREE_CODE (expr) != SSA_NAME || expr == stop)
     return expr;
 
   stmt = SSA_NAME_DEF_STMT (expr);
@@ -1614,7 +1616,7 @@  tree
 	  && src->loop_father != dest->loop_father)
 	return expr;
 
-      return expand_simple_operations (e);
+      return expand_simple_operations (e, stop);
     }
   if (gimple_code (stmt) != GIMPLE_ASSIGN)
     return expr;
@@ -1634,7 +1636,7 @@  tree
 	return e;
 
       if (code == SSA_NAME)
-	return expand_simple_operations (e);
+	return expand_simple_operations (e, stop);
 
       return expr;
     }
@@ -1643,7 +1645,7 @@  tree
     {
     CASE_CONVERT:
       /* Casts are simple.  */
-      ee = expand_simple_operations (e);
+      ee = expand_simple_operations (e, stop);
       return fold_build1 (code, TREE_TYPE (expr), ee);
 
     case PLUS_EXPR:
@@ -1658,7 +1660,7 @@  tree
       if (!is_gimple_min_invariant (e1))
 	return expr;
 
-      ee = expand_simple_operations (e);
+      ee = expand_simple_operations (e, stop);
       return fold_build2 (code, TREE_TYPE (expr), ee, e1);
 
     default:
Index: gcc/tree-ssa-loop-niter.h
===================================================================
--- gcc/tree-ssa-loop-niter.h	(revision 219574)
+++ gcc/tree-ssa-loop-niter.h	(working copy)
@@ -20,7 +20,7 @@  along with GCC; see the file COPYING3.  If not see
 #ifndef GCC_TREE_SSA_LOOP_NITER_H
 #define GCC_TREE_SSA_LOOP_NITER_H
 
-extern tree expand_simple_operations (tree);
+extern tree expand_simple_operations (tree, tree = NULL);
 extern bool loop_only_exit_p (const struct loop *, const_edge);
 extern bool number_of_iterations_exit (struct loop *, edge,
 				       struct tree_niter_desc *niter, bool,
Index: gcc/testsuite/gcc.dg/tree-ssa/pr64705.c
===================================================================
--- gcc/testsuite/gcc.dg/tree-ssa/pr64705.c	(revision 0)
+++ gcc/testsuite/gcc.dg/tree-ssa/pr64705.c	(revision 0)
@@ -0,0 +1,27 @@ 
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-ivopts-details" } */
+
+double g;
+
+int foo(char *flags, long len, long i, long steps)
+{
+  register long step, iter;
+
+  if(*(flags + i))
+    {
+      step = i + i + 3;
+      for(iter = i + step ; iter <= len ; iter += step)
+	{
+	  steps++;
+	  *(flags + iter)=0;
+	}
+    }
+   g = 5.0*(double)steps;
+
+   return 0;
+}
+
+/* Don't expand iv {base+step, step}_loop into {base+x+y, step}_loop
+   even if "step == x + y".  */
+/* { dg-final { scan-tree-dump "base step_\[0-9\]* \\+ iter|base iter_\[0-9\]* \\+ step" "ivopts"} } */
+/* { dg-final { cleanup-tree-dump "ivopts" } } */
Index: gcc/tree-ssa-loop-ivopts.c
===================================================================
--- gcc/tree-ssa-loop-ivopts.c	(revision 219574)
+++ gcc/tree-ssa-loop-ivopts.c	(working copy)
@@ -1070,13 +1070,40 @@  determine_biv_step (gphi *phi)
   return integer_zerop (iv.step) ? NULL_TREE : iv.step;
 }
 
+/* Return the first non-invariant ssa var found in EXPR.  */
+
+static tree
+extract_single_var_from_expr (tree expr)
+{
+  int i, n;
+  tree tmp;
+  enum tree_code code;
+
+  if (!expr || is_gimple_min_invariant (expr))
+    return NULL;
+
+  code = TREE_CODE (expr);
+  if (IS_EXPR_CODE_CLASS (TREE_CODE_CLASS (code)))
+    {
+      n = TREE_OPERAND_LENGTH (expr);
+      for (i = 0; i < n; i++)
+	{
+	  tmp = extract_single_var_from_expr (TREE_OPERAND (expr, i));
+
+	  if (tmp)
+	    return tmp;
+	}
+    }
+  return (TREE_CODE (expr) == SSA_NAME) ? expr : NULL;
+}
+
 /* Finds basic ivs.  */
 
 static bool
 find_bivs (struct ivopts_data *data)
 {
   gphi *phi;
-  tree step, type, base;
+  tree step, type, base, stop;
   bool found = false;
   struct loop *loop = data->current_loop;
   gphi_iterator psi;
@@ -1093,7 +1120,13 @@  find_bivs (struct ivopts_data *data)
 	continue;
 
       base = PHI_ARG_DEF_FROM_EDGE (phi, loop_preheader_edge (loop));
-      base = expand_simple_operations (base);
+      /* Stop expanding iv base at the first ssa var referred by iv step.
+	 Ideally we should stop at any ssa var, because that's expensive
+	 and unusual to happen, we just do it on the first one.
+
+	 See PR64705 for the rationale.  */
+      stop = extract_single_var_from_expr (step);
+      base = expand_simple_operations (base, stop);
       if (contains_abnormal_ssa_name_p (base)
 	  || contains_abnormal_ssa_name_p (step))
 	continue;
@@ -1165,7 +1198,7 @@  mark_bivs (struct ivopts_data *data)
 static bool
 find_givs_in_stmt_scev (struct ivopts_data *data, gimple stmt, affine_iv *iv)
 {
-  tree lhs;
+  tree lhs, stop;
   struct loop *loop = data->current_loop;
 
   iv->base = NULL_TREE;
@@ -1180,13 +1213,20 @@  find_givs_in_stmt_scev (struct ivopts_data *data,
 
   if (!simple_iv (loop, loop_containing_stmt (stmt), lhs, iv, true))
     return false;
-  iv->base = expand_simple_operations (iv->base);
 
+  /* Stop expanding iv base at the first ssa var referred by iv step.
+     Ideally we should stop at any ssa var, because that's expensive
+     and unusual to happen, we just do it on the first one.
+
+     See PR64705 for the rationale.  */
+  stop = extract_single_var_from_expr (iv->step);
+  iv->base = expand_simple_operations (iv->base, stop);
+
   if (contains_abnormal_ssa_name_p (iv->base)
       || contains_abnormal_ssa_name_p (iv->step))
     return false;
 
-  /* If STMT could throw, then do not consider STMT as defining a GIV.  
+  /* If STMT could throw, then do not consider STMT as defining a GIV.
      While this will suppress optimizations, we can not safely delete this
      GIV and associated statements, even if it appears it is not used.  */
   if (stmt_could_throw_p (stmt))