diff mbox

[PR72817/PR73450] Fix wrong code caused by niter analyzer for NE_EXPR exit cond.

Message ID AM4PR0802MB2163D2B5FB7166CA568A623FE71E0@AM4PR0802MB2163.eurprd08.prod.outlook.com
State New
Headers show

Commit Message

Bin Cheng Aug. 11, 2016, 4:35 p.m. UTC
Hi,
I made a mistake when improving loop niter analysis for NE_EXPR exit condition.
It can results in wrong code as reported in PR72817/PR73450.  In previous patch, 
it simplifies below condition:
	    base <= FINAL ; step > 0
	    base >= FINAL ; step < 0
into:
	    base - step < FINAL ; step > 0 && base - step doesn't underflow
	    base - step > FINAL ; step < 0 && base - step doesn't overflow
But this is not enough.  Since we adjusted IV.base to "new_base = IV.base - IV.step"
in the condition, we need to check if |FINAL - new_base| is multiple of |step| for the
adjusted base.

This patch fixes the issue as well as revises the comment.  Bootstrap and test on
x86_64.  Is it OK?

Thanks,
bin

2016-08-11  Bin Cheng  <bin.cheng@arm.com>

	PR tree-optimization/72817
	PR tree-optimization/73450
	* tree-ssa-loop-niter.c (number_of_iterations_ne): Check
	multiple_of_p for adjusted IV.base.

gcc/testsuite/ChangeLog
2016-08-11  Bin Cheng  <bin.cheng@arm.com>

	PR tree-optimization/72817
	PR tree-optimization/73450
	* gcc.dg/tree-ssa/pr72817.c: New test.
	* gcc.dg/tree-ssa/pr73450.c: New test.

Comments

Richard Biener Aug. 12, 2016, 8:06 a.m. UTC | #1
On Thu, Aug 11, 2016 at 6:35 PM, Bin Cheng <Bin.Cheng@arm.com> wrote:
> Hi,
> I made a mistake when improving loop niter analysis for NE_EXPR exit condition.
> It can results in wrong code as reported in PR72817/PR73450.  In previous patch,
> it simplifies below condition:
>             base <= FINAL ; step > 0
>             base >= FINAL ; step < 0
> into:
>             base - step < FINAL ; step > 0 && base - step doesn't underflow
>             base - step > FINAL ; step < 0 && base - step doesn't overflow
> But this is not enough.  Since we adjusted IV.base to "new_base = IV.base - IV.step"
> in the condition, we need to check if |FINAL - new_base| is multiple of |step| for the
> adjusted base.
>
> This patch fixes the issue as well as revises the comment.  Bootstrap and test on
> x86_64.  Is it OK?

Ok.

Thanks,
Richard.

> Thanks,
> bin
>
> 2016-08-11  Bin Cheng  <bin.cheng@arm.com>
>
>         PR tree-optimization/72817
>         PR tree-optimization/73450
>         * tree-ssa-loop-niter.c (number_of_iterations_ne): Check
>         multiple_of_p for adjusted IV.base.
>
> gcc/testsuite/ChangeLog
> 2016-08-11  Bin Cheng  <bin.cheng@arm.com>
>
>         PR tree-optimization/72817
>         PR tree-optimization/73450
>         * gcc.dg/tree-ssa/pr72817.c: New test.
>         * gcc.dg/tree-ssa/pr73450.c: New test.
diff mbox

Patch

diff --git a/gcc/tree-ssa-loop-niter.c b/gcc/tree-ssa-loop-niter.c
index c740ffa..8851862 100644
--- a/gcc/tree-ssa-loop-niter.c
+++ b/gcc/tree-ssa-loop-niter.c
@@ -999,33 +999,36 @@  number_of_iterations_ne (struct loop *loop, tree type, affine_iv *iv,
   mpz_clear (max);
 
   /* Compute no-overflow information for the control iv.  This can be
-     proven when below two conditions hold.
-
-       1) |FINAL - base| is an exact multiple of step.
-       2) IV evaluates toward FINAL at beginning, i.e:
+     proven when below two conditions are satisfied:
 
+       1) IV evaluates toward FINAL at beginning, i.e:
 	    base <= FINAL ; step > 0
 	    base >= FINAL ; step < 0
 
-	  Note the first condition holds, the second can be then relaxed
-	  to below condition.
+       2) |FINAL - base| is an exact multiple of step.
+
+     Unfortunately, it's hard to prove above conditions after pass loop-ch
+     because loop with exit condition (IV != FINAL) usually will be guarded
+     by initial-condition (IV.base - IV.step != FINAL).  In this case, we
+     can alternatively try to prove below conditions:
+
+       1') IV evaluates toward FINAL at beginning, i.e:
+	    new_base = base - step < FINAL ; step > 0
+					     && base - step doesn't underflow
+	    new_base = base - step > FINAL ; step < 0
+					     && base - step doesn't overflow
 
-	    base - step < FINAL ; step > 0
-				  && base - step doesn't underflow
-	    base - step > FINAL ; step < 0
-				  && base - step doesn't overflow
+       2') |FINAL - new_base| is an exact multiple of step.
 
-	  The relaxation is important because after pass loop-ch, loop
-	  with exit condition (IV != FINAL) will usually be guarded by
-	  pre-condition (IV.base - IV.step != FINAL).  Please refer to
-	  PR34114 as an example.
+     Please refer to PR34114 as an example of loop-ch's impact, also refer
+     to PR72817 as an example why condition 2') is necessary.
 
-     Also note, for NE_EXPR, base equals to FINAL is a special case, in
+     Note, for NE_EXPR, base equals to FINAL is a special case, in
      which the loop exits immediately, and the iv does not overflow.  */
   if (!niter->control.no_overflow
       && (integer_onep (s) || multiple_of_p (type, c, s)))
     {
-      tree t, cond, relaxed_cond = boolean_false_node;
+      tree t, cond, new_c, relaxed_cond = boolean_false_node;
 
       if (tree_int_cst_sign_bit (iv->step))
 	{
@@ -1039,8 +1042,12 @@  number_of_iterations_ne (struct loop *loop, tree type, affine_iv *iv,
 	      if (integer_nonzerop (t))
 		{
 		  t = fold_build2 (MINUS_EXPR, type, iv->base, iv->step);
-		  relaxed_cond = fold_build2 (GT_EXPR, boolean_type_node,
-					      t, final);
+		  new_c = fold_build2 (MINUS_EXPR, niter_type,
+				       fold_convert (niter_type, t),
+				       fold_convert (niter_type, final));
+		  if (multiple_of_p (type, new_c, s))
+		    relaxed_cond = fold_build2 (GT_EXPR, boolean_type_node,
+						t, final);
 		}
 	    }
 	}
@@ -1056,8 +1063,12 @@  number_of_iterations_ne (struct loop *loop, tree type, affine_iv *iv,
 	      if (integer_nonzerop (t))
 		{
 		  t = fold_build2 (MINUS_EXPR, type, iv->base, iv->step);
-		  relaxed_cond = fold_build2 (LT_EXPR, boolean_type_node,
-					      t, final);
+		  new_c = fold_build2 (MINUS_EXPR, niter_type,
+				       fold_convert (niter_type, final),
+				       fold_convert (niter_type, t));
+		  if (multiple_of_p (type, new_c, s))
+		    relaxed_cond = fold_build2 (LT_EXPR, boolean_type_node,
+						t, final);
 		}
 	    }
 	}
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr72817.c b/gcc/testsuite/gcc.dg/tree-ssa/pr72817.c
new file mode 100644
index 0000000..290e096
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr72817.c
@@ -0,0 +1,13 @@ 
+/* { dg-do run } */
+/* { dg-options "-O3" } */
+
+char a;
+short b;
+
+int main ()
+{
+  for (a = 3; a != -1; a -= 5)
+    while (b)
+      ;
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr73450.c b/gcc/testsuite/gcc.dg/tree-ssa/pr73450.c
new file mode 100644
index 0000000..7dd44db
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr73450.c
@@ -0,0 +1,14 @@ 
+/* { dg-do run } */
+/* { dg-options "-O3" } */
+
+int a;
+char b;
+int main() {
+  char c = 0;
+  for (; c != 3; c = c + 7) {
+    a = b & a;
+    if (a)
+      break;
+  }
+  return 0;
+}