diff mbox series

PR tree-optimization/102950: Improved EVRP for signed BIT_XOR_EXPR.

Message ID 005401d81749$b1d77ad0$15867070$@nextmovesoftware.com
State New
Headers show
Series PR tree-optimization/102950: Improved EVRP for signed BIT_XOR_EXPR. | expand

Commit Message

Roger Sayle Feb. 1, 2022, 8:57 a.m. UTC
This patch fixes PR tree-optimization/102950, which is a P2 regression,
by providing better range bounds for BIT_XOR_EXPR, BIT_AND_EXPR and
BIT_IOR_EXPR on signed integer types.  In general terms, any binary
bitwise operation on sign-extended or zero-extended integer types will
produce results that are themselves sign-extended or zero-extended.
More precisely, we can derive signed bounds from the number of leading
redundant sign bit copies, from the equation:
        clrsb(X op Y) >= min (clrsb (X), clrsb(Y))
and from the property that for any (signed or unsigned) range [lb, ub]
that clrsb([lb, ub]) >= min (clrsb(lb), clrsb(ub)).

These can be used to show that [-1, 0] op [-1, 0] is [-1, 0] or that
[-128, 127] op [-128, 127] is [-128, 127], even when tracking nonzero
bits would result in VARYING (as every bit can be 0 or 1).  This is
equivalent to determining the minimum type precision in which the
operation can be performed then sign extending the result.

One additional refinement is to observe that X ^ Y can never be
zero if the ranges of X and Y don't overlap, i.e. X can't be equal
to Y.

Previously, the expression "(int)(char)a ^ 233" in the PR was considered
VARYING, but with the above changes now has the range [-256, -1][1, 255],
which is sufficient to optimize away the call to foo.


This patch has been tested on x86_64-pc-linux-gnu with make bootstrap
and make -k check with no new failures.  Ok for mainline?


2022-02-01  Roger Sayle  <roger@nextmovesoftware.com>

gcc/ChangeLog
	PR tree-optimization/102950
	* range-op.cc (wi_optimize_signed_bitwise_op): New function to
	determine bounds of bitwise operations on signed types.
	(operator_bitwise_and::wi_fold): Call the above function.
	(operator_bitwise_or::wi_fold): Likewise.
	(operator_bitwise_xor::wi_fold): Likewise.  Additionally, the
	result can't be zero if the operands can't be equal.

gcc/testsuite/ChangeLog
	PR tree-optimization/102950
	gcc.dg/pr102950.c: New test case.
	gcc.dg/tree-ssa/evrp10.c: New test case.


Thanks in advance (and Happy Chinese New Year),
Roger
--

Comments

Richard Biener May 2, 2022, 11:49 a.m. UTC | #1
On Tue, Feb 1, 2022 at 9:57 AM Roger Sayle <roger@nextmovesoftware.com> wrote:
>
>
> This patch fixes PR tree-optimization/102950, which is a P2 regression,
> by providing better range bounds for BIT_XOR_EXPR, BIT_AND_EXPR and
> BIT_IOR_EXPR on signed integer types.  In general terms, any binary
> bitwise operation on sign-extended or zero-extended integer types will
> produce results that are themselves sign-extended or zero-extended.
> More precisely, we can derive signed bounds from the number of leading
> redundant sign bit copies, from the equation:
>         clrsb(X op Y) >= min (clrsb (X), clrsb(Y))
> and from the property that for any (signed or unsigned) range [lb, ub]
> that clrsb([lb, ub]) >= min (clrsb(lb), clrsb(ub)).
>
> These can be used to show that [-1, 0] op [-1, 0] is [-1, 0] or that
> [-128, 127] op [-128, 127] is [-128, 127], even when tracking nonzero
> bits would result in VARYING (as every bit can be 0 or 1).  This is
> equivalent to determining the minimum type precision in which the
> operation can be performed then sign extending the result.
>
> One additional refinement is to observe that X ^ Y can never be
> zero if the ranges of X and Y don't overlap, i.e. X can't be equal
> to Y.
>
> Previously, the expression "(int)(char)a ^ 233" in the PR was considered
> VARYING, but with the above changes now has the range [-256, -1][1, 255],
> which is sufficient to optimize away the call to foo.
>
>
> This patch has been tested on x86_64-pc-linux-gnu with make bootstrap
> and make -k check with no new failures.  Ok for mainline?

OK for trunk.

Thanks,
Richard.

>
> 2022-02-01  Roger Sayle  <roger@nextmovesoftware.com>
>
> gcc/ChangeLog
>         PR tree-optimization/102950
>         * range-op.cc (wi_optimize_signed_bitwise_op): New function to
>         determine bounds of bitwise operations on signed types.
>         (operator_bitwise_and::wi_fold): Call the above function.
>         (operator_bitwise_or::wi_fold): Likewise.
>         (operator_bitwise_xor::wi_fold): Likewise.  Additionally, the
>         result can't be zero if the operands can't be equal.
>
> gcc/testsuite/ChangeLog
>         PR tree-optimization/102950
>         gcc.dg/pr102950.c: New test case.
>         gcc.dg/tree-ssa/evrp10.c: New test case.
>
>
> Thanks in advance (and Happy Chinese New Year),
> Roger
> --
>
diff mbox series

Patch

diff --git a/gcc/range-op.cc b/gcc/range-op.cc
index 19bdf30..71264ba 100644
--- a/gcc/range-op.cc
+++ b/gcc/range-op.cc
@@ -2659,6 +2659,29 @@  operator_bitwise_and::fold_range (irange &r, tree type,
 }
 
 
+// Optimize BIT_AND_EXPR, BIT_IOR_EXPR and BIT_XOR_EXPR of signed types
+// by considering the number of leading redundant sign bit copies.
+// clrsb (X op Y) = min (clrsb (X), clrsb (Y)), so for example
+// [-1, 0] op [-1, 0] is [-1, 0] (where nonzero_bits doesn't help).
+static bool
+wi_optimize_signed_bitwise_op (irange &r, tree type,
+			       const wide_int &lh_lb, const wide_int &lh_ub,
+			       const wide_int &rh_lb, const wide_int &rh_ub)
+{
+  int lh_clrsb = MIN (wi::clrsb (lh_lb), wi::clrsb (lh_ub));
+  int rh_clrsb = MIN (wi::clrsb (rh_lb), wi::clrsb (rh_ub));
+  int new_clrsb = MIN (lh_clrsb, rh_clrsb);
+  if (new_clrsb == 0)
+    return false;
+  int type_prec = TYPE_PRECISION (type);
+  int rprec = (type_prec - new_clrsb) - 1;
+  value_range_with_overflow (r, type,
+			     wi::mask (rprec, true, type_prec),
+			     wi::mask (rprec, false, type_prec));
+  return true;
+}
+
+
 // Optimize BIT_AND_EXPR and BIT_IOR_EXPR in terms of a mask if
 // possible.  Basically, see if we can optimize:
 //
@@ -2839,7 +2862,14 @@  operator_bitwise_and::wi_fold (irange &r, tree type,
     }
   // If the limits got swapped around, return varying.
   if (wi::gt_p (new_lb, new_ub,sign))
-    r.set_varying (type);
+    {
+      if (sign == SIGNED
+	  && wi_optimize_signed_bitwise_op (r, type,
+					    lh_lb, lh_ub,
+					    rh_lb, rh_ub))
+	return;
+      r.set_varying (type);
+    }
   else
     value_range_with_overflow (r, type, new_lb, new_ub);
 }
@@ -3093,6 +3123,11 @@  operator_bitwise_or::wi_fold (irange &r, tree type,
 	  || wi::lt_p (lh_ub, 0, sign)
 	  || wi::lt_p (rh_ub, 0, sign))
 	r.set_nonzero (type);
+      else if (sign == SIGNED
+	       && wi_optimize_signed_bitwise_op (r, type,
+						 lh_lb, lh_ub,
+						 rh_lb, rh_ub))
+	return;
       else
 	r.set_varying (type);
       return;
@@ -3180,8 +3215,23 @@  operator_bitwise_xor::wi_fold (irange &r, tree type,
   // is better than VARYING.
   if (wi::lt_p (new_lb, 0, sign) || wi::ge_p (new_ub, 0, sign))
     value_range_with_overflow (r, type, new_lb, new_ub);
+  else if (sign == SIGNED
+	   && wi_optimize_signed_bitwise_op (r, type,
+					     lh_lb, lh_ub,
+					     rh_lb, rh_ub))
+    ;  /* Do nothing.  */
   else
     r.set_varying (type);
+
+  /* Furthermore, XOR is non-zero if its arguments can't be equal.  */
+  if (wi::lt_p (lh_ub, rh_lb, sign)
+      || wi::lt_p (rh_ub, lh_lb, sign)
+      || wi::ne_p (result_one_bits, 0))
+    {
+      int_range<2> tmp;
+      tmp.set_nonzero (type);
+      r.intersect (tmp);
+    }
 }
 
 bool
diff --git a/gcc/testsuite/gcc.dg/pr102950.c b/gcc/testsuite/gcc.dg/pr102950.c
new file mode 100644
index 0000000..0ab23bd
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr102950.c
@@ -0,0 +1,21 @@ 
+/* { dg-do link } */
+/* { dg-options "-O2" } */
+
+extern void link_error(void);
+
+static char a;
+static short d(unsigned e) {
+  char b;
+  short c;
+  a = b = e;
+  if (b)
+    return 0;
+  if (1 >= e) {
+    c = e == 0;
+    if (c)
+      link_error();
+  }
+  return 0;
+}
+int main() { d(a ^ 233); }
+
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/evrp10.c b/gcc/testsuite/gcc.dg/tree-ssa/evrp10.c
new file mode 100644
index 0000000..6ca00e4
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/evrp10.c
@@ -0,0 +1,30 @@ 
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-evrp" }*/
+
+typedef __INT32_TYPE__ int32_t;
+
+int32_t and(int32_t x, int32_t y)
+{
+  int32_t tx = x >> 24;
+  int32_t ty = y >> 24;
+  int32_t t = tx & ty;
+  return t;
+}
+
+int32_t ior(int32_t x, int32_t y)
+{
+  int32_t tx = x >> 24;
+  int32_t ty = y >> 24;
+  int32_t t = tx | ty;
+  return t;
+}
+
+int32_t xor(int32_t x, int32_t y)
+{
+  int32_t tx = x >> 24;
+  int32_t ty = y >> 24;
+  int32_t t = tx ^ ty;
+  return t;
+}
+
+/* { dg-final { scan-tree-dump-times "\\\[-128, 127\\\]" 9 "evrp" } } */