Patchwork [tree-optimization] : Improve reassociation pass for bitwise-operations

login
register
mail settings
Submitter Kai Tietz
Date Aug. 4, 2011, 10:45 a.m.
Message ID <CAEwic4Y1fXOtm=Sf+gpunVchkC8p+byvT9-r-ovfk_WgrE3LPA@mail.gmail.com>
Download mbox | patch
Permalink /patch/108412/
State New
Headers show

Comments

Kai Tietz - Aug. 4, 2011, 10:45 a.m.
2011/8/3 Richard Guenther <richard.guenther@gmail.com>:
> On Wed, Aug 3, 2011 at 3:32 PM, Kai Tietz <ktietz70@googlemail.com> wrote:
>> 2011/8/3 Michael Matz <matz@suse.de>:
>>> Hi,
>>>
>>> On Tue, 2 Aug 2011, Kai Tietz wrote:
>>>
>>>> this patch improves the ability of reassociation pass to simplifiy
>>>> more complex bitwise-binary
>>>> operations with comparisons.  We break-up for this patch statements
>>>> like (X | Y) != 0 to X != 0 | Y != 0,
>>>> and (X | Y) == 0 to expanded X == 0 & Y == 0.
>>>> Additionally we expand logical-not expressions like ~(A & B) -> ~A |
>>>> ~B, ~(A & B) -> ~A | ~B, and
>>>> ~(A ^ B) -> A ^ ~B.  These expansion are just temporary for this pass
>>>> and getting later by fold
>>>> reversed again back to their original form.
>>>
>>> Implement all of this in the normal reassoc machinery that already exists.
>>> Don't implement your own walker (which btw is superlinear because you
>>> recurse into both operands).  If no simplifications were possible you have
>>> to fold back the NOTs into the shorter sequences again, which conveniently
>>> reassoc already can do for negates and PLUS/MINUS chains.
>>>
>>> Hence extend the existing support for arithmetic operations to logical
>>> operations.
>>>
>>>
>>> Ciao,
>>> Michael.
>>
>> What you mean by existing machinery for negate expression here?  This
>> machinery doen't work in this case and additionally doesn't provide
>> the opportunity to do a complete reassociation rewrite of
>> bitwise-expression-chains.
>>
>> Eg: the case (~(a | c) & (b & ~d))  would be expanded (by code in
>> patch) to ~a & ~c & b & ~d.
>> This intermediate result is good to inspect doubles, or inverted optimizations.
>> On rebuilding of tree the result gets transformed (or should) to ~(a |
>> c | d) & b.
>
> It depends on the context whether a conjunctive or a disjunctive normal
> form is what you want.  As you are mixing two operation kinds reassoc
> isn't the perfect place to deal with this (yet).
>
> You don't seem to stop when single-use chains end (which is where
> reassoc will give up) and even visit stmts multiple times that way.
> You need to at least do this "unfolding" in a lot more controlled manner.
>
> Richard.

Hello,

This revised patch takes care that we don't visit statements, which
aren't modified, recursively.
Additionally it takes care that we just do normalize statements with single-use.

ChangeLog

2011-08-04  Kai Tietz  <ktietz@redhat.com>

	* tree-ssa-reassoc.c (gimple build_and_add_sum): Add forwarder
	declaration and add support for unary-expression.
	(remove_visited_stmt_chain): Add forwarder declaration.
	(make_new_tmp_statement): New helper function.
	(expand_not_bitwise_binary): Likewise.
	(break_up_bitwise_combined_stmt): Likeiwise.
	(break_up_subtract_bb): Add call to break_up_bitwise_combined_stmt.


ChangeLog

2011-08-03  Kai Tietz  <ktietz@redhat.com>

	* gcc.dg/tree-ssa/reassoc-24.c: New test.
	* gcc.dg/tree-ssa/reassoc-25.c: New test.
	* gcc.dg/tree-ssa/reassoc-26.c: New test.
	* gcc.dg/tree-ssa/reassoc-27.c: New test.
	* gcc.dg/tree-ssa/reassoc-28.c: New test.
	* gcc.dg/tree-ssa/reassoc-29.c: New test.

Bootstrapped and regression tested for all languages (including Ada
and Obj-C++) on host x86_64-pc-linux-gnu.
Ok for apply?

Regards,
Kai

Patch

Index: gcc/gcc/tree-ssa-reassoc.c
===================================================================
--- gcc.orig/gcc/tree-ssa-reassoc.c
+++ gcc/gcc/tree-ssa-reassoc.c
@@ -41,6 +41,10 @@  along with GCC; see the file COPYING3.
 #include "cfgloop.h"
 #include "flags.h"

+/* Forwarders.  */
+static gimple build_and_add_sum (tree, tree, tree, enum tree_code);
+static void remove_visited_stmt_chain (tree);
+
 /*  This is a simple global reassociation pass.  It is, in part, based
     on the LLVM pass of the same name (They do some things more/less
     than we do, in different orders, etc).
@@ -48,7 +52,9 @@  along with GCC; see the file COPYING3.
     It consists of five steps:

     1. Breaking up subtract operations into addition + negate, where
-    it would promote the reassociation of adds.
+    it would promote the reassociation of adds.  Additionally breaking
+    up combined expression made out of boolean-typed bitwise expressions
+    for improving simplification.

     2. Left linearization of the expression trees, so that (A+B)+(C+D)
     becomes (((A+B)+C)+D), which is easier for us to rewrite later.
@@ -554,6 +560,232 @@  get_unary_op (tree name, enum tree_code
   return NULL_TREE;
 }

+/* Create a temorary register expression with type TYPE, tree code CODE, and
+   operands OP1 and OP2.  If REF_DEF is a valid gimple statement, we use its
+   location for new generated temporary.
+   Function returns left-hand-side of new generated temporary register.  */
+static tree
+make_new_tmp_statement (tree type, enum tree_code code, tree op1, tree op2,
+			gimple ref_def)
+{
+  gimple sum;
+  tree tmpvar = create_tmp_reg (type, NULL);
+  add_referenced_var (tmpvar);
+  sum = build_and_add_sum (tmpvar, op1, op2, code);
+  if (ref_def)
+    gimple_set_location (sum, gimple_location (ref_def));
+  return gimple_get_lhs (sum);
+}
+
+/* Perfrom on tree LHS with optional definition statement EPXR
+   a logic-not operation.  TYPE is of kind boolean with a 1-bit
+   precision.  */
+static tree
+expand_not_bitwise_binary (tree type, tree lhs, gimple expr)
+{
+  enum tree_code code = ERROR_MARK;
+  tree op1 = NULL, op2 = NULL;
+  gimple s1 = NULL, s2 = NULL;
+  gimple_stmt_iterator gsi;
+
+  if (TREE_CODE (lhs) == INTEGER_CST)
+    return fold_build1 (BIT_NOT_EXPR, type, lhs);
+
+  if (expr && is_gimple_assign (expr))
+    code = gimple_assign_rhs_code (expr);
+
+  /* If statement lhs isn't a single-use statement,
+     we don't want to modify it. So we can do only default-case
+     operation for it.  */
+  if (code != ERROR_MARK && !has_single_use (lhs))
+    code = ERROR_MARK;
+
+  if (TREE_CODE_CLASS (code) == tcc_comparison
+      || code == BIT_XOR_EXPR
+      || code == BIT_AND_EXPR
+      || code == BIT_IOR_EXPR)
+    {
+      op1 = gimple_assign_rhs1 (expr);
+      op2 = gimple_assign_rhs2 (expr);
+    }
+  else if (code == BIT_NOT_EXPR)
+    op1 = gimple_assign_rhs1 (expr);
+  else
+    return make_new_tmp_statement (type, BIT_NOT_EXPR, lhs, NULL_TREE, expr);
+
+
+  /* ~(~X) -> X.  */
+  if (code == BIT_NOT_EXPR)
+    return op1;
+
+  /* Invert comparison if possible, otherwise fall through to
+     default case.  */
+  else if (TREE_CODE_CLASS (code) == tcc_comparison)
+    {
+      enum tree_code ncode;
+      ncode = invert_tree_comparison (code,
+				      HONOR_NANS (TYPE_MODE (type)));
+      if (ncode != ERROR_MARK)
+	return make_new_tmp_statement (type, ncode, op1, op2, expr);
+    }
+  /* ~(A & B) -> ~A | ~B.  */
+  else if (code == BIT_AND_EXPR)
+    {
+      if (TREE_CODE (op1) == SSA_NAME)
+	s1 = SSA_NAME_DEF_STMT (op1);
+      if (TREE_CODE (op2) == SSA_NAME)
+	s2 = SSA_NAME_DEF_STMT (op2);
+      /* We have to do inversion in one step to avoid looping.  */
+      op1 = expand_not_bitwise_binary (type, op1, s1);
+      op2 = expand_not_bitwise_binary (type, op2, s2);
+      if (TREE_CODE (op2) == INTEGER_CST && integer_zerop (op2))
+	return op1;
+      else if (TREE_CODE (op2) == INTEGER_CST && integer_all_onesp (op2))
+	return op2;
+      return make_new_tmp_statement (type, BIT_IOR_EXPR, op1, op2, expr);
+    }
+  /* ~(A | B) -> ~A & ~B.  */
+  else if (code == BIT_IOR_EXPR)
+    {
+      if (TREE_CODE (op1) == SSA_NAME)
+	s1 = SSA_NAME_DEF_STMT (op1);
+      if (TREE_CODE (op2) == SSA_NAME)
+	s2 = SSA_NAME_DEF_STMT (op2);
+      op1 = expand_not_bitwise_binary (type, op1, s1);
+      op2 = expand_not_bitwise_binary (type, op2, s2);
+      if (TREE_CODE (op2) == INTEGER_CST && integer_zerop (op2))
+	return op2;
+      else if (TREE_CODE (op2) == INTEGER_CST && integer_all_onesp (op2))
+	return op1;
+      return make_new_tmp_statement (type, BIT_AND_EXPR, op1, op2, expr);
+    }
+  /* ~(A ^ B) -> ~A ^ B.  Handle here special cases for B being
+     an integer constant, or being a logical-not.  */
+  else if (code == BIT_XOR_EXPR)
+    {
+      if (TREE_CODE (op1) == SSA_NAME)
+	s1 = SSA_NAME_DEF_STMT (op1);
+      if (TREE_CODE (op2) == SSA_NAME)
+	s2 = SSA_NAME_DEF_STMT (op2);
+      if (TREE_CODE (op2) != INTEGER_CST
+	  && (!s2 || !is_gimple_assign (s2)
+	      || gimple_assign_rhs_code (s2) != BIT_NOT_EXPR))
+        op1 = expand_not_bitwise_binary (type, op1, s1);
+      else
+        op2 = expand_not_bitwise_binary (type, op2, s2);
+
+      if (TREE_CODE (op2) == INTEGER_CST && integer_zerop (op2))
+	return op1;
+
+      gsi = gsi_for_stmt (expr);
+      if (TREE_CODE (op2) == INTEGER_CST && integer_all_onesp (op2))
+	return make_new_tmp_statement (type, BIT_NOT_EXPR, op1, NULL, expr);
+      return make_new_tmp_statement (type, BIT_XOR_EXPR, op1, op2, expr);
+    }
+
+  /* Default case lhs -> ~lhs  */
+  return make_new_tmp_statement (type, BIT_NOT_EXPR, lhs, NULL_TREE, expr);
+}
+
+/* Break up statement STMT if it is a combined expressions made out of
+   bitwise operations.  Handle expansion of (A | B) !=/== 0, and ~(A op B).  */
+static bool
+break_up_bitwise_combined_stmt (gimple stmt)
+{
+  tree op1, op2;
+  gimple op1_def, op2_def;
+  enum tree_code code = gimple_assign_rhs_code (stmt);
+  tree type = TREE_TYPE (gimple_assign_lhs (stmt));
+  gimple_stmt_iterator gsi;
+
+  /* Don't do anything for none integral type.  */
+  if (!INTEGRAL_TYPE_P (type))
+    return false;
+
+  op1 = gimple_assign_rhs1 (stmt);
+  if (TREE_CODE (op1) != SSA_NAME)
+    return false;
+  op1_def = SSA_NAME_DEF_STMT (op1);
+  if (!op1_def
+      || !is_gimple_assign (op1_def)
+      || !has_single_use (op1))
+    return false;
+
+  if (code == NE_EXPR || code == EQ_EXPR)
+    {
+      tree zero = gimple_assign_rhs2 (stmt);
+      tree old_op = op1;
+
+      /* Check that right-hand operand has integral type and
+         not a boolean.  And see if it is constant zero valued.  */
+      if (!INTEGRAL_TYPE_P (TREE_TYPE (zero))
+	  || TREE_CODE (TREE_TYPE (zero)) == BOOLEAN_TYPE
+	  || !integer_zerop (zero))
+	return false;
+      /* Is left-hand operand bitwise-or expression?  */
+      if (gimple_assign_rhs_code (op1_def) != BIT_IOR_EXPR)
+	return false;
+      op1 = make_new_tmp_statement (type, code,
+				   gimple_assign_rhs1 (op1_def), zero,
+				   stmt);
+      op2 = make_new_tmp_statement (type, code,
+				   gimple_assign_rhs2 (op1_def), zero,
+				   stmt);
+
+      gsi = gsi_for_stmt (stmt);
+      gimple_assign_set_rhs_with_ops (&gsi, (code == NE_EXPR ? BIT_IOR_EXPR
+							     : BIT_AND_EXPR),
+				      op1, op2);
+      update_stmt (gsi_stmt (gsi));
+      remove_visited_stmt_chain (old_op);
+      return true;
+    }
+  /* Handle expansion for expansion of ~X.  */
+  else if (code == BIT_NOT_EXPR)
+    {
+      tree old_op;
+      enum tree_code inner_code = gimple_assign_rhs_code (op1_def);
+      if (inner_code != BIT_AND_EXPR && inner_code != BIT_IOR_EXPR
+          && inner_code != BIT_XOR_EXPR)
+	return false;
+      old_op = op1;
+      op1 = gimple_assign_rhs1 (op1_def);
+      op2 = gimple_assign_rhs2 (op1_def);
+      op1_def = op2_def = NULL;
+      if (TREE_CODE (op1) != SSA_NAME
+	|| (op1_def = SSA_NAME_DEF_STMT (op1)) == NULL
+	|| !is_gimple_assign (op1_def))
+	op1_def = NULL;
+      if (TREE_CODE (op2) != SSA_NAME
+	|| (op2_def = SSA_NAME_DEF_STMT (op2)) == NULL
+	|| !is_gimple_assign (op2_def))
+	op2_def = NULL;
+      if (inner_code == BIT_XOR_EXPR)
+	{
+          if (TREE_CODE (op2) != INTEGER_CST
+	      && (!op2_def || !is_gimple_assign (op2_def)
+	          || !has_single_use (op2)
+	          || gimple_assign_rhs_code (op2_def) != BIT_NOT_EXPR))
+            op1 = expand_not_bitwise_binary (type, op1, op1_def);
+          else
+            op2 = expand_not_bitwise_binary (type, op2, op2_def);
+	}
+      else
+	{
+          op1 = expand_not_bitwise_binary (type, op1, op1_def);
+          op2 = expand_not_bitwise_binary (type, op2, op2_def);
+	  inner_code = (inner_code == BIT_AND_EXPR ? BIT_IOR_EXPR
+						   : BIT_AND_EXPR);
+	}
+      gsi = gsi_for_stmt (stmt);
+      gimple_assign_set_rhs_with_ops (&gsi, inner_code, op1, op2);
+      update_stmt (gsi_stmt (gsi));
+      remove_visited_stmt_chain (old_op);
+      return true;
+    }
+  return false;
+}
+
 /* If CURR and LAST are a pair of ops that OPCODE allows us to
    eliminate through equivalences, do so, remove them from OPS, and
    return true.  Otherwise, return false.  */
@@ -1015,8 +1247,8 @@  zero_one_operation (tree *def, enum tree
   while (1);
 }

-/* Builds one statement performing OP1 OPCODE OP2 using TMPVAR for
-   the result.  Places the statement after the definition of either
+/* Builds one statement performing OP1 OPCODE OP2, OPCODE op1 using TMPVAR
+   for the result.  Places the statement after the definition of either
    OP1 or OP2.  Returns the new statement.  */

 static gimple
@@ -1035,7 +1267,7 @@  build_and_add_sum (tree tmpvar, tree op1
   /* Find an insertion place and insert.  */
   if (TREE_CODE (op1) == SSA_NAME)
     op1def = SSA_NAME_DEF_STMT (op1);
-  if (TREE_CODE (op2) == SSA_NAME)
+  if (op2 && TREE_CODE (op2) == SSA_NAME)
     op2def = SSA_NAME_DEF_STMT (op2);
   if ((!op1def || gimple_nop_p (op1def))
       && (!op2def || gimple_nop_p (op2def)))
@@ -2133,6 +2365,17 @@  can_reassociate_p (tree op)
    we want to break up k = t - q, but we won't until we've transformed q
    = b - r, which won't be broken up until we transform b = c - d.

+   Break up comparison !=/== 0 operations of bitwise-or operations for
+   being able to optimize within combined conditions.
+   (A | B) != 0 -> (A != 0) || (B != 0)
+   (A | B) == 0 -> (A == 0) && (B != 0)
+
+   Break up logical-not expressions of bitwise boolean-typed and/or/xor
+   operations for being able to optimize wihin combined conditions.
+   ~(A | B) -> ~A | ~B
+   ~(A & B) -> ~A & ~B
+   ~(A ^ B) -> A ^ ~B (special case if B is a constant)
+
    En passant, clear the GIMPLE visited flag on every statement.  */

 static void
@@ -2141,21 +2384,32 @@  break_up_subtract_bb (basic_block bb)
   gimple_stmt_iterator gsi;
   basic_block son;

-  for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next (&gsi))
+  for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); )
     {
       gimple stmt = gsi_stmt (gsi);
       gimple_set_visited (stmt, false);
-
-      if (!is_gimple_assign (stmt)
-	  || !can_reassociate_p (gimple_assign_lhs (stmt)))
+      if (!is_gimple_assign (stmt))
+        {
+	  gsi_next (&gsi);
+	  continue;
+        }
+      if (break_up_bitwise_combined_stmt (stmt))
 	continue;
+      if (!can_reassociate_p (gimple_assign_lhs (stmt)))
+	{
+	  gsi_next (&gsi);
+	  continue;
+	}

       /* Look for simple gimple subtract operations.  */
       if (gimple_assign_rhs_code (stmt) == MINUS_EXPR)
 	{
 	  if (!can_reassociate_p (gimple_assign_rhs1 (stmt))
 	      || !can_reassociate_p (gimple_assign_rhs2 (stmt)))
-	    continue;
+	    {
+	      gsi_next (&gsi);
+	      continue;
+	    }

 	  /* Check for a subtract used only in an addition.  If this
 	     is the case, transform it into add of a negate for better
@@ -2167,6 +2421,7 @@  break_up_subtract_bb (basic_block bb)
       else if (gimple_assign_rhs_code (stmt) == NEGATE_EXPR
 	       && can_reassociate_p (gimple_assign_rhs1 (stmt)))
 	VEC_safe_push (tree, heap, plus_negates, gimple_assign_lhs (stmt));
+      gsi_next (&gsi);
     }
   for (son = first_dom_son (CDI_DOMINATORS, bb);
        son;
Index: gcc/gcc/testsuite/gcc.dg/tree-ssa/reassoc-24.c
===================================================================
--- /dev/null
+++ gcc/gcc/testsuite/gcc.dg/tree-ssa/reassoc-24.c
@@ -0,0 +1,12 @@ 
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-optimized -ffast-math" } */
+
+int foo (int a, int b, int c, int d)
+{
+  int r1 = a != 0 & c != 0 & b != 0;
+  int r2 = a == 0 | b != 0 | d == 0;
+  return (r1 != 0 & r2 == 0);
+}
+
+/* { dg-final { scan-tree-dump-times "return 0" 1 "optimized"} } */
+/* { dg-final { cleanup-tree-dump "optimized" } } */
Index: gcc/gcc/testsuite/gcc.dg/tree-ssa/reassoc-25.c
===================================================================
--- /dev/null
+++ gcc/gcc/testsuite/gcc.dg/tree-ssa/reassoc-25.c
@@ -0,0 +1,12 @@ 
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-optimized -ffast-math" } */
+
+int foo (int a, int b, int c, int d)
+{
+  int r1 = a != 0 & c != 0 & b != 0;
+  int r2 = a == 0 | b != 0 | d == 0;
+  return (r1 == 0 | r2 != 0);
+}
+
+/* { dg-final { scan-tree-dump-times "return 1" 1 "optimized"} } */
+/* { dg-final { cleanup-tree-dump "optimized" } } */
Index: gcc/gcc/testsuite/gcc.dg/tree-ssa/reassoc-26.c
===================================================================
--- /dev/null
+++ gcc/gcc/testsuite/gcc.dg/tree-ssa/reassoc-26.c
@@ -0,0 +1,12 @@ 
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-optimized -ffast-math" } */
+
+int foo (int a, int b, int c, int d)
+{
+  int r1 = (a | b | c) == 0;
+  int r2 = (a | d | c) != 0 | b == 0;
+  return (r1 == 0 | r2 != 0);
+}
+
+/* { dg-final { scan-tree-dump-times "return 1" 1 "optimized"} } */
+/* { dg-final { cleanup-tree-dump "optimized" } } */
Index: gcc/gcc/testsuite/gcc.dg/tree-ssa/reassoc-27.c
===================================================================
--- /dev/null
+++ gcc/gcc/testsuite/gcc.dg/tree-ssa/reassoc-27.c
@@ -0,0 +1,12 @@ 
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-optimized -ffast-math" } */
+
+int foo (int a, int b, int c, int d)
+{
+  int r1 = a & ~c & b;
+  int r2 = ~a | b | ~d;
+  return (r1 & ~r2);
+}
+
+/* { dg-final { scan-tree-dump-times "return 0" 1 "optimized"} } */
+/* { dg-final { cleanup-tree-dump "optimized" } } */
Index: gcc/gcc/testsuite/gcc.dg/tree-ssa/reassoc-28.c
===================================================================
--- /dev/null
+++ gcc/gcc/testsuite/gcc.dg/tree-ssa/reassoc-28.c
@@ -0,0 +1,12 @@ 
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-optimized -ffast-math" } */
+
+int foo (int a, int b, int c, int d)
+{
+  int r1 = a & ~c & b;
+  int r2 = ~a | b | ~d;
+  return (~r1 | r2);
+}
+
+/* { dg-final { scan-tree-dump-times "return -1" 1 "optimized"} } */
+/* { dg-final { cleanup-tree-dump "optimized" } } */
Index: gcc/gcc/testsuite/gcc.dg/tree-ssa/reassoc-29.c
===================================================================
--- /dev/null
+++ gcc/gcc/testsuite/gcc.dg/tree-ssa/reassoc-29.c
@@ -0,0 +1,12 @@ 
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-optimized -ffast-math" } */
+
+int foo (int a, int b, int c, int d)
+{
+  int r1 = ~(a | b | c);
+  int r2 = (a | d | c) | ~b;
+  return (~r1 | r2);
+}
+
+/* { dg-final { scan-tree-dump-times "return -1" 1 "optimized"} } */
+/* { dg-final { cleanup-tree-dump "optimized" } } */