Patchwork [2,of,2] RTL expansion for zero sign extension elimination with VRP

login
register
mail settings
Submitter Kugan
Date June 3, 2013, 2:16 a.m.
Message ID <51ABFC6E.30205@linaro.org>
Download mbox | patch
Permalink /patch/248153/
State New
Headers show

Comments

Kugan - June 3, 2013, 2:16 a.m.
Hi,

This patch  removes some of the redundant sign/zero extensions using 
value range information during RTL expansion.

When GIMPLE_ASSIGN stmts with LHS type smaller than word is expanded to 
RTL, if we can prove that RHS expression value can always fit in LHS 
type and there is no sign conversion, truncation and extension to fit 
the type is redundant. For a SUBREG_PROMOTED_VAR_P, Subreg and Zero/sign 
extensions are therefore redundant.

For example, when an expression is evaluated and it's value is assigned
to variable of type short, the generated RTL would look something like
the following.

(set (reg:SI 110)
                  (zero_extend:SI (subreg:HI (reg:SI 117) 0)))

However, if during value range propagation, if we can say for certain
that the value of the expression which is present in register 117 is
within the limits of short and there is no sign conversion, we do not
need to perform the subreg and zero_extend; instead we can generate the
following RTl.

(set (reg:SI 110)
                  (reg:SI 117)))

Same could be done for other assign statements.

This patch is based on the earlier attempt posted in 
http://gcc.gnu.org/ml/gcc-patches/2013-05/msg00610.html and addresses 
the review comments of  Richard Biener. I am post-processing the 
expand_expr_real_2 output in expand_gimple_stmt though. Reason for this 
is that I would like to process all the possible assignment stmts, not 
just  CASE_CONVERT case and/or the REDUCE_BITFIELD.

This change along with expansion improve the geomean of spec2k int 
benchmark with ref by about ~3.5% on an arm chromebook.

Tested  on X86_64 and ARM.

I would like review comments on this.

Thanks,
Kugan


+2013-06-03  Kugan Vivekanandarajah  <kuganv@linaro.org>
+
+	* gcc/dojump.c (do_compare_and_jump): generates rtl without
+	zero/sign extension if redundant.
+	* gcc/cfgexpand.c (expand_gimple_stmt_1): Likewise.
+	* gcc/gimple.c (gimple_assign_is_zero_sign_ext_redundant) : New
+	function.
+	* gcc/gimple.h (gimple_assign_is_zero_sign_ext_redundant) : New
+	function definition.
+
Eric Botcazou - July 1, 2013, 9:21 a.m.
[Sorry for the delay]

> For example, when an expression is evaluated and it's value is assigned
> to variable of type short, the generated RTL would look something like
> the following.
> 
> (set (reg:SI 110)
>                   (zero_extend:SI (subreg:HI (reg:SI 117) 0)))
> 
> However, if during value range propagation, if we can say for certain
> that the value of the expression which is present in register 117 is
> within the limits of short and there is no sign conversion, we do not
> need to perform the subreg and zero_extend; instead we can generate the
> following RTl.
> 
> (set (reg:SI 110)
>                   (reg:SI 117)))
> 
> Same could be done for other assign statements.

The idea looks interesting.  Some remarks:

> +2013-06-03  Kugan Vivekanandarajah  <kuganv@linaro.org>
> +
> +	* gcc/dojump.c (do_compare_and_jump): generates rtl without
> +	zero/sign extension if redundant.
> +	* gcc/cfgexpand.c (expand_gimple_stmt_1): Likewise.
> +	* gcc/gimple.c (gimple_assign_is_zero_sign_ext_redundant) : New
> +	function.
> +	* gcc/gimple.h (gimple_assign_is_zero_sign_ext_redundant) : New
> +	function definition.

No gcc/ prefix in entries for gcc/ChangeLog.  "Generate RTL without..."


+            /* If the value in SUBREG of temp fits that SUBREG (does not
+               overflow) and is assigned to target SUBREG of the same mode
+               without sign convertion, we can skip the SUBREG
+               and extension.  */
+            else if (promoted
+                     && gimple_assign_is_zero_sign_ext_redundant (stmt)
+                     && (GET_CODE (temp) == SUBREG)
+                     && (GET_MODE (target) == GET_MODE (temp))
+                     && (GET_MODE (SUBREG_REG (target))
+                         == GET_MODE (SUBREG_REG (temp))))
+	      emit_move_insn (SUBREG_REG (target), SUBREG_REG (temp));
 	    else if (promoted)
 	      {
 		int unsignedp = SUBREG_PROMOTED_UNSIGNED_P (target);

Can we relax the strict mode equality here?  This change augments the same 
transformation applied to the RHS when it is also a SUBREG_PROMOTED_VAR_P at 
the beginning of convert_move, but the condition on the mode is less strict in 
the latter case, so maybe it can serve as a model here.


+  /* Is zero/sign extension redundant as per VRP.  */
+  bool op0_ext_redundant = false;
+  bool op1_ext_redundant = false;
+
+  /* If promoted and the value in SUBREG of op0 fits (does not overflow),
+     it is a candidate for extension elimination.  */
+  if (GET_CODE (op0) == SUBREG && SUBREG_PROMOTED_VAR_P (op0))
+    op0_ext_redundant =
+      gimple_assign_is_zero_sign_ext_redundant (SSA_NAME_DEF_STMT (treeop0));
+
+  /* If promoted and the value in SUBREG of op1 fits (does not overflow),
+     it is a candidate for extension elimination.  */
+  if (GET_CODE (op1) == SUBREG && SUBREG_PROMOTED_VAR_P (op1))
+    op1_ext_redundant =
+      gimple_assign_is_zero_sign_ext_redundant (SSA_NAME_DEF_STMT (treeop1));

Are the gimple_assign_is_zero_sign_ext_redundant checks necessary here?  
When set on a SUBREG, SUBREG_PROMOTED_VAR_P guarantees that SUBREG_REG is 
always properly extended (otherwise it's a bug) so don't you just need to 
compare SUBREG_PROMOTED_UNSIGNED_SET?  See do_jump for an existing case.


+  /* If zero/sign extension is redundant, generate RTL
+     for operands without zero/sign extension.  */
+  if ((op0_ext_redundant || TREE_CODE (treeop0) == INTEGER_CST)
+      && (op1_ext_redundant || TREE_CODE (treeop1) == INTEGER_CST))

Don't you need to be careful with the INTEGER_CSTs here?  The CONST_INTs are 
always sign-extended in RTL so 0x80 is always represented by (const_int -128) 
in QImode, whatever the signedness.  If SUBREG_PROMOTED_UNSIGNED_SET is true,
then comparing in QImode and comparing in e.g. SImode wouldn't be equivalent.

Patch

diff --git a/gcc/cfgexpand.c b/gcc/cfgexpand.c
index c187273..ce980bc 100644
--- a/gcc/cfgexpand.c
+++ b/gcc/cfgexpand.c
@@ -2311,6 +2311,17 @@  expand_gimple_stmt_1 (gimple stmt)
 
 	    if (temp == target)
 	      ;
+            /* If the value in SUBREG of temp fits that SUBREG (does not
+               overflow) and is assigned to target SUBREG of the same mode
+               without sign convertion, we can skip the SUBREG
+               and extension.  */
+            else if (promoted
+                     && gimple_assign_is_zero_sign_ext_redundant (stmt)
+                     && (GET_CODE (temp) == SUBREG)
+                     && (GET_MODE (target) == GET_MODE (temp))
+                     && (GET_MODE (SUBREG_REG (target))
+                         == GET_MODE (SUBREG_REG (temp))))
+	      emit_move_insn (SUBREG_REG (target), SUBREG_REG (temp));
 	    else if (promoted)
 	      {
 		int unsignedp = SUBREG_PROMOTED_UNSIGNED_P (target);
diff --git a/gcc/dojump.c b/gcc/dojump.c
index 3f04eac..cb13f3a 100644
--- a/gcc/dojump.c
+++ b/gcc/dojump.c
@@ -34,6 +34,7 @@  along with GCC; see the file COPYING3.  If not see
 #include "ggc.h"
 #include "basic-block.h"
 #include "tm_p.h"
+#include "gimple.h"
 
 static bool prefer_and_bit_test (enum machine_mode, int);
 static void do_jump_by_parts_greater (tree, tree, int, rtx, rtx, int);
@@ -1108,6 +1109,60 @@  do_compare_and_jump (tree treeop0, tree treeop1, enum rtx_code signed_code,
 
   type = TREE_TYPE (treeop0);
   mode = TYPE_MODE (type);
+
+  /* Is zero/sign extension redundant as per VRP.  */
+  bool op0_ext_redundant = false;
+  bool op1_ext_redundant = false;
+
+  /* If promoted and the value in SUBREG of op0 fits (does not overflow),
+     it is a candidate for extension elimination.  */
+  if (GET_CODE (op0) == SUBREG && SUBREG_PROMOTED_VAR_P (op0))
+    op0_ext_redundant =
+      gimple_assign_is_zero_sign_ext_redundant (SSA_NAME_DEF_STMT (treeop0));
+
+  /* If promoted and the value in SUBREG of op1 fits (does not overflow),
+     it is a candidate for extension elimination.  */
+  if (GET_CODE (op1) == SUBREG && SUBREG_PROMOTED_VAR_P (op1))
+    op1_ext_redundant =
+      gimple_assign_is_zero_sign_ext_redundant (SSA_NAME_DEF_STMT (treeop1));
+
+  /* If zero/sign extension is redundant, generate RTL
+     for operands without zero/sign extension.  */
+  if ((op0_ext_redundant || TREE_CODE (treeop0) == INTEGER_CST)
+      && (op1_ext_redundant || TREE_CODE (treeop1) == INTEGER_CST))
+    {
+      if (TREE_CODE (treeop1) == INTEGER_CST)
+        {
+          /* First operand is constant.  */
+          rtx new_op0 = gen_reg_rtx (GET_MODE (SUBREG_REG (op0)));
+
+          emit_move_insn (new_op0, SUBREG_REG (op0));
+          op0 = new_op0;
+        }
+      else if (TREE_CODE (treeop0) == INTEGER_CST)
+        {
+          /* Other operand is constant.  */
+          rtx new_op1 = gen_reg_rtx (GET_MODE (SUBREG_REG (op1)));
+
+          emit_move_insn (new_op1, SUBREG_REG (op1));
+          op1 = new_op1;
+        }
+      /* If both the comapre registers fits SUBREG and of the same mode.  */
+      else if ((TREE_CODE (treeop0) != INTEGER_CST)
+              && (TREE_CODE (treeop1) != INTEGER_CST)
+              && (GET_MODE (op0) == GET_MODE (op1))
+              && (GET_MODE (SUBREG_REG (op0)) == GET_MODE (SUBREG_REG (op1))))
+        {
+          rtx new_op0 = gen_reg_rtx (GET_MODE (SUBREG_REG (op0)));
+          rtx new_op1 = gen_reg_rtx (GET_MODE (SUBREG_REG (op1)));
+
+          emit_move_insn (new_op0, SUBREG_REG (op0));
+          emit_move_insn (new_op1, SUBREG_REG (op1));
+          op0 = new_op0;
+          op1 = new_op1;
+        }
+    }
+
   if (TREE_CODE (treeop0) == INTEGER_CST
       && (TREE_CODE (treeop1) != INTEGER_CST
           || (GET_MODE_BITSIZE (mode)
diff --git a/gcc/gimple.c b/gcc/gimple.c
index f507419..e9499b6 100644
--- a/gcc/gimple.c
+++ b/gcc/gimple.c
@@ -200,6 +200,75 @@  gimple_call_reset_alias_info (gimple s)
     pt_solution_reset (gimple_call_clobber_set (s));
 }
 
+
+/* process gimple assign stmts and see if the sign/zero extensions are
+   redundant.  i.e.  if an assignment gimple statement has RHS expression
+   value that can fit in LHS type, truncation is redundant.  Zero/sign
+   extensions in this case can be removed.  */
+
+bool
+gimple_assign_is_zero_sign_ext_redundant (gimple stmt)
+{
+  double_int type_min, type_max;
+  tree int_val = NULL_TREE;
+  range_info_def *ri;
+
+  /* skip if not assign stmt.  */
+  if (!is_gimple_assign (stmt))
+    return false;
+
+  tree lhs = gimple_assign_lhs (stmt);
+
+  /* We can remove extension only for non-pointer and integral stmts.  */
+  if (!INTEGRAL_TYPE_P (TREE_TYPE (lhs))
+      || POINTER_TYPE_P (TREE_TYPE (lhs)))
+    return false;
+
+  type_max = tree_to_double_int (TYPE_MAX_VALUE (TREE_TYPE (lhs)));
+  type_min = tree_to_double_int (TYPE_MIN_VALUE (TREE_TYPE (lhs)));
+
+  /* For binary expressions, if one of the argument is constant and is
+     larger than tne signed maximum, it can be intepreted as nagative
+     and sign extended.  This could lead to problems so return false in
+     this case.  */
+  if (TREE_CODE_CLASS (gimple_assign_rhs_code (stmt)) == tcc_binary)
+    {
+      tree rhs1 = gimple_assign_rhs1 (stmt);
+      tree rhs2 = gimple_assign_rhs2 (stmt);
+
+      if (TREE_CODE (rhs1) == INTEGER_CST)
+        int_val = rhs1;
+      else if (TREE_CODE (rhs2) == INTEGER_CST)
+        int_val = rhs2;
+
+      if (int_val != NULL_TREE)
+        {
+          tree max = TYPE_MIN_VALUE (TREE_TYPE (lhs));
+
+          /* if type is unsigned, get the max for signed equivalent.  */
+          if (!INT_CST_LT (TYPE_MIN_VALUE (TREE_TYPE (lhs)), integer_zero_node))
+            max = int_const_binop (RSHIFT_EXPR,
+                                    max, build_int_cst (TREE_TYPE (max), 1));
+          if (!INT_CST_LT (int_val, max))
+            return false;
+        }
+    }
+
+  /* Get the value range.  */
+  ri = SSA_NAME_RANGE_INFO (lhs);
+
+  /* Check if value range is valid.  */
+  if (!ri || !ri->valid || !ri->vr_range)
+    return false;
+
+  /* Value range fits type.  */
+  if (ri->max.scmp (type_max) != 1
+      && (type_min.scmp (ri->min) != 1))
+    return true;
+
+  return false;
+}
+
 /* Helper for gimple_build_call, gimple_build_call_valist,
    gimple_build_call_vec and gimple_build_call_from_tree.  Build the basic
    components of a GIMPLE_CALL statement to function FN with NARGS
diff --git a/gcc/gimple.h b/gcc/gimple.h
index b4de403..e924c5b 100644
--- a/gcc/gimple.h
+++ b/gcc/gimple.h
@@ -829,6 +829,7 @@  int gimple_call_flags (const_gimple);
 int gimple_call_return_flags (const_gimple);
 int gimple_call_arg_flags (const_gimple, unsigned);
 void gimple_call_reset_alias_info (gimple);
+bool gimple_assign_is_zero_sign_ext_redundant (gimple);
 bool gimple_assign_copy_p (gimple);
 bool gimple_assign_ssa_name_copy_p (gimple);
 bool gimple_assign_unary_nop_p (gimple);