diff mbox

[RFC,PR63586] Convert x+x+x+x into 4*x

Message ID 56D3C8D9.5020508@linaro.org
State New
Headers show

Commit Message

Kugan Vivekanandarajah Feb. 29, 2016, 4:28 a.m. UTC
> That looks better, but I think the unordered_remove will break operand sorting
> and thus you probably don't handle x + x + x + x + y + y + y + y + y +
> y + z + z + z + z
> optimally.
>
> I'd say you simply want to avoid the recursion and collect a vector of
> [start, end] pairs
> before doing any modification to the ops vector.

Hi Richard,

Is the attached patch looks better?

Thanks,
Kugan

Comments

Christophe Lyon March 2, 2016, 2:28 p.m. UTC | #1
On 29 February 2016 at 05:28, kugan <kugan.vivekanandarajah@linaro.org> wrote:
>
>> That looks better, but I think the unordered_remove will break operand
>> sorting
>> and thus you probably don't handle x + x + x + x + y + y + y + y + y +
>> y + z + z + z + z
>> optimally.
>>
>> I'd say you simply want to avoid the recursion and collect a vector of
>> [start, end] pairs
>> before doing any modification to the ops vector.
>
>
> Hi Richard,
>
> Is the attached patch looks better?
>

Minor comment, I've noticed typos in your updated comment:
"There should be two multiplication left in test1 (inculding one generated"
should be
"There should be two multiplications left in test1 (including one generated"



> Thanks,
> Kugan
Richard Biener April 19, 2016, 11:56 a.m. UTC | #2
On Wed, Mar 2, 2016 at 3:28 PM, Christophe Lyon
<christophe.lyon@linaro.org> wrote:
> On 29 February 2016 at 05:28, kugan <kugan.vivekanandarajah@linaro.org> wrote:
>>
>>> That looks better, but I think the unordered_remove will break operand
>>> sorting
>>> and thus you probably don't handle x + x + x + x + y + y + y + y + y +
>>> y + z + z + z + z
>>> optimally.
>>>
>>> I'd say you simply want to avoid the recursion and collect a vector of
>>> [start, end] pairs
>>> before doing any modification to the ops vector.
>>
>>
>> Hi Richard,
>>
>> Is the attached patch looks better?
>>
>
> Minor comment, I've noticed typos in your updated comment:
> "There should be two multiplication left in test1 (inculding one generated"
> should be
> "There should be two multiplications left in test1 (including one generated"

+/* Transoform repeated addition of same values into multiply with
+   constant.  */

Transform

+static void
+transform_add_to_multiply (gimple_stmt_iterator *gsi, gimple *stmt,
vec<operand_entry *> *ops)

split the long line

op_list looks redundant - ops[start]->op gives you the desired value
already and if you
use a vec<std::pair<int, int>> you can have a more C++ish start,end pair.

+      tree tmp = make_temp_ssa_name (TREE_TYPE (op), NULL, "reassocmul");
+      gassign *mul_stmt = gimple_build_assign (tmp, MULT_EXPR,
+                                              op, build_int_cst
(TREE_TYPE(op), count));

this won't work for floating point or complex numbers - you need to use sth like
fold_convert (TREE_TYPE (op), build_int_cst (integer_type_node, count));

For FP types you need to guard the transform with flag_unsafe_math_optimizations

+      gimple_set_location (mul_stmt, gimple_location (stmt));
+      gimple_set_uid (mul_stmt, gimple_uid (stmt));
+      gsi_insert_before (gsi, mul_stmt, GSI_SAME_STMT);

I think you do not want to set the stmt uid and you want to insert the
stmt right
after the def of op (or at the original first add - though you can't
get your hands at
that easily).  You also don't want to set the location to the last stmt of the
whole add sequence - simply leave it unset.

+      oe = operand_entry_pool.allocate ();
+      oe->op = tmp;
+      oe->rank = get_rank (op) * count;

?  Why that?  oe->rank should be get_rank (tmp).

+      oe->id = 0;

other places use next_operand_entry_id++.  I think you want to simply
use add_to_ops_vec (oe, tmp); here for all of the above.

Please return whether you did any optimization and do the
qsort of the operand vector only if you did sth.

Testcase with FP math missing.  Likewise with complex or vector math.

Thanks,
Richard.

>> Thanks,
>> Kugan
Richard Biener April 19, 2016, 11:56 a.m. UTC | #3
On Tue, Apr 19, 2016 at 1:56 PM, Richard Biener
<richard.guenther@gmail.com> wrote:
> On Wed, Mar 2, 2016 at 3:28 PM, Christophe Lyon
> <christophe.lyon@linaro.org> wrote:
>> On 29 February 2016 at 05:28, kugan <kugan.vivekanandarajah@linaro.org> wrote:
>>>
>>>> That looks better, but I think the unordered_remove will break operand
>>>> sorting
>>>> and thus you probably don't handle x + x + x + x + y + y + y + y + y +
>>>> y + z + z + z + z
>>>> optimally.
>>>>
>>>> I'd say you simply want to avoid the recursion and collect a vector of
>>>> [start, end] pairs
>>>> before doing any modification to the ops vector.
>>>
>>>
>>> Hi Richard,
>>>
>>> Is the attached patch looks better?
>>>
>>
>> Minor comment, I've noticed typos in your updated comment:
>> "There should be two multiplication left in test1 (inculding one generated"
>> should be
>> "There should be two multiplications left in test1 (including one generated"
>
> +/* Transoform repeated addition of same values into multiply with
> +   constant.  */
>
> Transform
>
> +static void
> +transform_add_to_multiply (gimple_stmt_iterator *gsi, gimple *stmt,
> vec<operand_entry *> *ops)
>
> split the long line
>
> op_list looks redundant - ops[start]->op gives you the desired value
> already and if you
> use a vec<std::pair<int, int>> you can have a more C++ish start,end pair.
>
> +      tree tmp = make_temp_ssa_name (TREE_TYPE (op), NULL, "reassocmul");
> +      gassign *mul_stmt = gimple_build_assign (tmp, MULT_EXPR,
> +                                              op, build_int_cst
> (TREE_TYPE(op), count));
>
> this won't work for floating point or complex numbers - you need to use sth like
> fold_convert (TREE_TYPE (op), build_int_cst (integer_type_node, count));
>
> For FP types you need to guard the transform with flag_unsafe_math_optimizations
>
> +      gimple_set_location (mul_stmt, gimple_location (stmt));
> +      gimple_set_uid (mul_stmt, gimple_uid (stmt));
> +      gsi_insert_before (gsi, mul_stmt, GSI_SAME_STMT);
>
> I think you do not want to set the stmt uid and you want to insert the
> stmt right
> after the def of op (or at the original first add - though you can't
> get your hands at
> that easily).  You also don't want to set the location to the last stmt of the
> whole add sequence - simply leave it unset.
>
> +      oe = operand_entry_pool.allocate ();
> +      oe->op = tmp;
> +      oe->rank = get_rank (op) * count;
>
> ?  Why that?  oe->rank should be get_rank (tmp).
>
> +      oe->id = 0;
>
> other places use next_operand_entry_id++.  I think you want to simply
> use add_to_ops_vec (oe, tmp); here for all of the above.
>
> Please return whether you did any optimization and do the
> qsort of the operand vector only if you did sth.
>
> Testcase with FP math missing.  Likewise with complex or vector math.

Btw, does it handle associating

  x + 3 * x + x

to

  5 * x

?

Richard.

> Thanks,
> Richard.
>
>>> Thanks,
>>> Kugan
diff mbox

Patch

diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr63586.c b/gcc/testsuite/gcc.dg/tree-ssa/pr63586.c
index e69de29..a002bdd 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/pr63586.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr63586.c
@@ -0,0 +1,59 @@ 
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-reassoc1" } */
+
+unsigned f1 (unsigned x)
+{
+    unsigned y = x + x;
+    y = y + x;
+    y = y + x;
+    y = y + x;
+    y = y + x;
+    y = y + x;
+    y = y + x;
+    return y;
+}
+
+unsigned f2 (unsigned x, unsigned z)
+{
+    unsigned y = x + x;
+    y = y + x;
+    y = y + x;
+    y = y + z;
+    y = y + z;
+    y = y + z;
+    y = y + z;
+    return y;
+}
+
+unsigned f3 (unsigned x, unsigned z, unsigned k)
+{
+    unsigned y = x + x;
+    y = y + x;
+    y = y + x;
+    y = y + z;
+    y = y + z;
+    y = y + z;
+    y = y + k;
+    return y;
+}
+
+unsigned f4 (unsigned x, unsigned z, unsigned k)
+{
+    unsigned y = k + x;
+    y = y + x;
+    y = y + x;
+    y = y + z;
+    y = y + z;
+    y = y + z;
+    y = y + x;
+    return y;
+}
+
+unsigned f5 (unsigned x, unsigned y, unsigned z)
+{
+    return x + x + x + x + y + y + y + y + y +
+      y + z + z + z + z;
+}
+
+
+/* { dg-final { scan-tree-dump-times "\\\*" 10 "reassoc1" } } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/reassoc-14.c b/gcc/testsuite/gcc.dg/tree-ssa/reassoc-14.c
index 62802d1..16ebc86 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/reassoc-14.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/reassoc-14.c
@@ -19,6 +19,7 @@  unsigned int test2 (unsigned int x, unsigned int y, unsigned int z,
   return tmp1 + tmp2 + tmp3;
 }
 
-/* There should be one multiplication left in test1 and three in test2.  */
+/* There should be two multiplication left in test1 (inculding one generated
+   when converting addition to multiplication) and three in test2.  */
 
-/* { dg-final { scan-tree-dump-times "\\\*" 4 "reassoc1" } } */
+/* { dg-final { scan-tree-dump-times "\\\*" 5 "reassoc1" } } */
diff --git a/gcc/tree-ssa-reassoc.c b/gcc/tree-ssa-reassoc.c
index 17eb64f..0a43faf 100644
--- a/gcc/tree-ssa-reassoc.c
+++ b/gcc/tree-ssa-reassoc.c
@@ -1698,6 +1698,79 @@  eliminate_redundant_comparison (enum tree_code opcode,
   return false;
 }
 
+/* Transoform repeated addition of same values into multiply with
+   constant.  */
+static void
+transform_add_to_multiply (gimple_stmt_iterator *gsi, gimple *stmt, vec<operand_entry *> *ops)
+{
+  operand_entry *oe;
+  tree op = NULL_TREE;
+  int j;
+  int i, start = -1, end = 0, count = 0;
+  vec<int>  start_inds = vNULL;
+  vec<int>  end_inds = vNULL;
+  vec<tree>  op_list = vNULL;
+
+  /* Look for repeated operands.  */
+  FOR_EACH_VEC_ELT (*ops, i, oe)
+    {
+      if (start == -1)
+	{
+	  count = 1;
+	  op = oe->op;
+	  start = i;
+	}
+      else if (operand_equal_p (oe->op, op, 0))
+	{
+	  count++;
+	  end = i;
+	}
+      else
+	{
+	  if (count > 1)
+	    {
+	      start_inds.safe_push (start);
+	      end_inds.safe_push (end);
+	      op_list.safe_push (op);
+	    }
+	  count = 1;
+	  op = oe->op;
+	  start = i;
+	}
+    }
+
+  if (count > 1)
+    {
+      start_inds.safe_push (start);
+      end_inds.safe_push (end);
+      op_list.safe_push (op);
+    }
+
+  for (j = start_inds.length () - 1; j >= 0; --j)
+    {
+      /* Convert repeated operand addition to multiplication.  */
+      start = start_inds[j];
+      end = end_inds[j];
+      op = op_list[j];
+      count = end - start + 1;
+      for (i = end; i >= start; --i)
+	ops->unordered_remove (i);
+      tree tmp = make_temp_ssa_name (TREE_TYPE (op), NULL, "reassocmul");
+      gassign *mul_stmt = gimple_build_assign (tmp, MULT_EXPR,
+					       op, build_int_cst (TREE_TYPE(op), count));
+      gimple_set_location (mul_stmt, gimple_location (stmt));
+      gimple_set_uid (mul_stmt, gimple_uid (stmt));
+      gsi_insert_before (gsi, mul_stmt, GSI_SAME_STMT);
+      oe = operand_entry_pool.allocate ();
+      oe->op = tmp;
+      oe->rank = get_rank (op) * count;
+      oe->id = 0;
+      oe->count = 1;
+      ops->safe_push (oe);
+    }
+}
+
+
 /* Perform various identities and other optimizations on the list of
    operand entries, stored in OPS.  The tree code for the binary
    operation between all the operands is OPCODE.  */
@@ -4922,6 +4995,12 @@  reassociate_bb (basic_block bb)
 		  && flag_unsafe_math_optimizations)
 		powi_result = attempt_builtin_powi (stmt, &ops);
 
+	      if (rhs_code == PLUS_EXPR)
+		{
+		  transform_add_to_multiply (&gsi, stmt, &ops);
+		  ops.qsort (sort_by_operand_rank);
+		}
+
 	      /* If the operand vector is now empty, all operands were 
 		 consumed by the __builtin_powi optimization.  */
 	      if (ops.length () == 0)