folding (vec_)cond_expr in a binary operation

Message ID alpine.DEB.2.10.1309031344320.11225@stedding.saclay.inria.fr New show

Commit Message

Marc Glisse Sept. 3, 2013, 1:38 p.m.
```(attaching the latest version. I only added comments since regtesting it)

On Tue, 3 Sep 2013, Richard Biener wrote:

>>> Btw, as for the patch I don't like that you basically feed everything into
>>> fold.  Yes, I know we do that for conditions because that's quite
>>> important
>>> and nobody has re-written the condition folding as gimple pattern matcher.
>>> I doubt that this transform is important enough to warrant another
>>> exception ;)
>>
>>
>> I am not sure what you want here. When I notice the pattern (a?b:c) op d, I
>> need to check whether b op d and c op d are likely to simplify before
>> transforming it to a?(b op d):(c op d). And currently I can't see any way to
>> do that other than feeding (b op d) to fold. Even if/when we get a gimple
>> folder, we will still need to call it in about the same way.
>
> With a gimple folder you'd avoid building trees.

Ah, so the problem is the cost of building those 2 trees? It will indeed
be better when we can avoid it, but I really don't expect the cost to be
that much...

> You are testing for
> a quite complex pattern here - (a?b:c) & (d?e:f).

But it is really handled in several steps. IIRC:
(a?1:0)&x becomes a?(x&1):0.
Since x is d?1:0, x&1 becomes d?1:0.
a?(d?1:0):0 is not (yet?) simplified further.

Now if we compare to 0: a?(d?1:0):0 != 0 simplifies to a?(d?1:0)!=0:0
then a?(d?-1:0):0 and finally a?d:0.

Each step can also trigger individually.

> It seems to be that
> for example
>
> +  vec c=(a>3)?o:z;
> +  vec d=(b>2)?o:z;
> +  vec e=c&d;
>
> would be better suited in the combine phase (you are interested in
> combining the comparisons).  That is, look for a [&|^] b where
> a and b are [VEC_]COND_EXPRs with the same values.

Hmm, I am already looking for a binary op which has at least one operand
which is a [VEC_]COND_EXPR, in the combine (=backward) part of
tree-ssa-forwprop.  Isn't that almost what you are suggesting?

> Heh, and it's not obvious to me with looking for a minute what this
> should be simplified to ...

(a?x:y)&(b?x:y) doesn't really simplify in general.

> (so the code and the testcase needs some
> comments on what you are trying to catch ...)

a<b vectorizes to (a<b)?1:0. If we compute & and | of conditions, we end
up with & and | of vec_cond_expr, and it seems preferable to clean that
up, so ((a<b)?1:0)&((c<d)?1:0) would become ((a<b)&(c<d))?1:0. I don't
quite produce (a<b)&(c<d) but (a<b)?(c<d):0 instead, I guess the last
step (replacing vec_cond_expr with and_expr) would be easy to do at
expansion time if it isn't already. It could also be done earlier, but
we want to be careful since fold_binary_op_with_conditional_arg had
related infinite loops in the past.

This transformation is basically a gimple version of
fold_binary_op_with_conditional_arg, so it applies more widely than just
this vector example.
```

Marc Glisse Sept. 11, 2013, 10:36 a.m. | #1
```Any other comments on this patch?

http://gcc.gnu.org/ml/gcc-patches/2013-09/msg00129.html

On Tue, 3 Sep 2013, Marc Glisse wrote:

>
> On Tue, 3 Sep 2013, Richard Biener wrote:
>
>>>> Btw, as for the patch I don't like that you basically feed everything
>>>> into
>>>> fold.  Yes, I know we do that for conditions because that's quite
>>>> important
>>>> and nobody has re-written the condition folding as gimple pattern
>>>> matcher.
>>>> I doubt that this transform is important enough to warrant another
>>>> exception ;)
>>>
>>>
>>> I am not sure what you want here. When I notice the pattern (a?b:c) op d,
>>> I
>>> need to check whether b op d and c op d are likely to simplify before
>>> transforming it to a?(b op d):(c op d). And currently I can't see any way
>>> to
>>> do that other than feeding (b op d) to fold. Even if/when we get a gimple
>>> folder, we will still need to call it in about the same way.
>>
>> With a gimple folder you'd avoid building trees.
>
> Ah, so the problem is the cost of building those 2 trees? It will indeed
> be better when we can avoid it, but I really don't expect the cost to be
> that much...
>
>> You are testing for
>> a quite complex pattern here - (a?b:c) & (d?e:f).
>
> But it is really handled in several steps. IIRC:
> (a?1:0)&x becomes a?(x&1):0.
> Since x is d?1:0, x&1 becomes d?1:0.
> a?(d?1:0):0 is not (yet?) simplified further.
>
> Now if we compare to 0: a?(d?1:0):0 != 0 simplifies to a?(d?1:0)!=0:0
> then a?(d?-1:0):0 and finally a?d:0.
>
> Each step can also trigger individually.
>
>> It seems to be that
>> for example
>>
>> +  vec c=(a>3)?o:z;
>> +  vec d=(b>2)?o:z;
>> +  vec e=c&d;
>>
>> would be better suited in the combine phase (you are interested in
>> combining the comparisons).  That is, look for a [&|^] b where
>> a and b are [VEC_]COND_EXPRs with the same values.
>
> Hmm, I am already looking for a binary op which has at least one operand
> which is a [VEC_]COND_EXPR, in the combine (=backward) part of
> tree-ssa-forwprop.  Isn't that almost what you are suggesting?
>
>> Heh, and it's not obvious to me with looking for a minute what this should
>> be simplified to ...
>
> (a?x:y)&(b?x:y) doesn't really simplify in general.
>
>> (so the code and the testcase needs some
>> comments on what you are trying to catch ...)
>
> a<b vectorizes to (a<b)?1:0. If we compute & and | of conditions, we end
> up with & and | of vec_cond_expr, and it seems preferable to clean that
> up, so ((a<b)?1:0)&((c<d)?1:0) would become ((a<b)&(c<d))?1:0. I don't
> quite produce (a<b)&(c<d) but (a<b)?(c<d):0 instead, I guess the last
> step (replacing vec_cond_expr with and_expr) would be easy to do at
> expansion time if it isn't already. It could also be done earlier, but
> we want to be careful since fold_binary_op_with_conditional_arg had
> related infinite loops in the past.
>
> This transformation is basically a gimple version of
> fold_binary_op_with_conditional_arg, so it applies more widely than just
> this vector example.
```

Patch

```Index: tree-ssa-forwprop.c
===================================================================
--- tree-ssa-forwprop.c	(revision 202185)
+++ tree-ssa-forwprop.c	(working copy)
@@ -363,23 +363,20 @@  combine_cond_expr_cond (gimple stmt, enu
gcc_assert (TREE_CODE_CLASS (code) == tcc_comparison);

fold_defer_overflow_warnings ();
t = fold_binary_loc (gimple_location (stmt), code, type, op0, op1);
if (!t)
{
fold_undefer_overflow_warnings (false, NULL, 0);
return NULL_TREE;
}

-  /* Require that we got a boolean type out if we put one in.  */
-  gcc_assert (TREE_CODE (TREE_TYPE (t)) == TREE_CODE (type));
-
/* Canonicalize the combined condition for use in a COND_EXPR.  */
t = canonicalize_cond_expr_cond (t);

/* Bail out if we required an invariant but didn't get one.  */
if (!t || (invariant_only && !is_gimple_min_invariant (t)))
{
fold_undefer_overflow_warnings (false, NULL, 0);
return NULL_TREE;
}

@@ -1135,20 +1132,131 @@  forward_propagate_comparison (gimple_stm

/* Remove defining statements.  */
return remove_prop_source_from_use (name);

bailout:
gsi_next (defgsi);
return false;
}

+/* Combine the binary operation defined in *GSI with one of its arguments
+   that is a condition, i.e. (A ? B : C) op D becomes A ? (B op D) : (C op D).
+   Returns 1 if there were any changes made, 2 if cfg-cleanup needs to
+   run.  Else it returns 0.  */
+
+static int
+combine_binop_with_condition (gimple_stmt_iterator *gsi)
+{
+  gimple stmt = gsi_stmt (*gsi);
+  tree type = TREE_TYPE (gimple_assign_lhs (stmt));
+  enum tree_code code = gimple_assign_rhs_code (stmt);
+  tree rhs1 = gimple_assign_rhs1 (stmt);
+  tree rhs2 = gimple_assign_rhs2 (stmt);
+  gimple def_stmt;
+  enum tree_code defcode;
+  bool trueok = false;
+  bool falseok = false;
+  tree true_op, false_op;
+  location_t loc = gimple_location (stmt);
+
+  if (TREE_CODE (rhs1) != SSA_NAME)
+    goto second_op;
+
+  def_stmt = get_prop_source_stmt (rhs1, true, NULL);
+  if (!def_stmt || !can_propagate_from (def_stmt))
+    goto second_op;
+
+  defcode = gimple_assign_rhs_code (def_stmt);
+  if (defcode != COND_EXPR && defcode != VEC_COND_EXPR)
+    goto second_op;
+
+  true_op = fold_build2_loc (loc, code, type, gimple_assign_rhs2 (def_stmt),
+			     gimple_assign_rhs2 (stmt));
+  false_op = fold_build2_loc (loc, code, type, gimple_assign_rhs3 (def_stmt),
+			      gimple_assign_rhs2 (stmt));
+
+  if (is_gimple_val (true_op))
+    trueok = true;
+  if (is_gimple_val (false_op))
+    falseok = true;
+  /* At least one of them has to simplify, or it isn't worth it.  */
+  if (!trueok && !falseok)
+    goto second_op;
+  if (!trueok)
+    {
+      if (!valid_gimple_rhs_p (true_op))
+	goto second_op;
+      gimple g = gimple_build_assign (make_ssa_name (type, NULL), true_op);
+      true_op = gimple_assign_lhs (g);
+      gsi_insert_before (gsi, g, GSI_SAME_STMT);
+    }
+  if (!falseok)
+    {
+      if (!valid_gimple_rhs_p (false_op))
+	goto second_op;
+      gimple g = gimple_build_assign (make_ssa_name (type, NULL), false_op);
+      false_op = gimple_assign_lhs (g);
+      gsi_insert_before (gsi, g, GSI_SAME_STMT);
+    }
+  goto finish;
+
+second_op:
+  if (TREE_CODE (rhs2) != SSA_NAME)
+    return 0;
+
+  def_stmt = get_prop_source_stmt (rhs2, true, NULL);
+  if (!def_stmt || !can_propagate_from (def_stmt))
+    return 0;
+
+  defcode = gimple_assign_rhs_code (def_stmt);
+  if (defcode != COND_EXPR && defcode != VEC_COND_EXPR)
+    return 0;
+
+  true_op = fold_build2_loc (loc, code, type, gimple_assign_rhs1 (stmt),
+			     gimple_assign_rhs2 (def_stmt));
+  false_op = fold_build2_loc (loc, code, type, gimple_assign_rhs1 (stmt),
+			      gimple_assign_rhs3 (def_stmt));
+
+  trueok = is_gimple_val (true_op);
+  falseok = is_gimple_val (false_op);
+  /* At least one of them has to simplify, or it isn't worth it.  */
+  if (!trueok && !falseok)
+    return 0;
+  if (!trueok)
+    {
+      if (!valid_gimple_rhs_p (true_op))
+	return 0;
+      gimple g = gimple_build_assign (make_ssa_name (type, NULL), true_op);
+      true_op = gimple_assign_lhs (g);
+      gsi_insert_before (gsi, g, GSI_SAME_STMT);
+    }
+  if (!falseok)
+    {
+      if (!valid_gimple_rhs_p (false_op))
+	return 0;
+      gimple g = gimple_build_assign (make_ssa_name (type, NULL), false_op);
+      false_op = gimple_assign_lhs (g);
+      gsi_insert_before (gsi, g, GSI_SAME_STMT);
+    }
+finish:
+  gimple_assign_set_rhs_with_ops_1 (gsi, defcode, gimple_assign_rhs1 (def_stmt),
+				    true_op, false_op);
+
+  fold_stmt (gsi);
+  update_stmt (gsi_stmt (*gsi));
+
+  /* Remove defining statements.  */
+  return remove_prop_source_from_use (gimple_assign_lhs (def_stmt)) + 1;
+}
+
+
/* GSI_P points to a statement which performs a narrowing integral
conversion.

Look for cases like:

t = x & c;
y = (T) t;

Turn them into:

@@ -3403,21 +3511,35 @@  ssa_forward_propagate_and_combine (void)
/* Mark stmt as potentially needing revisiting.  */
gimple_set_plf (stmt, GF_PLF_1, false);

switch (gimple_code (stmt))
{
case GIMPLE_ASSIGN:
{
tree rhs1 = gimple_assign_rhs1 (stmt);
enum tree_code code = gimple_assign_rhs_code (stmt);

-		if ((code == BIT_NOT_EXPR
+		/* Something (associate_plusminus?) doesn't set "changed"
+		   properly, so we can't put this at the end with
+		   if (!changed && ...).  */
+		if (TREE_CODE_CLASS (code) == tcc_binary
+		    || TREE_CODE_CLASS (code) == tcc_comparison)
+		  {
+		    int did_something;
+		    did_something = combine_binop_with_condition (&gsi);
+		    if (did_something == 2)
+		      cfg_changed = true;
+		    changed = did_something != 0;
+		  }
+		if (changed)
+		  ;
+		else if ((code == BIT_NOT_EXPR
|| code == NEGATE_EXPR)
&& TREE_CODE (rhs1) == SSA_NAME)
changed = simplify_not_neg_expr (&gsi);
else if (code == COND_EXPR
|| code == VEC_COND_EXPR)
{
/* In this case the entire COND_EXPR is in rhs1. */
if (forward_propagate_into_cond (&gsi)
|| combine_cond_exprs (&gsi))
{
Index: fold-const.c
===================================================================
--- fold-const.c	(revision 202185)
+++ fold-const.c	(working copy)
@@ -14122,21 +14122,23 @@  fold_ternary_loc (location_t loc, enum t

/* Convert A ? 1 : 0 to simply A.  */
if ((code == VEC_COND_EXPR ? integer_all_onesp (op1)
: (integer_onep (op1)
&& !VECTOR_TYPE_P (type)))
&& integer_zerop (op2)
/* If we try to convert OP0 to our type, the
call to fold will try to move the conversion inside
a COND, which will recurse.  In that case, the COND_EXPR
is probably the best choice, so leave it alone.  */
-	  && type == TREE_TYPE (arg0))
+	  && (type == TREE_TYPE (arg0)
+	      || (in_gimple_form
+		  && useless_type_conversion_p (type, TREE_TYPE (arg0)))))
return pedantic_non_lvalue_loc (loc, arg0);

/* Convert A ? 0 : 1 to !A.  This prefers the use of NOT_EXPR
over COND_EXPR in cases such as floating point comparisons.  */
if (integer_zerop (op1)
&& (code == VEC_COND_EXPR ? integer_all_onesp (op2)
: (integer_onep (op2)
&& !VECTOR_TYPE_P (type)))
&& truth_value_p (TREE_CODE (arg0)))
return pedantic_non_lvalue_loc (loc,
Index: testsuite/g++.dg/tree-ssa/pr57755.C
===================================================================
--- testsuite/g++.dg/tree-ssa/pr57755.C	(revision 0)
+++ testsuite/g++.dg/tree-ssa/pr57755.C	(revision 0)
@@ -0,0 +1,24 @@
+/* { dg-do compile } */
+/* { dg-options "-O -fdump-tree-forwprop1" } */
+typedef int vec __attribute__((vector_size(4*sizeof(int))));
+
+vec f(vec a,vec b){
+  vec z=a!=a; /* zero */
+  vec o=z+1; /* one */
+  vec c=(a>3)?o:z; /* scalar equivalent: c = a>3 */
+  vec d=(b>2)?o:z; /* scalar equivalent: d = b>2 */
+  vec e=c&d; /* simplify to (a>3)?((b>2)?1:0):0 */
+  return e!=0; /* simplify to (a>3)?((b>2)?-1:0):0 then (a>3)?(b>2):0 */
+}
+
+vec g(vec a,vec b){
+  vec z=a!=a; /* zero */
+  vec c=(a<=42)?b:z; /* equivalent to (a<=42)&b */
+  return b&c; /* notice that &b is useless */
+  /* return (a<=42)?b:0 */
+}
+
+/* { dg-final { scan-tree-dump-times "VEC_COND_EXPR" 2 "forwprop1" } } */
+/* { dg-final { scan-tree-dump-not "!=" "forwprop1" } } */
+/* { dg-final { scan-tree-dump-not "&" "forwprop1" } } */
+/* { dg-final { cleanup-tree-dump "forwprop1" } } */

```