Patchwork folding (vec_)cond_expr in a binary operation

login
register
mail settings
Submitter Marc Glisse
Date Sept. 3, 2013, 1:38 p.m.
Message ID <alpine.DEB.2.10.1309031344320.11225@stedding.saclay.inria.fr>
Download mbox | patch
Permalink /patch/272256/
State New
Headers show

Comments

Marc Glisse - Sept. 3, 2013, 1:38 p.m.
(attaching the latest version. I only added comments since regtesting it)

On Tue, 3 Sep 2013, Richard Biener wrote:

>>> Btw, as for the patch I don't like that you basically feed everything into
>>> fold.  Yes, I know we do that for conditions because that's quite
>>> important
>>> and nobody has re-written the condition folding as gimple pattern matcher.
>>> I doubt that this transform is important enough to warrant another
>>> exception ;)
>>
>>
>> I am not sure what you want here. When I notice the pattern (a?b:c) op d, I
>> need to check whether b op d and c op d are likely to simplify before
>> transforming it to a?(b op d):(c op d). And currently I can't see any way to
>> do that other than feeding (b op d) to fold. Even if/when we get a gimple
>> folder, we will still need to call it in about the same way.
>
> With a gimple folder you'd avoid building trees.

Ah, so the problem is the cost of building those 2 trees? It will indeed
be better when we can avoid it, but I really don't expect the cost to be
that much...

> You are testing for
> a quite complex pattern here - (a?b:c) & (d?e:f).

But it is really handled in several steps. IIRC:
(a?1:0)&x becomes a?(x&1):0.
Since x is d?1:0, x&1 becomes d?1:0.
a?(d?1:0):0 is not (yet?) simplified further.

Now if we compare to 0: a?(d?1:0):0 != 0 simplifies to a?(d?1:0)!=0:0
then a?(d?-1:0):0 and finally a?d:0.

Each step can also trigger individually.

> It seems to be that
> for example
>
> +  vec c=(a>3)?o:z;
> +  vec d=(b>2)?o:z;
> +  vec e=c&d;
>
> would be better suited in the combine phase (you are interested in
> combining the comparisons).  That is, look for a [&|^] b where
> a and b are [VEC_]COND_EXPRs with the same values.

Hmm, I am already looking for a binary op which has at least one operand
which is a [VEC_]COND_EXPR, in the combine (=backward) part of
tree-ssa-forwprop.  Isn't that almost what you are suggesting?

> Heh, and it's not obvious to me with looking for a minute what this 
> should be simplified to ...

(a?x:y)&(b?x:y) doesn't really simplify in general.

> (so the code and the testcase needs some
> comments on what you are trying to catch ...)

a<b vectorizes to (a<b)?1:0. If we compute & and | of conditions, we end
up with & and | of vec_cond_expr, and it seems preferable to clean that
up, so ((a<b)?1:0)&((c<d)?1:0) would become ((a<b)&(c<d))?1:0. I don't
quite produce (a<b)&(c<d) but (a<b)?(c<d):0 instead, I guess the last
step (replacing vec_cond_expr with and_expr) would be easy to do at
expansion time if it isn't already. It could also be done earlier, but
we want to be careful since fold_binary_op_with_conditional_arg had
related infinite loops in the past.

This transformation is basically a gimple version of
fold_binary_op_with_conditional_arg, so it applies more widely than just
this vector example.
Marc Glisse - Sept. 11, 2013, 10:36 a.m.
Any other comments on this patch?

http://gcc.gnu.org/ml/gcc-patches/2013-09/msg00129.html

On Tue, 3 Sep 2013, Marc Glisse wrote:

> (attaching the latest version. I only added comments since regtesting it)
>
> On Tue, 3 Sep 2013, Richard Biener wrote:
>
>>>> Btw, as for the patch I don't like that you basically feed everything 
>>>> into
>>>> fold.  Yes, I know we do that for conditions because that's quite
>>>> important
>>>> and nobody has re-written the condition folding as gimple pattern 
>>>> matcher.
>>>> I doubt that this transform is important enough to warrant another
>>>> exception ;)
>>> 
>>> 
>>> I am not sure what you want here. When I notice the pattern (a?b:c) op d, 
>>> I
>>> need to check whether b op d and c op d are likely to simplify before
>>> transforming it to a?(b op d):(c op d). And currently I can't see any way 
>>> to
>>> do that other than feeding (b op d) to fold. Even if/when we get a gimple
>>> folder, we will still need to call it in about the same way.
>> 
>> With a gimple folder you'd avoid building trees.
>
> Ah, so the problem is the cost of building those 2 trees? It will indeed
> be better when we can avoid it, but I really don't expect the cost to be
> that much...
>
>> You are testing for
>> a quite complex pattern here - (a?b:c) & (d?e:f).
>
> But it is really handled in several steps. IIRC:
> (a?1:0)&x becomes a?(x&1):0.
> Since x is d?1:0, x&1 becomes d?1:0.
> a?(d?1:0):0 is not (yet?) simplified further.
>
> Now if we compare to 0: a?(d?1:0):0 != 0 simplifies to a?(d?1:0)!=0:0
> then a?(d?-1:0):0 and finally a?d:0.
>
> Each step can also trigger individually.
>
>> It seems to be that
>> for example
>> 
>> +  vec c=(a>3)?o:z;
>> +  vec d=(b>2)?o:z;
>> +  vec e=c&d;
>> 
>> would be better suited in the combine phase (you are interested in
>> combining the comparisons).  That is, look for a [&|^] b where
>> a and b are [VEC_]COND_EXPRs with the same values.
>
> Hmm, I am already looking for a binary op which has at least one operand
> which is a [VEC_]COND_EXPR, in the combine (=backward) part of
> tree-ssa-forwprop.  Isn't that almost what you are suggesting?
>
>> Heh, and it's not obvious to me with looking for a minute what this should 
>> be simplified to ...
>
> (a?x:y)&(b?x:y) doesn't really simplify in general.
>
>> (so the code and the testcase needs some
>> comments on what you are trying to catch ...)
>
> a<b vectorizes to (a<b)?1:0. If we compute & and | of conditions, we end
> up with & and | of vec_cond_expr, and it seems preferable to clean that
> up, so ((a<b)?1:0)&((c<d)?1:0) would become ((a<b)&(c<d))?1:0. I don't
> quite produce (a<b)&(c<d) but (a<b)?(c<d):0 instead, I guess the last
> step (replacing vec_cond_expr with and_expr) would be easy to do at
> expansion time if it isn't already. It could also be done earlier, but
> we want to be careful since fold_binary_op_with_conditional_arg had
> related infinite loops in the past.
>
> This transformation is basically a gimple version of
> fold_binary_op_with_conditional_arg, so it applies more widely than just
> this vector example.

Patch

Index: tree-ssa-forwprop.c

===================================================================
--- tree-ssa-forwprop.c	(revision 202185)

+++ tree-ssa-forwprop.c	(working copy)

@@ -363,23 +363,20 @@  combine_cond_expr_cond (gimple stmt, enu

   gcc_assert (TREE_CODE_CLASS (code) == tcc_comparison);
 
   fold_defer_overflow_warnings ();
   t = fold_binary_loc (gimple_location (stmt), code, type, op0, op1);
   if (!t)
     {
       fold_undefer_overflow_warnings (false, NULL, 0);
       return NULL_TREE;
     }
 
-  /* Require that we got a boolean type out if we put one in.  */

-  gcc_assert (TREE_CODE (TREE_TYPE (t)) == TREE_CODE (type));

-

   /* Canonicalize the combined condition for use in a COND_EXPR.  */
   t = canonicalize_cond_expr_cond (t);
 
   /* Bail out if we required an invariant but didn't get one.  */
   if (!t || (invariant_only && !is_gimple_min_invariant (t)))
     {
       fold_undefer_overflow_warnings (false, NULL, 0);
       return NULL_TREE;
     }
 
@@ -1135,20 +1132,131 @@  forward_propagate_comparison (gimple_stm

 
   /* Remove defining statements.  */
   return remove_prop_source_from_use (name);
 
 bailout:
   gsi_next (defgsi);
   return false;
 }
 
 
+/* Combine the binary operation defined in *GSI with one of its arguments

+   that is a condition, i.e. (A ? B : C) op D becomes A ? (B op D) : (C op D).

+   Returns 1 if there were any changes made, 2 if cfg-cleanup needs to

+   run.  Else it returns 0.  */

+

+static int

+combine_binop_with_condition (gimple_stmt_iterator *gsi)

+{

+  gimple stmt = gsi_stmt (*gsi);

+  tree type = TREE_TYPE (gimple_assign_lhs (stmt));

+  enum tree_code code = gimple_assign_rhs_code (stmt);

+  tree rhs1 = gimple_assign_rhs1 (stmt);

+  tree rhs2 = gimple_assign_rhs2 (stmt);

+  gimple def_stmt;

+  enum tree_code defcode;

+  bool trueok = false;

+  bool falseok = false;

+  tree true_op, false_op;

+  location_t loc = gimple_location (stmt);

+

+  if (TREE_CODE (rhs1) != SSA_NAME)

+    goto second_op;

+

+  def_stmt = get_prop_source_stmt (rhs1, true, NULL);

+  if (!def_stmt || !can_propagate_from (def_stmt))

+    goto second_op;

+

+  defcode = gimple_assign_rhs_code (def_stmt);

+  if (defcode != COND_EXPR && defcode != VEC_COND_EXPR)

+    goto second_op;

+

+  true_op = fold_build2_loc (loc, code, type, gimple_assign_rhs2 (def_stmt),

+			     gimple_assign_rhs2 (stmt));

+  false_op = fold_build2_loc (loc, code, type, gimple_assign_rhs3 (def_stmt),

+			      gimple_assign_rhs2 (stmt));

+

+  if (is_gimple_val (true_op))

+    trueok = true;

+  if (is_gimple_val (false_op))

+    falseok = true;

+  /* At least one of them has to simplify, or it isn't worth it.  */

+  if (!trueok && !falseok)

+    goto second_op;

+  if (!trueok)

+    {

+      if (!valid_gimple_rhs_p (true_op))

+	goto second_op;

+      gimple g = gimple_build_assign (make_ssa_name (type, NULL), true_op);

+      true_op = gimple_assign_lhs (g);

+      gsi_insert_before (gsi, g, GSI_SAME_STMT);

+    }

+  if (!falseok)

+    {

+      if (!valid_gimple_rhs_p (false_op))

+	goto second_op;

+      gimple g = gimple_build_assign (make_ssa_name (type, NULL), false_op);

+      false_op = gimple_assign_lhs (g);

+      gsi_insert_before (gsi, g, GSI_SAME_STMT);

+    }

+  goto finish;

+

+second_op:

+  if (TREE_CODE (rhs2) != SSA_NAME)

+    return 0;

+

+  def_stmt = get_prop_source_stmt (rhs2, true, NULL);

+  if (!def_stmt || !can_propagate_from (def_stmt))

+    return 0;

+

+  defcode = gimple_assign_rhs_code (def_stmt);

+  if (defcode != COND_EXPR && defcode != VEC_COND_EXPR)

+    return 0;

+

+  true_op = fold_build2_loc (loc, code, type, gimple_assign_rhs1 (stmt),

+			     gimple_assign_rhs2 (def_stmt));

+  false_op = fold_build2_loc (loc, code, type, gimple_assign_rhs1 (stmt),

+			      gimple_assign_rhs3 (def_stmt));

+

+  trueok = is_gimple_val (true_op);

+  falseok = is_gimple_val (false_op);

+  /* At least one of them has to simplify, or it isn't worth it.  */

+  if (!trueok && !falseok)

+    return 0;

+  if (!trueok)

+    {

+      if (!valid_gimple_rhs_p (true_op))

+	return 0;

+      gimple g = gimple_build_assign (make_ssa_name (type, NULL), true_op);

+      true_op = gimple_assign_lhs (g);

+      gsi_insert_before (gsi, g, GSI_SAME_STMT);

+    }

+  if (!falseok)

+    {

+      if (!valid_gimple_rhs_p (false_op))

+	return 0;

+      gimple g = gimple_build_assign (make_ssa_name (type, NULL), false_op);

+      false_op = gimple_assign_lhs (g);

+      gsi_insert_before (gsi, g, GSI_SAME_STMT);

+    }

+finish:

+  gimple_assign_set_rhs_with_ops_1 (gsi, defcode, gimple_assign_rhs1 (def_stmt),

+				    true_op, false_op);

+

+  fold_stmt (gsi);

+  update_stmt (gsi_stmt (*gsi));

+

+  /* Remove defining statements.  */

+  return remove_prop_source_from_use (gimple_assign_lhs (def_stmt)) + 1;

+}

+

+

 /* GSI_P points to a statement which performs a narrowing integral
    conversion.
 
    Look for cases like:
 
      t = x & c;
      y = (T) t;
 
    Turn them into:
 
@@ -3403,21 +3511,35 @@  ssa_forward_propagate_and_combine (void)

 	  /* Mark stmt as potentially needing revisiting.  */
 	  gimple_set_plf (stmt, GF_PLF_1, false);
 
 	  switch (gimple_code (stmt))
 	    {
 	    case GIMPLE_ASSIGN:
 	      {
 		tree rhs1 = gimple_assign_rhs1 (stmt);
 		enum tree_code code = gimple_assign_rhs_code (stmt);
 
-		if ((code == BIT_NOT_EXPR

+		/* Something (associate_plusminus?) doesn't set "changed"

+		   properly, so we can't put this at the end with

+		   if (!changed && ...).  */

+		if (TREE_CODE_CLASS (code) == tcc_binary

+		    || TREE_CODE_CLASS (code) == tcc_comparison)

+		  {

+		    int did_something;

+		    did_something = combine_binop_with_condition (&gsi);

+		    if (did_something == 2)

+		      cfg_changed = true;

+		    changed = did_something != 0;

+		  }

+		if (changed)

+		  ;

+		else if ((code == BIT_NOT_EXPR

 		     || code == NEGATE_EXPR)
 		    && TREE_CODE (rhs1) == SSA_NAME)
 		  changed = simplify_not_neg_expr (&gsi);
 		else if (code == COND_EXPR
 			 || code == VEC_COND_EXPR)
 		  {
 		    /* In this case the entire COND_EXPR is in rhs1. */
 		    if (forward_propagate_into_cond (&gsi)
 			|| combine_cond_exprs (&gsi))
 		      {
Index: fold-const.c

===================================================================
--- fold-const.c	(revision 202185)

+++ fold-const.c	(working copy)

@@ -14122,21 +14122,23 @@  fold_ternary_loc (location_t loc, enum t

 
       /* Convert A ? 1 : 0 to simply A.  */
       if ((code == VEC_COND_EXPR ? integer_all_onesp (op1)
 				 : (integer_onep (op1)
 				    && !VECTOR_TYPE_P (type)))
 	  && integer_zerop (op2)
 	  /* If we try to convert OP0 to our type, the
 	     call to fold will try to move the conversion inside
 	     a COND, which will recurse.  In that case, the COND_EXPR
 	     is probably the best choice, so leave it alone.  */
-	  && type == TREE_TYPE (arg0))

+	  && (type == TREE_TYPE (arg0)

+	      || (in_gimple_form

+		  && useless_type_conversion_p (type, TREE_TYPE (arg0)))))

 	return pedantic_non_lvalue_loc (loc, arg0);
 
       /* Convert A ? 0 : 1 to !A.  This prefers the use of NOT_EXPR
 	 over COND_EXPR in cases such as floating point comparisons.  */
       if (integer_zerop (op1)
 	  && (code == VEC_COND_EXPR ? integer_all_onesp (op2)
 				    : (integer_onep (op2)
 				       && !VECTOR_TYPE_P (type)))
 	  && truth_value_p (TREE_CODE (arg0)))
 	return pedantic_non_lvalue_loc (loc,
Index: testsuite/g++.dg/tree-ssa/pr57755.C

===================================================================
--- testsuite/g++.dg/tree-ssa/pr57755.C	(revision 0)

+++ testsuite/g++.dg/tree-ssa/pr57755.C	(revision 0)

@@ -0,0 +1,24 @@ 

+/* { dg-do compile } */

+/* { dg-options "-O -fdump-tree-forwprop1" } */

+typedef int vec __attribute__((vector_size(4*sizeof(int))));

+

+vec f(vec a,vec b){

+  vec z=a!=a; /* zero */

+  vec o=z+1; /* one */

+  vec c=(a>3)?o:z; /* scalar equivalent: c = a>3 */

+  vec d=(b>2)?o:z; /* scalar equivalent: d = b>2 */

+  vec e=c&d; /* simplify to (a>3)?((b>2)?1:0):0 */

+  return e!=0; /* simplify to (a>3)?((b>2)?-1:0):0 then (a>3)?(b>2):0 */

+}

+

+vec g(vec a,vec b){

+  vec z=a!=a; /* zero */

+  vec c=(a<=42)?b:z; /* equivalent to (a<=42)&b */

+  return b&c; /* notice that &b is useless */

+  /* return (a<=42)?b:0 */

+}

+

+/* { dg-final { scan-tree-dump-times "VEC_COND_EXPR" 2 "forwprop1" } } */

+/* { dg-final { scan-tree-dump-not "!=" "forwprop1" } } */

+/* { dg-final { scan-tree-dump-not "&" "forwprop1" } } */

+/* { dg-final { cleanup-tree-dump "forwprop1" } } */