Patchwork folding (vec_)cond_expr in a binary operation

login
register
mail settings
Submitter Marc Glisse
Date July 6, 2013, 7:42 p.m.
Message ID <alpine.DEB.2.02.1307061712010.5572@stedding.saclay.inria.fr>
Download mbox | patch
Permalink /patch/257299/
State New
Headers show

Comments

Marc Glisse - July 6, 2013, 7:42 p.m.
Hello,

the first attached patch does not bootstrap and has at least 2 main 
issues. The second patch does pass bootstrap+testsuite, but I liked the 
first more...

First, the fold-const bit causes an assertion failure (building libjava) 
in combine_cond_expr_cond, which calls:

   t = fold_binary_loc (gimple_location (stmt), code, type, op0, op1);

and then checks:

   /* Require that we got a boolean type out if we put one in.  */
   gcc_assert (TREE_CODE (TREE_TYPE (t)) == TREE_CODE (type));

which makes sense... Note that the 2 relevant types are:

(gdb) call debug_tree((tree)0x7ffff5e78930)
  <integer_type 0x7ffff5e78930 jboolean asm_written public unsigned type_3 QI
     size <integer_cst 0x7ffff64dcfa0 type <integer_type 0x7ffff64f60a8 bitsizetype> constant 8>
     unit size <integer_cst 0x7ffff64dcfc0 type <integer_type 0x7ffff64f6000 sizetype> constant 1>
     align 8 symtab -169408800 alias set -1 canonical type 0x7ffff6635c78 precision 1 min <integer_cst 0x7ffff66321e0 0> max <integer_cst 0x7ffff6632200 1>
     pointer_to_this <pointer_type 0x7ffff5a1e540> chain <integer_type 0x7ffff6635dc8>>
(gdb) call debug_tree(type)
  <boolean_type 0x7ffff64f69d8 bool sizes-gimplified asm_written public unsigned QI
     size <integer_cst 0x7ffff64dcfa0 type <integer_type 0x7ffff64f60a8 bitsizetype> constant 8>
     unit size <integer_cst 0x7ffff64dcfc0 type <integer_type 0x7ffff64f6000 sizetype> constant 1>
     align 8 symtab -170348640 alias set 19 canonical type 0x7ffff64f69d8 precision 1 min <integer_cst 0x7ffff64f92a0 0> max <integer_cst 0x7ffff64f92e0 1>
     pointer_to_this <pointer_type 0x7ffff5912dc8>>

I need to relax the conditions on the vec?-1:0 optimization because with 
vectors we often end up with several types that are equivalent. Can you 
think of a good way to do it? In the second patch I limit the 
transformation to vectors and hope it doesn't cause as much trouble 
there...



Second, the way the forwprop transformation is done, it can be necessary 
to run the forwprop pass several times in a row, which is not nice. It 
takes:

stmt_cond
stmt_binop

and produces:

stmt_folded
stmt_cond2

But one of the arguments of stmt_folded could be an ssa_name defined by a 
cond_expr, which could now be propagated into stmt_folded (there may be 
other new possible transformations). However, that cond_expr has already 
been visited and won't be again. The combine part of the pass uses a PLF 
to revisit statements, but the forwprop part doesn't have any specific 
mechanism.

The workarounds I can think of are:
- manually check if some of the arguments of stmt_folded are cond_expr and 
recursively call the function on them;
- move the transformation to the combine part of the pass;
- setup some system to revisit all earlier statements?

I went with the second option in the second patch, but this limitation of 
forward propagation seems strange to me.


(s/combine_binop_with_condition/forward_propagate_condition/ for the first 
patch)

2013-07-06  Marc Glisse  <marc.glisse@inria.fr>

 	PR tree-optimization/57755
gcc/
 	* tree-ssa-forwprop.c (combine_binop_with_condition): New function.
 	(ssa_forward_propagate_and_combine): Call it.
 	* fold-const.c (fold_ternary_loc): In gimple form, don't check for
 	type equality.

gcc/testsuite/
 	* g++.dg/tree-ssa/pr57755.C: New testcase.

(this message does not supersede 
http://gcc.gnu.org/ml/gcc-patches/2013-06/msg01624.html )
Richard Guenther - Aug. 30, 2013, 9:36 a.m.
On Sat, Jul 6, 2013 at 9:42 PM, Marc Glisse <marc.glisse@inria.fr> wrote:
> Hello,
>
> the first attached patch does not bootstrap and has at least 2 main issues.
> The second patch does pass bootstrap+testsuite, but I liked the first
> more...
>
> First, the fold-const bit causes an assertion failure (building libjava) in
> combine_cond_expr_cond, which calls:
>
>   t = fold_binary_loc (gimple_location (stmt), code, type, op0, op1);
>
> and then checks:
>
>   /* Require that we got a boolean type out if we put one in.  */
>   gcc_assert (TREE_CODE (TREE_TYPE (t)) == TREE_CODE (type));
>
> which makes sense... Note that the 2 relevant types are:

well, the check is probably old and can be relaxed to also allow all
compatible types.

> (gdb) call debug_tree((tree)0x7ffff5e78930)
>  <integer_type 0x7ffff5e78930 jboolean asm_written public unsigned type_3 QI
>     size <integer_cst 0x7ffff64dcfa0 type <integer_type 0x7ffff64f60a8
> bitsizetype> constant 8>
>     unit size <integer_cst 0x7ffff64dcfc0 type <integer_type 0x7ffff64f6000
> sizetype> constant 1>
>     align 8 symtab -169408800 alias set -1 canonical type 0x7ffff6635c78
> precision 1 min <integer_cst 0x7ffff66321e0 0> max <integer_cst
> 0x7ffff6632200 1>
>     pointer_to_this <pointer_type 0x7ffff5a1e540> chain <integer_type
> 0x7ffff6635dc8>>
> (gdb) call debug_tree(type)
>  <boolean_type 0x7ffff64f69d8 bool sizes-gimplified asm_written public
> unsigned QI
>     size <integer_cst 0x7ffff64dcfa0 type <integer_type 0x7ffff64f60a8
> bitsizetype> constant 8>
>     unit size <integer_cst 0x7ffff64dcfc0 type <integer_type 0x7ffff64f6000
> sizetype> constant 1>
>     align 8 symtab -170348640 alias set 19 canonical type 0x7ffff64f69d8
> precision 1 min <integer_cst 0x7ffff64f92a0 0> max <integer_cst
> 0x7ffff64f92e0 1>
>     pointer_to_this <pointer_type 0x7ffff5912dc8>>
>
> I need to relax the conditions on the vec?-1:0 optimization because with
> vectors we often end up with several types that are equivalent. Can you
> think of a good way to do it? In the second patch I limit the transformation
> to vectors and hope it doesn't cause as much trouble there...
>
>
>
> Second, the way the forwprop transformation is done, it can be necessary to
> run the forwprop pass several times in a row, which is not nice. It takes:
>
> stmt_cond
> stmt_binop
>
> and produces:
>
> stmt_folded
> stmt_cond2
>
> But one of the arguments of stmt_folded could be an ssa_name defined by a
> cond_expr, which could now be propagated into stmt_folded (there may be
> other new possible transformations). However, that cond_expr has already
> been visited and won't be again. The combine part of the pass uses a PLF to
> revisit statements, but the forwprop part doesn't have any specific
> mechanism.

forwprop is designed to re-process stmts it has folded to catch cascading
effects.  Now, as it as both a forward and a backward run you don't catch
2nd-order effects with iterating them.  On my TODO is to only do one kind,
either forward or backward propagation.

> The workarounds I can think of are:
> - manually check if some of the arguments of stmt_folded are cond_expr and
> recursively call the function on them;
> - move the transformation to the combine part of the pass;

this it is.

> - setup some system to revisit all earlier statements?
>
> I went with the second option in the second patch, but this limitation of
> forward propagation seems strange to me.
>
>
> (s/combine_binop_with_condition/forward_propagate_condition/ for the first
> patch)

Btw, as for the patch I don't like that you basically feed everything into
fold.  Yes, I know we do that for conditions because that's quite important
and nobody has re-written the condition folding as gimple pattern matcher.
I doubt that this transform is important enough to warrant another exception ;)

Richard.

> 2013-07-06  Marc Glisse  <marc.glisse@inria.fr>
>
>         PR tree-optimization/57755
> gcc/
>         * tree-ssa-forwprop.c (combine_binop_with_condition): New function.
>         (ssa_forward_propagate_and_combine): Call it.
>         * fold-const.c (fold_ternary_loc): In gimple form, don't check for
>         type equality.
>
> gcc/testsuite/
>         * g++.dg/tree-ssa/pr57755.C: New testcase.
>
> (this message does not supersede
> http://gcc.gnu.org/ml/gcc-patches/2013-06/msg01624.html )
>
> --
> Marc Glisse
> Index: fold-const.c
> ===================================================================
> --- fold-const.c        (revision 200736)
> +++ fold-const.c        (working copy)
> @@ -14124,21 +14124,23 @@ fold_ternary_loc (location_t loc, enum t
>
>        /* Convert A ? 1 : 0 to simply A.  */
>        if ((code == VEC_COND_EXPR ? integer_all_onesp (op1)
>                                  : (integer_onep (op1)
>                                     && !VECTOR_TYPE_P (type)))
>           && integer_zerop (op2)
>           /* If we try to convert OP0 to our type, the
>              call to fold will try to move the conversion inside
>              a COND, which will recurse.  In that case, the COND_EXPR
>              is probably the best choice, so leave it alone.  */
> -         && type == TREE_TYPE (arg0))
> +         && (type == TREE_TYPE (arg0)
> +             || (in_gimple_form
> +                 && useless_type_conversion_p (type, TREE_TYPE (arg0)))))
>         return pedantic_non_lvalue_loc (loc, arg0);
>
>        /* Convert A ? 0 : 1 to !A.  This prefers the use of NOT_EXPR
>          over COND_EXPR in cases such as floating point comparisons.  */
>        if (integer_zerop (op1)
>           && (code == VEC_COND_EXPR ? integer_all_onesp (op2)
>                                     : (integer_onep (op2)
>                                        && !VECTOR_TYPE_P (type)))
>           && truth_value_p (TREE_CODE (arg0)))
>         return pedantic_non_lvalue_loc (loc,
> Index: tree-ssa-forwprop.c
> ===================================================================
> --- tree-ssa-forwprop.c (revision 200736)
> +++ tree-ssa-forwprop.c (working copy)
> @@ -1135,20 +1135,135 @@ forward_propagate_comparison (gimple_stm
>
>    /* Remove defining statements.  */
>    return remove_prop_source_from_use (name);
>
>  bailout:
>    gsi_next (defgsi);
>    return false;
>  }
>
>
> +/* Forward propagate the condition defined in *DEFGSI to uses in
> +   binary operations.
> +   Returns true if stmt is now unused.  Advance DEFGSI to the next
> +   statement.  */
> +
> +static bool
> +forward_propagate_condition (gimple_stmt_iterator *defgsi)
> +{
> +  gimple stmt = gsi_stmt (*defgsi);
> +  tree name = gimple_assign_lhs (stmt);
> +  enum tree_code defcode = gimple_assign_rhs_code (stmt);
> +  gimple_stmt_iterator gsi;
> +  enum tree_code code;
> +  tree lhs, type;
> +  gimple use_stmt;
> +
> +  /* Don't propagate ssa names that occur in abnormal phis.  */
> +  if ((TREE_CODE (gimple_assign_rhs1 (stmt)) == SSA_NAME
> +       && SSA_NAME_OCCURS_IN_ABNORMAL_PHI (gimple_assign_rhs1 (stmt)))
> +      || (TREE_CODE (gimple_assign_rhs2 (stmt)) == SSA_NAME
> +        && SSA_NAME_OCCURS_IN_ABNORMAL_PHI (gimple_assign_rhs2 (stmt)))
> +      || (TREE_CODE (gimple_assign_rhs3 (stmt)) == SSA_NAME
> +        && SSA_NAME_OCCURS_IN_ABNORMAL_PHI (gimple_assign_rhs3 (stmt))))
> +    goto bailout;
> +
> +  /* Do not un-cse, but propagate through copies.  */
> +  use_stmt = get_prop_dest_stmt (name, &name);
> +  if (!use_stmt
> +      || !is_gimple_assign (use_stmt)
> +      || gimple_has_side_effects (stmt)
> +      || gimple_has_side_effects (use_stmt))
> +    goto bailout;
> +
> +  gsi = gsi_for_stmt (use_stmt);
> +  code = gimple_assign_rhs_code (use_stmt);
> +  lhs = gimple_assign_lhs (use_stmt);
> +  type = TREE_TYPE (lhs);
> +
> +  if (TREE_CODE_CLASS (code) == tcc_binary
> +      || TREE_CODE_CLASS (code) == tcc_comparison)
> +    {
> +      bool cond_is_left = false;
> +      bool trueok = false;
> +      bool falseok = false;
> +      if (name == gimple_assign_rhs1 (use_stmt))
> +       cond_is_left = true;
> +      else
> +       gcc_assert (name == gimple_assign_rhs2 (use_stmt));
> +      tree true_op, false_op;
> +      if (cond_is_left)
> +       {
> +         true_op = fold_build2_loc (gimple_location (stmt), code, type,
> +                                    gimple_assign_rhs2 (stmt),
> +                                    gimple_assign_rhs2 (use_stmt));
> +         false_op = fold_build2_loc (gimple_location (stmt), code, type,
> +                                     gimple_assign_rhs3 (stmt),
> +                                     gimple_assign_rhs2 (use_stmt));
> +       }
> +      else
> +       {
> +         true_op = fold_build2_loc (gimple_location (stmt), code, type,
> +                                    gimple_assign_rhs1 (use_stmt),
> +                                    gimple_assign_rhs2 (stmt));
> +         false_op = fold_build2_loc (gimple_location (stmt), code, type,
> +                                     gimple_assign_rhs1 (use_stmt),
> +                                     gimple_assign_rhs3 (stmt));
> +       }
> +
> +      if (is_gimple_val (true_op))
> +       trueok = true;
> +      if (is_gimple_val (false_op))
> +       falseok = true;
> +      /* At least one of them has to simplify, or it isn't worth it.  */
> +      if (!trueok && !falseok)
> +       goto bailout;
> +      if (!trueok)
> +       {
> +         if (!valid_gimple_rhs_p (true_op))
> +           goto bailout;
> +         gimple g = gimple_build_assign (make_ssa_name (type, NULL),
> true_op);
> +         true_op = gimple_assign_lhs (g);
> +         gsi_insert_before (&gsi, g, GSI_SAME_STMT);
> +       }
> +      if (!falseok)
> +       {
> +         if (!valid_gimple_rhs_p (false_op))
> +           goto bailout;
> +         gimple g = gimple_build_assign (make_ssa_name (type, NULL),
> false_op);
> +         false_op = gimple_assign_lhs (g);
> +         gsi_insert_before (&gsi, g, GSI_SAME_STMT);
> +       }
> +
> +      gimple_assign_set_rhs_with_ops_1 (&gsi, defcode,
> +                                       gimple_assign_rhs1 (stmt),
> +                                       true_op, false_op);
> +    }
> +  else
> +    goto bailout;
> +
> +  fold_stmt (&gsi);
> +  update_stmt (gsi_stmt (gsi));
> +
> +  /* When we remove stmt now the iterator defgsi goes off it's current
> +     sequence, hence advance it now.  */
> +  gsi_next (defgsi);
> +
> +  /* Remove defining statements.  */
> +  return remove_prop_source_from_use (name);
> +
> +bailout:
> +  gsi_next (defgsi);
> +  return false;
> +}
> +
> +
>  /* GSI_P points to a statement which performs a narrowing integral
>     conversion.
>
>     Look for cases like:
>
>       t = x & c;
>       y = (T) t;
>
>     Turn them into:
>
> @@ -3382,20 +3497,25 @@ ssa_forward_propagate_and_combine (void)
>                     gsi_next (&gsi);
>                 }
>               else
>                 gsi_next (&gsi);
>             }
>           else if (TREE_CODE_CLASS (code) == tcc_comparison)
>             {
>               if (forward_propagate_comparison (&gsi))
>                 cfg_changed = true;
>             }
> +         else if (code == COND_EXPR || code == VEC_COND_EXPR)
> +           {
> +             if (forward_propagate_condition (&gsi))
> +               cfg_changed = true;
> +           }
>           else
>             gsi_next (&gsi);
>         }
>
>        /* Combine stmts with the stmts defining their operands.
>          Note we update GSI within the loop as necessary.  */
>        for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi);)
>         {
>           gimple stmt = gsi_stmt (gsi);
>           bool changed = false;
> Index: testsuite/g++.dg/tree-ssa/pr57755.c
> ===================================================================
> --- testsuite/g++.dg/tree-ssa/pr57755.c (revision 0)
> +++ testsuite/g++.dg/tree-ssa/pr57755.c (revision 0)
> @@ -0,0 +1,23 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O -fdump-tree-forwprop1" } */
> +typedef int vec __attribute__((vector_size(4*sizeof(int))));
> +
> +vec f(vec a,vec b){
> +  vec z=a!=a;
> +  vec o=z+1;
> +  vec c=(a>3)?o:z;
> +  vec d=(b>2)?o:z;
> +  vec e=c&d;
> +  return e!=0;
> +}
> +
> +vec g(vec a,vec b){
> +  vec z=a!=a;
> +  vec c=(a<=42)?b:z;
> +  return b&c;
> +}
> +
> +/* { dg-final { scan-tree-dump-times "VEC_COND_EXPR" 2 "forwprop1" } } */
> +/* { dg-final { scan-tree-dump-not "!=" "forwprop1" } } */
> +/* { dg-final { scan-tree-dump-not "&" "forwprop1" } } */
> +/* { dg-final { cleanup-tree-dump "forwprop1" } } */
>
> Property changes on: testsuite/g++.dg/tree-ssa/pr57755.c
> ___________________________________________________________________
> Added: svn:eol-style
>    + native
> Added: svn:keywords
>    + Author Date Id Revision URL
>
>
> Index: testsuite/g++.dg/tree-ssa/pr57755.C
> ===================================================================
> --- testsuite/g++.dg/tree-ssa/pr57755.C (revision 0)
> +++ testsuite/g++.dg/tree-ssa/pr57755.C (revision 0)
> @@ -0,0 +1,23 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O -fdump-tree-forwprop1" } */
> +typedef int vec __attribute__((vector_size(4*sizeof(int))));
> +
> +vec f(vec a,vec b){
> +  vec z=a!=a;
> +  vec o=z+1;
> +  vec c=(a>3)?o:z;
> +  vec d=(b>2)?o:z;
> +  vec e=c&d;
> +  return e!=0;
> +}
> +
> +vec g(vec a,vec b){
> +  vec z=a!=a;
> +  vec c=(a<=42)?b:z;
> +  return b&c;
> +}
> +
> +/* { dg-final { scan-tree-dump-times "VEC_COND_EXPR" 2 "forwprop1" } } */
> +/* { dg-final { scan-tree-dump-not "!=" "forwprop1" } } */
> +/* { dg-final { scan-tree-dump-not "&" "forwprop1" } } */
> +/* { dg-final { cleanup-tree-dump "forwprop1" } } */
>
> Property changes on: testsuite/g++.dg/tree-ssa/pr57755.C
> ___________________________________________________________________
> Added: svn:keywords
>    + Author Date Id Revision URL
> Added: svn:eol-style
>    + native
>
> Index: fold-const.c
> ===================================================================
> --- fold-const.c        (revision 200736)
> +++ fold-const.c        (working copy)
> @@ -14124,21 +14124,23 @@ fold_ternary_loc (location_t loc, enum t
>
>        /* Convert A ? 1 : 0 to simply A.  */
>        if ((code == VEC_COND_EXPR ? integer_all_onesp (op1)
>                                  : (integer_onep (op1)
>                                     && !VECTOR_TYPE_P (type)))
>           && integer_zerop (op2)
>           /* If we try to convert OP0 to our type, the
>              call to fold will try to move the conversion inside
>              a COND, which will recurse.  In that case, the COND_EXPR
>              is probably the best choice, so leave it alone.  */
> -         && type == TREE_TYPE (arg0))
> +         && (type == TREE_TYPE (arg0)
> +             || (in_gimple_form && code == VEC_COND_EXPR
> +                 && useless_type_conversion_p (type, TREE_TYPE (arg0)))))
>         return pedantic_non_lvalue_loc (loc, arg0);
>
>        /* Convert A ? 0 : 1 to !A.  This prefers the use of NOT_EXPR
>          over COND_EXPR in cases such as floating point comparisons.  */
>        if (integer_zerop (op1)
>           && (code == VEC_COND_EXPR ? integer_all_onesp (op2)
>                                     : (integer_onep (op2)
>                                        && !VECTOR_TYPE_P (type)))
>           && truth_value_p (TREE_CODE (arg0)))
>         return pedantic_non_lvalue_loc (loc,
> Index: tree-ssa-forwprop.c
> ===================================================================
> --- tree-ssa-forwprop.c (revision 200736)
> +++ tree-ssa-forwprop.c (working copy)
> @@ -1135,20 +1135,131 @@ forward_propagate_comparison (gimple_stm
>
>    /* Remove defining statements.  */
>    return remove_prop_source_from_use (name);
>
>  bailout:
>    gsi_next (defgsi);
>    return false;
>  }
>
>
> +/* Combine the binary operation defined in *GSI with one of its arguments
> +   that is a condition.
> +   Returns 1 if there were any changes made, 2 if cfg-cleanup needs to
> +   run.  Else it returns 0.  */
> +
> +static int
> +combine_binop_with_condition (gimple_stmt_iterator *gsi)
> +{
> +  gimple stmt = gsi_stmt (*gsi);
> +  tree type = TREE_TYPE (gimple_assign_lhs (stmt));
> +  enum tree_code code = gimple_assign_rhs_code (stmt);
> +  tree rhs1 = gimple_assign_rhs1 (stmt);
> +  tree rhs2 = gimple_assign_rhs2 (stmt);
> +  gimple def_stmt;
> +  enum tree_code defcode;
> +  bool trueok = false;
> +  bool falseok = false;
> +  tree true_op, false_op;
> +  location_t loc = gimple_location (stmt);
> +
> +  if (TREE_CODE (rhs1) != SSA_NAME)
> +    goto second_op;
> +
> +  def_stmt = get_prop_source_stmt (rhs1, true, NULL);
> +  if (!def_stmt || !can_propagate_from (def_stmt))
> +    goto second_op;
> +
> +  defcode = gimple_assign_rhs_code (def_stmt);
> +  if (defcode != COND_EXPR && defcode != VEC_COND_EXPR)
> +    goto second_op;
> +
> +  true_op = fold_build2_loc (loc, code, type, gimple_assign_rhs2
> (def_stmt),
> +                            gimple_assign_rhs2 (stmt));
> +  false_op = fold_build2_loc (loc, code, type, gimple_assign_rhs3
> (def_stmt),
> +                             gimple_assign_rhs2 (stmt));
> +
> +  if (is_gimple_val (true_op))
> +    trueok = true;
> +  if (is_gimple_val (false_op))
> +    falseok = true;
> +  /* At least one of them has to simplify, or it isn't worth it.  */
> +  if (!trueok && !falseok)
> +    goto second_op;
> +  if (!trueok)
> +    {
> +      if (!valid_gimple_rhs_p (true_op))
> +       goto second_op;
> +      gimple g = gimple_build_assign (make_ssa_name (type, NULL), true_op);
> +      true_op = gimple_assign_lhs (g);
> +      gsi_insert_before (gsi, g, GSI_SAME_STMT);
> +    }
> +  if (!falseok)
> +    {
> +      if (!valid_gimple_rhs_p (false_op))
> +       goto second_op;
> +      gimple g = gimple_build_assign (make_ssa_name (type, NULL),
> false_op);
> +      false_op = gimple_assign_lhs (g);
> +      gsi_insert_before (gsi, g, GSI_SAME_STMT);
> +    }
> +  goto finish;
> +
> +second_op:
> +  if (TREE_CODE (rhs2) != SSA_NAME)
> +    return 0;
> +
> +  def_stmt = get_prop_source_stmt (rhs2, true, NULL);
> +  if (!def_stmt || !can_propagate_from (def_stmt))
> +    return 0;
> +
> +  defcode = gimple_assign_rhs_code (def_stmt);
> +  if (defcode != COND_EXPR && defcode != VEC_COND_EXPR)
> +    return 0;
> +
> +  true_op = fold_build2_loc (loc, code, type, gimple_assign_rhs1 (stmt),
> +                            gimple_assign_rhs2 (def_stmt));
> +  false_op = fold_build2_loc (loc, code, type, gimple_assign_rhs1 (stmt),
> +                             gimple_assign_rhs3 (def_stmt));
> +
> +  trueok = is_gimple_val (true_op);
> +  falseok = is_gimple_val (false_op);
> +  /* At least one of them has to simplify, or it isn't worth it.  */
> +  if (!trueok && !falseok)
> +    return 0;
> +  if (!trueok)
> +    {
> +      if (!valid_gimple_rhs_p (true_op))
> +       return 0;
> +      gimple g = gimple_build_assign (make_ssa_name (type, NULL), true_op);
> +      true_op = gimple_assign_lhs (g);
> +      gsi_insert_before (gsi, g, GSI_SAME_STMT);
> +    }
> +  if (!falseok)
> +    {
> +      if (!valid_gimple_rhs_p (false_op))
> +       return 0;
> +      gimple g = gimple_build_assign (make_ssa_name (type, NULL),
> false_op);
> +      false_op = gimple_assign_lhs (g);
> +      gsi_insert_before (gsi, g, GSI_SAME_STMT);
> +    }
> +finish:
> +  gimple_assign_set_rhs_with_ops_1 (gsi, defcode, gimple_assign_rhs1
> (def_stmt),
> +                                   true_op, false_op);
> +
> +  fold_stmt (gsi);
> +  update_stmt (gsi_stmt (*gsi));
> +
> +  /* Remove defining statements.  */
> +  return remove_prop_source_from_use (gimple_assign_lhs (def_stmt)) + 1;
> +}
> +
> +
>  /* GSI_P points to a statement which performs a narrowing integral
>     conversion.
>
>     Look for cases like:
>
>       t = x & c;
>       y = (T) t;
>
>     Turn them into:
>
> @@ -3403,21 +3514,35 @@ ssa_forward_propagate_and_combine (void)
>           /* Mark stmt as potentially needing revisiting.  */
>           gimple_set_plf (stmt, GF_PLF_1, false);
>
>           switch (gimple_code (stmt))
>             {
>             case GIMPLE_ASSIGN:
>               {
>                 tree rhs1 = gimple_assign_rhs1 (stmt);
>                 enum tree_code code = gimple_assign_rhs_code (stmt);
>
> -               if ((code == BIT_NOT_EXPR
> +               /* Something (associate_plusminus?) doesn't set "changed"
> +                  properly, so we can't put this at the end with
> +                  if (!changed && ...).  */
> +               if (TREE_CODE_CLASS (code) == tcc_binary
> +                   || TREE_CODE_CLASS (code) == tcc_comparison)
> +                 {
> +                   int did_something;
> +                   did_something = combine_binop_with_condition (&gsi);
> +                   if (did_something == 2)
> +                     cfg_changed = true;
> +                   changed = did_something != 0;
> +                 }
> +               if (changed)
> +                 ;
> +               else if ((code == BIT_NOT_EXPR
>                      || code == NEGATE_EXPR)
>                     && TREE_CODE (rhs1) == SSA_NAME)
>                   changed = simplify_not_neg_expr (&gsi);
>                 else if (code == COND_EXPR
>                          || code == VEC_COND_EXPR)
>                   {
>                     /* In this case the entire COND_EXPR is in rhs1. */
>                     if (forward_propagate_into_cond (&gsi)
>                         || combine_cond_exprs (&gsi))
>                       {
>
Marc Glisse - Sept. 2, 2013, 9:46 a.m.
On Fri, 30 Aug 2013, Richard Biener wrote:

> On Sat, Jul 6, 2013 at 9:42 PM, Marc Glisse <marc.glisse@inria.fr> wrote:
>> First, the fold-const bit causes an assertion failure (building libjava) in
>> combine_cond_expr_cond, which calls:
>>
>>   t = fold_binary_loc (gimple_location (stmt), code, type, op0, op1);
>>
>> and then checks:
>>
>>   /* Require that we got a boolean type out if we put one in.  */
>>   gcc_assert (TREE_CODE (TREE_TYPE (t)) == TREE_CODE (type));
>>
>> which makes sense... Note that the 2 relevant types are:
>
> well, the check is probably old and can be relaxed to also allow all
> compatible types.

Ok. Maybe it could even be removed then, we shouldn't check in every 
function that calls fold_binary_loc that it returns something of the type 
that was asked for.

>> Second, the way the forwprop transformation is done, it can be necessary to
>> run the forwprop pass several times in a row, which is not nice. It takes:
>>
>> stmt_cond
>> stmt_binop
>>
>> and produces:
>>
>> stmt_folded
>> stmt_cond2
>>
>> But one of the arguments of stmt_folded could be an ssa_name defined by a
>> cond_expr, which could now be propagated into stmt_folded (there may be
>> other new possible transformations). However, that cond_expr has already
>> been visited and won't be again. The combine part of the pass uses a PLF to
>> revisit statements, but the forwprop part doesn't have any specific
>> mechanism.
>
> forwprop is designed to re-process stmts it has folded to catch cascading
> effects.  Now, as it as both a forward and a backward run you don't catch
> 2nd-order effects with iterating them.  On my TODO is to only do one kind,
> either forward or backward propagation.

My impression is that even internally in the forward part, the
reprocessing doesn't really work, but that'll disappear anyway when the
forward part disappears.

> Btw, as for the patch I don't like that you basically feed everything into
> fold.  Yes, I know we do that for conditions because that's quite important
> and nobody has re-written the condition folding as gimple pattern matcher.
> I doubt that this transform is important enough to warrant another 
> exception ;)

I am not sure what you want here. When I notice the pattern (a?b:c) op d, 
I need to check whether b op d and c op d are likely to simplify before 
transforming it to a?(b op d):(c op d). And currently I can't see any way 
to do that other than feeding (b op d) to fold. Even if/when we get a 
gimple folder, we will still need to call it in about the same way.
Richard Guenther - Sept. 3, 2013, 10:41 a.m.
On Mon, Sep 2, 2013 at 11:46 AM, Marc Glisse <marc.glisse@inria.fr> wrote:
> On Fri, 30 Aug 2013, Richard Biener wrote:
>
>> On Sat, Jul 6, 2013 at 9:42 PM, Marc Glisse <marc.glisse@inria.fr> wrote:
>>>
>>> First, the fold-const bit causes an assertion failure (building libjava)
>>> in
>>> combine_cond_expr_cond, which calls:
>>>
>>>   t = fold_binary_loc (gimple_location (stmt), code, type, op0, op1);
>>>
>>> and then checks:
>>>
>>>   /* Require that we got a boolean type out if we put one in.  */
>>>   gcc_assert (TREE_CODE (TREE_TYPE (t)) == TREE_CODE (type));
>>>
>>> which makes sense... Note that the 2 relevant types are:
>>
>>
>> well, the check is probably old and can be relaxed to also allow all
>> compatible types.
>
>
> Ok. Maybe it could even be removed then, we shouldn't check in every
> function that calls fold_binary_loc that it returns something of the type
> that was asked for.
>
>
>>> Second, the way the forwprop transformation is done, it can be necessary
>>> to
>>> run the forwprop pass several times in a row, which is not nice. It
>>> takes:
>>>
>>> stmt_cond
>>> stmt_binop
>>>
>>> and produces:
>>>
>>> stmt_folded
>>> stmt_cond2
>>>
>>> But one of the arguments of stmt_folded could be an ssa_name defined by a
>>> cond_expr, which could now be propagated into stmt_folded (there may be
>>> other new possible transformations). However, that cond_expr has already
>>> been visited and won't be again. The combine part of the pass uses a PLF
>>> to
>>> revisit statements, but the forwprop part doesn't have any specific
>>> mechanism.
>>
>>
>> forwprop is designed to re-process stmts it has folded to catch cascading
>> effects.  Now, as it as both a forward and a backward run you don't catch
>> 2nd-order effects with iterating them.  On my TODO is to only do one kind,
>> either forward or backward propagation.
>
>
> My impression is that even internally in the forward part, the
> reprocessing doesn't really work, but that'll disappear anyway when the
> forward part disappears.
>
>
>> Btw, as for the patch I don't like that you basically feed everything into
>> fold.  Yes, I know we do that for conditions because that's quite
>> important
>> and nobody has re-written the condition folding as gimple pattern matcher.
>> I doubt that this transform is important enough to warrant another
>> exception ;)
>
>
> I am not sure what you want here. When I notice the pattern (a?b:c) op d, I
> need to check whether b op d and c op d are likely to simplify before
> transforming it to a?(b op d):(c op d). And currently I can't see any way to
> do that other than feeding (b op d) to fold. Even if/when we get a gimple
> folder, we will still need to call it in about the same way.

With a gimple folder you'd avoid building trees.  You are testing for
a quite complex pattern here - (a?b:c) & (d?e:f).  It seems to be that
for example

+  vec c=(a>3)?o:z;
+  vec d=(b>2)?o:z;
+  vec e=c&d;

would be better suited in the combine phase (you are interested in
combining the comparisons).  That is, look for a [&|^] b where
a and b are [VEC_]COND_EXPRs with the same values.  Heh,
and it's not obvious to me with looking for a minute what this
should be simplified to ... (so the code and the testcase needs some
comments on what you are trying to catch ...)

Richard.



> --
> Marc Glisse

Patch

Index: fold-const.c

===================================================================
--- fold-const.c	(revision 200736)

+++ fold-const.c	(working copy)

@@ -14124,21 +14124,23 @@  fold_ternary_loc (location_t loc, enum t

 
       /* Convert A ? 1 : 0 to simply A.  */
       if ((code == VEC_COND_EXPR ? integer_all_onesp (op1)
 				 : (integer_onep (op1)
 				    && !VECTOR_TYPE_P (type)))
 	  && integer_zerop (op2)
 	  /* If we try to convert OP0 to our type, the
 	     call to fold will try to move the conversion inside
 	     a COND, which will recurse.  In that case, the COND_EXPR
 	     is probably the best choice, so leave it alone.  */
-	  && type == TREE_TYPE (arg0))

+	  && (type == TREE_TYPE (arg0)

+	      || (in_gimple_form

+		  && useless_type_conversion_p (type, TREE_TYPE (arg0)))))

 	return pedantic_non_lvalue_loc (loc, arg0);
 
       /* Convert A ? 0 : 1 to !A.  This prefers the use of NOT_EXPR
 	 over COND_EXPR in cases such as floating point comparisons.  */
       if (integer_zerop (op1)
 	  && (code == VEC_COND_EXPR ? integer_all_onesp (op2)
 				    : (integer_onep (op2)
 				       && !VECTOR_TYPE_P (type)))
 	  && truth_value_p (TREE_CODE (arg0)))
 	return pedantic_non_lvalue_loc (loc,
Index: tree-ssa-forwprop.c

===================================================================
--- tree-ssa-forwprop.c	(revision 200736)

+++ tree-ssa-forwprop.c	(working copy)

@@ -1135,20 +1135,135 @@  forward_propagate_comparison (gimple_stm

 
   /* Remove defining statements.  */
   return remove_prop_source_from_use (name);
 
 bailout:
   gsi_next (defgsi);
   return false;
 }
 
 
+/* Forward propagate the condition defined in *DEFGSI to uses in

+   binary operations.

+   Returns true if stmt is now unused.  Advance DEFGSI to the next

+   statement.  */

+

+static bool

+forward_propagate_condition (gimple_stmt_iterator *defgsi)

+{

+  gimple stmt = gsi_stmt (*defgsi);

+  tree name = gimple_assign_lhs (stmt);

+  enum tree_code defcode = gimple_assign_rhs_code (stmt);

+  gimple_stmt_iterator gsi;

+  enum tree_code code;

+  tree lhs, type;

+  gimple use_stmt;

+

+  /* Don't propagate ssa names that occur in abnormal phis.  */

+  if ((TREE_CODE (gimple_assign_rhs1 (stmt)) == SSA_NAME

+       && SSA_NAME_OCCURS_IN_ABNORMAL_PHI (gimple_assign_rhs1 (stmt)))

+      || (TREE_CODE (gimple_assign_rhs2 (stmt)) == SSA_NAME

+        && SSA_NAME_OCCURS_IN_ABNORMAL_PHI (gimple_assign_rhs2 (stmt)))

+      || (TREE_CODE (gimple_assign_rhs3 (stmt)) == SSA_NAME

+        && SSA_NAME_OCCURS_IN_ABNORMAL_PHI (gimple_assign_rhs3 (stmt))))

+    goto bailout;

+

+  /* Do not un-cse, but propagate through copies.  */

+  use_stmt = get_prop_dest_stmt (name, &name);

+  if (!use_stmt

+      || !is_gimple_assign (use_stmt)

+      || gimple_has_side_effects (stmt)

+      || gimple_has_side_effects (use_stmt))

+    goto bailout;

+

+  gsi = gsi_for_stmt (use_stmt);

+  code = gimple_assign_rhs_code (use_stmt);

+  lhs = gimple_assign_lhs (use_stmt);

+  type = TREE_TYPE (lhs);

+

+  if (TREE_CODE_CLASS (code) == tcc_binary

+      || TREE_CODE_CLASS (code) == tcc_comparison)

+    {

+      bool cond_is_left = false;

+      bool trueok = false;

+      bool falseok = false;

+      if (name == gimple_assign_rhs1 (use_stmt))

+	cond_is_left = true;

+      else

+	gcc_assert (name == gimple_assign_rhs2 (use_stmt));

+      tree true_op, false_op;

+      if (cond_is_left)

+	{

+	  true_op = fold_build2_loc (gimple_location (stmt), code, type,

+				     gimple_assign_rhs2 (stmt),

+				     gimple_assign_rhs2 (use_stmt));

+	  false_op = fold_build2_loc (gimple_location (stmt), code, type,

+				      gimple_assign_rhs3 (stmt),

+				      gimple_assign_rhs2 (use_stmt));

+	}

+      else

+	{

+	  true_op = fold_build2_loc (gimple_location (stmt), code, type,

+				     gimple_assign_rhs1 (use_stmt),

+				     gimple_assign_rhs2 (stmt));

+	  false_op = fold_build2_loc (gimple_location (stmt), code, type,

+				      gimple_assign_rhs1 (use_stmt),

+				      gimple_assign_rhs3 (stmt));

+	}

+

+      if (is_gimple_val (true_op))

+	trueok = true;

+      if (is_gimple_val (false_op))

+	falseok = true;

+      /* At least one of them has to simplify, or it isn't worth it.  */

+      if (!trueok && !falseok)

+	goto bailout;

+      if (!trueok)

+	{

+	  if (!valid_gimple_rhs_p (true_op))

+	    goto bailout;

+	  gimple g = gimple_build_assign (make_ssa_name (type, NULL), true_op);

+	  true_op = gimple_assign_lhs (g);

+	  gsi_insert_before (&gsi, g, GSI_SAME_STMT);

+	}

+      if (!falseok)

+	{

+	  if (!valid_gimple_rhs_p (false_op))

+	    goto bailout;

+	  gimple g = gimple_build_assign (make_ssa_name (type, NULL), false_op);

+	  false_op = gimple_assign_lhs (g);

+	  gsi_insert_before (&gsi, g, GSI_SAME_STMT);

+	}

+

+      gimple_assign_set_rhs_with_ops_1 (&gsi, defcode,

+					gimple_assign_rhs1 (stmt),

+					true_op, false_op);

+    }

+  else

+    goto bailout;

+

+  fold_stmt (&gsi);

+  update_stmt (gsi_stmt (gsi));

+

+  /* When we remove stmt now the iterator defgsi goes off it's current

+     sequence, hence advance it now.  */

+  gsi_next (defgsi);

+

+  /* Remove defining statements.  */

+  return remove_prop_source_from_use (name);

+

+bailout:

+  gsi_next (defgsi);

+  return false;

+}

+

+

 /* GSI_P points to a statement which performs a narrowing integral
    conversion.
 
    Look for cases like:
 
      t = x & c;
      y = (T) t;
 
    Turn them into:
 
@@ -3382,20 +3497,25 @@  ssa_forward_propagate_and_combine (void)

 		    gsi_next (&gsi);
 		}
 	      else
 		gsi_next (&gsi);
 	    }
 	  else if (TREE_CODE_CLASS (code) == tcc_comparison)
 	    {
 	      if (forward_propagate_comparison (&gsi))
 	        cfg_changed = true;
 	    }
+	  else if (code == COND_EXPR || code == VEC_COND_EXPR)

+	    {

+	      if (forward_propagate_condition (&gsi))

+	        cfg_changed = true;

+	    }

 	  else
 	    gsi_next (&gsi);
 	}
 
       /* Combine stmts with the stmts defining their operands.
 	 Note we update GSI within the loop as necessary.  */
       for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi);)
 	{
 	  gimple stmt = gsi_stmt (gsi);
 	  bool changed = false;
Index: testsuite/g++.dg/tree-ssa/pr57755.c

===================================================================
--- testsuite/g++.dg/tree-ssa/pr57755.c	(revision 0)

+++ testsuite/g++.dg/tree-ssa/pr57755.c	(revision 0)

@@ -0,0 +1,23 @@ 

+/* { dg-do compile } */

+/* { dg-options "-O -fdump-tree-forwprop1" } */

+typedef int vec __attribute__((vector_size(4*sizeof(int))));

+

+vec f(vec a,vec b){

+  vec z=a!=a;

+  vec o=z+1;

+  vec c=(a>3)?o:z;

+  vec d=(b>2)?o:z;

+  vec e=c&d;

+  return e!=0;

+}

+

+vec g(vec a,vec b){

+  vec z=a!=a;

+  vec c=(a<=42)?b:z;

+  return b&c;

+}

+

+/* { dg-final { scan-tree-dump-times "VEC_COND_EXPR" 2 "forwprop1" } } */

+/* { dg-final { scan-tree-dump-not "!=" "forwprop1" } } */

+/* { dg-final { scan-tree-dump-not "&" "forwprop1" } } */

+/* { dg-final { cleanup-tree-dump "forwprop1" } } */