Patchwork Constant-fold vector comparisons

login
register
mail settings
Submitter Marc Glisse
Date Oct. 22, 2012, 8:31 p.m.
Message ID <alpine.DEB.2.02.1210222210420.14670@stedding.saclay.inria.fr>
Download mbox | patch
Permalink /patch/193271/
State New
Headers show

Comments

Marc Glisse - Oct. 22, 2012, 8:31 p.m.
On Mon, 15 Oct 2012, Richard Biener wrote:

> On Fri, Oct 12, 2012 at 4:07 PM, Marc Glisse <marc.glisse@inria.fr> wrote:
>> On Sat, 29 Sep 2012, Marc Glisse wrote:
>>
>>> 1) it handles constant folding of vector comparisons,
>>>
>>> 2) it fixes another place where vectors are not expected
>>
>>
>> Here is a new version of this patch.
>>
>> In a first try, I got bitten by the operator priorities in "a && b?c:d",
>> which g++ doesn't warn about.
>>
>>
>> 2012-10-12  Marc Glisse  <marc.glisse@inria.fr>
>>
>> gcc/
>>         * tree-ssa-forwprop.c (forward_propagate_into_cond): Handle vectors.
>>
>>         * fold-const.c (fold_relational_const): Handle VECTOR_CST.
>>
>> gcc/testsuite/
>>         * gcc.dg/tree-ssa/foldconst-6.c: New testcase.

Here is a new version, with the same ChangeLog plus

 	* doc/generic.texi (VEC_COND_EXPR): Document current policy.

> Which means I'd prefer if you simply condition the existing ~ and ^
> handling on COND_EXPR.

Done.

>> -      if (integer_onep (tmp))
>> +      if ((gimple_assign_rhs_code (stmt) == VEC_COND_EXPR)
>> +         ? integer_all_onesp (tmp) : integer_onep (tmp))
>
> and cache gimple_assign_rhs_code as a 'code' variable at the beginning
> of the function.

Done.

>> +  if (TREE_CODE (op0) == VECTOR_CST && TREE_CODE (op1) == VECTOR_CST)
>> +    {
>> +      int count = VECTOR_CST_NELTS (op0);
>> +      tree *elts =  XALLOCAVEC (tree, count);
>> +      gcc_assert (TREE_CODE (type) == VECTOR_TYPE);
>
> A better check would be that VECTOR_CST_NELTS of type is the same
> as that of op0.

I wasn't sure which check you meant, so I added both possibilities. I am 
fine with removing either or both, actually.

> Ok with these changes.

A few too many changes, I prefer to re-post, in case.


On Tue, 16 Oct 2012, Richard Biener wrote:

>> I liked your idea of the signed boolean vector, as a way to express that we
>> know some vector can only have values 0 and -1, but I am not sure how to use
>> it.
>
> Ah no, I didn't mean to suggest that ;)

Maybe you didn't, but I still took the idea from your words ;-)

>>> Thus, as we defined true to -1 and false to 0 we cannot, unless relaxing
>>> what VEC_COND_EXRP treats as true or false, optimize any of ~ or ^ -1 away.
>>
>> It seems to me that what prevents from optimizing is if we want to keep the
>> door open for a future relaxation of what VEC_COND_EXPR accepts as its first
>> argument. Which means: produce only -1 and 0, but don't assume we are only
>> reading -1 and 0 (unless we have a reason to know it, for instance because
>> it is the result of a comparison), and don't assume any specific
>> interpretation on those other values. Not sure how much that limits possible
>> optimizations.
>
> I'm not sure either - I'd rather leave the possibility open until we see
> a compelling reason to go either way (read: a testcase where it matters
> in practice).

Ok, I implemented the safe way. My current opinion is that we should go 
with a VEC_COND_EXPR that only accepts 0 and -1 (it is easy to pass a 
LT_EXPR or NE_EXPR as first argument if that is what one wants), but it 
can wait.
Richard Guenther - Oct. 23, 2012, 9:55 a.m.
On Mon, Oct 22, 2012 at 10:31 PM, Marc Glisse <marc.glisse@inria.fr> wrote:
> On Mon, 15 Oct 2012, Richard Biener wrote:
>
>> On Fri, Oct 12, 2012 at 4:07 PM, Marc Glisse <marc.glisse@inria.fr> wrote:
>>>
>>> On Sat, 29 Sep 2012, Marc Glisse wrote:
>>>
>>>> 1) it handles constant folding of vector comparisons,
>>>>
>>>> 2) it fixes another place where vectors are not expected
>>>
>>>
>>>
>>> Here is a new version of this patch.
>>>
>>> In a first try, I got bitten by the operator priorities in "a && b?c:d",
>>> which g++ doesn't warn about.
>>>
>>>
>>> 2012-10-12  Marc Glisse  <marc.glisse@inria.fr>
>>>
>>> gcc/
>>>         * tree-ssa-forwprop.c (forward_propagate_into_cond): Handle
>>> vectors.
>>>
>>>         * fold-const.c (fold_relational_const): Handle VECTOR_CST.
>>>
>>> gcc/testsuite/
>>>         * gcc.dg/tree-ssa/foldconst-6.c: New testcase.
>
>
> Here is a new version, with the same ChangeLog plus
>
>         * doc/generic.texi (VEC_COND_EXPR): Document current policy.
>
>
>> Which means I'd prefer if you simply condition the existing ~ and ^
>> handling on COND_EXPR.
>
>
> Done.
>
>
>>> -      if (integer_onep (tmp))
>>> +      if ((gimple_assign_rhs_code (stmt) == VEC_COND_EXPR)
>>> +         ? integer_all_onesp (tmp) : integer_onep (tmp))
>>
>>
>> and cache gimple_assign_rhs_code as a 'code' variable at the beginning
>> of the function.
>
>
> Done.
>
>
>>> +  if (TREE_CODE (op0) == VECTOR_CST && TREE_CODE (op1) == VECTOR_CST)
>>> +    {
>>> +      int count = VECTOR_CST_NELTS (op0);
>>> +      tree *elts =  XALLOCAVEC (tree, count);
>>> +      gcc_assert (TREE_CODE (type) == VECTOR_TYPE);
>>
>>
>> A better check would be that VECTOR_CST_NELTS of type is the same
>> as that of op0.
>
>
> I wasn't sure which check you meant, so I added both possibilities. I am
> fine with removing either or both, actually.
>
>> Ok with these changes.
>
>
> A few too many changes, I prefer to re-post, in case.
>
>
>
> On Tue, 16 Oct 2012, Richard Biener wrote:
>
>>> I liked your idea of the signed boolean vector, as a way to express that
>>> we
>>> know some vector can only have values 0 and -1, but I am not sure how to
>>> use
>>> it.
>>
>>
>> Ah no, I didn't mean to suggest that ;)
>
>
> Maybe you didn't, but I still took the idea from your words ;-)
>
>
>>>> Thus, as we defined true to -1 and false to 0 we cannot, unless relaxing
>>>> what VEC_COND_EXRP treats as true or false, optimize any of ~ or ^ -1
>>>> away.
>>>
>>>
>>> It seems to me that what prevents from optimizing is if we want to keep
>>> the
>>> door open for a future relaxation of what VEC_COND_EXPR accepts as its
>>> first
>>> argument. Which means: produce only -1 and 0, but don't assume we are
>>> only
>>> reading -1 and 0 (unless we have a reason to know it, for instance
>>> because
>>> it is the result of a comparison), and don't assume any specific
>>> interpretation on those other values. Not sure how much that limits
>>> possible
>>> optimizations.
>>
>>
>> I'm not sure either - I'd rather leave the possibility open until we see
>> a compelling reason to go either way (read: a testcase where it matters
>> in practice).
>
>
> Ok, I implemented the safe way. My current opinion is that we should go with
> a VEC_COND_EXPR that only accepts 0 and -1 (it is easy to pass a LT_EXPR or
> NE_EXPR as first argument if that is what one wants), but it can wait.

Ok with ...

> --
> Marc Glisse
> Index: gcc/testsuite/gcc.dg/tree-ssa/foldconst-6.c
> ===================================================================
> --- gcc/testsuite/gcc.dg/tree-ssa/foldconst-6.c (revision 0)
> +++ gcc/testsuite/gcc.dg/tree-ssa/foldconst-6.c (revision 0)
> @@ -0,0 +1,14 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O -fdump-tree-ccp1" } */
> +
> +typedef long vec __attribute__ ((vector_size (2 * sizeof(long))));
> +
> +vec f ()
> +{
> +  vec a = { -2, 666 };
> +  vec b = { 3, 2 };
> +  return a < b;
> +}
> +
> +/* { dg-final { scan-tree-dump-not "666" "ccp1"} } */
> +/* { dg-final { cleanup-tree-dump "ccp1" } } */
>
> Property changes on: gcc/testsuite/gcc.dg/tree-ssa/foldconst-6.c
> ___________________________________________________________________
> Added: svn:keywords
>    + Author Date Id Revision URL
> Added: svn:eol-style
>    + native
>
> Index: gcc/fold-const.c
> ===================================================================
> --- gcc/fold-const.c    (revision 192695)
> +++ gcc/fold-const.c    (working copy)
> @@ -16123,20 +16123,45 @@ fold_relational_const (enum tree_code co
>                                           TREE_IMAGPART (op0),
>                                           TREE_IMAGPART (op1));
>        if (code == EQ_EXPR)
>         return fold_build2 (TRUTH_ANDIF_EXPR, type, rcond, icond);
>        else if (code == NE_EXPR)
>         return fold_build2 (TRUTH_ORIF_EXPR, type, rcond, icond);
>        else
>         return NULL_TREE;
>      }
>
> +  if (TREE_CODE (op0) == VECTOR_CST && TREE_CODE (op1) == VECTOR_CST)
> +    {
> +      unsigned count = VECTOR_CST_NELTS (op0);
> +      tree *elts =  XALLOCAVEC (tree, count);
> +      gcc_assert (VECTOR_CST_NELTS (op1) == count);
> +      gcc_assert (TYPE_VECTOR_SUBPARTS (type) == count);

gcc_assert (VECTOR_CST_NELTS (op1) == count
                   && TYPE_VECTOR_SUBPARTS (type) == count);

Thanks,
Richard.

> +
> +      for (unsigned i = 0; i < count; i++)
> +       {
> +         tree elem_type = TREE_TYPE (type);
> +         tree elem0 = VECTOR_CST_ELT (op0, i);
> +         tree elem1 = VECTOR_CST_ELT (op1, i);
> +
> +         tree tem = fold_relational_const (code, elem_type,
> +                                           elem0, elem1);
> +
> +         if (tem == NULL_TREE)
> +           return NULL_TREE;
> +
> +         elts[i] = build_int_cst (elem_type, integer_zerop (tem) ? 0 : -1);
> +       }
> +
> +      return build_vector (type, elts);
> +    }
> +
>    /* From here on we only handle LT, LE, GT, GE, EQ and NE.
>
>       To compute GT, swap the arguments and do LT.
>       To compute GE, do LT and invert the result.
>       To compute LE, swap the arguments, do LT and invert the result.
>       To compute NE, do EQ and invert the result.
>
>       Therefore, the code below must handle only EQ and LT.  */
>
>    if (code == LE_EXPR || code == GT_EXPR)
> Index: gcc/doc/generic.texi
> ===================================================================
> --- gcc/doc/generic.texi        (revision 192695)
> +++ gcc/doc/generic.texi        (working copy)
> @@ -1773,22 +1773,23 @@ elements of the two vectors are merged (
>  vector.
>
>  @item VEC_COND_EXPR
>  These nodes represent @code{?:} expressions.  The three operands must be
>  vectors of the same size and number of elements.  The second and third
>  operands must have the same type as the entire expression.  The first
>  operand is of signed integral vector type.  If an element of the first
>  operand evaluates to a zero value, the corresponding element of the
>  result is taken from the third operand. If it evaluates to a minus one
>  value, it is taken from the second operand. It should never evaluate to
> -any other value. In contrast with a @code{COND_EXPR}, all operands are
> -always evaluated.
> +any other value currently, but optimizations should not rely on that
> +property. In contrast with a @code{COND_EXPR}, all operands are always
> +evaluated.
>  @end table
>
>
>  @c ---------------------------------------------------------------------
>  @c Statements
>  @c ---------------------------------------------------------------------
>
>  @node Statements
>  @section Statements
>  @cindex Statements
> Index: gcc/tree-ssa-forwprop.c
> ===================================================================
> --- gcc/tree-ssa-forwprop.c     (revision 192695)
> +++ gcc/tree-ssa-forwprop.c     (working copy)
> @@ -544,66 +544,69 @@ forward_propagate_into_gimple_cond (gimp
>  /* Propagate from the ssa name definition statements of COND_EXPR
>     in the rhs of statement STMT into the conditional if that simplifies it.
>     Returns true zero if the stmt was changed.  */
>
>  static bool
>  forward_propagate_into_cond (gimple_stmt_iterator *gsi_p)
>  {
>    gimple stmt = gsi_stmt (*gsi_p);
>    tree tmp = NULL_TREE;
>    tree cond = gimple_assign_rhs1 (stmt);
> +  enum tree_code code = gimple_assign_rhs_code (stmt);
>    bool swap = false;
>
>    /* We can do tree combining on SSA_NAME and comparison expressions.  */
>    if (COMPARISON_CLASS_P (cond))
>      tmp = forward_propagate_into_comparison_1 (stmt, TREE_CODE (cond),
>                                                TREE_TYPE (cond),
>                                                TREE_OPERAND (cond, 0),
>                                                TREE_OPERAND (cond, 1));
>    else if (TREE_CODE (cond) == SSA_NAME)
>      {
> -      enum tree_code code;
> +      enum tree_code def_code;
>        tree name = cond;
>        gimple def_stmt = get_prop_source_stmt (name, true, NULL);
>        if (!def_stmt || !can_propagate_from (def_stmt))
>         return 0;
>
> -      code = gimple_assign_rhs_code (def_stmt);
> -      if (TREE_CODE_CLASS (code) == tcc_comparison)
> +      def_code = gimple_assign_rhs_code (def_stmt);
> +      if (TREE_CODE_CLASS (def_code) == tcc_comparison)
>         tmp = fold_build2_loc (gimple_location (def_stmt),
> -                              code,
> +                              def_code,
>                                TREE_TYPE (cond),
>                                gimple_assign_rhs1 (def_stmt),
>                                gimple_assign_rhs2 (def_stmt));
> -      else if ((code == BIT_NOT_EXPR
> -               && TYPE_PRECISION (TREE_TYPE (cond)) == 1)
> -              || (code == BIT_XOR_EXPR
> -                  && integer_onep (gimple_assign_rhs2 (def_stmt))))
> +      else if (code == COND_EXPR
> +              && ((def_code == BIT_NOT_EXPR
> +                   && TYPE_PRECISION (TREE_TYPE (cond)) == 1)
> +                  || (def_code == BIT_XOR_EXPR
> +                      && integer_onep (gimple_assign_rhs2 (def_stmt)))))
>         {
>           tmp = gimple_assign_rhs1 (def_stmt);
>           swap = true;
>         }
>      }
>
>    if (tmp
>        && is_gimple_condexpr (tmp))
>      {
>        if (dump_file && tmp)
>         {
>           fprintf (dump_file, "  Replaced '");
>           print_generic_expr (dump_file, cond, 0);
>           fprintf (dump_file, "' with '");
>           print_generic_expr (dump_file, tmp, 0);
>           fprintf (dump_file, "'\n");
>         }
>
> -      if (integer_onep (tmp))
> +      if ((code == VEC_COND_EXPR) ? integer_all_onesp (tmp)
> +                                 : integer_onep (tmp))
>         gimple_assign_set_rhs_from_tree (gsi_p, gimple_assign_rhs2 (stmt));
>        else if (integer_zerop (tmp))
>         gimple_assign_set_rhs_from_tree (gsi_p, gimple_assign_rhs3 (stmt));
>        else
>         {
>           gimple_assign_set_rhs1 (stmt, unshare_expr (tmp));
>           if (swap)
>             {
>               tree t = gimple_assign_rhs2 (stmt);
>               gimple_assign_set_rhs2 (stmt, gimple_assign_rhs3 (stmt));
>

Patch

Index: gcc/testsuite/gcc.dg/tree-ssa/foldconst-6.c

===================================================================
--- gcc/testsuite/gcc.dg/tree-ssa/foldconst-6.c	(revision 0)

+++ gcc/testsuite/gcc.dg/tree-ssa/foldconst-6.c	(revision 0)

@@ -0,0 +1,14 @@ 

+/* { dg-do compile } */

+/* { dg-options "-O -fdump-tree-ccp1" } */

+

+typedef long vec __attribute__ ((vector_size (2 * sizeof(long))));

+

+vec f ()

+{

+  vec a = { -2, 666 };

+  vec b = { 3, 2 };

+  return a < b;

+}

+

+/* { dg-final { scan-tree-dump-not "666" "ccp1"} } */

+/* { dg-final { cleanup-tree-dump "ccp1" } } */


Property changes on: gcc/testsuite/gcc.dg/tree-ssa/foldconst-6.c
___________________________________________________________________
Added: svn:keywords
   + Author Date Id Revision URL
Added: svn:eol-style
   + native

Index: gcc/fold-const.c

===================================================================
--- gcc/fold-const.c	(revision 192695)

+++ gcc/fold-const.c	(working copy)

@@ -16123,20 +16123,45 @@  fold_relational_const (enum tree_code co

 					  TREE_IMAGPART (op0),
 					  TREE_IMAGPART (op1));
       if (code == EQ_EXPR)
 	return fold_build2 (TRUTH_ANDIF_EXPR, type, rcond, icond);
       else if (code == NE_EXPR)
 	return fold_build2 (TRUTH_ORIF_EXPR, type, rcond, icond);
       else
 	return NULL_TREE;
     }
 
+  if (TREE_CODE (op0) == VECTOR_CST && TREE_CODE (op1) == VECTOR_CST)

+    {

+      unsigned count = VECTOR_CST_NELTS (op0);

+      tree *elts =  XALLOCAVEC (tree, count);

+      gcc_assert (VECTOR_CST_NELTS (op1) == count);

+      gcc_assert (TYPE_VECTOR_SUBPARTS (type) == count);

+

+      for (unsigned i = 0; i < count; i++)

+	{

+	  tree elem_type = TREE_TYPE (type);

+	  tree elem0 = VECTOR_CST_ELT (op0, i);

+	  tree elem1 = VECTOR_CST_ELT (op1, i);

+

+	  tree tem = fold_relational_const (code, elem_type,

+					    elem0, elem1);

+

+	  if (tem == NULL_TREE)

+	    return NULL_TREE;

+

+	  elts[i] = build_int_cst (elem_type, integer_zerop (tem) ? 0 : -1);

+	}

+

+      return build_vector (type, elts);

+    }

+

   /* From here on we only handle LT, LE, GT, GE, EQ and NE.
 
      To compute GT, swap the arguments and do LT.
      To compute GE, do LT and invert the result.
      To compute LE, swap the arguments, do LT and invert the result.
      To compute NE, do EQ and invert the result.
 
      Therefore, the code below must handle only EQ and LT.  */
 
   if (code == LE_EXPR || code == GT_EXPR)
Index: gcc/doc/generic.texi

===================================================================
--- gcc/doc/generic.texi	(revision 192695)

+++ gcc/doc/generic.texi	(working copy)

@@ -1773,22 +1773,23 @@  elements of the two vectors are merged (

 vector.
 
 @item VEC_COND_EXPR
 These nodes represent @code{?:} expressions.  The three operands must be
 vectors of the same size and number of elements.  The second and third
 operands must have the same type as the entire expression.  The first
 operand is of signed integral vector type.  If an element of the first
 operand evaluates to a zero value, the corresponding element of the
 result is taken from the third operand. If it evaluates to a minus one
 value, it is taken from the second operand. It should never evaluate to
-any other value. In contrast with a @code{COND_EXPR}, all operands are

-always evaluated.

+any other value currently, but optimizations should not rely on that

+property. In contrast with a @code{COND_EXPR}, all operands are always

+evaluated.

 @end table
 
 
 @c ---------------------------------------------------------------------
 @c Statements
 @c ---------------------------------------------------------------------
 
 @node Statements
 @section Statements
 @cindex Statements
Index: gcc/tree-ssa-forwprop.c

===================================================================
--- gcc/tree-ssa-forwprop.c	(revision 192695)

+++ gcc/tree-ssa-forwprop.c	(working copy)

@@ -544,66 +544,69 @@  forward_propagate_into_gimple_cond (gimp

 /* Propagate from the ssa name definition statements of COND_EXPR
    in the rhs of statement STMT into the conditional if that simplifies it.
    Returns true zero if the stmt was changed.  */
 
 static bool
 forward_propagate_into_cond (gimple_stmt_iterator *gsi_p)
 {
   gimple stmt = gsi_stmt (*gsi_p);
   tree tmp = NULL_TREE;
   tree cond = gimple_assign_rhs1 (stmt);
+  enum tree_code code = gimple_assign_rhs_code (stmt);

   bool swap = false;
 
   /* We can do tree combining on SSA_NAME and comparison expressions.  */
   if (COMPARISON_CLASS_P (cond))
     tmp = forward_propagate_into_comparison_1 (stmt, TREE_CODE (cond),
 					       TREE_TYPE (cond),
 					       TREE_OPERAND (cond, 0),
 					       TREE_OPERAND (cond, 1));
   else if (TREE_CODE (cond) == SSA_NAME)
     {
-      enum tree_code code;

+      enum tree_code def_code;

       tree name = cond;
       gimple def_stmt = get_prop_source_stmt (name, true, NULL);
       if (!def_stmt || !can_propagate_from (def_stmt))
 	return 0;
 
-      code = gimple_assign_rhs_code (def_stmt);

-      if (TREE_CODE_CLASS (code) == tcc_comparison)

+      def_code = gimple_assign_rhs_code (def_stmt);

+      if (TREE_CODE_CLASS (def_code) == tcc_comparison)

 	tmp = fold_build2_loc (gimple_location (def_stmt),
-			       code,

+			       def_code,

 			       TREE_TYPE (cond),
 			       gimple_assign_rhs1 (def_stmt),
 			       gimple_assign_rhs2 (def_stmt));
-      else if ((code == BIT_NOT_EXPR

-		&& TYPE_PRECISION (TREE_TYPE (cond)) == 1)

-	       || (code == BIT_XOR_EXPR

-		   && integer_onep (gimple_assign_rhs2 (def_stmt))))

+      else if (code == COND_EXPR

+	       && ((def_code == BIT_NOT_EXPR

+		    && TYPE_PRECISION (TREE_TYPE (cond)) == 1)

+		   || (def_code == BIT_XOR_EXPR

+		       && integer_onep (gimple_assign_rhs2 (def_stmt)))))

 	{
 	  tmp = gimple_assign_rhs1 (def_stmt);
 	  swap = true;
 	}
     }
 
   if (tmp
       && is_gimple_condexpr (tmp))
     {
       if (dump_file && tmp)
 	{
 	  fprintf (dump_file, "  Replaced '");
 	  print_generic_expr (dump_file, cond, 0);
 	  fprintf (dump_file, "' with '");
 	  print_generic_expr (dump_file, tmp, 0);
 	  fprintf (dump_file, "'\n");
 	}
 
-      if (integer_onep (tmp))

+      if ((code == VEC_COND_EXPR) ? integer_all_onesp (tmp)

+				  : integer_onep (tmp))

 	gimple_assign_set_rhs_from_tree (gsi_p, gimple_assign_rhs2 (stmt));
       else if (integer_zerop (tmp))
 	gimple_assign_set_rhs_from_tree (gsi_p, gimple_assign_rhs3 (stmt));
       else
 	{
 	  gimple_assign_set_rhs1 (stmt, unshare_expr (tmp));
 	  if (swap)
 	    {
 	      tree t = gimple_assign_rhs2 (stmt);
 	      gimple_assign_set_rhs2 (stmt, gimple_assign_rhs3 (stmt));