diff mbox

[RFC,PR61839] Convert CST BINOP COND_EXPR to COND_EXPR ? (CST BINOP 1) : (CST BINOP 0)

Message ID 028aa472-0193-1077-bad3-d1dba1b324e2@linaro.org
State New
Headers show

Commit Message

Kugan Vivekanandarajah Aug. 11, 2016, 4:11 a.m. UTC
Hi Richard,

On 09/08/16 20:22, Richard Biener wrote:
> On Tue, Aug 9, 2016 at 4:51 AM, Kugan Vivekanandarajah
> <kugan.vivekanandarajah@linaro.org> wrote:
>> Hi Richard,
>>
>> Thanks for the review.
>>
>> On 29 April 2016 at 20:47, Richard Biener <richard.guenther@gmail.com> wrote:
>>> On Sun, Apr 17, 2016 at 1:14 AM, kugan
>>> <kugan.vivekanandarajah@linaro.org> wrote:
>>>> As explained in PR61839,
>>>>
>>>> Following difference results in extra instructions:
>>>> -  c = b != 0 ? 486097858 : 972195717;
>>>> +  c = a + 972195718 >> (b != 0);
>>>>
>>>> As suggested in PR, attached patch converts CST BINOP COND_EXPR to COND_EXPR
>>>> ? (CST BINOP 1) : (CST BINOP 0).
>>>>
>>>> Bootstrapped and regression tested for x86-64-linux-gnu with no new
>>>> regression. Is this OK for statege-1.
>>>
>>> You are missing a testcase.
>>>
>>> I think the transform can be generalized to any two-value value-range by
>>> instead of
>>>
>>>   lhs = cond_res ? (cst binop 1) : (cst binop 0)
>>>
>>> emitting
>>>
>>>   lhs = tmp == val1 ? (cst binop val1) : (cst binop val2);
>>>
>>> In the PR I asked the transform to be only carried out if cond_res and
>>> tmp have a single use (and thus they'd eventually vanish).
>>>
>>> I'm not sure if a general two-value "constant" propagation is profitable
>>> which is why I was originally asking for the pattern to only apply
>>> if the resulting value is used in a comparison which we could then
>>> in turn simplify by substituting COND_RES (or ! COND_RES) for it.
>>> For the general two-value case we'd substitute it with tmp [=!]= val[12]
>>> dependent on which constant is cheaper to test for.
>>>
>>> So I think this needs some exploring work on which way to go
>>> and which transform is profitable in the end.  I think the general
>>> two-value case feeding a condition will be always profitable.
>>
>>
>> Please find a modified version which checks for two-valued variable
>> and uses this to optimize. In the attached test case (in function
>> bar), we end up doing the conversion twice.
>>
>> Bootstrapped and regression tested on x86_64-linux-gnu without no new
>> regressions. Is this OK for trunk?
>
> +/* Return true if VAR is a two-valued variable.  Set MIN and MAX when it is
> +   true.  Return false otherwise.  */
> +
> +static bool
> +two_valued_val_range_p (tree var, tree *min, tree *max)
> +{
>
> I'd use A and B, not MIN/MAX given it's two values, not necessarily
> a two-valued range (for example for ~[1, UINT_MAX-1] which you
I have changed this. I don't  think this would be the only 
VR_ANTI_RANGE. Others like TYPE_MIN + 1, TYPE_MIN + 2 should come as 
VR_RANGE.

> don't handle).  In theory VRP might get a more sophisticated range
> representation to also allow a range consisting of just 3 and 7 for example.
>
I am not sure how this will be represented as value range. Is this for 
enum types where thhere is no valid values between 3 and 7 ?

> +  tree tmp
> +    = int_const_binop (PLUS_EXPR,
> +                      vr->min,
> +                      build_int_cst_type (TREE_TYPE (var), 1));
> +  if (0 != compare_values (tmp, vr->max))
> +    return false;
>
> I think simply
>
>    if (wi::sub (vr->max, vr->min) == 1)
I have changed this.

>
> might work as well and avoid building a tree node.
>
> +      /* Convert:
> +        LHS = CST BINOP VAR
> +        where VAR is two-valued.
> +
> +        To:
> +        LHS = VAR == VAL1 ? (CST BINOP VAL1) : (CST BINOP VAL2) */
> +
> +      if (TREE_CODE_CLASS (rhs_code) == tcc_binary
> +         && TREE_CODE (rhs1) == INTEGER_CST
> +         && TREE_CODE (rhs2) == SSA_NAME
>
> Note that for all commutative tcc_binary operators the constant will be on the
> other operand.  I think you need to handle the constant appearing in both places
> (and for division for example watch out for a zero divisor).

I have now canonicalized it in the beginning. I don't think it will 
affect other simplifications that comes after this transformation.

>
> +         && has_single_use (rhs2)
> +         && two_valued_val_range_p (rhs2, &min, &max))
> +
> +       {
> +         tree cond = build2 (EQ_EXPR, TREE_TYPE (rhs2), rhs2, min);
> +         tree new_rhs1 =  int_const_binop (rhs_code, rhs1, min);
> +         tree new_rhs2 =  int_const_binop (rhs_code, rhs1, max);
>
> too many spaces after '='.
Done.

>
> +
> +         if (new_rhs1 && new_rhs2)
>
> You didn't address my point about profitability - you test for a single use
> but not for the kind of use.  Please instead use
I checked with some simple test-cases. In those cases it either improves 
or no difference.

>
>     && single_imm_use (rhs2, &use_p, &use_stmt)
>     && gimple_code (use_stmt) == GIMPLE_COND
Done.

>
> The testcase won't work on targets with small integers thus please
> require int32plus.  With the existing scan-dumps it's not obvious
Done.

> what transform it is testing for - please add a comment before
> the dump scan reflecting the desired transform.  Maybe also scan
> "optimized" instead to also verify that followup transforms trigger.
>
Done.


ASM difference for x86-64 is
@@ -11,11 +11,7 @@
  	movl	$1, 12(%rsp)
  	movl	12(%rsp), %eax
  	testl	%eax, %eax
-	movl	$972195717, %eax
-	setne	%cl
-	sarl	%cl, %eax
-	cmpl	$486097858, %eax
-	jne	.L5
+	je	.L5
  	xorl	%eax, %eax
  	addq	$24, %rsp
  	.cfi_remember_state

function bar in the test-case is already optimized so no difference. I 
just added to make sure that the operation is correct.

Bootstrapped and regression tested on x86_64-linux-gn with no new 
regressions. Is this OK for trunk now.


Thanks,
Kugan

> Thanks,
> Richard.
>
>> Thanks,
>> Kugan
>>
>> gcc/testsuite/ChangeLog:
>>
>> 2016-08-09  Kugan Vivekanandarajah  <kuganv@linaro.org>
>>
>>     PR tree-optimization/61839
>>     * gcc.dg/tree-ssa/pr61839.c: New test.
>>
>> gcc/ChangeLog:
>>
>> 2016-08-09  Kugan Vivekanandarajah  <kuganv@linaro.org>
>>
>>     PR tree-optimization/61839
>>     * tree-vrp.c (two_valued_val_range_p): New.
>>     (simplify_stmt_using_ranges): Convert CST BINOP VAR where VAR is
>>     two-valued to VAR == VAL1 ? (CST BINOP VAL1) : (CST BINOP VAL2).

Comments

Richard Biener Aug. 11, 2016, 10:04 a.m. UTC | #1
On Thu, Aug 11, 2016 at 6:11 AM, kugan
<kugan.vivekanandarajah@linaro.org> wrote:
> Hi Richard,
>
>
> On 09/08/16 20:22, Richard Biener wrote:
>>
>> On Tue, Aug 9, 2016 at 4:51 AM, Kugan Vivekanandarajah
>> <kugan.vivekanandarajah@linaro.org> wrote:
>>>
>>> Hi Richard,
>>>
>>> Thanks for the review.
>>>
>>> On 29 April 2016 at 20:47, Richard Biener <richard.guenther@gmail.com>
>>> wrote:
>>>>
>>>> On Sun, Apr 17, 2016 at 1:14 AM, kugan
>>>> <kugan.vivekanandarajah@linaro.org> wrote:
>>>>>
>>>>> As explained in PR61839,
>>>>>
>>>>> Following difference results in extra instructions:
>>>>> -  c = b != 0 ? 486097858 : 972195717;
>>>>> +  c = a + 972195718 >> (b != 0);
>>>>>
>>>>> As suggested in PR, attached patch converts CST BINOP COND_EXPR to
>>>>> COND_EXPR
>>>>> ? (CST BINOP 1) : (CST BINOP 0).
>>>>>
>>>>> Bootstrapped and regression tested for x86-64-linux-gnu with no new
>>>>> regression. Is this OK for statege-1.
>>>>
>>>>
>>>> You are missing a testcase.
>>>>
>>>> I think the transform can be generalized to any two-value value-range by
>>>> instead of
>>>>
>>>>   lhs = cond_res ? (cst binop 1) : (cst binop 0)
>>>>
>>>> emitting
>>>>
>>>>   lhs = tmp == val1 ? (cst binop val1) : (cst binop val2);
>>>>
>>>> In the PR I asked the transform to be only carried out if cond_res and
>>>> tmp have a single use (and thus they'd eventually vanish).
>>>>
>>>> I'm not sure if a general two-value "constant" propagation is profitable
>>>> which is why I was originally asking for the pattern to only apply
>>>> if the resulting value is used in a comparison which we could then
>>>> in turn simplify by substituting COND_RES (or ! COND_RES) for it.
>>>> For the general two-value case we'd substitute it with tmp [=!]= val[12]
>>>> dependent on which constant is cheaper to test for.
>>>>
>>>> So I think this needs some exploring work on which way to go
>>>> and which transform is profitable in the end.  I think the general
>>>> two-value case feeding a condition will be always profitable.
>>>
>>>
>>>
>>> Please find a modified version which checks for two-valued variable
>>> and uses this to optimize. In the attached test case (in function
>>> bar), we end up doing the conversion twice.
>>>
>>> Bootstrapped and regression tested on x86_64-linux-gnu without no new
>>> regressions. Is this OK for trunk?
>>
>>
>> +/* Return true if VAR is a two-valued variable.  Set MIN and MAX when it
>> is
>> +   true.  Return false otherwise.  */
>> +
>> +static bool
>> +two_valued_val_range_p (tree var, tree *min, tree *max)
>> +{
>>
>> I'd use A and B, not MIN/MAX given it's two values, not necessarily
>> a two-valued range (for example for ~[1, UINT_MAX-1] which you
>
> I have changed this. I don't  think this would be the only VR_ANTI_RANGE.
> Others like TYPE_MIN + 1, TYPE_MIN + 2 should come as VR_RANGE.
>
>> don't handle).  In theory VRP might get a more sophisticated range
>> representation to also allow a range consisting of just 3 and 7 for
>> example.
>>
> I am not sure how this will be represented as value range. Is this for enum
> types where thhere is no valid values between 3 and 7 ?
>
>> +  tree tmp
>> +    = int_const_binop (PLUS_EXPR,
>> +                      vr->min,
>> +                      build_int_cst_type (TREE_TYPE (var), 1));
>> +  if (0 != compare_values (tmp, vr->max))
>> +    return false;
>>
>> I think simply
>>
>>    if (wi::sub (vr->max, vr->min) == 1)
>
> I have changed this.
>
>>
>> might work as well and avoid building a tree node.
>>
>> +      /* Convert:
>> +        LHS = CST BINOP VAR
>> +        where VAR is two-valued.
>> +
>> +        To:
>> +        LHS = VAR == VAL1 ? (CST BINOP VAL1) : (CST BINOP VAL2) */
>> +
>> +      if (TREE_CODE_CLASS (rhs_code) == tcc_binary
>> +         && TREE_CODE (rhs1) == INTEGER_CST
>> +         && TREE_CODE (rhs2) == SSA_NAME
>>
>> Note that for all commutative tcc_binary operators the constant will be on
>> the
>> other operand.  I think you need to handle the constant appearing in both
>> places
>> (and for division for example watch out for a zero divisor).
>
>
> I have now canonicalized it in the beginning. I don't think it will affect
> other simplifications that comes after this transformation.
>
>>
>> +         && has_single_use (rhs2)
>> +         && two_valued_val_range_p (rhs2, &min, &max))
>> +
>> +       {
>> +         tree cond = build2 (EQ_EXPR, TREE_TYPE (rhs2), rhs2, min);
>> +         tree new_rhs1 =  int_const_binop (rhs_code, rhs1, min);
>> +         tree new_rhs2 =  int_const_binop (rhs_code, rhs1, max);
>>
>> too many spaces after '='.
>
> Done.
>
>>
>> +
>> +         if (new_rhs1 && new_rhs2)
>>
>> You didn't address my point about profitability - you test for a single
>> use
>> but not for the kind of use.  Please instead use
>
> I checked with some simple test-cases. In those cases it either improves or
> no difference.
>
>>
>>     && single_imm_use (rhs2, &use_p, &use_stmt)
>>     && gimple_code (use_stmt) == GIMPLE_COND
>
> Done.
>
>>
>> The testcase won't work on targets with small integers thus please
>> require int32plus.  With the existing scan-dumps it's not obvious
>
> Done.
>
>> what transform it is testing for - please add a comment before
>> the dump scan reflecting the desired transform.  Maybe also scan
>> "optimized" instead to also verify that followup transforms trigger.
>>
> Done.
>
>
> ASM difference for x86-64 is
> @@ -11,11 +11,7 @@
>         movl    $1, 12(%rsp)
>         movl    12(%rsp), %eax
>         testl   %eax, %eax
> -       movl    $972195717, %eax
> -       setne   %cl
> -       sarl    %cl, %eax
> -       cmpl    $486097858, %eax
> -       jne     .L5
> +       je      .L5
>         xorl    %eax, %eax
>         addq    $24, %rsp
>         .cfi_remember_state
>
> function bar in the test-case is already optimized so no difference. I just
> added to make sure that the operation is correct.
>
> Bootstrapped and regression tested on x86_64-linux-gn with no new
> regressions. Is this OK for trunk now.

+two_valued_val_range_p (tree var, tree *a, tree *b)
+{
+  value_range *vr = get_value_range (var);
+  if ((vr->type != VR_RANGE
+       && vr->type != VR_ANTI_RANGE)
+      || !range_int_cst_p (vr))
+    return false;

range_int_cst_p checks for vr->type == VR_RANGE so the anti-range handling
doesn't ever trigger - which means you should add a testcase for it as well.


+    {
+      *a = TYPE_MIN_VALUE (TREE_TYPE (var));
+      *b = TYPE_MAX_VALUE (TREE_TYPE (var));

note that for pointer types this doesn't work, please also use
vrp_val_min/max for
consistency.  I think you want to add a INTEGRAL_TYPE_P (TREE_TYPE (var))
to the guard of two_valued_val_range_p.

+      /* First canonicalize to simplify tests.  */
+      if (commutative_tree_code (rhs_code)
+         && TREE_CODE (rhs2) == INTEGER_CST)
+       std::swap (rhs1, rhs2);

note that this doesn't really address my comment - if you just want to handle
commutative ops then simply look for the constant in the second place rather
than the first which is the canonical operand order.  But even for
non-commutative
operations we might want to apply this optimization - and then for both cases,
rhs1 or rhs2 being constant.  Like x / 5 and 5 / x.

Note that you can rely on int_const_binop returning NULL_TREE for "invalid"
ops like x % 0 or x / 0, so no need to explicitely guard this here.

Thanks,
Richard.

>
> Thanks,
> Kugan
>
>
>> Thanks,
>> Richard.
>>
>>> Thanks,
>>> Kugan
>>>
>>> gcc/testsuite/ChangeLog:
>>>
>>> 2016-08-09  Kugan Vivekanandarajah  <kuganv@linaro.org>
>>>
>>>     PR tree-optimization/61839
>>>     * gcc.dg/tree-ssa/pr61839.c: New test.
>>>
>>> gcc/ChangeLog:
>>>
>>> 2016-08-09  Kugan Vivekanandarajah  <kuganv@linaro.org>
>>>
>>>     PR tree-optimization/61839
>>>     * tree-vrp.c (two_valued_val_range_p): New.
>>>     (simplify_stmt_using_ranges): Convert CST BINOP VAR where VAR is
>>>     two-valued to VAR == VAL1 ? (CST BINOP VAL1) : (CST BINOP VAL2).
diff mbox

Patch

diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr61839.c b/gcc/testsuite/gcc.dg/tree-ssa/pr61839.c
index e69de29..abe46a0 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/pr61839.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr61839.c
@@ -0,0 +1,44 @@ 
+/* PR tree-optimization/61839.  */
+/* { dg-do run } */
+/* { dg-options "-O2 -fdump-tree-vrp1 -fdump-tree-optimized" } */
+/* { dg-require-effective-target int32plus } */
+
+__attribute__ ((noinline))
+int foo ()
+{
+  int a = -1;
+  volatile unsigned b = 1U;
+  int c = 1;
+  c = (a + 972195718) >> (1LU <= b);
+  if (c == 486097858)
+    ;
+  else
+    __builtin_abort ();
+  return 0;
+}
+
+__attribute__ ((noinline))
+int bar ()
+{
+  int a = -1;
+  volatile unsigned b = 1U;
+  int c = 1;
+  c = (a + 972195718) >> (b ? 2 : 3);
+  if (c == 243048929)
+    ;
+  else
+    __builtin_abort ();
+  return 0;
+}
+
+int main ()
+{
+  foo ();
+  bar ();
+}
+
+/* Scan for c = 972195717) >> [0, 1] in function foo.  */
+/* { dg-final { scan-tree-dump-times "972195717 : 486097858" 1  "vrp1" } } */
+/* Scan for c = 972195717) >> [2, 3] in function bar.  */
+/* { dg-final { scan-tree-dump-times "243048929 : 121524464" 2  "vrp1" } } */
+/* { dg-final { scan-tree-dump-times "486097858" 0  "optimized" } } */
diff --git a/gcc/tree-vrp.c b/gcc/tree-vrp.c
index 7c7ad91..b9013b3 100644
--- a/gcc/tree-vrp.c
+++ b/gcc/tree-vrp.c
@@ -10004,6 +10004,39 @@  simplify_internal_call_using_ranges (gimple_stmt_iterator *gsi, gimple *stmt)
   return true;
 }
 
+/* Return true if VAR is a two-valued variable.  Set a and b with the
+   two-values when it is true.  Return false otherwise.  */
+
+static bool
+two_valued_val_range_p (tree var, tree *a, tree *b)
+{
+  value_range *vr = get_value_range (var);
+  if ((vr->type != VR_RANGE
+       && vr->type != VR_ANTI_RANGE)
+      || !range_int_cst_p (vr))
+    return false;
+
+  if (vr->type == VR_RANGE
+      && wi::sub (vr->max, vr->min) == 1)
+    {
+      *a = vr->min;
+      *b = vr->max;
+      return true;
+    }
+
+  /* ~[TYPE_MIN + 1, TYPE_MAX - 1] */
+  if (vr->type == VR_ANTI_RANGE
+      && wi::sub (vr->min, wi::min_value (TREE_TYPE (var))) == 1
+      && wi::sub (wi::max_value (TREE_TYPE (var)), vr->max) == 1)
+    {
+      *a = TYPE_MIN_VALUE (TREE_TYPE (var));
+      *b = TYPE_MAX_VALUE (TREE_TYPE (var));
+      return true;
+    }
+
+  return false;
+}
+
 /* Simplify STMT using ranges if possible.  */
 
 static bool
@@ -10014,6 +10047,61 @@  simplify_stmt_using_ranges (gimple_stmt_iterator *gsi)
     {
       enum tree_code rhs_code = gimple_assign_rhs_code (stmt);
       tree rhs1 = gimple_assign_rhs1 (stmt);
+      tree rhs2 = gimple_assign_rhs2 (stmt);
+      tree lhs = gimple_assign_lhs (stmt);
+      tree val1 = NULL_TREE, val2 = NULL_TREE;
+      use_operand_p use_p;
+      gimple *use_stmt;
+
+      /* First canonicalize to simplify tests.  */
+      if (commutative_tree_code (rhs_code)
+	  && TREE_CODE (rhs2) == INTEGER_CST)
+	std::swap (rhs1, rhs2);
+
+      /* Convert:
+	 LHS = CST BINOP VAR
+	 Where VAR is two-valued and LHS is used in GIMPLE_COND only
+	 To:
+	 LHS = VAR == VAL1 ? (CST BINOP VAL1) : (CST BINOP VAL2) */
+
+      if (TREE_CODE_CLASS (rhs_code) == tcc_binary
+	  && TREE_CODE (rhs1) == INTEGER_CST
+	  && TREE_CODE (rhs2) == SSA_NAME
+	  && single_imm_use (lhs, &use_p, &use_stmt)
+	  && gimple_code (use_stmt) == GIMPLE_COND
+	  && two_valued_val_range_p (rhs2, &val1, &val2))
+
+	{
+	  tree new_rhs1 = NULL_TREE;
+	  tree new_rhs2 = NULL_TREE;
+	  switch (rhs_code)
+	    {
+	    case TRUNC_DIV_EXPR:
+	    case CEIL_DIV_EXPR:
+	    case FLOOR_DIV_EXPR:
+	    case ROUND_DIV_EXPR:
+	    case EXACT_DIV_EXPR:
+	      /* Prevent divide by zero.  */
+	      if (integer_zerop (val1)
+		  || integer_zerop (val2))
+		break;
+	    default:
+	      new_rhs1 = int_const_binop (rhs_code, rhs1, val1);
+	      new_rhs2 = int_const_binop (rhs_code, rhs1, val2);
+	      break;
+	    }
+
+	  if (new_rhs1 && new_rhs2)
+	    {
+	      tree cond = build2 (EQ_EXPR, TREE_TYPE (rhs2), rhs2, val1);
+	      gimple_assign_set_rhs_with_ops (gsi,
+					      COND_EXPR, cond,
+					      new_rhs1,
+					      new_rhs2);
+	      update_stmt (gsi_stmt (*gsi));
+	      return true;
+	    }
+	}
 
       switch (rhs_code)
 	{