diff mbox

[RTL-ifcvt] PR rtl-optimization/68841: Make sure one basic block doesn't clobber CC reg usage of the other

Message ID 568E2E6C.4050806@foss.arm.com
State New
Headers show

Commit Message

Kyrill Tkachov Jan. 7, 2016, 9:22 a.m. UTC
Hi Bernd,

On 06/01/16 10:36, Kyrill Tkachov wrote:
> Hi Bernd,
>
> On 05/01/16 18:19, Kyrill Tkachov wrote:
>>
>> On 05/01/16 17:06, Kyrill Tkachov wrote:
>>>
>>> On 05/01/16 16:34, Bernd Schmidt wrote:
>>>> On 01/05/2016 03:22 PM, Kyrill Tkachov wrote:
>>>>>
>>>>> This works around the issue but we don't want to do perform the check
>>>>> for pairs of
>>>>> simple basic blocks because then we'll end up rejecting code that does
>>>>> things like:
>>>>> x = cond ? x + 1 : x - 1
>>>>> i.e. source of the set in both blocks reads and writes the same register.
>>>>> We can deal with this safely later on in the function since we rename
>>>>> the destinations
>>>>> of the two sets, so we don't want to reject this case here.
>>>>
>>>> So we need to teach bbs_ok_for_cmove_arith that this is going to happen. How about the approach below? Still seems to fix the issue, and it looks like the CC set is present in the df info so everything should work as intended. Right?
>>>>
>>>
>>> Yeah, this looks like it works.
>>> However, now we reject if-conversion whereas with my patch we still tried switching the order in which
>>> the blocks were emitted, which allowed for a bit more aggressive if-conversion.
>>> I don't know if this approach is overly restrictive yet.
>>> I'll try its effects on codegen quality on SPEC as soon as I get some cycles.
>>> But this approach does look appealing to me.
>>>
>>
>> Hmm, from a first look at SPEC, it seems to still overly restrict ifconversion in the
>> x = cond ? x + 1 : x - 1 case.
>> I'll look deeper tomorrow as to what's going on there.
>>
>
> Ok, found the problem.
> bbs_ok_for_cmove_arith also checks that we don't perform any stores in the basic block, which kills opportunities
> to convert operations of the form [addr] = c ? a : b. Since with your patch we now call this even for
> simple basic blocks we miss these opportunities.
>
> bb_valid_for_noce_process_p called earlier should have already ensured that for complex
> blocks we allow only the last set to be a store, so we should be able to relax the restriction in
> bbs_ok_for_cmove_arith. That change in combination with your patch fixes the testcase and has no code quality
> fallout that I can see (it even slightly increases if-conversion opportunities). I'll test more thoroughly
> and post a patch in due time.
>


And here is the updated patch.
It is a modified version of the patch with the restriction on stores in the basic block lifted.
Also I chose instead to pass the register 'x' on which we should not be recording conflicts
rather than a boolean will_remain.

I found if we don't do this we will pessimise cases where
one basic block is something like:
t1 = x + y;
x = t1 + 2;

and the other is something like:
x = x + 1.

We want to ignore the conflict on x in both invocations of bbs_ok_for_cmove_arith.

With this patch the testcase passes and there is no codegen fallout on SPEC.
Bootstrapped and tested on arm, aarch64, x86_64.

How does this look?

Thanks,
Kyrill

2016-01-06  Bernd Schmidt  <bschmidt@redhat.com>
             Kyrylo Tkachov  <kyrylo.tkachov@arm.com>

     PR rtl-optimization/68841
     * ifcvt.c (bbs_ok_for_cmove_arith): Add to_rename parameter.
     Don't record conflicts on to_rename if it's present.
     Allow memory destinations in sets.
     (noce_try_cmove_arith): Call bbs_ok_for_cmove_arith even on simple
     blocks.

2016-01-06  Kyrylo Tkachov  <kyrylo.tkachov@arm.com>

     PR rtl-optimization/68841
     * gcc.dg/pr68841.c: New test.
     * gcc.c-torture/execute/pr68841.c: New test.

> Thanks,
> Kyrill
>
>> Kyrill
>>
>>> Thanks for the help,
>>> Kyrill
>>>
>>>>
>>>> Bernd
>>>
>>
>

Comments

Bernd Schmidt Jan. 7, 2016, 11:52 a.m. UTC | #1
On 01/07/2016 10:22 AM, Kyrill Tkachov wrote:
> +/* Return true iff the registers that the insns in BB_A set do not get
> +   used in BB_B.  If TO_RENAME is non NULL then it is a REG that will be
> +   renamed later by the caller and so conflicts on it should be ignored
> +   in this function.  */

Typical spelling is "non-NULL".

> @@ -1942,8 +1944,12 @@ bbs_ok_for_cmove_arith (basic_block bb_a, basic_block bb_b)
>   	}
>
>         /* Make sure this is a REG and not some instance
> -	 of ZERO_EXTRACT or SUBREG or other dangerous stuff.  */
> -      if (!REG_P (SET_DEST (sset_b)))
> +	 of ZERO_EXTRACT or SUBREG or other dangerous stuff.
> +	 If we have a memory destination then we have a pair of simple
> +	 basic blocks performing an operation of the form [addr] = c ? a : b.
> +	 bb_valid_for_noce_process_p will have ensured that these are
> +	 the only stores present.  */

I looked and this is indeed the case. Still slightly scary. Should the 
MEM be rtx_equal_p to TO_RENAME? It would be good to test or at least 
assert this.


Bernd
Kyrill Tkachov Jan. 7, 2016, 1:45 p.m. UTC | #2
On 07/01/16 11:52, Bernd Schmidt wrote:
> On 01/07/2016 10:22 AM, Kyrill Tkachov wrote:
>> +/* Return true iff the registers that the insns in BB_A set do not get
>> +   used in BB_B.  If TO_RENAME is non NULL then it is a REG that will be
>> +   renamed later by the caller and so conflicts on it should be ignored
>> +   in this function.  */
>
> Typical spelling is "non-NULL".
>
>> @@ -1942,8 +1944,12 @@ bbs_ok_for_cmove_arith (basic_block bb_a, basic_block bb_b)
>>       }
>>
>>         /* Make sure this is a REG and not some instance
>> -     of ZERO_EXTRACT or SUBREG or other dangerous stuff.  */
>> -      if (!REG_P (SET_DEST (sset_b)))
>> +     of ZERO_EXTRACT or SUBREG or other dangerous stuff.
>> +     If we have a memory destination then we have a pair of simple
>> +     basic blocks performing an operation of the form [addr] = c ? a : b.
>> +     bb_valid_for_noce_process_p will have ensured that these are
>> +     the only stores present.  */
>
> I looked and this is indeed the case. Still slightly scary. Should the MEM be rtx_equal_p to TO_RENAME? It would be good to test or at least assert this.
>

I tried asserting that and it caused trouble because a bit further up in noce_process_if_block it does:
   /* Only operate on register destinations, and even then avoid extending
      the lifetime of hard registers on small register class machines.  */
   orig_x = x;
   if (!REG_P (x)
       || (HARD_REGISTER_P (x)
       && targetm.small_register_classes_for_mode_p (GET_MODE (x))))
     {
       if (GET_MODE (x) == BLKmode)
         return FALSE;

       if (GET_CODE (x) == ZERO_EXTRACT
           && (!CONST_INT_P (XEXP (x, 1))
               || !CONST_INT_P (XEXP (x, 2))))
         return FALSE;

       x = gen_reg_rtx (GET_MODE (GET_CODE (x) == STRICT_LOW_PART
                  ? XEXP (x, 0) : x));
     }

It changes X to a register and after noce_try_cmove_arith it emits the store.
So, while the basic blocks that we inspect in bbs_ok_for_cmove_arith contain memory
destinations, the move to x will be a register move.
move will be performed t

Kyrill

>
> Bernd
Bernd Schmidt Jan. 7, 2016, 1:47 p.m. UTC | #3
On 01/07/2016 02:45 PM, Kyrill Tkachov wrote:
>
> On 07/01/16 11:52, Bernd Schmidt wrote:
>> I looked and this is indeed the case. Still slightly scary. Should the
>> MEM be rtx_equal_p to TO_RENAME? It would be good to test or at least
>> assert this.
>>
>
> I tried asserting that and it caused trouble because a bit further up in
> noce_process_if_block it does:
>    /* Only operate on register destinations, and even then avoid extending
>       the lifetime of hard registers on small register class machines.  */
>    orig_x = x;
>    if (!REG_P (x)
>        || (HARD_REGISTER_P (x)
>        && targetm.small_register_classes_for_mode_p (GET_MODE (x))))
>      {
>        if (GET_MODE (x) == BLKmode)
>          return FALSE;
>
>        if (GET_CODE (x) == ZERO_EXTRACT
>            && (!CONST_INT_P (XEXP (x, 1))
>                || !CONST_INT_P (XEXP (x, 2))))
>          return FALSE;
>
>        x = gen_reg_rtx (GET_MODE (GET_CODE (x) == STRICT_LOW_PART
>                   ? XEXP (x, 0) : x));
>      }
>
> It changes X to a register and after noce_try_cmove_arith it emits the
> store.

This suggests that orig_x needs to be passed to bbs_ok_for_cmove_arith, 
even disregarding the question of using it to assert that only expected 
MEMs are being modified.


Bernd
Kyrill Tkachov Jan. 7, 2016, 2 p.m. UTC | #4
On 07/01/16 13:47, Bernd Schmidt wrote:
> On 01/07/2016 02:45 PM, Kyrill Tkachov wrote:
>>
>> On 07/01/16 11:52, Bernd Schmidt wrote:
>>> I looked and this is indeed the case. Still slightly scary. Should the
>>> MEM be rtx_equal_p to TO_RENAME? It would be good to test or at least
>>> assert this.
>>>
>>
>> I tried asserting that and it caused trouble because a bit further up in
>> noce_process_if_block it does:
>>    /* Only operate on register destinations, and even then avoid extending
>>       the lifetime of hard registers on small register class machines.  */
>>    orig_x = x;
>>    if (!REG_P (x)
>>        || (HARD_REGISTER_P (x)
>>        && targetm.small_register_classes_for_mode_p (GET_MODE (x))))
>>      {
>>        if (GET_MODE (x) == BLKmode)
>>          return FALSE;
>>
>>        if (GET_CODE (x) == ZERO_EXTRACT
>>            && (!CONST_INT_P (XEXP (x, 1))
>>                || !CONST_INT_P (XEXP (x, 2))))
>>          return FALSE;
>>
>>        x = gen_reg_rtx (GET_MODE (GET_CODE (x) == STRICT_LOW_PART
>>                   ? XEXP (x, 0) : x));
>>      }
>>
>> It changes X to a register and after noce_try_cmove_arith it emits the
>> store.
>
> This suggests that orig_x needs to be passed to bbs_ok_for_cmove_arith, even disregarding the question of using it to assert that only expected MEMs are being modified.
>

Yes, which should be the original destinations of insn_a and insn_b of if_info.
I'll work on that. Though if we execute the above path then x will be a fresh new pseudo and so
will not cause any conflicts anyway.

Kyrill

>
> Bernd
diff mbox

Patch

diff --git a/gcc/ifcvt.c b/gcc/ifcvt.c
index 5812ce30151ed7425780890c66e7763f5758df7e..96957aac5481acbc7ac5f186a653fe63ad3e5f51 100644
--- a/gcc/ifcvt.c
+++ b/gcc/ifcvt.c
@@ -1896,11 +1896,13 @@  insn_valid_noce_process_p (rtx_insn *insn, rtx cc)
 }
 
 
-/* Return true iff the registers that the insns in BB_A set do not
-   get used in BB_B.  */
+/* Return true iff the registers that the insns in BB_A set do not get
+   used in BB_B.  If TO_RENAME is non NULL then it is a REG that will be
+   renamed later by the caller and so conflicts on it should be ignored
+   in this function.  */
 
 static bool
-bbs_ok_for_cmove_arith (basic_block bb_a, basic_block bb_b)
+bbs_ok_for_cmove_arith (basic_block bb_a, basic_block bb_b, rtx to_rename)
 {
   rtx_insn *a_insn;
   bitmap bba_sets = BITMAP_ALLOC (&reg_obstack);
@@ -1920,10 +1922,10 @@  bbs_ok_for_cmove_arith (basic_block bb_a, basic_block bb_b)
 	  BITMAP_FREE (bba_sets);
 	  return false;
 	}
-
       /* Record all registers that BB_A sets.  */
       FOR_EACH_INSN_DEF (def, a_insn)
-	bitmap_set_bit (bba_sets, DF_REF_REGNO (def));
+	if (!(to_rename && DF_REF_REG (def) == to_rename))
+	  bitmap_set_bit (bba_sets, DF_REF_REGNO (def));
     }
 
   rtx_insn *b_insn;
@@ -1942,8 +1944,12 @@  bbs_ok_for_cmove_arith (basic_block bb_a, basic_block bb_b)
 	}
 
       /* Make sure this is a REG and not some instance
-	 of ZERO_EXTRACT or SUBREG or other dangerous stuff.  */
-      if (!REG_P (SET_DEST (sset_b)))
+	 of ZERO_EXTRACT or SUBREG or other dangerous stuff.
+	 If we have a memory destination then we have a pair of simple
+	 basic blocks performing an operation of the form [addr] = c ? a : b.
+	 bb_valid_for_noce_process_p will have ensured that these are
+	 the only stores present.  */
+      if (!REG_P (SET_DEST (sset_b)) && !MEM_P (SET_DEST (sset_b)))
 	{
 	  BITMAP_FREE (bba_sets);
 	  return false;
@@ -2112,9 +2118,9 @@  noce_try_cmove_arith (struct noce_if_info *if_info)
 	}
     }
 
-  if (then_bb && else_bb && !a_simple && !b_simple
-      && (!bbs_ok_for_cmove_arith (then_bb, else_bb)
-	  || !bbs_ok_for_cmove_arith (else_bb, then_bb)))
+  if (then_bb && else_bb
+      && (!bbs_ok_for_cmove_arith (then_bb, else_bb, x)
+	  || !bbs_ok_for_cmove_arith (else_bb, then_bb, x)))
     return FALSE;
 
   start_sequence ();
diff --git a/gcc/testsuite/gcc.c-torture/execute/pr68841.c b/gcc/testsuite/gcc.c-torture/execute/pr68841.c
new file mode 100644
index 0000000000000000000000000000000000000000..15a27e7dc382d97398ca05427f431f5ecd3b89da
--- /dev/null
+++ b/gcc/testsuite/gcc.c-torture/execute/pr68841.c
@@ -0,0 +1,31 @@ 
+static inline int
+foo (int *x, int y)
+{
+  int z = *x;
+  while (y > z)
+    z *= 2;
+  return z;
+}
+
+int
+main ()
+{
+  int i;
+  for (i = 1; i < 17; i++)
+    {
+      int j;
+      int k;
+      j = foo (&i, 7);
+      if (i >= 7)
+	k = i;
+      else if (i >= 4)
+	k = 8 + (i - 4) * 2;
+      else if (i == 3)
+	k = 12;
+      else
+	k = 8;
+      if (j != k)
+	__builtin_abort ();
+    }
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.dg/pr68841.c b/gcc/testsuite/gcc.dg/pr68841.c
new file mode 100644
index 0000000000000000000000000000000000000000..470048cc24f0d7150ed1e3141181bc1e8472ae12
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr68841.c
@@ -0,0 +1,34 @@ 
+/* { dg-do run } */
+/* { dg-options "-Og -fif-conversion -flive-range-shrinkage -fpeel-loops -frerun-cse-after-loop" } */
+
+static inline int
+foo (int *x, int y)
+{
+  int z = *x;
+  while (y > z)
+    z *= 2;
+  return z;
+}
+
+int
+main ()
+{
+  int i;
+  for (i = 1; i < 17; i++)
+    {
+      int j;
+      int k;
+      j = foo (&i, 7);
+      if (i >= 7)
+	k = i;
+      else if (i >= 4)
+	k = 8 + (i - 4) * 2;
+      else if (i == 3)
+	k = 12;
+      else
+	k = 8;
+      if (j != k)
+	__builtin_abort ();
+    }
+  return 0;
+}