diff mbox

[AArch64] Peephole for SUBS

Message ID 58F9C0E4.7070108@foss.arm.com
State New
Headers show

Commit Message

Kyrill Tkachov April 21, 2017, 8:20 a.m. UTC
Hi all,

A pattern that sometimes occurs in the wild is to subtract two operands and separately
compare them. This can be implemented as a single SUBS instruction and we actually have
a define_insn for this: sub<mode>3_compare1.
However, I'm not sure if that's enough by itself to match these constructs. Adding a peephole
that will actually bring the subtraction and comparison SETs together into a PARALLEL helps
a lot in matching these (note that there is no dependency between the subtract and the comparison).

This patch adds such a peephole. It's really simple and straightforward. The only thing to look out for
is the case when the output of the subtract is a register that is also one of the operands:
SUB W0, W0, W1
CMP W0, W1

should not be transformed into:
SUBS W0, W0, W1.

The testcase in the patch provides a motivating example where we now generate a single SUBS instead of a SUB
followed by a CMP.

This transformation triggers a few times in SPEC2006. Not enough to actually move the needle,
but it's the Right Thing to Do (tm).

I've seen it catch cases that compute an absolute difference, for example:
int
foo (int a, int b)
{
   if (a < b)
     return b - a;
   else
     return a - b;
}

will now generate:
foo:
         sub     w2, w1, w0
         subs    w3, w0, w1
         csel    w0, w3, w2, ge
         ret

instead of:
foo:
         sub     w2, w1, w0
         sub     w3, w0, w1
         cmp     w0, w1
         csel    w0, w3, w2, ge
         ret


Bootstrapped and tested on aarch64-none-linux-gnu.

Ok for GCC 8?

Thanks,
Kyrill

2017-04-21  Kyrylo Tkachov  <kyrylo.tkachov@arm.com>

     * config/aarch64/aarch64.c (define_peephole2 above
     *sub_<shift>_<mode>): New peephole.

2017-04-21  Kyrylo Tkachov  <kyrylo.tkachov@arm.com>

     * gcc.target/aarch64/subs_compare_1.c: New test.

Comments

Kyrill Tkachov May 11, 2017, 10:14 a.m. UTC | #1
Ping.

https://gcc.gnu.org/ml/gcc-patches/2017-04/msg00931.html

Thanks,
Kyrill

On 21/04/17 09:20, Kyrill Tkachov wrote:
> Hi all,
>
> A pattern that sometimes occurs in the wild is to subtract two operands and separately
> compare them. This can be implemented as a single SUBS instruction and we actually have
> a define_insn for this: sub<mode>3_compare1.
> However, I'm not sure if that's enough by itself to match these constructs. Adding a peephole
> that will actually bring the subtraction and comparison SETs together into a PARALLEL helps
> a lot in matching these (note that there is no dependency between the subtract and the comparison).
>
> This patch adds such a peephole. It's really simple and straightforward. The only thing to look out for
> is the case when the output of the subtract is a register that is also one of the operands:
> SUB W0, W0, W1
> CMP W0, W1
>
> should not be transformed into:
> SUBS W0, W0, W1.
>
> The testcase in the patch provides a motivating example where we now generate a single SUBS instead of a SUB
> followed by a CMP.
>
> This transformation triggers a few times in SPEC2006. Not enough to actually move the needle,
> but it's the Right Thing to Do (tm).
>
> I've seen it catch cases that compute an absolute difference, for example:
> int
> foo (int a, int b)
> {
>   if (a < b)
>     return b - a;
>   else
>     return a - b;
> }
>
> will now generate:
> foo:
>         sub     w2, w1, w0
>         subs    w3, w0, w1
>         csel    w0, w3, w2, ge
>         ret
>
> instead of:
> foo:
>         sub     w2, w1, w0
>         sub     w3, w0, w1
>         cmp     w0, w1
>         csel    w0, w3, w2, ge
>         ret
>
>
> Bootstrapped and tested on aarch64-none-linux-gnu.
>
> Ok for GCC 8?
>
> Thanks,
> Kyrill
>
> 2017-04-21  Kyrylo Tkachov  <kyrylo.tkachov@arm.com>
>
>     * config/aarch64/aarch64.c (define_peephole2 above
>     *sub_<shift>_<mode>): New peephole.
>
> 2017-04-21  Kyrylo Tkachov  <kyrylo.tkachov@arm.com>
>
>     * gcc.target/aarch64/subs_compare_1.c: New test.
Kyrill Tkachov June 2, 2017, 10:38 a.m. UTC | #2
Ping.

Thanks,
Kyrill

On 11/05/17 11:14, Kyrill Tkachov wrote:
> Ping.
>
> https://gcc.gnu.org/ml/gcc-patches/2017-04/msg00931.html
>
> Thanks,
> Kyrill
>
> On 21/04/17 09:20, Kyrill Tkachov wrote:
>> Hi all,
>>
>> A pattern that sometimes occurs in the wild is to subtract two operands and separately
>> compare them. This can be implemented as a single SUBS instruction and we actually have
>> a define_insn for this: sub<mode>3_compare1.
>> However, I'm not sure if that's enough by itself to match these constructs. Adding a peephole
>> that will actually bring the subtraction and comparison SETs together into a PARALLEL helps
>> a lot in matching these (note that there is no dependency between the subtract and the comparison).
>>
>> This patch adds such a peephole. It's really simple and straightforward. The only thing to look out for
>> is the case when the output of the subtract is a register that is also one of the operands:
>> SUB W0, W0, W1
>> CMP W0, W1
>>
>> should not be transformed into:
>> SUBS W0, W0, W1.
>>
>> The testcase in the patch provides a motivating example where we now generate a single SUBS instead of a SUB
>> followed by a CMP.
>>
>> This transformation triggers a few times in SPEC2006. Not enough to actually move the needle,
>> but it's the Right Thing to Do (tm).
>>
>> I've seen it catch cases that compute an absolute difference, for example:
>> int
>> foo (int a, int b)
>> {
>>   if (a < b)
>>     return b - a;
>>   else
>>     return a - b;
>> }
>>
>> will now generate:
>> foo:
>>         sub     w2, w1, w0
>>         subs    w3, w0, w1
>>         csel    w0, w3, w2, ge
>>         ret
>>
>> instead of:
>> foo:
>>         sub     w2, w1, w0
>>         sub     w3, w0, w1
>>         cmp     w0, w1
>>         csel    w0, w3, w2, ge
>>         ret
>>
>>
>> Bootstrapped and tested on aarch64-none-linux-gnu.
>>
>> Ok for GCC 8?
>>
>> Thanks,
>> Kyrill
>>
>> 2017-04-21  Kyrylo Tkachov  <kyrylo.tkachov@arm.com>
>>
>>     * config/aarch64/aarch64.c (define_peephole2 above
>>     *sub_<shift>_<mode>): New peephole.
>>
>> 2017-04-21  Kyrylo Tkachov  <kyrylo.tkachov@arm.com>
>>
>>     * gcc.target/aarch64/subs_compare_1.c: New test.
>
James Greenhalgh June 2, 2017, 2:04 p.m. UTC | #3
On Fri, Apr 21, 2017 at 09:20:52AM +0100, Kyrill Tkachov wrote:
> Hi all,
> 
> A pattern that sometimes occurs in the wild is to subtract two operands and
> separately compare them. This can be implemented as a single SUBS instruction
> and we actually have a define_insn for this: sub<mode>3_compare1.  However,
> I'm not sure if that's enough by itself to match these constructs. Adding a
> peephole that will actually bring the subtraction and comparison SETs
> together into a PARALLEL helps a lot in matching these (note that there is no
> dependency between the subtract and the comparison).
> 
> This patch adds such a peephole. It's really simple and straightforward. The
> only thing to look out for is the case when the output of the subtract is a
> register that is also one of the operands:
> SUB W0, W0, W1
> CMP W0, W1
> 
> should not be transformed into:
> SUBS W0, W0, W1.
> 
> The testcase in the patch provides a motivating example where we now generate
> a single SUBS instead of a SUB followed by a CMP.
> 
> This transformation triggers a few times in SPEC2006. Not enough to actually
> move the needle, but it's the Right Thing to Do (tm).
> 
> I've seen it catch cases that compute an absolute difference, for example:
> int
> foo (int a, int b)
> {
>   if (a < b)
>     return b - a;
>   else
>     return a - b;
> }
> 
> will now generate:
> foo:
>         sub     w2, w1, w0
>         subs    w3, w0, w1
>         csel    w0, w3, w2, ge
>         ret
> 
> instead of:
> foo:
>         sub     w2, w1, w0
>         sub     w3, w0, w1
>         cmp     w0, w1
>         csel    w0, w3, w2, ge
>         ret
> 
> 
> Bootstrapped and tested on aarch64-none-linux-gnu.
> 
> Ok for GCC 8?

OK.

Thanks,
James

> 
> Thanks,
> Kyrill
> 
> 2017-04-21  Kyrylo Tkachov  <kyrylo.tkachov@arm.com>
> 
>     * config/aarch64/aarch64.c (define_peephole2 above
>     *sub_<shift>_<mode>): New peephole.
> 
> 2017-04-21  Kyrylo Tkachov  <kyrylo.tkachov@arm.com>
> 
>     * gcc.target/aarch64/subs_compare_1.c: New test.
diff mbox

Patch

diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
index f046466f8af731db6752d69690ebfd071cd55d3e..2a0341e1a957ebd28bc9e29465803501be23cd72 100644
--- a/gcc/config/aarch64/aarch64.md
+++ b/gcc/config/aarch64/aarch64.md
@@ -2344,6 +2344,24 @@  (define_insn "sub<mode>3_compare1"
   [(set_attr "type" "alus_sreg")]
 )
 
+(define_peephole2
+  [(set (match_operand:GPI 0 "register_operand")
+	(minus:GPI (match_operand:GPI 1 "aarch64_reg_or_zero")
+		    (match_operand:GPI 2 "aarch64_reg_or_zero")))
+   (set (reg:CC CC_REGNUM)
+	(compare:CC
+	  (match_dup 1)
+	  (match_dup 2)))]
+  "!reg_overlap_mentioned_p (operands[0], operands[1])
+   && !reg_overlap_mentioned_p (operands[0], operands[2])"
+  [(const_int 0)]
+  {
+    emit_insn (gen_sub<mode>3_compare1 (operands[0], operands[1],
+					 operands[2]));
+    DONE;
+  }
+)
+
 (define_insn "*sub_<shift>_<mode>"
   [(set (match_operand:GPI 0 "register_operand" "=r")
 	(minus:GPI (match_operand:GPI 3 "register_operand" "r")
diff --git a/gcc/testsuite/gcc.target/aarch64/subs_compare_1.c b/gcc/testsuite/gcc.target/aarch64/subs_compare_1.c
new file mode 100644
index 0000000000000000000000000000000000000000..95c8f696fee7e992c27625108850c02319426de5
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/subs_compare_1.c
@@ -0,0 +1,15 @@ 
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+
+int
+foo (int a, int b)
+{
+  int x = a - b;
+  if (a <= b)
+    return x;
+  else
+    return 0;
+}
+
+/* { dg-final { scan-assembler-times "subs\\tw\[0-9\]+, w\[0-9\]+, w\[0-9\]+" 1 } } */
+/* { dg-final { scan-assembler-not "cmp\\tw\[0-9\]+, w\[0-9\]+" } } */