[AArch64] Use SUBS for parallel subtraction and comparison with immediate

Submitted by Kyrill Tkachov on April 21, 2017, 8:20 a.m.

Details

Message ID 58F9C0E7.4070600@foss.arm.com
State New
Headers show

Commit Message

Kyrill Tkachov April 21, 2017, 8:20 a.m.
Hi all,

Our sub<mode>3_compare1 pattern is not enough to catch cases where we subtract an immediate
and compare against in PARALLEL. This is due to the RTL canonicalisation rules that require
subtractions of immediate IMM be represented as (plus x -IMM).
So we need a bit of trickery to catch those cases and this patch does that.
It adds a new define_insn to match the plus-negatable-immediate in parallel with a comparison
and a peephole that will bring the two together when possible.
Otherwise it's pretty straightforward.

The testcase in the patch now generates a single SUBS-immediate instead of a SUB followed by a CMP.

Bootstrapped and tested on aarch64-none-linux-gnu.

Ok for GCC 8?

Thanks,
Kyrill

2017-04-21  Kyrylo Tkachov  <kyrylo.tkachov@arm.com>

	* config/aarch64/aarch64.md (sub<mode>3_compare1_imm): New define_insn.
	(peephole2): New peephole2 to emit the above.
	* config/aarch64/predicates.md (aarch64_sub_immediate): New predicate.

2017-04-21  Kyrylo Tkachov  <kyrylo.tkachov@arm.com>

	* gcc.target/aarch64/subs_compare_2.c: New test.

Comments

Kyrill Tkachov May 11, 2017, 10:14 a.m.
Ping.

https://gcc.gnu.org/ml/gcc-patches/2017-04/msg00932.html

Thanks,
Kyrill

On 21/04/17 09:20, Kyrill Tkachov wrote:
> Hi all,
>
> Our sub<mode>3_compare1 pattern is not enough to catch cases where we subtract an immediate
> and compare against in PARALLEL. This is due to the RTL canonicalisation rules that require
> subtractions of immediate IMM be represented as (plus x -IMM).
> So we need a bit of trickery to catch those cases and this patch does that.
> It adds a new define_insn to match the plus-negatable-immediate in parallel with a comparison
> and a peephole that will bring the two together when possible.
> Otherwise it's pretty straightforward.
>
> The testcase in the patch now generates a single SUBS-immediate instead of a SUB followed by a CMP.
>
> Bootstrapped and tested on aarch64-none-linux-gnu.
>
> Ok for GCC 8?
>
> Thanks,
> Kyrill
>
> 2017-04-21  Kyrylo Tkachov  <kyrylo.tkachov@arm.com>
>
>     * config/aarch64/aarch64.md (sub<mode>3_compare1_imm): New define_insn.
>     (peephole2): New peephole2 to emit the above.
>     * config/aarch64/predicates.md (aarch64_sub_immediate): New predicate.
>
> 2017-04-21  Kyrylo Tkachov  <kyrylo.tkachov@arm.com>
>
>     * gcc.target/aarch64/subs_compare_2.c: New test.
>
Kyrill Tkachov June 2, 2017, 10:39 a.m.
Ping.

Thanks,
Kyrill

On 11/05/17 11:14, Kyrill Tkachov wrote:
> Ping.
>
> https://gcc.gnu.org/ml/gcc-patches/2017-04/msg00932.html
>
> Thanks,
> Kyrill
>
> On 21/04/17 09:20, Kyrill Tkachov wrote:
>> Hi all,
>>
>> Our sub<mode>3_compare1 pattern is not enough to catch cases where we subtract an immediate
>> and compare against in PARALLEL. This is due to the RTL canonicalisation rules that require
>> subtractions of immediate IMM be represented as (plus x -IMM).
>> So we need a bit of trickery to catch those cases and this patch does that.
>> It adds a new define_insn to match the plus-negatable-immediate in parallel with a comparison
>> and a peephole that will bring the two together when possible.
>> Otherwise it's pretty straightforward.
>>
>> The testcase in the patch now generates a single SUBS-immediate instead of a SUB followed by a CMP.
>>
>> Bootstrapped and tested on aarch64-none-linux-gnu.
>>
>> Ok for GCC 8?
>>
>> Thanks,
>> Kyrill
>>
>> 2017-04-21  Kyrylo Tkachov  <kyrylo.tkachov@arm.com>
>>
>>     * config/aarch64/aarch64.md (sub<mode>3_compare1_imm): New define_insn.
>>     (peephole2): New peephole2 to emit the above.
>>     * config/aarch64/predicates.md (aarch64_sub_immediate): New predicate.
>>
>> 2017-04-21  Kyrylo Tkachov  <kyrylo.tkachov@arm.com>
>>
>>     * gcc.target/aarch64/subs_compare_2.c: New test.
>>
>
James Greenhalgh June 2, 2017, 2:02 p.m.
On Fri, Apr 21, 2017 at 09:20:55AM +0100, Kyrill Tkachov wrote:
> Hi all,
> 
> Our sub<mode>3_compare1 pattern is not enough to catch cases where we
> subtract an immediate and compare against in PARALLEL. This is due to the RTL
> canonicalisation rules that require subtractions of immediate IMM be
> represented as (plus x -IMM).  So we need a bit of trickery to catch those
> cases and this patch does that.  It adds a new define_insn to match the
> plus-negatable-immediate in parallel with a comparison and a peephole that
> will bring the two together when possible.  Otherwise it's pretty
> straightforward.
> 
> The testcase in the patch now generates a single SUBS-immediate instead of a
> SUB followed by a CMP.
> 
> Bootstrapped and tested on aarch64-none-linux-gnu.
> 
> Ok for GCC 8?

OK.

Thansk,
James

> 
> Thanks,
> Kyrill
> 
> 2017-04-21  Kyrylo Tkachov  <kyrylo.tkachov@arm.com>
> 
> 	* config/aarch64/aarch64.md (sub<mode>3_compare1_imm): New define_insn.
> 	(peephole2): New peephole2 to emit the above.
> 	* config/aarch64/predicates.md (aarch64_sub_immediate): New predicate.
> 
> 2017-04-21  Kyrylo Tkachov  <kyrylo.tkachov@arm.com>
> 
> 	* gcc.target/aarch64/subs_compare_2.c: New test.
> 

> diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
> index 2a0341e1a957ebd28bc9e29465803501be23cd72..ff34e0d5ff4713f7b8005855f62e834aceef51f0 100644
> --- a/gcc/config/aarch64/aarch64.md
> +++ b/gcc/config/aarch64/aarch64.md
> @@ -2344,6 +2344,19 @@ (define_insn "sub<mode>3_compare1"
>    [(set_attr "type" "alus_sreg")]
>  )
>  
> +(define_insn "sub<mode>3_compare1_imm"
> +  [(set (reg:CC CC_REGNUM)
> +	(compare:CC
> +	  (match_operand:GPI 1 "register_operand" "r")
> +	  (match_operand:GPI 3 "const_int_operand" "n")))
> +   (set (match_operand:GPI 0 "register_operand" "=r")
> +	(plus:GPI (match_dup 1)
> +		  (match_operand:GPI 2 "aarch64_sub_immediate" "J")))]
> +  "INTVAL (operands[3]) == -INTVAL (operands[2])"
> +  "subs\\t%<w>0, %<w>1, #%n2"
> +  [(set_attr "type" "alus_sreg")]
> +)
> +
>  (define_peephole2
>    [(set (match_operand:GPI 0 "register_operand")
>  	(minus:GPI (match_operand:GPI 1 "aarch64_reg_or_zero")
> @@ -2362,6 +2375,24 @@ (define_peephole2
>    }
>  )
>  
> +(define_peephole2
> +  [(set (match_operand:GPI 0 "register_operand")
> +	(plus:GPI (match_operand:GPI 1 "register_operand")
> +		  (match_operand:GPI 2 "aarch64_sub_immediate")))
> +   (set (reg:CC CC_REGNUM)
> +	(compare:CC
> +	  (match_dup 1)
> +	  (match_operand:GPI 3 "const_int_operand")))]
> +  "!reg_overlap_mentioned_p (operands[0], operands[1])
> +   && INTVAL (operands[3]) == -INTVAL (operands[2])"
> +  [(const_int 0)]
> +  {
> +    emit_insn (gen_sub<mode>3_compare1_imm (operands[0], operands[1],
> +					 operands[2], operands[3]));
> +    DONE;
> +  }
> +)
> +
>  (define_insn "*sub_<shift>_<mode>"
>    [(set (match_operand:GPI 0 "register_operand" "=r")
>  	(minus:GPI (match_operand:GPI 3 "register_operand" "r")
> diff --git a/gcc/config/aarch64/predicates.md b/gcc/config/aarch64/predicates.md
> index 875ae6180e232e25b29c8787c0acf7a5dfa82d94..4bd8f45562c017bca736ae466ede6b9e4de0d17a 100644
> --- a/gcc/config/aarch64/predicates.md
> +++ b/gcc/config/aarch64/predicates.md
> @@ -77,6 +77,10 @@ (define_predicate "aarch64_fp_pow2"
>  (define_predicate "aarch64_fp_vec_pow2"
>    (match_test "aarch64_vec_fpconst_pow_of_2 (op) > 0"))
>  
> +(define_predicate "aarch64_sub_immediate"
> +  (and (match_code "const_int")
> +       (match_test "aarch64_uimm12_shift (-INTVAL (op))")))
> +
>  (define_predicate "aarch64_plus_immediate"
>    (and (match_code "const_int")
>         (ior (match_test "aarch64_uimm12_shift (INTVAL (op))")
> diff --git a/gcc/testsuite/gcc.target/aarch64/subs_compare_2.c b/gcc/testsuite/gcc.target/aarch64/subs_compare_2.c
> new file mode 100644
> index 0000000000000000000000000000000000000000..60c6d9e5ccd8fce42c388c831a8060dead128491
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/aarch64/subs_compare_2.c
> @@ -0,0 +1,15 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2" } */
> +
> +int
> +foo (int a, int b)
> +{
> +  int x = a - 4;
> +  if (a < 4)
> +    return x;
> +  else
> +    return 0;
> +}
> +
> +/* { dg-final { scan-assembler-times "subs\\tw\[0-9\]+, w\[0-9\]+, #4" 1 } } */
> +/* { dg-final { scan-assembler-not "cmp\\tw\[0-9\]+, w\[0-9\]+" } } */

Patch hide | download patch | download mbox

diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
index 2a0341e1a957ebd28bc9e29465803501be23cd72..ff34e0d5ff4713f7b8005855f62e834aceef51f0 100644
--- a/gcc/config/aarch64/aarch64.md
+++ b/gcc/config/aarch64/aarch64.md
@@ -2344,6 +2344,19 @@  (define_insn "sub<mode>3_compare1"
   [(set_attr "type" "alus_sreg")]
 )
 
+(define_insn "sub<mode>3_compare1_imm"
+  [(set (reg:CC CC_REGNUM)
+	(compare:CC
+	  (match_operand:GPI 1 "register_operand" "r")
+	  (match_operand:GPI 3 "const_int_operand" "n")))
+   (set (match_operand:GPI 0 "register_operand" "=r")
+	(plus:GPI (match_dup 1)
+		  (match_operand:GPI 2 "aarch64_sub_immediate" "J")))]
+  "INTVAL (operands[3]) == -INTVAL (operands[2])"
+  "subs\\t%<w>0, %<w>1, #%n2"
+  [(set_attr "type" "alus_sreg")]
+)
+
 (define_peephole2
   [(set (match_operand:GPI 0 "register_operand")
 	(minus:GPI (match_operand:GPI 1 "aarch64_reg_or_zero")
@@ -2362,6 +2375,24 @@  (define_peephole2
   }
 )
 
+(define_peephole2
+  [(set (match_operand:GPI 0 "register_operand")
+	(plus:GPI (match_operand:GPI 1 "register_operand")
+		  (match_operand:GPI 2 "aarch64_sub_immediate")))
+   (set (reg:CC CC_REGNUM)
+	(compare:CC
+	  (match_dup 1)
+	  (match_operand:GPI 3 "const_int_operand")))]
+  "!reg_overlap_mentioned_p (operands[0], operands[1])
+   && INTVAL (operands[3]) == -INTVAL (operands[2])"
+  [(const_int 0)]
+  {
+    emit_insn (gen_sub<mode>3_compare1_imm (operands[0], operands[1],
+					 operands[2], operands[3]));
+    DONE;
+  }
+)
+
 (define_insn "*sub_<shift>_<mode>"
   [(set (match_operand:GPI 0 "register_operand" "=r")
 	(minus:GPI (match_operand:GPI 3 "register_operand" "r")
diff --git a/gcc/config/aarch64/predicates.md b/gcc/config/aarch64/predicates.md
index 875ae6180e232e25b29c8787c0acf7a5dfa82d94..4bd8f45562c017bca736ae466ede6b9e4de0d17a 100644
--- a/gcc/config/aarch64/predicates.md
+++ b/gcc/config/aarch64/predicates.md
@@ -77,6 +77,10 @@  (define_predicate "aarch64_fp_pow2"
 (define_predicate "aarch64_fp_vec_pow2"
   (match_test "aarch64_vec_fpconst_pow_of_2 (op) > 0"))
 
+(define_predicate "aarch64_sub_immediate"
+  (and (match_code "const_int")
+       (match_test "aarch64_uimm12_shift (-INTVAL (op))")))
+
 (define_predicate "aarch64_plus_immediate"
   (and (match_code "const_int")
        (ior (match_test "aarch64_uimm12_shift (INTVAL (op))")
diff --git a/gcc/testsuite/gcc.target/aarch64/subs_compare_2.c b/gcc/testsuite/gcc.target/aarch64/subs_compare_2.c
new file mode 100644
index 0000000000000000000000000000000000000000..60c6d9e5ccd8fce42c388c831a8060dead128491
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/subs_compare_2.c
@@ -0,0 +1,15 @@ 
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+
+int
+foo (int a, int b)
+{
+  int x = a - 4;
+  if (a < 4)
+    return x;
+  else
+    return 0;
+}
+
+/* { dg-final { scan-assembler-times "subs\\tw\[0-9\]+, w\[0-9\]+, #4" 1 } } */
+/* { dg-final { scan-assembler-not "cmp\\tw\[0-9\]+, w\[0-9\]+" } } */