diff mbox

[AArch64] Split X-reg UBFIZ into W-reg LSL when possible

Message ID 5853DC63.3030602@foss.arm.com
State New
Headers show

Commit Message

Kyrill Tkachov Dec. 16, 2016, 12:21 p.m. UTC
On 15/12/16 11:56, James Greenhalgh wrote:
> On Thu, Dec 08, 2016 at 09:35:09AM +0000, Kyrill Tkachov wrote:
>> Hi all,
>>
>> Similar to the previous patch this transforms X-reg UBFIZ instructions into
>> W-reg LSL instructions when the UBFIZ operands add up to 32, so we can take
>> advantage of the implicit zero-extension to DImode
>> when writing to a W-register.
>>
>> This is done by splitting the existing *andim_ashift<mode>_bfi pattern into
>> its two SImode and DImode specialisations and changing the DImode pattern
>> into a define_insn_and_split that splits into a
>> zero-extended SImode ashift when the operands match up.
>>
>> So for the code in the testcase we generate:
>> LSL     W0, W0, 5
>>
>> instead of:
>> UBFIZ   X0, X0, 5, 27
>>
>> Bootstrapped and tested on aarch64-none-linux-gnu.
>>
>> Since we're in stage 3 perhaps this is not for GCC 6, but it is fairly low
>> risk.  I'm happy for it to wait for the next release if necessary.
> My comments on the previous patch also apply here. This patch should only
> need to add one new split pattern.
>
> Thanks,
> James

Thanks, here is the version adding just a single define_split.

Bootstrapped and tested on aarch64-none-linux-gnu.

Ok?

Thanks,
Kyrill

2016-12-16  Kyrylo Tkachov  <kyrylo.tkachov@arm.com>

     * config/aarch64/aarch64.md: New define_split above bswap<mode>2.

2016-12-16  Kyrylo Tkachov  <kyrylo.tkachov@arm.com>

     * gcc.target/aarch64/ubfiz_lsl_1.c: New test.

Comments

James Greenhalgh Dec. 16, 2016, 4:16 p.m. UTC | #1
On Fri, Dec 16, 2016 at 12:21:55PM +0000, Kyrill Tkachov wrote:
> 
> On 15/12/16 11:56, James Greenhalgh wrote:
> >On Thu, Dec 08, 2016 at 09:35:09AM +0000, Kyrill Tkachov wrote:
> >>Hi all,
> >>
> >>Similar to the previous patch this transforms X-reg UBFIZ instructions into
> >>W-reg LSL instructions when the UBFIZ operands add up to 32, so we can take
> >>advantage of the implicit zero-extension to DImode
> >>when writing to a W-register.
> >>
> >>This is done by splitting the existing *andim_ashift<mode>_bfi pattern into
> >>its two SImode and DImode specialisations and changing the DImode pattern
> >>into a define_insn_and_split that splits into a
> >>zero-extended SImode ashift when the operands match up.
> >>
> >>So for the code in the testcase we generate:
> >>LSL     W0, W0, 5
> >>
> >>instead of:
> >>UBFIZ   X0, X0, 5, 27
> >>
> >>Bootstrapped and tested on aarch64-none-linux-gnu.
> >>
> >>Since we're in stage 3 perhaps this is not for GCC 6, but it is fairly low
> >>risk.  I'm happy for it to wait for the next release if necessary.
> >My comments on the previous patch also apply here. This patch should only
> >need to add one new split pattern.

OK with a small nit fixed.

Thanks,
James

> Thanks, here is the version adding just a single define_split.
> 
> Bootstrapped and tested on aarch64-none-linux-gnu.


> 2016-12-16  Kyrylo Tkachov  <kyrylo.tkachov@arm.com>
> 
>     * config/aarch64/aarch64.md: New define_split above bswap<mode>2.
> 
> 2016-12-16  Kyrylo Tkachov  <kyrylo.tkachov@arm.com>
> 
>     * gcc.target/aarch64/ubfiz_lsl_1.c: New test.

> diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
> index 5a40ee6abd5e123116aaaa478dced2207dd59478..b0f7bcbb84159fc8c0c733d0b40f2f08eea241a9 100644
> --- a/gcc/config/aarch64/aarch64.md
> +++ b/gcc/config/aarch64/aarch64.md
> @@ -4454,6 +4454,24 @@ (define_insn "*andim_ashift<mode>_bfiz"
>    [(set_attr "type" "bfx")]
>  )
>  
> +;; When the bitposition and width of the equivalent extraction add up to 32

s/bitposition/bit position/
diff mbox

Patch

diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
index 5a40ee6abd5e123116aaaa478dced2207dd59478..b0f7bcbb84159fc8c0c733d0b40f2f08eea241a9 100644
--- a/gcc/config/aarch64/aarch64.md
+++ b/gcc/config/aarch64/aarch64.md
@@ -4454,6 +4454,24 @@  (define_insn "*andim_ashift<mode>_bfiz"
   [(set_attr "type" "bfx")]
 )
 
+;; When the bitposition and width of the equivalent extraction add up to 32
+;; we can use a W-reg LSL instruction taking advantage of the implicit
+;; zero-extension of the X-reg.
+(define_split
+  [(set (match_operand:DI 0 "register_operand")
+	(and:DI (ashift:DI (match_operand:DI 1 "register_operand")
+			     (match_operand 2 "const_int_operand"))
+		 (match_operand 3 "const_int_operand")))]
+ "aarch64_mask_and_shift_for_ubfiz_p (DImode, operands[3], operands[2])
+  && (INTVAL (operands[2]) + popcount_hwi (INTVAL (operands[3])))
+      == GET_MODE_BITSIZE (SImode)"
+  [(set (match_dup 0)
+	(zero_extend:DI (ashift:SI (match_dup 4) (match_dup 2))))]
+  {
+    operands[4] = gen_lowpart (SImode, operands[1]);
+  }
+)
+
 (define_insn "bswap<mode>2"
   [(set (match_operand:GPI 0 "register_operand" "=r")
         (bswap:GPI (match_operand:GPI 1 "register_operand" "r")))]
diff --git a/gcc/testsuite/gcc.target/aarch64/ubfiz_lsl_1.c b/gcc/testsuite/gcc.target/aarch64/ubfiz_lsl_1.c
new file mode 100644
index 0000000000000000000000000000000000000000..d3fd3f234f2324d71813298210fdcf0660ac45b4
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/ubfiz_lsl_1.c
@@ -0,0 +1,13 @@ 
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+
+/* Check that an X-reg UBFIZ can be simplified into a W-reg LSL.  */
+
+long long
+f2 (long long x)
+{
+  return (x << 5) & 0xffffffff;
+}
+
+/* { dg-final { scan-assembler "lsl\tw" } } */
+/* { dg-final { scan-assembler-not "ubfiz\tx" } } */