diff mbox series

[Arm] Fix NEON REG to REG reload failures. (PR/target 88850)

Message ID 20190205101308.GA12744@arm.com
State New
Headers show
Series [Arm] Fix NEON REG to REG reload failures. (PR/target 88850) | expand

Commit Message

Tamar Christina Feb. 5, 2019, 10:13 a.m. UTC
Hi All,

We currently return cost 2 for NEON REG to REG moves, which would be incorrect
for 64 bit moves.  We currently don't have a pattern for this in the neon_move
alternatives because this is a bit of a special case.  We would almost never
want it to use this r -> r pattern unless it really has no choice.

As such we add a new neon r -> r move pattern but also hide it from being used
to determine register preferences and also disparage it during LRA.

Bootstrapped Regtested on arm-none-gnueabihf and no issues.

Ok for trunk?

Thanks,
Tamar

gcc/ChangeLog:

2019-02-05  Tamar Christina  <tamar.christina@arm.com>

	PR/target 88850
	* config/arm/neon.md (*neon_mov<mode>): Add r -> r case.

gcc/testsuite/ChangeLog:

2019-02-05  Tamar Christina  <tamar.christina@arm.com>

	PR/target 88850
	* gcc.target/arm/pr88850.c: New test.

--

Comments

Kyrill Tkachov Feb. 6, 2019, 1:58 p.m. UTC | #1
Hi Tamar,

On 05/02/19 10:13, Tamar Christina wrote:
> Hi All,
>
> We currently return cost 2 for NEON REG to REG moves, which would be incorrect
> for 64 bit moves.  We currently don't have a pattern for this in the neon_move
> alternatives because this is a bit of a special case.  We would almost never
> want it to use this r -> r pattern unless it really has no choice.
>
> As such we add a new neon r -> r move pattern but also hide it from being used
> to determine register preferences and also disparage it during LRA.
>
> Bootstrapped Regtested on arm-none-gnueabihf and no issues.
>
> Ok for trunk?
>
> Thanks,
> Tamar
>
> gcc/ChangeLog:
>
> 2019-02-05  Tamar Christina  <tamar.christina@arm.com>
>
>         PR/target 88850
>         * config/arm/neon.md (*neon_mov<mode>): Add r -> r case.
>
> gcc/testsuite/ChangeLog:
>
> 2019-02-05  Tamar Christina  <tamar.christina@arm.com>
>
>         PR/target 88850
>         * gcc.target/arm/pr88850.c: New test.
>
> -- 

diff --git a/gcc/config/arm/neon.md b/gcc/config/arm/neon.md
index f9d7ba35b137fed383f84eecbe81dd942943d216..21ff5adbd40c362aa977171bd1726b88b383a265 100644
--- a/gcc/config/arm/neon.md
+++ b/gcc/config/arm/neon.md
@@ -25,9 +25,9 @@
  
  (define_insn "*neon_mov<mode>"
    [(set (match_operand:VDX 0 "nonimmediate_operand"
-	  "=w,Un,w, w,  ?r,?w,?r, ?Us")
+	  "=w,Un,w, w,  ?r,?w,?r, ?Us,*r")
  	(match_operand:VDX 1 "general_operand"
-	  " w,w, Dn,Uni, w, r, Usi,r"))]
+	  " w,w, Dn,Uni, w, r, Usi,r,*r"))]
    "TARGET_NEON
     && (register_operand (operands[0], <MODE>mode)
         || register_operand (operands[1], <MODE>mode))"
@@ -62,11 +62,11 @@
  }
   [(set_attr "type" "neon_move<q>,neon_store1_1reg,neon_move<q>,\
                      neon_load1_1reg, neon_to_gp<q>,neon_from_gp<q>,\
-                    neon_load1_2reg, neon_store1_2reg")
-  (set_attr "length" "4,4,4,4,4,4,8,8")
-  (set_attr "arm_pool_range"     "*,*,*,1020,*,*,1020,*")
-  (set_attr "thumb2_pool_range"     "*,*,*,1018,*,*,1018,*")
-  (set_attr "neg_pool_range" "*,*,*,1004,*,*,1004,*")])
+                    neon_load1_2reg, neon_store1_2reg, multiple")
+  (set_attr "length" "4,4,4,4,4,4,8,8,8")
+  (set_attr "arm_pool_range"     "*,*,*,1020,*,*,1020,*,*")
+  (set_attr "thumb2_pool_range"     "*,*,*,1018,*,*,1018,*,*")
+  (set_attr "neg_pool_range" "*,*,*,1004,*,*,1004,*,*")])
  

So you add a new alternative but don't modify any of the output logic, which means it will use
output_move_double. Can that handle an "r,r" alternative?
Or do you expect a split to always happen here? If so, the output should be '#'

Thanks,
Kyrill
Tamar Christina Feb. 6, 2019, 2:28 p.m. UTC | #2
Hi Kyrill,

> 
> So you add a new alternative but don't modify any of the output logic, which
> means it will use output_move_double. Can that handle an "r,r" alternative?
> Or do you expect a split to always happen here? If so, the output should be
> '#'

Yes.. so output_move_double can't indeed handle it, but it doesn't actually get used. The pattern here is only because reload has decided there must be one because of our cost = 2.  In the end they will always be split due to the splitter in arm.md line 5897 which will split any 64 bit values after reload. And since this is an anonymous pattern you can't call It, so there really shouldn't be any case in which the r -> r case doesn't get split as VDX only contains 64 bit types.

However I agree it may be confusing and probably a bit more correct if I put the #.

Kind Regards,
Tamar
> 
> Thanks,
> Kyrill
Kyrylo Tkachov Feb. 6, 2019, 2:31 p.m. UTC | #3
On 06/02/19 14:28, Tamar Christina wrote:
> Hi Kyrill,
>
> >
> > So you add a new alternative but don't modify any of the output logic, which
> > means it will use output_move_double. Can that handle an "r,r" alternative?
> > Or do you expect a split to always happen here? If so, the output should be
> > '#'
>
> Yes.. so output_move_double can't indeed handle it, but it doesn't actually get used. The pattern here is only because reload has decided there must be one because of our cost = 2.  In the end they will always be split due to the splitter in arm.md line 5897 which will split any 64 bit values after reload. And since this is an anonymous pattern you can't call It, so there really shouldn't be any case in which the r -> r case doesn't get split as VDX only contains 64 bit types.
>

Ok.

> However I agree it may be confusing and probably a bit more correct if I put the #.

Yeah, returning a '#' explicitly is the correct thing to do in that case.
Ok with that change.

Thanks,
Kyrill

>
> Kind Regards,
> Tamar
> >
> > Thanks,
> > Kyrill
>
diff mbox series

Patch

diff --git a/gcc/config/arm/neon.md b/gcc/config/arm/neon.md
index f9d7ba35b137fed383f84eecbe81dd942943d216..21ff5adbd40c362aa977171bd1726b88b383a265 100644
--- a/gcc/config/arm/neon.md
+++ b/gcc/config/arm/neon.md
@@ -25,9 +25,9 @@ 
 
 (define_insn "*neon_mov<mode>"
   [(set (match_operand:VDX 0 "nonimmediate_operand"
-	  "=w,Un,w, w,  ?r,?w,?r, ?Us")
+	  "=w,Un,w, w,  ?r,?w,?r, ?Us,*r")
 	(match_operand:VDX 1 "general_operand"
-	  " w,w, Dn,Uni, w, r, Usi,r"))]
+	  " w,w, Dn,Uni, w, r, Usi,r,*r"))]
   "TARGET_NEON
    && (register_operand (operands[0], <MODE>mode)
        || register_operand (operands[1], <MODE>mode))"
@@ -62,11 +62,11 @@ 
 }
  [(set_attr "type" "neon_move<q>,neon_store1_1reg,neon_move<q>,\
                     neon_load1_1reg, neon_to_gp<q>,neon_from_gp<q>,\
-                    neon_load1_2reg, neon_store1_2reg")
-  (set_attr "length" "4,4,4,4,4,4,8,8")
-  (set_attr "arm_pool_range"     "*,*,*,1020,*,*,1020,*")
-  (set_attr "thumb2_pool_range"     "*,*,*,1018,*,*,1018,*")
-  (set_attr "neg_pool_range" "*,*,*,1004,*,*,1004,*")])
+                    neon_load1_2reg, neon_store1_2reg, multiple")
+  (set_attr "length" "4,4,4,4,4,4,8,8,8")
+  (set_attr "arm_pool_range"     "*,*,*,1020,*,*,1020,*,*")
+  (set_attr "thumb2_pool_range"     "*,*,*,1018,*,*,1018,*,*")
+  (set_attr "neg_pool_range" "*,*,*,1004,*,*,1004,*,*")])
 
 (define_insn "*neon_mov<mode>"
   [(set (match_operand:VQXMOV 0 "nonimmediate_operand"
diff --git a/gcc/testsuite/gcc.target/arm/pr88850.c b/gcc/testsuite/gcc.target/arm/pr88850.c
new file mode 100644
index 0000000000000000000000000000000000000000..66d0e9e620aef4f092a33ead79521901b5d89636
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/pr88850.c
@@ -0,0 +1,22 @@ 
+/* PR target/88850 */
+/* { dg-do compile } */
+/* { dg-options "-O2 -march=armv7-a -mfloat-abi=softfp -mfpu=neon -fdump-rtl-final" } */
+/* { dg-require-effective-target arm_neon_ok } */
+
+typedef __builtin_neon_qi int8x8_t __attribute__ ((__vector_size__ (8)));
+void bar (int8x8_t, int8x8_t);
+
+void
+foo (int8x8_t x, int8x8_t y)
+{
+ bar (y, x);
+}
+
+void foo2 (int8x8_t x)
+{
+  int8x8_t y;
+  bar (y, x);
+}
+
+/* Check that these 64-bit moves are done in SI.  */
+/* { dg-final { scan-rtl-dump "_movsi_vfp" "final" } } */