Message ID | 5AB140DE.20609@foss.arm.com |
---|---|
State | New |
Headers | show |
Series | [arm] PR target/82518: Return false in ARRAY_MODE_SUPPORTED_P for BYTES_BIG_ENDIAN | expand |
On 20 March 2018 at 18:11, Kyrill Tkachov <kyrylo.tkachov@foss.arm.com> wrote: > Hi all, > > This PR shows that we get the load/store_lanes logic wrong for arm > big-endian. > It is tricky to get right. Aarch64 does it by adding the appropriate > lane-swapping > operations during expansion. > > I'd like to do the same on arm eventually, but we'd need to port and > validate the VTBL-generating > code and add it to all the right places and I'm not comfortable enough doing > it for GCC 8, but I am keen > in getting the wrong-code fixed. > As I say in the PR, vectorisation on armeb is already severely restricted > (we disable many patterns on BYTES_BIG_ENDIAN) > and the load/store_lanes patterns really were not working properly at all, > so disabling them is not > a radical approach. > > The way to do that is to return false in ARRAY_MODE_SUPPORTED_P for > BYTES_BIG_ENDIAN. > > Bootstrapped and tested on arm-none-linux-gnueabihf. > Also tested on armeb-none-eabi. > > Committing to trunk. > Thanks, > Kyrill > > 2018-03-20 Kyrylo Tkachov <kyrylo.tkachov@arm.com> > > PR target/82518 > * config/arm/arm.c (arm_array_mode_supported_p): Return false for > BYTES_BIG_ENDIAN. > > 2018-03-20 Kyrylo Tkachov <kyrylo.tkachov@arm.com> > > PR target/82518 > * lib/target-supports.exp (check_effective_target_vect_load_lanes): > Disable for armeb targets. > * gcc.target/arm/pr82518.c: New test. Hi Kyrill, I think you need: Index: pr82518.c =================================================================== --- pr82518.c (revision 258705) +++ pr82518.c (working copy) @@ -1,5 +1,5 @@ /* { dg-do run } */ -/* { dg-require-effective-target arm_neon_ok } */ +/* { dg-require-effective-target arm_neon_hw } */ /* { dg-additional-options "-O3 -fno-inline -std=gnu99" } */ /* { dg-add-options arm_neon } */ (The test fails to run for me when using qemu --cpu arm926) OK? Christophe
On 21/03/18 09:07, Christophe Lyon wrote: > On 20 March 2018 at 18:11, Kyrill Tkachov <kyrylo.tkachov@foss.arm.com> wrote: >> Hi all, >> >> This PR shows that we get the load/store_lanes logic wrong for arm >> big-endian. >> It is tricky to get right. Aarch64 does it by adding the appropriate >> lane-swapping >> operations during expansion. >> >> I'd like to do the same on arm eventually, but we'd need to port and >> validate the VTBL-generating >> code and add it to all the right places and I'm not comfortable enough doing >> it for GCC 8, but I am keen >> in getting the wrong-code fixed. >> As I say in the PR, vectorisation on armeb is already severely restricted >> (we disable many patterns on BYTES_BIG_ENDIAN) >> and the load/store_lanes patterns really were not working properly at all, >> so disabling them is not >> a radical approach. >> >> The way to do that is to return false in ARRAY_MODE_SUPPORTED_P for >> BYTES_BIG_ENDIAN. >> >> Bootstrapped and tested on arm-none-linux-gnueabihf. >> Also tested on armeb-none-eabi. >> >> Committing to trunk. >> Thanks, >> Kyrill >> >> 2018-03-20 Kyrylo Tkachov <kyrylo.tkachov@arm.com> >> >> PR target/82518 >> * config/arm/arm.c (arm_array_mode_supported_p): Return false for >> BYTES_BIG_ENDIAN. >> >> 2018-03-20 Kyrylo Tkachov <kyrylo.tkachov@arm.com> >> >> PR target/82518 >> * lib/target-supports.exp (check_effective_target_vect_load_lanes): >> Disable for armeb targets. >> * gcc.target/arm/pr82518.c: New test. > Hi Kyrill, > > I think you need: > Index: pr82518.c > =================================================================== > --- pr82518.c (revision 258705) > +++ pr82518.c (working copy) > @@ -1,5 +1,5 @@ > /* { dg-do run } */ > -/* { dg-require-effective-target arm_neon_ok } */ > +/* { dg-require-effective-target arm_neon_hw } */ > /* { dg-additional-options "-O3 -fno-inline -std=gnu99" } */ > /* { dg-add-options arm_neon } */ > > (The test fails to run for me when using qemu --cpu arm926) > > OK? Yes, you're right. Thanks Christophe. Kyrill > Christophe
> On Mar 20, 2018, at 8:11 PM, Kyrill Tkachov <kyrylo.tkachov@foss.arm.com> wrote: > > Hi all, > > This PR shows that we get the load/store_lanes logic wrong for arm big-endian. > It is tricky to get right. Aarch64 does it by adding the appropriate lane-swapping > operations during expansion. > > I'd like to do the same on arm eventually, but we'd need to port and validate the VTBL-generating > code and add it to all the right places and I'm not comfortable enough doing it for GCC 8, but I am keen > in getting the wrong-code fixed. > As I say in the PR, vectorisation on armeb is already severely restricted (we disable many patterns on BYTES_BIG_ENDIAN) > and the load/store_lanes patterns really were not working properly at all, so disabling them is not > a radical approach. > > The way to do that is to return false in ARRAY_MODE_SUPPORTED_P for BYTES_BIG_ENDIAN. > > Bootstrapped and tested on arm-none-linux-gnueabihf. > Also tested on armeb-none-eabi. > > Committing to trunk. ... > --- a/gcc/testsuite/lib/target-supports.exp > +++ b/gcc/testsuite/lib/target-supports.exp > @@ -6609,7 +6609,8 @@ proc check_effective_target_vect_load_lanes { } { > verbose "check_effective_target_vect_load_lanes: using cached result" 2 > } else { > set et_vect_load_lanes 0 > - if { ([istarget arm*-*-*] && [check_effective_target_arm_neon_ok]) > + # We don't support load_lanes correctly on big-endian arm. > + if { ([istarget arm-*-*] && [check_effective_target_arm_neon_ok]) > || [istarget aarch64*-*-*] } { > set et_vect_load_lanes 1 > } > Hi Kyrill, This part makes armv8l-linux-gnueabihf target fail a few of slp-perm-* tests. Using [check_effective_target_arm_little_endian] should fix this. Would you please fix this on master and gcc-7-branch? Thanks! -- Maxim Kuvyrkov www.linaro.org
On 25/04/18 18:31, Maxim Kuvyrkov wrote: >> On Mar 20, 2018, at 8:11 PM, Kyrill Tkachov <kyrylo.tkachov@foss.arm.com> wrote: >> >> Hi all, >> >> This PR shows that we get the load/store_lanes logic wrong for arm big-endian. >> It is tricky to get right. Aarch64 does it by adding the appropriate lane-swapping >> operations during expansion. >> >> I'd like to do the same on arm eventually, but we'd need to port and validate the VTBL-generating >> code and add it to all the right places and I'm not comfortable enough doing it for GCC 8, but I am keen >> in getting the wrong-code fixed. >> As I say in the PR, vectorisation on armeb is already severely restricted (we disable many patterns on BYTES_BIG_ENDIAN) >> and the load/store_lanes patterns really were not working properly at all, so disabling them is not >> a radical approach. >> >> The way to do that is to return false in ARRAY_MODE_SUPPORTED_P for BYTES_BIG_ENDIAN. >> >> Bootstrapped and tested on arm-none-linux-gnueabihf. >> Also tested on armeb-none-eabi. >> >> Committing to trunk. > ... >> --- a/gcc/testsuite/lib/target-supports.exp >> +++ b/gcc/testsuite/lib/target-supports.exp >> @@ -6609,7 +6609,8 @@ proc check_effective_target_vect_load_lanes { } { >> verbose "check_effective_target_vect_load_lanes: using cached result" 2 >> } else { >> set et_vect_load_lanes 0 >> - if { ([istarget arm*-*-*] && [check_effective_target_arm_neon_ok]) >> + # We don't support load_lanes correctly on big-endian arm. >> + if { ([istarget arm-*-*] && [check_effective_target_arm_neon_ok]) >> || [istarget aarch64*-*-*] } { >> set et_vect_load_lanes 1 >> } >> > Hi Kyrill, > > This part makes armv8l-linux-gnueabihf target fail a few of slp-perm-* tests. Using [check_effective_target_arm_little_endian] should fix this. > > Would you please fix this on master and gcc-7-branch? Hi Maxim, Thanks for catching this. This patch fixes the failures (I've reproduced them on a armv8l-linux-gnueabihf target). Committing to trunk. Richi, Jakub, is this ok to commit to the GCC 8 branch at this time? Thanks, Kyrill 2018-04-27 Kyrylo Tkachov <kyrylo.tkachov@arm.com> * lib/target-supports.exp (check_effective_target_vect_load_lanes): Use check_effective_target_arm_little_endian. diff --git a/gcc/testsuite/lib/target-supports.exp b/gcc/testsuite/lib/target-supports.exp index 00f988cc756b4f5898b9c6290722d1353240b3a3..6d497fbcf8033cae06f54e0601f14e4affe29923 100644 --- a/gcc/testsuite/lib/target-supports.exp +++ b/gcc/testsuite/lib/target-supports.exp @@ -6588,7 +6588,7 @@ proc check_effective_target_vect_load_lanes { } { } else { set et_vect_load_lanes 0 # We don't support load_lanes correctly on big-endian arm. - if { ([istarget arm-*-*] && [check_effective_target_arm_neon_ok]) + if { ([check_effective_target_arm_little_endian] && [check_effective_target_arm_neon_ok]) || [istarget aarch64*-*-*] } { set et_vect_load_lanes 1 }
commit 861f9881d3af7e6f94e9b008e88aa7ba7c44a4df Author: Kyrylo Tkachov <kyrylo.tkachov@arm.com> Date: Tue Feb 20 15:14:01 2018 +0000 [arm] PR target/82518: Return false in ARRAY_MODE_SUPPORTED_P for BYTES_BIG_ENDIAN diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c index 73e9fc1..09deb60 100644 --- a/gcc/config/arm/arm.c +++ b/gcc/config/arm/arm.c @@ -27151,7 +27151,10 @@ static bool arm_array_mode_supported_p (machine_mode mode, unsigned HOST_WIDE_INT nelems) { - if (TARGET_NEON + /* We don't want to enable interleaved loads and stores for BYTES_BIG_ENDIAN + for now, as the lane-swapping logic needs to be extended in the expanders. + See PR target/82518. */ + if (TARGET_NEON && !BYTES_BIG_ENDIAN && (VALID_NEON_DREG_MODE (mode) || VALID_NEON_QREG_MODE (mode)) && (nelems >= 2 && nelems <= 4)) return true; diff --git a/gcc/testsuite/gcc.target/arm/pr82518.c b/gcc/testsuite/gcc.target/arm/pr82518.c new file mode 100644 index 0000000..c3e45b8 --- /dev/null +++ b/gcc/testsuite/gcc.target/arm/pr82518.c @@ -0,0 +1,29 @@ +/* { dg-do run } */ +/* { dg-require-effective-target arm_neon_ok } */ +/* { dg-additional-options "-O3 -fno-inline -std=gnu99" } */ +/* { dg-add-options arm_neon } */ + +typedef struct { int x, y; } X; + +void f4(X *p, int n) +{ + for (int i = 0; i < n; i++) + { p[i].x = i; + p[i].y = i + 1; + } +} + +__attribute ((aligned (16))) X arr[100]; + +int main(void) +{ + volatile int fail = 0; + f4 (arr, 100); + for (int i = 0; i < 100; i++) + if (arr[i].y != i+1 || arr[i].x != i) + fail = 1; + if (fail) + __builtin_abort (); + + return 0; +} diff --git a/gcc/testsuite/lib/target-supports.exp b/gcc/testsuite/lib/target-supports.exp index 640f76e..5ff5ec6 100644 --- a/gcc/testsuite/lib/target-supports.exp +++ b/gcc/testsuite/lib/target-supports.exp @@ -6609,7 +6609,8 @@ proc check_effective_target_vect_load_lanes { } { verbose "check_effective_target_vect_load_lanes: using cached result" 2 } else { set et_vect_load_lanes 0 - if { ([istarget arm*-*-*] && [check_effective_target_arm_neon_ok]) + # We don't support load_lanes correctly on big-endian arm. + if { ([istarget arm-*-*] && [check_effective_target_arm_neon_ok]) || [istarget aarch64*-*-*] } { set et_vect_load_lanes 1 }