Message ID | 20200817085933.z5fg52xopegjojtn@arm.com |
---|---|
State | New |
Headers | show |
Series | aarch64: Don't generate invalid zero/sign-extend syntax | expand |
Alex Coplan <alex.coplan@arm.com> writes: > Note that an obvious omission here is that this patch does not touch the > mult patterns such as *add_<optab><ALLX:mode>_mult_<GPI:mode>. I found > that I couldn't hit these patterns with C code since multiplications by > powers of two always get turned into shifts by earlier RTL passes. If > there's a way to reliably hit these patterns, then perhaps these should > be updated as well. Hmm. Feels like we should either update them or delete them. E.g.: *adds_<optab><mode>_multp2 *subs_<optab><mode>_multp2 were added alongside the adds3.c and subs3.c tests that you're updating, so if the tests don't/no longer need the multp2 patterns to pass, there's a good chance that the patterns are redundant. For reasons I never understood, the canonical representation is to use (mult …) for powers of 2 inside a (mem …) but shifts outside of (mem …)s. So perhaps the patterns were originally for address calculations that had been moved outside of a (mem …) and not updated to shifts instead of mults. AFAICT the full list of affected patterns is: *adds_<optab><mode>_multp2 *subs_<optab><mode>_multp2 *add_<optab><ALLX:mode>_mult_<GPI:mode> *add_uxt<mode>_multp2 *sub_uxt<mode>_multp2 Is that right? If so, I think we should consider a follow-on patch to delete them. > Testing: > * New test which checks for the correct syntax in all updated > patterns (fails before and passes after the aarch64.md change). > * New test can be assembled by both GAS and llvm-mc following the > change. > * Bootstrapped and regtested on aarch64-none-linux-gnu. > > OK for master? OK as-is if paired with a follow-on patch to delete the patterns above (preapproved if it passes testing). Also OK without a follow-on patch if the fix is extended to the patterns above too (but the first option is better :-)). Thanks for taking the time to find a test for each pattern. Richard
Richard Sandiford <richard.sandiford@arm.com> wrote: > Alex Coplan <alex.coplan@arm.com> writes: >> Note that an obvious omission here is that this patch does not touch the >> mult patterns such as *add_<optab><ALLX:mode>_mult_<GPI:mode>. I found >> that I couldn't hit these patterns with C code since multiplications by >> powers of two always get turned into shifts by earlier RTL passes. If >> there's a way to reliably hit these patterns, then perhaps these should >> be updated as well. > > Hmm. Feels like we should either update them or delete them. E.g.: > > *adds_<optab><mode>_multp2 > *subs_<optab><mode>_multp2 FWIW add_extvdi_multp2 seems to fire for me, building libstdc++ (c++11 cow-wstring-inst) on my [very experimental] initial attempts at a Darwin port. (I see these failures too because the platform assembler is based off the LLVM backend which complains) Iain > were added alongside the adds3.c and subs3.c tests that you're updating, > so if the tests don't/no longer need the multp2 patterns to pass, > there's a good chance that the patterns are redundant. > > For reasons I never understood, the canonical representation is to use > (mult …) for powers of 2 inside a (mem …) but shifts outside of (mem …)s. > So perhaps the patterns were originally for address calculations that had > been moved outside of a (mem …) and not updated to shifts instead of mults. > > AFAICT the full list of affected patterns is: > > *adds_<optab><mode>_multp2 > *subs_<optab><mode>_multp2 > *add_<optab><ALLX:mode>_mult_<GPI:mode> > *add_uxt<mode>_multp2 > *sub_uxt<mode>_multp2 > > Is that right? If so, I think we should consider a follow-on patch > to delete them. > >> Testing: >> * New test which checks for the correct syntax in all updated >> patterns (fails before and passes after the aarch64.md change). >> * New test can be assembled by both GAS and llvm-mc following the >> change. >> * Bootstrapped and regtested on aarch64-none-linux-gnu. >> >> OK for master? > > OK as-is if paired with a follow-on patch to delete the patterns above > (preapproved if it passes testing). Also OK without a follow-on patch > if the fix is extended to the patterns above too (but the first option > is better :-)). > > Thanks for taking the time to find a test for each pattern. > > Richard
Hi Richard, > -----Original Message----- > From: Richard Sandiford <richard.sandiford@arm.com> > Sent: 18 August 2020 09:35 > To: Alex Coplan <Alex.Coplan@arm.com> > Cc: gcc-patches@gcc.gnu.org; Richard Earnshaw <Richard.Earnshaw@arm.com>; > Marcus Shawcroft <Marcus.Shawcroft@arm.com>; Kyrylo Tkachov > <Kyrylo.Tkachov@arm.com> > Subject: Re: [PATCH] aarch64: Don't generate invalid zero/sign-extend > syntax > > Alex Coplan <alex.coplan@arm.com> writes: > > Note that an obvious omission here is that this patch does not touch > the > > mult patterns such as *add_<optab><ALLX:mode>_mult_<GPI:mode>. I found > > that I couldn't hit these patterns with C code since multiplications by > > powers of two always get turned into shifts by earlier RTL passes. If > > there's a way to reliably hit these patterns, then perhaps these should > > be updated as well. > > Hmm. Feels like we should either update them or delete them. E.g.: > > *adds_<optab><mode>_multp2 > *subs_<optab><mode>_multp2 > > were added alongside the adds3.c and subs3.c tests that you're updating, > so if the tests don't/no longer need the multp2 patterns to pass, > there's a good chance that the patterns are redundant. > > For reasons I never understood, the canonical representation is to use > (mult …) for powers of 2 inside a (mem …) but shifts outside of (mem …)s. > So perhaps the patterns were originally for address calculations that had > been moved outside of a (mem …) and not updated to shifts instead of > mults. Thanks for the review, and for clarifying this. I tried removing these together with the *_mul_imm_ patterns (e.g. *adds_mul_imm_<mode>) and the only failure was gcc/testsuite/gcc.dg/torture/pr34330.c when compiled with -Os -ftree-vectorize which appears to depend on the *add_mul_imm_di pattern. Without this pattern, we ICE in LRA on this input. In this case, GCC appears to have done exactly what you described: we have the (plus (mult ...) ...) nodes inside (mem)s prior to register allocation, and then we end up pulling these out without converting them to shifts. Seeing this behaviour (and in particular seeing the ICE) makes me hesitant to just go ahead and remove the other patterns. That said, I have a patch to remove the following patterns: *adds_<optab><mode>_multp2 *subs_<optab><mode>_multp2 *add_<optab><ALLX:mode>_mult_<GPI:mode> *add_<optab><SHORT:mode>_mult_si_uxtw *add_<optab><mode>_multp2 *add_<optab>si_multp2_uxtw *add_uxt<mode>_multp2 *add_uxtsi_multp2_uxtw *sub_<optab><mode>_multp2 *sub_<optab>si_multp2_uxtw *sub_uxt<mode>_multp2 *sub_uxtsi_multp2_uxtw *sub_uxtsi_multp2_uxtw (together with the predicate aarch64_pwr_imm3 which is only used in these patterns) and this bootstraps/regtests just fine. So, I have a couple of questions: (1) Should it be considered a bug if we pull (plus (mult (power of 2) ...) ...) out of a (mem) RTX without re-writing the (mult) as a shift? (2) If not, can we otherwise justify the removal of the patterns here? I'm happy to go ahead with this if either (1) or (2) are true. Thanks, Alex
Alex Coplan <Alex.Coplan@arm.com> writes: > Hi Richard, > >> -----Original Message----- >> From: Richard Sandiford <richard.sandiford@arm.com> >> Sent: 18 August 2020 09:35 >> To: Alex Coplan <Alex.Coplan@arm.com> >> Cc: gcc-patches@gcc.gnu.org; Richard Earnshaw <Richard.Earnshaw@arm.com>; >> Marcus Shawcroft <Marcus.Shawcroft@arm.com>; Kyrylo Tkachov >> <Kyrylo.Tkachov@arm.com> >> Subject: Re: [PATCH] aarch64: Don't generate invalid zero/sign-extend >> syntax >> >> Alex Coplan <alex.coplan@arm.com> writes: >> > Note that an obvious omission here is that this patch does not touch >> the >> > mult patterns such as *add_<optab><ALLX:mode>_mult_<GPI:mode>. I found >> > that I couldn't hit these patterns with C code since multiplications by >> > powers of two always get turned into shifts by earlier RTL passes. If >> > there's a way to reliably hit these patterns, then perhaps these should >> > be updated as well. >> >> Hmm. Feels like we should either update them or delete them. E.g.: >> >> *adds_<optab><mode>_multp2 >> *subs_<optab><mode>_multp2 >> >> were added alongside the adds3.c and subs3.c tests that you're updating, >> so if the tests don't/no longer need the multp2 patterns to pass, >> there's a good chance that the patterns are redundant. >> >> For reasons I never understood, the canonical representation is to use >> (mult …) for powers of 2 inside a (mem …) but shifts outside of (mem …)s. >> So perhaps the patterns were originally for address calculations that had >> been moved outside of a (mem …) and not updated to shifts instead of >> mults. > > Thanks for the review, and for clarifying this. I tried removing these > together with the *_mul_imm_ patterns (e.g. *adds_mul_imm_<mode>) and > the only failure was gcc/testsuite/gcc.dg/torture/pr34330.c when > compiled with -Os -ftree-vectorize which appears to depend on the > *add_mul_imm_di pattern. Without this pattern, we ICE in LRA on this > input. > > In this case, GCC appears to have done exactly what you described: we > have the (plus (mult ...) ...) nodes inside (mem)s prior to register > allocation, and then we end up pulling these out without converting them > to shifts. > > Seeing this behaviour (and in particular seeing the ICE) makes me > hesitant to just go ahead and remove the other patterns. That said, I > have a patch to remove the following patterns: > > *adds_<optab><mode>_multp2 > *subs_<optab><mode>_multp2 > *add_<optab><ALLX:mode>_mult_<GPI:mode> > *add_<optab><SHORT:mode>_mult_si_uxtw > *add_<optab><mode>_multp2 > *add_<optab>si_multp2_uxtw > *add_uxt<mode>_multp2 > *add_uxtsi_multp2_uxtw > *sub_<optab><mode>_multp2 > *sub_<optab>si_multp2_uxtw > *sub_uxt<mode>_multp2 > *sub_uxtsi_multp2_uxtw > *sub_uxtsi_multp2_uxtw > > (together with the predicate aarch64_pwr_imm3 which is only used in > these patterns) and this bootstraps/regtests just fine. > > So, I have a couple of questions: > > (1) Should it be considered a bug if we pull (plus (mult (power of 2) > ...) ...) out of a (mem) RTX without re-writing the (mult) as a > shift? IMO, yes. But if we have an example in which it happens, we have to fix it before removing the patterns. That could end up being a bit of a rabbit hole, and could affect other targets too. If we keep the patterns, we should fix the [su]xtw problem in: *adds_<optab><mode>_multp2 *subs_<optab><mode>_multp2 *add_<optab><ALLX:mode>_mult_<GPI:mode> *add_uxt<mode>_multp2 *sub_uxt<mode>_multp2 too. (Plus any others I missed, if that isn't the full list.) Thanks, Richard
On Mon, 17 Aug 2020 at 11:00, Alex Coplan <alex.coplan@arm.com> wrote: > > Hello, > > Given the following C function: > > double *f(double *p, unsigned x) > { > return p + x; > } > > prior to this patch, GCC at -O2 would generate: > > f: > add x0, x0, x1, uxtw 3 > ret > > but this add instruction uses architecturally-invalid syntax: the width > of the third operand conflicts with the width of the extension > specifier. The third operand is only permitted to be an x register when > the extension specifier is (u|s)xtx. > > This instruction, and analogous insns for adds, sub, subs, and cmp, are > rejected by clang, but accepted by binutils. Assembling and > disassembling such an insn with binutils gives the architecturally-valid > version in the disassembly: > > 0: 8b214c00 add x0, x0, w1, uxtw #3 > > This patch fixes several patterns in the AArch64 backend to use the > standard syntax as specified in the Arm ARM such that GCC's output can > be assembled by assemblers other than GAS. > > Note that an obvious omission here is that this patch does not touch the > mult patterns such as *add_<optab><ALLX:mode>_mult_<GPI:mode>. I found > that I couldn't hit these patterns with C code since multiplications by > powers of two always get turned into shifts by earlier RTL passes. If > there's a way to reliably hit these patterns, then perhaps these should > be updated as well. > > Testing: > * New test which checks for the correct syntax in all updated > patterns (fails before and passes after the aarch64.md change). > * New test can be assembled by both GAS and llvm-mc following the > change. > * Bootstrapped and regtested on aarch64-none-linux-gnu. > > OK for master? > > Thanks, > Alex > > --- > > gcc/ChangeLog: > > * config/aarch64/aarch64.md > (*adds_<optab><ALLX:mode>_<GPI:mode>): Ensure extended operand > agrees with width of extension specifier. > (*subs_<optab><ALLX:mode>_<GPI:mode>): Likewise. > (*adds_<optab><ALLX:mode>_shift_<GPI:mode>): Likewise. > (*subs_<optab><ALLX:mode>_shift_<GPI:mode>): Likewise. > (*add_<optab><ALLX:mode>_<GPI:mode>): Likewise. > (*add_<optab><ALLX:mode>_shft_<GPI:mode>): Likewise. > (*add_uxt<mode>_shift2): Likewise. > (*sub_<optab><ALLX:mode>_<GPI:mode>): Likewise. > (*sub_<optab><ALLX:mode>_shft_<GPI:mode>): Likewise. > (*sub_uxt<mode>_shift2): Likewise. > (*cmp_swp_<optab><ALLX:mode>_reg<GPI:mode>): Likewise. > (*cmp_swp_<optab><ALLX:mode>_shft_<GPI:mode>): Likewise. > > > gcc/testsuite/ChangeLog: > > * gcc.target/aarch64/adds3.c: Fix test w.r.t. new syntax. > * gcc.target/aarch64/cmp.c: Likewise. > * gcc.target/aarch64/subs3.c: Likewise. > * gcc.target/aarch64/subsp.c: Likewise. > * gcc.target/aarch64/extend-syntax.c: New test. > Hi, I've noticed some of the new tests fail with -mabi=ilp32: gcc.target/aarch64/extend-syntax.c check-function-bodies add1 gcc.target/aarch64/extend-syntax.c check-function-bodies add3 gcc.target/aarch64/extend-syntax.c check-function-bodies sub2 gcc.target/aarch64/extend-syntax.c check-function-bodies sub3 gcc.target/aarch64/extend-syntax.c scan-assembler-times subs\tx[0-9]+, x[0-9]+, w[0-9]+, sxtw 3 1 gcc.target/aarch64/subsp.c scan-assembler sub\tsp, sp, w[0-9]*, sxtw 4\n Christophe
Hi Christophe, > -----Original Message----- > From: Christophe Lyon <christophe.lyon@linaro.org> > Sent: 08 September 2020 09:15 > To: Alex Coplan <Alex.Coplan@arm.com> > Cc: gcc Patches <gcc-patches@gcc.gnu.org>; Richard Earnshaw > <Richard.Earnshaw@arm.com>; Marcus Shawcroft <Marcus.Shawcroft@arm.com> > Subject: Re: [PATCH] aarch64: Don't generate invalid zero/sign-extend > syntax > > > gcc/ChangeLog: > > > > * config/aarch64/aarch64.md > > (*adds_<optab><ALLX:mode>_<GPI:mode>): Ensure extended operand > > agrees with width of extension specifier. > > (*subs_<optab><ALLX:mode>_<GPI:mode>): Likewise. > > (*adds_<optab><ALLX:mode>_shift_<GPI:mode>): Likewise. > > (*subs_<optab><ALLX:mode>_shift_<GPI:mode>): Likewise. > > (*add_<optab><ALLX:mode>_<GPI:mode>): Likewise. > > (*add_<optab><ALLX:mode>_shft_<GPI:mode>): Likewise. > > (*add_uxt<mode>_shift2): Likewise. > > (*sub_<optab><ALLX:mode>_<GPI:mode>): Likewise. > > (*sub_<optab><ALLX:mode>_shft_<GPI:mode>): Likewise. > > (*sub_uxt<mode>_shift2): Likewise. > > (*cmp_swp_<optab><ALLX:mode>_reg<GPI:mode>): Likewise. > > (*cmp_swp_<optab><ALLX:mode>_shft_<GPI:mode>): Likewise. > > > > > > gcc/testsuite/ChangeLog: > > > > * gcc.target/aarch64/adds3.c: Fix test w.r.t. new syntax. > > * gcc.target/aarch64/cmp.c: Likewise. > > * gcc.target/aarch64/subs3.c: Likewise. > > * gcc.target/aarch64/subsp.c: Likewise. > > * gcc.target/aarch64/extend-syntax.c: New test. > > > > Hi, > > I've noticed some of the new tests fail with -mabi=ilp32: > gcc.target/aarch64/extend-syntax.c check-function-bodies add1 > gcc.target/aarch64/extend-syntax.c check-function-bodies add3 > gcc.target/aarch64/extend-syntax.c check-function-bodies sub2 > gcc.target/aarch64/extend-syntax.c check-function-bodies sub3 > gcc.target/aarch64/extend-syntax.c scan-assembler-times > subs\tx[0-9]+, x[0-9]+, w[0-9]+, sxtw 3 1 > gcc.target/aarch64/subsp.c scan-assembler sub\tsp, sp, w[0-9]*, sxtw > 4\n Thanks for catching these. The failures in extend-syntax.c just need the assertions tweaking, I have a patch to fix those. The failure in subsp.c is more interesting: looks like a missed optimisation on ILP32, I'm taking a look. > > Christophe Thanks, Alex
Hi Christophe, On 08/09/2020 10:14, Christophe Lyon wrote: > On Mon, 17 Aug 2020 at 11:00, Alex Coplan <alex.coplan@arm.com> wrote: > > > > gcc/ChangeLog: > > > > * config/aarch64/aarch64.md > > (*adds_<optab><ALLX:mode>_<GPI:mode>): Ensure extended operand > > agrees with width of extension specifier. > > (*subs_<optab><ALLX:mode>_<GPI:mode>): Likewise. > > (*adds_<optab><ALLX:mode>_shift_<GPI:mode>): Likewise. > > (*subs_<optab><ALLX:mode>_shift_<GPI:mode>): Likewise. > > (*add_<optab><ALLX:mode>_<GPI:mode>): Likewise. > > (*add_<optab><ALLX:mode>_shft_<GPI:mode>): Likewise. > > (*add_uxt<mode>_shift2): Likewise. > > (*sub_<optab><ALLX:mode>_<GPI:mode>): Likewise. > > (*sub_<optab><ALLX:mode>_shft_<GPI:mode>): Likewise. > > (*sub_uxt<mode>_shift2): Likewise. > > (*cmp_swp_<optab><ALLX:mode>_reg<GPI:mode>): Likewise. > > (*cmp_swp_<optab><ALLX:mode>_shft_<GPI:mode>): Likewise. > > > > > > gcc/testsuite/ChangeLog: > > > > * gcc.target/aarch64/adds3.c: Fix test w.r.t. new syntax. > > * gcc.target/aarch64/cmp.c: Likewise. > > * gcc.target/aarch64/subs3.c: Likewise. > > * gcc.target/aarch64/subsp.c: Likewise. > > * gcc.target/aarch64/extend-syntax.c: New test. > > > > Hi, > > I've noticed some of the new tests fail with -mabi=ilp32: > gcc.target/aarch64/extend-syntax.c check-function-bodies add1 > gcc.target/aarch64/extend-syntax.c check-function-bodies add3 > gcc.target/aarch64/extend-syntax.c check-function-bodies sub2 > gcc.target/aarch64/extend-syntax.c check-function-bodies sub3 > gcc.target/aarch64/extend-syntax.c scan-assembler-times > subs\tx[0-9]+, x[0-9]+, w[0-9]+, sxtw 3 1 > gcc.target/aarch64/subsp.c scan-assembler sub\tsp, sp, w[0-9]*, sxtw 4\n > > Christophe AFAICT the second scan-assembler in that subsp test failed on ILP32 before my commit. This is because we generate slightly suboptimal code here. On LP64 with -O, we get: f2: stp x29, x30, [sp, -16]! mov x29, sp add w1, w1, 1 sub sp, sp, x1, sxtw 4 mov x0, sp bl foo mov sp, x29 ldp x29, x30, [sp], 16 ret On ILP32, we get: f2: stp x29, x30, [sp, -16]! mov x29, sp add w1, w1, 1 lsl w1, w1, 4 sub sp, sp, x1 mov w0, wsp bl foo mov sp, x29 ldp x29, x30, [sp], 16 ret And we see similar results all the way back to GCC 6. So AFAICT this scan-assembler has never worked. The attached patch disables it on ILP32 since this isn't a code quality regression. This patch also fixes up the DejaGnu directives in extend-syntax.c to work on ILP32: we change the check-function-bodies directive to only run on LP64, adding scan-assembler directives for ILP32 where required. OK for trunk? Thanks, Alex diff --git a/gcc/testsuite/gcc.target/aarch64/extend-syntax.c b/gcc/testsuite/gcc.target/aarch64/extend-syntax.c index 23fa9f4ffc5..1bfcdb59dde 100644 --- a/gcc/testsuite/gcc.target/aarch64/extend-syntax.c +++ b/gcc/testsuite/gcc.target/aarch64/extend-syntax.c @@ -20,6 +20,7 @@ unsigned long long *add1(unsigned long long *p, unsigned x) */ unsigned long long add2(unsigned long long x, unsigned y) { + /* { dg-final { scan-assembler-times "add\tx0, x0, w1, uxtw" 1 { target ilp32 } } } */ return x + y; } @@ -34,6 +35,9 @@ double *add3(double *p, int x) return p + x; } +// add1 and add3 should both generate this on ILP32: +/* { dg-final { scan-assembler-times "add\tw0, w0, w1, lsl 3" 2 { target ilp32 } } } */ + // Hits *sub_zero_extendsi_di (*sub_<optab><ALLX:mode>_<GPI:mode>). /* ** sub1: @@ -42,6 +46,7 @@ double *add3(double *p, int x) */ unsigned long long sub1(unsigned long long x, unsigned n) { + /* { dg-final { scan-assembler-times "sub\tx0, x0, w1, uxtw" 1 { target ilp32 } } } */ return x - n; } @@ -67,6 +72,9 @@ double *sub3(double *p, int n) return p - n; } +// sub2 and sub3 should both generate this on ILP32: +/* { dg-final { scan-assembler-times "sub\tw0, w0, w1, lsl 3" 2 { target ilp32 } } } */ + // Hits *adds_zero_extendsi_di (*adds_<optab><ALLX:mode>_<GPI:mode>). int adds1(unsigned long long x, unsigned y) { @@ -97,7 +105,8 @@ int subs1(unsigned long long x, unsigned y) unsigned long long *w; int subs2(unsigned long long *x, int y) { - /* { dg-final { scan-assembler-times "subs\tx\[0-9\]+, x\[0-9\]+, w\[0-9\]+, sxtw 3" 1 } } */ + /* { dg-final { scan-assembler-times "subs\tx\[0-9\]+, x\[0-9\]+, w\[0-9\]+, sxtw 3" 1 { target lp64 } } } */ + /* { dg-final { scan-assembler-times "subs\tw\[0-9\]+, w\[0-9\]+, w\[0-9\]+, lsl 3" 1 { target ilp32 } } } */ unsigned long long *t = x - y; w = t; return !!t; @@ -117,4 +126,4 @@ int cmp2(unsigned long long x, int y) return x == ((unsigned long long)y << 3); } -/* { dg-final { check-function-bodies "**" "" "" } } */ +/* { dg-final { check-function-bodies "**" "" "" { target lp64 } } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/subsp.c b/gcc/testsuite/gcc.target/aarch64/subsp.c index 341b83dca86..e7f61e0799b 100644 --- a/gcc/testsuite/gcc.target/aarch64/subsp.c +++ b/gcc/testsuite/gcc.target/aarch64/subsp.c @@ -16,4 +16,4 @@ f2 (int *x, int y) } /* { dg-final { scan-assembler "sub\tsp, sp, x\[0-9\]*\n" } } */ -/* { dg-final { scan-assembler "sub\tsp, sp, w\[0-9\]*, sxtw 4\n" } } */ +/* { dg-final { scan-assembler "sub\tsp, sp, w\[0-9\]*, sxtw 4\n" { target lp64 } } } */
Ping. Are these testsuite fixes for ILP32 OK? On 18/09/2020 17:15, Alex Coplan wrote: > Hi Christophe, > > On 08/09/2020 10:14, Christophe Lyon wrote: > > On Mon, 17 Aug 2020 at 11:00, Alex Coplan <alex.coplan@arm.com> wrote: > > > > > > gcc/ChangeLog: > > > > > > * config/aarch64/aarch64.md > > > (*adds_<optab><ALLX:mode>_<GPI:mode>): Ensure extended operand > > > agrees with width of extension specifier. > > > (*subs_<optab><ALLX:mode>_<GPI:mode>): Likewise. > > > (*adds_<optab><ALLX:mode>_shift_<GPI:mode>): Likewise. > > > (*subs_<optab><ALLX:mode>_shift_<GPI:mode>): Likewise. > > > (*add_<optab><ALLX:mode>_<GPI:mode>): Likewise. > > > (*add_<optab><ALLX:mode>_shft_<GPI:mode>): Likewise. > > > (*add_uxt<mode>_shift2): Likewise. > > > (*sub_<optab><ALLX:mode>_<GPI:mode>): Likewise. > > > (*sub_<optab><ALLX:mode>_shft_<GPI:mode>): Likewise. > > > (*sub_uxt<mode>_shift2): Likewise. > > > (*cmp_swp_<optab><ALLX:mode>_reg<GPI:mode>): Likewise. > > > (*cmp_swp_<optab><ALLX:mode>_shft_<GPI:mode>): Likewise. > > > > > > > > > gcc/testsuite/ChangeLog: > > > > > > * gcc.target/aarch64/adds3.c: Fix test w.r.t. new syntax. > > > * gcc.target/aarch64/cmp.c: Likewise. > > > * gcc.target/aarch64/subs3.c: Likewise. > > > * gcc.target/aarch64/subsp.c: Likewise. > > > * gcc.target/aarch64/extend-syntax.c: New test. > > > > > > > Hi, > > > > I've noticed some of the new tests fail with -mabi=ilp32: > > gcc.target/aarch64/extend-syntax.c check-function-bodies add1 > > gcc.target/aarch64/extend-syntax.c check-function-bodies add3 > > gcc.target/aarch64/extend-syntax.c check-function-bodies sub2 > > gcc.target/aarch64/extend-syntax.c check-function-bodies sub3 > > gcc.target/aarch64/extend-syntax.c scan-assembler-times > > subs\tx[0-9]+, x[0-9]+, w[0-9]+, sxtw 3 1 > > gcc.target/aarch64/subsp.c scan-assembler sub\tsp, sp, w[0-9]*, sxtw 4\n > > > > Christophe > > AFAICT the second scan-assembler in that subsp test failed on ILP32 > before my commit. This is because we generate slightly suboptimal code > here. On LP64 with -O, we get: > > f2: > stp x29, x30, [sp, -16]! > mov x29, sp > add w1, w1, 1 > sub sp, sp, x1, sxtw 4 > mov x0, sp > bl foo > mov sp, x29 > ldp x29, x30, [sp], 16 > ret > > On ILP32, we get: > > f2: > stp x29, x30, [sp, -16]! > mov x29, sp > add w1, w1, 1 > lsl w1, w1, 4 > sub sp, sp, x1 > mov w0, wsp > bl foo > mov sp, x29 > ldp x29, x30, [sp], 16 > ret > > And we see similar results all the way back to GCC 6. So AFAICT this > scan-assembler has never worked. The attached patch disables it on ILP32 > since this isn't a code quality regression. > > This patch also fixes up the DejaGnu directives in extend-syntax.c to > work on ILP32: we change the check-function-bodies directive to only run > on LP64, adding scan-assembler directives for ILP32 where required. > > OK for trunk? > > Thanks, > Alex > diff --git a/gcc/testsuite/gcc.target/aarch64/extend-syntax.c b/gcc/testsuite/gcc.target/aarch64/extend-syntax.c > index 23fa9f4ffc5..1bfcdb59dde 100644 > --- a/gcc/testsuite/gcc.target/aarch64/extend-syntax.c > +++ b/gcc/testsuite/gcc.target/aarch64/extend-syntax.c > @@ -20,6 +20,7 @@ unsigned long long *add1(unsigned long long *p, unsigned x) > */ > unsigned long long add2(unsigned long long x, unsigned y) > { > + /* { dg-final { scan-assembler-times "add\tx0, x0, w1, uxtw" 1 { target ilp32 } } } */ > return x + y; > } > > @@ -34,6 +35,9 @@ double *add3(double *p, int x) > return p + x; > } > > +// add1 and add3 should both generate this on ILP32: > +/* { dg-final { scan-assembler-times "add\tw0, w0, w1, lsl 3" 2 { target ilp32 } } } */ > + > // Hits *sub_zero_extendsi_di (*sub_<optab><ALLX:mode>_<GPI:mode>). > /* > ** sub1: > @@ -42,6 +46,7 @@ double *add3(double *p, int x) > */ > unsigned long long sub1(unsigned long long x, unsigned n) > { > + /* { dg-final { scan-assembler-times "sub\tx0, x0, w1, uxtw" 1 { target ilp32 } } } */ > return x - n; > } > > @@ -67,6 +72,9 @@ double *sub3(double *p, int n) > return p - n; > } > > +// sub2 and sub3 should both generate this on ILP32: > +/* { dg-final { scan-assembler-times "sub\tw0, w0, w1, lsl 3" 2 { target ilp32 } } } */ > + > // Hits *adds_zero_extendsi_di (*adds_<optab><ALLX:mode>_<GPI:mode>). > int adds1(unsigned long long x, unsigned y) > { > @@ -97,7 +105,8 @@ int subs1(unsigned long long x, unsigned y) > unsigned long long *w; > int subs2(unsigned long long *x, int y) > { > - /* { dg-final { scan-assembler-times "subs\tx\[0-9\]+, x\[0-9\]+, w\[0-9\]+, sxtw 3" 1 } } */ > + /* { dg-final { scan-assembler-times "subs\tx\[0-9\]+, x\[0-9\]+, w\[0-9\]+, sxtw 3" 1 { target lp64 } } } */ > + /* { dg-final { scan-assembler-times "subs\tw\[0-9\]+, w\[0-9\]+, w\[0-9\]+, lsl 3" 1 { target ilp32 } } } */ > unsigned long long *t = x - y; > w = t; > return !!t; > @@ -117,4 +126,4 @@ int cmp2(unsigned long long x, int y) > return x == ((unsigned long long)y << 3); > } > > -/* { dg-final { check-function-bodies "**" "" "" } } */ > +/* { dg-final { check-function-bodies "**" "" "" { target lp64 } } } */ > diff --git a/gcc/testsuite/gcc.target/aarch64/subsp.c b/gcc/testsuite/gcc.target/aarch64/subsp.c > index 341b83dca86..e7f61e0799b 100644 > --- a/gcc/testsuite/gcc.target/aarch64/subsp.c > +++ b/gcc/testsuite/gcc.target/aarch64/subsp.c > @@ -16,4 +16,4 @@ f2 (int *x, int y) > } > > /* { dg-final { scan-assembler "sub\tsp, sp, x\[0-9\]*\n" } } */ > -/* { dg-final { scan-assembler "sub\tsp, sp, w\[0-9\]*, sxtw 4\n" } } */ > +/* { dg-final { scan-assembler "sub\tsp, sp, w\[0-9\]*, sxtw 4\n" { target lp64 } } } */
On Wed, 30 Sep 2020 at 16:03, Alex Coplan <alex.coplan@arm.com> wrote: > > Ping. Are these testsuite fixes for ILP32 OK? > LGTM, by looking at the patch (I didn't run it in ilp32 mode) Thanks Christophe > On 18/09/2020 17:15, Alex Coplan wrote: > > Hi Christophe, > > > > On 08/09/2020 10:14, Christophe Lyon wrote: > > > On Mon, 17 Aug 2020 at 11:00, Alex Coplan <alex.coplan@arm.com> wrote: > > > > > > > > gcc/ChangeLog: > > > > > > > > * config/aarch64/aarch64.md > > > > (*adds_<optab><ALLX:mode>_<GPI:mode>): Ensure extended operand > > > > agrees with width of extension specifier. > > > > (*subs_<optab><ALLX:mode>_<GPI:mode>): Likewise. > > > > (*adds_<optab><ALLX:mode>_shift_<GPI:mode>): Likewise. > > > > (*subs_<optab><ALLX:mode>_shift_<GPI:mode>): Likewise. > > > > (*add_<optab><ALLX:mode>_<GPI:mode>): Likewise. > > > > (*add_<optab><ALLX:mode>_shft_<GPI:mode>): Likewise. > > > > (*add_uxt<mode>_shift2): Likewise. > > > > (*sub_<optab><ALLX:mode>_<GPI:mode>): Likewise. > > > > (*sub_<optab><ALLX:mode>_shft_<GPI:mode>): Likewise. > > > > (*sub_uxt<mode>_shift2): Likewise. > > > > (*cmp_swp_<optab><ALLX:mode>_reg<GPI:mode>): Likewise. > > > > (*cmp_swp_<optab><ALLX:mode>_shft_<GPI:mode>): Likewise. > > > > > > > > > > > > gcc/testsuite/ChangeLog: > > > > > > > > * gcc.target/aarch64/adds3.c: Fix test w.r.t. new syntax. > > > > * gcc.target/aarch64/cmp.c: Likewise. > > > > * gcc.target/aarch64/subs3.c: Likewise. > > > > * gcc.target/aarch64/subsp.c: Likewise. > > > > * gcc.target/aarch64/extend-syntax.c: New test. > > > > > > > > > > Hi, > > > > > > I've noticed some of the new tests fail with -mabi=ilp32: > > > gcc.target/aarch64/extend-syntax.c check-function-bodies add1 > > > gcc.target/aarch64/extend-syntax.c check-function-bodies add3 > > > gcc.target/aarch64/extend-syntax.c check-function-bodies sub2 > > > gcc.target/aarch64/extend-syntax.c check-function-bodies sub3 > > > gcc.target/aarch64/extend-syntax.c scan-assembler-times > > > subs\tx[0-9]+, x[0-9]+, w[0-9]+, sxtw 3 1 > > > gcc.target/aarch64/subsp.c scan-assembler sub\tsp, sp, w[0-9]*, sxtw 4\n > > > > > > Christophe > > > > AFAICT the second scan-assembler in that subsp test failed on ILP32 > > before my commit. This is because we generate slightly suboptimal code > > here. On LP64 with -O, we get: > > > > f2: > > stp x29, x30, [sp, -16]! > > mov x29, sp > > add w1, w1, 1 > > sub sp, sp, x1, sxtw 4 > > mov x0, sp > > bl foo > > mov sp, x29 > > ldp x29, x30, [sp], 16 > > ret > > > > On ILP32, we get: > > > > f2: > > stp x29, x30, [sp, -16]! > > mov x29, sp > > add w1, w1, 1 > > lsl w1, w1, 4 > > sub sp, sp, x1 > > mov w0, wsp > > bl foo > > mov sp, x29 > > ldp x29, x30, [sp], 16 > > ret > > > > And we see similar results all the way back to GCC 6. So AFAICT this > > scan-assembler has never worked. The attached patch disables it on ILP32 > > since this isn't a code quality regression. > > > > This patch also fixes up the DejaGnu directives in extend-syntax.c to > > work on ILP32: we change the check-function-bodies directive to only run > > on LP64, adding scan-assembler directives for ILP32 where required. > > > > OK for trunk? > > > > Thanks, > > Alex > > > diff --git a/gcc/testsuite/gcc.target/aarch64/extend-syntax.c b/gcc/testsuite/gcc.target/aarch64/extend-syntax.c > > index 23fa9f4ffc5..1bfcdb59dde 100644 > > --- a/gcc/testsuite/gcc.target/aarch64/extend-syntax.c > > +++ b/gcc/testsuite/gcc.target/aarch64/extend-syntax.c > > @@ -20,6 +20,7 @@ unsigned long long *add1(unsigned long long *p, unsigned x) > > */ > > unsigned long long add2(unsigned long long x, unsigned y) > > { > > + /* { dg-final { scan-assembler-times "add\tx0, x0, w1, uxtw" 1 { target ilp32 } } } */ > > return x + y; > > } > > > > @@ -34,6 +35,9 @@ double *add3(double *p, int x) > > return p + x; > > } > > > > +// add1 and add3 should both generate this on ILP32: > > +/* { dg-final { scan-assembler-times "add\tw0, w0, w1, lsl 3" 2 { target ilp32 } } } */ > > + > > // Hits *sub_zero_extendsi_di (*sub_<optab><ALLX:mode>_<GPI:mode>). > > /* > > ** sub1: > > @@ -42,6 +46,7 @@ double *add3(double *p, int x) > > */ > > unsigned long long sub1(unsigned long long x, unsigned n) > > { > > + /* { dg-final { scan-assembler-times "sub\tx0, x0, w1, uxtw" 1 { target ilp32 } } } */ > > return x - n; > > } > > > > @@ -67,6 +72,9 @@ double *sub3(double *p, int n) > > return p - n; > > } > > > > +// sub2 and sub3 should both generate this on ILP32: > > +/* { dg-final { scan-assembler-times "sub\tw0, w0, w1, lsl 3" 2 { target ilp32 } } } */ > > + > > // Hits *adds_zero_extendsi_di (*adds_<optab><ALLX:mode>_<GPI:mode>). > > int adds1(unsigned long long x, unsigned y) > > { > > @@ -97,7 +105,8 @@ int subs1(unsigned long long x, unsigned y) > > unsigned long long *w; > > int subs2(unsigned long long *x, int y) > > { > > - /* { dg-final { scan-assembler-times "subs\tx\[0-9\]+, x\[0-9\]+, w\[0-9\]+, sxtw 3" 1 } } */ > > + /* { dg-final { scan-assembler-times "subs\tx\[0-9\]+, x\[0-9\]+, w\[0-9\]+, sxtw 3" 1 { target lp64 } } } */ > > + /* { dg-final { scan-assembler-times "subs\tw\[0-9\]+, w\[0-9\]+, w\[0-9\]+, lsl 3" 1 { target ilp32 } } } */ > > unsigned long long *t = x - y; > > w = t; > > return !!t; > > @@ -117,4 +126,4 @@ int cmp2(unsigned long long x, int y) > > return x == ((unsigned long long)y << 3); > > } > > > > -/* { dg-final { check-function-bodies "**" "" "" } } */ > > +/* { dg-final { check-function-bodies "**" "" "" { target lp64 } } } */ > > diff --git a/gcc/testsuite/gcc.target/aarch64/subsp.c b/gcc/testsuite/gcc.target/aarch64/subsp.c > > index 341b83dca86..e7f61e0799b 100644 > > --- a/gcc/testsuite/gcc.target/aarch64/subsp.c > > +++ b/gcc/testsuite/gcc.target/aarch64/subsp.c > > @@ -16,4 +16,4 @@ f2 (int *x, int y) > > } > > > > /* { dg-final { scan-assembler "sub\tsp, sp, x\[0-9\]*\n" } } */ > > -/* { dg-final { scan-assembler "sub\tsp, sp, w\[0-9\]*, sxtw 4\n" } } */ > > +/* { dg-final { scan-assembler "sub\tsp, sp, w\[0-9\]*, sxtw 4\n" { target lp64 } } } */
Alex Coplan <alex.coplan@arm.com> writes: > Hi Christophe, > > On 08/09/2020 10:14, Christophe Lyon wrote: >> On Mon, 17 Aug 2020 at 11:00, Alex Coplan <alex.coplan@arm.com> wrote: >> > >> > gcc/ChangeLog: >> > >> > * config/aarch64/aarch64.md >> > (*adds_<optab><ALLX:mode>_<GPI:mode>): Ensure extended operand >> > agrees with width of extension specifier. >> > (*subs_<optab><ALLX:mode>_<GPI:mode>): Likewise. >> > (*adds_<optab><ALLX:mode>_shift_<GPI:mode>): Likewise. >> > (*subs_<optab><ALLX:mode>_shift_<GPI:mode>): Likewise. >> > (*add_<optab><ALLX:mode>_<GPI:mode>): Likewise. >> > (*add_<optab><ALLX:mode>_shft_<GPI:mode>): Likewise. >> > (*add_uxt<mode>_shift2): Likewise. >> > (*sub_<optab><ALLX:mode>_<GPI:mode>): Likewise. >> > (*sub_<optab><ALLX:mode>_shft_<GPI:mode>): Likewise. >> > (*sub_uxt<mode>_shift2): Likewise. >> > (*cmp_swp_<optab><ALLX:mode>_reg<GPI:mode>): Likewise. >> > (*cmp_swp_<optab><ALLX:mode>_shft_<GPI:mode>): Likewise. >> > >> > >> > gcc/testsuite/ChangeLog: >> > >> > * gcc.target/aarch64/adds3.c: Fix test w.r.t. new syntax. >> > * gcc.target/aarch64/cmp.c: Likewise. >> > * gcc.target/aarch64/subs3.c: Likewise. >> > * gcc.target/aarch64/subsp.c: Likewise. >> > * gcc.target/aarch64/extend-syntax.c: New test. >> > >> >> Hi, >> >> I've noticed some of the new tests fail with -mabi=ilp32: >> gcc.target/aarch64/extend-syntax.c check-function-bodies add1 >> gcc.target/aarch64/extend-syntax.c check-function-bodies add3 >> gcc.target/aarch64/extend-syntax.c check-function-bodies sub2 >> gcc.target/aarch64/extend-syntax.c check-function-bodies sub3 >> gcc.target/aarch64/extend-syntax.c scan-assembler-times >> subs\tx[0-9]+, x[0-9]+, w[0-9]+, sxtw 3 1 >> gcc.target/aarch64/subsp.c scan-assembler sub\tsp, sp, w[0-9]*, sxtw 4\n >> >> Christophe > > AFAICT the second scan-assembler in that subsp test failed on ILP32 > before my commit. This is because we generate slightly suboptimal code > here. On LP64 with -O, we get: > > f2: > stp x29, x30, [sp, -16]! > mov x29, sp > add w1, w1, 1 > sub sp, sp, x1, sxtw 4 > mov x0, sp > bl foo > mov sp, x29 > ldp x29, x30, [sp], 16 > ret > > On ILP32, we get: > > f2: > stp x29, x30, [sp, -16]! > mov x29, sp > add w1, w1, 1 > lsl w1, w1, 4 > sub sp, sp, x1 > mov w0, wsp > bl foo > mov sp, x29 > ldp x29, x30, [sp], 16 > ret > > And we see similar results all the way back to GCC 6. So AFAICT this > scan-assembler has never worked. The attached patch disables it on ILP32 > since this isn't a code quality regression. > > This patch also fixes up the DejaGnu directives in extend-syntax.c to > work on ILP32: we change the check-function-bodies directive to only run > on LP64, adding scan-assembler directives for ILP32 where required. > > OK for trunk? OK, thanks. Sorry for the slow review. Richard
diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md index 9b20dd0b1a0..b1e83dfda78 100644 --- a/gcc/config/aarch64/aarch64.md +++ b/gcc/config/aarch64/aarch64.md @@ -2383,7 +2383,7 @@ (set (match_operand:GPI 0 "register_operand" "=r") (plus:GPI (ANY_EXTEND:GPI (match_dup 1)) (match_dup 2)))] "" - "adds\\t%<GPI:w>0, %<GPI:w>2, %<GPI:w>1, <su>xt<ALLX:size>" + "adds\\t%<GPI:w>0, %<GPI:w>2, %w1, <su>xt<ALLX:size>" [(set_attr "type" "alus_ext")] ) @@ -2397,7 +2397,7 @@ (set (match_operand:GPI 0 "register_operand" "=r") (minus:GPI (match_dup 1) (ANY_EXTEND:GPI (match_dup 2))))] "" - "subs\\t%<GPI:w>0, %<GPI:w>1, %<GPI:w>2, <su>xt<ALLX:size>" + "subs\\t%<GPI:w>0, %<GPI:w>1, %w2, <su>xt<ALLX:size>" [(set_attr "type" "alus_ext")] ) @@ -2415,7 +2415,7 @@ (match_dup 2)) (match_dup 3)))] "" - "adds\\t%<GPI:w>0, %<GPI:w>3, %<GPI:w>1, <su>xt<ALLX:size> %2" + "adds\\t%<GPI:w>0, %<GPI:w>3, %w1, <su>xt<ALLX:size> %2" [(set_attr "type" "alus_ext")] ) @@ -2433,7 +2433,7 @@ (ashift:GPI (ANY_EXTEND:GPI (match_dup 2)) (match_dup 3))))] "" - "subs\\t%<GPI:w>0, %<GPI:w>1, %<GPI:w>2, <su>xt<ALLX:size> %3" + "subs\\t%<GPI:w>0, %<GPI:w>1, %w2, <su>xt<ALLX:size> %3" [(set_attr "type" "alus_ext")] ) @@ -2549,7 +2549,7 @@ (plus:GPI (ANY_EXTEND:GPI (match_operand:ALLX 1 "register_operand" "r")) (match_operand:GPI 2 "register_operand" "r")))] "" - "add\\t%<GPI:w>0, %<GPI:w>2, %<GPI:w>1, <su>xt<ALLX:size>" + "add\\t%<GPI:w>0, %<GPI:w>2, %w1, <su>xt<ALLX:size>" [(set_attr "type" "alu_ext")] ) @@ -2571,7 +2571,7 @@ (match_operand 2 "aarch64_imm3" "Ui3")) (match_operand:GPI 3 "register_operand" "r")))] "" - "add\\t%<GPI:w>0, %<GPI:w>3, %<GPI:w>1, <su>xt<ALLX:size> %2" + "add\\t%<GPI:w>0, %<GPI:w>3, %w1, <su>xt<ALLX:size> %2" [(set_attr "type" "alu_ext")] ) @@ -2819,7 +2819,7 @@ "* operands[3] = GEN_INT (aarch64_uxt_size (INTVAL(operands[2]), INTVAL (operands[3]))); - return \"add\t%<w>0, %<w>4, %<w>1, uxt%e3 %2\";" + return \"add\t%<w>0, %<w>4, %w1, uxt%e3 %2\";" [(set_attr "type" "alu_ext")] ) @@ -3305,7 +3305,7 @@ (ANY_EXTEND:GPI (match_operand:ALLX 2 "register_operand" "r"))))] "" - "sub\\t%<GPI:w>0, %<GPI:w>1, %<GPI:w>2, <su>xt<ALLX:size>" + "sub\\t%<GPI:w>0, %<GPI:w>1, %w2, <su>xt<ALLX:size>" [(set_attr "type" "alu_ext")] ) @@ -3328,7 +3328,7 @@ (match_operand:ALLX 2 "register_operand" "r")) (match_operand 3 "aarch64_imm3" "Ui3"))))] "" - "sub\\t%<GPI:w>0, %<GPI:w>1, %<GPI:w>2, <su>xt<ALLX:size> %3" + "sub\\t%<GPI:w>0, %<GPI:w>1, %w2, <su>xt<ALLX:size> %3" [(set_attr "type" "alu_ext")] ) @@ -3607,7 +3607,7 @@ "* operands[3] = GEN_INT (aarch64_uxt_size (INTVAL (operands[2]), INTVAL (operands[3]))); - return \"sub\t%<w>0, %<w>4, %<w>1, uxt%e3 %2\";" + return \"sub\t%<w>0, %<w>4, %w1, uxt%e3 %2\";" [(set_attr "type" "alu_ext")] ) @@ -4054,7 +4054,7 @@ (match_operand:ALLX 0 "register_operand" "r")) (match_operand:GPI 1 "register_operand" "r")))] "" - "cmp\\t%<GPI:w>1, %<GPI:w>0, <su>xt<ALLX:size>" + "cmp\\t%<GPI:w>1, %w0, <su>xt<ALLX:size>" [(set_attr "type" "alus_ext")] ) @@ -4066,7 +4066,7 @@ (match_operand 1 "aarch64_imm3" "Ui3")) (match_operand:GPI 2 "register_operand" "r")))] "" - "cmp\\t%<GPI:w>2, %<GPI:w>0, <su>xt<ALLX:size> %1" + "cmp\\t%<GPI:w>2, %w0, <su>xt<ALLX:size> %1" [(set_attr "type" "alus_ext")] ) diff --git a/gcc/testsuite/gcc.target/aarch64/adds3.c b/gcc/testsuite/gcc.target/aarch64/adds3.c index c5518bdcaf2..e938c8049cf 100644 --- a/gcc/testsuite/gcc.target/aarch64/adds3.c +++ b/gcc/testsuite/gcc.target/aarch64/adds3.c @@ -58,4 +58,4 @@ int main () return 0; } -/* { dg-final { scan-assembler-times "adds\tx\[0-9\]+, x\[0-9\]+, x\[0-9\]+, sxtw" 2 } } */ +/* { dg-final { scan-assembler-times "adds\tx\[0-9\]+, x\[0-9\]+, w\[0-9\]+, sxtw" 2 } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/cmp.c b/gcc/testsuite/gcc.target/aarch64/cmp.c index ee57dd283bf..a6487a4f77a 100644 --- a/gcc/testsuite/gcc.target/aarch64/cmp.c +++ b/gcc/testsuite/gcc.target/aarch64/cmp.c @@ -58,4 +58,5 @@ cmp_di_test4 (int a, s64 b, s64 c) } /* { dg-final { scan-assembler-times "cmp\tw\[0-9\]+, w\[0-9\]+" 2 } } */ -/* { dg-final { scan-assembler-times "cmp\tx\[0-9\]+, x\[0-9\]+" 4 } } */ +/* { dg-final { scan-assembler-times "cmp\tx\[0-9\]+, x\[0-9\]+" 2 } } */ +/* { dg-final { scan-assembler-times "cmp\tx\[0-9\]+, w\[0-9\]+, sxtw" 2 } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/extend-syntax.c b/gcc/testsuite/gcc.target/aarch64/extend-syntax.c new file mode 100644 index 00000000000..23fa9f4ffc5 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/extend-syntax.c @@ -0,0 +1,120 @@ +/* { dg-do compile } */ +/* { dg-options "-O2" } */ + +// Hits *add_uxtdi_shift2 (*add_uxt<mode>_shift2). +/* +** add1: +** add x0, x0, w1, uxtw 3 +** ret +*/ +unsigned long long *add1(unsigned long long *p, unsigned x) +{ + return p + x; +} + +// Hits *add_zero_extendsi_di (*add_<optab><ALLX:mode>_<GPI:mode>). +/* +** add2: +** add x0, x0, w1, uxtw +** ret +*/ +unsigned long long add2(unsigned long long x, unsigned y) +{ + return x + y; +} + +// Hits *add_extendsi_shft_di (*add_<optab><ALLX:mode>_shft_<GPI:mode>). +/* +** add3: +** add x0, x0, w1, sxtw 3 +** ret +*/ +double *add3(double *p, int x) +{ + return p + x; +} + +// Hits *sub_zero_extendsi_di (*sub_<optab><ALLX:mode>_<GPI:mode>). +/* +** sub1: +** sub x0, x0, w1, uxtw +** ret +*/ +unsigned long long sub1(unsigned long long x, unsigned n) +{ + return x - n; +} + +// Hits *sub_uxtdi_shift2 (*sub_uxt<mode>_shift2). +/* +** sub2: +** sub x0, x0, w1, uxtw 3 +** ret +*/ +double *sub2(double *x, unsigned n) +{ + return x - n; +} + +// Hits *sub_extendsi_shft_di (*sub_<optab><ALLX:mode>_shft_<GPI:mode>). +/* +** sub3: +** sub x0, x0, w1, sxtw 3 +** ret +*/ +double *sub3(double *p, int n) +{ + return p - n; +} + +// Hits *adds_zero_extendsi_di (*adds_<optab><ALLX:mode>_<GPI:mode>). +int adds1(unsigned long long x, unsigned y) +{ + /* { dg-final { scan-assembler-times "adds\tx\[0-9\]+, x\[0-9\]+, w\[0-9\]+, uxtw" 1 } } */ + unsigned long long l = x + y; + return !!l; +} + +// Hits *adds_extendsi_shift_di (*adds_<optab><ALLX:mode>_shift_<GPI:mode>). +int adds2(long long x, int y) +{ + /* { dg-final { scan-assembler-times "adds\tx\[0-9\]+, x\[0-9\]+, w\[0-9\]+, sxtw 3" 1 } } */ + long long t = x + ((long long)y << 3); + return !!t; +} + +// Hits *subs_zero_extendsi_di (*subs_<optab><ALLX:mode>_<GPI:mode>). +unsigned long long z; +int subs1(unsigned long long x, unsigned y) +{ + /* { dg-final { scan-assembler-times "subs\tx\[0-9\]+, x\[0-9\]+, w\[0-9\]+, uxtw" 1 } } */ + unsigned long long t = x - y; + z = t; + return !!t; +} + +// Hits *subs_extendsi_shift_di (*subs_<optab><ALLX:mode>_shift_<GPI:mode>). +unsigned long long *w; +int subs2(unsigned long long *x, int y) +{ + /* { dg-final { scan-assembler-times "subs\tx\[0-9\]+, x\[0-9\]+, w\[0-9\]+, sxtw 3" 1 } } */ + unsigned long long *t = x - y; + w = t; + return !!t; +} + +// Hits *cmp_swp_zero_extendsi_regdi (*cmp_swp_<optab><ALLX:mode>_reg<GPI:mode>). +int cmp(unsigned long long x, unsigned y) +{ + /* { dg-final { scan-assembler-times "cmp\tx\[0-9\]+, w\[0-9\]+, uxtw" 1 } } */ + return !!(x - y); +} + +// Hits *cmp_swp_extendsi_shft_di (*cmp_swp_<optab><ALLX:mode>_shft_<GPI:mode>). +int cmp2(unsigned long long x, int y) +{ + /* { dg-final { scan-assembler-times "cmp\tx\[0-9\]+, w\[0-9\]+, sxtw 3" 1 } } */ + return x == ((unsigned long long)y << 3); +} + +/* { dg-final { check-function-bodies "**" "" "" } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/subs3.c b/gcc/testsuite/gcc.target/aarch64/subs3.c index 59581bf1ab7..0470a3bde34 100644 --- a/gcc/testsuite/gcc.target/aarch64/subs3.c +++ b/gcc/testsuite/gcc.target/aarch64/subs3.c @@ -58,4 +58,4 @@ int main () return 0; } -/* { dg-final { scan-assembler-times "subs\tx\[0-9\]+, x\[0-9\]+, x\[0-9\]+, sxtw" 2 } } */ +/* { dg-final { scan-assembler-times "subs\tx\[0-9\]+, x\[0-9\]+, w\[0-9\]+, sxtw" 2 } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/subsp.c b/gcc/testsuite/gcc.target/aarch64/subsp.c index 6ef6b2c90ae..341b83dca86 100644 --- a/gcc/testsuite/gcc.target/aarch64/subsp.c +++ b/gcc/testsuite/gcc.target/aarch64/subsp.c @@ -16,4 +16,4 @@ f2 (int *x, int y) } /* { dg-final { scan-assembler "sub\tsp, sp, x\[0-9\]*\n" } } */ -/* { dg-final { scan-assembler "sub\tsp, sp, x\[0-9\]*, sxtw 4\n" } } */ +/* { dg-final { scan-assembler "sub\tsp, sp, w\[0-9\]*, sxtw 4\n" } } */