diff mbox series

aarch64: Don't generate invalid zero/sign-extend syntax

Message ID 20200817085933.z5fg52xopegjojtn@arm.com
State New
Headers show
Series aarch64: Don't generate invalid zero/sign-extend syntax | expand

Commit Message

Alex Coplan Aug. 17, 2020, 8:59 a.m. UTC
Hello,

Given the following C function:

double *f(double *p, unsigned x)
{
    return p + x;
}

prior to this patch, GCC at -O2 would generate:

f:
        add     x0, x0, x1, uxtw 3
        ret

but this add instruction uses architecturally-invalid syntax: the width
of the third operand conflicts with the width of the extension
specifier. The third operand is only permitted to be an x register when
the extension specifier is (u|s)xtx.

This instruction, and analogous insns for adds, sub, subs, and cmp, are
rejected by clang, but accepted by binutils. Assembling and
disassembling such an insn with binutils gives the architecturally-valid
version in the disassembly:

   0:   8b214c00        add     x0, x0, w1, uxtw #3

This patch fixes several patterns in the AArch64 backend to use the
standard syntax as specified in the Arm ARM such that GCC's output can
be assembled by assemblers other than GAS.

Note that an obvious omission here is that this patch does not touch the
mult patterns such as *add_<optab><ALLX:mode>_mult_<GPI:mode>. I found
that I couldn't hit these patterns with C code since multiplications by
powers of two always get turned into shifts by earlier RTL passes. If
there's a way to reliably hit these patterns, then perhaps these should
be updated as well.

Testing:
 * New test which checks for the correct syntax in all updated
   patterns (fails before and passes after the aarch64.md change).
 * New test can be assembled by both GAS and llvm-mc following the
   change.
 * Bootstrapped and regtested on aarch64-none-linux-gnu.

OK for master?

Thanks,
Alex

---

gcc/ChangeLog:

	* config/aarch64/aarch64.md
	(*adds_<optab><ALLX:mode>_<GPI:mode>): Ensure extended operand
	agrees with width of extension specifier.
	(*subs_<optab><ALLX:mode>_<GPI:mode>): Likewise.
	(*adds_<optab><ALLX:mode>_shift_<GPI:mode>): Likewise.
	(*subs_<optab><ALLX:mode>_shift_<GPI:mode>): Likewise.
	(*add_<optab><ALLX:mode>_<GPI:mode>): Likewise.
	(*add_<optab><ALLX:mode>_shft_<GPI:mode>): Likewise.
	(*add_uxt<mode>_shift2): Likewise.
	(*sub_<optab><ALLX:mode>_<GPI:mode>): Likewise.
	(*sub_<optab><ALLX:mode>_shft_<GPI:mode>): Likewise.
	(*sub_uxt<mode>_shift2): Likewise.
	(*cmp_swp_<optab><ALLX:mode>_reg<GPI:mode>): Likewise.
	(*cmp_swp_<optab><ALLX:mode>_shft_<GPI:mode>): Likewise.


gcc/testsuite/ChangeLog:

	* gcc.target/aarch64/adds3.c: Fix test w.r.t. new syntax.
	* gcc.target/aarch64/cmp.c: Likewise.
	* gcc.target/aarch64/subs3.c: Likewise.
	* gcc.target/aarch64/subsp.c: Likewise.
	* gcc.target/aarch64/extend-syntax.c: New test.

Comments

Richard Sandiford Aug. 18, 2020, 8:35 a.m. UTC | #1
Alex Coplan <alex.coplan@arm.com> writes:
> Note that an obvious omission here is that this patch does not touch the
> mult patterns such as *add_<optab><ALLX:mode>_mult_<GPI:mode>. I found
> that I couldn't hit these patterns with C code since multiplications by
> powers of two always get turned into shifts by earlier RTL passes. If
> there's a way to reliably hit these patterns, then perhaps these should
> be updated as well.

Hmm.  Feels like we should either update them or delete them.  E.g.:

  *adds_<optab><mode>_multp2
  *subs_<optab><mode>_multp2

were added alongside the adds3.c and subs3.c tests that you're updating,
so if the tests don't/no longer need the multp2 patterns to pass,
there's a good chance that the patterns are redundant.

For reasons I never understood, the canonical representation is to use
(mult …) for powers of 2 inside a (mem …) but shifts outside of (mem …)s.
So perhaps the patterns were originally for address calculations that had
been moved outside of a (mem …) and not updated to shifts instead of mults.

AFAICT the full list of affected patterns is:

  *adds_<optab><mode>_multp2
  *subs_<optab><mode>_multp2
  *add_<optab><ALLX:mode>_mult_<GPI:mode>
  *add_uxt<mode>_multp2
  *sub_uxt<mode>_multp2

Is that right?  If so, I think we should consider a follow-on patch
to delete them.

> Testing:
>  * New test which checks for the correct syntax in all updated
>    patterns (fails before and passes after the aarch64.md change).
>  * New test can be assembled by both GAS and llvm-mc following the
>    change.
>  * Bootstrapped and regtested on aarch64-none-linux-gnu.
>
> OK for master?

OK as-is if paired with a follow-on patch to delete the patterns above
(preapproved if it passes testing).  Also OK without a follow-on patch
if the fix is extended to the patterns above too (but the first option
is better :-)).

Thanks for taking the time to find a test for each pattern.

Richard
Iain Sandoe Aug. 18, 2020, 9:37 a.m. UTC | #2
Richard Sandiford <richard.sandiford@arm.com> wrote:

> Alex Coplan <alex.coplan@arm.com> writes:
>> Note that an obvious omission here is that this patch does not touch the
>> mult patterns such as *add_<optab><ALLX:mode>_mult_<GPI:mode>. I found
>> that I couldn't hit these patterns with C code since multiplications by
>> powers of two always get turned into shifts by earlier RTL passes. If
>> there's a way to reliably hit these patterns, then perhaps these should
>> be updated as well.
>
> Hmm.  Feels like we should either update them or delete them.  E.g.:
>
>  *adds_<optab><mode>_multp2
>  *subs_<optab><mode>_multp2

FWIW add_extvdi_multp2 seems to fire for me, building libstdc++ (c++11
cow-wstring-inst) on my [very experimental] initial attempts at a Darwin  
port.

(I see these failures too because the platform assembler is based off the  
LLVM
  backend which complains)

Iain

> were added alongside the adds3.c and subs3.c tests that you're updating,
> so if the tests don't/no longer need the multp2 patterns to pass,
> there's a good chance that the patterns are redundant.
>
> For reasons I never understood, the canonical representation is to use
> (mult …) for powers of 2 inside a (mem …) but shifts outside of (mem …)s.
> So perhaps the patterns were originally for address calculations that had
> been moved outside of a (mem …) and not updated to shifts instead of mults.
>
> AFAICT the full list of affected patterns is:
>
>  *adds_<optab><mode>_multp2
>  *subs_<optab><mode>_multp2
>  *add_<optab><ALLX:mode>_mult_<GPI:mode>
>  *add_uxt<mode>_multp2
>  *sub_uxt<mode>_multp2
>
> Is that right?  If so, I think we should consider a follow-on patch
> to delete them.
>
>> Testing:
>> * New test which checks for the correct syntax in all updated
>>   patterns (fails before and passes after the aarch64.md change).
>> * New test can be assembled by both GAS and llvm-mc following the
>>   change.
>> * Bootstrapped and regtested on aarch64-none-linux-gnu.
>>
>> OK for master?
>
> OK as-is if paired with a follow-on patch to delete the patterns above
> (preapproved if it passes testing).  Also OK without a follow-on patch
> if the fix is extended to the patterns above too (but the first option
> is better :-)).
>
> Thanks for taking the time to find a test for each pattern.
>
> Richard
Alex Coplan Aug. 19, 2020, 8:11 a.m. UTC | #3
Hi Richard,

> -----Original Message-----
> From: Richard Sandiford <richard.sandiford@arm.com>
> Sent: 18 August 2020 09:35
> To: Alex Coplan <Alex.Coplan@arm.com>
> Cc: gcc-patches@gcc.gnu.org; Richard Earnshaw <Richard.Earnshaw@arm.com>;
> Marcus Shawcroft <Marcus.Shawcroft@arm.com>; Kyrylo Tkachov
> <Kyrylo.Tkachov@arm.com>
> Subject: Re: [PATCH] aarch64: Don't generate invalid zero/sign-extend
> syntax
> 
> Alex Coplan <alex.coplan@arm.com> writes:
> > Note that an obvious omission here is that this patch does not touch
> the
> > mult patterns such as *add_<optab><ALLX:mode>_mult_<GPI:mode>. I found
> > that I couldn't hit these patterns with C code since multiplications by
> > powers of two always get turned into shifts by earlier RTL passes. If
> > there's a way to reliably hit these patterns, then perhaps these should
> > be updated as well.
> 
> Hmm.  Feels like we should either update them or delete them.  E.g.:
> 
>   *adds_<optab><mode>_multp2
>   *subs_<optab><mode>_multp2
> 
> were added alongside the adds3.c and subs3.c tests that you're updating,
> so if the tests don't/no longer need the multp2 patterns to pass,
> there's a good chance that the patterns are redundant.
> 
> For reasons I never understood, the canonical representation is to use
> (mult …) for powers of 2 inside a (mem …) but shifts outside of (mem …)s.
> So perhaps the patterns were originally for address calculations that had
> been moved outside of a (mem …) and not updated to shifts instead of
> mults.

Thanks for the review, and for clarifying this. I tried removing these
together with the *_mul_imm_ patterns (e.g. *adds_mul_imm_<mode>) and
the only failure was gcc/testsuite/gcc.dg/torture/pr34330.c when
compiled with -Os -ftree-vectorize which appears to depend on the
*add_mul_imm_di pattern. Without this pattern, we ICE in LRA on this
input.

In this case, GCC appears to have done exactly what you described: we
have the (plus (mult ...) ...) nodes inside (mem)s prior to register
allocation, and then we end up pulling these out without converting them
to shifts.

Seeing this behaviour (and in particular seeing the ICE) makes me
hesitant to just go ahead and remove the other patterns. That said, I
have a patch to remove the following patterns:

 *adds_<optab><mode>_multp2
 *subs_<optab><mode>_multp2
 *add_<optab><ALLX:mode>_mult_<GPI:mode>
 *add_<optab><SHORT:mode>_mult_si_uxtw
 *add_<optab><mode>_multp2
 *add_<optab>si_multp2_uxtw
 *add_uxt<mode>_multp2
 *add_uxtsi_multp2_uxtw
 *sub_<optab><mode>_multp2
 *sub_<optab>si_multp2_uxtw
 *sub_uxt<mode>_multp2
 *sub_uxtsi_multp2_uxtw
 *sub_uxtsi_multp2_uxtw

(together with the predicate aarch64_pwr_imm3 which is only used in
these patterns) and this bootstraps/regtests just fine.

So, I have a couple of questions:

(1) Should it be considered a bug if we pull (plus (mult (power of 2)
   ...) ...) out of a (mem) RTX without re-writing the (mult) as a
   shift?
(2) If not, can we otherwise justify the removal of the patterns here?

I'm happy to go ahead with this if either (1) or (2) are true.

Thanks,
Alex
Richard Sandiford Aug. 21, 2020, 9:28 a.m. UTC | #4
Alex Coplan <Alex.Coplan@arm.com> writes:
> Hi Richard,
>
>> -----Original Message-----
>> From: Richard Sandiford <richard.sandiford@arm.com>
>> Sent: 18 August 2020 09:35
>> To: Alex Coplan <Alex.Coplan@arm.com>
>> Cc: gcc-patches@gcc.gnu.org; Richard Earnshaw <Richard.Earnshaw@arm.com>;
>> Marcus Shawcroft <Marcus.Shawcroft@arm.com>; Kyrylo Tkachov
>> <Kyrylo.Tkachov@arm.com>
>> Subject: Re: [PATCH] aarch64: Don't generate invalid zero/sign-extend
>> syntax
>> 
>> Alex Coplan <alex.coplan@arm.com> writes:
>> > Note that an obvious omission here is that this patch does not touch
>> the
>> > mult patterns such as *add_<optab><ALLX:mode>_mult_<GPI:mode>. I found
>> > that I couldn't hit these patterns with C code since multiplications by
>> > powers of two always get turned into shifts by earlier RTL passes. If
>> > there's a way to reliably hit these patterns, then perhaps these should
>> > be updated as well.
>> 
>> Hmm.  Feels like we should either update them or delete them.  E.g.:
>> 
>>   *adds_<optab><mode>_multp2
>>   *subs_<optab><mode>_multp2
>> 
>> were added alongside the adds3.c and subs3.c tests that you're updating,
>> so if the tests don't/no longer need the multp2 patterns to pass,
>> there's a good chance that the patterns are redundant.
>> 
>> For reasons I never understood, the canonical representation is to use
>> (mult …) for powers of 2 inside a (mem …) but shifts outside of (mem …)s.
>> So perhaps the patterns were originally for address calculations that had
>> been moved outside of a (mem …) and not updated to shifts instead of
>> mults.
>
> Thanks for the review, and for clarifying this. I tried removing these
> together with the *_mul_imm_ patterns (e.g. *adds_mul_imm_<mode>) and
> the only failure was gcc/testsuite/gcc.dg/torture/pr34330.c when
> compiled with -Os -ftree-vectorize which appears to depend on the
> *add_mul_imm_di pattern. Without this pattern, we ICE in LRA on this
> input.
>
> In this case, GCC appears to have done exactly what you described: we
> have the (plus (mult ...) ...) nodes inside (mem)s prior to register
> allocation, and then we end up pulling these out without converting them
> to shifts.
>
> Seeing this behaviour (and in particular seeing the ICE) makes me
> hesitant to just go ahead and remove the other patterns. That said, I
> have a patch to remove the following patterns:
>
>  *adds_<optab><mode>_multp2
>  *subs_<optab><mode>_multp2
>  *add_<optab><ALLX:mode>_mult_<GPI:mode>
>  *add_<optab><SHORT:mode>_mult_si_uxtw
>  *add_<optab><mode>_multp2
>  *add_<optab>si_multp2_uxtw
>  *add_uxt<mode>_multp2
>  *add_uxtsi_multp2_uxtw
>  *sub_<optab><mode>_multp2
>  *sub_<optab>si_multp2_uxtw
>  *sub_uxt<mode>_multp2
>  *sub_uxtsi_multp2_uxtw
>  *sub_uxtsi_multp2_uxtw
>
> (together with the predicate aarch64_pwr_imm3 which is only used in
> these patterns) and this bootstraps/regtests just fine.
>
> So, I have a couple of questions:
>
> (1) Should it be considered a bug if we pull (plus (mult (power of 2)
>    ...) ...) out of a (mem) RTX without re-writing the (mult) as a
>    shift?

IMO, yes.  But if we have an example in which it happens, we have
to fix it before removing the patterns.  That could end up being
a bit of a rabbit hole, and could affect other targets too.

If we keep the patterns, we should fix the [su]xtw problem in:

  *adds_<optab><mode>_multp2
  *subs_<optab><mode>_multp2
  *add_<optab><ALLX:mode>_mult_<GPI:mode>
  *add_uxt<mode>_multp2
  *sub_uxt<mode>_multp2

too.  (Plus any others I missed, if that isn't the full list.)

Thanks,
Richard
Christophe Lyon Sept. 8, 2020, 8:14 a.m. UTC | #5
On Mon, 17 Aug 2020 at 11:00, Alex Coplan <alex.coplan@arm.com> wrote:
>
> Hello,
>
> Given the following C function:
>
> double *f(double *p, unsigned x)
> {
>     return p + x;
> }
>
> prior to this patch, GCC at -O2 would generate:
>
> f:
>         add     x0, x0, x1, uxtw 3
>         ret
>
> but this add instruction uses architecturally-invalid syntax: the width
> of the third operand conflicts with the width of the extension
> specifier. The third operand is only permitted to be an x register when
> the extension specifier is (u|s)xtx.
>
> This instruction, and analogous insns for adds, sub, subs, and cmp, are
> rejected by clang, but accepted by binutils. Assembling and
> disassembling such an insn with binutils gives the architecturally-valid
> version in the disassembly:
>
>    0:   8b214c00        add     x0, x0, w1, uxtw #3
>
> This patch fixes several patterns in the AArch64 backend to use the
> standard syntax as specified in the Arm ARM such that GCC's output can
> be assembled by assemblers other than GAS.
>
> Note that an obvious omission here is that this patch does not touch the
> mult patterns such as *add_<optab><ALLX:mode>_mult_<GPI:mode>. I found
> that I couldn't hit these patterns with C code since multiplications by
> powers of two always get turned into shifts by earlier RTL passes. If
> there's a way to reliably hit these patterns, then perhaps these should
> be updated as well.
>
> Testing:
>  * New test which checks for the correct syntax in all updated
>    patterns (fails before and passes after the aarch64.md change).
>  * New test can be assembled by both GAS and llvm-mc following the
>    change.
>  * Bootstrapped and regtested on aarch64-none-linux-gnu.
>
> OK for master?
>
> Thanks,
> Alex
>
> ---
>
> gcc/ChangeLog:
>
>         * config/aarch64/aarch64.md
>         (*adds_<optab><ALLX:mode>_<GPI:mode>): Ensure extended operand
>         agrees with width of extension specifier.
>         (*subs_<optab><ALLX:mode>_<GPI:mode>): Likewise.
>         (*adds_<optab><ALLX:mode>_shift_<GPI:mode>): Likewise.
>         (*subs_<optab><ALLX:mode>_shift_<GPI:mode>): Likewise.
>         (*add_<optab><ALLX:mode>_<GPI:mode>): Likewise.
>         (*add_<optab><ALLX:mode>_shft_<GPI:mode>): Likewise.
>         (*add_uxt<mode>_shift2): Likewise.
>         (*sub_<optab><ALLX:mode>_<GPI:mode>): Likewise.
>         (*sub_<optab><ALLX:mode>_shft_<GPI:mode>): Likewise.
>         (*sub_uxt<mode>_shift2): Likewise.
>         (*cmp_swp_<optab><ALLX:mode>_reg<GPI:mode>): Likewise.
>         (*cmp_swp_<optab><ALLX:mode>_shft_<GPI:mode>): Likewise.
>
>
> gcc/testsuite/ChangeLog:
>
>         * gcc.target/aarch64/adds3.c: Fix test w.r.t. new syntax.
>         * gcc.target/aarch64/cmp.c: Likewise.
>         * gcc.target/aarch64/subs3.c: Likewise.
>         * gcc.target/aarch64/subsp.c: Likewise.
>         * gcc.target/aarch64/extend-syntax.c: New test.
>

Hi,

I've noticed some of the new tests fail with -mabi=ilp32:
    gcc.target/aarch64/extend-syntax.c check-function-bodies add1
    gcc.target/aarch64/extend-syntax.c check-function-bodies add3
    gcc.target/aarch64/extend-syntax.c check-function-bodies sub2
    gcc.target/aarch64/extend-syntax.c check-function-bodies sub3
    gcc.target/aarch64/extend-syntax.c scan-assembler-times
subs\tx[0-9]+, x[0-9]+, w[0-9]+, sxtw 3 1
    gcc.target/aarch64/subsp.c scan-assembler sub\tsp, sp, w[0-9]*, sxtw 4\n

Christophe
Alex Coplan Sept. 8, 2020, 3:41 p.m. UTC | #6
Hi Christophe,

> -----Original Message-----
> From: Christophe Lyon <christophe.lyon@linaro.org>
> Sent: 08 September 2020 09:15
> To: Alex Coplan <Alex.Coplan@arm.com>
> Cc: gcc Patches <gcc-patches@gcc.gnu.org>; Richard Earnshaw
> <Richard.Earnshaw@arm.com>; Marcus Shawcroft <Marcus.Shawcroft@arm.com>
> Subject: Re: [PATCH] aarch64: Don't generate invalid zero/sign-extend
> syntax
> 
> > gcc/ChangeLog:
> >
> >         * config/aarch64/aarch64.md
> >         (*adds_<optab><ALLX:mode>_<GPI:mode>): Ensure extended operand
> >         agrees with width of extension specifier.
> >         (*subs_<optab><ALLX:mode>_<GPI:mode>): Likewise.
> >         (*adds_<optab><ALLX:mode>_shift_<GPI:mode>): Likewise.
> >         (*subs_<optab><ALLX:mode>_shift_<GPI:mode>): Likewise.
> >         (*add_<optab><ALLX:mode>_<GPI:mode>): Likewise.
> >         (*add_<optab><ALLX:mode>_shft_<GPI:mode>): Likewise.
> >         (*add_uxt<mode>_shift2): Likewise.
> >         (*sub_<optab><ALLX:mode>_<GPI:mode>): Likewise.
> >         (*sub_<optab><ALLX:mode>_shft_<GPI:mode>): Likewise.
> >         (*sub_uxt<mode>_shift2): Likewise.
> >         (*cmp_swp_<optab><ALLX:mode>_reg<GPI:mode>): Likewise.
> >         (*cmp_swp_<optab><ALLX:mode>_shft_<GPI:mode>): Likewise.
> >
> >
> > gcc/testsuite/ChangeLog:
> >
> >         * gcc.target/aarch64/adds3.c: Fix test w.r.t. new syntax.
> >         * gcc.target/aarch64/cmp.c: Likewise.
> >         * gcc.target/aarch64/subs3.c: Likewise.
> >         * gcc.target/aarch64/subsp.c: Likewise.
> >         * gcc.target/aarch64/extend-syntax.c: New test.
> >
> 
> Hi,
> 
> I've noticed some of the new tests fail with -mabi=ilp32:
>     gcc.target/aarch64/extend-syntax.c check-function-bodies add1
>     gcc.target/aarch64/extend-syntax.c check-function-bodies add3
>     gcc.target/aarch64/extend-syntax.c check-function-bodies sub2
>     gcc.target/aarch64/extend-syntax.c check-function-bodies sub3
>     gcc.target/aarch64/extend-syntax.c scan-assembler-times
> subs\tx[0-9]+, x[0-9]+, w[0-9]+, sxtw 3 1
>     gcc.target/aarch64/subsp.c scan-assembler sub\tsp, sp, w[0-9]*, sxtw
> 4\n

Thanks for catching these.

The failures in extend-syntax.c just need the assertions tweaking, I have a
patch to fix those.

The failure in subsp.c is more interesting: looks like a missed optimisation on
ILP32, I'm taking a look.

> 
> Christophe

Thanks,
Alex
Alex Coplan Sept. 18, 2020, 4:15 p.m. UTC | #7
Hi Christophe,

On 08/09/2020 10:14, Christophe Lyon wrote:
> On Mon, 17 Aug 2020 at 11:00, Alex Coplan <alex.coplan@arm.com> wrote:
> >
> > gcc/ChangeLog:
> >
> >         * config/aarch64/aarch64.md
> >         (*adds_<optab><ALLX:mode>_<GPI:mode>): Ensure extended operand
> >         agrees with width of extension specifier.
> >         (*subs_<optab><ALLX:mode>_<GPI:mode>): Likewise.
> >         (*adds_<optab><ALLX:mode>_shift_<GPI:mode>): Likewise.
> >         (*subs_<optab><ALLX:mode>_shift_<GPI:mode>): Likewise.
> >         (*add_<optab><ALLX:mode>_<GPI:mode>): Likewise.
> >         (*add_<optab><ALLX:mode>_shft_<GPI:mode>): Likewise.
> >         (*add_uxt<mode>_shift2): Likewise.
> >         (*sub_<optab><ALLX:mode>_<GPI:mode>): Likewise.
> >         (*sub_<optab><ALLX:mode>_shft_<GPI:mode>): Likewise.
> >         (*sub_uxt<mode>_shift2): Likewise.
> >         (*cmp_swp_<optab><ALLX:mode>_reg<GPI:mode>): Likewise.
> >         (*cmp_swp_<optab><ALLX:mode>_shft_<GPI:mode>): Likewise.
> >
> >
> > gcc/testsuite/ChangeLog:
> >
> >         * gcc.target/aarch64/adds3.c: Fix test w.r.t. new syntax.
> >         * gcc.target/aarch64/cmp.c: Likewise.
> >         * gcc.target/aarch64/subs3.c: Likewise.
> >         * gcc.target/aarch64/subsp.c: Likewise.
> >         * gcc.target/aarch64/extend-syntax.c: New test.
> >
> 
> Hi,
> 
> I've noticed some of the new tests fail with -mabi=ilp32:
>     gcc.target/aarch64/extend-syntax.c check-function-bodies add1
>     gcc.target/aarch64/extend-syntax.c check-function-bodies add3
>     gcc.target/aarch64/extend-syntax.c check-function-bodies sub2
>     gcc.target/aarch64/extend-syntax.c check-function-bodies sub3
>     gcc.target/aarch64/extend-syntax.c scan-assembler-times
> subs\tx[0-9]+, x[0-9]+, w[0-9]+, sxtw 3 1
>     gcc.target/aarch64/subsp.c scan-assembler sub\tsp, sp, w[0-9]*, sxtw 4\n
> 
> Christophe

AFAICT the second scan-assembler in that subsp test failed on ILP32
before my commit. This is because we generate slightly suboptimal code
here. On LP64 with -O, we get:

f2:
        stp     x29, x30, [sp, -16]!
        mov     x29, sp
        add     w1, w1, 1
        sub     sp, sp, x1, sxtw 4
        mov     x0, sp
        bl      foo
        mov     sp, x29
        ldp     x29, x30, [sp], 16
        ret

On ILP32, we get:

f2:
        stp     x29, x30, [sp, -16]!
        mov     x29, sp
        add     w1, w1, 1
        lsl     w1, w1, 4
        sub     sp, sp, x1
        mov     w0, wsp
        bl      foo
        mov     sp, x29
        ldp     x29, x30, [sp], 16
        ret

And we see similar results all the way back to GCC 6. So AFAICT this
scan-assembler has never worked. The attached patch disables it on ILP32
since this isn't a code quality regression.

This patch also fixes up the DejaGnu directives in extend-syntax.c to
work on ILP32: we change the check-function-bodies directive to only run
on LP64, adding scan-assembler directives for ILP32 where required.

OK for trunk?

Thanks,
Alex
diff --git a/gcc/testsuite/gcc.target/aarch64/extend-syntax.c b/gcc/testsuite/gcc.target/aarch64/extend-syntax.c
index 23fa9f4ffc5..1bfcdb59dde 100644
--- a/gcc/testsuite/gcc.target/aarch64/extend-syntax.c
+++ b/gcc/testsuite/gcc.target/aarch64/extend-syntax.c
@@ -20,6 +20,7 @@ unsigned long long *add1(unsigned long long *p, unsigned x)
 */
 unsigned long long add2(unsigned long long x, unsigned y)
 {
+  /* { dg-final { scan-assembler-times "add\tx0, x0, w1, uxtw" 1 { target ilp32 } } } */
   return x + y;
 }
 
@@ -34,6 +35,9 @@ double *add3(double *p, int x)
   return p + x;
 }
 
+// add1 and add3 should both generate this on ILP32:
+/* { dg-final { scan-assembler-times "add\tw0, w0, w1, lsl 3" 2 { target ilp32 } } } */
+
 // Hits *sub_zero_extendsi_di (*sub_<optab><ALLX:mode>_<GPI:mode>).
 /*
 ** sub1:
@@ -42,6 +46,7 @@ double *add3(double *p, int x)
 */
 unsigned long long sub1(unsigned long long x, unsigned n)
 {
+    /* { dg-final { scan-assembler-times "sub\tx0, x0, w1, uxtw" 1 { target ilp32 } } } */
     return x - n;
 }
 
@@ -67,6 +72,9 @@ double *sub3(double *p, int n)
   return p - n;
 }
 
+// sub2 and sub3 should both generate this on ILP32:
+/* { dg-final { scan-assembler-times "sub\tw0, w0, w1, lsl 3" 2 { target ilp32 } } } */
+
 // Hits *adds_zero_extendsi_di (*adds_<optab><ALLX:mode>_<GPI:mode>).
 int adds1(unsigned long long x, unsigned y)
 {
@@ -97,7 +105,8 @@ int subs1(unsigned long long x, unsigned y)
 unsigned long long *w;
 int subs2(unsigned long long *x, int y)
 {
-  /* { dg-final { scan-assembler-times "subs\tx\[0-9\]+, x\[0-9\]+, w\[0-9\]+, sxtw 3" 1 } } */
+  /* { dg-final { scan-assembler-times "subs\tx\[0-9\]+, x\[0-9\]+, w\[0-9\]+, sxtw 3" 1 { target lp64 } } } */
+  /* { dg-final { scan-assembler-times "subs\tw\[0-9\]+, w\[0-9\]+, w\[0-9\]+, lsl 3" 1 { target ilp32 } } } */
   unsigned long long *t = x - y;
   w = t;
   return !!t;
@@ -117,4 +126,4 @@ int cmp2(unsigned long long x, int y)
   return x == ((unsigned long long)y << 3);
 }
 
-/* { dg-final { check-function-bodies "**" "" "" } } */
+/* { dg-final { check-function-bodies "**" "" "" { target lp64 } } } */
diff --git a/gcc/testsuite/gcc.target/aarch64/subsp.c b/gcc/testsuite/gcc.target/aarch64/subsp.c
index 341b83dca86..e7f61e0799b 100644
--- a/gcc/testsuite/gcc.target/aarch64/subsp.c
+++ b/gcc/testsuite/gcc.target/aarch64/subsp.c
@@ -16,4 +16,4 @@ f2 (int *x, int y)
 }
 
 /* { dg-final { scan-assembler "sub\tsp, sp, x\[0-9\]*\n" } } */
-/* { dg-final { scan-assembler "sub\tsp, sp, w\[0-9\]*, sxtw 4\n" } } */
+/* { dg-final { scan-assembler "sub\tsp, sp, w\[0-9\]*, sxtw 4\n" { target lp64 } } } */
Alex Coplan Sept. 30, 2020, 2:03 p.m. UTC | #8
Ping. Are these testsuite fixes for ILP32 OK?

On 18/09/2020 17:15, Alex Coplan wrote:
> Hi Christophe,
> 
> On 08/09/2020 10:14, Christophe Lyon wrote:
> > On Mon, 17 Aug 2020 at 11:00, Alex Coplan <alex.coplan@arm.com> wrote:
> > >
> > > gcc/ChangeLog:
> > >
> > >         * config/aarch64/aarch64.md
> > >         (*adds_<optab><ALLX:mode>_<GPI:mode>): Ensure extended operand
> > >         agrees with width of extension specifier.
> > >         (*subs_<optab><ALLX:mode>_<GPI:mode>): Likewise.
> > >         (*adds_<optab><ALLX:mode>_shift_<GPI:mode>): Likewise.
> > >         (*subs_<optab><ALLX:mode>_shift_<GPI:mode>): Likewise.
> > >         (*add_<optab><ALLX:mode>_<GPI:mode>): Likewise.
> > >         (*add_<optab><ALLX:mode>_shft_<GPI:mode>): Likewise.
> > >         (*add_uxt<mode>_shift2): Likewise.
> > >         (*sub_<optab><ALLX:mode>_<GPI:mode>): Likewise.
> > >         (*sub_<optab><ALLX:mode>_shft_<GPI:mode>): Likewise.
> > >         (*sub_uxt<mode>_shift2): Likewise.
> > >         (*cmp_swp_<optab><ALLX:mode>_reg<GPI:mode>): Likewise.
> > >         (*cmp_swp_<optab><ALLX:mode>_shft_<GPI:mode>): Likewise.
> > >
> > >
> > > gcc/testsuite/ChangeLog:
> > >
> > >         * gcc.target/aarch64/adds3.c: Fix test w.r.t. new syntax.
> > >         * gcc.target/aarch64/cmp.c: Likewise.
> > >         * gcc.target/aarch64/subs3.c: Likewise.
> > >         * gcc.target/aarch64/subsp.c: Likewise.
> > >         * gcc.target/aarch64/extend-syntax.c: New test.
> > >
> > 
> > Hi,
> > 
> > I've noticed some of the new tests fail with -mabi=ilp32:
> >     gcc.target/aarch64/extend-syntax.c check-function-bodies add1
> >     gcc.target/aarch64/extend-syntax.c check-function-bodies add3
> >     gcc.target/aarch64/extend-syntax.c check-function-bodies sub2
> >     gcc.target/aarch64/extend-syntax.c check-function-bodies sub3
> >     gcc.target/aarch64/extend-syntax.c scan-assembler-times
> > subs\tx[0-9]+, x[0-9]+, w[0-9]+, sxtw 3 1
> >     gcc.target/aarch64/subsp.c scan-assembler sub\tsp, sp, w[0-9]*, sxtw 4\n
> > 
> > Christophe
> 
> AFAICT the second scan-assembler in that subsp test failed on ILP32
> before my commit. This is because we generate slightly suboptimal code
> here. On LP64 with -O, we get:
> 
> f2:
>         stp     x29, x30, [sp, -16]!
>         mov     x29, sp
>         add     w1, w1, 1
>         sub     sp, sp, x1, sxtw 4
>         mov     x0, sp
>         bl      foo
>         mov     sp, x29
>         ldp     x29, x30, [sp], 16
>         ret
> 
> On ILP32, we get:
> 
> f2:
>         stp     x29, x30, [sp, -16]!
>         mov     x29, sp
>         add     w1, w1, 1
>         lsl     w1, w1, 4
>         sub     sp, sp, x1
>         mov     w0, wsp
>         bl      foo
>         mov     sp, x29
>         ldp     x29, x30, [sp], 16
>         ret
> 
> And we see similar results all the way back to GCC 6. So AFAICT this
> scan-assembler has never worked. The attached patch disables it on ILP32
> since this isn't a code quality regression.
> 
> This patch also fixes up the DejaGnu directives in extend-syntax.c to
> work on ILP32: we change the check-function-bodies directive to only run
> on LP64, adding scan-assembler directives for ILP32 where required.
> 
> OK for trunk?
> 
> Thanks,
> Alex

> diff --git a/gcc/testsuite/gcc.target/aarch64/extend-syntax.c b/gcc/testsuite/gcc.target/aarch64/extend-syntax.c
> index 23fa9f4ffc5..1bfcdb59dde 100644
> --- a/gcc/testsuite/gcc.target/aarch64/extend-syntax.c
> +++ b/gcc/testsuite/gcc.target/aarch64/extend-syntax.c
> @@ -20,6 +20,7 @@ unsigned long long *add1(unsigned long long *p, unsigned x)
>  */
>  unsigned long long add2(unsigned long long x, unsigned y)
>  {
> +  /* { dg-final { scan-assembler-times "add\tx0, x0, w1, uxtw" 1 { target ilp32 } } } */
>    return x + y;
>  }
>  
> @@ -34,6 +35,9 @@ double *add3(double *p, int x)
>    return p + x;
>  }
>  
> +// add1 and add3 should both generate this on ILP32:
> +/* { dg-final { scan-assembler-times "add\tw0, w0, w1, lsl 3" 2 { target ilp32 } } } */
> +
>  // Hits *sub_zero_extendsi_di (*sub_<optab><ALLX:mode>_<GPI:mode>).
>  /*
>  ** sub1:
> @@ -42,6 +46,7 @@ double *add3(double *p, int x)
>  */
>  unsigned long long sub1(unsigned long long x, unsigned n)
>  {
> +    /* { dg-final { scan-assembler-times "sub\tx0, x0, w1, uxtw" 1 { target ilp32 } } } */
>      return x - n;
>  }
>  
> @@ -67,6 +72,9 @@ double *sub3(double *p, int n)
>    return p - n;
>  }
>  
> +// sub2 and sub3 should both generate this on ILP32:
> +/* { dg-final { scan-assembler-times "sub\tw0, w0, w1, lsl 3" 2 { target ilp32 } } } */
> +
>  // Hits *adds_zero_extendsi_di (*adds_<optab><ALLX:mode>_<GPI:mode>).
>  int adds1(unsigned long long x, unsigned y)
>  {
> @@ -97,7 +105,8 @@ int subs1(unsigned long long x, unsigned y)
>  unsigned long long *w;
>  int subs2(unsigned long long *x, int y)
>  {
> -  /* { dg-final { scan-assembler-times "subs\tx\[0-9\]+, x\[0-9\]+, w\[0-9\]+, sxtw 3" 1 } } */
> +  /* { dg-final { scan-assembler-times "subs\tx\[0-9\]+, x\[0-9\]+, w\[0-9\]+, sxtw 3" 1 { target lp64 } } } */
> +  /* { dg-final { scan-assembler-times "subs\tw\[0-9\]+, w\[0-9\]+, w\[0-9\]+, lsl 3" 1 { target ilp32 } } } */
>    unsigned long long *t = x - y;
>    w = t;
>    return !!t;
> @@ -117,4 +126,4 @@ int cmp2(unsigned long long x, int y)
>    return x == ((unsigned long long)y << 3);
>  }
>  
> -/* { dg-final { check-function-bodies "**" "" "" } } */
> +/* { dg-final { check-function-bodies "**" "" "" { target lp64 } } } */
> diff --git a/gcc/testsuite/gcc.target/aarch64/subsp.c b/gcc/testsuite/gcc.target/aarch64/subsp.c
> index 341b83dca86..e7f61e0799b 100644
> --- a/gcc/testsuite/gcc.target/aarch64/subsp.c
> +++ b/gcc/testsuite/gcc.target/aarch64/subsp.c
> @@ -16,4 +16,4 @@ f2 (int *x, int y)
>  }
>  
>  /* { dg-final { scan-assembler "sub\tsp, sp, x\[0-9\]*\n" } } */
> -/* { dg-final { scan-assembler "sub\tsp, sp, w\[0-9\]*, sxtw 4\n" } } */
> +/* { dg-final { scan-assembler "sub\tsp, sp, w\[0-9\]*, sxtw 4\n" { target lp64 } } } */
Christophe Lyon Sept. 30, 2020, 3:07 p.m. UTC | #9
On Wed, 30 Sep 2020 at 16:03, Alex Coplan <alex.coplan@arm.com> wrote:
>
> Ping. Are these testsuite fixes for ILP32 OK?
>
LGTM, by looking at the patch (I didn't run it in ilp32 mode)

Thanks
Christophe


> On 18/09/2020 17:15, Alex Coplan wrote:
> > Hi Christophe,
> >
> > On 08/09/2020 10:14, Christophe Lyon wrote:
> > > On Mon, 17 Aug 2020 at 11:00, Alex Coplan <alex.coplan@arm.com> wrote:
> > > >
> > > > gcc/ChangeLog:
> > > >
> > > >         * config/aarch64/aarch64.md
> > > >         (*adds_<optab><ALLX:mode>_<GPI:mode>): Ensure extended operand
> > > >         agrees with width of extension specifier.
> > > >         (*subs_<optab><ALLX:mode>_<GPI:mode>): Likewise.
> > > >         (*adds_<optab><ALLX:mode>_shift_<GPI:mode>): Likewise.
> > > >         (*subs_<optab><ALLX:mode>_shift_<GPI:mode>): Likewise.
> > > >         (*add_<optab><ALLX:mode>_<GPI:mode>): Likewise.
> > > >         (*add_<optab><ALLX:mode>_shft_<GPI:mode>): Likewise.
> > > >         (*add_uxt<mode>_shift2): Likewise.
> > > >         (*sub_<optab><ALLX:mode>_<GPI:mode>): Likewise.
> > > >         (*sub_<optab><ALLX:mode>_shft_<GPI:mode>): Likewise.
> > > >         (*sub_uxt<mode>_shift2): Likewise.
> > > >         (*cmp_swp_<optab><ALLX:mode>_reg<GPI:mode>): Likewise.
> > > >         (*cmp_swp_<optab><ALLX:mode>_shft_<GPI:mode>): Likewise.
> > > >
> > > >
> > > > gcc/testsuite/ChangeLog:
> > > >
> > > >         * gcc.target/aarch64/adds3.c: Fix test w.r.t. new syntax.
> > > >         * gcc.target/aarch64/cmp.c: Likewise.
> > > >         * gcc.target/aarch64/subs3.c: Likewise.
> > > >         * gcc.target/aarch64/subsp.c: Likewise.
> > > >         * gcc.target/aarch64/extend-syntax.c: New test.
> > > >
> > >
> > > Hi,
> > >
> > > I've noticed some of the new tests fail with -mabi=ilp32:
> > >     gcc.target/aarch64/extend-syntax.c check-function-bodies add1
> > >     gcc.target/aarch64/extend-syntax.c check-function-bodies add3
> > >     gcc.target/aarch64/extend-syntax.c check-function-bodies sub2
> > >     gcc.target/aarch64/extend-syntax.c check-function-bodies sub3
> > >     gcc.target/aarch64/extend-syntax.c scan-assembler-times
> > > subs\tx[0-9]+, x[0-9]+, w[0-9]+, sxtw 3 1
> > >     gcc.target/aarch64/subsp.c scan-assembler sub\tsp, sp, w[0-9]*, sxtw 4\n
> > >
> > > Christophe
> >
> > AFAICT the second scan-assembler in that subsp test failed on ILP32
> > before my commit. This is because we generate slightly suboptimal code
> > here. On LP64 with -O, we get:
> >
> > f2:
> >         stp     x29, x30, [sp, -16]!
> >         mov     x29, sp
> >         add     w1, w1, 1
> >         sub     sp, sp, x1, sxtw 4
> >         mov     x0, sp
> >         bl      foo
> >         mov     sp, x29
> >         ldp     x29, x30, [sp], 16
> >         ret
> >
> > On ILP32, we get:
> >
> > f2:
> >         stp     x29, x30, [sp, -16]!
> >         mov     x29, sp
> >         add     w1, w1, 1
> >         lsl     w1, w1, 4
> >         sub     sp, sp, x1
> >         mov     w0, wsp
> >         bl      foo
> >         mov     sp, x29
> >         ldp     x29, x30, [sp], 16
> >         ret
> >
> > And we see similar results all the way back to GCC 6. So AFAICT this
> > scan-assembler has never worked. The attached patch disables it on ILP32
> > since this isn't a code quality regression.
> >
> > This patch also fixes up the DejaGnu directives in extend-syntax.c to
> > work on ILP32: we change the check-function-bodies directive to only run
> > on LP64, adding scan-assembler directives for ILP32 where required.
> >
> > OK for trunk?
> >
> > Thanks,
> > Alex
>
> > diff --git a/gcc/testsuite/gcc.target/aarch64/extend-syntax.c b/gcc/testsuite/gcc.target/aarch64/extend-syntax.c
> > index 23fa9f4ffc5..1bfcdb59dde 100644
> > --- a/gcc/testsuite/gcc.target/aarch64/extend-syntax.c
> > +++ b/gcc/testsuite/gcc.target/aarch64/extend-syntax.c
> > @@ -20,6 +20,7 @@ unsigned long long *add1(unsigned long long *p, unsigned x)
> >  */
> >  unsigned long long add2(unsigned long long x, unsigned y)
> >  {
> > +  /* { dg-final { scan-assembler-times "add\tx0, x0, w1, uxtw" 1 { target ilp32 } } } */
> >    return x + y;
> >  }
> >
> > @@ -34,6 +35,9 @@ double *add3(double *p, int x)
> >    return p + x;
> >  }
> >
> > +// add1 and add3 should both generate this on ILP32:
> > +/* { dg-final { scan-assembler-times "add\tw0, w0, w1, lsl 3" 2 { target ilp32 } } } */
> > +
> >  // Hits *sub_zero_extendsi_di (*sub_<optab><ALLX:mode>_<GPI:mode>).
> >  /*
> >  ** sub1:
> > @@ -42,6 +46,7 @@ double *add3(double *p, int x)
> >  */
> >  unsigned long long sub1(unsigned long long x, unsigned n)
> >  {
> > +    /* { dg-final { scan-assembler-times "sub\tx0, x0, w1, uxtw" 1 { target ilp32 } } } */
> >      return x - n;
> >  }
> >
> > @@ -67,6 +72,9 @@ double *sub3(double *p, int n)
> >    return p - n;
> >  }
> >
> > +// sub2 and sub3 should both generate this on ILP32:
> > +/* { dg-final { scan-assembler-times "sub\tw0, w0, w1, lsl 3" 2 { target ilp32 } } } */
> > +
> >  // Hits *adds_zero_extendsi_di (*adds_<optab><ALLX:mode>_<GPI:mode>).
> >  int adds1(unsigned long long x, unsigned y)
> >  {
> > @@ -97,7 +105,8 @@ int subs1(unsigned long long x, unsigned y)
> >  unsigned long long *w;
> >  int subs2(unsigned long long *x, int y)
> >  {
> > -  /* { dg-final { scan-assembler-times "subs\tx\[0-9\]+, x\[0-9\]+, w\[0-9\]+, sxtw 3" 1 } } */
> > +  /* { dg-final { scan-assembler-times "subs\tx\[0-9\]+, x\[0-9\]+, w\[0-9\]+, sxtw 3" 1 { target lp64 } } } */
> > +  /* { dg-final { scan-assembler-times "subs\tw\[0-9\]+, w\[0-9\]+, w\[0-9\]+, lsl 3" 1 { target ilp32 } } } */
> >    unsigned long long *t = x - y;
> >    w = t;
> >    return !!t;
> > @@ -117,4 +126,4 @@ int cmp2(unsigned long long x, int y)
> >    return x == ((unsigned long long)y << 3);
> >  }
> >
> > -/* { dg-final { check-function-bodies "**" "" "" } } */
> > +/* { dg-final { check-function-bodies "**" "" "" { target lp64 } } } */
> > diff --git a/gcc/testsuite/gcc.target/aarch64/subsp.c b/gcc/testsuite/gcc.target/aarch64/subsp.c
> > index 341b83dca86..e7f61e0799b 100644
> > --- a/gcc/testsuite/gcc.target/aarch64/subsp.c
> > +++ b/gcc/testsuite/gcc.target/aarch64/subsp.c
> > @@ -16,4 +16,4 @@ f2 (int *x, int y)
> >  }
> >
> >  /* { dg-final { scan-assembler "sub\tsp, sp, x\[0-9\]*\n" } } */
> > -/* { dg-final { scan-assembler "sub\tsp, sp, w\[0-9\]*, sxtw 4\n" } } */
> > +/* { dg-final { scan-assembler "sub\tsp, sp, w\[0-9\]*, sxtw 4\n" { target lp64 } } } */
Richard Sandiford Oct. 1, 2020, 4:08 p.m. UTC | #10
Alex Coplan <alex.coplan@arm.com> writes:
> Hi Christophe,
>
> On 08/09/2020 10:14, Christophe Lyon wrote:
>> On Mon, 17 Aug 2020 at 11:00, Alex Coplan <alex.coplan@arm.com> wrote:
>> >
>> > gcc/ChangeLog:
>> >
>> >         * config/aarch64/aarch64.md
>> >         (*adds_<optab><ALLX:mode>_<GPI:mode>): Ensure extended operand
>> >         agrees with width of extension specifier.
>> >         (*subs_<optab><ALLX:mode>_<GPI:mode>): Likewise.
>> >         (*adds_<optab><ALLX:mode>_shift_<GPI:mode>): Likewise.
>> >         (*subs_<optab><ALLX:mode>_shift_<GPI:mode>): Likewise.
>> >         (*add_<optab><ALLX:mode>_<GPI:mode>): Likewise.
>> >         (*add_<optab><ALLX:mode>_shft_<GPI:mode>): Likewise.
>> >         (*add_uxt<mode>_shift2): Likewise.
>> >         (*sub_<optab><ALLX:mode>_<GPI:mode>): Likewise.
>> >         (*sub_<optab><ALLX:mode>_shft_<GPI:mode>): Likewise.
>> >         (*sub_uxt<mode>_shift2): Likewise.
>> >         (*cmp_swp_<optab><ALLX:mode>_reg<GPI:mode>): Likewise.
>> >         (*cmp_swp_<optab><ALLX:mode>_shft_<GPI:mode>): Likewise.
>> >
>> >
>> > gcc/testsuite/ChangeLog:
>> >
>> >         * gcc.target/aarch64/adds3.c: Fix test w.r.t. new syntax.
>> >         * gcc.target/aarch64/cmp.c: Likewise.
>> >         * gcc.target/aarch64/subs3.c: Likewise.
>> >         * gcc.target/aarch64/subsp.c: Likewise.
>> >         * gcc.target/aarch64/extend-syntax.c: New test.
>> >
>> 
>> Hi,
>> 
>> I've noticed some of the new tests fail with -mabi=ilp32:
>>     gcc.target/aarch64/extend-syntax.c check-function-bodies add1
>>     gcc.target/aarch64/extend-syntax.c check-function-bodies add3
>>     gcc.target/aarch64/extend-syntax.c check-function-bodies sub2
>>     gcc.target/aarch64/extend-syntax.c check-function-bodies sub3
>>     gcc.target/aarch64/extend-syntax.c scan-assembler-times
>> subs\tx[0-9]+, x[0-9]+, w[0-9]+, sxtw 3 1
>>     gcc.target/aarch64/subsp.c scan-assembler sub\tsp, sp, w[0-9]*, sxtw 4\n
>> 
>> Christophe
>
> AFAICT the second scan-assembler in that subsp test failed on ILP32
> before my commit. This is because we generate slightly suboptimal code
> here. On LP64 with -O, we get:
>
> f2:
>         stp     x29, x30, [sp, -16]!
>         mov     x29, sp
>         add     w1, w1, 1
>         sub     sp, sp, x1, sxtw 4
>         mov     x0, sp
>         bl      foo
>         mov     sp, x29
>         ldp     x29, x30, [sp], 16
>         ret
>
> On ILP32, we get:
>
> f2:
>         stp     x29, x30, [sp, -16]!
>         mov     x29, sp
>         add     w1, w1, 1
>         lsl     w1, w1, 4
>         sub     sp, sp, x1
>         mov     w0, wsp
>         bl      foo
>         mov     sp, x29
>         ldp     x29, x30, [sp], 16
>         ret
>
> And we see similar results all the way back to GCC 6. So AFAICT this
> scan-assembler has never worked. The attached patch disables it on ILP32
> since this isn't a code quality regression.
>
> This patch also fixes up the DejaGnu directives in extend-syntax.c to
> work on ILP32: we change the check-function-bodies directive to only run
> on LP64, adding scan-assembler directives for ILP32 where required.
>
> OK for trunk?

OK, thanks.  Sorry for the slow review.

Richard
diff mbox series

Patch

diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
index 9b20dd0b1a0..b1e83dfda78 100644
--- a/gcc/config/aarch64/aarch64.md
+++ b/gcc/config/aarch64/aarch64.md
@@ -2383,7 +2383,7 @@ 
    (set (match_operand:GPI 0 "register_operand" "=r")
 	(plus:GPI (ANY_EXTEND:GPI (match_dup 1)) (match_dup 2)))]
   ""
-  "adds\\t%<GPI:w>0, %<GPI:w>2, %<GPI:w>1, <su>xt<ALLX:size>"
+  "adds\\t%<GPI:w>0, %<GPI:w>2, %w1, <su>xt<ALLX:size>"
   [(set_attr "type" "alus_ext")]
 )
 
@@ -2397,7 +2397,7 @@ 
    (set (match_operand:GPI 0 "register_operand" "=r")
 	(minus:GPI (match_dup 1) (ANY_EXTEND:GPI (match_dup 2))))]
   ""
-  "subs\\t%<GPI:w>0, %<GPI:w>1, %<GPI:w>2, <su>xt<ALLX:size>"
+  "subs\\t%<GPI:w>0, %<GPI:w>1, %w2, <su>xt<ALLX:size>"
   [(set_attr "type" "alus_ext")]
 )
 
@@ -2415,7 +2415,7 @@ 
 			      (match_dup 2))
 		  (match_dup 3)))]
   ""
-  "adds\\t%<GPI:w>0, %<GPI:w>3, %<GPI:w>1, <su>xt<ALLX:size> %2"
+  "adds\\t%<GPI:w>0, %<GPI:w>3, %w1, <su>xt<ALLX:size> %2"
   [(set_attr "type" "alus_ext")]
 )
 
@@ -2433,7 +2433,7 @@ 
 		   (ashift:GPI (ANY_EXTEND:GPI (match_dup 2))
 			       (match_dup 3))))]
   ""
-  "subs\\t%<GPI:w>0, %<GPI:w>1, %<GPI:w>2, <su>xt<ALLX:size> %3"
+  "subs\\t%<GPI:w>0, %<GPI:w>1, %w2, <su>xt<ALLX:size> %3"
   [(set_attr "type" "alus_ext")]
 )
 
@@ -2549,7 +2549,7 @@ 
 	(plus:GPI (ANY_EXTEND:GPI (match_operand:ALLX 1 "register_operand" "r"))
 		  (match_operand:GPI 2 "register_operand" "r")))]
   ""
-  "add\\t%<GPI:w>0, %<GPI:w>2, %<GPI:w>1, <su>xt<ALLX:size>"
+  "add\\t%<GPI:w>0, %<GPI:w>2, %w1, <su>xt<ALLX:size>"
   [(set_attr "type" "alu_ext")]
 )
 
@@ -2571,7 +2571,7 @@ 
 			      (match_operand 2 "aarch64_imm3" "Ui3"))
 		  (match_operand:GPI 3 "register_operand" "r")))]
   ""
-  "add\\t%<GPI:w>0, %<GPI:w>3, %<GPI:w>1, <su>xt<ALLX:size> %2"
+  "add\\t%<GPI:w>0, %<GPI:w>3, %w1, <su>xt<ALLX:size> %2"
   [(set_attr "type" "alu_ext")]
 )
 
@@ -2819,7 +2819,7 @@ 
   "*
   operands[3] = GEN_INT (aarch64_uxt_size (INTVAL(operands[2]),
 					   INTVAL (operands[3])));
-  return \"add\t%<w>0, %<w>4, %<w>1, uxt%e3 %2\";"
+  return \"add\t%<w>0, %<w>4, %w1, uxt%e3 %2\";"
   [(set_attr "type" "alu_ext")]
 )
 
@@ -3305,7 +3305,7 @@ 
 		   (ANY_EXTEND:GPI
 		    (match_operand:ALLX 2 "register_operand" "r"))))]
   ""
-  "sub\\t%<GPI:w>0, %<GPI:w>1, %<GPI:w>2, <su>xt<ALLX:size>"
+  "sub\\t%<GPI:w>0, %<GPI:w>1, %w2, <su>xt<ALLX:size>"
   [(set_attr "type" "alu_ext")]
 )
 
@@ -3328,7 +3328,7 @@ 
 				(match_operand:ALLX 2 "register_operand" "r"))
 			       (match_operand 3 "aarch64_imm3" "Ui3"))))]
   ""
-  "sub\\t%<GPI:w>0, %<GPI:w>1, %<GPI:w>2, <su>xt<ALLX:size> %3"
+  "sub\\t%<GPI:w>0, %<GPI:w>1, %w2, <su>xt<ALLX:size> %3"
   [(set_attr "type" "alu_ext")]
 )
 
@@ -3607,7 +3607,7 @@ 
   "*
   operands[3] = GEN_INT (aarch64_uxt_size (INTVAL (operands[2]),
 					   INTVAL (operands[3])));
-  return \"sub\t%<w>0, %<w>4, %<w>1, uxt%e3 %2\";"
+  return \"sub\t%<w>0, %<w>4, %w1, uxt%e3 %2\";"
   [(set_attr "type" "alu_ext")]
 )
 
@@ -4054,7 +4054,7 @@ 
 			 (match_operand:ALLX 0 "register_operand" "r"))
 			(match_operand:GPI 1 "register_operand" "r")))]
   ""
-  "cmp\\t%<GPI:w>1, %<GPI:w>0, <su>xt<ALLX:size>"
+  "cmp\\t%<GPI:w>1, %w0, <su>xt<ALLX:size>"
   [(set_attr "type" "alus_ext")]
 )
 
@@ -4066,7 +4066,7 @@ 
 			 (match_operand 1 "aarch64_imm3" "Ui3"))
 	(match_operand:GPI 2 "register_operand" "r")))]
   ""
-  "cmp\\t%<GPI:w>2, %<GPI:w>0, <su>xt<ALLX:size> %1"
+  "cmp\\t%<GPI:w>2, %w0, <su>xt<ALLX:size> %1"
   [(set_attr "type" "alus_ext")]
 )
 
diff --git a/gcc/testsuite/gcc.target/aarch64/adds3.c b/gcc/testsuite/gcc.target/aarch64/adds3.c
index c5518bdcaf2..e938c8049cf 100644
--- a/gcc/testsuite/gcc.target/aarch64/adds3.c
+++ b/gcc/testsuite/gcc.target/aarch64/adds3.c
@@ -58,4 +58,4 @@  int main ()
   return 0;
 }
 
-/* { dg-final { scan-assembler-times "adds\tx\[0-9\]+, x\[0-9\]+, x\[0-9\]+, sxtw" 2 } } */
+/* { dg-final { scan-assembler-times "adds\tx\[0-9\]+, x\[0-9\]+, w\[0-9\]+, sxtw" 2 } } */
diff --git a/gcc/testsuite/gcc.target/aarch64/cmp.c b/gcc/testsuite/gcc.target/aarch64/cmp.c
index ee57dd283bf..a6487a4f77a 100644
--- a/gcc/testsuite/gcc.target/aarch64/cmp.c
+++ b/gcc/testsuite/gcc.target/aarch64/cmp.c
@@ -58,4 +58,5 @@  cmp_di_test4 (int a, s64 b, s64 c)
 }
 
 /* { dg-final { scan-assembler-times "cmp\tw\[0-9\]+, w\[0-9\]+" 2 } } */
-/* { dg-final { scan-assembler-times "cmp\tx\[0-9\]+, x\[0-9\]+" 4 } } */
+/* { dg-final { scan-assembler-times "cmp\tx\[0-9\]+, x\[0-9\]+" 2 } } */
+/* { dg-final { scan-assembler-times "cmp\tx\[0-9\]+, w\[0-9\]+, sxtw" 2 } } */
diff --git a/gcc/testsuite/gcc.target/aarch64/extend-syntax.c b/gcc/testsuite/gcc.target/aarch64/extend-syntax.c
new file mode 100644
index 00000000000..23fa9f4ffc5
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/extend-syntax.c
@@ -0,0 +1,120 @@ 
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+
+// Hits *add_uxtdi_shift2 (*add_uxt<mode>_shift2).
+/*
+** add1:
+** 	add	x0, x0, w1, uxtw 3
+** 	ret
+*/
+unsigned long long *add1(unsigned long long *p, unsigned x)
+{
+  return p + x;
+}
+
+// Hits *add_zero_extendsi_di (*add_<optab><ALLX:mode>_<GPI:mode>).
+/*
+** add2:
+** 	add	x0, x0, w1, uxtw
+** 	ret
+*/
+unsigned long long add2(unsigned long long x, unsigned y)
+{
+  return x + y;
+}
+
+// Hits *add_extendsi_shft_di (*add_<optab><ALLX:mode>_shft_<GPI:mode>).
+/*
+** add3:
+** 	add	x0, x0, w1, sxtw 3
+** 	ret
+*/
+double *add3(double *p, int x)
+{
+  return p + x;
+}
+
+// Hits *sub_zero_extendsi_di (*sub_<optab><ALLX:mode>_<GPI:mode>).
+/*
+** sub1:
+** 	sub	x0, x0, w1, uxtw
+** 	ret
+*/
+unsigned long long sub1(unsigned long long x, unsigned n)
+{
+    return x - n;
+}
+
+// Hits *sub_uxtdi_shift2 (*sub_uxt<mode>_shift2).
+/*
+** sub2:
+** 	sub	x0, x0, w1, uxtw 3
+** 	ret
+*/
+double *sub2(double *x, unsigned n)
+{
+  return x - n;
+}
+
+// Hits *sub_extendsi_shft_di (*sub_<optab><ALLX:mode>_shft_<GPI:mode>).
+/*
+** sub3:
+** 	sub	x0, x0, w1, sxtw 3
+** 	ret
+*/
+double *sub3(double *p, int n)
+{
+  return p - n;
+}
+
+// Hits *adds_zero_extendsi_di (*adds_<optab><ALLX:mode>_<GPI:mode>).
+int adds1(unsigned long long x, unsigned y)
+{
+  /* { dg-final { scan-assembler-times "adds\tx\[0-9\]+, x\[0-9\]+, w\[0-9\]+, uxtw" 1 } } */
+  unsigned long long l = x + y;
+  return !!l;
+}
+
+// Hits *adds_extendsi_shift_di (*adds_<optab><ALLX:mode>_shift_<GPI:mode>).
+int adds2(long long x, int y)
+{
+  /* { dg-final { scan-assembler-times "adds\tx\[0-9\]+, x\[0-9\]+, w\[0-9\]+, sxtw 3" 1 } } */
+  long long t = x + ((long long)y << 3);
+  return !!t;
+}
+
+// Hits *subs_zero_extendsi_di (*subs_<optab><ALLX:mode>_<GPI:mode>).
+unsigned long long z;
+int subs1(unsigned long long x, unsigned y)
+{
+  /* { dg-final { scan-assembler-times "subs\tx\[0-9\]+, x\[0-9\]+, w\[0-9\]+, uxtw" 1 } } */
+  unsigned long long t = x - y;
+  z = t;
+  return !!t;
+}
+
+// Hits *subs_extendsi_shift_di (*subs_<optab><ALLX:mode>_shift_<GPI:mode>).
+unsigned long long *w;
+int subs2(unsigned long long *x, int y)
+{
+  /* { dg-final { scan-assembler-times "subs\tx\[0-9\]+, x\[0-9\]+, w\[0-9\]+, sxtw 3" 1 } } */
+  unsigned long long *t = x - y;
+  w = t;
+  return !!t;
+}
+
+// Hits *cmp_swp_zero_extendsi_regdi (*cmp_swp_<optab><ALLX:mode>_reg<GPI:mode>).
+int cmp(unsigned long long x, unsigned y)
+{
+  /* { dg-final { scan-assembler-times "cmp\tx\[0-9\]+, w\[0-9\]+, uxtw" 1 } } */
+  return !!(x - y);
+}
+
+// Hits *cmp_swp_extendsi_shft_di (*cmp_swp_<optab><ALLX:mode>_shft_<GPI:mode>).
+int cmp2(unsigned long long x, int y)
+{
+  /* { dg-final { scan-assembler-times "cmp\tx\[0-9\]+, w\[0-9\]+, sxtw 3" 1 } } */
+  return x == ((unsigned long long)y << 3);
+}
+
+/* { dg-final { check-function-bodies "**" "" "" } } */
diff --git a/gcc/testsuite/gcc.target/aarch64/subs3.c b/gcc/testsuite/gcc.target/aarch64/subs3.c
index 59581bf1ab7..0470a3bde34 100644
--- a/gcc/testsuite/gcc.target/aarch64/subs3.c
+++ b/gcc/testsuite/gcc.target/aarch64/subs3.c
@@ -58,4 +58,4 @@  int main ()
   return 0;
 }
 
-/* { dg-final { scan-assembler-times "subs\tx\[0-9\]+, x\[0-9\]+, x\[0-9\]+, sxtw" 2 } } */
+/* { dg-final { scan-assembler-times "subs\tx\[0-9\]+, x\[0-9\]+, w\[0-9\]+, sxtw" 2 } } */
diff --git a/gcc/testsuite/gcc.target/aarch64/subsp.c b/gcc/testsuite/gcc.target/aarch64/subsp.c
index 6ef6b2c90ae..341b83dca86 100644
--- a/gcc/testsuite/gcc.target/aarch64/subsp.c
+++ b/gcc/testsuite/gcc.target/aarch64/subsp.c
@@ -16,4 +16,4 @@  f2 (int *x, int y)
 }
 
 /* { dg-final { scan-assembler "sub\tsp, sp, x\[0-9\]*\n" } } */
-/* { dg-final { scan-assembler "sub\tsp, sp, x\[0-9\]*, sxtw 4\n" } } */
+/* { dg-final { scan-assembler "sub\tsp, sp, w\[0-9\]*, sxtw 4\n" } } */