Message ID | f8e3f68e-72ee-0632-db54-5651e66ec3f6@arm.com |
---|---|
State | New |
Headers | show |
Series | [ARM,PR82989] Fix unexpected use of NEON instructions for shifts | expand |
On 14/03/18 10:11, Sudakshina Das wrote: > Hi > > This patch fixes PR82989 so that we avoid NEON instructions when > -mneon-for-64bits is not enabled. This is more of a short term fix for > the real deeper problem of making and early decision of choosing or > rejecting NEON instructions. There is now a new ticket PR84467 to deal > with the longer term solution. > (Please refer to the discussion in the bug report for more details). > > Testing: Bootstrapped and regtested on arm-none-linux-gnueabihf and > added a new test case based on the test given on the bug report. > > Ok for trunk and backports for gcc-7 and gcc-6 branches? > OK for trunk. Please leave it a couple of days before backporting to ensure that the testcase doesn't tickle any multilib issues. R. > Sudi > > > *** gcc/ChangeLog *** > > 2018-03-14 Sudakshina Das <sudi.das@arm.com> > > * config/arm/neon.md (ashldi3_neon): Update ?s for constraints > to favor GPR over NEON registers. > (<shift>di3_neon): Likewise. > > *** gcc/testsuite/ChangeLog *** > > 2018-03-14 Sudakshina Das <sudi.das@arm.com> > > * gcc.target/arm/pr82989.c: New test. > > pr82989.diff > > > diff --git a/gcc/config/arm/neon.md b/gcc/config/arm/neon.md > index 6a6f5d7..1646b21 100644 > --- a/gcc/config/arm/neon.md > +++ b/gcc/config/arm/neon.md > @@ -1180,12 +1180,12 @@ > ) > > (define_insn_and_split "ashldi3_neon" > - [(set (match_operand:DI 0 "s_register_operand" "= w, w,?&r,?r,?&r, ?w,w") > - (ashift:DI (match_operand:DI 1 "s_register_operand" " 0w, w, 0r, 0, r, 0w,w") > - (match_operand:SI 2 "general_operand" "rUm, i, r, i, i,rUm,i"))) > - (clobber (match_scratch:SI 3 "= X, X,?&r, X, X, X,X")) > - (clobber (match_scratch:SI 4 "= X, X,?&r, X, X, X,X")) > - (clobber (match_scratch:DI 5 "=&w, X, X, X, X, &w,X")) > + [(set (match_operand:DI 0 "s_register_operand" "= w, w, &r, r, &r, ?w,?w") > + (ashift:DI (match_operand:DI 1 "s_register_operand" " 0w, w, 0r, 0, r, 0w, w") > + (match_operand:SI 2 "general_operand" "rUm, i, r, i, i,rUm, i"))) > + (clobber (match_scratch:SI 3 "= X, X, &r, X, X, X, X")) > + (clobber (match_scratch:SI 4 "= X, X, &r, X, X, X, X")) > + (clobber (match_scratch:DI 5 "=&w, X, X, X, X, &w, X")) > (clobber (reg:CC_C CC_REGNUM))] > "TARGET_NEON" > "#" > @@ -1276,7 +1276,7 @@ > ;; ashrdi3_neon > ;; lshrdi3_neon > (define_insn_and_split "<shift>di3_neon" > - [(set (match_operand:DI 0 "s_register_operand" "= w, w,?&r,?r,?&r,?w,?w") > + [(set (match_operand:DI 0 "s_register_operand" "= w, w, &r, r, &r,?w,?w") > (RSHIFTS:DI (match_operand:DI 1 "s_register_operand" " 0w, w, 0r, 0, r,0w, w") > (match_operand:SI 2 "reg_or_int_operand" " r, i, r, i, i, r, i"))) > (clobber (match_scratch:SI 3 "=2r, X, &r, X, X,2r, X")) > diff --git a/gcc/testsuite/gcc.target/arm/pr82989.c b/gcc/testsuite/gcc.target/arm/pr82989.c > new file mode 100644 > index 0000000..1295ee6 > --- /dev/null > +++ b/gcc/testsuite/gcc.target/arm/pr82989.c > @@ -0,0 +1,38 @@ > +/* PR target/82989 */ > +/* { dg-do compile } */ > +/* { dg-require-effective-target arm_neon_ok } */ > +/* { dg-skip-if "avoid conflicts with multilib options" { *-*-* } { "-mcpu=*" } { "-mcpu=cortex-a8" } } */ > +/* { dg-skip-if "avoid conflicts with multilib options" { *-*-* } { "-mfpu=*" } { "-mfpu=neon" } } */ > +/* { dg-skip-if "avoid conflicts with multilib options" { *-*-* } { "-mfloat-abi=*" } { "-mfloat-abi=hard" } } */ > +/* { dg-options "-O2 -mcpu=cortex-a8 -mfpu=neon -mfloat-abi=hard" } */ > +/* { dg-add-options arm_neon } */ > + > +typedef unsigned long long uint64_t; > + > +void f_shr_imm (uint64_t *a ) > +{ > + *a += *a >> 32; > +} > +/* { dg-final { scan-assembler-not "vshr*" } } */ > + > +void f_shr_reg (uint64_t *a, uint64_t b) > +{ > + *a += *a >> b; > +} > +/* { dg-final { scan-assembler-not "vshl*" } } */ > +/* Only 2 times for f_shr_reg. f_shr_imm should not have any. */ > +/* { dg-final { scan-assembler-times {lsr\tr[0-9]+, r[0-9]+, r[0-9]} 2 } } */ > + > +void f_shl_imm (uint64_t *a) > +{ > + *a += *a << 32; > +} > +/* { dg-final { scan-assembler-not "vshl*" } } */ > + > +void f_shl_reg (uint64_t *a, uint64_t b) > +{ > + *a += *a << b; > +} > +/* { dg-final { scan-assembler-not "vshl*" } } */ > +/* Only 2 times for f_shl_reg. f_shl_imm should not have any. */ > +/* { dg-final { scan-assembler-times {lsl\tr[0-9]+, r[0-9]+, r[0-9]} 2 } } */ >
Hi On 20/03/18 10:03, Richard Earnshaw (lists) wrote: > On 14/03/18 10:11, Sudakshina Das wrote: >> Hi >> >> This patch fixes PR82989 so that we avoid NEON instructions when >> -mneon-for-64bits is not enabled. This is more of a short term fix for >> the real deeper problem of making and early decision of choosing or >> rejecting NEON instructions. There is now a new ticket PR84467 to deal >> with the longer term solution. >> (Please refer to the discussion in the bug report for more details). >> >> Testing: Bootstrapped and regtested on arm-none-linux-gnueabihf and >> added a new test case based on the test given on the bug report. >> >> Ok for trunk and backports for gcc-7 and gcc-6 branches? >> > > OK for trunk. Please leave it a couple of days before backporting to > ensure that the testcase doesn't tickle any multilib issues. > > R. Thanks. Committed to trunk as r258677. Will wait a week for backporting. Sudi > >> Sudi >> >> >> *** gcc/ChangeLog *** >> >> 2018-03-14 Sudakshina Das <sudi.das@arm.com> >> >> * config/arm/neon.md (ashldi3_neon): Update ?s for constraints >> to favor GPR over NEON registers. >> (<shift>di3_neon): Likewise. >> >> *** gcc/testsuite/ChangeLog *** >> >> 2018-03-14 Sudakshina Das <sudi.das@arm.com> >> >> * gcc.target/arm/pr82989.c: New test. >> >> pr82989.diff >> >> >> diff --git a/gcc/config/arm/neon.md b/gcc/config/arm/neon.md >> index 6a6f5d7..1646b21 100644 >> --- a/gcc/config/arm/neon.md >> +++ b/gcc/config/arm/neon.md >> @@ -1180,12 +1180,12 @@ >> ) >> >> (define_insn_and_split "ashldi3_neon" >> - [(set (match_operand:DI 0 "s_register_operand" "= w, w,?&r,?r,?&r, ?w,w") >> - (ashift:DI (match_operand:DI 1 "s_register_operand" " 0w, w, 0r, 0, r, 0w,w") >> - (match_operand:SI 2 "general_operand" "rUm, i, r, i, i,rUm,i"))) >> - (clobber (match_scratch:SI 3 "= X, X,?&r, X, X, X,X")) >> - (clobber (match_scratch:SI 4 "= X, X,?&r, X, X, X,X")) >> - (clobber (match_scratch:DI 5 "=&w, X, X, X, X, &w,X")) >> + [(set (match_operand:DI 0 "s_register_operand" "= w, w, &r, r, &r, ?w,?w") >> + (ashift:DI (match_operand:DI 1 "s_register_operand" " 0w, w, 0r, 0, r, 0w, w") >> + (match_operand:SI 2 "general_operand" "rUm, i, r, i, i,rUm, i"))) >> + (clobber (match_scratch:SI 3 "= X, X, &r, X, X, X, X")) >> + (clobber (match_scratch:SI 4 "= X, X, &r, X, X, X, X")) >> + (clobber (match_scratch:DI 5 "=&w, X, X, X, X, &w, X")) >> (clobber (reg:CC_C CC_REGNUM))] >> "TARGET_NEON" >> "#" >> @@ -1276,7 +1276,7 @@ >> ;; ashrdi3_neon >> ;; lshrdi3_neon >> (define_insn_and_split "<shift>di3_neon" >> - [(set (match_operand:DI 0 "s_register_operand" "= w, w,?&r,?r,?&r,?w,?w") >> + [(set (match_operand:DI 0 "s_register_operand" "= w, w, &r, r, &r,?w,?w") >> (RSHIFTS:DI (match_operand:DI 1 "s_register_operand" " 0w, w, 0r, 0, r,0w, w") >> (match_operand:SI 2 "reg_or_int_operand" " r, i, r, i, i, r, i"))) >> (clobber (match_scratch:SI 3 "=2r, X, &r, X, X,2r, X")) >> diff --git a/gcc/testsuite/gcc.target/arm/pr82989.c b/gcc/testsuite/gcc.target/arm/pr82989.c >> new file mode 100644 >> index 0000000..1295ee6 >> --- /dev/null >> +++ b/gcc/testsuite/gcc.target/arm/pr82989.c >> @@ -0,0 +1,38 @@ >> +/* PR target/82989 */ >> +/* { dg-do compile } */ >> +/* { dg-require-effective-target arm_neon_ok } */ >> +/* { dg-skip-if "avoid conflicts with multilib options" { *-*-* } { "-mcpu=*" } { "-mcpu=cortex-a8" } } */ >> +/* { dg-skip-if "avoid conflicts with multilib options" { *-*-* } { "-mfpu=*" } { "-mfpu=neon" } } */ >> +/* { dg-skip-if "avoid conflicts with multilib options" { *-*-* } { "-mfloat-abi=*" } { "-mfloat-abi=hard" } } */ >> +/* { dg-options "-O2 -mcpu=cortex-a8 -mfpu=neon -mfloat-abi=hard" } */ >> +/* { dg-add-options arm_neon } */ >> + >> +typedef unsigned long long uint64_t; >> + >> +void f_shr_imm (uint64_t *a ) >> +{ >> + *a += *a >> 32; >> +} >> +/* { dg-final { scan-assembler-not "vshr*" } } */ >> + >> +void f_shr_reg (uint64_t *a, uint64_t b) >> +{ >> + *a += *a >> b; >> +} >> +/* { dg-final { scan-assembler-not "vshl*" } } */ >> +/* Only 2 times for f_shr_reg. f_shr_imm should not have any. */ >> +/* { dg-final { scan-assembler-times {lsr\tr[0-9]+, r[0-9]+, r[0-9]} 2 } } */ >> + >> +void f_shl_imm (uint64_t *a) >> +{ >> + *a += *a << 32; >> +} >> +/* { dg-final { scan-assembler-not "vshl*" } } */ >> + >> +void f_shl_reg (uint64_t *a, uint64_t b) >> +{ >> + *a += *a << b; >> +} >> +/* { dg-final { scan-assembler-not "vshl*" } } */ >> +/* Only 2 times for f_shl_reg. f_shl_imm should not have any. */ >> +/* { dg-final { scan-assembler-times {lsl\tr[0-9]+, r[0-9]+, r[0-9]} 2 } } */ >> >
On 20 March 2018 at 11:58, Sudakshina Das <sudi.das@arm.com> wrote: > Hi > > On 20/03/18 10:03, Richard Earnshaw (lists) wrote: >> >> On 14/03/18 10:11, Sudakshina Das wrote: >>> >>> Hi >>> >>> This patch fixes PR82989 so that we avoid NEON instructions when >>> -mneon-for-64bits is not enabled. This is more of a short term fix for >>> the real deeper problem of making and early decision of choosing or >>> rejecting NEON instructions. There is now a new ticket PR84467 to deal >>> with the longer term solution. >>> (Please refer to the discussion in the bug report for more details). >>> >>> Testing: Bootstrapped and regtested on arm-none-linux-gnueabihf and >>> added a new test case based on the test given on the bug report. >>> >>> Ok for trunk and backports for gcc-7 and gcc-6 branches? >>> >> >> OK for trunk. Please leave it a couple of days before backporting to >> ensure that the testcase doesn't tickle any multilib issues. >> >> R. > > > Thanks. Committed to trunk as r258677. Will wait a week for backporting. > > Sudi > Hi Sudi, I've noticed that: FAIL: gcc.target/arm/pr82989.c scan-assembler-times lsl\\tr[0-9]+, r[0-9]+, r[0-9] 2 FAIL: gcc.target/arm/pr82989.c scan-assembler-times lsr\\tr[0-9]+, r[0-9]+, r[0-9] 2 on target armeb-none-linux-gnueabihf --with-mode thumb --with-cpu cortex-a9 --with-fpu neon-fp16 The tests pass when using --with-mode arm Can you check? Thanks Christophe > >> >>> Sudi >>> >>> >>> *** gcc/ChangeLog *** >>> >>> 2018-03-14 Sudakshina Das <sudi.das@arm.com> >>> >>> * config/arm/neon.md (ashldi3_neon): Update ?s for constraints >>> to favor GPR over NEON registers. >>> (<shift>di3_neon): Likewise. >>> >>> *** gcc/testsuite/ChangeLog *** >>> >>> 2018-03-14 Sudakshina Das <sudi.das@arm.com> >>> >>> * gcc.target/arm/pr82989.c: New test. >>> >>> pr82989.diff >>> >>> >>> diff --git a/gcc/config/arm/neon.md b/gcc/config/arm/neon.md >>> index 6a6f5d7..1646b21 100644 >>> --- a/gcc/config/arm/neon.md >>> +++ b/gcc/config/arm/neon.md >>> @@ -1180,12 +1180,12 @@ >>> ) >>> (define_insn_and_split "ashldi3_neon" >>> - [(set (match_operand:DI 0 "s_register_operand" "= w, >>> w,?&r,?r,?&r, ?w,w") >>> - (ashift:DI (match_operand:DI 1 "s_register_operand" " 0w, w, 0r, >>> 0, r, 0w,w") >>> - (match_operand:SI 2 "general_operand" "rUm, i, r, >>> i, i,rUm,i"))) >>> - (clobber (match_scratch:SI 3 "= X, >>> X,?&r, X, X, X,X")) >>> - (clobber (match_scratch:SI 4 "= X, >>> X,?&r, X, X, X,X")) >>> - (clobber (match_scratch:DI 5 "=&w, >>> X, X, X, X, &w,X")) >>> + [(set (match_operand:DI 0 "s_register_operand" "= w, w, &r, >>> r, &r, ?w,?w") >>> + (ashift:DI (match_operand:DI 1 "s_register_operand" " 0w, w, 0r, >>> 0, r, 0w, w") >>> + (match_operand:SI 2 "general_operand" "rUm, i, r, >>> i, i,rUm, i"))) >>> + (clobber (match_scratch:SI 3 "= X, >>> X, &r, X, X, X, X")) >>> + (clobber (match_scratch:SI 4 "= X, >>> X, &r, X, X, X, X")) >>> + (clobber (match_scratch:DI 5 "=&w, >>> X, X, X, X, &w, X")) >>> (clobber (reg:CC_C CC_REGNUM))] >>> "TARGET_NEON" >>> "#" >>> @@ -1276,7 +1276,7 @@ >>> ;; ashrdi3_neon >>> ;; lshrdi3_neon >>> (define_insn_and_split "<shift>di3_neon" >>> - [(set (match_operand:DI 0 "s_register_operand" "= w, >>> w,?&r,?r,?&r,?w,?w") >>> + [(set (match_operand:DI 0 "s_register_operand" "= w, w, &r, >>> r, &r,?w,?w") >>> (RSHIFTS:DI (match_operand:DI 1 "s_register_operand" " 0w, w, 0r, >>> 0, r,0w, w") >>> (match_operand:SI 2 "reg_or_int_operand" " r, i, r, >>> i, i, r, i"))) >>> (clobber (match_scratch:SI 3 >>> "=2r, X, &r, X, X,2r, X")) >>> diff --git a/gcc/testsuite/gcc.target/arm/pr82989.c >>> b/gcc/testsuite/gcc.target/arm/pr82989.c >>> new file mode 100644 >>> index 0000000..1295ee6 >>> --- /dev/null >>> +++ b/gcc/testsuite/gcc.target/arm/pr82989.c >>> @@ -0,0 +1,38 @@ >>> +/* PR target/82989 */ >>> +/* { dg-do compile } */ >>> +/* { dg-require-effective-target arm_neon_ok } */ >>> +/* { dg-skip-if "avoid conflicts with multilib options" { *-*-* } { >>> "-mcpu=*" } { "-mcpu=cortex-a8" } } */ >>> +/* { dg-skip-if "avoid conflicts with multilib options" { *-*-* } { >>> "-mfpu=*" } { "-mfpu=neon" } } */ >>> +/* { dg-skip-if "avoid conflicts with multilib options" { *-*-* } { >>> "-mfloat-abi=*" } { "-mfloat-abi=hard" } } */ >>> +/* { dg-options "-O2 -mcpu=cortex-a8 -mfpu=neon -mfloat-abi=hard" } */ >>> +/* { dg-add-options arm_neon } */ >>> + >>> +typedef unsigned long long uint64_t; >>> + >>> +void f_shr_imm (uint64_t *a ) >>> +{ >>> + *a += *a >> 32; >>> +} >>> +/* { dg-final { scan-assembler-not "vshr*" } } */ >>> + >>> +void f_shr_reg (uint64_t *a, uint64_t b) >>> +{ >>> + *a += *a >> b; >>> +} >>> +/* { dg-final { scan-assembler-not "vshl*" } } */ >>> +/* Only 2 times for f_shr_reg. f_shr_imm should not have any. */ >>> +/* { dg-final { scan-assembler-times {lsr\tr[0-9]+, r[0-9]+, r[0-9]} 2 } >>> } */ >>> + >>> +void f_shl_imm (uint64_t *a) >>> +{ >>> + *a += *a << 32; >>> +} >>> +/* { dg-final { scan-assembler-not "vshl*" } } */ >>> + >>> +void f_shl_reg (uint64_t *a, uint64_t b) >>> +{ >>> + *a += *a << b; >>> +} >>> +/* { dg-final { scan-assembler-not "vshl*" } } */ >>> +/* Only 2 times for f_shl_reg. f_shl_imm should not have any. */ >>> +/* { dg-final { scan-assembler-times {lsl\tr[0-9]+, r[0-9]+, r[0-9]} 2 } >>> } */ >>> >> >
Hi On 21/03/18 08:51, Christophe Lyon wrote: > On 20 March 2018 at 11:58, Sudakshina Das <sudi.das@arm.com> wrote: >> Hi >> >> On 20/03/18 10:03, Richard Earnshaw (lists) wrote: >>> >>> On 14/03/18 10:11, Sudakshina Das wrote: >>>> >>>> Hi >>>> >>>> This patch fixes PR82989 so that we avoid NEON instructions when >>>> -mneon-for-64bits is not enabled. This is more of a short term fix for >>>> the real deeper problem of making and early decision of choosing or >>>> rejecting NEON instructions. There is now a new ticket PR84467 to deal >>>> with the longer term solution. >>>> (Please refer to the discussion in the bug report for more details). >>>> >>>> Testing: Bootstrapped and regtested on arm-none-linux-gnueabihf and >>>> added a new test case based on the test given on the bug report. >>>> >>>> Ok for trunk and backports for gcc-7 and gcc-6 branches? >>>> >>> >>> OK for trunk. Please leave it a couple of days before backporting to >>> ensure that the testcase doesn't tickle any multilib issues. >>> >>> R. >> >> >> Thanks. Committed to trunk as r258677. Will wait a week for backporting. >> >> Sudi >> > > Hi Sudi, > > I've noticed that: > FAIL: gcc.target/arm/pr82989.c scan-assembler-times lsl\\tr[0-9]+, > r[0-9]+, r[0-9] 2 > FAIL: gcc.target/arm/pr82989.c scan-assembler-times lsr\\tr[0-9]+, > r[0-9]+, r[0-9] 2 > on target armeb-none-linux-gnueabihf > --with-mode thumb > --with-cpu cortex-a9 > --with-fpu neon-fp16 > > The tests pass when using --with-mode arm > > Can you check? Yes I see this as well. Sorry about this. I am testing a quick fix for this at the moment. Thanks Sudi > > Thanks > > Christophe > >> >>> >>>> Sudi >>>> >>>> >>>> *** gcc/ChangeLog *** >>>> >>>> 2018-03-14 Sudakshina Das <sudi.das@arm.com> >>>> >>>> * config/arm/neon.md (ashldi3_neon): Update ?s for constraints >>>> to favor GPR over NEON registers. >>>> (<shift>di3_neon): Likewise. >>>> >>>> *** gcc/testsuite/ChangeLog *** >>>> >>>> 2018-03-14 Sudakshina Das <sudi.das@arm.com> >>>> >>>> * gcc.target/arm/pr82989.c: New test. >>>> >>>> pr82989.diff >>>> >>>> >>>> diff --git a/gcc/config/arm/neon.md b/gcc/config/arm/neon.md >>>> index 6a6f5d7..1646b21 100644 >>>> --- a/gcc/config/arm/neon.md >>>> +++ b/gcc/config/arm/neon.md >>>> @@ -1180,12 +1180,12 @@ >>>> ) >>>> (define_insn_and_split "ashldi3_neon" >>>> - [(set (match_operand:DI 0 "s_register_operand" "= w, >>>> w,?&r,?r,?&r, ?w,w") >>>> - (ashift:DI (match_operand:DI 1 "s_register_operand" " 0w, w, 0r, >>>> 0, r, 0w,w") >>>> - (match_operand:SI 2 "general_operand" "rUm, i, r, >>>> i, i,rUm,i"))) >>>> - (clobber (match_scratch:SI 3 "= X, >>>> X,?&r, X, X, X,X")) >>>> - (clobber (match_scratch:SI 4 "= X, >>>> X,?&r, X, X, X,X")) >>>> - (clobber (match_scratch:DI 5 "=&w, >>>> X, X, X, X, &w,X")) >>>> + [(set (match_operand:DI 0 "s_register_operand" "= w, w, &r, >>>> r, &r, ?w,?w") >>>> + (ashift:DI (match_operand:DI 1 "s_register_operand" " 0w, w, 0r, >>>> 0, r, 0w, w") >>>> + (match_operand:SI 2 "general_operand" "rUm, i, r, >>>> i, i,rUm, i"))) >>>> + (clobber (match_scratch:SI 3 "= X, >>>> X, &r, X, X, X, X")) >>>> + (clobber (match_scratch:SI 4 "= X, >>>> X, &r, X, X, X, X")) >>>> + (clobber (match_scratch:DI 5 "=&w, >>>> X, X, X, X, &w, X")) >>>> (clobber (reg:CC_C CC_REGNUM))] >>>> "TARGET_NEON" >>>> "#" >>>> @@ -1276,7 +1276,7 @@ >>>> ;; ashrdi3_neon >>>> ;; lshrdi3_neon >>>> (define_insn_and_split "<shift>di3_neon" >>>> - [(set (match_operand:DI 0 "s_register_operand" "= w, >>>> w,?&r,?r,?&r,?w,?w") >>>> + [(set (match_operand:DI 0 "s_register_operand" "= w, w, &r, >>>> r, &r,?w,?w") >>>> (RSHIFTS:DI (match_operand:DI 1 "s_register_operand" " 0w, w, 0r, >>>> 0, r,0w, w") >>>> (match_operand:SI 2 "reg_or_int_operand" " r, i, r, >>>> i, i, r, i"))) >>>> (clobber (match_scratch:SI 3 >>>> "=2r, X, &r, X, X,2r, X")) >>>> diff --git a/gcc/testsuite/gcc.target/arm/pr82989.c >>>> b/gcc/testsuite/gcc.target/arm/pr82989.c >>>> new file mode 100644 >>>> index 0000000..1295ee6 >>>> --- /dev/null >>>> +++ b/gcc/testsuite/gcc.target/arm/pr82989.c >>>> @@ -0,0 +1,38 @@ >>>> +/* PR target/82989 */ >>>> +/* { dg-do compile } */ >>>> +/* { dg-require-effective-target arm_neon_ok } */ >>>> +/* { dg-skip-if "avoid conflicts with multilib options" { *-*-* } { >>>> "-mcpu=*" } { "-mcpu=cortex-a8" } } */ >>>> +/* { dg-skip-if "avoid conflicts with multilib options" { *-*-* } { >>>> "-mfpu=*" } { "-mfpu=neon" } } */ >>>> +/* { dg-skip-if "avoid conflicts with multilib options" { *-*-* } { >>>> "-mfloat-abi=*" } { "-mfloat-abi=hard" } } */ >>>> +/* { dg-options "-O2 -mcpu=cortex-a8 -mfpu=neon -mfloat-abi=hard" } */ >>>> +/* { dg-add-options arm_neon } */ >>>> + >>>> +typedef unsigned long long uint64_t; >>>> + >>>> +void f_shr_imm (uint64_t *a ) >>>> +{ >>>> + *a += *a >> 32; >>>> +} >>>> +/* { dg-final { scan-assembler-not "vshr*" } } */ >>>> + >>>> +void f_shr_reg (uint64_t *a, uint64_t b) >>>> +{ >>>> + *a += *a >> b; >>>> +} >>>> +/* { dg-final { scan-assembler-not "vshl*" } } */ >>>> +/* Only 2 times for f_shr_reg. f_shr_imm should not have any. */ >>>> +/* { dg-final { scan-assembler-times {lsr\tr[0-9]+, r[0-9]+, r[0-9]} 2 } >>>> } */ >>>> + >>>> +void f_shl_imm (uint64_t *a) >>>> +{ >>>> + *a += *a << 32; >>>> +} >>>> +/* { dg-final { scan-assembler-not "vshl*" } } */ >>>> + >>>> +void f_shl_reg (uint64_t *a, uint64_t b) >>>> +{ >>>> + *a += *a << b; >>>> +} >>>> +/* { dg-final { scan-assembler-not "vshl*" } } */ >>>> +/* Only 2 times for f_shl_reg. f_shl_imm should not have any. */ >>>> +/* { dg-final { scan-assembler-times {lsl\tr[0-9]+, r[0-9]+, r[0-9]} 2 } >>>> } */ >>>> >>> >>
On 21/03/18 11:40, Sudakshina Das wrote: > Hi > > On 21/03/18 08:51, Christophe Lyon wrote: >> On 20 March 2018 at 11:58, Sudakshina Das <sudi.das@arm.com> wrote: >>> Hi >>> >>> On 20/03/18 10:03, Richard Earnshaw (lists) wrote: >>>> >>>> On 14/03/18 10:11, Sudakshina Das wrote: >>>>> >>>>> Hi >>>>> >>>>> This patch fixes PR82989 so that we avoid NEON instructions when >>>>> -mneon-for-64bits is not enabled. This is more of a short term fix for >>>>> the real deeper problem of making and early decision of choosing or >>>>> rejecting NEON instructions. There is now a new ticket PR84467 to deal >>>>> with the longer term solution. >>>>> (Please refer to the discussion in the bug report for more details). >>>>> >>>>> Testing: Bootstrapped and regtested on arm-none-linux-gnueabihf and >>>>> added a new test case based on the test given on the bug report. >>>>> >>>>> Ok for trunk and backports for gcc-7 and gcc-6 branches? >>>>> >>>> >>>> OK for trunk. Please leave it a couple of days before backporting to >>>> ensure that the testcase doesn't tickle any multilib issues. >>>> >>>> R. >>> >>> >>> Thanks. Committed to trunk as r258677. Will wait a week for backporting. Backported both the commits of trunks to gcc-7 as r258883 and to gcc-6 as r258884 (Reg-tested for both) Thanks Sudi >>> >>> Sudi >>> >> >> Hi Sudi, >> >> I've noticed that: >> FAIL: gcc.target/arm/pr82989.c scan-assembler-times lsl\\tr[0-9]+, >> r[0-9]+, r[0-9] 2 >> FAIL: gcc.target/arm/pr82989.c scan-assembler-times lsr\\tr[0-9]+, >> r[0-9]+, r[0-9] 2 >> on target armeb-none-linux-gnueabihf >> --with-mode thumb >> --with-cpu cortex-a9 >> --with-fpu neon-fp16 >> >> The tests pass when using --with-mode arm >> >> Can you check? > > Yes I see this as well. Sorry about this. I am testing a quick fix for > this at the moment. > > Thanks > Sudi > >> >> Thanks >> >> Christophe >> >>> >>>> >>>>> Sudi >>>>> >>>>> >>>>> *** gcc/ChangeLog *** >>>>> >>>>> 2018-03-14 Sudakshina Das <sudi.das@arm.com> >>>>> >>>>> * config/arm/neon.md (ashldi3_neon): Update ?s for constraints >>>>> to favor GPR over NEON registers. >>>>> (<shift>di3_neon): Likewise. >>>>> >>>>> *** gcc/testsuite/ChangeLog *** >>>>> >>>>> 2018-03-14 Sudakshina Das <sudi.das@arm.com> >>>>> >>>>> * gcc.target/arm/pr82989.c: New test. >>>>> >>>>> pr82989.diff >>>>> >>>>> >>>>> diff --git a/gcc/config/arm/neon.md b/gcc/config/arm/neon.md >>>>> index 6a6f5d7..1646b21 100644 >>>>> --- a/gcc/config/arm/neon.md >>>>> +++ b/gcc/config/arm/neon.md >>>>> @@ -1180,12 +1180,12 @@ >>>>> ) >>>>> (define_insn_and_split "ashldi3_neon" >>>>> - [(set (match_operand:DI 0 "s_register_operand" "= w, >>>>> w,?&r,?r,?&r, ?w,w") >>>>> - (ashift:DI (match_operand:DI 1 "s_register_operand" " 0w, >>>>> w, 0r, >>>>> 0, r, 0w,w") >>>>> - (match_operand:SI 2 "general_operand" "rUm, >>>>> i, r, >>>>> i, i,rUm,i"))) >>>>> - (clobber (match_scratch:SI 3 >>>>> "= X, >>>>> X,?&r, X, X, X,X")) >>>>> - (clobber (match_scratch:SI 4 >>>>> "= X, >>>>> X,?&r, X, X, X,X")) >>>>> - (clobber (match_scratch:DI 5 >>>>> "=&w, >>>>> X, X, X, X, &w,X")) >>>>> + [(set (match_operand:DI 0 "s_register_operand" "= w, >>>>> w, &r, >>>>> r, &r, ?w,?w") >>>>> + (ashift:DI (match_operand:DI 1 "s_register_operand" " 0w, >>>>> w, 0r, >>>>> 0, r, 0w, w") >>>>> + (match_operand:SI 2 "general_operand" "rUm, >>>>> i, r, >>>>> i, i,rUm, i"))) >>>>> + (clobber (match_scratch:SI 3 >>>>> "= X, >>>>> X, &r, X, X, X, X")) >>>>> + (clobber (match_scratch:SI 4 >>>>> "= X, >>>>> X, &r, X, X, X, X")) >>>>> + (clobber (match_scratch:DI 5 >>>>> "=&w, >>>>> X, X, X, X, &w, X")) >>>>> (clobber (reg:CC_C CC_REGNUM))] >>>>> "TARGET_NEON" >>>>> "#" >>>>> @@ -1276,7 +1276,7 @@ >>>>> ;; ashrdi3_neon >>>>> ;; lshrdi3_neon >>>>> (define_insn_and_split "<shift>di3_neon" >>>>> - [(set (match_operand:DI 0 "s_register_operand" "= w, >>>>> w,?&r,?r,?&r,?w,?w") >>>>> + [(set (match_operand:DI 0 "s_register_operand" "= w, >>>>> w, &r, >>>>> r, &r,?w,?w") >>>>> (RSHIFTS:DI (match_operand:DI 1 "s_register_operand" " 0w, >>>>> w, 0r, >>>>> 0, r,0w, w") >>>>> (match_operand:SI 2 "reg_or_int_operand" " r, >>>>> i, r, >>>>> i, i, r, i"))) >>>>> (clobber (match_scratch:SI 3 >>>>> "=2r, X, &r, X, X,2r, X")) >>>>> diff --git a/gcc/testsuite/gcc.target/arm/pr82989.c >>>>> b/gcc/testsuite/gcc.target/arm/pr82989.c >>>>> new file mode 100644 >>>>> index 0000000..1295ee6 >>>>> --- /dev/null >>>>> +++ b/gcc/testsuite/gcc.target/arm/pr82989.c >>>>> @@ -0,0 +1,38 @@ >>>>> +/* PR target/82989 */ >>>>> +/* { dg-do compile } */ >>>>> +/* { dg-require-effective-target arm_neon_ok } */ >>>>> +/* { dg-skip-if "avoid conflicts with multilib options" { *-*-* } { >>>>> "-mcpu=*" } { "-mcpu=cortex-a8" } } */ >>>>> +/* { dg-skip-if "avoid conflicts with multilib options" { *-*-* } { >>>>> "-mfpu=*" } { "-mfpu=neon" } } */ >>>>> +/* { dg-skip-if "avoid conflicts with multilib options" { *-*-* } { >>>>> "-mfloat-abi=*" } { "-mfloat-abi=hard" } } */ >>>>> +/* { dg-options "-O2 -mcpu=cortex-a8 -mfpu=neon -mfloat-abi=hard" >>>>> } */ >>>>> +/* { dg-add-options arm_neon } */ >>>>> + >>>>> +typedef unsigned long long uint64_t; >>>>> + >>>>> +void f_shr_imm (uint64_t *a ) >>>>> +{ >>>>> + *a += *a >> 32; >>>>> +} >>>>> +/* { dg-final { scan-assembler-not "vshr*" } } */ >>>>> + >>>>> +void f_shr_reg (uint64_t *a, uint64_t b) >>>>> +{ >>>>> + *a += *a >> b; >>>>> +} >>>>> +/* { dg-final { scan-assembler-not "vshl*" } } */ >>>>> +/* Only 2 times for f_shr_reg. f_shr_imm should not have any. */ >>>>> +/* { dg-final { scan-assembler-times {lsr\tr[0-9]+, r[0-9]+, >>>>> r[0-9]} 2 } >>>>> } */ >>>>> + >>>>> +void f_shl_imm (uint64_t *a) >>>>> +{ >>>>> + *a += *a << 32; >>>>> +} >>>>> +/* { dg-final { scan-assembler-not "vshl*" } } */ >>>>> + >>>>> +void f_shl_reg (uint64_t *a, uint64_t b) >>>>> +{ >>>>> + *a += *a << b; >>>>> +} >>>>> +/* { dg-final { scan-assembler-not "vshl*" } } */ >>>>> +/* Only 2 times for f_shl_reg. f_shl_imm should not have any. */ >>>>> +/* { dg-final { scan-assembler-times {lsl\tr[0-9]+, r[0-9]+, >>>>> r[0-9]} 2 } >>>>> } */ >>>>> >>>> >>> >
diff --git a/gcc/config/arm/neon.md b/gcc/config/arm/neon.md index 6a6f5d7..1646b21 100644 --- a/gcc/config/arm/neon.md +++ b/gcc/config/arm/neon.md @@ -1180,12 +1180,12 @@ ) (define_insn_and_split "ashldi3_neon" - [(set (match_operand:DI 0 "s_register_operand" "= w, w,?&r,?r,?&r, ?w,w") - (ashift:DI (match_operand:DI 1 "s_register_operand" " 0w, w, 0r, 0, r, 0w,w") - (match_operand:SI 2 "general_operand" "rUm, i, r, i, i,rUm,i"))) - (clobber (match_scratch:SI 3 "= X, X,?&r, X, X, X,X")) - (clobber (match_scratch:SI 4 "= X, X,?&r, X, X, X,X")) - (clobber (match_scratch:DI 5 "=&w, X, X, X, X, &w,X")) + [(set (match_operand:DI 0 "s_register_operand" "= w, w, &r, r, &r, ?w,?w") + (ashift:DI (match_operand:DI 1 "s_register_operand" " 0w, w, 0r, 0, r, 0w, w") + (match_operand:SI 2 "general_operand" "rUm, i, r, i, i,rUm, i"))) + (clobber (match_scratch:SI 3 "= X, X, &r, X, X, X, X")) + (clobber (match_scratch:SI 4 "= X, X, &r, X, X, X, X")) + (clobber (match_scratch:DI 5 "=&w, X, X, X, X, &w, X")) (clobber (reg:CC_C CC_REGNUM))] "TARGET_NEON" "#" @@ -1276,7 +1276,7 @@ ;; ashrdi3_neon ;; lshrdi3_neon (define_insn_and_split "<shift>di3_neon" - [(set (match_operand:DI 0 "s_register_operand" "= w, w,?&r,?r,?&r,?w,?w") + [(set (match_operand:DI 0 "s_register_operand" "= w, w, &r, r, &r,?w,?w") (RSHIFTS:DI (match_operand:DI 1 "s_register_operand" " 0w, w, 0r, 0, r,0w, w") (match_operand:SI 2 "reg_or_int_operand" " r, i, r, i, i, r, i"))) (clobber (match_scratch:SI 3 "=2r, X, &r, X, X,2r, X")) diff --git a/gcc/testsuite/gcc.target/arm/pr82989.c b/gcc/testsuite/gcc.target/arm/pr82989.c new file mode 100644 index 0000000..1295ee6 --- /dev/null +++ b/gcc/testsuite/gcc.target/arm/pr82989.c @@ -0,0 +1,38 @@ +/* PR target/82989 */ +/* { dg-do compile } */ +/* { dg-require-effective-target arm_neon_ok } */ +/* { dg-skip-if "avoid conflicts with multilib options" { *-*-* } { "-mcpu=*" } { "-mcpu=cortex-a8" } } */ +/* { dg-skip-if "avoid conflicts with multilib options" { *-*-* } { "-mfpu=*" } { "-mfpu=neon" } } */ +/* { dg-skip-if "avoid conflicts with multilib options" { *-*-* } { "-mfloat-abi=*" } { "-mfloat-abi=hard" } } */ +/* { dg-options "-O2 -mcpu=cortex-a8 -mfpu=neon -mfloat-abi=hard" } */ +/* { dg-add-options arm_neon } */ + +typedef unsigned long long uint64_t; + +void f_shr_imm (uint64_t *a ) +{ + *a += *a >> 32; +} +/* { dg-final { scan-assembler-not "vshr*" } } */ + +void f_shr_reg (uint64_t *a, uint64_t b) +{ + *a += *a >> b; +} +/* { dg-final { scan-assembler-not "vshl*" } } */ +/* Only 2 times for f_shr_reg. f_shr_imm should not have any. */ +/* { dg-final { scan-assembler-times {lsr\tr[0-9]+, r[0-9]+, r[0-9]} 2 } } */ + +void f_shl_imm (uint64_t *a) +{ + *a += *a << 32; +} +/* { dg-final { scan-assembler-not "vshl*" } } */ + +void f_shl_reg (uint64_t *a, uint64_t b) +{ + *a += *a << b; +} +/* { dg-final { scan-assembler-not "vshl*" } } */ +/* Only 2 times for f_shl_reg. f_shl_imm should not have any. */ +/* { dg-final { scan-assembler-times {lsl\tr[0-9]+, r[0-9]+, r[0-9]} 2 } } */