Message ID | 20170203091047.GC14051@tucnak |
---|---|
State | New |
Headers | show |
Hi Jakub, On Fri, Feb 03, 2017 at 10:10:47AM +0100, Jakub Jelinek wrote: > As mentioned in the PR, for the following testcase we emit a power9 > instruction even with -mcpu=power8. Similar movsf_hardfloat instruction > uses wb constraint for the stxssp insn source rather than wu, which it > only uses for stxsspx (power7?). > > Bootstrapped/regtested on powerpc64{,le}-linux, ok for trunk? Yes please. Thanks! Some testcase stuff below... > --- gcc/testsuite/gcc.target/powerpc/pr79354.c.jj 2017-02-03 02:37:44.147938375 +0100 > +++ gcc/testsuite/gcc.target/powerpc/pr79354.c 2017-02-03 02:38:34.838303987 +0100 > @@ -0,0 +1,23 @@ > +/* PR target/79354 */ > +/* { dg-do compile { target { powerpc64*-*-* && lp64 } } } */ powerpc*-*-* instead? And why is lp64 needed? > +/* { dg-skip-if "do not override -mcpu" { powerpc*-*-* } { "-mcpu=*" } { "-mcpu=power8" } } */ > +/* { dg-require-effective-target powerpc_p8vector_ok } */ > +/* { dg-options "-mcpu=power8 -O2" } */ > +/* { dg-final { scan-assembler-not "stxssp\[^x]" } } */ \M is nicer and more future-proof, but this works. Segher
On Fri, Feb 03, 2017 at 09:59:45AM -0600, Segher Boessenkool wrote: > Hi Jakub, > > On Fri, Feb 03, 2017 at 10:10:47AM +0100, Jakub Jelinek wrote: > > As mentioned in the PR, for the following testcase we emit a power9 > > instruction even with -mcpu=power8. Similar movsf_hardfloat instruction > > uses wb constraint for the stxssp insn source rather than wu, which it > > only uses for stxsspx (power7?). > > > > Bootstrapped/regtested on powerpc64{,le}-linux, ok for trunk? > > Yes please. Thanks! > > Some testcase stuff below... > > > > --- gcc/testsuite/gcc.target/powerpc/pr79354.c.jj 2017-02-03 02:37:44.147938375 +0100 > > +++ gcc/testsuite/gcc.target/powerpc/pr79354.c 2017-02-03 02:38:34.838303987 +0100 > > @@ -0,0 +1,23 @@ > > +/* PR target/79354 */ > > +/* { dg-do compile { target { powerpc64*-*-* && lp64 } } } */ The code that moves SFmode vector registers to GPRs needs -m64 (the convert from scalar format to vector format puts the converted SFmode value in the upper 32-bits of the vector register, and then after the direct move, it does a 32-bit shift right in the GPR). Similarly, going from a GPR to a vector register, we do a shift left 32-bits, direct move, and then convert from vector format to scalar format. So to enable the code that was causing the problem, you need 64-bit. In 32-bit, it does a store, ori 2,2,0, and then a load to move SFmode values between GPR and vector registers.
On Fri, Feb 03, 2017 at 03:05:58PM -0500, Michael Meissner wrote: > > > --- gcc/testsuite/gcc.target/powerpc/pr79354.c.jj 2017-02-03 02:37:44.147938375 +0100 > > > +++ gcc/testsuite/gcc.target/powerpc/pr79354.c 2017-02-03 02:38:34.838303987 +0100 > > > @@ -0,0 +1,23 @@ > > > +/* PR target/79354 */ > > > +/* { dg-do compile { target { powerpc64*-*-* && lp64 } } } */ > > The code that moves SFmode vector registers to GPRs needs -m64 (the convert > from scalar format to vector format puts the converted SFmode value in the > upper 32-bits of the vector register, and then after the direct move, it does a > 32-bit shift right in the GPR). Similarly, going from a GPR to a vector > register, we do a shift left 32-bits, direct move, and then convert from vector > format to scalar format. The testcase is testing that the particular power9 instruction doesn't appear in -mcpu=power8 output. The test is valid source even for -m32 and should not contain the power9 instruction either. Without the rs6000.md change the test will only FAIL with -m64 indeed. Jakub
On Fri, Feb 03, 2017 at 09:11:38PM +0100, Jakub Jelinek wrote: > On Fri, Feb 03, 2017 at 03:05:58PM -0500, Michael Meissner wrote: > > > > --- gcc/testsuite/gcc.target/powerpc/pr79354.c.jj 2017-02-03 02:37:44.147938375 +0100 > > > > +++ gcc/testsuite/gcc.target/powerpc/pr79354.c 2017-02-03 02:38:34.838303987 +0100 > > > > @@ -0,0 +1,23 @@ > > > > +/* PR target/79354 */ > > > > +/* { dg-do compile { target { powerpc64*-*-* && lp64 } } } */ > > > > The code that moves SFmode vector registers to GPRs needs -m64 (the convert > > from scalar format to vector format puts the converted SFmode value in the > > upper 32-bits of the vector register, and then after the direct move, it does a > > 32-bit shift right in the GPR). Similarly, going from a GPR to a vector > > register, we do a shift left 32-bits, direct move, and then convert from vector > > format to scalar format. > > The testcase is testing that the particular power9 instruction doesn't > appear in -mcpu=power8 output. The test is valid source even for -m32 and > should not contain the power9 instruction either. > Without the rs6000.md change the test will only FAIL with -m64 indeed. The function being patched (movsi_from_sf) will only be called when -m64 and -mcpu=power9 or -mcpu=power8 are used. So, it doesn't harm anything to omit the && lp64, but the fix being tested would only occur in 64-bit mode.
--- gcc/config/rs6000/rs6000.md.jj 2017-02-02 11:04:27.000000000 +0100 +++ gcc/config/rs6000/rs6000.md 2017-02-03 02:29:42.754962983 +0100 @@ -6814,7 +6814,7 @@ (define_insn_and_split "movsi_from_sf" (unspec:SI [(match_operand:SF 1 "input_operand" "r, m, Z, Z, r, - f, wu, wu, wIwH, r, + f, wb, wu, wIwH, r, wK")] UNSPEC_SI_FROM_SF)) --- gcc/testsuite/gcc.target/powerpc/pr79354.c.jj 2017-02-03 02:37:44.147938375 +0100 +++ gcc/testsuite/gcc.target/powerpc/pr79354.c 2017-02-03 02:38:34.838303987 +0100 @@ -0,0 +1,23 @@ +/* PR target/79354 */ +/* { dg-do compile { target { powerpc64*-*-* && lp64 } } } */ +/* { dg-skip-if "do not override -mcpu" { powerpc*-*-* } { "-mcpu=*" } { "-mcpu=power8" } } */ +/* { dg-require-effective-target powerpc_p8vector_ok } */ +/* { dg-options "-mcpu=power8 -O2" } */ +/* { dg-final { scan-assembler-not "stxssp\[^x]" } } */ + + +int b, f, g; +float e; +unsigned long d; + +void +foo (int *a) +{ + for (g = 0; g < 32; g++) + if (f) + { + e = d; + __builtin_memcpy (&b, &e, sizeof (float)); + b = *a; + } +} --- gcc/testsuite/gcc.c-torture/execute/pr79354.c.jj 2017-02-03 02:36:36.746781897 +0100 +++ gcc/testsuite/gcc.c-torture/execute/pr79354.c 2017-02-03 02:36:07.000000000 +0100 @@ -0,0 +1,30 @@ +/* PR target/79354 */ + +int b, f, g; +float e; +unsigned long d; + +__attribute__((noinline, noclone)) void +foo (int *a) +{ + for (g = 0; g < 32; g++) + if (f) + { + e = d; + __builtin_memcpy (&b, &e, sizeof (float)); + b = *a; + } +} + +int +main () +{ + int h = 5; + f = 1; + asm volatile ("" : : : "memory"); + foo (&h); + asm volatile ("" : : : "memory"); + foo (&b); + asm volatile ("" : : : "memory"); + return 0; +}