Message ID | a1956d89-4226-ca0c-c24a-1f6ae75b1e29@linux.ibm.com |
---|---|
State | New |
Headers | show |
Series | rs6000: Generate _Decimal128 to _Decimal32 hardware conversion instructions | expand |
On Fri, Jul 17, 2020 at 02:47:50PM -0500, Peter Bergner wrote: > PR92488 shows we do not generate hardware conversion instructions when > converting from _Decimal128 to _Decimal32. There is no one instruction > that does the conversion, so we currently call the __dpd_trunctdsd2 > function to do the conversion for us. This is slow. Paul Murphy > described a short sequence of dfp hardware instructions that would do > the conversion correctly. The patch below implements that idea. > > The convert-fp-128.c test case uses dg-require-effective-target dfp, > so its !dfp usages are basically disabling those tests completely. > What we really want is to know whether the compiler is generating > hardware instructions or calling the libcalls. For that, we need > to test hard_dfp. > > This patch bootstrapped and regtested with no regressions on powerpc64le-linux. > Segher, you pre-approved the pattern, but I thought I'd have you double > check the test case changes and new test case. > > Still ok for trunk? Yes. Thanks! > gcc/ > PR target/92488 > * config/rs6000/dfp.md (trunctdsd2): New define_insn. > > gcc/testsuite/ > PR target/92488 > * gcc.target/powerpc/convert-fp-128.c (bl, drsp, drdpq): Update counts. > (__dpd_trunctdsd2): Make conditional on !hard_dfp. > (__dpd_extendsddd2, __dpd_extendsdtd2, __dpd_truncddsd2, > __dpd_extendddtd2, __dpd_trunctddd2): Use !hard_dfp. > * gcc.target/powerpc/pr92488.c: New test. > > diff --git a/gcc/config/rs6000/dfp.md b/gcc/config/rs6000/dfp.md > index e91d6f581ed..50bfad6beb7 100644 > --- a/gcc/config/rs6000/dfp.md > +++ b/gcc/config/rs6000/dfp.md > @@ -155,6 +155,16 @@ > [(set_attr "type" "dfp") > (set_attr "length" "8")]) > > +(define_insn "trunctdsd2" > + [(set (match_operand:SD 0 "gpc_reg_operand" "=d") > + (float_truncate:SD (match_operand:TD 1 "gpc_reg_operand" "d"))) > + (clobber (match_scratch:TD 2 "=&d")) > + (clobber (match_scratch:DF 3 "=&d"))] > + "TARGET_DFP" > + "mffs %3\;mtfsfi 7,7,1\;drdpq %2,%1\;mtfsf 0xff,%3,1,0\;drsp %0,%2" > + [(set_attr "type" "dfp") > + (set_attr "length" "20")]) Side note: On ISA 3.0B and later you can do mffscdrni %3,7 drdpq %2,%1 mffscdrn %3,%3 drsp %0,%2 (saving one insn, and using somewhat cheaper insns). But that is only on newer machines, so is this worth it at all? :-) Segher
On 7/17/20 3:23 PM, Segher Boessenkool wrote: > On ISA 3.0B and later you can do > > mffscdrni %3,7 > drdpq %2,%1 > mffscdrn %3,%3 > drsp %0,%2 > > (saving one insn, and using somewhat cheaper insns). But that is only > on newer machines, so is this worth it at all? :-) So something like the following? Peter (define_insn "trunctdsd2" [(set (match_operand:SD 0 "gpc_reg_operand" "=d") (float_truncate:SD (match_operand:TD 1 "gpc_reg_operand" "d"))) (clobber (match_scratch:TD 2 "=&d")) (clobber (match_scratch:DF 3 "=&d"))] "TARGET_DFP" { if (TARGET_MODULO) return "mffscdrni %3,7\;drdpq %2,%1\;mffscdrn %3,%3\;drsp %0,%2"; else return "mffs %3\;mtfsfi 7,7,1\;drdpq %2,%1\;mtfsf 0xff,%3,1,0\;drsp %0,%2"; } [(set_attr "type" "dfp") (set (attr "length") (if_then_else (match_test "TARGET_MODULO") (const_string "16") (const_string "20")))])
On Fri, Jul 17, 2020 at 04:18:55PM -0500, Peter Bergner wrote: > On 7/17/20 3:23 PM, Segher Boessenkool wrote: > > On ISA 3.0B and later you can do > > > > mffscdrni %3,7 > > drdpq %2,%1 > > mffscdrn %3,%3 > > drsp %0,%2 > > > > (saving one insn, and using somewhat cheaper insns). But that is only > > on newer machines, so is this worth it at all? :-) > > So something like the following? > > Peter > > > (define_insn "trunctdsd2" > [(set (match_operand:SD 0 "gpc_reg_operand" "=d") > (float_truncate:SD (match_operand:TD 1 "gpc_reg_operand" "d"))) > (clobber (match_scratch:TD 2 "=&d")) > (clobber (match_scratch:DF 3 "=&d"))] > "TARGET_DFP" > { > if (TARGET_MODULO) > return "mffscdrni %3,7\;drdpq %2,%1\;mffscdrn %3,%3\;drsp %0,%2"; > else > return "mffs %3\;mtfsfi 7,7,1\;drdpq %2,%1\;mtfsf 0xff,%3,1,0\;drsp %0,%2"; > } > [(set_attr "type" "dfp") > (set (attr "length") > (if_then_else > (match_test "TARGET_MODULO") > (const_string "16") > (const_string "20")))]) Well, just make an "isa" value of "p9"? Then you can just do (define_insn "trunctdsd2" [(set (match_operand:SD 0 "gpc_reg_operand" "=d") (float_truncate:SD (match_operand:TD 1 "gpc_reg_operand" "d"))) (clobber (match_scratch:TD 2 "=&d")) (clobber (match_scratch:DF 3 "=&d"))] "TARGET_DFP" "@ mffscdrni %3,7\;drdpq %2,%1\;mffscdrn %3,%3\;drsp %0,%2 mffs %3\;mtfsfi 7,7,1\;drdpq %2,%1\;mtfsf 0xff,%3,1,0\;drsp %0,%2" [(set_attr "type" "dfp") (set_attr "isa" "p9,*") (set_attr "length" "16,20")]) Segher
On 7/17/20 4:32 PM, Segher Boessenkool wrote: > Well, just make an "isa" value of "p9"? Then you can just do > > (define_insn "trunctdsd2" > [(set (match_operand:SD 0 "gpc_reg_operand" "=d") > (float_truncate:SD (match_operand:TD 1 "gpc_reg_operand" "d"))) > (clobber (match_scratch:TD 2 "=&d")) > (clobber (match_scratch:DF 3 "=&d"))] > "TARGET_DFP" > "@ > mffscdrni %3,7\;drdpq %2,%1\;mffscdrn %3,%3\;drsp %0,%2 > mffs %3\;mtfsfi 7,7,1\;drdpq %2,%1\;mtfsf 0xff,%3,1,0\;drsp %0,%2" > [(set_attr "type" "dfp") > (set_attr "isa" "p9,*") > (set_attr "length" "16,20")]) As we discussed offline, we need the duplicated constraint alternatives like below. I'll test this and commit this if clean with the following additional ChangeLog entry for the p9 addition. Thanks. * config/rs6000/rs6000.md (define_attr "isa"): Add p9. (define_attr "enabled"): Handle p9. Peter diff --git a/gcc/config/rs6000/dfp.md b/gcc/config/rs6000/dfp.md index e91d6f581ed..8f822732bac 100644 --- a/gcc/config/rs6000/dfp.md +++ b/gcc/config/rs6000/dfp.md @@ -155,6 +155,19 @@ [(set_attr "type" "dfp") (set_attr "length" "8")]) +(define_insn "trunctdsd2" + [(set (match_operand:SD 0 "gpc_reg_operand" "=d,d") + (float_truncate:SD (match_operand:TD 1 "gpc_reg_operand" "d,d"))) + (clobber (match_scratch:TD 2 "=&d,&d")) + (clobber (match_scratch:DF 3 "=&d,&d"))] + "TARGET_DFP" + "@ + mffscdrni %3,7\;drdpq %2,%1\;mffscdrn %3,%3\;drsp %0,%2 + mffs %3\;mtfsfi 7,7,1\;drdpq %2,%1\;mtfsf 0xff,%3,1,0\;drsp %0,%2" + [(set_attr "type" "dfp") + (set_attr "isa" "p9,*") + (set_attr "length" "16,20")]) + (define_insn "add<mode>3" [(set (match_operand:DDTD 0 "gpc_reg_operand" "=d") (plus:DDTD (match_operand:DDTD 1 "gpc_reg_operand" "%d") diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md index c0d9877c715..b3fcb845a38 100644 --- a/gcc/config/rs6000/rs6000.md +++ b/gcc/config/rs6000/rs6000.md @@ -322,7 +322,7 @@ (const (symbol_ref "(enum attr_cpu) rs6000_tune"))) ;; The ISA we implement. -(define_attr "isa" "any,p5,p6,p7,p7v,p8v,p9v,p9kf,p9tf,p10" +(define_attr "isa" "any,p5,p6,p7,p7v,p8v,p9,p9v,p9kf,p9tf,p10" (const_string "any")) ;; Is this alternative enabled for the current CPU/ISA/etc.? @@ -351,6 +351,10 @@ (match_test "TARGET_P8_VECTOR")) (const_int 1) + (and (eq_attr "isa" "p9") + (match_test "TARGET_MODULO")) + (const_int 1) + (and (eq_attr "isa" "p9v") (match_test "TARGET_P9_VECTOR")) (const_int 1)
On 7/17/20 5:07 PM, Peter Bergner wrote: > As we discussed offline, we need the duplicated constraint alternatives > like below. I'll test this and commit this if clean with the following > additional ChangeLog entry for the p9 addition. Thanks. Testing was clean, so I pushed the commit. Thanks! Peter
diff --git a/gcc/config/rs6000/dfp.md b/gcc/config/rs6000/dfp.md index e91d6f581ed..50bfad6beb7 100644 --- a/gcc/config/rs6000/dfp.md +++ b/gcc/config/rs6000/dfp.md @@ -155,6 +155,16 @@ [(set_attr "type" "dfp") (set_attr "length" "8")]) +(define_insn "trunctdsd2" + [(set (match_operand:SD 0 "gpc_reg_operand" "=d") + (float_truncate:SD (match_operand:TD 1 "gpc_reg_operand" "d"))) + (clobber (match_scratch:TD 2 "=&d")) + (clobber (match_scratch:DF 3 "=&d"))] + "TARGET_DFP" + "mffs %3\;mtfsfi 7,7,1\;drdpq %2,%1\;mtfsf 0xff,%3,1,0\;drsp %0,%2" + [(set_attr "type" "dfp") + (set_attr "length" "20")]) + (define_insn "add<mode>3" [(set (match_operand:DDTD 0 "gpc_reg_operand" "=d") (plus:DDTD (match_operand:DDTD 1 "gpc_reg_operand" "%d") diff --git a/gcc/testsuite/gcc.target/powerpc/convert-fp-128.c b/gcc/testsuite/gcc.target/powerpc/convert-fp-128.c index 67896d92c86..5b0ef3b0d49 100644 --- a/gcc/testsuite/gcc.target/powerpc/convert-fp-128.c +++ b/gcc/testsuite/gcc.target/powerpc/convert-fp-128.c @@ -34,7 +34,7 @@ conv2 /* { dg-final { scan-assembler-times {\mbl\M} 24 { target { ! hard_dfp } } } } */ /* { dg-final { scan-assembler-times {\mbl\M} 19 { target { hard_dfp && { ! ppc_float128 } } } } } */ -/* { dg-final { scan-assembler-times {\mbl\M} 31 { target { hard_dfp && { ppc_float128 && { ! ppc_float128_insns } } } } } } */ +/* { dg-final { scan-assembler-times {\mbl\M} 30 { target { hard_dfp && { ppc_float128 && { ! ppc_float128_insns } } } } } } */ /* { dg-final { scan-assembler-times {\mbl\M} 27 { target { hard_dfp && { ppc_float128 && { ppc_float128_insns } } } } } } */ @@ -60,20 +60,20 @@ conv2 /* { dg-final { scan-assembler-times {\mbl __dpd_extendsddf\M} 1 } } */ /* { dg-final { scan-assembler-times {\mbl __dpd_extendsdtf\M} 1 } } */ /* { dg-final { scan-assembler-times {\mbl __dpd_extendsdkf\M} 1 { target { ppc_float128 } } } } */ -/* { dg-final { scan-assembler-times {\mbl __dpd_extendsddd2\M} 1 { target { ! dfp } } } } */ -/* { dg-final { scan-assembler-times {\mbl __dpd_extendsdtd2\M} 1 { target { ! dfp } } } } */ +/* { dg-final { scan-assembler-times {\mbl __dpd_extendsddd2\M} 1 { target { ! hard_dfp } } } } */ +/* { dg-final { scan-assembler-times {\mbl __dpd_extendsdtd2\M} 1 { target { ! hard_dfp } } } } */ /* { dg-final { scan-assembler-times {\mbl __dpd_truncddsf\M} 1 } } */ /* { dg-final { scan-assembler-times {\mbl __dpd_truncdddf\M} 1 } } */ /* { dg-final { scan-assembler-times {\mbl __dpd_extendddtf\M} 1 } } */ /* { dg-final { scan-assembler-times {\mbl __dpd_extendddkf\M} 1 { target { ppc_float128 } } } } */ -/* { dg-final { scan-assembler-times {\mbl __dpd_truncddsd2\M} 1 { target { ! dfp } } } } */ -/* { dg-final { scan-assembler-times {\mbl __dpd_extendddtd2\M} 1 { target { ! dfp } } } } */ +/* { dg-final { scan-assembler-times {\mbl __dpd_truncddsd2\M} 1 { target { ! hard_dfp } } } } */ +/* { dg-final { scan-assembler-times {\mbl __dpd_extendddtd2\M} 1 { target { ! hard_dfp } } } } */ /* { dg-final { scan-assembler-times {\mbl __dpd_trunctdsf\M} 1 } } */ /* { dg-final { scan-assembler-times {\mbl __dpd_trunctddf\M} 1 } } */ /* { dg-final { scan-assembler-times {\mbl __dpd_trunctdtf\M} 1 } } */ /* { dg-final { scan-assembler-times {\mbl __dpd_trunctdkf\M} 1 { target { ppc_float128 } } } } */ -/* { dg-final { scan-assembler-times {\mbl __dpd_trunctdsd2\M} 1 } } */ -/* { dg-final { scan-assembler-times {\mbl __dpd_trunctddd2\M} 1 { target { ! dfp } } } } */ +/* { dg-final { scan-assembler-times {\mbl __dpd_trunctdsd2\M} 1 { target { ! hard_dfp } } } } */ +/* { dg-final { scan-assembler-times {\mbl __dpd_trunctddd2\M} 1 { target { ! hard_dfp } } } } */ /* { dg-final { scan-assembler-times {\mfrsp|xsrsp\M} 2 { target { ! ppc_float128_insns } } } } */ @@ -88,8 +88,8 @@ conv2 /* { dg-final { scan-assembler-times {\mxxlor|xscpsgndp\M} 3 { target { ppc_float128_insns } } } } */ -/* { dg-final { scan-assembler-times {\mdrsp\M} 1 { target { hard_dfp } } } } */ -/* { dg-final { scan-assembler-times {\mdrdpq\M} 1 { target { hard_dfp } } } } */ +/* { dg-final { scan-assembler-times {\mdrsp\M} 2 { target { hard_dfp } } } } */ +/* { dg-final { scan-assembler-times {\mdrdpq\M} 2 { target { hard_dfp } } } } */ /* { dg-final { scan-assembler-times {\mdctdp\M} 2 { target { hard_dfp } } } } */ /* { dg-final { scan-assembler-times {\mdctqpq\M} 2 { target { hard_dfp } } } } */ diff --git a/gcc/testsuite/gcc.target/powerpc/pr92488.c b/gcc/testsuite/gcc.target/powerpc/pr92488.c new file mode 100644 index 00000000000..3ca575531ca --- /dev/null +++ b/gcc/testsuite/gcc.target/powerpc/pr92488.c @@ -0,0 +1,43 @@ +/* { dg-do run } */ +/* { dg-require-effective-target dfprt } */ +/* { dg-options "-O2" } */ + +#include <stdio.h> +#include <stdlib.h> + +/* Runnable test case for testing _Decimal128 to _Decimal32 rounding. + The value below when rounded to _Decimal64 would result in the value + 1.2345675e+00, which if it were rounded to _Decimal32 would result in + the value 1.234568e+00. However, the correct value when rounding from + _Decimal128 directly to _Decimal32 is 1.234567e+00. */ + +_Decimal128 td = 1.23456749999999999999e+00dl; +_Decimal32 sd_expected = 1.234567e+00df; + +_Decimal32 __attribute__((noinline)) +td2sd (_Decimal128 td) +{ + return td; +} + +int +main (void) +{ + _Decimal32 sd = td2sd (td); + if (sd != sd_expected) + { + union { + _Decimal32 sd; + unsigned int i; + } u; + + printf ("cast to _Decimal32 failed:\n"); + u.sd = sd; + printf (" actual = 0x%x\n", u.i); + u.sd = sd_expected; + printf (" expected = 0x%x\n", u.i); + abort (); + } + + return 0; +}