Message ID | c2ce8cc1-cbd4-1eb9-5d58-97136d5a9603@linux.ibm.com |
---|---|
State | New |
Headers | show |
Series | [v5,rs6000] Implemented f[min/max]_optab by xs[min/max]dp [PR103605] | expand |
Hi! On Mon, Jun 20, 2022 at 11:12:50AM +0800, HAO CHEN GUI wrote: > --- a/gcc/config/rs6000/rs6000-builtins.def > +++ b/gcc/config/rs6000/rs6000-builtins.def You don't have this in the changelog. Please fix. > --- a/gcc/config/rs6000/rs6000.md > +++ b/gcc/config/rs6000/rs6000.md This, too. And match.pd isn't in the patch. > +(define_insn "f<minmax_op><mode>3" > + [(set (match_operand:SFDF 0 "vsx_register_operand" "=wa") > + (unspec:SFDF [(match_operand:SFDF 1 "vsx_register_operand" "wa") > + (match_operand:SFDF 2 "vsx_register_operand" "wa")] > + FMINMAX))] > + "TARGET_VSX && !flag_finite_math_only" && !flag_trapping_math and/or whatever else is needed as well here. > + "xs<minmax_op>dp %x0,%x1,%x2" > + [(set_attr "type" "fp")] > +) Are things like fmin(4.0, 2.0); (still) optimised correctly? > new file mode 100644 > index 00000000000..e43ac40c2d1 > --- /dev/null > +++ b/gcc/testsuite/gcc.target/powerpc/pr103605.c > +/* { dg-options "-O1 -mvsx" } */ Please use -O2 instead. That way, it will catch it if any of the optimisations that are normally done (and not with just -O1) sabotage us here. Thanks, Segher
Hi, On 21/6/2022 上午 7:08, Segher Boessenkool wrote: > && !flag_trapping_math > > and/or whatever else is needed as well here. > I have a question here. fmin/max are folded to MIN/MAX_EXPR when flag_finite_math_only is set. Seems no-trapping-math is no need to fmin/max? Also xs[min|max]dp do raise trapping. /* Convert fmin/fmax to MIN_EXPR/MAX_EXPR. C99 requires these functions to return the numeric arg if the other one is NaN. MIN and MAX don't honor that, so only transform if -ffinite-math-only is set. C99 doesn't require -0.0 to be handled, so we don't have to worry about it either. */ (if (flag_finite_math_only) (simplify (FMIN_ALL @0 @1) (min @0 @1)) (simplify (FMAX_ALL @0 @1) (max @0 @1))) > Are things like > fmin(4.0, 2.0); > (still) optimised correctly? I have tested it. fmin(4.0, 2.0) is converted to "2.0" in front end. So my patch doesn't touch it. Thanks a lot. Gui Haochen
diff --git a/gcc/config/rs6000/rs6000-builtins.def b/gcc/config/rs6000/rs6000-builtins.def index f4a9f24bcc5..8b735493b40 100644 --- a/gcc/config/rs6000/rs6000-builtins.def +++ b/gcc/config/rs6000/rs6000-builtins.def @@ -1613,10 +1613,10 @@ XSCVSPDP vsx_xscvspdp {} const double __builtin_vsx_xsmaxdp (double, double); - XSMAXDP smaxdf3 {} + XSMAXDP fmaxdf3 {} const double __builtin_vsx_xsmindp (double, double); - XSMINDP smindf3 {} + XSMINDP fmindf3 {} const double __builtin_vsx_xsrdpi (double); XSRDPI vsx_xsrdpi {} diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md index bf85baa5370..ae0dd98f0f9 100644 --- a/gcc/config/rs6000/rs6000.md +++ b/gcc/config/rs6000/rs6000.md @@ -158,6 +158,8 @@ (define_c_enum "unspec" UNSPEC_HASHCHK UNSPEC_XXSPLTIDP_CONST UNSPEC_XXSPLTIW_CONST + UNSPEC_FMAX + UNSPEC_FMIN ]) ;; @@ -5341,6 +5343,22 @@ (define_insn_and_split "*s<minmax><mode>3_fpr" DONE; }) + +(define_int_iterator FMINMAX [UNSPEC_FMAX UNSPEC_FMIN]) + +(define_int_attr minmax_op [(UNSPEC_FMAX "max") + (UNSPEC_FMIN "min")]) + +(define_insn "f<minmax_op><mode>3" + [(set (match_operand:SFDF 0 "vsx_register_operand" "=wa") + (unspec:SFDF [(match_operand:SFDF 1 "vsx_register_operand" "wa") + (match_operand:SFDF 2 "vsx_register_operand" "wa")] + FMINMAX))] + "TARGET_VSX && !flag_finite_math_only" + "xs<minmax_op>dp %x0,%x1,%x2" + [(set_attr "type" "fp")] +) + (define_expand "mov<mode>cc" [(set (match_operand:GPR 0 "gpc_reg_operand") (if_then_else:GPR (match_operand 1 "comparison_operator") diff --git a/gcc/testsuite/gcc.target/powerpc/pr103605.c b/gcc/testsuite/gcc.target/powerpc/pr103605.c new file mode 100644 index 00000000000..e43ac40c2d1 --- /dev/null +++ b/gcc/testsuite/gcc.target/powerpc/pr103605.c @@ -0,0 +1,37 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target powerpc_vsx_ok } */ +/* { dg-options "-O1 -mvsx" } */ +/* { dg-final { scan-assembler-times {\mxsmaxdp\M} 3 } } */ +/* { dg-final { scan-assembler-times {\mxsmindp\M} 3 } } */ + +#include <math.h> + +double test1 (double d0, double d1) +{ + return fmin (d0, d1); +} + +float test2 (float d0, float d1) +{ + return fmin (d0, d1); +} + +double test3 (double d0, double d1) +{ + return fmax (d0, d1); +} + +float test4 (float d0, float d1) +{ + return fmax (d0, d1); +} + +double test5 (double d0, double d1) +{ + return __builtin_vsx_xsmindp (d0, d1); +} + +double test6 (double d0, double d1) +{ + return __builtin_vsx_xsmaxdp (d0, d1); +}