diff mbox series

[1/2] Add IEEE 128-bit min/max support on PowerPC.

Message ID 20210518202606.GA14382@ibm-toto.the-meissners.org
State New
Headers show
Series Add power10 IEEE 128-bit min/max/conditional move support | expand

Commit Message

Michael Meissner May 18, 2021, 8:26 p.m. UTC
[PATCH 1/2] Add IEEE 128-bit min/max support on PowerPC.

This patch adds the support for the IEEE 128-bit floating point C minimum and
maximum instructions.  The next patch will add the support for using the
compare and set mask instruction to implement conditional moves.

This patch does not try to re-use the code used for SF/DF min/max
support.  It defines a separate insn for the IEEE 128-bit support.  It
uses the code iterator <minmax> to simplify adding both operations.

GCC will not convert ?: operations into using min/max instructions provided in
this patch unless the user uses -Ofast or similar switches due to issues with
NaNs.  The next patch that adds conditional move instructions will enable the
?: conversion in many cases.

I have done bootstrap builds with this patch on the following 3 systems:
    1)	power9 running LE Linux using --with-cpu=power9
    2)	power8 running BE Linux using --with-cpu=power8, testing both
	32/64-bit.
    3)	power10 prototype running LE Linux using --with-cpu=power10.

There were no regressions to the tests, and the new test added passed.  Can I
check these patches into trunk branch for GCC 12?

I would like to check these patches into GCC 11 after a cooling off period, but
I can also not do the backport if desired.

gcc/
2021-05-18  Michael Meissner  <meissner@linux.ibm.com>

	* config/rs6000/rs6000.c (rs6000_emit_minmax): Add support for ISA
	3.1   IEEE   128-bit   floating  point   xsmaxcqp   and   xsmincqp
	instructions.
	* config/rs6000/rs6000.md (s<minmax><mode>3, IEEE128 iterator):
	New insns.

gcc/testsuite/
2021-05-18  Michael Meissner  <meissner@linux.ibm.com>

	* gcc.target/powerpc/float128-minmax-2.c: New test.
	* gcc.target/powerpc/float128-minmax.c: Turn off power10 code
	generation.
---
 gcc/config/rs6000/rs6000.c                        |  3 ++-
 gcc/config/rs6000/rs6000.md                       | 11 +++++++++++
 .../gcc.target/powerpc/float128-minmax-2.c        | 15 +++++++++++++++
 .../gcc.target/powerpc/float128-minmax.c          |  7 +++++++
 4 files changed, 35 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.target/powerpc/float128-minmax-2.c

Comments

will schmidt May 20, 2021, 7:25 p.m. UTC | #1
On Tue, 2021-05-18 at 16:26 -0400, Michael Meissner wrote:
> [PATCH 1/2] Add IEEE 128-bit min/max support on PowerPC.
> 

Hi,


> This patch adds the support for the IEEE 128-bit floating point C minimum and
> maximum instructions.  The next patch will add the support for using the
> compare and set mask instruction to implement conditional moves.
> 
> This patch does not try to re-use the code used for SF/DF min/max
> support.  It defines a separate insn for the IEEE 128-bit support.  It
> uses the code iterator <minmax> to simplify adding both operations.
> 
> GCC will not convert ?: operations into using min/max instructions provided in

I'd throw the ternary term in there, easier to search for later. 
s/?: operations/ternary (?:) operations /

> this patch unless the user uses -Ofast or similar switches due to issues with
> NaNs.  The next patch that adds conditional move instructions will enable the
> ?: conversion in many cases.
> 
> I have done bootstrap builds with this patch on the following 3 systems:
>     1)	power9 running LE Linux using --with-cpu=power9
>     2)	power8 running BE Linux using --with-cpu=power8, testing both
> 	32/64-bit.
>     3)	power10 prototype running LE Linux using --with-cpu=power10.
> 
> There were no regressions to the tests, and the new test added passed.  Can I
> check these patches into trunk branch for GCC 12?
> 
> I would like to check these patches into GCC 11 after a cooling off period, but
> I can also not do the backport if desired.
> 
> gcc/
> 2021-05-18  Michael Meissner  <meissner@linux.ibm.com>
> 
> 	* config/rs6000/rs6000.c (rs6000_emit_minmax): Add support for ISA
> 	3.1   IEEE   128-bit   floating  point   xsmaxcqp   and   xsmincqp
> 	instructions.
> 	* config/rs6000/rs6000.md (s<minmax><mode>3, IEEE128 iterator):
> 	New insns.

ok

> 
> gcc/testsuite/
> 2021-05-18  Michael Meissner  <meissner@linux.ibm.com>
> 
> 	* gcc.target/powerpc/float128-minmax-2.c: New test.
> 	* gcc.target/powerpc/float128-minmax.c: Turn off power10 code
> 	generation.

So, presumably the float128-minmax-2.c test adds/replaces the power10
code gen tests that were removed or disabled from float128-minmax.c. 



> ---
>  gcc/config/rs6000/rs6000.c                        |  3 ++-
>  gcc/config/rs6000/rs6000.md                       | 11 +++++++++++
>  .../gcc.target/powerpc/float128-minmax-2.c        | 15 +++++++++++++++
>  .../gcc.target/powerpc/float128-minmax.c          |  7 +++++++
>  4 files changed, 35 insertions(+), 1 deletion(-)
>  create mode 100644 gcc/testsuite/gcc.target/powerpc/float128-minmax-2.c
> 
> diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
> index 0d0595dddd6..fdaf12aeda0 100644
> --- a/gcc/config/rs6000/rs6000.c
> +++ b/gcc/config/rs6000/rs6000.c
> @@ -16111,7 +16111,8 @@ rs6000_emit_minmax (rtx dest, enum rtx_code code, rtx op0, rtx op1)
>    /* VSX/altivec have direct min/max insns.  */
>    if ((code == SMAX || code == SMIN)
>        && (VECTOR_UNIT_ALTIVEC_OR_VSX_P (mode)
> -	  || (mode == SFmode && VECTOR_UNIT_VSX_P (DFmode))))
> +	  || (mode == SFmode && VECTOR_UNIT_VSX_P (DFmode))
> +	  || (TARGET_POWER10 && TARGET_FLOAT128_HW && FLOAT128_IEEE_P (mode))))
>      {
>        emit_insn (gen_rtx_SET (dest, gen_rtx_fmt_ee (code, mode, op0, op1)));
>        return;
> diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
> index 0bfeb24d9e8..3a1bc1f8547 100644
> --- a/gcc/config/rs6000/rs6000.md
> +++ b/gcc/config/rs6000/rs6000.md
> @@ -5196,6 +5196,17 @@ (define_insn "*s<minmax><mode>3_vsx"
>  }
>    [(set_attr "type" "fp")])
> 
> +;; Min/max for ISA 3.1 IEEE 128-bit floating point
> +(define_insn "s<minmax><mode>3"
> +  [(set (match_operand:IEEE128 0 "altivec_register_operand" "=v")
> +	(fp_minmax:IEEE128
> +	 (match_operand:IEEE128 1 "altivec_register_operand" "v")
> +	 (match_operand:IEEE128 2 "altivec_register_operand" "v")))]
> +  "TARGET_POWER10"
> +  "xs<minmax>cqp %0,%1,%2"
> +  [(set_attr "type" "vecfloat")
> +   (set_attr "size" "128")])
> +
>  ;; The conditional move instructions allow us to perform max and min operations
>  ;; even when we don't have the appropriate max/min instruction using the FSEL
>  ;; instruction.

ok


> diff --git a/gcc/testsuite/gcc.target/powerpc/float128-minmax-2.c b/gcc/testsuite/gcc.target/powerpc/float128-minmax-2.c
> new file mode 100644
> index 00000000000..c71ba08c9f8
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/float128-minmax-2.c
> @@ -0,0 +1,15 @@
> +/* { dg-require-effective-target ppc_float128_hw } */
> +/* { dg-require-effective-target power10_ok } */
> +/* { dg-options "-mdejagnu-cpu=power10 -O2 -ffast-math" } */
> +
> +#ifndef TYPE
> +#define TYPE _Float128
> +#endif
> +
> +/* Test that the fminf128/fmaxf128 functions generate if/then/else and not a
> +   call.  */
> +TYPE f128_min (TYPE a, TYPE b) { return __builtin_fminf128 (a, b); }
> +TYPE f128_max (TYPE a, TYPE b) { return __builtin_fmaxf128 (a, b); }
> +
> +/* { dg-final { scan-assembler {\mxsmaxcqp\M} } } */
> +/* { dg-final { scan-assembler {\mxsmincqp\M} } } */

ok

> diff --git a/gcc/testsuite/gcc.target/powerpc/float128-minmax.c b/gcc/testsuite/gcc.target/powerpc/float128-minmax.c
> index fe397518f2f..c3af759c0b9 100644
> --- a/gcc/testsuite/gcc.target/powerpc/float128-minmax.c
> +++ b/gcc/testsuite/gcc.target/powerpc/float128-minmax.c
> @@ -3,6 +3,13 @@
>  /* { dg-require-effective-target float128 } */
>  /* { dg-options "-mpower9-vector -O2 -ffast-math" } */
> 
> +/* If the compiler was configured to automatically generate power10 support with
> +   --with-cpu=power10, turn it off.  Otherwise, it will generate XXMAXCQP and
> +   XXMINCQP instructions.  */
> +#ifdef _ARCH_PWR10
> +#pragma GCC target ("cpu=power9")
> +#endif

Probably fine..  It's good to exercise the pragma target stuff, thoguh
I wonder if it would be better to just specify -mcpu=power9 in the
options since we are already specifying (redundant?) -mpower9-vector. 

I see similar changes in a later patch, probably OK there since those
tests do not appear to be specifying -mcpu=foo options that are already
pointed at power9 features...


> +
>  #ifndef TYPE
>  #define TYPE _Float128
>  #endif
> -- 
> 2.31.1

lgtm, 
thanks
-Will

> 
>
Michael Meissner May 21, 2021, 1:38 a.m. UTC | #2
On Thu, May 20, 2021 at 02:25:58PM -0500, will schmidt wrote:
> I'd throw the ternary term in there, easier to search for later. 
> s/?: operations/ternary (?:) operations /

Thanks.

> So, presumably the float128-minmax-2.c test adds/replaces the power10
> code gen tests that were removed or disabled from float128-minmax.c. 

Yes.

> Probably fine..  It's good to exercise the pragma target stuff, thoguh
> I wonder if it would be better to just specify -mcpu=power9 in the
> options since we are already specifying (redundant?) -mpower9-vector. 
> 
> I see similar changes in a later patch, probably OK there since those
> tests do not appear to be specifying -mcpu=foo options that are already
> pointed at power9 features...

I think we really want a better solution than #pragma, since some systems (AIX
if memory serves) might not support #pragma to change code generation models,
because they don't have the assembler/linker support for it.

Basically for code generation tests, I see the following cases:

1) Test code targetting precisley power8 (or power9, power10), etc.  Hopefully
these are rare.

2) Test code targetting at least power8.  But as these tests show, that a lot
of the code won't generate the appropriate instructions on power10.  This is
what we have now.  It relies on undocumented switches like -mpower9-vector to
add the necessary support.

3) Test code targetting at least power8 but go to power9 at the maximum.
Michael Meissner May 24, 2021, 3:47 p.m. UTC | #3
Ping patch:

| Subject: [PATCH 1/2] Add IEEE 128-bit min/max support on PowerPC.
| Message-ID: <20210518202606.GA14382@ibm-toto.the-meissners.org>
Michael Meissner June 1, 2021, 11:38 p.m. UTC | #4
Ping again.

Original patch (Add IEEE 128-bit min/max support on PowerPC):

| Date: Tue, 18 May 2021 16:26:06 -0400
| Subject: [PATCH 1/2] Add IEEE 128-bit min/max support on PowerPC.
| Message-ID: <20210518202606.GA14382@ibm-toto.the-meissners.org>
| https://gcc.gnu.org/pipermail/gcc-patches/2021-May/570675.html
Segher Boessenkool June 7, 2021, 8:08 p.m. UTC | #5
Hi!

On Thu, May 20, 2021 at 09:38:49PM -0400, Michael Meissner wrote:
> Basically for code generation tests, I see the following cases:
> 
> 1) Test code targetting precisley power8 (or power9, power10), etc.  Hopefully
> these are rare.

-mdejagnu-cpu= works perfectly for this.  You may need a *_ok or a *_hw
as well (and/or other selectors).

> 2) Test code targetting at least power8.  But as these tests show, that a lot
> of the code won't generate the appropriate instructions on power10.  This is
> what we have now.  It relies on undocumented switches like -mpower9-vector to
> add the necessary support.

You should simply not run this test on too new systems.  You can use
dg-skip-if or similar.

> 3) Test code targetting at least power8 but go to power9 at the maximum.

But why?  We will keep testing all interesting CPU / OS combos as long
as they are interesting.


Segher
Segher Boessenkool June 7, 2021, 8:25 p.m. UTC | #6
On Tue, May 18, 2021 at 04:26:06PM -0400, Michael Meissner wrote:
> This patch adds the support for the IEEE 128-bit floating point C minimum and
> maximum instructions.

> gcc/
> 2021-05-18  Michael Meissner  <meissner@linux.ibm.com>
> 
> 	* config/rs6000/rs6000.c (rs6000_emit_minmax): Add support for ISA
> 	3.1   IEEE   128-bit   floating  point   xsmaxcqp   and   xsmincqp
> 	instructions.

3.1 fits on the previous line (it is better to not split numbers to a
new line).  What is up with the weird multiple spaces?  We don't align
the right border in changelogs :-)

> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/float128-minmax-2.c
> @@ -0,0 +1,15 @@
> +/* { dg-require-effective-target ppc_float128_hw } */
> +/* { dg-require-effective-target power10_ok } */

Is this needed?  And, why is ppc_float128_hw needed?  That combination
does not seem to make sense.

> --- a/gcc/testsuite/gcc.target/powerpc/float128-minmax.c
> +++ b/gcc/testsuite/gcc.target/powerpc/float128-minmax.c
> @@ -3,6 +3,13 @@
>  /* { dg-require-effective-target float128 } */
>  /* { dg-options "-mpower9-vector -O2 -ffast-math" } */
>  
> +/* If the compiler was configured to automatically generate power10 support with
> +   --with-cpu=power10, turn it off.  Otherwise, it will generate XXMAXCQP and
> +   XXMINCQP instructions.  */
> +#ifdef _ARCH_PWR10
> +#pragma GCC target ("cpu=power9")
> +#endif

Yeah, don't.  Add a dg-skip-if if that is what you want.  That
-mpower9-vector shouldn't be there either.


Segher
Michael Meissner June 8, 2021, 11:52 p.m. UTC | #7
On Mon, Jun 07, 2021 at 03:25:06PM -0500, Segher Boessenkool wrote:
> On Tue, May 18, 2021 at 04:26:06PM -0400, Michael Meissner wrote:
> > This patch adds the support for the IEEE 128-bit floating point C minimum and
> > maximum instructions.
> 
> > gcc/
> > 2021-05-18  Michael Meissner  <meissner@linux.ibm.com>
> > 
> > 	* config/rs6000/rs6000.c (rs6000_emit_minmax): Add support for ISA
> > 	3.1   IEEE   128-bit   floating  point   xsmaxcqp   and   xsmincqp
> > 	instructions.
> 
> 3.1 fits on the previous line (it is better to not split numbers to a
> new line).  What is up with the weird multiple spaces?  We don't align
> the right border in changelogs :-)
> 
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.target/powerpc/float128-minmax-2.c
> > @@ -0,0 +1,15 @@
> > +/* { dg-require-effective-target ppc_float128_hw } */
> > +/* { dg-require-effective-target power10_ok } */
> 
> Is this needed?  And, why is ppc_float128_hw needed?  That combination
> does not seem to make sense.

Basically it is there to make sure that we are actually generating IEEE 128-bit
instructions.
diff mbox series

Patch

diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index 0d0595dddd6..fdaf12aeda0 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -16111,7 +16111,8 @@  rs6000_emit_minmax (rtx dest, enum rtx_code code, rtx op0, rtx op1)
   /* VSX/altivec have direct min/max insns.  */
   if ((code == SMAX || code == SMIN)
       && (VECTOR_UNIT_ALTIVEC_OR_VSX_P (mode)
-	  || (mode == SFmode && VECTOR_UNIT_VSX_P (DFmode))))
+	  || (mode == SFmode && VECTOR_UNIT_VSX_P (DFmode))
+	  || (TARGET_POWER10 && TARGET_FLOAT128_HW && FLOAT128_IEEE_P (mode))))
     {
       emit_insn (gen_rtx_SET (dest, gen_rtx_fmt_ee (code, mode, op0, op1)));
       return;
diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
index 0bfeb24d9e8..3a1bc1f8547 100644
--- a/gcc/config/rs6000/rs6000.md
+++ b/gcc/config/rs6000/rs6000.md
@@ -5196,6 +5196,17 @@  (define_insn "*s<minmax><mode>3_vsx"
 }
   [(set_attr "type" "fp")])
 
+;; Min/max for ISA 3.1 IEEE 128-bit floating point
+(define_insn "s<minmax><mode>3"
+  [(set (match_operand:IEEE128 0 "altivec_register_operand" "=v")
+	(fp_minmax:IEEE128
+	 (match_operand:IEEE128 1 "altivec_register_operand" "v")
+	 (match_operand:IEEE128 2 "altivec_register_operand" "v")))]
+  "TARGET_POWER10"
+  "xs<minmax>cqp %0,%1,%2"
+  [(set_attr "type" "vecfloat")
+   (set_attr "size" "128")])
+
 ;; The conditional move instructions allow us to perform max and min operations
 ;; even when we don't have the appropriate max/min instruction using the FSEL
 ;; instruction.
diff --git a/gcc/testsuite/gcc.target/powerpc/float128-minmax-2.c b/gcc/testsuite/gcc.target/powerpc/float128-minmax-2.c
new file mode 100644
index 00000000000..c71ba08c9f8
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/float128-minmax-2.c
@@ -0,0 +1,15 @@ 
+/* { dg-require-effective-target ppc_float128_hw } */
+/* { dg-require-effective-target power10_ok } */
+/* { dg-options "-mdejagnu-cpu=power10 -O2 -ffast-math" } */
+
+#ifndef TYPE
+#define TYPE _Float128
+#endif
+
+/* Test that the fminf128/fmaxf128 functions generate if/then/else and not a
+   call.  */
+TYPE f128_min (TYPE a, TYPE b) { return __builtin_fminf128 (a, b); }
+TYPE f128_max (TYPE a, TYPE b) { return __builtin_fmaxf128 (a, b); }
+
+/* { dg-final { scan-assembler {\mxsmaxcqp\M} } } */
+/* { dg-final { scan-assembler {\mxsmincqp\M} } } */
diff --git a/gcc/testsuite/gcc.target/powerpc/float128-minmax.c b/gcc/testsuite/gcc.target/powerpc/float128-minmax.c
index fe397518f2f..c3af759c0b9 100644
--- a/gcc/testsuite/gcc.target/powerpc/float128-minmax.c
+++ b/gcc/testsuite/gcc.target/powerpc/float128-minmax.c
@@ -3,6 +3,13 @@ 
 /* { dg-require-effective-target float128 } */
 /* { dg-options "-mpower9-vector -O2 -ffast-math" } */
 
+/* If the compiler was configured to automatically generate power10 support with
+   --with-cpu=power10, turn it off.  Otherwise, it will generate XXMAXCQP and
+   XXMINCQP instructions.  */
+#ifdef _ARCH_PWR10
+#pragma GCC target ("cpu=power9")
+#endif
+
 #ifndef TYPE
 #define TYPE _Float128
 #endif