diff mbox series

Add _Float<N>/_Float<N>X rounding built-ins & improve gimple optimization of _Float<N>/_Float<N>X built-in functions

Message ID 20171117050445.GA13927@ibm-tiger.the-meissners.org
State New
Headers show
Series Add _Float<N>/_Float<N>X rounding built-ins & improve gimple optimization of _Float<N>/_Float<N>X built-in functions | expand

Commit Message

Michael Meissner Nov. 17, 2017, 5:04 a.m. UTC
This patch is an enhancement of a previous page that never got approved.
https://gcc.gnu.org/ml/gcc-patches/2017-10/threads.html#02124

In the original patch, I added support to the machine independent
infrastructure to support the rounding built-in functions for _Float<N> and
_Float<N>X types (i.e. roundf128, ceilf128, etc.).  I also added PowerPC ISA
3.0 support to generate these built-in functions.

In addition to the previous changes, this patch now adds more optimizations for
_Float<N> and _Float<N>X types in match.pd.

I modified the gencfn generation to dump out <BUILTIN>_ALL case statements and
match.pd operators in addition to <BUILTIN> that puts out the normal float,
double, and long double variants, and <BUILTIN>_FN that puts out the _Float<N>
and _Float<N>X variants.  The <BUILTIN>_ALL version puts out the built-in
functions that include float, double, long double, and all of the _Float<N> and
_Float<N>X versions.

I also modified match.pd where possible to use COPYSIGN_ALL, SQRT_ALL,
FMIN_ALL, FMAX_ALL, TRUNC_ALL, FLOOR_ALL, CEIL_ALL, ROUND_ALL, NEARBYINT_ALL,
and RINT_ALL instead of the normal operator lists.  Sometimes this wasn't
possible (for example, we don't currently have _Float<N> pow support, so we
can't optimize powf128 (x, 0.5q) into sqrtf128 (x).

I added PowerPC tests for many of the optimizations when the compiler is
targeting ISA 3.0 (power9).

I checked this on an x86-64 system and little endian power8 system, doing a
bootstrap build and running make check.  There were no regresions.  Can I check
this patch into the trunk?

[gcc]
2017-11-16  Michael Meissner  <meissner@linux.vnet.ibm.com>

	* builtins.def: (_Float<N> and _Float<N>X BUILT_IN_CEIL): Add
	_Float<N> and _Float<N>X variants for rounding built-in
	functions.
	(_Float<N> and _Float<N>X BUILT_IN_FLOOR): Likewise.
	(_Float<N> and _Float<N>X BUILT_IN_NEARBYINT): Likewise.
	(_Float<N> and _Float<N>X BUILT_IN_RINT): Likewise.
	(_Float<N> and _Float<N>X BUILT_IN_ROUND): Likewise.
	(_Float<N> and _Float<N>X BUILT_IN_TRUNC): Likewise.
	* builtins.c (mathfn_built_in_2): Likewise.
	* internal-fn.def (CEIL): Likewise.
	(FLOOR): Likewise.
	(NEARBYINT): Likewise.
	(RINT): Likewise.
	(ROUND): Likewise.
	(TRUNC): Likewise.
	* convert.c (convert_to_integer_1): Likewise.
	* fold-const.c (tree_call_nonnegative_warnv_p): Likewise.
	(integer_valued_real_call_p): Likewise.
	* fold-const-call.c (fold_const_call_ss): Likewise.
	* gencfn-macros.c (print_case_cfn): Change CFN and operator
	printers to take a const char * suffix instead of a bool.
	(print_define_operator_list): Likewise.
	(fltall_suffixes): New list of suffixes, that include the
	traditional suffixes as well as all of the _Float<N> and
	_Float<N>X suffixes.
	(main): For _Float<N> and _Float<N>X functions, emit both
	<name>_FN and <name>_ALL variants.  The <macro>_FN variant only
	has the _Float<N> and _Float<N>X case names or operators.  The
	<name>_ALL variant has both the traditional and the
	_Float<N>/_Float<N>X case names or operators.
	* match.pd (COPYSIGN optimizations): Provide optimizations for
	_Float<N> and _Float<N>X types where possible.
	(MIN/MAX optimizations): Likewise.
	(sqrt optimizations): Likewise.
	(rounding optimizations): Likewise.
	* config/rs6000/rs6000.md (floor<mode>2): Add support for IEEE
	128-bit round to integer instructions.
	(ceil<mode>2): Likewise.
	(btrunc<mode>2): Likewise.
	(round<mode>2): Likewise.

[gcc/c]
2017-11-16  Michael Meissner  <meissner@linux.vnet.ibm.com>

	* c-decl.c (header_for_builtin_fn): Add integer rounding _Float<N>
	and _Float<N>X built-in functions.

[gcc/testsuite]
2017-11-16  Michael Meissner  <meissner@linux.vnet.ibm.com>

	* gcc.target/powerpc/float128-hw2.c: Add tests for ceilf128,
	floorf128, truncf128, and roundf128.
	* gcc.target/powerpc/float128-hw5.c: New tests for _Float128
	optimizations added in match.pd.
	* gcc.target/powerpc/float128-hw6.c: Likewise.
	* gcc.target/powerpc/float128-hw7.c: Likewise.
	* gcc.target/powerpc/float128-hw8.c: Likewise.
	* gcc.target/powerpc/float128-hw9.c: Likewise.
	* gcc.target/powerpc/float128-hw10.c: Likewise.

Comments

Segher Boessenkool Nov. 17, 2017, 2:06 p.m. UTC | #1
Hi!

On Fri, Nov 17, 2017 at 12:04:45AM -0500, Michael Meissner wrote:
> This patch is an enhancement of a previous page that never got approved.
> https://gcc.gnu.org/ml/gcc-patches/2017-10/threads.html#02124
> 
> In the original patch, I added support to the machine independent
> infrastructure to support the rounding built-in functions for _Float<N> and
> _Float<N>X types (i.e. roundf128, ceilf128, etc.).  I also added PowerPC ISA
> 3.0 support to generate these built-in functions.
> 
> In addition to the previous changes, this patch now adds more optimizations for
> _Float<N> and _Float<N>X types in match.pd.

> +(define_insn "btrunc<mode>2"
> +  [(set (match_operand:IEEE128 0 "altivec_register_operand" "=v")
> +	(unspec:IEEE128
> +	 [(match_operand:IEEE128 1 "altivec_register_operand" "v")]
> +	 UNSPEC_FRIZ))]
> +  "TARGET_FLOAT128_HW && FLOAT128_IEEE_P (<MODE>mode)"
> +  "xsrqpi 1,%0,%1,0"
> +  [(set_attr "type" "vecfloat")
> +   (set_attr "size" "128")])

Is this one correct?  Truncate is RMC=1, not RMC=0, I think?

The rest of the rs6000 part looks fine.  Thanks!


Segher
Michael Meissner Nov. 17, 2017, 9:29 p.m. UTC | #2
On Fri, Nov 17, 2017 at 08:06:09AM -0600, Segher Boessenkool wrote:
> Hi!
> 
> On Fri, Nov 17, 2017 at 12:04:45AM -0500, Michael Meissner wrote:
> > This patch is an enhancement of a previous page that never got approved.
> > https://gcc.gnu.org/ml/gcc-patches/2017-10/threads.html#02124
> > 
> > In the original patch, I added support to the machine independent
> > infrastructure to support the rounding built-in functions for _Float<N> and
> > _Float<N>X types (i.e. roundf128, ceilf128, etc.).  I also added PowerPC ISA
> > 3.0 support to generate these built-in functions.
> > 
> > In addition to the previous changes, this patch now adds more optimizations for
> > _Float<N> and _Float<N>X types in match.pd.
> 
> > +(define_insn "btrunc<mode>2"
> > +  [(set (match_operand:IEEE128 0 "altivec_register_operand" "=v")
> > +	(unspec:IEEE128
> > +	 [(match_operand:IEEE128 1 "altivec_register_operand" "v")]
> > +	 UNSPEC_FRIZ))]
> > +  "TARGET_FLOAT128_HW && FLOAT128_IEEE_P (<MODE>mode)"
> > +  "xsrqpi 1,%0,%1,0"
> > +  [(set_attr "type" "vecfloat")
> > +   (set_attr "size" "128")])
> 
> Is this one correct?  Truncate is RMC=1, not RMC=0, I think?

Good catch.  I just tried it, and you were right.

> The rest of the rs6000 part looks fine.  Thanks!

I will re-submit a patch once the bootstraps finish.  Just to be sure, I'm
adding a new test that tests it at runtime.
Michael Meissner Nov. 18, 2017, 12:35 a.m. UTC | #3
On Fri, Nov 17, 2017 at 08:06:09AM -0600, Segher Boessenkool wrote:
> Hi!
> 
> On Fri, Nov 17, 2017 at 12:04:45AM -0500, Michael Meissner wrote:
> > This patch is an enhancement of a previous page that never got approved.
> > https://gcc.gnu.org/ml/gcc-patches/2017-10/threads.html#02124
> > 
> > In the original patch, I added support to the machine independent
> > infrastructure to support the rounding built-in functions for _Float<N> and
> > _Float<N>X types (i.e. roundf128, ceilf128, etc.).  I also added PowerPC ISA
> > 3.0 support to generate these built-in functions.
> > 
> > In addition to the previous changes, this patch now adds more optimizations for
> > _Float<N> and _Float<N>X types in match.pd.
> 
> > +(define_insn "btrunc<mode>2"
> > +  [(set (match_operand:IEEE128 0 "altivec_register_operand" "=v")
> > +	(unspec:IEEE128
> > +	 [(match_operand:IEEE128 1 "altivec_register_operand" "v")]
> > +	 UNSPEC_FRIZ))]
> > +  "TARGET_FLOAT128_HW && FLOAT128_IEEE_P (<MODE>mode)"
> > +  "xsrqpi 1,%0,%1,0"
> > +  [(set_attr "type" "vecfloat")
> > +   (set_attr "size" "128")])
> 
> Is this one correct?  Truncate is RMC=1, not RMC=0, I think?
> 
> The rest of the rs6000 part looks fine.  Thanks!

Here is the fixed patch.  It fixes the btrunc<mode>2 insn to use the correct
XSRPQI variant for truncf128.  I added the float128-hw11.c test as a runtime
test to make sure round, trunc, ceil, and floor return the correct values.  The
machine independent portions are the same.

Assuming the machine independent versions are approved, can I check in the
PowerPC bits?

[gcc]
2017-11-17  Michael Meissner  <meissner@linux.vnet.ibm.com>

	* builtins.def: (_Float<N> and _Float<N>X BUILT_IN_CEIL): Add
	_Float<N> and _Float<N>X variants for rounding built-in
	functions.
	(_Float<N> and _Float<N>X BUILT_IN_FLOOR): Likewise.
	(_Float<N> and _Float<N>X BUILT_IN_NEARBYINT): Likewise.
	(_Float<N> and _Float<N>X BUILT_IN_RINT): Likewise.
	(_Float<N> and _Float<N>X BUILT_IN_ROUND): Likewise.
	(_Float<N> and _Float<N>X BUILT_IN_TRUNC): Likewise.
	* builtins.c (mathfn_built_in_2): Likewise.
	* internal-fn.def (CEIL): Likewise.
	(FLOOR): Likewise.
	(NEARBYINT): Likewise.
	(RINT): Likewise.
	(ROUND): Likewise.
	(TRUNC): Likewise.
	* convert.c (convert_to_integer_1): Likewise.
	* fold-const.c (tree_call_nonnegative_warnv_p): Likewise.
	(integer_valued_real_call_p): Likewise.
	* fold-const-call.c (fold_const_call_ss): Likewise.
	* gencfn-macros.c (print_case_cfn): Change CFN and operator
	printers to take a const char * suffix instead of a bool.
	(print_define_operator_list): Likewise.
	(fltall_suffixes): New list of suffixes, that include the
	traditional suffixes as well as all of the _Float<N> and
	_Float<N>X suffixes.
	(main): For _Float<N> and _Float<N>X functions, emit both
	<name>_FN and <name>_ALL variants.  The <macro>_FN variant only
	has the _Float<N> and _Float<N>X case names or operators.  The
	<name>_ALL variant has both the traditional and the
	_Float<N>/_Float<N>X case names or operators.
	* match.pd (COPYSIGN optimizations): Provide optimizations for
	_Float<N> and _Float<N>X types where possible.
	(MIN/MAX optimizations): Likewise.
	(sqrt optimizations): Likewise.
	(rounding optimizations): Likewise.
	* config/rs6000/rs6000.md (floor<mode>2): Add support for IEEE
	128-bit round to integer instructions.
	(ceil<mode>2): Likewise.
	(btrunc<mode>2): Likewise.
	(round<mode>2): Likewise.

[gcc/c]
2017-11-17  Michael Meissner  <meissner@linux.vnet.ibm.com>

	* c-decl.c (header_for_builtin_fn): Add integer rounding _Float<N>
	and _Float<N>X built-in functions.

[gcc/testsuite]
2017-11-17  Michael Meissner  <meissner@linux.vnet.ibm.com>

	* gcc.target/powerpc/float128-hw2.c: Add tests for ceilf128,
	floorf128, truncf128, and roundf128.
	* gcc.target/powerpc/float128-hw5.c: New tests for _Float128
	optimizations added in match.pd.
	* gcc.target/powerpc/float128-hw6.c: Likewise.
	* gcc.target/powerpc/float128-hw7.c: Likewise.
	* gcc.target/powerpc/float128-hw8.c: Likewise.
	* gcc.target/powerpc/float128-hw9.c: Likewise.
	* gcc.target/powerpc/float128-hw10.c: Likewise.
	* gcc.target/powerpc/float128-hw11.c: Likewise.
Segher Boessenkool Nov. 20, 2017, 5:32 p.m. UTC | #4
Hi!

On Fri, Nov 17, 2017 at 07:35:05PM -0500, Michael Meissner wrote:
> Here is the fixed patch.  It fixes the btrunc<mode>2 insn to use the correct
> XSRPQI variant for truncf128.  I added the float128-hw11.c test as a runtime
> test to make sure round, trunc, ceil, and floor return the correct values.  The
> machine independent portions are the same.
> 
> Assuming the machine independent versions are approved, can I check in the
> PowerPC bits?

The rs6000 parts are okay, with a trivial fix:

> --- gcc/testsuite/gcc.target/powerpc/float128-hw11.c	(revision 0)
> +++ gcc/testsuite/gcc.target/powerpc/float128-hw11.c	(revision 0)
> @@ -0,0 +1,59 @@
> +/* { dg-do run { target { powerpc*-*-* && lp64 } } } */
> +/* { dg-require-effective-target powerpc_p9vector_ok } */
> +/* { dg-options "-mpower9-vector -O2" } */
> +/* { dg-options "-mvsx -O2" } */

This last line should not be there.

Thanks,


Segher
Michael Meissner Nov. 20, 2017, 6:22 p.m. UTC | #5
On Mon, Nov 20, 2017 at 11:32:08AM -0600, Segher Boessenkool wrote:
> Hi!
> 
> On Fri, Nov 17, 2017 at 07:35:05PM -0500, Michael Meissner wrote:
> > Here is the fixed patch.  It fixes the btrunc<mode>2 insn to use the correct
> > XSRPQI variant for truncf128.  I added the float128-hw11.c test as a runtime
> > test to make sure round, trunc, ceil, and floor return the correct values.  The
> > machine independent portions are the same.
> > 
> > Assuming the machine independent versions are approved, can I check in the
> > PowerPC bits?
> 
> The rs6000 parts are okay, with a trivial fix:
> 
> > --- gcc/testsuite/gcc.target/powerpc/float128-hw11.c	(revision 0)
> > +++ gcc/testsuite/gcc.target/powerpc/float128-hw11.c	(revision 0)
> > @@ -0,0 +1,59 @@
> > +/* { dg-do run { target { powerpc*-*-* && lp64 } } } */
> > +/* { dg-require-effective-target powerpc_p9vector_ok } */
> > +/* { dg-options "-mpower9-vector -O2" } */
> > +/* { dg-options "-mvsx -O2" } */
> 
> This last line should not be there.

Ok.  I will also substitute p9vector_hw for powerpc_p9vector_ok, since the
former says we have power9 hardware to run tests, and the later says the
compiler supports compiling for a power9 target.  Sorry about that.
Joseph Myers Dec. 21, 2017, 6:16 p.m. UTC | #6
On Fri, 17 Nov 2017, Michael Meissner wrote:

> Here is the fixed patch.  It fixes the btrunc<mode>2 insn to use the correct
> XSRPQI variant for truncf128.  I added the float128-hw11.c test as a runtime
> test to make sure round, trunc, ceil, and floor return the correct values.  The
> machine independent portions are the same.

The architecture-independent changes are OK.  However, I have a comment on 
the target parts:

> +(define_insn "round<mode>2"
> +  [(set (match_operand:IEEE128 0 "altivec_register_operand" "=v")
> +	(unspec:IEEE128
> +	 [(match_operand:IEEE128 1 "altivec_register_operand" "v")]
> +	 UNSPEC_FRIN))]
> +  "TARGET_FLOAT128_HW && FLOAT128_IEEE_P (<MODE>mode)"
> +  "xsrqpi 0,%0,%1,3"
> +  [(set_attr "type" "vecfloat")
> +   (set_attr "size" "128")])

My reading of Power ISA 3.0B documentation is that 0,%0,%1,3 means round 
in the mode specified by FPSCR and you need 0,%0,%1,0 for 
round-to-nearest-away semantics which are what the round<mode>2 
instruction has (i.e., what you've written here is actually correct for 
nearbyint<mode>2, and would be rint<mode>2 if xsrqpix were used instead).  
Note that the testcase

> Index: gcc/testsuite/gcc.target/powerpc/float128-hw11.c

> +static const struct {
> +  _Float128 value;
> +  _Float128 exp_round;
> +  _Float128 exp_floor;
> +  _Float128 exp_ceil;
> +  _Float128 exp_trunc;
> +} a[] = {
> +  { -2.0Q, -2.0Q, -2.0Q, -2.0Q, -2.0Q },
> +  { -1.7Q, -2.0Q, -2.0Q, -1.0Q, -1.0Q },
> +  { -1.5Q, -2.0Q, -2.0Q, -1.0Q, -1.0Q },
> +  { -1.3Q, -1.0Q, -2.0Q, -1.0Q, -1.0Q },
> +  { +0.0Q, +0.0Q, +0.0Q, +0.0Q, +0.0Q },
> +  { +1.3Q, +1.0Q, +1.0Q, +2.0Q, +1.0Q },
> +  { +1.5Q, +2.0Q, +1.0Q, +2.0Q, +1.0Q },
> +  { +1.7Q, +2.0Q, +1.0Q, +2.0Q, +1.0Q },
> +  { +2.0Q, +2.0Q, +2.0Q, +2.0Q, +2.0Q }
> +};

has only -1.5Q and +1.5Q as half-way inputs, and both of those inputs have 
the same results for round-to-nearest-even and round-to-nearest-away, and 
the test doesn't test changing the rounding mode from the default 
round-to-nearest-even.  So getting this case wrong would not have shown up 
with a test failure; you'd need to add a further test input such as 2.5 
for which round-to-nearest-even and round-to-nearest-away differ, or test 
in multiple rounding modes to verify that the results of these built-in 
functions do not depend on the rounding mode, or both.

(Of course xsrqpi can also implement roundevenf128 using 1,%0,%1,0 but GCC 
doesn't currently have any built-in function or machine description 
support for roundeven for any type.)
Segher Boessenkool Dec. 21, 2017, 7:03 p.m. UTC | #7
On Thu, Dec 21, 2017 at 06:16:16PM +0000, Joseph Myers wrote:
> On Fri, 17 Nov 2017, Michael Meissner wrote:
> The architecture-independent changes are OK.  However, I have a comment on 
> the target parts:
> 
> > +(define_insn "round<mode>2"
> > +  [(set (match_operand:IEEE128 0 "altivec_register_operand" "=v")
> > +	(unspec:IEEE128
> > +	 [(match_operand:IEEE128 1 "altivec_register_operand" "v")]
> > +	 UNSPEC_FRIN))]
> > +  "TARGET_FLOAT128_HW && FLOAT128_IEEE_P (<MODE>mode)"
> > +  "xsrqpi 0,%0,%1,3"
> > +  [(set_attr "type" "vecfloat")
> > +   (set_attr "size" "128")])
> 
> My reading of Power ISA 3.0B documentation is that 0,%0,%1,3 means round 
> in the mode specified by FPSCR and you need 0,%0,%1,0 for 
> round-to-nearest-away semantics which are what the round<mode>2 
> instruction has (i.e., what you've written here is actually correct for 
> nearbyint<mode>2, and would be rint<mode>2 if xsrqpix were used instead).  

Ah yes, the roundM2 insn is round-away-from-zero, so you are right.
Tricky, from the name I assumed it would be "current rounding mode" :-/
Not that "frin" would make sense if that were true.

Thanks!  And thanks for all the reviews in general.


Segher
Michael Meissner Dec. 28, 2017, 9:21 p.m. UTC | #8
On Thu, Dec 21, 2017 at 06:16:16PM +0000, Joseph Myers wrote:
> On Fri, 17 Nov 2017, Michael Meissner wrote:
> 
> > Here is the fixed patch.  It fixes the btrunc<mode>2 insn to use the correct
> > XSRPQI variant for truncf128.  I added the float128-hw11.c test as a runtime
> > test to make sure round, trunc, ceil, and floor return the correct values.  The
> > machine independent portions are the same.
> 
> The architecture-independent changes are OK.  However, I have a comment on 
> the target parts:

Ok, I have committed the machine independent patches, and I will revise the
machine dependent patches, and add more tests.  Thanks for the review.
Michael Meissner Dec. 29, 2017, 5:35 a.m. UTC | #9
On Thu, Dec 21, 2017 at 01:03:26PM -0600, Segher Boessenkool wrote:
> On Thu, Dec 21, 2017 at 06:16:16PM +0000, Joseph Myers wrote:
> > On Fri, 17 Nov 2017, Michael Meissner wrote:
> > The architecture-independent changes are OK.  However, I have a comment on 
> > the target parts:
> > 
> > > +(define_insn "round<mode>2"
> > > +  [(set (match_operand:IEEE128 0 "altivec_register_operand" "=v")
> > > +	(unspec:IEEE128
> > > +	 [(match_operand:IEEE128 1 "altivec_register_operand" "v")]
> > > +	 UNSPEC_FRIN))]
> > > +  "TARGET_FLOAT128_HW && FLOAT128_IEEE_P (<MODE>mode)"
> > > +  "xsrqpi 0,%0,%1,3"
> > > +  [(set_attr "type" "vecfloat")
> > > +   (set_attr "size" "128")])
> > 
> > My reading of Power ISA 3.0B documentation is that 0,%0,%1,3 means round 
> > in the mode specified by FPSCR and you need 0,%0,%1,0 for 
> > round-to-nearest-away semantics which are what the round<mode>2 
> > instruction has (i.e., what you've written here is actually correct for 
> > nearbyint<mode>2, and would be rint<mode>2 if xsrqpix were used instead).  
> 
> Ah yes, the roundM2 insn is round-away-from-zero, so you are right.
> Tricky, from the name I assumed it would be "current rounding mode" :-/
> Not that "frin" would make sense if that were true.
> 
> Thanks!  And thanks for all the reviews in general.

Here is the corrected rs6000 part of the patch.  I added more round tests and I
checked it on a power9 prototype machine.  Roundf128 now produces the correct
answer.  Can I check this into the trunk?

[gcc]
2017-12-29  Michael Meissner  <meissner@linux.vnet.ibm.com>

	* config/rs6000/rs6000.md (floor<mode>2): Add support for IEEE
	128-bit round to integer instructions.
	(ceil<mode>2): Likewise.
	(btrunc<mode>2): Likewise.
	(round<mode>2): Likewise.

[gcc/testsuite]
2017-12-29  Michael Meissner  <meissner@linux.vnet.ibm.com>

	* gcc.target/powerpc/float128-hw2.c: Add tests for ceilf128,
	floorf128, truncf128, and roundf128.
	* gcc.target/powerpc/float128-hw5.c: New tests for _Float128
	optimizations added in match.pd.
	* gcc.target/powerpc/float128-hw6.c: Likewise.
	* gcc.target/powerpc/float128-hw7.c: Likewise.
	* gcc.target/powerpc/float128-hw8.c: Likewise.
	* gcc.target/powerpc/float128-hw9.c: Likewise.
	* gcc.target/powerpc/float128-hw10.c: Likewise.
	* gcc.target/powerpc/float128-hw11.c: Likewise.
diff mbox series

Patch

Index: gcc/builtins.def
===================================================================
--- gcc/builtins.def	(revision 254846)
+++ gcc/builtins.def	(working copy)
@@ -343,6 +343,9 @@  DEF_C99_BUILTIN        (BUILT_IN_CBRTL, 
 DEF_LIB_BUILTIN        (BUILT_IN_CEIL, "ceil", BT_FN_DOUBLE_DOUBLE, ATTR_CONST_NOTHROW_LEAF_LIST)
 DEF_C99_C90RES_BUILTIN (BUILT_IN_CEILF, "ceilf", BT_FN_FLOAT_FLOAT, ATTR_CONST_NOTHROW_LEAF_LIST)
 DEF_C99_C90RES_BUILTIN (BUILT_IN_CEILL, "ceill", BT_FN_LONGDOUBLE_LONGDOUBLE, ATTR_CONST_NOTHROW_LEAF_LIST)
+#define CEIL_TYPE(F) BT_FN_##F##_##F
+DEF_EXT_LIB_FLOATN_NX_BUILTINS (BUILT_IN_CEIL, "ceil", CEIL_TYPE, ATTR_CONST_NOTHROW_LEAF_LIST)
+#undef CEIL_TYPE
 DEF_C99_BUILTIN        (BUILT_IN_COPYSIGN, "copysign", BT_FN_DOUBLE_DOUBLE_DOUBLE, ATTR_CONST_NOTHROW_LEAF_LIST)
 DEF_C99_BUILTIN        (BUILT_IN_COPYSIGNF, "copysignf", BT_FN_FLOAT_FLOAT_FLOAT, ATTR_CONST_NOTHROW_LEAF_LIST)
 DEF_C99_BUILTIN        (BUILT_IN_COPYSIGNL, "copysignl", BT_FN_LONGDOUBLE_LONGDOUBLE_LONGDOUBLE, ATTR_CONST_NOTHROW_LEAF_LIST)
@@ -402,6 +405,9 @@  DEF_C99_BUILTIN        (BUILT_IN_FEUPDAT
 DEF_LIB_BUILTIN        (BUILT_IN_FLOOR, "floor", BT_FN_DOUBLE_DOUBLE, ATTR_CONST_NOTHROW_LEAF_LIST)
 DEF_C99_C90RES_BUILTIN (BUILT_IN_FLOORF, "floorf", BT_FN_FLOAT_FLOAT, ATTR_CONST_NOTHROW_LEAF_LIST)
 DEF_C99_C90RES_BUILTIN (BUILT_IN_FLOORL, "floorl", BT_FN_LONGDOUBLE_LONGDOUBLE, ATTR_CONST_NOTHROW_LEAF_LIST)
+#define FLOOR_TYPE(F) BT_FN_##F##_##F
+DEF_EXT_LIB_FLOATN_NX_BUILTINS (BUILT_IN_FLOOR, "floor", FLOOR_TYPE, ATTR_CONST_NOTHROW_LEAF_LIST)
+#undef FLOOR_TYPE
 DEF_C99_BUILTIN        (BUILT_IN_FMA, "fma", BT_FN_DOUBLE_DOUBLE_DOUBLE_DOUBLE, ATTR_MATHFN_FPROUNDING)
 DEF_C99_BUILTIN        (BUILT_IN_FMAF, "fmaf", BT_FN_FLOAT_FLOAT_FLOAT_FLOAT, ATTR_MATHFN_FPROUNDING)
 DEF_C99_BUILTIN        (BUILT_IN_FMAL, "fmal", BT_FN_LONGDOUBLE_LONGDOUBLE_LONGDOUBLE_LONGDOUBLE, ATTR_MATHFN_FPROUNDING)
@@ -539,6 +545,9 @@  DEF_GCC_FLOATN_NX_BUILTINS (BUILT_IN_NAN
 DEF_C99_BUILTIN        (BUILT_IN_NEARBYINT, "nearbyint", BT_FN_DOUBLE_DOUBLE, ATTR_CONST_NOTHROW_LEAF_LIST)
 DEF_C99_BUILTIN        (BUILT_IN_NEARBYINTF, "nearbyintf", BT_FN_FLOAT_FLOAT, ATTR_CONST_NOTHROW_LEAF_LIST)
 DEF_C99_BUILTIN        (BUILT_IN_NEARBYINTL, "nearbyintl", BT_FN_LONGDOUBLE_LONGDOUBLE, ATTR_CONST_NOTHROW_LEAF_LIST)
+#define NEARBYINT_TYPE(F) BT_FN_##F##_##F
+DEF_EXT_LIB_FLOATN_NX_BUILTINS (BUILT_IN_NEARBYINT, "nearbyint", NEARBYINT_TYPE, ATTR_CONST_NOTHROW_LEAF_LIST)
+#undef NEARBYINT_TYPE
 DEF_C99_BUILTIN        (BUILT_IN_NEXTAFTER, "nextafter", BT_FN_DOUBLE_DOUBLE_DOUBLE, ATTR_MATHFN_FPROUNDING_ERRNO)
 DEF_C99_BUILTIN        (BUILT_IN_NEXTAFTERF, "nextafterf", BT_FN_FLOAT_FLOAT_FLOAT, ATTR_MATHFN_FPROUNDING_ERRNO)
 DEF_C99_BUILTIN        (BUILT_IN_NEXTAFTERL, "nextafterl", BT_FN_LONGDOUBLE_LONGDOUBLE_LONGDOUBLE, ATTR_MATHFN_FPROUNDING_ERRNO)
@@ -563,9 +572,15 @@  DEF_C99_BUILTIN        (BUILT_IN_REMQUOL
 DEF_C99_BUILTIN        (BUILT_IN_RINT, "rint", BT_FN_DOUBLE_DOUBLE, ATTR_MATHFN_FPROUNDING)
 DEF_C99_BUILTIN        (BUILT_IN_RINTF, "rintf", BT_FN_FLOAT_FLOAT, ATTR_MATHFN_FPROUNDING)
 DEF_C99_BUILTIN        (BUILT_IN_RINTL, "rintl", BT_FN_LONGDOUBLE_LONGDOUBLE, ATTR_MATHFN_FPROUNDING)
+#define RINT_TYPE(F) BT_FN_##F##_##F
+DEF_EXT_LIB_FLOATN_NX_BUILTINS (BUILT_IN_RINT, "rint", RINT_TYPE, ATTR_CONST_NOTHROW_LEAF_LIST)
+#undef RINT_TYPE
 DEF_C99_BUILTIN        (BUILT_IN_ROUND, "round", BT_FN_DOUBLE_DOUBLE, ATTR_CONST_NOTHROW_LEAF_LIST)
 DEF_C99_BUILTIN        (BUILT_IN_ROUNDF, "roundf", BT_FN_FLOAT_FLOAT, ATTR_CONST_NOTHROW_LEAF_LIST)
 DEF_C99_BUILTIN        (BUILT_IN_ROUNDL, "roundl", BT_FN_LONGDOUBLE_LONGDOUBLE, ATTR_CONST_NOTHROW_LEAF_LIST)
+#define ROUND_TYPE(F) BT_FN_##F##_##F
+DEF_EXT_LIB_FLOATN_NX_BUILTINS (BUILT_IN_ROUND, "round", ROUND_TYPE, ATTR_CONST_NOTHROW_LEAF_LIST)
+#undef ROUND_TYPE
 DEF_EXT_LIB_BUILTIN    (BUILT_IN_SCALB, "scalb", BT_FN_DOUBLE_DOUBLE_DOUBLE, ATTR_MATHFN_FPROUNDING_ERRNO)
 DEF_EXT_LIB_BUILTIN    (BUILT_IN_SCALBF, "scalbf", BT_FN_FLOAT_FLOAT_FLOAT, ATTR_MATHFN_FPROUNDING_ERRNO)
 DEF_EXT_LIB_BUILTIN    (BUILT_IN_SCALBL, "scalbl", BT_FN_LONGDOUBLE_LONGDOUBLE_LONGDOUBLE, ATTR_MATHFN_FPROUNDING_ERRNO)
@@ -611,6 +626,9 @@  DEF_C99_BUILTIN        (BUILT_IN_TGAMMAL
 DEF_C99_BUILTIN        (BUILT_IN_TRUNC, "trunc", BT_FN_DOUBLE_DOUBLE, ATTR_CONST_NOTHROW_LEAF_LIST)
 DEF_C99_BUILTIN        (BUILT_IN_TRUNCF, "truncf", BT_FN_FLOAT_FLOAT, ATTR_CONST_NOTHROW_LEAF_LIST)
 DEF_C99_BUILTIN        (BUILT_IN_TRUNCL, "truncl", BT_FN_LONGDOUBLE_LONGDOUBLE, ATTR_CONST_NOTHROW_LEAF_LIST)
+#define TRUNC_TYPE(F) BT_FN_##F##_##F
+DEF_EXT_LIB_FLOATN_NX_BUILTINS (BUILT_IN_TRUNC, "trunc", TRUNC_TYPE, ATTR_CONST_NOTHROW_LEAF_LIST)
+#undef TRUNC_TYPE
 DEF_EXT_LIB_BUILTIN    (BUILT_IN_Y0, "y0", BT_FN_DOUBLE_DOUBLE, ATTR_MATHFN_FPROUNDING_ERRNO)
 DEF_EXT_LIB_BUILTIN    (BUILT_IN_Y0F, "y0f", BT_FN_FLOAT_FLOAT, ATTR_MATHFN_FPROUNDING_ERRNO)
 DEF_EXT_LIB_BUILTIN    (BUILT_IN_Y0L, "y0l", BT_FN_LONGDOUBLE_LONGDOUBLE, ATTR_MATHFN_FPROUNDING_ERRNO)
Index: gcc/builtins.c
===================================================================
--- gcc/builtins.c	(revision 254846)
+++ gcc/builtins.c	(working copy)
@@ -1872,7 +1872,7 @@  mathfn_built_in_2 (tree type, combined_f
     CASE_MATHFN (ATAN2)
     CASE_MATHFN (ATANH)
     CASE_MATHFN (CBRT)
-    CASE_MATHFN (CEIL)
+    CASE_MATHFN_FLOATN (CEIL)
     CASE_MATHFN (CEXPI)
     CASE_MATHFN_FLOATN (COPYSIGN)
     CASE_MATHFN (COS)
@@ -1886,7 +1886,7 @@  mathfn_built_in_2 (tree type, combined_f
     CASE_MATHFN (EXPM1)
     CASE_MATHFN (FABS)
     CASE_MATHFN (FDIM)
-    CASE_MATHFN (FLOOR)
+    CASE_MATHFN_FLOATN (FLOOR)
     CASE_MATHFN_FLOATN (FMA)
     CASE_MATHFN_FLOATN (FMAX)
     CASE_MATHFN_FLOATN (FMIN)
@@ -1925,7 +1925,7 @@  mathfn_built_in_2 (tree type, combined_f
     CASE_MATHFN (MODF)
     CASE_MATHFN (NAN)
     CASE_MATHFN (NANS)
-    CASE_MATHFN (NEARBYINT)
+    CASE_MATHFN_FLOATN (NEARBYINT)
     CASE_MATHFN (NEXTAFTER)
     CASE_MATHFN (NEXTTOWARD)
     CASE_MATHFN (POW)
@@ -1933,8 +1933,8 @@  mathfn_built_in_2 (tree type, combined_f
     CASE_MATHFN (POW10)
     CASE_MATHFN (REMAINDER)
     CASE_MATHFN (REMQUO)
-    CASE_MATHFN (RINT)
-    CASE_MATHFN (ROUND)
+    CASE_MATHFN_FLOATN (RINT)
+    CASE_MATHFN_FLOATN (ROUND)
     CASE_MATHFN (SCALB)
     CASE_MATHFN (SCALBLN)
     CASE_MATHFN (SCALBN)
@@ -1947,7 +1947,7 @@  mathfn_built_in_2 (tree type, combined_f
     CASE_MATHFN (TAN)
     CASE_MATHFN (TANH)
     CASE_MATHFN (TGAMMA)
-    CASE_MATHFN (TRUNC)
+    CASE_MATHFN_FLOATN (TRUNC)
     CASE_MATHFN (Y0)
     CASE_MATHFN (Y1)
     CASE_MATHFN (YN)
Index: gcc/internal-fn.def
===================================================================
--- gcc/internal-fn.def	(revision 254846)
+++ gcc/internal-fn.def	(working copy)
@@ -118,12 +118,12 @@  DEF_INTERNAL_FLT_FLOATN_FN (SQRT, ECF_CO
 DEF_INTERNAL_FLT_FN (TAN, ECF_CONST, tan, unary)
 
 /* FP rounding.  */
-DEF_INTERNAL_FLT_FN (CEIL, ECF_CONST, ceil, unary)
-DEF_INTERNAL_FLT_FN (FLOOR, ECF_CONST, floor, unary)
-DEF_INTERNAL_FLT_FN (NEARBYINT, ECF_CONST, nearbyint, unary)
-DEF_INTERNAL_FLT_FN (RINT, ECF_CONST, rint, unary)
-DEF_INTERNAL_FLT_FN (ROUND, ECF_CONST, round, unary)
-DEF_INTERNAL_FLT_FN (TRUNC, ECF_CONST, btrunc, unary)
+DEF_INTERNAL_FLT_FLOATN_FN (CEIL, ECF_CONST, ceil, unary)
+DEF_INTERNAL_FLT_FLOATN_FN (FLOOR, ECF_CONST, floor, unary)
+DEF_INTERNAL_FLT_FLOATN_FN (NEARBYINT, ECF_CONST, nearbyint, unary)
+DEF_INTERNAL_FLT_FLOATN_FN (RINT, ECF_CONST, rint, unary)
+DEF_INTERNAL_FLT_FLOATN_FN (ROUND, ECF_CONST, round, unary)
+DEF_INTERNAL_FLT_FLOATN_FN (TRUNC, ECF_CONST, btrunc, unary)
 
 /* Binary math functions.  */
 DEF_INTERNAL_FLT_FN (ATAN2, ECF_CONST, atan2, binary)
Index: gcc/convert.c
===================================================================
--- gcc/convert.c	(revision 254846)
+++ gcc/convert.c	(working copy)
@@ -554,6 +554,7 @@  convert_to_integer_1 (tree type, tree ex
       switch (fcode)
         {
 	CASE_FLT_FN (BUILT_IN_CEIL):
+	CASE_FLT_FN_FLOATN_NX (BUILT_IN_CEIL):
 	  /* Only convert in ISO C99 mode.  */
 	  if (!targetm.libc_has_function (function_c99_misc))
 	    break;
@@ -570,6 +571,7 @@  convert_to_integer_1 (tree type, tree ex
 	  break;
 
 	CASE_FLT_FN (BUILT_IN_FLOOR):
+	CASE_FLT_FN_FLOATN_NX (BUILT_IN_FLOOR):
 	  /* Only convert in ISO C99 mode.  */
 	  if (!targetm.libc_has_function (function_c99_misc))
 	    break;
@@ -586,6 +588,7 @@  convert_to_integer_1 (tree type, tree ex
 	  break;
 
 	CASE_FLT_FN (BUILT_IN_ROUND):
+	CASE_FLT_FN_FLOATN_NX (BUILT_IN_ROUND):
 	  /* Only convert in ISO C99 mode and with -fno-math-errno.  */
 	  if (!targetm.libc_has_function (function_c99_misc) || flag_errno_math)
 	    break;
@@ -602,11 +605,13 @@  convert_to_integer_1 (tree type, tree ex
 	  break;
 
 	CASE_FLT_FN (BUILT_IN_NEARBYINT):
+	CASE_FLT_FN_FLOATN_NX (BUILT_IN_NEARBYINT):
 	  /* Only convert nearbyint* if we can ignore math exceptions.  */
 	  if (flag_trapping_math)
 	    break;
 	  gcc_fallthrough ();
 	CASE_FLT_FN (BUILT_IN_RINT):
+	CASE_FLT_FN_FLOATN_NX (BUILT_IN_RINT):
 	  /* Only convert in ISO C99 mode and with -fno-math-errno.  */
 	  if (!targetm.libc_has_function (function_c99_misc) || flag_errno_math)
 	    break;
@@ -623,6 +628,7 @@  convert_to_integer_1 (tree type, tree ex
 	  break;
 
 	CASE_FLT_FN (BUILT_IN_TRUNC):
+	CASE_FLT_FN_FLOATN_NX (BUILT_IN_TRUNC):
 	  return convert_to_integer_1 (type, CALL_EXPR_ARG (s_expr, 0), dofold);
 
 	default:
Index: gcc/fold-const.c
===================================================================
--- gcc/fold-const.c	(revision 254846)
+++ gcc/fold-const.c	(working copy)
@@ -12784,9 +12784,11 @@  tree_call_nonnegative_warnv_p (tree type
     CASE_CFN_ATANH:
     CASE_CFN_CBRT:
     CASE_CFN_CEIL:
+    CASE_CFN_CEIL_FN:
     CASE_CFN_ERF:
     CASE_CFN_EXPM1:
     CASE_CFN_FLOOR:
+    CASE_CFN_FLOOR_FN:
     CASE_CFN_FMOD:
     CASE_CFN_FREXP:
     CASE_CFN_ICEIL:
@@ -12804,8 +12806,11 @@  tree_call_nonnegative_warnv_p (tree type
     CASE_CFN_LROUND:
     CASE_CFN_MODF:
     CASE_CFN_NEARBYINT:
+    CASE_CFN_NEARBYINT_FN:
     CASE_CFN_RINT:
+    CASE_CFN_RINT_FN:
     CASE_CFN_ROUND:
+    CASE_CFN_ROUND_FN:
     CASE_CFN_SCALB:
     CASE_CFN_SCALBLN:
     CASE_CFN_SCALBN:
@@ -12814,6 +12819,7 @@  tree_call_nonnegative_warnv_p (tree type
     CASE_CFN_SINH:
     CASE_CFN_TANH:
     CASE_CFN_TRUNC:
+    CASE_CFN_TRUNC_FN:
       /* True if the 1st argument is nonnegative.  */
       return RECURSE (arg0);
 
@@ -13319,11 +13325,17 @@  integer_valued_real_call_p (combined_fn 
   switch (fn)
     {
     CASE_CFN_CEIL:
+    CASE_CFN_CEIL_FN:
     CASE_CFN_FLOOR:
+    CASE_CFN_FLOOR_FN:
     CASE_CFN_NEARBYINT:
+    CASE_CFN_NEARBYINT_FN:
     CASE_CFN_RINT:
+    CASE_CFN_RINT_FN:
     CASE_CFN_ROUND:
+    CASE_CFN_ROUND_FN:
     CASE_CFN_TRUNC:
+    CASE_CFN_TRUNC_FN:
       return true;
 
     CASE_CFN_FMIN:
Index: gcc/fold-const-call.c
===================================================================
--- gcc/fold-const-call.c	(revision 254846)
+++ gcc/fold-const-call.c	(working copy)
@@ -699,6 +699,7 @@  fold_const_call_ss (real_value *result, 
 	      && do_mpfr_arg1 (result, mpfr_y1, arg, format));
 
     CASE_CFN_FLOOR:
+    CASE_CFN_FLOOR_FN:
       if (!REAL_VALUE_ISNAN (*arg) || !flag_errno_math)
 	{
 	  real_floor (result, format, arg);
@@ -707,6 +708,7 @@  fold_const_call_ss (real_value *result, 
       return false;
 
     CASE_CFN_CEIL:
+    CASE_CFN_CEIL_FN:
       if (!REAL_VALUE_ISNAN (*arg) || !flag_errno_math)
 	{
 	  real_ceil (result, format, arg);
@@ -715,10 +717,12 @@  fold_const_call_ss (real_value *result, 
       return false;
 
     CASE_CFN_TRUNC:
+    CASE_CFN_TRUNC_FN:
       real_trunc (result, format, arg);
       return true;
 
     CASE_CFN_ROUND:
+    CASE_CFN_ROUND_FN:
       if (!REAL_VALUE_ISNAN (*arg) || !flag_errno_math)
 	{
 	  real_round (result, format, arg);
Index: gcc/gencfn-macros.c
===================================================================
--- gcc/gencfn-macros.c	(revision 254846)
+++ gcc/gencfn-macros.c	(working copy)
@@ -94,13 +94,15 @@  is_group (string_set *builtins, const ch
 
 /* Print a macro for all combined functions related to NAME, with the
    null-terminated list of suffixes in SUFFIXES.  INTERNAL_P says whether
-   CFN_<NAME> also exists.  */
+   CFN_<NAME> also exists.  FLOATN_P is a suffix to the operator name, blank
+   for normal operators, "_FN" for _Float<N>/_Float<N>X operators only, and
+   "_ALL" for both the traditional operators and the _Float<N>/_Float<N>X
+   operators.  */
 
 static void
 print_case_cfn (const char *name, bool internal_p,
-		const char *const *suffixes, bool floatn_p)
+		const char *const *suffixes, const char *floatn)
 {
-  const char *floatn = (floatn_p) ? "_FN" : "";
   printf ("#define CASE_CFN_%s%s", name, floatn);
   if (internal_p)
     printf (" \\\n  case CFN_%s%s", name, floatn);
@@ -110,15 +112,18 @@  print_case_cfn (const char *name, bool i
   printf ("\n");
 }
 
-/* Print an operator list for all combined functions related to NAME,
-   with the null-terminated list of suffixes in SUFFIXES.  INTERNAL_P
-   says whether CFN_<NAME> also exists.  */
+/* Print an operator list for all combined functions related to NAME, with the
+   null-terminated list of suffixes in SUFFIXES.  INTERNAL_P says whether
+   CFN_<NAME> also exists.  FLOATN_P is a suffix to the operator name, blank
+   for normal operators, "_FN" for _Float<N>/_Float<N>X operators only, and
+   "_ALL" for both the traditional operators and the _Float<N>/_Float<N>X
+   operators.  */
 
 static void
 print_define_operator_list (const char *name, bool internal_p,
-			    const char *const *suffixes, bool floatn_p)
+			    const char *const *suffixes,
+			    const char *floatn)
 {
-  const char *floatn = (floatn_p) ? "_FN" : "";
   printf ("(define_operator_list %s%s\n", name, floatn);
   for (unsigned int i = 0; suffixes[i]; ++i)
     printf ("    BUILT_IN_%s%s\n", name, suffixes[i]);
@@ -152,6 +157,9 @@  const char *const internal_fn_int_names[
 static const char *const flt_suffixes[] = { "F", "", "L", NULL };
 static const char *const fltfn_suffixes[] = { "F16", "F32", "F64", "F128",
 					      "F32X", "F64X", "F128X", NULL };
+static const char *const fltall_suffixes[] = { "F", "", "L", "F16", "F32",
+					       "F64", "F128", "F32X", "F64X",
+					       "F128X", NULL };
 static const char *const int_suffixes[] = { "", "L", "LL", "IMAX", NULL };
 
 static const char *const *const suffix_lists[] = {
@@ -212,22 +220,31 @@  main (int argc, char **argv)
 		  bool internal_p = internal_fns.contains (root);
 
 		  if (type == 'c')
-		    print_case_cfn (root, internal_p, suffix, false);
+		    print_case_cfn (root, internal_p, suffix, "");
 		  else
-		    print_define_operator_list (root, internal_p,
-						suffix, false);
+		    print_define_operator_list (root, internal_p, suffix, "");
 
 		      /* Support the _Float<N> and _Float<N>X math functions if
-			 they exist.  We put these out as a separate CFN macro,
-			 so code can add support or not as needed.  */
+			 they exist.  We put these out as a separate CFN or
+			 operator macro, so code can add support or not as
+			 needed.  We also put out a combined CFN or operator
+			 macro that includes both the traditional names and the
+			 _Float<N> and _Float<N>X versions.  */
 		  if (suffix == flt_suffixes
 		      && is_group (&builtins, root, fltfn_suffixes))
 		    {
 		      if (type == 'c')
-			print_case_cfn (root, false, fltfn_suffixes, true);
+			{
+			  print_case_cfn (root, false, fltfn_suffixes, "_FN");
+			  print_case_cfn (root, false, fltall_suffixes, "_ALL");
+			}
 		      else
-			print_define_operator_list (root, false, fltfn_suffixes,
-						    true);
+			{
+			  print_define_operator_list (root, false,
+						      fltfn_suffixes, "_FN");
+			  print_define_operator_list (root, internal_p,
+						      fltall_suffixes, "_ALL");
+			}
 		    }
 		}
 	    }
Index: gcc/match.pd
===================================================================
--- gcc/match.pd	(revision 254846)
+++ gcc/match.pd	(working copy)
@@ -191,21 +191,21 @@  DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
 
 /* Transform X * copysign (1.0, X) into abs(X). */
 (simplify
- (mult:c @0 (COPYSIGN real_onep @0))
+ (mult:c @0 (COPYSIGN_ALL real_onep @0))
  (if (!HONOR_NANS (type) && !HONOR_SIGNED_ZEROS (type))
   (abs @0)))
 
 /* Transform X * copysign (1.0, -X) into -abs(X). */
 (simplify
- (mult:c @0 (COPYSIGN real_onep (negate @0)))
+ (mult:c @0 (COPYSIGN_ALL real_onep (negate @0)))
  (if (!HONOR_NANS (type) && !HONOR_SIGNED_ZEROS (type))
   (negate (abs @0))))
 
 /* Transform copysign (CST, X) into copysign (ABS(CST), X). */
 (simplify
- (COPYSIGN REAL_CST@0 @1)
+ (COPYSIGN_ALL REAL_CST@0 @1)
  (if (REAL_VALUE_NEGATIVE (TREE_REAL_CST (@0)))
-  (COPYSIGN (negate @0) @1)))
+  (COPYSIGN_ALL (negate @0) @1)))
 
 /* X * 1, X / 1 -> X.  */
 (for op (mult trunc_div ceil_div floor_div round_div exact_div)
@@ -534,7 +534,7 @@  DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
    (hypots @0 (op @1))
    (hypots @0 @1)))
  /* copysign(-x, y) and copysign(abs(x), y) -> copysign(x, y).  */
- (for copysigns (COPYSIGN)
+ (for copysigns (COPYSIGN_ALL)
   (simplify
    (copysigns (op @0) @1)
    (copysigns @0 @1))))
@@ -579,7 +579,7 @@  DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
   (hypots @0 @1)))
 
 /* copysign(x, CST) -> [-]abs (x).  */
-(for copysigns (COPYSIGN)
+(for copysigns (COPYSIGN_ALL)
  (simplify
   (copysigns @0 REAL_CST@1)
   (if (REAL_VALUE_NEGATIVE (TREE_REAL_CST (@1)))
@@ -587,13 +587,13 @@  DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
    (abs @0))))
 
 /* copysign(copysign(x, y), z) -> copysign(x, z).  */
-(for copysigns (COPYSIGN)
+(for copysigns (COPYSIGN_ALL)
  (simplify
   (copysigns (copysigns @0 @1) @2)
   (copysigns @0 @2)))
 
 /* copysign(x,y)*copysign(x,y) -> x*x.  */
-(for copysigns (COPYSIGN)
+(for copysigns (COPYSIGN_ALL)
  (simplify
   (mult (copysigns@2 @0 @1) @2)
   (mult @0 @0)))
@@ -1802,7 +1802,7 @@  DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
 
 /* Simplifications of MIN_EXPR, MAX_EXPR, fmin() and fmax().  */
 
-(for minmax (min max FMIN FMIN_FN FMAX FMAX_FN)
+(for minmax (min max FMIN_ALL FMAX_ALL)
  (simplify
   (minmax @0 @0)
   @0))
@@ -1880,7 +1880,7 @@  DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
        && TYPE_PRECISION (TREE_TYPE (@0)) > TYPE_PRECISION (type))
    (minmax @1 (convert @2)))))
 
-(for minmax (FMIN FMIN_FN FMAX FMAX_FN)
+(for minmax (FMIN_ALL FMAX_ALL)
  /* If either argument is NaN, return the other one.  Avoid the
     transformation if we get (and honor) a signalling NaN.  */
  (simplify
@@ -1895,20 +1895,14 @@  DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
    worry about it either.  */
 (if (flag_finite_math_only)
  (simplify
-  (FMIN @0 @1)
+  (FMIN_ALL @0 @1)
   (min @0 @1))
  (simplify
-  (FMIN_FN @0 @1)
-  (min @0 @1))
- (simplify
-  (FMAX @0 @1)
-  (max @0 @1))
- (simplify
-  (FMAX_FN @0 @1)
+  (FMAX_ALL @0 @1)
   (max @0 @1)))
 /* min (-A, -B) -> -max (A, B)  */
-(for minmax (min max FMIN FMIN_FN FMAX FMAX_FN)
-     maxmin (max min FMAX FMAX_FN FMIN FMAX_FN)
+(for minmax (min max FMIN_ALL FMAX_ALL)
+     maxmin (max min FMAX_ALL FMIN_ALL)
  (simplify
   (minmax (negate:s@2 @0) (negate:s@3 @1))
   (if (FLOAT_TYPE_P (TREE_TYPE (@0))
@@ -3697,7 +3691,7 @@  DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
 (if (flag_unsafe_math_optimizations)
  /* Simplify sqrt(x) * sqrt(x) -> x.  */
  (simplify
-  (mult (SQRT@1 @0) @1)
+  (mult (SQRT_ALL@1 @0) @1)
   (if (!HONOR_SNANS (type))
    @0))
 
@@ -3850,12 +3844,12 @@  DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
  (abs @0))
 
 /* trunc(trunc(x)) -> trunc(x), etc.  */
-(for fns (TRUNC FLOOR CEIL ROUND NEARBYINT RINT)
+(for fns (TRUNC_ALL FLOOR_ALL CEIL_ALL ROUND_ALL NEARBYINT_ALL RINT_ALL)
  (simplify
   (fns (fns @0))
   (fns @0)))
 /* f(x) -> x if x is integer valued and f does nothing for such values.  */
-(for fns (TRUNC FLOOR CEIL ROUND NEARBYINT RINT)
+(for fns (TRUNC_ALL FLOOR_ALL CEIL_ALL ROUND_ALL NEARBYINT_ALL RINT_ALL)
  (simplify
   (fns integer_valued_real_p@0)
   @0))
@@ -3872,12 +3866,12 @@  DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
 
 (simplify
  /* copysign(x,x) -> x.  */
- (COPYSIGN @0 @0)
+ (COPYSIGN_ALL @0 @0)
  @0)
 
 (simplify
  /* copysign(x,y) -> fabs(x) if y is nonnegative.  */
- (COPYSIGN @0 tree_expr_nonnegative_p@1)
+ (COPYSIGN_ALL @0 tree_expr_nonnegative_p@1)
  (abs @0))
 
 (for scale (LDEXP SCALBN SCALBLN)
@@ -4028,8 +4022,8 @@  DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
 
 (if (canonicalize_math_p ())
  /* floor(x) -> trunc(x) if x is nonnegative.  */
- (for floors (FLOOR)
-      truncs (TRUNC)
+ (for floors (FLOOR_ALL)
+      truncs (TRUNC_ALL)
   (simplify
    (floors tree_expr_nonnegative_p@0)
    (truncs @0))))
Index: gcc/config/rs6000/rs6000.md
===================================================================
--- gcc/config/rs6000/rs6000.md	(revision 254846)
+++ gcc/config/rs6000/rs6000.md	(working copy)
@@ -14733,6 +14733,47 @@  (define_insn_and_split "floatuns<QHI:mod
    (set_attr "type" "vecfloat")
    (set_attr "size" "128")])
 
+;; IEEE 128-bit round to integer built-in functions
+(define_insn "floor<mode>2"
+  [(set (match_operand:IEEE128 0 "altivec_register_operand" "=v")
+	(unspec:IEEE128
+	 [(match_operand:IEEE128 1 "altivec_register_operand" "v")]
+	 UNSPEC_FRIM))]
+  "TARGET_FLOAT128_HW && FLOAT128_IEEE_P (<MODE>mode)"
+  "xsrqpi 1,%0,%1,3"
+  [(set_attr "type" "vecfloat")
+   (set_attr "size" "128")])
+
+(define_insn "ceil<mode>2"
+  [(set (match_operand:IEEE128 0 "altivec_register_operand" "=v")
+	(unspec:IEEE128
+	 [(match_operand:IEEE128 1 "altivec_register_operand" "v")]
+	 UNSPEC_FRIP))]
+  "TARGET_FLOAT128_HW && FLOAT128_IEEE_P (<MODE>mode)"
+  "xsrqpi 1,%0,%1,2"
+  [(set_attr "type" "vecfloat")
+   (set_attr "size" "128")])
+
+(define_insn "btrunc<mode>2"
+  [(set (match_operand:IEEE128 0 "altivec_register_operand" "=v")
+	(unspec:IEEE128
+	 [(match_operand:IEEE128 1 "altivec_register_operand" "v")]
+	 UNSPEC_FRIZ))]
+  "TARGET_FLOAT128_HW && FLOAT128_IEEE_P (<MODE>mode)"
+  "xsrqpi 1,%0,%1,0"
+  [(set_attr "type" "vecfloat")
+   (set_attr "size" "128")])
+
+(define_insn "round<mode>2"
+  [(set (match_operand:IEEE128 0 "altivec_register_operand" "=v")
+	(unspec:IEEE128
+	 [(match_operand:IEEE128 1 "altivec_register_operand" "v")]
+	 UNSPEC_FRIN))]
+  "TARGET_FLOAT128_HW && FLOAT128_IEEE_P (<MODE>mode)"
+  "xsrqpi 0,%0,%1,3"
+  [(set_attr "type" "vecfloat")
+   (set_attr "size" "128")])
+
 ;; IEEE 128-bit instructions with round to odd semantics
 (define_insn "add<mode>3_odd"
   [(set (match_operand:IEEE128 0 "altivec_register_operand" "=v")
Index: gcc/c/c-decl.c
===================================================================
--- gcc/c/c-decl.c	(revision 254846)
+++ gcc/c/c-decl.c	(working copy)
@@ -3162,6 +3162,7 @@  header_for_builtin_fn (enum built_in_fun
     CASE_FLT_FN (BUILT_IN_ATAN2):
     CASE_FLT_FN (BUILT_IN_CBRT):
     CASE_FLT_FN (BUILT_IN_CEIL):
+    CASE_FLT_FN_FLOATN_NX (BUILT_IN_CEIL):
     CASE_FLT_FN (BUILT_IN_COPYSIGN):
     CASE_FLT_FN_FLOATN_NX (BUILT_IN_COPYSIGN):
     CASE_FLT_FN (BUILT_IN_COS):
@@ -3175,6 +3176,7 @@  header_for_builtin_fn (enum built_in_fun
     CASE_FLT_FN_FLOATN_NX (BUILT_IN_FABS):
     CASE_FLT_FN (BUILT_IN_FDIM):
     CASE_FLT_FN (BUILT_IN_FLOOR):
+    CASE_FLT_FN_FLOATN_NX (BUILT_IN_FLOOR):
     CASE_FLT_FN (BUILT_IN_FMA):
     CASE_FLT_FN_FLOATN_NX (BUILT_IN_FMA):
     CASE_FLT_FN (BUILT_IN_FMAX):
@@ -3199,13 +3201,16 @@  header_for_builtin_fn (enum built_in_fun
     CASE_FLT_FN (BUILT_IN_MODF):
     CASE_FLT_FN (BUILT_IN_NAN):
     CASE_FLT_FN (BUILT_IN_NEARBYINT):
+    CASE_FLT_FN_FLOATN_NX (BUILT_IN_NEARBYINT):
     CASE_FLT_FN (BUILT_IN_NEXTAFTER):
     CASE_FLT_FN (BUILT_IN_NEXTTOWARD):
     CASE_FLT_FN (BUILT_IN_POW):
     CASE_FLT_FN (BUILT_IN_REMAINDER):
     CASE_FLT_FN (BUILT_IN_REMQUO):
     CASE_FLT_FN (BUILT_IN_RINT):
+    CASE_FLT_FN_FLOATN_NX (BUILT_IN_RINT):
     CASE_FLT_FN (BUILT_IN_ROUND):
+    CASE_FLT_FN_FLOATN_NX (BUILT_IN_ROUND):
     CASE_FLT_FN (BUILT_IN_SCALBLN):
     CASE_FLT_FN (BUILT_IN_SCALBN):
     CASE_FLT_FN (BUILT_IN_SIN):
@@ -3217,6 +3222,7 @@  header_for_builtin_fn (enum built_in_fun
     CASE_FLT_FN (BUILT_IN_TANH):
     CASE_FLT_FN (BUILT_IN_TGAMMA):
     CASE_FLT_FN (BUILT_IN_TRUNC):
+    CASE_FLT_FN_FLOATN_NX (BUILT_IN_TRUNC):
     case BUILT_IN_ISINF:
     case BUILT_IN_ISNAN:
       return "<math.h>";
Index: gcc/testsuite/gcc.target/powerpc/float128-hw2.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/float128-hw2.c	(revision 254846)
+++ gcc/testsuite/gcc.target/powerpc/float128-hw2.c	(working copy)
@@ -14,6 +14,10 @@ 
 extern _Float128 copysignf128 (_Float128, _Float128);
 extern _Float128 sqrtf128 (_Float128);
 extern _Float128 fmaf128 (_Float128, _Float128, _Float128);
+extern _Float128 ceilf128 (_Float128);
+extern _Float128 floorf128 (_Float128);
+extern _Float128 truncf128 (_Float128);
+extern _Float128 roundf128 (_Float128);
 
 _Float128
 do_copysign (_Float128 a, _Float128 b)
@@ -51,10 +55,35 @@  do_nfms (_Float128 a, _Float128 b, _Floa
   return -fmaf128 (a, b, -c);
 }
 
+_Float128
+do_ceil (_Float128 a)
+{
+  return ceilf128 (a);
+}
+
+_Float128
+do_floor (_Float128 a)
+{
+  return floorf128 (a);
+}
+
+_Float128
+do_trunc (_Float128 a)
+{
+  return truncf128 (a);
+}
+
+_Float128
+do_round (_Float128 a)
+{
+  return roundf128 (a);
+}
+
 /* { dg-final { scan-assembler     {\mxscpsgnqp\M} } } */
 /* { dg-final { scan-assembler     {\mxssqrtqp\M}  } } */
 /* { dg-final { scan-assembler     {\mxsmaddqp\M}  } } */
 /* { dg-final { scan-assembler     {\mxsmsubqp\M}  } } */
 /* { dg-final { scan-assembler     {\mxsnmaddqp\M} } } */
 /* { dg-final { scan-assembler     {\mxsnmsubqp\M} } } */
+/* { dg-final { scan-assembler     {\mxsrqpi\M}    } } */
 /* { dg-final { scan-assembler-not {\mbl\M}        } } */
Index: gcc/testsuite/gcc.target/powerpc/float128-hw5.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/float128-hw5.c	(revision 0)
+++ gcc/testsuite/gcc.target/powerpc/float128-hw5.c	(revision 0)
@@ -0,0 +1,33 @@ 
+/* { dg-do compile { target { powerpc*-*-* && lp64 } } } */
+/* { dg-require-effective-target powerpc_p9vector_ok } */
+/* { dg-options "-mpower9-vector -O2 -ffast-math" } */
+
+extern _Float128 copysignf128 (_Float128, _Float128);
+
+/* Check copysign optimizations that are done for double are also done for
+   _Float128.  */
+
+_Float128
+x_times_cs_one_negx (_Float128 x)
+{
+  return x * copysignf128 (1.0Q, -x);	/* XSNABSQP  */
+}
+
+_Float128
+x_times_cs_one_x (_Float128 x)
+{
+  return x * copysignf128 (1.0Q, x);	/* XSABSQP  */
+}
+
+_Float128
+cs_x_x (_Float128 x)
+{
+  return copysignf128 (x, x);		/* no operation.  */
+}
+
+/* { dg-final { scan-assembler-times {\mxsabsqp\M}  1 } } */
+/* { dg-final { scan-assembler-times {\mxsnabsqp\M} 1 } } */
+/* { dg-final { scan-assembler-not   {\mxscpsgnqp\M}  } } */
+/* { dg-final { scan-assembler-not   {\mlxvx\M}       } } */
+/* { dg-final { scan-assembler-not   {\mlxv\M}        } } */
+/* { dg-final { scan-assembler-not   {\mbl\M}         } } */
Index: gcc/testsuite/gcc.target/powerpc/float128-hw6.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/float128-hw6.c	(revision 0)
+++ gcc/testsuite/gcc.target/powerpc/float128-hw6.c	(revision 0)
@@ -0,0 +1,26 @@ 
+/* { dg-do compile { target { powerpc*-*-* && lp64 } } } */
+/* { dg-require-effective-target powerpc_p9vector_ok } */
+/* { dg-options "-mpower9-vector -O2" } */
+
+extern _Float128 fabsf128 (_Float128);
+extern _Float128 copysignf128 (_Float128, _Float128);
+
+/* Check copysign optimizations that are done for double are also done for
+   _Float128.  */
+
+_Float128
+cs_negx_y (_Float128 x, _Float128 y)
+{
+  return copysignf128 (-x, y);			/* eliminate negation.  */
+}
+
+_Float128
+cs_absx_y (_Float128 x, _Float128 y)
+{
+  return copysignf128 (fabsf128 (x), y);	/* eliminate fabsf128.  */
+}
+
+/* { dg-final { scan-assembler-times {\mxscpsgnqp\M} 2 } } */
+/* { dg-final { scan-assembler-not   {\mxsnegqp\M}     } } */
+/* { dg-final { scan-assembler-not   {\mxsabsqp\M}     } } */
+/* { dg-final { scan-assembler-not   {\mbl\M}          } } */
Index: gcc/testsuite/gcc.target/powerpc/float128-hw7.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/float128-hw7.c	(revision 0)
+++ gcc/testsuite/gcc.target/powerpc/float128-hw7.c	(revision 0)
@@ -0,0 +1,27 @@ 
+/* { dg-do compile { target { powerpc*-*-* && lp64 } } } */
+/* { dg-require-effective-target powerpc_p9vector_ok } */
+/* { dg-options "-mpower9-vector -O2" } */
+
+extern _Float128 fabsf128 (_Float128);
+extern _Float128 copysignf128 (_Float128, _Float128);
+
+/* Check copysign optimizations that are done for double are also done for
+   _Float128.  */
+
+_Float128
+cs_x_pos1 (_Float128 x)
+{
+  return copysignf128 (x, 1.0Q);		/* XSABSQP.  */
+}
+
+_Float128 cs_x_neg2 (_Float128 x)
+{
+  return copysignf128 (x, -2.0Q);		/* XSNABSQP.  */
+}
+
+/* { dg-final { scan-assembler-times {\mxsabsqp\M}  1 } } */
+/* { dg-final { scan-assembler-not   {\mxsnabsqp\M} 1 } } */
+/* { dg-final { scan-assembler-not   {\mxscpsgnqp\M}  } } */
+/* { dg-final { scan-assembler-not   {\mlxvx\M}       } } */
+/* { dg-final { scan-assembler-not   {\mlxv\M}        } } */
+/* { dg-final { scan-assembler-not   {\mbl\M}         } } */
Index: gcc/testsuite/gcc.target/powerpc/float128-hw8.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/float128-hw8.c	(revision 0)
+++ gcc/testsuite/gcc.target/powerpc/float128-hw8.c	(revision 0)
@@ -0,0 +1,24 @@ 
+/* { dg-do compile { target { powerpc*-*-* && lp64 } } } */
+/* { dg-require-effective-target powerpc_p9vector_ok } */
+/* { dg-options "-mpower9-vector -O2" } */
+
+extern _Float128 fminf128 (_Float128, _Float128);
+extern _Float128 fmaxf128 (_Float128, _Float128);
+
+/* Check min/max optimizations that are done for double are also done for
+   _Float128.  */
+
+_Float128
+min_x_x (_Float128 x)
+{
+  return fminf128 (x, x);
+}
+
+_Float128
+max_x_x (_Float128 x)
+{
+  return fmaxf128 (x, x);
+}
+
+/* { dg-final { scan-assembler-not {\mxscmpuqp\M} } } */
+/* { dg-final { scan-assembler-not {\mbl\M}       } } */
Index: gcc/testsuite/gcc.target/powerpc/float128-hw9.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/float128-hw9.c	(revision 0)
+++ gcc/testsuite/gcc.target/powerpc/float128-hw9.c	(revision 0)
@@ -0,0 +1,17 @@ 
+/* { dg-do compile { target { powerpc*-*-* && lp64 } } } */
+/* { dg-require-effective-target powerpc_p9vector_ok } */
+/* { dg-options "-mpower9-vector -O2 -ffast-math" } */
+
+extern _Float128 sqrtf128 (_Float128);
+
+/* Check sqrt optimizations that are done for double are also done for
+   _Float128.  */
+
+_Float128
+sqrt_x_times_sqrt_x (_Float128 x)
+{
+  return sqrtf128 (x) * sqrtf128 (x);
+}
+
+/* { dg-final { scan-assembler-not {\mxssqrtqp\M} } } */
+/* { dg-final { scan-assembler-not {\mbl\M}       } } */
Index: gcc/testsuite/gcc.target/powerpc/float128-hw10.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/float128-hw10.c	(revision 0)
+++ gcc/testsuite/gcc.target/powerpc/float128-hw10.c	(revision 0)
@@ -0,0 +1,38 @@ 
+/* { dg-do compile { target { powerpc*-*-* && lp64 } } } */
+/* { dg-require-effective-target powerpc_p9vector_ok } */
+/* { dg-options "-mpower9-vector -O2" } */
+
+extern _Float128 floorf128 (_Float128);
+extern _Float128 ceilf128 (_Float128);
+extern _Float128 roundf128 (_Float128);
+extern _Float128 truncf128 (_Float128);
+
+/* Check rounding optimizations that are done for double are also done for
+   _Float128.  */
+
+_Float128
+floor_floor_x (_Float128 x)
+{
+  return floorf128 (floorf128 (x));
+}
+
+_Float128
+ceil_ceil_x (_Float128 x)
+{
+  return ceilf128 (ceilf128 (x));
+}
+
+_Float128
+trunc_trunc_x (_Float128 x)
+{
+  return truncf128 (truncf128 (x));
+}
+
+_Float128
+round_round_x (_Float128 x)
+{
+  return roundf128 (roundf128 (x));
+}
+
+/* { dg-final { scan-assembler-times {\mxsrqpi\M} 4 } } */
+/* { dg-final { scan-assembler-not   {\mbl\M}       } } */