diff mbox series

PowerPC: Update IEEE 128-bit built-ins for long double is IEEE 128-bit.

Message ID 20201022220938.GA12075@ibm-toto.the-meissners.org
State New
Headers show
Series PowerPC: Update IEEE 128-bit built-ins for long double is IEEE 128-bit. | expand

Commit Message

Michael Meissner Oct. 22, 2020, 10:09 p.m. UTC
PowerPC: Update IEEE 128-bit built-ins for long double is IEEE 128-bit.

I have split all of these patches into separate patches to hopefully get them
into the tree.

This patch adds long double variants of the power10 __float128 built-in
functions.  This is needed when long double uses IEEE 128-bit because
__float128 uses TFmode in this case instead of KFmode.  If this patch is not
applied, these built-in functions can't be used when long double is IEEE
128-bit.

I have tested this patch with bootstrap builds on a little endian power9 system
running Linux.  With the other patches, I have built two full bootstrap builds
using this patch and the patches after this patch.  One build used the current
default for long double (IBM extended double) and the other build switched the
default to IEEE 128-bit.  I used the Advance Toolchain AT 14.0 compiler as the
library used by this compiler.  There are no regressions between the tests.
There are 3 fortran benchmarks (ieee/large_2.f90, default_format_2.f90, and
default_format_denormal_2.f90) that now pass.

Can I install this into the trunk?

We have gotten some requests to back port these changes to GCC 10.x.  At the
moment, I am not planning to do the back port, but I may need to in the future.

gcc/
2020-10-22  Michael Meissner  <meissner@linux.ibm.com>

	* config/rs6000/rs6000-call.c (altivec_overloaded_builtins): Add
	built-in functions for long double built-ins that use IEEE
	128-bit.
	(rs6000_expand_builtin): Change the KF IEEE 128-bit comparison
	insns to TF if long double is IEEE 128-bit.
	* config/rs6000/rs6000-builtin.def (scalar_extract_exptf): Add
	support for long double being IEEE 128-bit built-in functions.
	(scalar_extract_sigtf): Likewise.
	(scalar_test_neg_tf): Likewise.
	(scalar_insert_exp_tf): Likewise.
	(scalar_insert_exp_tfp): Likewise.
	(scalar_cmp_exp_tf_gt): Likewise.
	(scalar_cmp_exp_tf_lt): Likewise.
	(scalar_cmp_exp_tf_eq): Likewise.
	(scalar_cmp_exp_tf_unordered): Likewise.
	(scalar_test_data_class_tf): Likewise.
---
 gcc/config/rs6000/rs6000-builtin.def | 11 ++++++++
 gcc/config/rs6000/rs6000-call.c      | 40 ++++++++++++++++++++++++++++
 2 files changed, 51 insertions(+)

Comments

will schmidt Oct. 27, 2020, 2:38 p.m. UTC | #1
On Thu, 2020-10-22 at 18:09 -0400, Michael Meissner via Gcc-patches wrote:
> PowerPC: Update IEEE 128-bit built-ins for long double is IEEE 128-bit.

"for when .."

> 
> I have split all of these patches into separate patches to hopefully get them
> into the tree.
> 
> This patch adds long double variants of the power10 __float128 built-in
> functions.  This is needed when long double uses IEEE 128-bit because
> __float128 uses TFmode in this case instead of KFmode.  If this patch is not
> applied, these built-in functions can't be used when long double is IEEE
> 128-bit.
> 
> I have tested this patch with bootstrap builds on a little endian power9 system
> running Linux.  With the other patches, I have built two full bootstrap builds
> using this patch and the patches after this patch.  One build used the current
> default for long double (IBM extended double) and the other build switched the
> default to IEEE 128-bit.  I used the Advance Toolchain AT 14.0 compiler as the
> library used by this compiler.  There are no regressions between the tests.
> There are 3 fortran benchmarks (ieee/large_2.f90, default_format_2.f90, and
> default_format_denormal_2.f90) that now pass.
> 
> Can I install this into the trunk?
> 
> We have gotten some requests to back port these changes to GCC 10.x.  At the
> moment, I am not planning to do the back port, but I may need to in the future.
> 



> gcc/
> 2020-10-22  Michael Meissner  <meissner@linux.ibm.com>
> 
> 	* config/rs6000/rs6000-call.c (altivec_overloaded_builtins): Add
> 	built-in functions for long double built-ins that use IEEE
> 	128-bit.
> 	(rs6000_expand_builtin): Change the KF IEEE 128-bit comparison
> 	insns to TF if long double is IEEE 128-bit.
> 	* config/rs6000/rs6000-builtin.def (scalar_extract_exptf): Add
> 	support for long double being IEEE 128-bit built-in functions.
> 	(scalar_extract_sigtf): Likewise.
> 	(scalar_test_neg_tf): Likewise.
> 	(scalar_insert_exp_tf): Likewise.
> 	(scalar_insert_exp_tfp): Likewise.
> 	(scalar_cmp_exp_tf_gt): Likewise.
> 	(scalar_cmp_exp_tf_lt): Likewise.
> 	(scalar_cmp_exp_tf_eq): Likewise.
> 	(scalar_cmp_exp_tf_unordered): Likewise.
> 	(scalar_test_data_class_tf): Likewise.
> ---
>  gcc/config/rs6000/rs6000-builtin.def | 11 ++++++++
>  gcc/config/rs6000/rs6000-call.c      | 40 ++++++++++++++++++++++++++++
>  2 files changed, 51 insertions(+)
> 
> diff --git a/gcc/config/rs6000/rs6000-builtin.def b/gcc/config/rs6000/rs6000-builtin.def
> index 3eb55f0ae43..6f5685bf697 100644
> --- a/gcc/config/rs6000/rs6000-builtin.def
> +++ b/gcc/config/rs6000/rs6000-builtin.def
> @@ -2401,8 +2401,11 @@ BU_P9V_64BIT_VSX_1 (VSESDP,	"scalar_extract_sig",	CONST,	xsxsigdp)
> 
>  BU_FLOAT128_HW_VSX_1 (VSEEQP,	"scalar_extract_expq",	CONST,	xsxexpqp_kf)
>  BU_FLOAT128_HW_VSX_1 (VSESQP,	"scalar_extract_sigq",	CONST,	xsxsigqp_kf)
> +BU_FLOAT128_HW_VSX_1 (VSEETF,	"scalar_extract_exptf",	CONST,	xsxexpqp_tf)
> +BU_FLOAT128_HW_VSX_1 (VSESTF,	"scalar_extract_sigtf",	CONST,	xsxsigqp_tf)
> 
>  BU_FLOAT128_HW_VSX_1 (VSTDCNQP, "scalar_test_neg_qp",	CONST,	xststdcnegqp_kf)
> +BU_FLOAT128_HW_VSX_1 (VSTDCNTF, "scalar_test_neg_tf",	CONST,	xststdcnegqp_tf)
>  BU_P9V_VSX_1 (VSTDCNDP,	"scalar_test_neg_dp",	CONST,	xststdcnegdp)
>  BU_P9V_VSX_1 (VSTDCNSP,	"scalar_test_neg_sp",	CONST,	xststdcnegsp)
> 
> @@ -2420,6 +2423,8 @@ BU_P9V_64BIT_VSX_2 (VSIEDPF,	"scalar_insert_exp_dp",	CONST,	xsiexpdpf)
> 
>  BU_FLOAT128_HW_VSX_2 (VSIEQP,	"scalar_insert_exp_q",	CONST,	xsiexpqp_kf)
>  BU_FLOAT128_HW_VSX_2 (VSIEQPF,	"scalar_insert_exp_qp",	CONST,	xsiexpqpf_kf)
> +BU_FLOAT128_HW_VSX_2 (VSIETF,	"scalar_insert_exp_tf",	CONST,	xsiexpqp_tf)
> +BU_FLOAT128_HW_VSX_2 (VSIETFF,	"scalar_insert_exp_tfp", CONST,	xsiexpqpf_tf)

Ok if its ok, but the pattern catches my eye.  Should that be VSIETFP ?
(or named "scalar_insert_exp_tff")?


> 
>  BU_P9V_VSX_2 (VSCEDPGT,	"scalar_cmp_exp_dp_gt",	CONST,	xscmpexpdp_gt)
>  BU_P9V_VSX_2 (VSCEDPLT,	"scalar_cmp_exp_dp_lt",	CONST,	xscmpexpdp_lt)
> @@ -2431,7 +2436,13 @@ BU_P9V_VSX_2 (VSCEQPLT,	"scalar_cmp_exp_qp_lt",	CONST,	xscmpexpqp_lt_kf)
>  BU_P9V_VSX_2 (VSCEQPEQ,	"scalar_cmp_exp_qp_eq",	CONST,	xscmpexpqp_eq_kf)
>  BU_P9V_VSX_2 (VSCEQPUO,	"scalar_cmp_exp_qp_unordered",	CONST,	xscmpexpqp_unordered_kf)
> 
> +BU_P9V_VSX_2 (VSCETFGT,	"scalar_cmp_exp_tf_gt",	CONST,	xscmpexpqp_gt_tf)
> +BU_P9V_VSX_2 (VSCETFLT,	"scalar_cmp_exp_tf_lt",	CONST,	xscmpexpqp_lt_tf)
> +BU_P9V_VSX_2 (VSCETFEQ,	"scalar_cmp_exp_tf_eq",	CONST,	xscmpexpqp_eq_tf)
> +BU_P9V_VSX_2 (VSCETFUO,	"scalar_cmp_exp_tf_unordered", CONST, xscmpexpqp_unordered_tf)
> +
>  BU_FLOAT128_HW_VSX_2 (VSTDCQP, "scalar_test_data_class_qp",	CONST,	xststdcqp_kf)
> +BU_FLOAT128_HW_VSX_2 (VSTDCTF, "scalar_test_data_class_tf",	CONST,	xststdcqp_tf)
>  BU_P9V_VSX_2 (VSTDCDP,	"scalar_test_data_class_dp",	CONST,	xststdcdp)
>  BU_P9V_VSX_2 (VSTDCSP,	"scalar_test_data_class_sp",	CONST,	xststdcsp)
> 
> diff --git a/gcc/config/rs6000/rs6000-call.c b/gcc/config/rs6000/rs6000-call.c
> index 9fdf97bc803..15dd99ac68d 100644
> --- a/gcc/config/rs6000/rs6000-call.c
> +++ b/gcc/config/rs6000/rs6000-call.c
> @@ -4585,6 +4585,8 @@ const struct altivec_builtin_types altivec_overloaded_builtins[] = {
>      RS6000_BTI_bool_int, RS6000_BTI_double, RS6000_BTI_INTSI, 0 },
>    { P9V_BUILTIN_VEC_VSTDC, P9V_BUILTIN_VSTDCQP,
>      RS6000_BTI_bool_int, RS6000_BTI_ieee128_float, RS6000_BTI_INTSI, 0 },
> +  { P9V_BUILTIN_VEC_VSTDC, P9V_BUILTIN_VSTDCTF,
> +    RS6000_BTI_bool_int, RS6000_BTI_long_double, RS6000_BTI_INTSI, 0 },
> 
>    { P9V_BUILTIN_VEC_VSTDCSP, P9V_BUILTIN_VSTDCSP,
>      RS6000_BTI_bool_int, RS6000_BTI_float, RS6000_BTI_INTSI, 0 },
> @@ -4592,6 +4594,8 @@ const struct altivec_builtin_types altivec_overloaded_builtins[] = {
>      RS6000_BTI_bool_int, RS6000_BTI_double, RS6000_BTI_INTSI, 0 },
>    { P9V_BUILTIN_VEC_VSTDCQP, P9V_BUILTIN_VSTDCQP,
>      RS6000_BTI_bool_int, RS6000_BTI_ieee128_float, RS6000_BTI_INTSI, 0 },
> +  { P9V_BUILTIN_VEC_VSTDCQP, P9V_BUILTIN_VSTDCTF,
> +    RS6000_BTI_bool_int, RS6000_BTI_long_double, RS6000_BTI_INTSI, 0 },
> 
>    { P9V_BUILTIN_VEC_VSTDCN, P9V_BUILTIN_VSTDCNSP,
>      RS6000_BTI_bool_int, RS6000_BTI_float, 0, 0 },
> @@ -4599,6 +4603,8 @@ const struct altivec_builtin_types altivec_overloaded_builtins[] = {
>      RS6000_BTI_bool_int, RS6000_BTI_double, 0, 0 },
>    { P9V_BUILTIN_VEC_VSTDCN, P9V_BUILTIN_VSTDCNQP,
>      RS6000_BTI_bool_int, RS6000_BTI_ieee128_float, 0, 0 },
> +  { P9V_BUILTIN_VEC_VSTDCN, P9V_BUILTIN_VSTDCNTF,
> +    RS6000_BTI_bool_int, RS6000_BTI_long_double, 0, 0 },
> 
>    { P9V_BUILTIN_VEC_VSTDCNSP, P9V_BUILTIN_VSTDCNSP,
>      RS6000_BTI_bool_int, RS6000_BTI_float, 0, 0 },
> @@ -4606,16 +4612,22 @@ const struct altivec_builtin_types altivec_overloaded_builtins[] = {
>      RS6000_BTI_bool_int, RS6000_BTI_double, 0, 0 },
>    { P9V_BUILTIN_VEC_VSTDCNQP, P9V_BUILTIN_VSTDCNQP,
>      RS6000_BTI_bool_int, RS6000_BTI_ieee128_float, 0, 0 },
> +  { P9V_BUILTIN_VEC_VSTDCNQP, P9V_BUILTIN_VSTDCNTF,
> +    RS6000_BTI_bool_int, RS6000_BTI_long_double, 0, 0 },
> 
>    { P9V_BUILTIN_VEC_VSEEDP, P9V_BUILTIN_VSEEDP,
>      RS6000_BTI_UINTSI, RS6000_BTI_double, 0, 0 },
>    { P9V_BUILTIN_VEC_VSEEDP, P9V_BUILTIN_VSEEQP,
>      RS6000_BTI_UINTDI, RS6000_BTI_ieee128_float, 0, 0 },
> +  { P9V_BUILTIN_VEC_VSEEDP, P9V_BUILTIN_VSEETF,
> +    RS6000_BTI_UINTDI, RS6000_BTI_long_double, 0, 0 },
> 
>    { P9V_BUILTIN_VEC_VSESDP, P9V_BUILTIN_VSESDP,
>      RS6000_BTI_UINTDI, RS6000_BTI_double, 0, 0 },
>    { P9V_BUILTIN_VEC_VSESDP, P9V_BUILTIN_VSESQP,
>      RS6000_BTI_UINTTI, RS6000_BTI_ieee128_float, 0, 0 },
> +  { P9V_BUILTIN_VEC_VSESDP, P9V_BUILTIN_VSESTF,
> +    RS6000_BTI_UINTTI, RS6000_BTI_long_double, 0, 0 },
> 
>    { P9V_BUILTIN_VEC_VSIEDP, P9V_BUILTIN_VSIEDP,
>      RS6000_BTI_double, RS6000_BTI_UINTDI, RS6000_BTI_UINTDI, 0 },
> @@ -4624,25 +4636,37 @@ const struct altivec_builtin_types altivec_overloaded_builtins[] = {
> 
>    { P9V_BUILTIN_VEC_VSIEDP, P9V_BUILTIN_VSIEQP,
>      RS6000_BTI_ieee128_float, RS6000_BTI_UINTTI, RS6000_BTI_UINTDI, 0 },
> +  { P9V_BUILTIN_VEC_VSIEDP, P9V_BUILTIN_VSIETF,
> +    RS6000_BTI_long_double, RS6000_BTI_UINTTI, RS6000_BTI_UINTDI, 0 },
>    { P9V_BUILTIN_VEC_VSIEDP, P9V_BUILTIN_VSIEQPF,
>      RS6000_BTI_ieee128_float, RS6000_BTI_ieee128_float, RS6000_BTI_UINTDI, 0 },
> +  { P9V_BUILTIN_VEC_VSIEDP, P9V_BUILTIN_VSIETFF,
> +    RS6000_BTI_long_double, RS6000_BTI_long_double, RS6000_BTI_UINTDI, 0 },
> 
>    { P9V_BUILTIN_VEC_VSCEGT, P9V_BUILTIN_VSCEDPGT,
>      RS6000_BTI_INTSI, RS6000_BTI_double, RS6000_BTI_double, 0 },
>    { P9V_BUILTIN_VEC_VSCEGT, P9V_BUILTIN_VSCEQPGT,
>      RS6000_BTI_INTSI, RS6000_BTI_ieee128_float, RS6000_BTI_ieee128_float, 0 },
> +  { P9V_BUILTIN_VEC_VSCEGT, P9V_BUILTIN_VSCETFGT,
> +    RS6000_BTI_INTSI, RS6000_BTI_long_double, RS6000_BTI_long_double, 0 },
>    { P9V_BUILTIN_VEC_VSCELT, P9V_BUILTIN_VSCEDPLT,
>      RS6000_BTI_INTSI, RS6000_BTI_double, RS6000_BTI_double, 0 },
>    { P9V_BUILTIN_VEC_VSCELT, P9V_BUILTIN_VSCEQPLT,
>      RS6000_BTI_INTSI, RS6000_BTI_ieee128_float, RS6000_BTI_ieee128_float, 0 },
> +  { P9V_BUILTIN_VEC_VSCELT, P9V_BUILTIN_VSCETFLT,
> +    RS6000_BTI_INTSI, RS6000_BTI_long_double, RS6000_BTI_long_double, 0 },
>    { P9V_BUILTIN_VEC_VSCEEQ, P9V_BUILTIN_VSCEDPEQ,
>      RS6000_BTI_INTSI, RS6000_BTI_double, RS6000_BTI_double, 0 },
>    { P9V_BUILTIN_VEC_VSCEEQ, P9V_BUILTIN_VSCEQPEQ,
>      RS6000_BTI_INTSI, RS6000_BTI_ieee128_float, RS6000_BTI_ieee128_float, 0 },
> +  { P9V_BUILTIN_VEC_VSCEEQ, P9V_BUILTIN_VSCETFEQ,
> +    RS6000_BTI_INTSI, RS6000_BTI_long_double, RS6000_BTI_long_double, 0 },
>    { P9V_BUILTIN_VEC_VSCEUO, P9V_BUILTIN_VSCEDPUO,
>      RS6000_BTI_INTSI, RS6000_BTI_double, RS6000_BTI_double, 0 },
>    { P9V_BUILTIN_VEC_VSCEUO, P9V_BUILTIN_VSCEQPUO,
>      RS6000_BTI_INTSI, RS6000_BTI_ieee128_float, RS6000_BTI_ieee128_float, 0 },
> +  { P9V_BUILTIN_VEC_VSCEUO, P9V_BUILTIN_VSCETFUO,
> +    RS6000_BTI_INTSI, RS6000_BTI_long_double, RS6000_BTI_long_double, 0 },
> 
>    { P9V_BUILTIN_VEC_XL_LEN_R, P9V_BUILTIN_XL_LEN_R,
>      RS6000_BTI_unsigned_V16QI, ~RS6000_BTI_UINTQI,
> @@ -12532,6 +12556,22 @@ rs6000_expand_builtin (tree exp, rtx target, rtx subtarget ATTRIBUTE_UNUSED,
>        case CODE_FOR_xsiexpqp_kf:	icode = CODE_FOR_xsiexpqp_tf;	break;
>        case CODE_FOR_xsiexpqpf_kf:	icode = CODE_FOR_xsiexpqpf_tf;	break;
>        case CODE_FOR_xststdcqp_kf:	icode = CODE_FOR_xststdcqp_tf;	break;
> +
> +      case CODE_FOR_xscmpexpqp_eq_kf:
> +	icode = CODE_FOR_xscmpexpqp_eq_tf;
> +	break;
> +
> +      case CODE_FOR_xscmpexpqp_lt_kf:
> +	icode = CODE_FOR_xscmpexpqp_lt_tf;
> +	break;
> +
> +      case CODE_FOR_xscmpexpqp_gt_kf:
> +	icode = CODE_FOR_xscmpexpqp_gt_tf;
> +	break;
> +
> +      case CODE_FOR_xscmpexpqp_unordered_kf:
> +	icode = CODE_FOR_xscmpexpqp_unordered_tf;
> +	break;
>        }

ok

lgtm, thanks
-Will


> 
>    if (TARGET_DEBUG_BUILTIN)
> -- 
> 2.22.0
> 
>
Segher Boessenkool Oct. 28, 2020, 11:27 p.m. UTC | #2
On Thu, Oct 22, 2020 at 06:09:38PM -0400, Michael Meissner wrote:
> This patch adds long double variants of the power10 __float128 built-in
> functions.  This is needed when long double uses IEEE 128-bit because
> __float128 uses TFmode in this case instead of KFmode.  If this patch is not
> applied, these built-in functions can't be used when long double is IEEE
> 128-bit.

But now they still cannot, you need new builtins, instead.

TFmode is an implementation detail at this level (functions use types,
not modes), so you do not need new builtins at all afaics?  Just define
the existing ones with TFmode as well (if that is the same as KFmode)?


Segher
Michael Meissner Oct. 29, 2020, 4:47 p.m. UTC | #3
On Tue, Oct 27, 2020 at 09:38:20AM -0500, will schmidt wrote:
> > @@ -2420,6 +2423,8 @@ BU_P9V_64BIT_VSX_2 (VSIEDPF,	"scalar_insert_exp_dp",	CONST,	xsiexpdpf)
> > 
> >  BU_FLOAT128_HW_VSX_2 (VSIEQP,	"scalar_insert_exp_q",	CONST,	xsiexpqp_kf)
> >  BU_FLOAT128_HW_VSX_2 (VSIEQPF,	"scalar_insert_exp_qp",	CONST,	xsiexpqpf_kf)
> > +BU_FLOAT128_HW_VSX_2 (VSIETF,	"scalar_insert_exp_tf",	CONST,	xsiexpqp_tf)
> > +BU_FLOAT128_HW_VSX_2 (VSIETFF,	"scalar_insert_exp_tfp", CONST,	xsiexpqpf_tf)
> 
> Ok if its ok, but the pattern catches my eye.  Should that be VSIETFP ?
> (or named "scalar_insert_exp_tff")?

That is the existing function in the library.  All I'm doing is adding TF
versions of the existing functions.
Michael Meissner Oct. 29, 2020, 4:50 p.m. UTC | #4
On Wed, Oct 28, 2020 at 06:27:42PM -0500, Segher Boessenkool wrote:
> On Thu, Oct 22, 2020 at 06:09:38PM -0400, Michael Meissner wrote:
> > This patch adds long double variants of the power10 __float128 built-in
> > functions.  This is needed when long double uses IEEE 128-bit because
> > __float128 uses TFmode in this case instead of KFmode.  If this patch is not
> > applied, these built-in functions can't be used when long double is IEEE
> > 128-bit.
> 
> But now they still cannot, you need new builtins, instead.
> 
> TFmode is an implementation detail at this level (functions use types,
> not modes), so you do not need new builtins at all afaics?  Just define
> the existing ones with TFmode as well (if that is the same as KFmode)?

In order to add new overloaded built-ins, you have to add a new built-in with a
new name.  Hence I have to add TF variants for these functions when __float128
is the same as long double.

Maybe when Bill finally reorganizes the built-in functions, we can do anyway
with having to create new named functions.  But for now, in order to add them,
you need a name.
Segher Boessenkool Oct. 29, 2020, 6:25 p.m. UTC | #5
On Thu, Oct 29, 2020 at 12:47:20PM -0400, Michael Meissner wrote:
> On Tue, Oct 27, 2020 at 09:38:20AM -0500, will schmidt wrote:
> > > @@ -2420,6 +2423,8 @@ BU_P9V_64BIT_VSX_2 (VSIEDPF,	"scalar_insert_exp_dp",	CONST,	xsiexpdpf)
> > > 
> > >  BU_FLOAT128_HW_VSX_2 (VSIEQP,	"scalar_insert_exp_q",	CONST,	xsiexpqp_kf)
> > >  BU_FLOAT128_HW_VSX_2 (VSIEQPF,	"scalar_insert_exp_qp",	CONST,	xsiexpqpf_kf)
> > > +BU_FLOAT128_HW_VSX_2 (VSIETF,	"scalar_insert_exp_tf",	CONST,	xsiexpqp_tf)
> > > +BU_FLOAT128_HW_VSX_2 (VSIETFF,	"scalar_insert_exp_tfp", CONST,	xsiexpqpf_tf)
> > 
> > Ok if its ok, but the pattern catches my eye.  Should that be VSIETFP ?
> > (or named "scalar_insert_exp_tff")?
> 
> That is the existing function in the library.  All I'm doing is adding TF
> versions of the existing functions.

Sure, but logically the macro for scalar_insert_exp_tfp would be VSIETFP
(instead of VSIETF) (and that is a new macro name fwiw).  So please fix
that?


Segher
Segher Boessenkool Oct. 29, 2020, 6:32 p.m. UTC | #6
On Thu, Oct 29, 2020 at 12:50:10PM -0400, Michael Meissner wrote:
> On Wed, Oct 28, 2020 at 06:27:42PM -0500, Segher Boessenkool wrote:
> > On Thu, Oct 22, 2020 at 06:09:38PM -0400, Michael Meissner wrote:
> > > This patch adds long double variants of the power10 __float128 built-in
> > > functions.  This is needed when long double uses IEEE 128-bit because
> > > __float128 uses TFmode in this case instead of KFmode.  If this patch is not
> > > applied, these built-in functions can't be used when long double is IEEE
> > > 128-bit.
> > 
> > But now they still cannot, you need new builtins, instead.
> > 
> > TFmode is an implementation detail at this level (functions use types,
> > not modes), so you do not need new builtins at all afaics?  Just define
> > the existing ones with TFmode as well (if that is the same as KFmode)?
> 
> In order to add new overloaded built-ins, you have to add a new built-in with a
> new name.

I do not follow?  Just delete the old non-overloaded one and add the
overloaded one with that same old name at the same time.

TF is a nasty name, it means a different thing externally (in the libgcc
function names, say: always IFmode) and internally (it varies what it
means).

> Maybe when Bill finally reorganizes the built-in functions, we can do anyway
> with having to create new named functions.  But for now, in order to add them,
> you need a name.

Of course.  And there already is a name :-)


Segher
Michael Meissner Oct. 29, 2020, 8:50 p.m. UTC | #7
On Thu, Oct 29, 2020 at 01:32:53PM -0500, Segher Boessenkool wrote:
> On Thu, Oct 29, 2020 at 12:50:10PM -0400, Michael Meissner wrote:
> > On Wed, Oct 28, 2020 at 06:27:42PM -0500, Segher Boessenkool wrote:
> > > On Thu, Oct 22, 2020 at 06:09:38PM -0400, Michael Meissner wrote:
> > > > This patch adds long double variants of the power10 __float128 built-in
> > > > functions.  This is needed when long double uses IEEE 128-bit because
> > > > __float128 uses TFmode in this case instead of KFmode.  If this patch is not
> > > > applied, these built-in functions can't be used when long double is IEEE
> > > > 128-bit.
> > > 
> > > But now they still cannot, you need new builtins, instead.
> > > 
> > > TFmode is an implementation detail at this level (functions use types,
> > > not modes), so you do not need new builtins at all afaics?  Just define
> > > the existing ones with TFmode as well (if that is the same as KFmode)?
> > 
> > In order to add new overloaded built-ins, you have to add a new built-in with a
> > new name.
> 
> I do not follow?  Just delete the old non-overloaded one and add the
> overloaded one with that same old name at the same time.

You have to have 3 names for the built-in function:

  * A name specific to KFmode;
  * A name specific to long double with IEEE 128-bit; (and)
  * The generic name.

And the generic name (from the user point of view, not the insn name) needs to
be the generic name.

> TF is a nasty name, it means a different thing externally (in the libgcc
> function names, say: always IFmode) and internally (it varies what it
> means).
> 
> > Maybe when Bill finally reorganizes the built-in functions, we can do anyway
> > with having to create new named functions.  But for now, in order to add them,
> > you need a name.
> 
> Of course.  And there already is a name :-)

One name, but as I said, we will now need three insn names, and these names
spill out to the built-in names.  Even if we don't document the names, they
still exist and determined users can call them.
diff mbox series

Patch

diff --git a/gcc/config/rs6000/rs6000-builtin.def b/gcc/config/rs6000/rs6000-builtin.def
index 3eb55f0ae43..6f5685bf697 100644
--- a/gcc/config/rs6000/rs6000-builtin.def
+++ b/gcc/config/rs6000/rs6000-builtin.def
@@ -2401,8 +2401,11 @@  BU_P9V_64BIT_VSX_1 (VSESDP,	"scalar_extract_sig",	CONST,	xsxsigdp)
 
 BU_FLOAT128_HW_VSX_1 (VSEEQP,	"scalar_extract_expq",	CONST,	xsxexpqp_kf)
 BU_FLOAT128_HW_VSX_1 (VSESQP,	"scalar_extract_sigq",	CONST,	xsxsigqp_kf)
+BU_FLOAT128_HW_VSX_1 (VSEETF,	"scalar_extract_exptf",	CONST,	xsxexpqp_tf)
+BU_FLOAT128_HW_VSX_1 (VSESTF,	"scalar_extract_sigtf",	CONST,	xsxsigqp_tf)
 
 BU_FLOAT128_HW_VSX_1 (VSTDCNQP, "scalar_test_neg_qp",	CONST,	xststdcnegqp_kf)
+BU_FLOAT128_HW_VSX_1 (VSTDCNTF, "scalar_test_neg_tf",	CONST,	xststdcnegqp_tf)
 BU_P9V_VSX_1 (VSTDCNDP,	"scalar_test_neg_dp",	CONST,	xststdcnegdp)
 BU_P9V_VSX_1 (VSTDCNSP,	"scalar_test_neg_sp",	CONST,	xststdcnegsp)
 
@@ -2420,6 +2423,8 @@  BU_P9V_64BIT_VSX_2 (VSIEDPF,	"scalar_insert_exp_dp",	CONST,	xsiexpdpf)
 
 BU_FLOAT128_HW_VSX_2 (VSIEQP,	"scalar_insert_exp_q",	CONST,	xsiexpqp_kf)
 BU_FLOAT128_HW_VSX_2 (VSIEQPF,	"scalar_insert_exp_qp",	CONST,	xsiexpqpf_kf)
+BU_FLOAT128_HW_VSX_2 (VSIETF,	"scalar_insert_exp_tf",	CONST,	xsiexpqp_tf)
+BU_FLOAT128_HW_VSX_2 (VSIETFF,	"scalar_insert_exp_tfp", CONST,	xsiexpqpf_tf)
 
 BU_P9V_VSX_2 (VSCEDPGT,	"scalar_cmp_exp_dp_gt",	CONST,	xscmpexpdp_gt)
 BU_P9V_VSX_2 (VSCEDPLT,	"scalar_cmp_exp_dp_lt",	CONST,	xscmpexpdp_lt)
@@ -2431,7 +2436,13 @@  BU_P9V_VSX_2 (VSCEQPLT,	"scalar_cmp_exp_qp_lt",	CONST,	xscmpexpqp_lt_kf)
 BU_P9V_VSX_2 (VSCEQPEQ,	"scalar_cmp_exp_qp_eq",	CONST,	xscmpexpqp_eq_kf)
 BU_P9V_VSX_2 (VSCEQPUO,	"scalar_cmp_exp_qp_unordered",	CONST,	xscmpexpqp_unordered_kf)
 
+BU_P9V_VSX_2 (VSCETFGT,	"scalar_cmp_exp_tf_gt",	CONST,	xscmpexpqp_gt_tf)
+BU_P9V_VSX_2 (VSCETFLT,	"scalar_cmp_exp_tf_lt",	CONST,	xscmpexpqp_lt_tf)
+BU_P9V_VSX_2 (VSCETFEQ,	"scalar_cmp_exp_tf_eq",	CONST,	xscmpexpqp_eq_tf)
+BU_P9V_VSX_2 (VSCETFUO,	"scalar_cmp_exp_tf_unordered", CONST, xscmpexpqp_unordered_tf)
+
 BU_FLOAT128_HW_VSX_2 (VSTDCQP, "scalar_test_data_class_qp",	CONST,	xststdcqp_kf)
+BU_FLOAT128_HW_VSX_2 (VSTDCTF, "scalar_test_data_class_tf",	CONST,	xststdcqp_tf)
 BU_P9V_VSX_2 (VSTDCDP,	"scalar_test_data_class_dp",	CONST,	xststdcdp)
 BU_P9V_VSX_2 (VSTDCSP,	"scalar_test_data_class_sp",	CONST,	xststdcsp)
 
diff --git a/gcc/config/rs6000/rs6000-call.c b/gcc/config/rs6000/rs6000-call.c
index 9fdf97bc803..15dd99ac68d 100644
--- a/gcc/config/rs6000/rs6000-call.c
+++ b/gcc/config/rs6000/rs6000-call.c
@@ -4585,6 +4585,8 @@  const struct altivec_builtin_types altivec_overloaded_builtins[] = {
     RS6000_BTI_bool_int, RS6000_BTI_double, RS6000_BTI_INTSI, 0 },
   { P9V_BUILTIN_VEC_VSTDC, P9V_BUILTIN_VSTDCQP,
     RS6000_BTI_bool_int, RS6000_BTI_ieee128_float, RS6000_BTI_INTSI, 0 },
+  { P9V_BUILTIN_VEC_VSTDC, P9V_BUILTIN_VSTDCTF,
+    RS6000_BTI_bool_int, RS6000_BTI_long_double, RS6000_BTI_INTSI, 0 },
 
   { P9V_BUILTIN_VEC_VSTDCSP, P9V_BUILTIN_VSTDCSP,
     RS6000_BTI_bool_int, RS6000_BTI_float, RS6000_BTI_INTSI, 0 },
@@ -4592,6 +4594,8 @@  const struct altivec_builtin_types altivec_overloaded_builtins[] = {
     RS6000_BTI_bool_int, RS6000_BTI_double, RS6000_BTI_INTSI, 0 },
   { P9V_BUILTIN_VEC_VSTDCQP, P9V_BUILTIN_VSTDCQP,
     RS6000_BTI_bool_int, RS6000_BTI_ieee128_float, RS6000_BTI_INTSI, 0 },
+  { P9V_BUILTIN_VEC_VSTDCQP, P9V_BUILTIN_VSTDCTF,
+    RS6000_BTI_bool_int, RS6000_BTI_long_double, RS6000_BTI_INTSI, 0 },
 
   { P9V_BUILTIN_VEC_VSTDCN, P9V_BUILTIN_VSTDCNSP,
     RS6000_BTI_bool_int, RS6000_BTI_float, 0, 0 },
@@ -4599,6 +4603,8 @@  const struct altivec_builtin_types altivec_overloaded_builtins[] = {
     RS6000_BTI_bool_int, RS6000_BTI_double, 0, 0 },
   { P9V_BUILTIN_VEC_VSTDCN, P9V_BUILTIN_VSTDCNQP,
     RS6000_BTI_bool_int, RS6000_BTI_ieee128_float, 0, 0 },
+  { P9V_BUILTIN_VEC_VSTDCN, P9V_BUILTIN_VSTDCNTF,
+    RS6000_BTI_bool_int, RS6000_BTI_long_double, 0, 0 },
 
   { P9V_BUILTIN_VEC_VSTDCNSP, P9V_BUILTIN_VSTDCNSP,
     RS6000_BTI_bool_int, RS6000_BTI_float, 0, 0 },
@@ -4606,16 +4612,22 @@  const struct altivec_builtin_types altivec_overloaded_builtins[] = {
     RS6000_BTI_bool_int, RS6000_BTI_double, 0, 0 },
   { P9V_BUILTIN_VEC_VSTDCNQP, P9V_BUILTIN_VSTDCNQP,
     RS6000_BTI_bool_int, RS6000_BTI_ieee128_float, 0, 0 },
+  { P9V_BUILTIN_VEC_VSTDCNQP, P9V_BUILTIN_VSTDCNTF,
+    RS6000_BTI_bool_int, RS6000_BTI_long_double, 0, 0 },
 
   { P9V_BUILTIN_VEC_VSEEDP, P9V_BUILTIN_VSEEDP,
     RS6000_BTI_UINTSI, RS6000_BTI_double, 0, 0 },
   { P9V_BUILTIN_VEC_VSEEDP, P9V_BUILTIN_VSEEQP,
     RS6000_BTI_UINTDI, RS6000_BTI_ieee128_float, 0, 0 },
+  { P9V_BUILTIN_VEC_VSEEDP, P9V_BUILTIN_VSEETF,
+    RS6000_BTI_UINTDI, RS6000_BTI_long_double, 0, 0 },
 
   { P9V_BUILTIN_VEC_VSESDP, P9V_BUILTIN_VSESDP,
     RS6000_BTI_UINTDI, RS6000_BTI_double, 0, 0 },
   { P9V_BUILTIN_VEC_VSESDP, P9V_BUILTIN_VSESQP,
     RS6000_BTI_UINTTI, RS6000_BTI_ieee128_float, 0, 0 },
+  { P9V_BUILTIN_VEC_VSESDP, P9V_BUILTIN_VSESTF,
+    RS6000_BTI_UINTTI, RS6000_BTI_long_double, 0, 0 },
 
   { P9V_BUILTIN_VEC_VSIEDP, P9V_BUILTIN_VSIEDP,
     RS6000_BTI_double, RS6000_BTI_UINTDI, RS6000_BTI_UINTDI, 0 },
@@ -4624,25 +4636,37 @@  const struct altivec_builtin_types altivec_overloaded_builtins[] = {
 
   { P9V_BUILTIN_VEC_VSIEDP, P9V_BUILTIN_VSIEQP,
     RS6000_BTI_ieee128_float, RS6000_BTI_UINTTI, RS6000_BTI_UINTDI, 0 },
+  { P9V_BUILTIN_VEC_VSIEDP, P9V_BUILTIN_VSIETF,
+    RS6000_BTI_long_double, RS6000_BTI_UINTTI, RS6000_BTI_UINTDI, 0 },
   { P9V_BUILTIN_VEC_VSIEDP, P9V_BUILTIN_VSIEQPF,
     RS6000_BTI_ieee128_float, RS6000_BTI_ieee128_float, RS6000_BTI_UINTDI, 0 },
+  { P9V_BUILTIN_VEC_VSIEDP, P9V_BUILTIN_VSIETFF,
+    RS6000_BTI_long_double, RS6000_BTI_long_double, RS6000_BTI_UINTDI, 0 },
 
   { P9V_BUILTIN_VEC_VSCEGT, P9V_BUILTIN_VSCEDPGT,
     RS6000_BTI_INTSI, RS6000_BTI_double, RS6000_BTI_double, 0 },
   { P9V_BUILTIN_VEC_VSCEGT, P9V_BUILTIN_VSCEQPGT,
     RS6000_BTI_INTSI, RS6000_BTI_ieee128_float, RS6000_BTI_ieee128_float, 0 },
+  { P9V_BUILTIN_VEC_VSCEGT, P9V_BUILTIN_VSCETFGT,
+    RS6000_BTI_INTSI, RS6000_BTI_long_double, RS6000_BTI_long_double, 0 },
   { P9V_BUILTIN_VEC_VSCELT, P9V_BUILTIN_VSCEDPLT,
     RS6000_BTI_INTSI, RS6000_BTI_double, RS6000_BTI_double, 0 },
   { P9V_BUILTIN_VEC_VSCELT, P9V_BUILTIN_VSCEQPLT,
     RS6000_BTI_INTSI, RS6000_BTI_ieee128_float, RS6000_BTI_ieee128_float, 0 },
+  { P9V_BUILTIN_VEC_VSCELT, P9V_BUILTIN_VSCETFLT,
+    RS6000_BTI_INTSI, RS6000_BTI_long_double, RS6000_BTI_long_double, 0 },
   { P9V_BUILTIN_VEC_VSCEEQ, P9V_BUILTIN_VSCEDPEQ,
     RS6000_BTI_INTSI, RS6000_BTI_double, RS6000_BTI_double, 0 },
   { P9V_BUILTIN_VEC_VSCEEQ, P9V_BUILTIN_VSCEQPEQ,
     RS6000_BTI_INTSI, RS6000_BTI_ieee128_float, RS6000_BTI_ieee128_float, 0 },
+  { P9V_BUILTIN_VEC_VSCEEQ, P9V_BUILTIN_VSCETFEQ,
+    RS6000_BTI_INTSI, RS6000_BTI_long_double, RS6000_BTI_long_double, 0 },
   { P9V_BUILTIN_VEC_VSCEUO, P9V_BUILTIN_VSCEDPUO,
     RS6000_BTI_INTSI, RS6000_BTI_double, RS6000_BTI_double, 0 },
   { P9V_BUILTIN_VEC_VSCEUO, P9V_BUILTIN_VSCEQPUO,
     RS6000_BTI_INTSI, RS6000_BTI_ieee128_float, RS6000_BTI_ieee128_float, 0 },
+  { P9V_BUILTIN_VEC_VSCEUO, P9V_BUILTIN_VSCETFUO,
+    RS6000_BTI_INTSI, RS6000_BTI_long_double, RS6000_BTI_long_double, 0 },
 
   { P9V_BUILTIN_VEC_XL_LEN_R, P9V_BUILTIN_XL_LEN_R,
     RS6000_BTI_unsigned_V16QI, ~RS6000_BTI_UINTQI,
@@ -12532,6 +12556,22 @@  rs6000_expand_builtin (tree exp, rtx target, rtx subtarget ATTRIBUTE_UNUSED,
       case CODE_FOR_xsiexpqp_kf:	icode = CODE_FOR_xsiexpqp_tf;	break;
       case CODE_FOR_xsiexpqpf_kf:	icode = CODE_FOR_xsiexpqpf_tf;	break;
       case CODE_FOR_xststdcqp_kf:	icode = CODE_FOR_xststdcqp_tf;	break;
+
+      case CODE_FOR_xscmpexpqp_eq_kf:
+	icode = CODE_FOR_xscmpexpqp_eq_tf;
+	break;
+
+      case CODE_FOR_xscmpexpqp_lt_kf:
+	icode = CODE_FOR_xscmpexpqp_lt_tf;
+	break;
+
+      case CODE_FOR_xscmpexpqp_gt_kf:
+	icode = CODE_FOR_xscmpexpqp_gt_tf;
+	break;
+
+      case CODE_FOR_xscmpexpqp_unordered_kf:
+	icode = CODE_FOR_xscmpexpqp_unordered_tf;
+	break;
       }
 
   if (TARGET_DEBUG_BUILTIN)