diff mbox series

PowerPC: Use __builtin_pack_ieee128 if long double is IEEE 128-bit.

Message ID 20201022221037.GA12168@ibm-toto.the-meissners.org
State New
Headers show
Series PowerPC: Use __builtin_pack_ieee128 if long double is IEEE 128-bit. | expand

Commit Message

Michael Meissner Oct. 22, 2020, 10:10 p.m. UTC
PowerPC: Use __builtin_pack_ieee128 if long double is IEEE 128-bit.

I have split all of these patches into separate patches to hopefully get them
into the tree.

This patch changes the __ibm128 emulator to use __builtin_pack_ieee128
instead of __builtin_pack_longdouble if long double is IEEE 128-bit, and
we need to use the __ibm128 type.  The code will run without this patch,
but this patch slightly optimizes it better.

I have tested this patch with bootstrap builds on a little endian power9 system
running Linux.  With the other patches, I have built two full bootstrap builds
using this patch and the patches after this patch.  One build used the current
default for long double (IBM extended double) and the other build switched the
default to IEEE 128-bit.  I used the Advance Toolchain AT 14.0 compiler as the
library used by this compiler.  There are no regressions between the tests.
There are 3 fortran benchmarks (ieee/large_2.f90, default_format_2.f90, and
default_format_denormal_2.f90) that now pass.

Can I install this into the trunk?

We have gotten some requests to back port these changes to GCC 10.x.  At the
moment, I am not planning to do the back port, but I may need to in the future.

libgcc/
2020-10-22  Michael Meissner  <meissner@linux.ibm.com>

	* config/rs6000/ibm-ldouble.c (pack_ldouble): Use
	__builtin_pack_ieee128 if long double is IEEE 128-bit.
---
 libgcc/config/rs6000/ibm-ldouble.c | 8 ++++++++
 1 file changed, 8 insertions(+)

Comments

will schmidt Oct. 27, 2020, 2:30 p.m. UTC | #1
On Thu, 2020-10-22 at 18:10 -0400, Michael Meissner via Gcc-patches wrote:
> PowerPC: Use __builtin_pack_ieee128 if long double is IEEE 128-bit.
> 
> I have split all of these patches into separate patches to hopefully get them
> into the tree.
> 
> This patch changes the __ibm128 emulator to use __builtin_pack_ieee128
> instead of __builtin_pack_longdouble if long double is IEEE 128-bit, and
> we need to use the __ibm128 type.  The code will run without this patch,
> but this patch slightly optimizes it better.
> 
> I have tested this patch with bootstrap builds on a little endian power9 system
> running Linux.  With the other patches, I have built two full bootstrap builds
> using this patch and the patches after this patch.  One build used the current
> default for long double (IBM extended double) and the other build switched the
> default to IEEE 128-bit.  I used the Advance Toolchain AT 14.0 compiler as the
> library used by this compiler.  There are no regressions between the tests.
> There are 3 fortran benchmarks (ieee/large_2.f90, default_format_2.f90, and
> default_format_denormal_2.f90) that now pass.

good. :-)    A quick search of gcc bugzilla shows there is an existing
PR 67531 that includes ieee rounding support for powerpc long double. 
Does this (partially?) address that? 
  
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67531


> 
> Can I install this into the trunk?
> 
> We have gotten some requests to back port these changes to GCC 10.x.  At the
> moment, I am not planning to do the back port, but I may need to in the future.
> 
> libgcc/
> 2020-10-22  Michael Meissner  <meissner@linux.ibm.com>
> 
> 	* config/rs6000/ibm-ldouble.c (pack_ldouble): Use
> 	__builtin_pack_ieee128 if long double is IEEE 128-bit.
> ---
>  libgcc/config/rs6000/ibm-ldouble.c | 8 ++++++++
>  1 file changed, 8 insertions(+)
> 
> diff --git a/libgcc/config/rs6000/ibm-ldouble.c b/libgcc/config/rs6000/ibm-ldouble.c
> index dd2a02373f2..767fdd72683 100644
> --- a/libgcc/config/rs6000/ibm-ldouble.c
> +++ b/libgcc/config/rs6000/ibm-ldouble.c
> @@ -102,9 +102,17 @@ __asm__ (".symver __gcc_qadd,_xlqadd@GCC_3.4\n\t"
>  static inline IBM128_TYPE
>  pack_ldouble (double dh, double dl)
>  {
> +  /* If we are building on a non-VSX system, the __ibm128 type is not defined.
> +     This means we can't always use __builtin_pack_ibm128.  Instead, we use
> +     __builtin_pack_longdouble if long double uses the IBM extended double
> +     128-bit format, and use the explicit __builtin_pack_ibm128 if long double
> +     is IEEE 128-bit.  */
>  #if defined (__LONG_DOUBLE_128__) && defined (__LONG_DOUBLE_IBM128__)	\
>      && !(defined (_SOFT_FLOAT) || defined (__NO_FPRS__))
>    return __builtin_pack_longdouble (dh, dl);
> +#elif defined (__LONG_DOUBLE_128__) && defined (__LONG_DOUBLE_IEEE128__) \
> +    && !(defined (_SOFT_FLOAT) || defined (__NO_FPRS__))
> +  return __builtin_pack_ibm128 (dh, dl);

ok

lgtm,
thanks
-Will


>  #else
>    union
>    {
> -- 
> 2.22.0
> 
>
Segher Boessenkool Oct. 28, 2020, 9:58 p.m. UTC | #2
Hi Mike,

On Thu, Oct 22, 2020 at 06:10:37PM -0400, Michael Meissner wrote:
> PowerPC: Use __builtin_pack_ieee128 if long double is IEEE 128-bit.

This title makes no sense, and thankfully is not what the patch does :-)

> This patch changes the __ibm128 emulator to use __builtin_pack_ieee128
> instead of __builtin_pack_longdouble if long double is IEEE 128-bit, and
> we need to use the __ibm128 type.  The code will run without this patch,
> but this patch slightly optimizes it better.

It uses __builtin_pack_ibm128, instead?

> libgcc/
> 2020-10-22  Michael Meissner  <meissner@linux.ibm.com>
> 
> 	* config/rs6000/ibm-ldouble.c (pack_ldouble): Use
> 	__builtin_pack_ieee128 if long double is IEEE 128-bit.

Here, too.

> ---
>  libgcc/config/rs6000/ibm-ldouble.c | 8 ++++++++
>  1 file changed, 8 insertions(+)
> 
> diff --git a/libgcc/config/rs6000/ibm-ldouble.c b/libgcc/config/rs6000/ibm-ldouble.c
> index dd2a02373f2..767fdd72683 100644
> --- a/libgcc/config/rs6000/ibm-ldouble.c
> +++ b/libgcc/config/rs6000/ibm-ldouble.c
> @@ -102,9 +102,17 @@ __asm__ (".symver __gcc_qadd,_xlqadd@GCC_3.4\n\t"
>  static inline IBM128_TYPE
>  pack_ldouble (double dh, double dl)
>  {
> +  /* If we are building on a non-VSX system, the __ibm128 type is not defined.

"Building on" does not matter in the least.  The compiler should
generate the same code, no matter what it runs on.  Target matters, not
host (and build not at all).

> +     This means we can't always use __builtin_pack_ibm128.  Instead, we use
> +     __builtin_pack_longdouble if long double uses the IBM extended double
> +     128-bit format, and use the explicit __builtin_pack_ibm128 if long double
> +     is IEEE 128-bit.  */

And this comment is about the *next* case?

>  #if defined (__LONG_DOUBLE_128__) && defined (__LONG_DOUBLE_IBM128__)	\
>      && !(defined (_SOFT_FLOAT) || defined (__NO_FPRS__))
>    return __builtin_pack_longdouble (dh, dl);
> +#elif defined (__LONG_DOUBLE_128__) && defined (__LONG_DOUBLE_IEEE128__) \
> +    && !(defined (_SOFT_FLOAT) || defined (__NO_FPRS__))
> +  return __builtin_pack_ibm128 (dh, dl);

Given the above, _SOFT_FLOAT etc. are wrong.

Just use some more portable thing to repack?  Is __builtin_pack_ibm128
not defined always here anyway?

/* 128-bit __ibm128 floating point builtins (use -mfloat128 to indicate that
   __ibm128 is available).  */
#define BU_IBM128_2(ENUM, NAME, ATTR, ICODE)                            \
  RS6000_BUILTIN_2 (MISC_BUILTIN_ ## ENUM,              /* ENUM */      \
                    "__builtin_" NAME,                  /* NAME */      \
                    (RS6000_BTM_HARD_FLOAT              /* MASK */      \
                     | RS6000_BTM_FLOAT128),                            \
                    (RS6000_BTC_ ## ATTR                /* ATTR */      \
                     | RS6000_BTC_BINARY),                              \
                    CODE_FOR_ ## ICODE)                 /* ICODE */

(so just HARD_FLOAT and FLOAT128 are needed)

What am I missing?


Segher
Michael Meissner Oct. 29, 2020, 4:56 p.m. UTC | #3
On Wed, Oct 28, 2020 at 04:58:46PM -0500, Segher Boessenkool wrote:
> Hi Mike,
> 
> On Thu, Oct 22, 2020 at 06:10:37PM -0400, Michael Meissner wrote:
> > PowerPC: Use __builtin_pack_ieee128 if long double is IEEE 128-bit.
> 
> This title makes no sense, and thankfully is not what the patch does :-)

Thanks, every so often I accidently type __ieee128 instead of __ibm128.

> > This patch changes the __ibm128 emulator to use __builtin_pack_ieee128
> > instead of __builtin_pack_longdouble if long double is IEEE 128-bit, and
> > we need to use the __ibm128 type.  The code will run without this patch,
> > but this patch slightly optimizes it better.
> 
> It uses __builtin_pack_ibm128, instead?

Yes.

> > libgcc/
> > 2020-10-22  Michael Meissner  <meissner@linux.ibm.com>
> > 
> > 	* config/rs6000/ibm-ldouble.c (pack_ldouble): Use
> > 	__builtin_pack_ieee128 if long double is IEEE 128-bit.
> 
> Here, too.
> 
> > ---
> >  libgcc/config/rs6000/ibm-ldouble.c | 8 ++++++++
> >  1 file changed, 8 insertions(+)
> > 
> > diff --git a/libgcc/config/rs6000/ibm-ldouble.c b/libgcc/config/rs6000/ibm-ldouble.c
> > index dd2a02373f2..767fdd72683 100644
> > --- a/libgcc/config/rs6000/ibm-ldouble.c
> > +++ b/libgcc/config/rs6000/ibm-ldouble.c
> > @@ -102,9 +102,17 @@ __asm__ (".symver __gcc_qadd,_xlqadd@GCC_3.4\n\t"
> >  static inline IBM128_TYPE
> >  pack_ldouble (double dh, double dl)
> >  {
> > +  /* If we are building on a non-VSX system, the __ibm128 type is not defined.
> 
> "Building on" does not matter in the least.  The compiler should
> generate the same code, no matter what it runs on.  Target matters, not
> host (and build not at all).

Yes.

> > +     This means we can't always use __builtin_pack_ibm128.  Instead, we use
> > +     __builtin_pack_longdouble if long double uses the IBM extended double
> > +     128-bit format, and use the explicit __builtin_pack_ibm128 if long double
> > +     is IEEE 128-bit.  */
> 
> And this comment is about the *next* case?
> 
> >  #if defined (__LONG_DOUBLE_128__) && defined (__LONG_DOUBLE_IBM128__)	\
> >      && !(defined (_SOFT_FLOAT) || defined (__NO_FPRS__))
> >    return __builtin_pack_longdouble (dh, dl);
> > +#elif defined (__LONG_DOUBLE_128__) && defined (__LONG_DOUBLE_IEEE128__) \
> > +    && !(defined (_SOFT_FLOAT) || defined (__NO_FPRS__))
> > +  return __builtin_pack_ibm128 (dh, dl);
> 
> Given the above, _SOFT_FLOAT etc. are wrong.
> 
> Just use some more portable thing to repack?  Is __builtin_pack_ibm128
> not defined always here anyway?

That is the problem.  If you build a big endian PowerPC compiler where VSX is
not default, the __ibm128 stuff is not defined.  It is only defined when
__float128 is a possibility.  Hence __builtin_pack_ibm128 and
__builtin_unpack_ibm128 are not defined.

> /* 128-bit __ibm128 floating point builtins (use -mfloat128 to indicate that
>    __ibm128 is available).  */
> #define BU_IBM128_2(ENUM, NAME, ATTR, ICODE)                            \
>   RS6000_BUILTIN_2 (MISC_BUILTIN_ ## ENUM,              /* ENUM */      \
>                     "__builtin_" NAME,                  /* NAME */      \
>                     (RS6000_BTM_HARD_FLOAT              /* MASK */      \
>                      | RS6000_BTM_FLOAT128),                            \
>                     (RS6000_BTC_ ## ATTR                /* ATTR */      \
>                      | RS6000_BTC_BINARY),                              \
>                     CODE_FOR_ ## ICODE)                 /* ICODE */
> 
> (so just HARD_FLOAT and FLOAT128 are needed)
> 
> What am I missing?

As I said, the __ibm128 keyword is not enabled on non-VSX systems.
Michael Meissner Oct. 29, 2020, 4:58 p.m. UTC | #4
On Tue, Oct 27, 2020 at 09:30:03AM -0500, will schmidt wrote:
> On Thu, 2020-10-22 at 18:10 -0400, Michael Meissner via Gcc-patches wrote:
> > PowerPC: Use __builtin_pack_ieee128 if long double is IEEE 128-bit.
> > 
> > I have split all of these patches into separate patches to hopefully get them
> > into the tree.
> > 
> > This patch changes the __ibm128 emulator to use __builtin_pack_ieee128
> > instead of __builtin_pack_longdouble if long double is IEEE 128-bit, and
> > we need to use the __ibm128 type.  The code will run without this patch,
> > but this patch slightly optimizes it better.
> > 
> > I have tested this patch with bootstrap builds on a little endian power9 system
> > running Linux.  With the other patches, I have built two full bootstrap builds
> > using this patch and the patches after this patch.  One build used the current
> > default for long double (IBM extended double) and the other build switched the
> > default to IEEE 128-bit.  I used the Advance Toolchain AT 14.0 compiler as the
> > library used by this compiler.  There are no regressions between the tests.
> > There are 3 fortran benchmarks (ieee/large_2.f90, default_format_2.f90, and
> > default_format_denormal_2.f90) that now pass.
> 
> good. :-)    A quick search of gcc bugzilla shows there is an existing
> PR 67531 that includes ieee rounding support for powerpc long double. 
> Does this (partially?) address that? 
>   
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67531

In theory, once the full system uses IEEE 128-bit floating point for long
double, all of the various rounding issues will be fixed.

However, we have to get to that step, and this is just one of a long line of
intermediate steps to get to that goal.
Segher Boessenkool Oct. 29, 2020, 7:10 p.m. UTC | #5
On Thu, Oct 29, 2020 at 12:56:03PM -0400, Michael Meissner wrote:
> On Wed, Oct 28, 2020 at 04:58:46PM -0500, Segher Boessenkool wrote:
> > >  #if defined (__LONG_DOUBLE_128__) && defined (__LONG_DOUBLE_IBM128__)	\
> > >      && !(defined (_SOFT_FLOAT) || defined (__NO_FPRS__))
> > >    return __builtin_pack_longdouble (dh, dl);
> > > +#elif defined (__LONG_DOUBLE_128__) && defined (__LONG_DOUBLE_IEEE128__) \
> > > +    && !(defined (_SOFT_FLOAT) || defined (__NO_FPRS__))
> > > +  return __builtin_pack_ibm128 (dh, dl);
> > 
> > Given the above, _SOFT_FLOAT etc. are wrong.
> > 
> > Just use some more portable thing to repack?  Is __builtin_pack_ibm128
> > not defined always here anyway?
> 
> That is the problem.  If you build a big endian PowerPC compiler where VSX is
> not default, the __ibm128 stuff is not defined.  It is only defined when
> __float128 is a possibility.  Hence __builtin_pack_ibm128 and
> __builtin_unpack_ibm128 are not defined.

So fix that?  When ibm128 is the only thing supported there is no reason
why __builtin_{un,}pack_ibm128 should not be supported (the ieee128
functions of course not, but there is no reason to not define the normal
names for the one supported thing).

> > /* 128-bit __ibm128 floating point builtins (use -mfloat128 to indicate that
> >    __ibm128 is available).  */
> > #define BU_IBM128_2(ENUM, NAME, ATTR, ICODE)                            \
> >   RS6000_BUILTIN_2 (MISC_BUILTIN_ ## ENUM,              /* ENUM */      \
> >                     "__builtin_" NAME,                  /* NAME */      \
> >                     (RS6000_BTM_HARD_FLOAT              /* MASK */      \
> >                      | RS6000_BTM_FLOAT128),                            \
> >                     (RS6000_BTC_ ## ATTR                /* ATTR */      \
> >                      | RS6000_BTC_BINARY),                              \
> >                     CODE_FOR_ ## ICODE)                 /* ICODE */
> > 
> > (so just HARD_FLOAT and FLOAT128 are needed)
> > 
> > What am I missing?
> 
> As I said, the __ibm128 keyword is not enabled on non-VSX systems.

So fix that?  It can easily be supported everywhere, after all.


Segher
diff mbox series

Patch

diff --git a/libgcc/config/rs6000/ibm-ldouble.c b/libgcc/config/rs6000/ibm-ldouble.c
index dd2a02373f2..767fdd72683 100644
--- a/libgcc/config/rs6000/ibm-ldouble.c
+++ b/libgcc/config/rs6000/ibm-ldouble.c
@@ -102,9 +102,17 @@  __asm__ (".symver __gcc_qadd,_xlqadd@GCC_3.4\n\t"
 static inline IBM128_TYPE
 pack_ldouble (double dh, double dl)
 {
+  /* If we are building on a non-VSX system, the __ibm128 type is not defined.
+     This means we can't always use __builtin_pack_ibm128.  Instead, we use
+     __builtin_pack_longdouble if long double uses the IBM extended double
+     128-bit format, and use the explicit __builtin_pack_ibm128 if long double
+     is IEEE 128-bit.  */
 #if defined (__LONG_DOUBLE_128__) && defined (__LONG_DOUBLE_IBM128__)	\
     && !(defined (_SOFT_FLOAT) || defined (__NO_FPRS__))
   return __builtin_pack_longdouble (dh, dl);
+#elif defined (__LONG_DOUBLE_128__) && defined (__LONG_DOUBLE_IEEE128__) \
+    && !(defined (_SOFT_FLOAT) || defined (__NO_FPRS__))
+  return __builtin_pack_ibm128 (dh, dl);
 #else
   union
   {