diff mbox series

[13/13] Use GCC builtins for copysign functions if desired.

Message ID 1575297977-2589-14-git-send-email-stli@linux.ibm.com
State New
Headers show
Series Use GCC builtins for some math functions if desired. | expand

Commit Message

Stefan Liebler Dec. 2, 2019, 2:46 p.m. UTC
This patch is using the corresponding GCC builtin for copysignf, copysign,
copysignl and copysignf128 if the USE_FUNCTION_BUILTIN macros are defined to one
in math-use-builtins.h.

This is the case for s390 if build with at least --march=z196 --mzarch.
Otherwise the generic implementation is used.
Note: z196 is no typo but a bug in GCC!
---
 sysdeps/generic/math-use-builtins.h         |  5 +++++
 sysdeps/ieee754/dbl-64/s_copysign.c         |  8 ++++++-
 sysdeps/ieee754/float128/float128_private.h |  3 +++
 sysdeps/ieee754/flt-32/s_copysignf.c        | 21 ++++++++++++-------
 sysdeps/ieee754/ldbl-128/s_copysignl.c      | 23 ++++++++++++++-------
 sysdeps/s390/fpu/math-use-builtins.h        |  9 ++++++++
 6 files changed, 53 insertions(+), 16 deletions(-)

Comments

Joseph Myers Dec. 2, 2019, 9 p.m. UTC | #1
On Mon, 2 Dec 2019, Stefan Liebler wrote:

> This patch is using the corresponding GCC builtin for copysignf, copysign,
> copysignl and copysignf128 if the USE_FUNCTION_BUILTIN macros are defined to one
> in math-use-builtins.h.

I believe this is always safe for these implementations (the only case 
where GCC might not expand copysign functions inline is copysignl for IBM 
long double, in the soft-float case).
Stefan Liebler Dec. 3, 2019, 8:27 a.m. UTC | #2
On 12/2/19 10:00 PM, Joseph Myers wrote:
> On Mon, 2 Dec 2019, Stefan Liebler wrote:
> 
>> This patch is using the corresponding GCC builtin for copysignf, copysign,
>> copysignl and copysignf128 if the USE_FUNCTION_BUILTIN macros are defined to one
>> in math-use-builtins.h.
> 
> I believe this is always safe for these implementations (the only case
> where GCC might not expand copysign functions inline is copysignl for IBM
> long double, in the soft-float case).
> 
Thus you mean we can do the following preset in 
sysdeps/generic/math-use-builtins.h?
#define USE_COPYSIGN_BUILTIN 1
#define USE_COPYSIGNF_BUILTIN 1
#define USE_COPYSIGNL_BUILTIN 0
#define USE_COPYSIGNF128_BUILTIN 0

Or even also set USE_COPYSIGNL_BUILTIN to one as IBM long double has its 
own implementation in ./sysdeps/ieee754/ldbl-128ibm/s_copysignl.c.

On the other hand, I'm not able to run tests on the different 
architectures. Thus the safest way would be to leave all those macros 
set to zero and let the architectures decide.

In each case, I would prefer to set those macros to one in a separate 
patch. Then this patch could be reverted in case of failing on one 
architecture.
Joseph Myers Dec. 3, 2019, 4:50 p.m. UTC | #3
On Tue, 3 Dec 2019, Stefan Liebler wrote:

> On 12/2/19 10:00 PM, Joseph Myers wrote:
> > On Mon, 2 Dec 2019, Stefan Liebler wrote:
> > 
> > > This patch is using the corresponding GCC builtin for copysignf, copysign,
> > > copysignl and copysignf128 if the USE_FUNCTION_BUILTIN macros are defined
> > > to one
> > > in math-use-builtins.h.
> > 
> > I believe this is always safe for these implementations (the only case
> > where GCC might not expand copysign functions inline is copysignl for IBM
> > long double, in the soft-float case).
> > 
> Thus you mean we can do the following preset in
> sysdeps/generic/math-use-builtins.h?
> #define USE_COPYSIGN_BUILTIN 1
> #define USE_COPYSIGNF_BUILTIN 1
> #define USE_COPYSIGNL_BUILTIN 0
> #define USE_COPYSIGNF128_BUILTIN 0
> 
> Or even also set USE_COPYSIGNL_BUILTIN to one as IBM long double has its own
> implementation in ./sysdeps/ieee754/ldbl-128ibm/s_copysignl.c.

I think they can all be 1, given that IBM long double has its own 
implementation.  Except that you shouldn't need the indirection through 
these macros at all.  Just use __builtin_copysign etc. directly in the 
implementations (other than IBM long double), unconditionally, the same 
way that sysdeps/ieee754/dbl-64/s_fabs.c uses __builtin_fabs directly 
without such macros being needed.
Stefan Liebler Dec. 4, 2019, 1:15 p.m. UTC | #4
On 12/3/19 5:50 PM, Joseph Myers wrote:
> On Tue, 3 Dec 2019, Stefan Liebler wrote:
> 
>> On 12/2/19 10:00 PM, Joseph Myers wrote:
>>> On Mon, 2 Dec 2019, Stefan Liebler wrote:
>>>
>>>> This patch is using the corresponding GCC builtin for copysignf, copysign,
>>>> copysignl and copysignf128 if the USE_FUNCTION_BUILTIN macros are defined
>>>> to one
>>>> in math-use-builtins.h.
>>>
>>> I believe this is always safe for these implementations (the only case
>>> where GCC might not expand copysign functions inline is copysignl for IBM
>>> long double, in the soft-float case).
>>>
>> Thus you mean we can do the following preset in
>> sysdeps/generic/math-use-builtins.h?
>> #define USE_COPYSIGN_BUILTIN 1
>> #define USE_COPYSIGNF_BUILTIN 1
>> #define USE_COPYSIGNL_BUILTIN 0
>> #define USE_COPYSIGNF128_BUILTIN 0
>>
>> Or even also set USE_COPYSIGNL_BUILTIN to one as IBM long double has its own
>> implementation in ./sysdeps/ieee754/ldbl-128ibm/s_copysignl.c.
> 
> I think they can all be 1, given that IBM long double has its own
> implementation.  Except that you shouldn't need the indirection through
> these macros at all.  Just use __builtin_copysign etc. directly in the
> implementations (other than IBM long double), unconditionally, the same
> way that sysdeps/ieee754/dbl-64/s_fabs.c uses __builtin_fabs directly
> without such macros being needed.
> 

Okay. I've set them to 1 and also used build-many-glibcs.py and had a 
look into the s_copysign*.os files. There is no function call to 
copysign* itself.

If one architecture encounters issues with those builtins, an 
architecture specific math-use-builtins.sh file can set the macros to zero.

Bye,
Stefan
Joseph Myers Dec. 4, 2019, 1:20 p.m. UTC | #5
On Wed, 4 Dec 2019, Stefan Liebler wrote:

> > I think they can all be 1, given that IBM long double has its own
> > implementation.  Except that you shouldn't need the indirection through
> > these macros at all.  Just use __builtin_copysign etc. directly in the
> > implementations (other than IBM long double), unconditionally, the same
> > way that sysdeps/ieee754/dbl-64/s_fabs.c uses __builtin_fabs directly
> > without such macros being needed.
> > 
> 
> Okay. I've set them to 1 and also used build-many-glibcs.py and had a look
> into the s_copysign*.os files. There is no function call to copysign* itself.

I think we should handle this like fabs and not have the macros at all, 
just call __builtin_copysign etc. unconditionally.
Stefan Liebler Dec. 4, 2019, 4:34 p.m. UTC | #6
On 12/4/19 2:20 PM, Joseph Myers wrote:
> On Wed, 4 Dec 2019, Stefan Liebler wrote:
> 
>>> I think they can all be 1, given that IBM long double has its own
>>> implementation.  Except that you shouldn't need the indirection through
>>> these macros at all.  Just use __builtin_copysign etc. directly in the
>>> implementations (other than IBM long double), unconditionally, the same
>>> way that sysdeps/ieee754/dbl-64/s_fabs.c uses __builtin_fabs directly
>>> without such macros being needed.
>>>
>>
>> Okay. I've set them to 1 and also used build-many-glibcs.py and had a look
>> into the s_copysign*.os files. There is no function call to copysign* itself.
> 
> I think we should handle this like fabs and not have the macros at all,
> just call __builtin_copysign etc. unconditionally.
> 

Okay. Then I've removed the macros for float and double and call the 
builtins unconditionally.
But I still need the long double version as the f128 builtin is not 
available with every GCC version.

Bye,
Stefan
Joseph Myers Dec. 4, 2019, 8:43 p.m. UTC | #7
On Wed, 4 Dec 2019, Stefan Liebler wrote:

> Okay. Then I've removed the macros for float and double and call the builtins
> unconditionally.
> But I still need the long double version as the f128 builtin is not available
> with every GCC version.

I think the _Float128 version could be defined to 1 for GCC 7 and later 
(so both long double and _Float128 versions could be removed once we 
require GCC 7 or later to build glibc).
Stefan Liebler Dec. 5, 2019, 3:40 p.m. UTC | #8
On 12/4/19 9:43 PM, Joseph Myers wrote:
> On Wed, 4 Dec 2019, Stefan Liebler wrote:
> 
>> Okay. Then I've removed the macros for float and double and call the builtins
>> unconditionally.
>> But I still need the long double version as the f128 builtin is not available
>> with every GCC version.
> 
> I think the _Float128 version could be defined to 1 for GCC 7 and later
> (so both long double and _Float128 versions could be removed once we
> require GCC 7 or later to build glibc).
> 
Okay. I've defined it to 1 for GCC 7 and later.

Bye
Stefan
diff mbox series

Patch

diff --git a/sysdeps/generic/math-use-builtins.h b/sysdeps/generic/math-use-builtins.h
index 34ca438a8c..a22d787791 100644
--- a/sysdeps/generic/math-use-builtins.h
+++ b/sysdeps/generic/math-use-builtins.h
@@ -51,4 +51,9 @@ 
 #define USE_ROUNDL_BUILTIN 0
 #define USE_ROUNDF128_BUILTIN 0
 
+#define USE_COPYSIGN_BUILTIN 0
+#define USE_COPYSIGNF_BUILTIN 0
+#define USE_COPYSIGNL_BUILTIN 0
+#define USE_COPYSIGNF128_BUILTIN 0
+
 #endif /* math-use-builtins.h */
diff --git a/sysdeps/ieee754/dbl-64/s_copysign.c b/sysdeps/ieee754/dbl-64/s_copysign.c
index 589b088c95..0be1a6420f 100644
--- a/sysdeps/ieee754/dbl-64/s_copysign.c
+++ b/sysdeps/ieee754/dbl-64/s_copysign.c
@@ -10,7 +10,9 @@ 
  * ====================================================
  */
 
-#if defined(LIBM_SCCS) && !defined(lint)
+#include <math-use-builtins.h>
+
+#if ! USE_COPYSIGN_BUILTIN && defined (LIBM_SCCS) && ! defined (lint)
 static char rcsid[] = "$NetBSD: s_copysign.c,v 1.8 1995/05/10 20:46:57 jtc Exp $";
 #endif
 
@@ -28,10 +30,14 @@  static char rcsid[] = "$NetBSD: s_copysign.c,v 1.8 1995/05/10 20:46:57 jtc Exp $
 double
 __copysign (double x, double y)
 {
+#if USE_COPYSIGN_BUILTIN
+  return __builtin_copysign (x, y);
+#else
   uint32_t hx, hy;
   GET_HIGH_WORD (hx, x);
   GET_HIGH_WORD (hy, y);
   SET_HIGH_WORD (x, (hx & 0x7fffffff) | (hy & 0x80000000));
   return x;
+#endif
 }
 libm_alias_double (__copysign, copysign)
diff --git a/sysdeps/ieee754/float128/float128_private.h b/sysdeps/ieee754/float128/float128_private.h
index a6c76ce364..7f7f904152 100644
--- a/sysdeps/ieee754/float128/float128_private.h
+++ b/sysdeps/ieee754/float128/float128_private.h
@@ -152,6 +152,8 @@ 
 #define USE_TRUNCL_BUILTIN USE_TRUNCF128_BUILTIN
 #undef USE_ROUNDL_BUILTIN
 #define USE_ROUNDL_BUILTIN USE_ROUNDF128_BUILTIN
+#undef USE_COPYSIGNL_BUILTIN
+#define USE_COPYSIGNL_BUILTIN USE_COPYSIGNF128_BUILTIN
 
 /* IEEE function renames.  */
 #define __ieee754_acoshl __ieee754_acoshf128
@@ -361,6 +363,7 @@ 
 #define __builtin_ceill __builtin_ceilf128
 #define __builtin_truncl __builtin_truncf128
 #define __builtin_roundl __builtin_roundf128
+#define __builtin_copysignl __builtin_copysignf128
 
 /* Get the constant suffix from bits/floatn-compat.h.  */
 #define L(x) __f128 (x)
diff --git a/sysdeps/ieee754/flt-32/s_copysignf.c b/sysdeps/ieee754/flt-32/s_copysignf.c
index 77d1d90e92..9a9e6389cd 100644
--- a/sysdeps/ieee754/flt-32/s_copysignf.c
+++ b/sysdeps/ieee754/flt-32/s_copysignf.c
@@ -13,7 +13,9 @@ 
  * ====================================================
  */
 
-#if defined(LIBM_SCCS) && !defined(lint)
+#include <math-use-builtins.h>
+
+#if ! USE_COPYSIGNF_BUILTIN && defined (LIBM_SCCS) && ! defined (lint)
 static char rcsid[] = "$NetBSD: s_copysignf.c,v 1.4 1995/05/10 20:46:59 jtc Exp $";
 #endif
 
@@ -28,12 +30,17 @@  static char rcsid[] = "$NetBSD: s_copysignf.c,v 1.4 1995/05/10 20:46:59 jtc Exp
 #include <math_private.h>
 #include <libm-alias-float.h>
 
-float __copysignf(float x, float y)
+float
+__copysignf (float x, float y)
 {
-	uint32_t ix,iy;
-	GET_FLOAT_WORD(ix,x);
-	GET_FLOAT_WORD(iy,y);
-	SET_FLOAT_WORD(x,(ix&0x7fffffff)|(iy&0x80000000));
-        return x;
+#if USE_COPYSIGNF_BUILTIN
+  return __builtin_copysignf (x, y);
+#else
+  uint32_t ix, iy;
+  GET_FLOAT_WORD (ix, x);
+  GET_FLOAT_WORD (iy, y);
+  SET_FLOAT_WORD (x, (ix & 0x7fffffff) | (iy & 0x80000000));
+  return x;
+#endif
 }
 libm_alias_float (__copysign, copysign)
diff --git a/sysdeps/ieee754/ldbl-128/s_copysignl.c b/sysdeps/ieee754/ldbl-128/s_copysignl.c
index a501139f71..6095b7fc73 100644
--- a/sysdeps/ieee754/ldbl-128/s_copysignl.c
+++ b/sysdeps/ieee754/ldbl-128/s_copysignl.c
@@ -13,7 +13,9 @@ 
  * ====================================================
  */
 
-#if defined(LIBM_SCCS) && !defined(lint)
+#include <math-use-builtins.h>
+
+#if ! USE_COPYSIGNL_BUILTIN && defined (LIBM_SCCS) && ! defined (lint)
 static char rcsid[] = "$NetBSD: $";
 #endif
 
@@ -28,13 +30,18 @@  static char rcsid[] = "$NetBSD: $";
 #include <math_private.h>
 #include <libm-alias-ldouble.h>
 
-_Float128 __copysignl(_Float128 x, _Float128 y)
+_Float128
+__copysignl (_Float128 x, _Float128 y)
 {
-	uint64_t hx,hy;
-	GET_LDOUBLE_MSW64(hx,x);
-	GET_LDOUBLE_MSW64(hy,y);
-	SET_LDOUBLE_MSW64(x,(hx&0x7fffffffffffffffULL)
-			    |(hy&0x8000000000000000ULL));
-        return x;
+#if USE_COPYSIGNL_BUILTIN
+  return __builtin_copysignl (x, y);
+#else
+  uint64_t hx, hy;
+  GET_LDOUBLE_MSW64 (hx, x);
+  GET_LDOUBLE_MSW64 (hy, y);
+  SET_LDOUBLE_MSW64 (x, (hx & 0x7fffffffffffffffULL)
+		     | (hy & 0x8000000000000000ULL));
+  return x;
+#endif
 }
 libm_alias_ldouble (__copysign, copysign)
diff --git a/sysdeps/s390/fpu/math-use-builtins.h b/sysdeps/s390/fpu/math-use-builtins.h
index 5838a31c50..ed2a05775f 100644
--- a/sysdeps/s390/fpu/math-use-builtins.h
+++ b/sysdeps/s390/fpu/math-use-builtins.h
@@ -50,6 +50,13 @@ 
 # define USE_ROUNDF_BUILTIN 1
 # define USE_ROUNDL_BUILTIN 1
 
+/* GCC emits the z9-ec zarch "copy sign" instruction for these
+   builtins if build with at least --march=z196 -mzarch.
+   Note: z196 is no typo but a bug in GCC!  */
+# define USE_COPYSIGN_BUILTIN 1
+# define USE_COPYSIGNF_BUILTIN 1
+# define USE_COPYSIGNL_BUILTIN 1
+
 # if __GNUC_PREREQ (8, 1)
 #  define USE_NEARBYINTF128_BUILTIN 1
 #  define USE_RINTF128_BUILTIN 1
@@ -57,6 +64,7 @@ 
 #  define USE_CEILF128_BUILTIN 1
 #  define USE_TRUNCF128_BUILTIN 1
 #  define USE_ROUNDF128_BUILTIN 1
+#  define USE_COPYSIGNF128_BUILTIN 1
 # else
 #  define USE_NEARBYINTF128_BUILTIN 0
 #  define USE_RINTF128_BUILTIN 0
@@ -64,6 +72,7 @@ 
 #  define USE_CEILF128_BUILTIN 0
 #  define USE_TRUNCF128_BUILTIN 0
 #  define USE_ROUNDF128_BUILTIN 0
+#  define USE_COPYSIGNF128_BUILTIN 0
 # endif
 
 #else