[rs6000] Improve Documentation of Built-In Functions Part 1
diff mbox series

Message ID a411010a-b268-acd7-3262-88a3a55e4ec7@linux.ibm.com
State New
Headers show
Series
  • [rs6000] Improve Documentation of Built-In Functions Part 1
Related show

Commit Message

Kelvin Nilsen April 24, 2018, 2:12 p.m. UTC
This is the first of several patches to address shortcomings in existing
documentation of
PowerPC built-in functions.  The focus of this particular patch is to
improve documentation
of low-level built-in functions that do not require special include headers.

A summary of this patch follows:

1. Change the name of the first PowerPC built-in section from "PowerPC
Built-in Functions"
   to "Low-Level PowerPC Built-in Functions".  This section has never
described all PowerPC
   built-in functions.

2. Introduce subsubsections within this section to independently
describe built-in functions
   that target particular ISA levels.  Sort function descriptions into
appropriate
   subsubsections.

3. Add descriptions of three new features that can be tested with the
__builtin_cpu_supports
   function: darn, htm-no-suspend, and scv.

4. Remove descriptions of built-in function that do not belong in this
section because the
   built-in functions are generic (not specific to PowerPC):
__builtin_fabsq,
   __builtin_copysignq, __builtin_infq, __builtin_huge_valq, __builtin_nanq,
   __builtin_nansq, __builtin_sqrtf128, __builtin_fmaf128.

5. Corrected the spellings of several built-in functions:
__builtin_fmaf128_round_to_odd,
   __builtin_addg6s, __builtin_cbctdt, __builtin_cdtbcd.

This patch is limited in scope in order to manage complexity of the
diffs.  Subsequent patches
will address different sections of the documentation.  Subsequent
patches will also add
new function descriptions into these sections.

This patch affects only extend.texi.  The gcc.pdf file has been built
and reviewed.

Is this ok for the trunk?

gcc/ChangeLog:

2018-04-24  Kelvin Nilsen  <kelvin@gcc.gnu.org>

    * doc/extend.texi: Tidy documentation of PowerPC built-in functions.

solely
 to maintain API compatibility with the x86 builtins.
@@ -15633,6 +15638,8 @@ CPU supports the set of compatible performance mon
 CPU supports the Embedded ISA category.
 @item cellbe
 CPU has a CELL broadband engine.
+@item darn
+CPU supports the darn (deliver a random number) instruction.
 @item dfp
 CPU has a decimal floating point unit.
 @item dscr
@@ -15649,6 +15656,8 @@ CPU has a floating point unit.
 CPU has hardware transaction memory instructions.
 @item htm-nosc
 Kernel aborts hardware transactions when a syscall is made.
+@item htm-no-suspend
+Kernel aborts hardware transactions when the thread is suspended.
 @item ic_snoop
 CPU supports icache snooping capabilities.
 @item ieee128
@@ -15677,6 +15686,8 @@ CPU supports the old POWER ISA (eg, 601)
 CPU supports 64-bit mode execution.
 @item ppcle
 CPU supports a little-endian mode that uses address swizzling.
+@item scv
+Kernel supports system call vectored.
 @item smt
 CPU support simultaneous multi-threading.
 @item spe
@@ -15708,19 +15719,81 @@ Here is an example:
 @end smallexample
 @end deftypefn
 
-These built-in functions are available for the PowerPC family of
+The following built-in functions are also available on all PowerPC
 processors:
 @smallexample
-float __builtin_recipdivf (float, float);
-float __builtin_rsqrtf (float);
-double __builtin_recipdiv (double, double);
-double __builtin_rsqrt (double);
 uint64_t __builtin_ppc_get_timebase ();
 unsigned long __builtin_ppc_mftb ();
-double __builtin_unpack_longdouble (long double, int);
-long double __builtin_pack_longdouble (double, double);
 @end smallexample
 
+The @code{__builtin_ppc_get_timebase} and @code{__builtin_ppc_mftb}
+functions generate instructions to read the Time Base Register.  The
+@code{__builtin_ppc_get_timebase} function may generate multiple
+instructions and always returns the 64 bits of the Time Base Register.
+The @code{__builtin_ppc_mftb} function always generates one instruction and
+returns the Time Base Register value as an unsigned long, throwing away
+the most significant word on 32-bit environments.
+
+@node Low-Level PowerPC Built-in Functions Available on ISA 2.05
+@subsubsection Low-Level PowerPC Built-in Functions Available on ISA 2.05
+
+The low-level built-in functions described in this section are
+available on the PowerPC family of processors starting with ISA 2.05
+or later.  Unless specific options are explicitly disabled on the
+command line, specifying option (@option{-mcpu=power6}) has the effect of
+enabling the (@option{-mpowerpc64}), (@option{-mpowerpc-gpopt}),
+(@option{-mpowerpc-gfxopt}), (@option{-mmfcrf}), (@option{-mpopcntb}),
+(@option{-mfprnd}), (@option{-mcmpb}), (@option{-mhard-dfp}), and
+(@option{-mrecip-precision}) options.  Specify the
+(@option{-maltivec}) and (@option{-mfpgpr}) options explicitly in
+combination with the above options if they are desired.
+
+The following functions require option (@option{-mcmpb}).
+@smallexample
+unsigned long long __builtin_cmpb (unsigned long long int, unsigned
long long int);
+unsigned int __builtin_cmpb (unsigned int, unsigned int);
+@end smallexample
+
+The @code{__builtin_cmpb} function
+performs a byte-wise compare on the contents of its two arguments,
+returning the result of the byte-wise comparison as the returned
+value.  For each byte comparison, the corresponding byte of the return
+value holds 0xff if the input bytes are equal and 0 if the input bytes
+are not equal.  If either of the arguments to this built-in function
+is wider than 32 bits, the function call expands into the form that
+expects @code{unsigned long long int} arguments
+which is only available on 64-bit targets.
+
+The following built-in functions are available
+when hardware decimal floating point
+(@option{-mhard-dfp}) is available:
+@smallexample
+_Decimal64 __builtin_ddedpd (int, _Decimal64);
+_Decimal128 __builtin_ddedpdq (int, _Decimal128);
+_Decimal64 __builtin_denbcd (int, _Decimal64);
+_Decimal128 __builtin_denbcdq (int, _Decimal128);
+_Decimal64 __builtin_diex (long long, _Decimal64);
+_Decimal128 _builtin_diexq (long long, _Decimal128);
+_Decimal64 __builtin_dscli (_Decimal64, int);
+_Decimal128 __builtin_dscliq (_Decimal128, int);
+_Decimal64 __builtin_dscri (_Decimal64, int);
+_Decimal128 __builtin_dscriq (_Decimal128, int);
+long long __builtin_dxex (_Decimal64);
+long long __builtin_dxexq (_Decimal128);
+_Decimal128 __builtin_pack_dec128 (unsigned long long, unsigned long long);
+unsigned long long __builtin_unpack_dec128 (_Decimal128, int);
+@end smallexample
+
+The following functions require (@option{-mhard-float}),
+(@option{-mpowerpc-gfxopt}), and (@option{-mpopcntb}) options.
+
+@smallexample
+double __builtin_recipdiv (double, double);
+float __builtin_recipdivf (float, float);
+double __builtin_rsqrt (double);
+float __builtin_rsqrtf (float);
+@end smallexample
+
 The @code{vec_rsqrt}, @code{__builtin_rsqrt}, and
 @code{__builtin_rsqrtf} functions generate multiple instructions to
 implement the reciprocal sqrt functionality using reciprocal sqrt
@@ -15730,43 +15803,68 @@ The @code{__builtin_recipdiv}, and @code{__builtin
 functions generate multiple instructions to implement division using
 the reciprocal estimate instructions.
 
-The @code{__builtin_ppc_get_timebase} and @code{__builtin_ppc_mftb}
-functions generate instructions to read the Time Base Register.  The
-@code{__builtin_ppc_get_timebase} function may generate multiple
-instructions and always returns the 64 bits of the Time Base Register.
-The @code{__builtin_ppc_mftb} function always generates one instruction and
-returns the Time Base Register value as an unsigned long, throwing away
-the most significant word on 32-bit environments.
+The following functions require (@option{-mhard-float}) and
+(@option{-mmultiple}) options.
 
-Additional built-in functions are available for the 64-bit PowerPC
-family of processors, for efficient use of 128-bit floating point
-(@code{__float128}) values.
+@smallexample
+long double __builtin_pack_longdouble (double, double);
+double __builtin_unpack_longdouble (long double, int);
+@end smallexample
 
-Previous versions of GCC supported some 'q' builtins for IEEE 128-bit
-floating point.  These functions are now mapped into the equivalent
-'f128' builtin functions.
+@node Low-Level PowerPC Built-in Functions Available on ISA 2.06
+@subsubsection Low-Level PowerPC Built-in Functions Available on ISA 2.06
 
+The low-level built-in functions described in this section are
+available on the PowerPC family of processors starting with ISA 2.05
+or later.  Unless specific options are explicitly disabled on the
+command line, specifying option (@option{-mcpu=power7}) has the effect of
+enabling all the same options as for (@option{-mcpu=power6}) in
+addition to the (@option{-maltivec}), (@option{-mpopcntd}), and
+(@option{-mvsx}) options.
+
+The following low-level built-in functions require (@option{-mpopcntd}):
 @smallexample
-__builtin_fabsq is mapped into __builtin_fabsf128
-__builtin_copysignq is mapped into __builtin_copysignf128
-__builtin_infq is mapped into __builtin_inff128
-__builtin_huge_valq is mapped into __builtin_huge_valf128
-__builtin_nanq is mapped into __builtin_nanf128
-__builtin_nansq is mapped into __builtin_nansf128
+unsigned int __builtin_addg6s (unsigned int, unsigned int);
+long long __builtin_bpermd (long long, long long);
+unsigned int __builtin_cbcdtd (unsigned int);
+unsigned int __builtin_cdtbcd (unsigned int);
+long long __builtin_divde (long long, long long);
+unsigned long long __builtin_divdeu (unsigned long long, unsigned long
long);
+int __builtin_divwe (int, int);
+unsigned int __builtin_divweu (unsigned int, unsigned int);
+vector __int128_t __builtin_pack_vector_int128 (long long, long long);
+void __builtin_rs6000_speculation_barrier (void);
+long long __builtin_unpack_vector_int128 (vector __int128_t, signed char);
 @end smallexample
 
+@node Low-Level PowerPC Built-in Functions Available on ISA 2.07
+@subsubsection Low-Level PowerPC Built-in Functions Available on ISA 2.07
+
+The low-level built-in functions described in this section are
+available on the PowerPC family of processors starting with ISA 2.07
+or later.  Unless specific options are explicitly disabled on the
+command line, specifying option (@option{-mcpu=power8}) has the effect of
+enabling all the same options as for (@option{-mcpu=power7}) in
+addition to the (@option{-mpower8-fusion}), (@option{-mpower8-vector}),
+(@option{-mcrypto}), (@option{-mhtm}), (@option{-mquad-memory}), and
+(@option{-mquad-memory-atomic}) options.
+
+This section intentionally empty.
+
+@node Low-Level PowerPC Built-in Functions Available on ISA 3.0
+@subsubsection Low-Level PowerPC Built-in Functions Available on ISA 3.0
+
+The low-level built-in functions described in this section are
+available on the PowerPC family of processors starting with ISA 3.0
+or later.  Unless specific options are explicitly disabled on the
+command line, specifying option (@option{-mcpu=power9}) has the effect of
+enabling all the same options as for (@option{-mcpu=power8}) in
+addition to the (@option{-misel}) option.
+
 The following built-in functions are available on Linux 64-bit systems
-that use the ISA 3.0 instruction set.
+that use the ISA 3.0 instruction set (@option{-mcpu=power9}):
 
 @table @code
-@item __float128 __builtin_sqrtf128 (__float128)
-Perform a 128-bit IEEE floating point square root operation.
-@findex __builtin_sqrtf128
-
-@item __float128 __builtin_fmaf128 (__float128, __float128, __float128)
-Perform a 128-bit IEEE floating point fused multiply and add operation.
-@findex __builtin_fmaf128
-
 @item __float128 __builtin_addf128_round_to_odd (__float128, __float128)
 Perform a 128-bit IEEE floating point add using round to odd as the
 rounding mode.
@@ -15792,7 +15890,7 @@ Perform a 128-bit IEEE floating point square root
 as the rounding mode.
 @findex __builtin_sqrtf128_round_to_odd
 
-@item __float128 __builtin_fmaf128 (__float128, __float128, __float128)
+@item __float128 __builtin_fmaf128_round_to_odd (__float128,
__float128, __float128)
 Perform a 128-bit IEEE floating point fused multiply and add operation
 using round to odd as the rounding mode.
 @findex __builtin_fmaf128_round_to_odd
@@ -15803,78 +15901,26 @@ round to odd as the rounding mode.
 @findex __builtin_truncf128_round_to_odd
 @end table
 
-The following built-in functions are available for the PowerPC family
-of processors, starting with ISA 2.05 or later (@option{-mcpu=power6}
-or @option{-mcmpb}):
+The following additional built-in functions are also available for the
+PowerPC family of processors, starting with ISA 3.0 or later:
 @smallexample
-unsigned long long __builtin_cmpb (unsigned long long int, unsigned
long long int);
-unsigned int __builtin_cmpb (unsigned int, unsigned int);
-@end smallexample
-
-The @code{__builtin_cmpb} function
-performs a byte-wise compare on the contents of its two arguments,
-returning the result of the byte-wise comparison as the returned
-value.  For each byte comparison, the corresponding byte of the return
-value holds 0xff if the input bytes are equal and 0 if the input bytes
-are not equal.  If either of the arguments to this built-in function
-is wider than 32 bits, the function call expands into the form that
-expects @code{unsigned long long int} arguments
-which is only available on 64-bit targets.
-
-The following built-in functions are available for the PowerPC family
-of processors, starting with ISA 2.06 or later (@option{-mcpu=power7}
-or @option{-mpopcntd}):
-@smallexample
-long __builtin_bpermd (long, long);
-int __builtin_divwe (int, int);
-unsigned int __builtin_divweu (unsigned int, unsigned int);
-long __builtin_divde (long, long);
-unsigned long __builtin_divdeu (unsigned long, unsigned long);
-unsigned int cdtbcd (unsigned int);
-unsigned int cbcdtd (unsigned int);
-unsigned int addg6s (unsigned int, unsigned int);
-void __builtin_rs6000_speculation_barrier (void);
-@end smallexample
-
-The @code{__builtin_divde} and @code{__builtin_divdeu} functions
-require a 64-bit environment supporting ISA 2.06 or later.
-
-The following built-in functions are available for the PowerPC family
-of processors, starting with ISA 3.0 or later (@option{-mcpu=power9}):
-@smallexample
 long long __builtin_darn (void);
 long long __builtin_darn_raw (void);
 int __builtin_darn_32 (void);
+@end smallexample
 
-unsigned int scalar_extract_exp (double source);
-unsigned long long int scalar_extract_exp (__ieee128 source);
+The @code{__builtin_darn} and @code{__builtin_darn_raw}
+functions require a
+64-bit environment supporting ISA 3.0 or later.
+The @code{__builtin_darn} function provides a 64-bit conditioned
+random number.  The @code{__builtin_darn_raw} function provides a
+64-bit raw random number.  The @code{__builtin_darn_32} function
+provides a 32-bit random number.
 
-unsigned long long int scalar_extract_sig (double source);
-unsigned __int128 scalar_extract_sig (__ieee128 source);
+The following additional built-in functions are also available for the
+PowerPC family of processors, starting with ISA 3.0 or later:
 
-double
-scalar_insert_exp (unsigned long long int significand, unsigned long
long int exponent);
-double
-scalar_insert_exp (double significand, unsigned long long int exponent);
-
-ieee_128
-scalar_insert_exp (unsigned __int128 significand, unsigned long long
int exponent);
-ieee_128
-scalar_insert_exp (ieee_128 significand, unsigned long long int exponent);
-
-int scalar_cmp_exp_gt (double arg1, double arg2);
-int scalar_cmp_exp_lt (double arg1, double arg2);
-int scalar_cmp_exp_eq (double arg1, double arg2);
-int scalar_cmp_exp_unordered (double arg1, double arg2);
-
-bool scalar_test_data_class (float source, const int condition);
-bool scalar_test_data_class (double source, const int condition);
-bool scalar_test_data_class (__ieee128 source, const int condition);
-
-bool scalar_test_neg (float source);
-bool scalar_test_neg (double source);
-bool scalar_test_neg (__ieee128 source);
-
+@smallexample
 int __builtin_byte_in_set (unsigned char u, unsigned long long set);
 int __builtin_byte_in_range (unsigned char u, unsigned int range);
 int __builtin_byte_in_either_range (unsigned char u, unsigned int ranges);
@@ -15899,81 +15945,6 @@ int __builtin_dfp_dtstsfi_ov (unsigned int compari
 int __builtin_dfp_dtstsfi_ov_dd (unsigned int comparison, _Decimal64
value);
 int __builtin_dfp_dtstsfi_ov_td (unsigned int comparison, _Decimal128
value);
 @end smallexample
-
-The @code{__builtin_darn} and @code{__builtin_darn_raw}
-functions require a
-64-bit environment supporting ISA 3.0 or later.
-The @code{__builtin_darn} function provides a 64-bit conditioned
-random number.  The @code{__builtin_darn_raw} function provides a
-64-bit raw random number.  The @code{__builtin_darn_32} function
-provides a 32-bit random number.
-
-The @code{scalar_extract_exp} and @code{scalar_extract_sig}
-functions require a 64-bit environment supporting ISA 3.0 or later.
-The @code{scalar_extract_exp} and @code{scalar_extract_sig} built-in
-functions return the significand and the biased exponent value
-respectively of their @code{source} arguments.
-When supplied with a 64-bit @code{source} argument, the
-result returned by @code{scalar_extract_sig} has
-the @code{0x0010000000000000} bit set if the
-function's @code{source} argument is in normalized form.
-Otherwise, this bit is set to 0.
-When supplied with a 128-bit @code{source} argument, the
-@code{0x00010000000000000000000000000000} bit of the result is
-treated similarly.
-Note that the sign of the significand is not represented in the result
-returned from the @code{scalar_extract_sig} function.  Use the
-@code{scalar_test_neg} function to test the sign of its @code{double}
-argument.
-
-The @code{scalar_insert_exp}
-functions require a 64-bit environment supporting ISA 3.0 or later.
-When supplied with a 64-bit first argument, the
-@code{scalar_insert_exp} built-in function returns a double-precision
-floating point value that is constructed by assembling the values of its
-@code{significand} and @code{exponent} arguments.  The sign of the
-result is copied from the most significant bit of the
-@code{significand} argument.  The significand and exponent components
-of the result are composed of the least significant 11 bits of the
-@code{exponent} argument and the least significant 52 bits of the
-@code{significand} argument respectively.
-
-When supplied with a 128-bit first argument, the
-@code{scalar_insert_exp} built-in function returns a quad-precision
-ieee floating point value.  The sign bit of the result is copied from
-the most significant bit of the @code{significand} argument.
-The significand and exponent components of the result are composed of
-the least significant 15 bits of the @code{exponent} argument and the
-least significant 112 bits of the @code{significand} argument respectively.
-
-The @code{scalar_cmp_exp_gt}, @code{scalar_cmp_exp_lt},
-@code{scalar_cmp_exp_eq}, and @code{scalar_cmp_exp_unordered} built-in
-functions return a non-zero value if @code{arg1} is greater than, less
-than, equal to, or not comparable to @code{arg2} respectively.  The
-arguments are not comparable if one or the other equals NaN (not a
-number).
-
-The @code{scalar_test_data_class} built-in function returns 1
-if any of the condition tests enabled by the value of the
-@code{condition} variable are true, and 0 otherwise.  The
-@code{condition} argument must be a compile-time constant integer with
-value not exceeding 127.  The
-@code{condition} argument is encoded as a bitmask with each bit
-enabling the testing of a different condition, as characterized by the
-following:
-@smallexample
-0x40    Test for NaN
-0x20    Test for +Infinity
-0x10    Test for -Infinity
-0x08    Test for +Zero
-0x04    Test for -Zero
-0x02    Test for +Denormal
-0x01    Test for -Denormal
-@end smallexample
-
-The @code{scalar_test_neg} built-in function returns 1 if its
-@code{source} argument holds a negative value, 0 otherwise.
-
 The @code{__builtin_byte_in_set} function requires a
 64-bit environment supporting ISA 3.0 or later.  This function returns
 a non-zero value if and only if its @code{u} argument exactly equals one of
@@ -16024,241 +15995,8 @@ The @code{__builtin_dfp_dtstsfi_ov_dd} and
 require that the type of the @code{value} argument be
 @code{__Decimal64} and @code{__Decimal128} respectively.
 
-The following built-in functions are also available for the PowerPC family
-of processors, starting with ISA 3.0 or later
-(@option{-mcpu=power9}).  These string functions are described
-separately in order to group the descriptions closer to the function
-prototypes:
-@smallexample
-int vec_all_nez (vector signed char, vector signed char);
-int vec_all_nez (vector unsigned char, vector unsigned char);
-int vec_all_nez (vector signed short, vector signed short);
-int vec_all_nez (vector unsigned short, vector unsigned short);
-int vec_all_nez (vector signed int, vector signed int);
-int vec_all_nez (vector unsigned int, vector unsigned int);
 
-int vec_any_eqz (vector signed char, vector signed char);
-int vec_any_eqz (vector unsigned char, vector unsigned char);
-int vec_any_eqz (vector signed short, vector signed short);
-int vec_any_eqz (vector unsigned short, vector unsigned short);
-int vec_any_eqz (vector signed int, vector signed int);
-int vec_any_eqz (vector unsigned int, vector unsigned int);
 
-vector bool char vec_cmpnez (vector signed char arg1, vector signed
char arg2);
-vector bool char vec_cmpnez (vector unsigned char arg1, vector unsigned
char arg2);
-vector bool short vec_cmpnez (vector signed short arg1, vector signed
short arg2);
-vector bool short vec_cmpnez (vector unsigned short arg1, vector
unsigned short arg2);
-vector bool int vec_cmpnez (vector signed int arg1, vector signed int
arg2);
-vector bool int vec_cmpnez (vector unsigned int, vector unsigned int);
-
-vector signed char vec_cnttz (vector signed char);
-vector unsigned char vec_cnttz (vector unsigned char);
-vector signed short vec_cnttz (vector signed short);
-vector unsigned short vec_cnttz (vector unsigned short);
-vector signed int vec_cnttz (vector signed int);
-vector unsigned int vec_cnttz (vector unsigned int);
-vector signed long long vec_cnttz (vector signed long long);
-vector unsigned long long vec_cnttz (vector unsigned long long);
-
-signed int vec_cntlz_lsbb (vector signed char);
-signed int vec_cntlz_lsbb (vector unsigned char);
-
-signed int vec_cnttz_lsbb (vector signed char);
-signed int vec_cnttz_lsbb (vector unsigned char);
-
-unsigned int vec_first_match_index (vector signed char, vector signed
char);
-unsigned int vec_first_match_index (vector unsigned char,
-                                    vector unsigned char);
-unsigned int vec_first_match_index (vector signed int, vector signed int);
-unsigned int vec_first_match_index (vector unsigned int, vector
unsigned int);
-unsigned int vec_first_match_index (vector signed short, vector signed
short);
-unsigned int vec_first_match_index (vector unsigned short,
-                                    vector unsigned short);
-unsigned int vec_first_match_or_eos_index (vector signed char,
-                                           vector signed char);
-unsigned int vec_first_match_or_eos_index (vector unsigned char,
-                                           vector unsigned char);
-unsigned int vec_first_match_or_eos_index (vector signed int,
-                                           vector signed int);
-unsigned int vec_first_match_or_eos_index (vector unsigned int,
-                                           vector unsigned int);
-unsigned int vec_first_match_or_eos_index (vector signed short,
-                                           vector signed short);
-unsigned int vec_first_match_or_eos_index (vector unsigned short,
-                                           vector unsigned short);
-unsigned int vec_first_mismatch_index (vector signed char,
-                                       vector signed char);
-unsigned int vec_first_mismatch_index (vector unsigned char,
-                                       vector unsigned char);
-unsigned int vec_first_mismatch_index (vector signed int,
-                                       vector signed int);
-unsigned int vec_first_mismatch_index (vector unsigned int,
-                                       vector unsigned int);
-unsigned int vec_first_mismatch_index (vector signed short,
-                                       vector signed short);
-unsigned int vec_first_mismatch_index (vector unsigned short,
-                                       vector unsigned short);
-unsigned int vec_first_mismatch_or_eos_index (vector signed char,
-                                              vector signed char);
-unsigned int vec_first_mismatch_or_eos_index (vector unsigned char,
-                                              vector unsigned char);
-unsigned int vec_first_mismatch_or_eos_index (vector signed int,
-                                              vector signed int);
-unsigned int vec_first_mismatch_or_eos_index (vector unsigned int,
-                                              vector unsigned int);
-unsigned int vec_first_mismatch_or_eos_index (vector signed short,
-                                              vector signed short);
-unsigned int vec_first_mismatch_or_eos_index (vector unsigned short,
-                                              vector unsigned short);
-
-vector unsigned short vec_pack_to_short_fp32 (vector float, vector float);
-
-vector signed char vec_xl_be (signed long long, signed char *);
-vector unsigned char vec_xl_be (signed long long, unsigned char *);
-vector signed int vec_xl_be (signed long long, signed int *);
-vector unsigned int vec_xl_be (signed long long, unsigned int *);
-vector signed __int128 vec_xl_be (signed long long, signed __int128 *);
-vector unsigned __int128 vec_xl_be (signed long long, unsigned __int128 *);
-vector signed long long vec_xl_be (signed long long, signed long long *);
-vector unsigned long long vec_xl_be (signed long long, unsigned long
long *);
-vector signed short vec_xl_be (signed long long, signed short *);
-vector unsigned short vec_xl_be (signed long long, unsigned short *);
-vector double vec_xl_be (signed long long, double *);
-vector float vec_xl_be (signed long long, float *);
-
-vector signed char vec_xl_len (signed char *addr, size_t len);
-vector unsigned char vec_xl_len (unsigned char *addr, size_t len);
-vector signed int vec_xl_len (signed int *addr, size_t len);
-vector unsigned int vec_xl_len (unsigned int *addr, size_t len);
-vector signed __int128 vec_xl_len (signed __int128 *addr, size_t len);
-vector unsigned __int128 vec_xl_len (unsigned __int128 *addr, size_t len);
-vector signed long long vec_xl_len (signed long long *addr, size_t len);
-vector unsigned long long vec_xl_len (unsigned long long *addr, size_t
len);
-vector signed short vec_xl_len (signed short *addr, size_t len);
-vector unsigned short vec_xl_len (unsigned short *addr, size_t len);
-vector double vec_xl_len (double *addr, size_t len);
-vector float vec_xl_len (float *addr, size_t len);
-
-vector unsigned char vec_xl_len_r (unsigned char *addr, size_t len);
-
-void vec_xst_len (vector signed char data, signed char *addr, size_t len);
-void vec_xst_len (vector unsigned char data, unsigned char *addr,
size_t len);
-void vec_xst_len (vector signed int data, signed int *addr, size_t len);
-void vec_xst_len (vector unsigned int data, unsigned int *addr, size_t
len);
-void vec_xst_len (vector unsigned __int128 data, unsigned __int128
*addr, size_t len);
-void vec_xst_len (vector signed long long data, signed long long *addr,
size_t len);
-void vec_xst_len (vector unsigned long long data, unsigned long long
*addr, size_t len);
-void vec_xst_len (vector signed short data, signed short *addr, size_t
len);
-void vec_xst_len (vector unsigned short data, unsigned short *addr,
size_t len);
-void vec_xst_len (vector signed __int128 data, signed __int128 *addr,
size_t len);
-void vec_xst_len (vector double data, double *addr, size_t len);
-void vec_xst_len (vector float data, float *addr, size_t len);
-
-void vec_xst_len_r (vector unsigned char data, unsigned char *addr,
size_t len);
-
-signed char vec_xlx (unsigned int index, vector signed char data);
-unsigned char vec_xlx (unsigned int index, vector unsigned char data);
-signed short vec_xlx (unsigned int index, vector signed short data);
-unsigned short vec_xlx (unsigned int index, vector unsigned short data);
-signed int vec_xlx (unsigned int index, vector signed int data);
-unsigned int vec_xlx (unsigned int index, vector unsigned int data);
-float vec_xlx (unsigned int index, vector float data);
-
-signed char vec_xrx (unsigned int index, vector signed char data);
-unsigned char vec_xrx (unsigned int index, vector unsigned char data);
-signed short vec_xrx (unsigned int index, vector signed short data);
-unsigned short vec_xrx (unsigned int index, vector unsigned short data);
-signed int vec_xrx (unsigned int index, vector signed int data);
-unsigned int vec_xrx (unsigned int index, vector unsigned int data);
-float vec_xrx (unsigned int index, vector float data);
-@end smallexample
-
-The @code{vec_all_nez}, @code{vec_any_eqz}, and @code{vec_cmpnez}
-perform pairwise comparisons between the elements at the same
-positions within their two vector arguments.
-The @code{vec_all_nez} function returns a
-non-zero value if and only if all pairwise comparisons are not
-equal and no element of either vector argument contains a zero.
-The @code{vec_any_eqz} function returns a
-non-zero value if and only if at least one pairwise comparison is equal
-or if at least one element of either vector argument contains a zero.
-The @code{vec_cmpnez} function returns a vector of the same type as
-its two arguments, within which each element consists of all ones to
-denote that either the corresponding elements of the incoming arguments are
-not equal or that at least one of the corresponding elements contains
-zero.  Otherwise, the element of the returned vector contains all zeros.
-
-The @code{vec_cntlz_lsbb} function returns the count of the number of
-consecutive leading byte elements (starting from position 0 within the
-supplied vector argument) for which the least-significant bit
-equals zero.  The @code{vec_cnttz_lsbb} function returns the count of
-the number of consecutive trailing byte elements (starting from
-position 15 and counting backwards within the supplied vector
-argument) for which the least-significant bit equals zero.
-
-The @code{vec_xl_len} and @code{vec_xst_len} functions require a
-64-bit environment supporting ISA 3.0 or later.  The @code{vec_xl_len}
-function loads a variable length vector from memory.  The
-@code{vec_xst_len} function stores a variable length vector to memory.
-With both the @code{vec_xl_len} and @code{vec_xst_len} functions, the
-@code{addr} argument represents the memory address to or from which
-data will be transferred, and the
-@code{len} argument represents the number of bytes to be
-transferred, as computed by the C expression @code{min((len & 0xff), 16)}.
-If this expression's value is not a multiple of the vector element's
-size, the behavior of this function is undefined.
-In the case that the underlying computer is configured to run in
-big-endian mode, the data transfer moves bytes 0 to @code{(len - 1)} of
-the corresponding vector.  In little-endian mode, the data transfer
-moves bytes @code{(16 - len)} to @code{15} of the corresponding
-vector.  For the load function, any bytes of the result vector that
-are not loaded from memory are set to zero.
-The value of the @code{addr} argument need not be aligned on a
-multiple of the vector's element size.
-
-The @code{vec_xlx} and @code{vec_xrx} functions extract the single
-element selected by the @code{index} argument from the vector
-represented by the @code{data} argument.  The @code{index} argument
-always specifies a byte offset, regardless of the size of the vector
-element.  With @code{vec_xlx}, @code{index} is the offset of the first
-byte of the element to be extracted.  With @code{vec_xrx}, @code{index}
-represents the last byte of the element to be extracted, measured
-from the right end of the vector.  In other words, the last byte of
-the element to be extracted is found at position @code{(15 - index)}.
-There is no requirement that @code{index} be a multiple of the vector
-element size.  However, if the size of the vector element added to
-@code{index} is greater than 15, the content of the returned value is
-undefined.
-
-The following built-in functions are available for the PowerPC family
-of processors when hardware decimal floating point
-(@option{-mhard-dfp}) is available:
-@smallexample
-long long __builtin_dxex (_Decimal64);
-long long __builtin_dxexq (_Decimal128);
-_Decimal64 __builtin_ddedpd (int, _Decimal64);
-_Decimal128 __builtin_ddedpdq (int, _Decimal128);
-_Decimal64 __builtin_denbcd (int, _Decimal64);
-_Decimal128 __builtin_denbcdq (int, _Decimal128);
-_Decimal64 __builtin_diex (long long, _Decimal64);
-_Decimal128 _builtin_diexq (long long, _Decimal128);
-_Decimal64 __builtin_dscli (_Decimal64, int);
-_Decimal128 __builtin_dscliq (_Decimal128, int);
-_Decimal64 __builtin_dscri (_Decimal64, int);
-_Decimal128 __builtin_dscriq (_Decimal128, int);
-unsigned long long __builtin_unpack_dec128 (_Decimal128, int);
-_Decimal128 __builtin_pack_dec128 (unsigned long long, unsigned long long);
-@end smallexample
-
-The following built-in functions are available for the PowerPC family
-of processors when the Vector Scalar (vsx) instruction set is
-available:
-@smallexample
-unsigned long long __builtin_unpack_vector_int128 (vector __int128_t, int);
-vector __int128_t __builtin_pack_vector_int128 (unsigned long long,
-                                                unsigned long long);
-@end smallexample
-
 @node PowerPC AltiVec/VSX Built-in Functions
 @subsection PowerPC AltiVec Built-in Functions
 
@@ -19030,6 +18768,312 @@ int __builtin_bcdsub_gt (vector __int128_t, vector
 int __builtin_bcdsub_ov (vector __int128_t, vector __int128_t);
 @end smallexample
 
+The following additional built-in functions are also available for the
+PowerPC family of processors, starting with ISA 3.0
+(@option{-mcpu=power9}) or later:
+@smallexample
+unsigned int scalar_extract_exp (double source);
+unsigned long long int scalar_extract_exp (__ieee128 source);
+
+unsigned long long int scalar_extract_sig (double source);
+unsigned __int128 scalar_extract_sig (__ieee128 source);
+
+double
+scalar_insert_exp (unsigned long long int significand, unsigned long
long int exponent);
+double
+scalar_insert_exp (double significand, unsigned long long int exponent);
+
+ieee_128
+scalar_insert_exp (unsigned __int128 significand, unsigned long long
int exponent);
+ieee_128
+scalar_insert_exp (ieee_128 significand, unsigned long long int exponent);
+
+int scalar_cmp_exp_gt (double arg1, double arg2);
+int scalar_cmp_exp_lt (double arg1, double arg2);
+int scalar_cmp_exp_eq (double arg1, double arg2);
+int scalar_cmp_exp_unordered (double arg1, double arg2);
+
+bool scalar_test_data_class (float source, const int condition);
+bool scalar_test_data_class (double source, const int condition);
+bool scalar_test_data_class (__ieee128 source, const int condition);
+
+bool scalar_test_neg (float source);
+bool scalar_test_neg (double source);
+bool scalar_test_neg (__ieee128 source);
+@end smallexample
+
+The @code{scalar_extract_exp} and @code{scalar_extract_sig}
+functions require a 64-bit environment supporting ISA 3.0 or later.
+The @code{scalar_extract_exp} and @code{scalar_extract_sig} built-in
+functions return the significand and the biased exponent value
+respectively of their @code{source} arguments.
+When supplied with a 64-bit @code{source} argument, the
+result returned by @code{scalar_extract_sig} has
+the @code{0x0010000000000000} bit set if the
+function's @code{source} argument is in normalized form.
+Otherwise, this bit is set to 0.
+When supplied with a 128-bit @code{source} argument, the
+@code{0x00010000000000000000000000000000} bit of the result is
+treated similarly.
+Note that the sign of the significand is not represented in the result
+returned from the @code{scalar_extract_sig} function.  Use the
+@code{scalar_test_neg} function to test the sign of its @code{double}
+argument.
+
+The @code{scalar_insert_exp}
+functions require a 64-bit environment supporting ISA 3.0 or later.
+When supplied with a 64-bit first argument, the
+@code{scalar_insert_exp} built-in function returns a double-precision
+floating point value that is constructed by assembling the values of its
+@code{significand} and @code{exponent} arguments.  The sign of the
+result is copied from the most significant bit of the
+@code{significand} argument.  The significand and exponent components
+of the result are composed of the least significant 11 bits of the
+@code{exponent} argument and the least significant 52 bits of the
+@code{significand} argument respectively.
+
+When supplied with a 128-bit first argument, the
+@code{scalar_insert_exp} built-in function returns a quad-precision
+ieee floating point value.  The sign bit of the result is copied from
+the most significant bit of the @code{significand} argument.
+The significand and exponent components of the result are composed of
+the least significant 15 bits of the @code{exponent} argument and the
+least significant 112 bits of the @code{significand} argument respectively.
+
+The @code{scalar_cmp_exp_gt}, @code{scalar_cmp_exp_lt},
+@code{scalar_cmp_exp_eq}, and @code{scalar_cmp_exp_unordered} built-in
+functions return a non-zero value if @code{arg1} is greater than, less
+than, equal to, or not comparable to @code{arg2} respectively.  The
+arguments are not comparable if one or the other equals NaN (not a
+number).
+
+The @code{scalar_test_data_class} built-in function returns 1
+if any of the condition tests enabled by the value of the
+@code{condition} variable are true, and 0 otherwise.  The
+@code{condition} argument must be a compile-time constant integer with
+value not exceeding 127.  The
+@code{condition} argument is encoded as a bitmask with each bit
+enabling the testing of a different condition, as characterized by the
+following:
+@smallexample
+0x40    Test for NaN
+0x20    Test for +Infinity
+0x10    Test for -Infinity
+0x08    Test for +Zero
+0x04    Test for -Zero
+0x02    Test for +Denormal
+0x01    Test for -Denormal
+@end smallexample
+
+The @code{scalar_test_neg} built-in function returns 1 if its
+@code{source} argument holds a negative value, 0 otherwise.
+
+The following built-in functions are also available for the PowerPC family
+of processors, starting with ISA 3.0 or later
+(@option{-mcpu=power9}).  These string functions are described
+separately in order to group the descriptions closer to the function
+prototypes:
+@smallexample
+int vec_all_nez (vector signed char, vector signed char);
+int vec_all_nez (vector unsigned char, vector unsigned char);
+int vec_all_nez (vector signed short, vector signed short);
+int vec_all_nez (vector unsigned short, vector unsigned short);
+int vec_all_nez (vector signed int, vector signed int);
+int vec_all_nez (vector unsigned int, vector unsigned int);
+
+int vec_any_eqz (vector signed char, vector signed char);
+int vec_any_eqz (vector unsigned char, vector unsigned char);
+int vec_any_eqz (vector signed short, vector signed short);
+int vec_any_eqz (vector unsigned short, vector unsigned short);
+int vec_any_eqz (vector signed int, vector signed int);
+int vec_any_eqz (vector unsigned int, vector unsigned int);
+
+vector bool char vec_cmpnez (vector signed char arg1, vector signed
char arg2);
+vector bool char vec_cmpnez (vector unsigned char arg1, vector unsigned
char arg2);
+vector bool short vec_cmpnez (vector signed short arg1, vector signed
short arg2);
+vector bool short vec_cmpnez (vector unsigned short arg1, vector
unsigned short arg2);
+vector bool int vec_cmpnez (vector signed int arg1, vector signed int
arg2);
+vector bool int vec_cmpnez (vector unsigned int, vector unsigned int);
+
+vector signed char vec_cnttz (vector signed char);
+vector unsigned char vec_cnttz (vector unsigned char);
+vector signed short vec_cnttz (vector signed short);
+vector unsigned short vec_cnttz (vector unsigned short);
+vector signed int vec_cnttz (vector signed int);
+vector unsigned int vec_cnttz (vector unsigned int);
+vector signed long long vec_cnttz (vector signed long long);
+vector unsigned long long vec_cnttz (vector unsigned long long);
+
+signed int vec_cntlz_lsbb (vector signed char);
+signed int vec_cntlz_lsbb (vector unsigned char);
+
+signed int vec_cnttz_lsbb (vector signed char);
+signed int vec_cnttz_lsbb (vector unsigned char);
+
+unsigned int vec_first_match_index (vector signed char, vector signed
char);
+unsigned int vec_first_match_index (vector unsigned char,
+                                    vector unsigned char);
+unsigned int vec_first_match_index (vector signed int, vector signed int);
+unsigned int vec_first_match_index (vector unsigned int, vector
unsigned int);
+unsigned int vec_first_match_index (vector signed short, vector signed
short);
+unsigned int vec_first_match_index (vector unsigned short,
+                                    vector unsigned short);
+unsigned int vec_first_match_or_eos_index (vector signed char,
+                                           vector signed char);
+unsigned int vec_first_match_or_eos_index (vector unsigned char,
+                                           vector unsigned char);
+unsigned int vec_first_match_or_eos_index (vector signed int,
+                                           vector signed int);
+unsigned int vec_first_match_or_eos_index (vector unsigned int,
+                                           vector unsigned int);
+unsigned int vec_first_match_or_eos_index (vector signed short,
+                                           vector signed short);
+unsigned int vec_first_match_or_eos_index (vector unsigned short,
+                                           vector unsigned short);
+unsigned int vec_first_mismatch_index (vector signed char,
+                                       vector signed char);
+unsigned int vec_first_mismatch_index (vector unsigned char,
+                                       vector unsigned char);
+unsigned int vec_first_mismatch_index (vector signed int,
+                                       vector signed int);
+unsigned int vec_first_mismatch_index (vector unsigned int,
+                                       vector unsigned int);
+unsigned int vec_first_mismatch_index (vector signed short,
+                                       vector signed short);
+unsigned int vec_first_mismatch_index (vector unsigned short,
+                                       vector unsigned short);
+unsigned int vec_first_mismatch_or_eos_index (vector signed char,
+                                              vector signed char);
+unsigned int vec_first_mismatch_or_eos_index (vector unsigned char,
+                                              vector unsigned char);
+unsigned int vec_first_mismatch_or_eos_index (vector signed int,
+                                              vector signed int);
+unsigned int vec_first_mismatch_or_eos_index (vector unsigned int,
+                                              vector unsigned int);
+unsigned int vec_first_mismatch_or_eos_index (vector signed short,
+                                              vector signed short);
+unsigned int vec_first_mismatch_or_eos_index (vector unsigned short,
+                                              vector unsigned short);
+
+vector unsigned short vec_pack_to_short_fp32 (vector float, vector float);
+
+vector signed char vec_xl_be (signed long long, signed char *);
+vector unsigned char vec_xl_be (signed long long, unsigned char *);
+vector signed int vec_xl_be (signed long long, signed int *);
+vector unsigned int vec_xl_be (signed long long, unsigned int *);
+vector signed __int128 vec_xl_be (signed long long, signed __int128 *);
+vector unsigned __int128 vec_xl_be (signed long long, unsigned __int128 *);
+vector signed long long vec_xl_be (signed long long, signed long long *);
+vector unsigned long long vec_xl_be (signed long long, unsigned long
long *);
+vector signed short vec_xl_be (signed long long, signed short *);
+vector unsigned short vec_xl_be (signed long long, unsigned short *);
+vector double vec_xl_be (signed long long, double *);
+vector float vec_xl_be (signed long long, float *);
+
+vector signed char vec_xl_len (signed char *addr, size_t len);
+vector unsigned char vec_xl_len (unsigned char *addr, size_t len);
+vector signed int vec_xl_len (signed int *addr, size_t len);
+vector unsigned int vec_xl_len (unsigned int *addr, size_t len);
+vector signed __int128 vec_xl_len (signed __int128 *addr, size_t len);
+vector unsigned __int128 vec_xl_len (unsigned __int128 *addr, size_t len);
+vector signed long long vec_xl_len (signed long long *addr, size_t len);
+vector unsigned long long vec_xl_len (unsigned long long *addr, size_t
len);
+vector signed short vec_xl_len (signed short *addr, size_t len);
+vector unsigned short vec_xl_len (unsigned short *addr, size_t len);
+vector double vec_xl_len (double *addr, size_t len);
+vector float vec_xl_len (float *addr, size_t len);
+
+vector unsigned char vec_xl_len_r (unsigned char *addr, size_t len);
+
+void vec_xst_len (vector signed char data, signed char *addr, size_t len);
+void vec_xst_len (vector unsigned char data, unsigned char *addr,
size_t len);
+void vec_xst_len (vector signed int data, signed int *addr, size_t len);
+void vec_xst_len (vector unsigned int data, unsigned int *addr, size_t
len);
+void vec_xst_len (vector unsigned __int128 data, unsigned __int128
*addr, size_t len);
+void vec_xst_len (vector signed long long data, signed long long *addr,
size_t len);
+void vec_xst_len (vector unsigned long long data, unsigned long long
*addr, size_t len);
+void vec_xst_len (vector signed short data, signed short *addr, size_t
len);
+void vec_xst_len (vector unsigned short data, unsigned short *addr,
size_t len);
+void vec_xst_len (vector signed __int128 data, signed __int128 *addr,
size_t len);
+void vec_xst_len (vector double data, double *addr, size_t len);
+void vec_xst_len (vector float data, float *addr, size_t len);
+
+void vec_xst_len_r (vector unsigned char data, unsigned char *addr,
size_t len);
+
+signed char vec_xlx (unsigned int index, vector signed char data);
+unsigned char vec_xlx (unsigned int index, vector unsigned char data);
+signed short vec_xlx (unsigned int index, vector signed short data);
+unsigned short vec_xlx (unsigned int index, vector unsigned short data);
+signed int vec_xlx (unsigned int index, vector signed int data);
+unsigned int vec_xlx (unsigned int index, vector unsigned int data);
+float vec_xlx (unsigned int index, vector float data);
+
+signed char vec_xrx (unsigned int index, vector signed char data);
+unsigned char vec_xrx (unsigned int index, vector unsigned char data);
+signed short vec_xrx (unsigned int index, vector signed short data);
+unsigned short vec_xrx (unsigned int index, vector unsigned short data);
+signed int vec_xrx (unsigned int index, vector signed int data);
+unsigned int vec_xrx (unsigned int index, vector unsigned int data);
+float vec_xrx (unsigned int index, vector float data);
+@end smallexample
+
+The @code{vec_all_nez}, @code{vec_any_eqz}, and @code{vec_cmpnez}
+perform pairwise comparisons between the elements at the same
+positions within their two vector arguments.
+The @code{vec_all_nez} function returns a
+non-zero value if and only if all pairwise comparisons are not
+equal and no element of either vector argument contains a zero.
+The @code{vec_any_eqz} function returns a
+non-zero value if and only if at least one pairwise comparison is equal
+or if at least one element of either vector argument contains a zero.
+The @code{vec_cmpnez} function returns a vector of the same type as
+its two arguments, within which each element consists of all ones to
+denote that either the corresponding elements of the incoming arguments are
+not equal or that at least one of the corresponding elements contains
+zero.  Otherwise, the element of the returned vector contains all zeros.
+
+The @code{vec_cntlz_lsbb} function returns the count of the number of
+consecutive leading byte elements (starting from position 0 within the
+supplied vector argument) for which the least-significant bit
+equals zero.  The @code{vec_cnttz_lsbb} function returns the count of
+the number of consecutive trailing byte elements (starting from
+position 15 and counting backwards within the supplied vector
+argument) for which the least-significant bit equals zero.
+
+The @code{vec_xl_len} and @code{vec_xst_len} functions require a
+64-bit environment supporting ISA 3.0 or later.  The @code{vec_xl_len}
+function loads a variable length vector from memory.  The
+@code{vec_xst_len} function stores a variable length vector to memory.
+With both the @code{vec_xl_len} and @code{vec_xst_len} functions, the
+@code{addr} argument represents the memory address to or from which
+data will be transferred, and the
+@code{len} argument represents the number of bytes to be
+transferred, as computed by the C expression @code{min((len & 0xff), 16)}.
+If this expression's value is not a multiple of the vector element's
+size, the behavior of this function is undefined.
+In the case that the underlying computer is configured to run in
+big-endian mode, the data transfer moves bytes 0 to @code{(len - 1)} of
+the corresponding vector.  In little-endian mode, the data transfer
+moves bytes @code{(16 - len)} to @code{15} of the corresponding
+vector.  For the load function, any bytes of the result vector that
+are not loaded from memory are set to zero.
+The value of the @code{addr} argument need not be aligned on a
+multiple of the vector's element size.
+
+The @code{vec_xlx} and @code{vec_xrx} functions extract the single
+element selected by the @code{index} argument from the vector
+represented by the @code{data} argument.  The @code{index} argument
+always specifies a byte offset, regardless of the size of the vector
+element.  With @code{vec_xlx}, @code{index} is the offset of the first
+byte of the element to be extracted.  With @code{vec_xrx}, @code{index}
+represents the last byte of the element to be extracted, measured
+from the right end of the vector.  In other words, the last byte of
+the element to be extracted is found at position @code{(15 - index)}.
+There is no requirement that @code{index} be a multiple of the vector
+element size.  However, if the size of the vector element added to
+@code{index} is greater than 15, the content of the returned value is
+undefined.
+
 If the ISA 3.0 instruction set additions (@option{-mcpu=power9})
 are available:

Comments

Kelvin Nilsen April 24, 2018, 7:25 p.m. UTC | #1
I'm updating this patch to make two improvements to what was submitted
earlier today:

1. Correct the description of the htm-no-suspend CPU feature.

2. Add a comment to clarify that the builtin_divde and builtin_divdeu
   built-in functions require 64-bit targets.

Everything else is the same as submitted previously.

On 4/24/18 9:12 AM, Kelvin Nilsen wrote:
> This is the first of several patches to address shortcomings in existing
> documentation of
> PowerPC built-in functions.  The focus of this particular patch is to
> improve documentation
> of low-level built-in functions that do not require special include headers.
> 
> A summary of this patch follows:
> 
> 1. Change the name of the first PowerPC built-in section from "PowerPC
> Built-in Functions"
>    to "Low-Level PowerPC Built-in Functions".  This section has never
> described all PowerPC
>    built-in functions.
> 
> 2. Introduce subsubsections within this section to independently
> describe built-in functions
>    that target particular ISA levels.  Sort function descriptions into
> appropriate
>    subsubsections.
> 
> 3. Add descriptions of three new features that can be tested with the
> __builtin_cpu_supports
>    function: darn, htm-no-suspend, and scv.
> 
> 4. Remove descriptions of built-in function that do not belong in this
> section because the
>    built-in functions are generic (not specific to PowerPC):
> __builtin_fabsq,
>    __builtin_copysignq, __builtin_infq, __builtin_huge_valq, __builtin_nanq,
>    __builtin_nansq, __builtin_sqrtf128, __builtin_fmaf128.
> 
> 5. Corrected the spellings of several built-in functions:
> __builtin_fmaf128_round_to_odd,
>    __builtin_addg6s, __builtin_cbctdt, __builtin_cdtbcd.
> 
> This patch is limited in scope in order to manage complexity of the
> diffs.  Subsequent patches
> will address different sections of the documentation.  Subsequent
> patches will also add
> new function descriptions into these sections.
> 
> This patch affects only extend.texi.  The gcc.pdf file has been built
> and reviewed.
> 
> Is this ok for the trunk?
gcc/ChangeLog:

2018-04-24  Kelvin Nilsen  <kelvin@gcc.gnu.org>

	* doc/extend.texi: Tidy documentation of PowerPC built-in functions.

Index: gcc/doc/extend.texi
===================================================================
--- gcc/doc/extend.texi	(revision 259504)
+++ gcc/doc/extend.texi	(working copy)
@@ -15524,12 +15524,17 @@ implementing assertions.
 
 @end table
 
-@node PowerPC Built-in Functions
-@subsection PowerPC Built-in Functions
+@node Low-Level PowerPC Built-in Functions
+@subsection Low-Level PowerPC Built-in Functions
 
-The following built-in functions are always available and can be used to
-check the PowerPC target platform type:
+This section describes PowerPC built-in functions that do not require
+the inclusion of any special header files to declare prototypes or
+provide macro definitions.  The sections that follow describe
+additional PowerPC built-in functions.
 
+@node Low-Level PowerPC Built-in Functions Available on all Targets
+@subsubsection Low-Level PowerPC Built-in Functions Available on all Targets
+
 @deftypefn {Built-in Function} void __builtin_cpu_init (void)
 This function is a @code{nop} on the PowerPC platform and is included solely
 to maintain API compatibility with the x86 builtins.
@@ -15633,6 +15638,8 @@ CPU supports the set of compatible performance mon
 CPU supports the Embedded ISA category.
 @item cellbe
 CPU has a CELL broadband engine.
+@item darn
+CPU supports the darn (deliver a random number) instruction.
 @item dfp
 CPU has a decimal floating point unit.
 @item dscr
@@ -15649,6 +15656,9 @@ CPU has a floating point unit.
 CPU has hardware transaction memory instructions.
 @item htm-nosc
 Kernel aborts hardware transactions when a syscall is made.
+@item htm-no-suspend
+CPU supports hardware transaction memory but does not support the
+tsuspend. instruction.
 @item ic_snoop
 CPU supports icache snooping capabilities.
 @item ieee128
@@ -15677,6 +15687,8 @@ CPU supports the old POWER ISA (eg, 601)
 CPU supports 64-bit mode execution.
 @item ppcle
 CPU supports a little-endian mode that uses address swizzling.
+@item scv
+Kernel supports system call vectored.
 @item smt
 CPU support simultaneous multi-threading.
 @item spe
@@ -15708,19 +15720,81 @@ Here is an example:
 @end smallexample
 @end deftypefn
 
-These built-in functions are available for the PowerPC family of
+The following built-in functions are also available on all PowerPC
 processors:
 @smallexample
-float __builtin_recipdivf (float, float);
-float __builtin_rsqrtf (float);
-double __builtin_recipdiv (double, double);
-double __builtin_rsqrt (double);
 uint64_t __builtin_ppc_get_timebase ();
 unsigned long __builtin_ppc_mftb ();
-double __builtin_unpack_longdouble (long double, int);
-long double __builtin_pack_longdouble (double, double);
 @end smallexample
 
+The @code{__builtin_ppc_get_timebase} and @code{__builtin_ppc_mftb}
+functions generate instructions to read the Time Base Register.  The
+@code{__builtin_ppc_get_timebase} function may generate multiple
+instructions and always returns the 64 bits of the Time Base Register.
+The @code{__builtin_ppc_mftb} function always generates one instruction and
+returns the Time Base Register value as an unsigned long, throwing away
+the most significant word on 32-bit environments.
+
+@node Low-Level PowerPC Built-in Functions Available on ISA 2.05
+@subsubsection Low-Level PowerPC Built-in Functions Available on ISA 2.05
+
+The low-level built-in functions described in this section are
+available on the PowerPC family of processors starting with ISA 2.05
+or later.  Unless specific options are explicitly disabled on the
+command line, specifying option (@option{-mcpu=power6}) has the effect of
+enabling the (@option{-mpowerpc64}), (@option{-mpowerpc-gpopt}),
+(@option{-mpowerpc-gfxopt}), (@option{-mmfcrf}), (@option{-mpopcntb}),
+(@option{-mfprnd}), (@option{-mcmpb}), (@option{-mhard-dfp}), and
+(@option{-mrecip-precision}) options.  Specify the
+(@option{-maltivec}) and (@option{-mfpgpr}) options explicitly in
+combination with the above options if they are desired.
+
+The following functions require option (@option{-mcmpb}).
+@smallexample
+unsigned long long __builtin_cmpb (unsigned long long int, unsigned long long int);
+unsigned int __builtin_cmpb (unsigned int, unsigned int);
+@end smallexample
+
+The @code{__builtin_cmpb} function
+performs a byte-wise compare on the contents of its two arguments,
+returning the result of the byte-wise comparison as the returned
+value.  For each byte comparison, the corresponding byte of the return
+value holds 0xff if the input bytes are equal and 0 if the input bytes
+are not equal.  If either of the arguments to this built-in function
+is wider than 32 bits, the function call expands into the form that
+expects @code{unsigned long long int} arguments
+which is only available on 64-bit targets.
+
+The following built-in functions are available
+when hardware decimal floating point
+(@option{-mhard-dfp}) is available:
+@smallexample
+_Decimal64 __builtin_ddedpd (int, _Decimal64);
+_Decimal128 __builtin_ddedpdq (int, _Decimal128);
+_Decimal64 __builtin_denbcd (int, _Decimal64);
+_Decimal128 __builtin_denbcdq (int, _Decimal128);
+_Decimal64 __builtin_diex (long long, _Decimal64);
+_Decimal128 _builtin_diexq (long long, _Decimal128);
+_Decimal64 __builtin_dscli (_Decimal64, int);
+_Decimal128 __builtin_dscliq (_Decimal128, int);
+_Decimal64 __builtin_dscri (_Decimal64, int);
+_Decimal128 __builtin_dscriq (_Decimal128, int);
+long long __builtin_dxex (_Decimal64);
+long long __builtin_dxexq (_Decimal128);
+_Decimal128 __builtin_pack_dec128 (unsigned long long, unsigned long long);
+unsigned long long __builtin_unpack_dec128 (_Decimal128, int);
+@end smallexample
+
+The following functions require (@option{-mhard-float}),
+(@option{-mpowerpc-gfxopt}), and (@option{-mpopcntb}) options.
+
+@smallexample
+double __builtin_recipdiv (double, double);
+float __builtin_recipdivf (float, float);
+double __builtin_rsqrt (double);
+float __builtin_rsqrtf (float);
+@end smallexample
+
 The @code{vec_rsqrt}, @code{__builtin_rsqrt}, and
 @code{__builtin_rsqrtf} functions generate multiple instructions to
 implement the reciprocal sqrt functionality using reciprocal sqrt
@@ -15730,43 +15804,71 @@ The @code{__builtin_recipdiv}, and @code{__builtin
 functions generate multiple instructions to implement division using
 the reciprocal estimate instructions.
 
-The @code{__builtin_ppc_get_timebase} and @code{__builtin_ppc_mftb}
-functions generate instructions to read the Time Base Register.  The
-@code{__builtin_ppc_get_timebase} function may generate multiple
-instructions and always returns the 64 bits of the Time Base Register.
-The @code{__builtin_ppc_mftb} function always generates one instruction and
-returns the Time Base Register value as an unsigned long, throwing away
-the most significant word on 32-bit environments.
+The following functions require (@option{-mhard-float}) and
+(@option{-mmultiple}) options.
 
-Additional built-in functions are available for the 64-bit PowerPC
-family of processors, for efficient use of 128-bit floating point
-(@code{__float128}) values.
+@smallexample
+long double __builtin_pack_longdouble (double, double);
+double __builtin_unpack_longdouble (long double, int);
+@end smallexample
 
-Previous versions of GCC supported some 'q' builtins for IEEE 128-bit
-floating point.  These functions are now mapped into the equivalent
-'f128' builtin functions.
+@node Low-Level PowerPC Built-in Functions Available on ISA 2.06
+@subsubsection Low-Level PowerPC Built-in Functions Available on ISA 2.06
 
+The low-level built-in functions described in this section are
+available on the PowerPC family of processors starting with ISA 2.05
+or later.  Unless specific options are explicitly disabled on the
+command line, specifying option (@option{-mcpu=power7}) has the effect of
+enabling all the same options as for (@option{-mcpu=power6}) in
+addition to the (@option{-maltivec}), (@option{-mpopcntd}), and
+(@option{-mvsx}) options.
+
+The following low-level built-in functions require (@option{-mpopcntd}):
 @smallexample
-__builtin_fabsq is mapped into __builtin_fabsf128
-__builtin_copysignq is mapped into __builtin_copysignf128
-__builtin_infq is mapped into __builtin_inff128
-__builtin_huge_valq is mapped into __builtin_huge_valf128
-__builtin_nanq is mapped into __builtin_nanf128
-__builtin_nansq is mapped into __builtin_nansf128
+unsigned int __builtin_addg6s (unsigned int, unsigned int);
+long long __builtin_bpermd (long long, long long);
+unsigned int __builtin_cbcdtd (unsigned int);
+unsigned int __builtin_cdtbcd (unsigned int);
+long long __builtin_divde (long long, long long);
+unsigned long long __builtin_divdeu (unsigned long long, unsigned long long);
+int __builtin_divwe (int, int);
+unsigned int __builtin_divweu (unsigned int, unsigned int);
+vector __int128_t __builtin_pack_vector_int128 (long long, long long);
+void __builtin_rs6000_speculation_barrier (void);
+long long __builtin_unpack_vector_int128 (vector __int128_t, signed char);
 @end smallexample
 
+Of these, the @code{__builtin_divde} and @code{__builtin_divdeu} functions
+require a 64-bit environment.
+
+@node Low-Level PowerPC Built-in Functions Available on ISA 2.07
+@subsubsection Low-Level PowerPC Built-in Functions Available on ISA 2.07
+
+The low-level built-in functions described in this section are
+available on the PowerPC family of processors starting with ISA 2.07
+or later.  Unless specific options are explicitly disabled on the
+command line, specifying option (@option{-mcpu=power8}) has the effect of
+enabling all the same options as for (@option{-mcpu=power7}) in
+addition to the (@option{-mpower8-fusion}), (@option{-mpower8-vector}),
+(@option{-mcrypto}), (@option{-mhtm}), (@option{-mquad-memory}), and
+(@option{-mquad-memory-atomic}) options.
+
+This section intentionally empty.
+
+@node Low-Level PowerPC Built-in Functions Available on ISA 3.0
+@subsubsection Low-Level PowerPC Built-in Functions Available on ISA 3.0
+
+The low-level built-in functions described in this section are
+available on the PowerPC family of processors starting with ISA 3.0
+or later.  Unless specific options are explicitly disabled on the
+command line, specifying option (@option{-mcpu=power9}) has the effect of
+enabling all the same options as for (@option{-mcpu=power8}) in
+addition to the (@option{-misel}) option.
+
 The following built-in functions are available on Linux 64-bit systems
-that use the ISA 3.0 instruction set.
+that use the ISA 3.0 instruction set (@option{-mcpu=power9}):
 
 @table @code
-@item __float128 __builtin_sqrtf128 (__float128)
-Perform a 128-bit IEEE floating point square root operation.
-@findex __builtin_sqrtf128
-
-@item __float128 __builtin_fmaf128 (__float128, __float128, __float128)
-Perform a 128-bit IEEE floating point fused multiply and add operation.
-@findex __builtin_fmaf128
-
 @item __float128 __builtin_addf128_round_to_odd (__float128, __float128)
 Perform a 128-bit IEEE floating point add using round to odd as the
 rounding mode.
@@ -15792,7 +15894,7 @@ Perform a 128-bit IEEE floating point square root
 as the rounding mode.
 @findex __builtin_sqrtf128_round_to_odd
 
-@item __float128 __builtin_fmaf128 (__float128, __float128, __float128)
+@item __float128 __builtin_fmaf128_round_to_odd (__float128, __float128, __float128)
 Perform a 128-bit IEEE floating point fused multiply and add operation
 using round to odd as the rounding mode.
 @findex __builtin_fmaf128_round_to_odd
@@ -15803,78 +15905,26 @@ round to odd as the rounding mode.
 @findex __builtin_truncf128_round_to_odd
 @end table
 
-The following built-in functions are available for the PowerPC family
-of processors, starting with ISA 2.05 or later (@option{-mcpu=power6}
-or @option{-mcmpb}):
+The following additional built-in functions are also available for the
+PowerPC family of processors, starting with ISA 3.0 or later:
 @smallexample
-unsigned long long __builtin_cmpb (unsigned long long int, unsigned long long int);
-unsigned int __builtin_cmpb (unsigned int, unsigned int);
-@end smallexample
-
-The @code{__builtin_cmpb} function
-performs a byte-wise compare on the contents of its two arguments,
-returning the result of the byte-wise comparison as the returned
-value.  For each byte comparison, the corresponding byte of the return
-value holds 0xff if the input bytes are equal and 0 if the input bytes
-are not equal.  If either of the arguments to this built-in function
-is wider than 32 bits, the function call expands into the form that
-expects @code{unsigned long long int} arguments
-which is only available on 64-bit targets.
-
-The following built-in functions are available for the PowerPC family
-of processors, starting with ISA 2.06 or later (@option{-mcpu=power7}
-or @option{-mpopcntd}):
-@smallexample
-long __builtin_bpermd (long, long);
-int __builtin_divwe (int, int);
-unsigned int __builtin_divweu (unsigned int, unsigned int);
-long __builtin_divde (long, long);
-unsigned long __builtin_divdeu (unsigned long, unsigned long);
-unsigned int cdtbcd (unsigned int);
-unsigned int cbcdtd (unsigned int);
-unsigned int addg6s (unsigned int, unsigned int);
-void __builtin_rs6000_speculation_barrier (void);
-@end smallexample
-
-The @code{__builtin_divde} and @code{__builtin_divdeu} functions
-require a 64-bit environment supporting ISA 2.06 or later.
-
-The following built-in functions are available for the PowerPC family
-of processors, starting with ISA 3.0 or later (@option{-mcpu=power9}):
-@smallexample
 long long __builtin_darn (void);
 long long __builtin_darn_raw (void);
 int __builtin_darn_32 (void);
+@end smallexample
 
-unsigned int scalar_extract_exp (double source);
-unsigned long long int scalar_extract_exp (__ieee128 source);
+The @code{__builtin_darn} and @code{__builtin_darn_raw}
+functions require a
+64-bit environment supporting ISA 3.0 or later.
+The @code{__builtin_darn} function provides a 64-bit conditioned
+random number.  The @code{__builtin_darn_raw} function provides a
+64-bit raw random number.  The @code{__builtin_darn_32} function
+provides a 32-bit random number.
 
-unsigned long long int scalar_extract_sig (double source);
-unsigned __int128 scalar_extract_sig (__ieee128 source);
+The following additional built-in functions are also available for the
+PowerPC family of processors, starting with ISA 3.0 or later:
 
-double
-scalar_insert_exp (unsigned long long int significand, unsigned long long int exponent);
-double
-scalar_insert_exp (double significand, unsigned long long int exponent);
-
-ieee_128
-scalar_insert_exp (unsigned __int128 significand, unsigned long long int exponent);
-ieee_128
-scalar_insert_exp (ieee_128 significand, unsigned long long int exponent);
-
-int scalar_cmp_exp_gt (double arg1, double arg2);
-int scalar_cmp_exp_lt (double arg1, double arg2);
-int scalar_cmp_exp_eq (double arg1, double arg2);
-int scalar_cmp_exp_unordered (double arg1, double arg2);
-
-bool scalar_test_data_class (float source, const int condition);
-bool scalar_test_data_class (double source, const int condition);
-bool scalar_test_data_class (__ieee128 source, const int condition);
-
-bool scalar_test_neg (float source);
-bool scalar_test_neg (double source);
-bool scalar_test_neg (__ieee128 source);
-
+@smallexample
 int __builtin_byte_in_set (unsigned char u, unsigned long long set);
 int __builtin_byte_in_range (unsigned char u, unsigned int range);
 int __builtin_byte_in_either_range (unsigned char u, unsigned int ranges);
@@ -15899,81 +15949,6 @@ int __builtin_dfp_dtstsfi_ov (unsigned int compari
 int __builtin_dfp_dtstsfi_ov_dd (unsigned int comparison, _Decimal64 value);
 int __builtin_dfp_dtstsfi_ov_td (unsigned int comparison, _Decimal128 value);
 @end smallexample
-
-The @code{__builtin_darn} and @code{__builtin_darn_raw}
-functions require a
-64-bit environment supporting ISA 3.0 or later.
-The @code{__builtin_darn} function provides a 64-bit conditioned
-random number.  The @code{__builtin_darn_raw} function provides a
-64-bit raw random number.  The @code{__builtin_darn_32} function
-provides a 32-bit random number.
-
-The @code{scalar_extract_exp} and @code{scalar_extract_sig}
-functions require a 64-bit environment supporting ISA 3.0 or later.
-The @code{scalar_extract_exp} and @code{scalar_extract_sig} built-in
-functions return the significand and the biased exponent value
-respectively of their @code{source} arguments.
-When supplied with a 64-bit @code{source} argument, the
-result returned by @code{scalar_extract_sig} has
-the @code{0x0010000000000000} bit set if the
-function's @code{source} argument is in normalized form.
-Otherwise, this bit is set to 0.
-When supplied with a 128-bit @code{source} argument, the
-@code{0x00010000000000000000000000000000} bit of the result is
-treated similarly.
-Note that the sign of the significand is not represented in the result
-returned from the @code{scalar_extract_sig} function.  Use the
-@code{scalar_test_neg} function to test the sign of its @code{double}
-argument.
-
-The @code{scalar_insert_exp}
-functions require a 64-bit environment supporting ISA 3.0 or later.
-When supplied with a 64-bit first argument, the
-@code{scalar_insert_exp} built-in function returns a double-precision
-floating point value that is constructed by assembling the values of its
-@code{significand} and @code{exponent} arguments.  The sign of the
-result is copied from the most significant bit of the
-@code{significand} argument.  The significand and exponent components
-of the result are composed of the least significant 11 bits of the
-@code{exponent} argument and the least significant 52 bits of the
-@code{significand} argument respectively.
-
-When supplied with a 128-bit first argument, the
-@code{scalar_insert_exp} built-in function returns a quad-precision
-ieee floating point value.  The sign bit of the result is copied from
-the most significant bit of the @code{significand} argument.
-The significand and exponent components of the result are composed of
-the least significant 15 bits of the @code{exponent} argument and the
-least significant 112 bits of the @code{significand} argument respectively.
-
-The @code{scalar_cmp_exp_gt}, @code{scalar_cmp_exp_lt},
-@code{scalar_cmp_exp_eq}, and @code{scalar_cmp_exp_unordered} built-in
-functions return a non-zero value if @code{arg1} is greater than, less
-than, equal to, or not comparable to @code{arg2} respectively.  The
-arguments are not comparable if one or the other equals NaN (not a
-number). 
-
-The @code{scalar_test_data_class} built-in function returns 1
-if any of the condition tests enabled by the value of the
-@code{condition} variable are true, and 0 otherwise.  The
-@code{condition} argument must be a compile-time constant integer with
-value not exceeding 127.  The
-@code{condition} argument is encoded as a bitmask with each bit
-enabling the testing of a different condition, as characterized by the
-following:
-@smallexample
-0x40    Test for NaN
-0x20    Test for +Infinity
-0x10    Test for -Infinity
-0x08    Test for +Zero
-0x04    Test for -Zero
-0x02    Test for +Denormal
-0x01    Test for -Denormal
-@end smallexample
-
-The @code{scalar_test_neg} built-in function returns 1 if its
-@code{source} argument holds a negative value, 0 otherwise.
-
 The @code{__builtin_byte_in_set} function requires a
 64-bit environment supporting ISA 3.0 or later.  This function returns
 a non-zero value if and only if its @code{u} argument exactly equals one of
@@ -16024,241 +15999,8 @@ The @code{__builtin_dfp_dtstsfi_ov_dd} and
 require that the type of the @code{value} argument be
 @code{__Decimal64} and @code{__Decimal128} respectively.
 
-The following built-in functions are also available for the PowerPC family
-of processors, starting with ISA 3.0 or later
-(@option{-mcpu=power9}).  These string functions are described
-separately in order to group the descriptions closer to the function
-prototypes:
-@smallexample
-int vec_all_nez (vector signed char, vector signed char);
-int vec_all_nez (vector unsigned char, vector unsigned char);
-int vec_all_nez (vector signed short, vector signed short);
-int vec_all_nez (vector unsigned short, vector unsigned short);
-int vec_all_nez (vector signed int, vector signed int);
-int vec_all_nez (vector unsigned int, vector unsigned int);
 
-int vec_any_eqz (vector signed char, vector signed char);
-int vec_any_eqz (vector unsigned char, vector unsigned char);
-int vec_any_eqz (vector signed short, vector signed short);
-int vec_any_eqz (vector unsigned short, vector unsigned short);
-int vec_any_eqz (vector signed int, vector signed int);
-int vec_any_eqz (vector unsigned int, vector unsigned int);
 
-vector bool char vec_cmpnez (vector signed char arg1, vector signed char arg2);
-vector bool char vec_cmpnez (vector unsigned char arg1, vector unsigned char arg2);
-vector bool short vec_cmpnez (vector signed short arg1, vector signed short arg2);
-vector bool short vec_cmpnez (vector unsigned short arg1, vector unsigned short arg2);
-vector bool int vec_cmpnez (vector signed int arg1, vector signed int arg2);
-vector bool int vec_cmpnez (vector unsigned int, vector unsigned int);
-
-vector signed char vec_cnttz (vector signed char);
-vector unsigned char vec_cnttz (vector unsigned char);
-vector signed short vec_cnttz (vector signed short);
-vector unsigned short vec_cnttz (vector unsigned short);
-vector signed int vec_cnttz (vector signed int);
-vector unsigned int vec_cnttz (vector unsigned int);
-vector signed long long vec_cnttz (vector signed long long);
-vector unsigned long long vec_cnttz (vector unsigned long long);
-
-signed int vec_cntlz_lsbb (vector signed char);
-signed int vec_cntlz_lsbb (vector unsigned char);
-
-signed int vec_cnttz_lsbb (vector signed char);
-signed int vec_cnttz_lsbb (vector unsigned char);
-
-unsigned int vec_first_match_index (vector signed char, vector signed char);
-unsigned int vec_first_match_index (vector unsigned char,
-                                    vector unsigned char);
-unsigned int vec_first_match_index (vector signed int, vector signed int);
-unsigned int vec_first_match_index (vector unsigned int, vector unsigned int);
-unsigned int vec_first_match_index (vector signed short, vector signed short);
-unsigned int vec_first_match_index (vector unsigned short,
-                                    vector unsigned short);
-unsigned int vec_first_match_or_eos_index (vector signed char,
-                                           vector signed char);
-unsigned int vec_first_match_or_eos_index (vector unsigned char,
-                                           vector unsigned char);
-unsigned int vec_first_match_or_eos_index (vector signed int,
-                                           vector signed int);
-unsigned int vec_first_match_or_eos_index (vector unsigned int,
-                                           vector unsigned int);
-unsigned int vec_first_match_or_eos_index (vector signed short,
-                                           vector signed short);
-unsigned int vec_first_match_or_eos_index (vector unsigned short,
-                                           vector unsigned short);
-unsigned int vec_first_mismatch_index (vector signed char,
-                                       vector signed char);
-unsigned int vec_first_mismatch_index (vector unsigned char,
-                                       vector unsigned char);
-unsigned int vec_first_mismatch_index (vector signed int,
-                                       vector signed int);
-unsigned int vec_first_mismatch_index (vector unsigned int,
-                                       vector unsigned int);
-unsigned int vec_first_mismatch_index (vector signed short,
-                                       vector signed short);
-unsigned int vec_first_mismatch_index (vector unsigned short,
-                                       vector unsigned short);
-unsigned int vec_first_mismatch_or_eos_index (vector signed char,
-                                              vector signed char);
-unsigned int vec_first_mismatch_or_eos_index (vector unsigned char,
-                                              vector unsigned char);
-unsigned int vec_first_mismatch_or_eos_index (vector signed int,
-                                              vector signed int);
-unsigned int vec_first_mismatch_or_eos_index (vector unsigned int,
-                                              vector unsigned int);
-unsigned int vec_first_mismatch_or_eos_index (vector signed short,
-                                              vector signed short);
-unsigned int vec_first_mismatch_or_eos_index (vector unsigned short,
-                                              vector unsigned short);
-
-vector unsigned short vec_pack_to_short_fp32 (vector float, vector float);
-
-vector signed char vec_xl_be (signed long long, signed char *);
-vector unsigned char vec_xl_be (signed long long, unsigned char *);
-vector signed int vec_xl_be (signed long long, signed int *);
-vector unsigned int vec_xl_be (signed long long, unsigned int *);
-vector signed __int128 vec_xl_be (signed long long, signed __int128 *);
-vector unsigned __int128 vec_xl_be (signed long long, unsigned __int128 *);
-vector signed long long vec_xl_be (signed long long, signed long long *);
-vector unsigned long long vec_xl_be (signed long long, unsigned long long *);
-vector signed short vec_xl_be (signed long long, signed short *);
-vector unsigned short vec_xl_be (signed long long, unsigned short *);
-vector double vec_xl_be (signed long long, double *);
-vector float vec_xl_be (signed long long, float *);
-
-vector signed char vec_xl_len (signed char *addr, size_t len);
-vector unsigned char vec_xl_len (unsigned char *addr, size_t len);
-vector signed int vec_xl_len (signed int *addr, size_t len);
-vector unsigned int vec_xl_len (unsigned int *addr, size_t len);
-vector signed __int128 vec_xl_len (signed __int128 *addr, size_t len);
-vector unsigned __int128 vec_xl_len (unsigned __int128 *addr, size_t len);
-vector signed long long vec_xl_len (signed long long *addr, size_t len);
-vector unsigned long long vec_xl_len (unsigned long long *addr, size_t len);
-vector signed short vec_xl_len (signed short *addr, size_t len);
-vector unsigned short vec_xl_len (unsigned short *addr, size_t len);
-vector double vec_xl_len (double *addr, size_t len);
-vector float vec_xl_len (float *addr, size_t len);
-
-vector unsigned char vec_xl_len_r (unsigned char *addr, size_t len);
-
-void vec_xst_len (vector signed char data, signed char *addr, size_t len);
-void vec_xst_len (vector unsigned char data, unsigned char *addr, size_t len);
-void vec_xst_len (vector signed int data, signed int *addr, size_t len);
-void vec_xst_len (vector unsigned int data, unsigned int *addr, size_t len);
-void vec_xst_len (vector unsigned __int128 data, unsigned __int128 *addr, size_t len);
-void vec_xst_len (vector signed long long data, signed long long *addr, size_t len);
-void vec_xst_len (vector unsigned long long data, unsigned long long *addr, size_t len);
-void vec_xst_len (vector signed short data, signed short *addr, size_t len);
-void vec_xst_len (vector unsigned short data, unsigned short *addr, size_t len);
-void vec_xst_len (vector signed __int128 data, signed __int128 *addr, size_t len);
-void vec_xst_len (vector double data, double *addr, size_t len);
-void vec_xst_len (vector float data, float *addr, size_t len);
-
-void vec_xst_len_r (vector unsigned char data, unsigned char *addr, size_t len);
-
-signed char vec_xlx (unsigned int index, vector signed char data);
-unsigned char vec_xlx (unsigned int index, vector unsigned char data);
-signed short vec_xlx (unsigned int index, vector signed short data);
-unsigned short vec_xlx (unsigned int index, vector unsigned short data);
-signed int vec_xlx (unsigned int index, vector signed int data);
-unsigned int vec_xlx (unsigned int index, vector unsigned int data);
-float vec_xlx (unsigned int index, vector float data);
-
-signed char vec_xrx (unsigned int index, vector signed char data);
-unsigned char vec_xrx (unsigned int index, vector unsigned char data);
-signed short vec_xrx (unsigned int index, vector signed short data);
-unsigned short vec_xrx (unsigned int index, vector unsigned short data);
-signed int vec_xrx (unsigned int index, vector signed int data);
-unsigned int vec_xrx (unsigned int index, vector unsigned int data);
-float vec_xrx (unsigned int index, vector float data);
-@end smallexample
-
-The @code{vec_all_nez}, @code{vec_any_eqz}, and @code{vec_cmpnez}
-perform pairwise comparisons between the elements at the same
-positions within their two vector arguments.
-The @code{vec_all_nez} function returns a
-non-zero value if and only if all pairwise comparisons are not
-equal and no element of either vector argument contains a zero.
-The @code{vec_any_eqz} function returns a
-non-zero value if and only if at least one pairwise comparison is equal
-or if at least one element of either vector argument contains a zero.
-The @code{vec_cmpnez} function returns a vector of the same type as
-its two arguments, within which each element consists of all ones to
-denote that either the corresponding elements of the incoming arguments are
-not equal or that at least one of the corresponding elements contains
-zero.  Otherwise, the element of the returned vector contains all zeros.
-
-The @code{vec_cntlz_lsbb} function returns the count of the number of
-consecutive leading byte elements (starting from position 0 within the
-supplied vector argument) for which the least-significant bit
-equals zero.  The @code{vec_cnttz_lsbb} function returns the count of
-the number of consecutive trailing byte elements (starting from
-position 15 and counting backwards within the supplied vector
-argument) for which the least-significant bit equals zero.
-
-The @code{vec_xl_len} and @code{vec_xst_len} functions require a
-64-bit environment supporting ISA 3.0 or later.  The @code{vec_xl_len}
-function loads a variable length vector from memory.  The
-@code{vec_xst_len} function stores a variable length vector to memory.
-With both the @code{vec_xl_len} and @code{vec_xst_len} functions, the
-@code{addr} argument represents the memory address to or from which
-data will be transferred, and the
-@code{len} argument represents the number of bytes to be
-transferred, as computed by the C expression @code{min((len & 0xff), 16)}.
-If this expression's value is not a multiple of the vector element's
-size, the behavior of this function is undefined.
-In the case that the underlying computer is configured to run in
-big-endian mode, the data transfer moves bytes 0 to @code{(len - 1)} of
-the corresponding vector.  In little-endian mode, the data transfer
-moves bytes @code{(16 - len)} to @code{15} of the corresponding
-vector.  For the load function, any bytes of the result vector that
-are not loaded from memory are set to zero.
-The value of the @code{addr} argument need not be aligned on a
-multiple of the vector's element size.
-
-The @code{vec_xlx} and @code{vec_xrx} functions extract the single
-element selected by the @code{index} argument from the vector
-represented by the @code{data} argument.  The @code{index} argument
-always specifies a byte offset, regardless of the size of the vector
-element.  With @code{vec_xlx}, @code{index} is the offset of the first
-byte of the element to be extracted.  With @code{vec_xrx}, @code{index}
-represents the last byte of the element to be extracted, measured
-from the right end of the vector.  In other words, the last byte of
-the element to be extracted is found at position @code{(15 - index)}.
-There is no requirement that @code{index} be a multiple of the vector
-element size.  However, if the size of the vector element added to
-@code{index} is greater than 15, the content of the returned value is
-undefined.
-
-The following built-in functions are available for the PowerPC family
-of processors when hardware decimal floating point
-(@option{-mhard-dfp}) is available:
-@smallexample
-long long __builtin_dxex (_Decimal64);
-long long __builtin_dxexq (_Decimal128);
-_Decimal64 __builtin_ddedpd (int, _Decimal64);
-_Decimal128 __builtin_ddedpdq (int, _Decimal128);
-_Decimal64 __builtin_denbcd (int, _Decimal64);
-_Decimal128 __builtin_denbcdq (int, _Decimal128);
-_Decimal64 __builtin_diex (long long, _Decimal64);
-_Decimal128 _builtin_diexq (long long, _Decimal128);
-_Decimal64 __builtin_dscli (_Decimal64, int);
-_Decimal128 __builtin_dscliq (_Decimal128, int);
-_Decimal64 __builtin_dscri (_Decimal64, int);
-_Decimal128 __builtin_dscriq (_Decimal128, int);
-unsigned long long __builtin_unpack_dec128 (_Decimal128, int);
-_Decimal128 __builtin_pack_dec128 (unsigned long long, unsigned long long);
-@end smallexample
-
-The following built-in functions are available for the PowerPC family
-of processors when the Vector Scalar (vsx) instruction set is
-available:
-@smallexample
-unsigned long long __builtin_unpack_vector_int128 (vector __int128_t, int);
-vector __int128_t __builtin_pack_vector_int128 (unsigned long long,
-                                                unsigned long long);
-@end smallexample
-
 @node PowerPC AltiVec/VSX Built-in Functions
 @subsection PowerPC AltiVec Built-in Functions
 
@@ -19030,6 +18772,312 @@ int __builtin_bcdsub_gt (vector __int128_t, vector
 int __builtin_bcdsub_ov (vector __int128_t, vector __int128_t);
 @end smallexample
 
+The following additional built-in functions are also available for the
+PowerPC family of processors, starting with ISA 3.0
+(@option{-mcpu=power9}) or later:
+@smallexample
+unsigned int scalar_extract_exp (double source);
+unsigned long long int scalar_extract_exp (__ieee128 source);
+
+unsigned long long int scalar_extract_sig (double source);
+unsigned __int128 scalar_extract_sig (__ieee128 source);
+
+double
+scalar_insert_exp (unsigned long long int significand, unsigned long long int exponent);
+double
+scalar_insert_exp (double significand, unsigned long long int exponent);
+
+ieee_128
+scalar_insert_exp (unsigned __int128 significand, unsigned long long int exponent);
+ieee_128
+scalar_insert_exp (ieee_128 significand, unsigned long long int exponent);
+
+int scalar_cmp_exp_gt (double arg1, double arg2);
+int scalar_cmp_exp_lt (double arg1, double arg2);
+int scalar_cmp_exp_eq (double arg1, double arg2);
+int scalar_cmp_exp_unordered (double arg1, double arg2);
+
+bool scalar_test_data_class (float source, const int condition);
+bool scalar_test_data_class (double source, const int condition);
+bool scalar_test_data_class (__ieee128 source, const int condition);
+
+bool scalar_test_neg (float source);
+bool scalar_test_neg (double source);
+bool scalar_test_neg (__ieee128 source);
+@end smallexample
+
+The @code{scalar_extract_exp} and @code{scalar_extract_sig}
+functions require a 64-bit environment supporting ISA 3.0 or later.
+The @code{scalar_extract_exp} and @code{scalar_extract_sig} built-in
+functions return the significand and the biased exponent value
+respectively of their @code{source} arguments.
+When supplied with a 64-bit @code{source} argument, the
+result returned by @code{scalar_extract_sig} has
+the @code{0x0010000000000000} bit set if the
+function's @code{source} argument is in normalized form.
+Otherwise, this bit is set to 0.
+When supplied with a 128-bit @code{source} argument, the
+@code{0x00010000000000000000000000000000} bit of the result is
+treated similarly.
+Note that the sign of the significand is not represented in the result
+returned from the @code{scalar_extract_sig} function.  Use the
+@code{scalar_test_neg} function to test the sign of its @code{double}
+argument.
+
+The @code{scalar_insert_exp}
+functions require a 64-bit environment supporting ISA 3.0 or later.
+When supplied with a 64-bit first argument, the
+@code{scalar_insert_exp} built-in function returns a double-precision
+floating point value that is constructed by assembling the values of its
+@code{significand} and @code{exponent} arguments.  The sign of the
+result is copied from the most significant bit of the
+@code{significand} argument.  The significand and exponent components
+of the result are composed of the least significant 11 bits of the
+@code{exponent} argument and the least significant 52 bits of the
+@code{significand} argument respectively.
+
+When supplied with a 128-bit first argument, the
+@code{scalar_insert_exp} built-in function returns a quad-precision
+ieee floating point value.  The sign bit of the result is copied from
+the most significant bit of the @code{significand} argument.
+The significand and exponent components of the result are composed of
+the least significant 15 bits of the @code{exponent} argument and the
+least significant 112 bits of the @code{significand} argument respectively.
+
+The @code{scalar_cmp_exp_gt}, @code{scalar_cmp_exp_lt},
+@code{scalar_cmp_exp_eq}, and @code{scalar_cmp_exp_unordered} built-in
+functions return a non-zero value if @code{arg1} is greater than, less
+than, equal to, or not comparable to @code{arg2} respectively.  The
+arguments are not comparable if one or the other equals NaN (not a
+number). 
+
+The @code{scalar_test_data_class} built-in function returns 1
+if any of the condition tests enabled by the value of the
+@code{condition} variable are true, and 0 otherwise.  The
+@code{condition} argument must be a compile-time constant integer with
+value not exceeding 127.  The
+@code{condition} argument is encoded as a bitmask with each bit
+enabling the testing of a different condition, as characterized by the
+following:
+@smallexample
+0x40    Test for NaN
+0x20    Test for +Infinity
+0x10    Test for -Infinity
+0x08    Test for +Zero
+0x04    Test for -Zero
+0x02    Test for +Denormal
+0x01    Test for -Denormal
+@end smallexample
+
+The @code{scalar_test_neg} built-in function returns 1 if its
+@code{source} argument holds a negative value, 0 otherwise.
+
+The following built-in functions are also available for the PowerPC family
+of processors, starting with ISA 3.0 or later
+(@option{-mcpu=power9}).  These string functions are described
+separately in order to group the descriptions closer to the function
+prototypes:
+@smallexample
+int vec_all_nez (vector signed char, vector signed char);
+int vec_all_nez (vector unsigned char, vector unsigned char);
+int vec_all_nez (vector signed short, vector signed short);
+int vec_all_nez (vector unsigned short, vector unsigned short);
+int vec_all_nez (vector signed int, vector signed int);
+int vec_all_nez (vector unsigned int, vector unsigned int);
+
+int vec_any_eqz (vector signed char, vector signed char);
+int vec_any_eqz (vector unsigned char, vector unsigned char);
+int vec_any_eqz (vector signed short, vector signed short);
+int vec_any_eqz (vector unsigned short, vector unsigned short);
+int vec_any_eqz (vector signed int, vector signed int);
+int vec_any_eqz (vector unsigned int, vector unsigned int);
+
+vector bool char vec_cmpnez (vector signed char arg1, vector signed char arg2);
+vector bool char vec_cmpnez (vector unsigned char arg1, vector unsigned char arg2);
+vector bool short vec_cmpnez (vector signed short arg1, vector signed short arg2);
+vector bool short vec_cmpnez (vector unsigned short arg1, vector unsigned short arg2);
+vector bool int vec_cmpnez (vector signed int arg1, vector signed int arg2);
+vector bool int vec_cmpnez (vector unsigned int, vector unsigned int);
+
+vector signed char vec_cnttz (vector signed char);
+vector unsigned char vec_cnttz (vector unsigned char);
+vector signed short vec_cnttz (vector signed short);
+vector unsigned short vec_cnttz (vector unsigned short);
+vector signed int vec_cnttz (vector signed int);
+vector unsigned int vec_cnttz (vector unsigned int);
+vector signed long long vec_cnttz (vector signed long long);
+vector unsigned long long vec_cnttz (vector unsigned long long);
+
+signed int vec_cntlz_lsbb (vector signed char);
+signed int vec_cntlz_lsbb (vector unsigned char);
+
+signed int vec_cnttz_lsbb (vector signed char);
+signed int vec_cnttz_lsbb (vector unsigned char);
+
+unsigned int vec_first_match_index (vector signed char, vector signed char);
+unsigned int vec_first_match_index (vector unsigned char,
+                                    vector unsigned char);
+unsigned int vec_first_match_index (vector signed int, vector signed int);
+unsigned int vec_first_match_index (vector unsigned int, vector unsigned int);
+unsigned int vec_first_match_index (vector signed short, vector signed short);
+unsigned int vec_first_match_index (vector unsigned short,
+                                    vector unsigned short);
+unsigned int vec_first_match_or_eos_index (vector signed char,
+                                           vector signed char);
+unsigned int vec_first_match_or_eos_index (vector unsigned char,
+                                           vector unsigned char);
+unsigned int vec_first_match_or_eos_index (vector signed int,
+                                           vector signed int);
+unsigned int vec_first_match_or_eos_index (vector unsigned int,
+                                           vector unsigned int);
+unsigned int vec_first_match_or_eos_index (vector signed short,
+                                           vector signed short);
+unsigned int vec_first_match_or_eos_index (vector unsigned short,
+                                           vector unsigned short);
+unsigned int vec_first_mismatch_index (vector signed char,
+                                       vector signed char);
+unsigned int vec_first_mismatch_index (vector unsigned char,
+                                       vector unsigned char);
+unsigned int vec_first_mismatch_index (vector signed int,
+                                       vector signed int);
+unsigned int vec_first_mismatch_index (vector unsigned int,
+                                       vector unsigned int);
+unsigned int vec_first_mismatch_index (vector signed short,
+                                       vector signed short);
+unsigned int vec_first_mismatch_index (vector unsigned short,
+                                       vector unsigned short);
+unsigned int vec_first_mismatch_or_eos_index (vector signed char,
+                                              vector signed char);
+unsigned int vec_first_mismatch_or_eos_index (vector unsigned char,
+                                              vector unsigned char);
+unsigned int vec_first_mismatch_or_eos_index (vector signed int,
+                                              vector signed int);
+unsigned int vec_first_mismatch_or_eos_index (vector unsigned int,
+                                              vector unsigned int);
+unsigned int vec_first_mismatch_or_eos_index (vector signed short,
+                                              vector signed short);
+unsigned int vec_first_mismatch_or_eos_index (vector unsigned short,
+                                              vector unsigned short);
+
+vector unsigned short vec_pack_to_short_fp32 (vector float, vector float);
+
+vector signed char vec_xl_be (signed long long, signed char *);
+vector unsigned char vec_xl_be (signed long long, unsigned char *);
+vector signed int vec_xl_be (signed long long, signed int *);
+vector unsigned int vec_xl_be (signed long long, unsigned int *);
+vector signed __int128 vec_xl_be (signed long long, signed __int128 *);
+vector unsigned __int128 vec_xl_be (signed long long, unsigned __int128 *);
+vector signed long long vec_xl_be (signed long long, signed long long *);
+vector unsigned long long vec_xl_be (signed long long, unsigned long long *);
+vector signed short vec_xl_be (signed long long, signed short *);
+vector unsigned short vec_xl_be (signed long long, unsigned short *);
+vector double vec_xl_be (signed long long, double *);
+vector float vec_xl_be (signed long long, float *);
+
+vector signed char vec_xl_len (signed char *addr, size_t len);
+vector unsigned char vec_xl_len (unsigned char *addr, size_t len);
+vector signed int vec_xl_len (signed int *addr, size_t len);
+vector unsigned int vec_xl_len (unsigned int *addr, size_t len);
+vector signed __int128 vec_xl_len (signed __int128 *addr, size_t len);
+vector unsigned __int128 vec_xl_len (unsigned __int128 *addr, size_t len);
+vector signed long long vec_xl_len (signed long long *addr, size_t len);
+vector unsigned long long vec_xl_len (unsigned long long *addr, size_t len);
+vector signed short vec_xl_len (signed short *addr, size_t len);
+vector unsigned short vec_xl_len (unsigned short *addr, size_t len);
+vector double vec_xl_len (double *addr, size_t len);
+vector float vec_xl_len (float *addr, size_t len);
+
+vector unsigned char vec_xl_len_r (unsigned char *addr, size_t len);
+
+void vec_xst_len (vector signed char data, signed char *addr, size_t len);
+void vec_xst_len (vector unsigned char data, unsigned char *addr, size_t len);
+void vec_xst_len (vector signed int data, signed int *addr, size_t len);
+void vec_xst_len (vector unsigned int data, unsigned int *addr, size_t len);
+void vec_xst_len (vector unsigned __int128 data, unsigned __int128 *addr, size_t len);
+void vec_xst_len (vector signed long long data, signed long long *addr, size_t len);
+void vec_xst_len (vector unsigned long long data, unsigned long long *addr, size_t len);
+void vec_xst_len (vector signed short data, signed short *addr, size_t len);
+void vec_xst_len (vector unsigned short data, unsigned short *addr, size_t len);
+void vec_xst_len (vector signed __int128 data, signed __int128 *addr, size_t len);
+void vec_xst_len (vector double data, double *addr, size_t len);
+void vec_xst_len (vector float data, float *addr, size_t len);
+
+void vec_xst_len_r (vector unsigned char data, unsigned char *addr, size_t len);
+
+signed char vec_xlx (unsigned int index, vector signed char data);
+unsigned char vec_xlx (unsigned int index, vector unsigned char data);
+signed short vec_xlx (unsigned int index, vector signed short data);
+unsigned short vec_xlx (unsigned int index, vector unsigned short data);
+signed int vec_xlx (unsigned int index, vector signed int data);
+unsigned int vec_xlx (unsigned int index, vector unsigned int data);
+float vec_xlx (unsigned int index, vector float data);
+
+signed char vec_xrx (unsigned int index, vector signed char data);
+unsigned char vec_xrx (unsigned int index, vector unsigned char data);
+signed short vec_xrx (unsigned int index, vector signed short data);
+unsigned short vec_xrx (unsigned int index, vector unsigned short data);
+signed int vec_xrx (unsigned int index, vector signed int data);
+unsigned int vec_xrx (unsigned int index, vector unsigned int data);
+float vec_xrx (unsigned int index, vector float data);
+@end smallexample
+
+The @code{vec_all_nez}, @code{vec_any_eqz}, and @code{vec_cmpnez}
+perform pairwise comparisons between the elements at the same
+positions within their two vector arguments.
+The @code{vec_all_nez} function returns a
+non-zero value if and only if all pairwise comparisons are not
+equal and no element of either vector argument contains a zero.
+The @code{vec_any_eqz} function returns a
+non-zero value if and only if at least one pairwise comparison is equal
+or if at least one element of either vector argument contains a zero.
+The @code{vec_cmpnez} function returns a vector of the same type as
+its two arguments, within which each element consists of all ones to
+denote that either the corresponding elements of the incoming arguments are
+not equal or that at least one of the corresponding elements contains
+zero.  Otherwise, the element of the returned vector contains all zeros.
+
+The @code{vec_cntlz_lsbb} function returns the count of the number of
+consecutive leading byte elements (starting from position 0 within the
+supplied vector argument) for which the least-significant bit
+equals zero.  The @code{vec_cnttz_lsbb} function returns the count of
+the number of consecutive trailing byte elements (starting from
+position 15 and counting backwards within the supplied vector
+argument) for which the least-significant bit equals zero.
+
+The @code{vec_xl_len} and @code{vec_xst_len} functions require a
+64-bit environment supporting ISA 3.0 or later.  The @code{vec_xl_len}
+function loads a variable length vector from memory.  The
+@code{vec_xst_len} function stores a variable length vector to memory.
+With both the @code{vec_xl_len} and @code{vec_xst_len} functions, the
+@code{addr} argument represents the memory address to or from which
+data will be transferred, and the
+@code{len} argument represents the number of bytes to be
+transferred, as computed by the C expression @code{min((len & 0xff), 16)}.
+If this expression's value is not a multiple of the vector element's
+size, the behavior of this function is undefined.
+In the case that the underlying computer is configured to run in
+big-endian mode, the data transfer moves bytes 0 to @code{(len - 1)} of
+the corresponding vector.  In little-endian mode, the data transfer
+moves bytes @code{(16 - len)} to @code{15} of the corresponding
+vector.  For the load function, any bytes of the result vector that
+are not loaded from memory are set to zero.
+The value of the @code{addr} argument need not be aligned on a
+multiple of the vector's element size.
+
+The @code{vec_xlx} and @code{vec_xrx} functions extract the single
+element selected by the @code{index} argument from the vector
+represented by the @code{data} argument.  The @code{index} argument
+always specifies a byte offset, regardless of the size of the vector
+element.  With @code{vec_xlx}, @code{index} is the offset of the first
+byte of the element to be extracted.  With @code{vec_xrx}, @code{index}
+represents the last byte of the element to be extracted, measured
+from the right end of the vector.  In other words, the last byte of
+the element to be extracted is found at position @code{(15 - index)}.
+There is no requirement that @code{index} be a multiple of the vector
+element size.  However, if the size of the vector element added to
+@code{index} is greater than 15, the content of the returned value is
+undefined.
+
 If the ISA 3.0 instruction set additions (@option{-mcpu=power9})
 are available:
Segher Boessenkool April 24, 2018, 9:45 p.m. UTC | #2
Hi!

On Tue, Apr 24, 2018 at 02:25:58PM -0500, Kelvin Nilsen wrote:
> > 4. Remove descriptions of built-in function that do not belong in this
> > section because the
> >    built-in functions are generic (not specific to PowerPC):
> > __builtin_fabsq,
> >    __builtin_copysignq, __builtin_infq, __builtin_huge_valq, __builtin_nanq,
> >    __builtin_nansq, __builtin_sqrtf128, __builtin_fmaf128.

Are these described in a generic place, then?  I don't see it?

> +@node Low-Level PowerPC Built-in Functions Available on all Targets
> +@subsubsection Low-Level PowerPC Built-in Functions Available on all Targets

"Targets" is not such a great name.  "Configurations", maybe?

>  CPU supports the Embedded ISA category.
>  @item cellbe
>  CPU has a CELL broadband engine.
> +@item darn
> +CPU supports the darn (deliver a random number) instruction.

"the @code{darn}" etc.

>  CPU has hardware transaction memory instructions.
>  @item htm-nosc
>  Kernel aborts hardware transactions when a syscall is made.
> +@item htm-no-suspend
> +CPU supports hardware transaction memory but does not support the
> +tsuspend. instruction.

"@code{tsuspend.} instruction"

> +The following functions require (@option{-mhard-float}),
> +(@option{-mpowerpc-gfxopt}), and (@option{-mpopcntb}) options.

Why the parentheses?  The text within parens is not an explanation of
something that came before, it's just part of the main sentence.

Similar in many places.

> +The @code{__builtin_darn} and @code{__builtin_darn_raw}
> +functions require a
> +64-bit environment supporting ISA 3.0 or later.
> +The @code{__builtin_darn} function provides a 64-bit conditioned
> +random number.  The @code{__builtin_darn_raw} function provides a
> +64-bit raw random number.  The @code{__builtin_darn_32} function
> +provides a 32-bit random number.

Is darn_32 conditioned or raw?  (I realise you didn't change this text :-) )

The rest looks great, thanks!  If you fix those details, and probably not
delete the q/f128 things yet, it is okay for trunk.

Cheers,


Segher
Kelvin Nilsen April 25, 2018, 4:28 p.m. UTC | #3
Thank you for the prompt review and careful feedback.  I didn't notice
your message until this morning.  At this point, I'll wait a few days before
committing these changes as I understand we are still in the "RC phase of GCC 8".


On 4/24/18 4:45 PM, Segher Boessenkool wrote:
> Hi!
> 
> On Tue, Apr 24, 2018 at 02:25:58PM -0500, Kelvin Nilsen wrote:
>>> 4. Remove descriptions of built-in function that do not belong in this
>>> section because the
>>>    built-in functions are generic (not specific to PowerPC):
>>> __builtin_fabsq,
>>>    __builtin_copysignq, __builtin_infq, __builtin_huge_valq, __builtin_nanq,
>>>    __builtin_nansq, __builtin_sqrtf128, __builtin_fmaf128.
> 
> Are these described in a generic place, then?  I don't see it?
> 
>> +@node Low-Level PowerPC Built-in Functions Available on all Targets
>> +@subsubsection Low-Level PowerPC Built-in Functions Available on all Targets

Regarding your question about "q functions", the existing gcc.pdf document
is a bit confusing.  Here's what I can figure out.

The following are mentioned only in "Section 6.59.33: x86 Built-in Functions"

  __float128 __builtin_fabsq (__float128)
  __float128 __builtin_copysignq (__float128, __float128)
  __float128 __builtin_infq (void)
  __float128 __builtin_huge_valq (void)
  __float128 __builtin_nanq (void)
  __float128 __builtin_nansq (void)

As far as I can tell, these should not be documented as specific to x86, but
should be documented as generic across all platforms.  This is an issue outside
the realm of PowerPC maintenance.

If we want to preserve mention of these "q" functions, I would recommend
changing the text that introduces them.  Currently, it says:

  "Previous versions of GCC supported some 'q' builtins for IEEE 128-bit
   floating point.  These functions are now mapped into the equivalent
   'f128' builtin functions."

If the description of these built-ins is not moved to a more generic context,
I would prefer to replace this section with something like:

The following functions, which are also supported on x86 targets, are supported
if the -mfloat128 option is specified:

  __float128 __builtin_fabsq (__float128)
  __float128 __builtin_copysignq (__float128, __float128)
  __float128 __builtin_infq (void)
  __float128 __builtin_huge_valq (void)
  __float128 __builtin_nanq (void)
  __float128 __builtin_nansq (void)

Regarding your question about f128 functions, these are "supposed to be"
documented in "Section 6.58: Other Built-in Functions Provided by GCC".
Search for the phrase "corresponding to the TS 18661-3 functions".  We
should add "__builtin_sqrtf128 and builtin_fmaf128 to the list of functions
described this way.  These may not be the only omissions.  Should we push
for fixing this documentation in Section 6.58 instead of keeping it in
the PowerPC section?

It is difficult to find the official TS 18661-3 document, and
I'm not sure where to look for a list of which of the functions are
currently implemented by gcc.  I found this "diff" document, which provides
some hints.  Given that this standard is not easily accessible, perhaps the
generic built-in documentation should provide a little more information?

See http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1945.pdf
Segher Boessenkool May 2, 2018, 1:43 p.m. UTC | #4
Hi Kelvin,

On Tue, Apr 24, 2018 at 09:12:52AM -0500, Kelvin Nilsen wrote:
>     * doc/extend.texi: Tidy documentation of PowerPC built-in functions.

You should note the sections you renamed:
	(PowerPC Built-in Functions): Rename to ...
	(Low-Level PowerPC Built-in Functions): ... this.
(but see below).

> --- gcc/doc/extend.texi    (revision 259504)
> +++ gcc/doc/extend.texi    (working copy)
> @@ -15524,12 +15524,17 @@ implementing assertions.
> -@node PowerPC Built-in Functions
> -@subsection PowerPC Built-in Functions
> +@node Low-Level PowerPC Built-in Functions
> +@subsection Low-Level PowerPC Built-in Functions

If you change the name of a node you need to also change it wherever
it is referred to.  In this case, importantly the menu of per-target
built-in function documentation.  We probably should keep this
subsection's name, maybe do a subsubsection or such?

There is nothing particularly low-level about these builtins; maybe
call it "basic builtins" or similar?

Or we probably should start the "PowerPC Built-in Functions" node
with a menu of its subsubsections?  Like what FR-V has.

Sorry for the late review.


Segher
Segher Boessenkool May 2, 2018, 3:46 p.m. UTC | #5
Hi!

On Wed, Apr 25, 2018 at 11:28:44AM -0500, Kelvin Nilsen wrote:
> >>> 4. Remove descriptions of built-in function that do not belong in this
> >>> section because the
> >>>    built-in functions are generic (not specific to PowerPC):
> >>> __builtin_fabsq,
> >>>    __builtin_copysignq, __builtin_infq, __builtin_huge_valq, __builtin_nanq,
> >>>    __builtin_nansq, __builtin_sqrtf128, __builtin_fmaf128.
> > 
> > Are these described in a generic place, then?  I don't see it?
> > 
> >> +@node Low-Level PowerPC Built-in Functions Available on all Targets
> >> +@subsubsection Low-Level PowerPC Built-in Functions Available on all Targets
> 
> Regarding your question about "q functions", the existing gcc.pdf document
> is a bit confusing.  Here's what I can figure out.
> 
> The following are mentioned only in "Section 6.59.33: x86 Built-in Functions"
> 
>   __float128 __builtin_fabsq (__float128)
>   __float128 __builtin_copysignq (__float128, __float128)
>   __float128 __builtin_infq (void)
>   __float128 __builtin_huge_valq (void)
>   __float128 __builtin_nanq (void)
>   __float128 __builtin_nansq (void)
> 
> As far as I can tell, these should not be documented as specific to x86, but
> should be documented as generic across all platforms.  This is an issue outside
> the realm of PowerPC maintenance.
> 
> If we want to preserve mention of these "q" functions, I would recommend
> changing the text that introduces them.  Currently, it says:
> 
>   "Previous versions of GCC supported some 'q' builtins for IEEE 128-bit
>    floating point.  These functions are now mapped into the equivalent
>    'f128' builtin functions."

I think that is a bit confusing, especially if you are not familiar with
those builtins already.

> If the description of these built-ins is not moved to a more generic context,
> I would prefer to replace this section with something like:
> 
> The following functions, which are also supported on x86 targets, are supported
> if the -mfloat128 option is specified:
> 
>   __float128 __builtin_fabsq (__float128)
>   __float128 __builtin_copysignq (__float128, __float128)
>   __float128 __builtin_infq (void)
>   __float128 __builtin_huge_valq (void)
>   __float128 __builtin_nanq (void)
>   __float128 __builtin_nansq (void)

That looks fine.

> Regarding your question about f128 functions, these are "supposed to be"
> documented in "Section 6.58: Other Built-in Functions Provided by GCC".
> Search for the phrase "corresponding to the TS 18661-3 functions".  We
> should add "__builtin_sqrtf128 and builtin_fmaf128 to the list of functions
> described this way.  These may not be the only omissions.  Should we push
> for fixing this documentation in Section 6.58 instead of keeping it in
> the PowerPC section?

Well, are they supported on other targets?

> It is difficult to find the official TS 18661-3 document, and
> I'm not sure where to look for a list of which of the functions are
> currently implemented by gcc.  I found this "diff" document, which provides
> some hints.  Given that this standard is not easily accessible, perhaps the
> generic built-in documentation should provide a little more information?
> 
> See http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1945.pdf

That is the document I use.  See other mail for other resources :-)


Segher

Patch
diff mbox series

Index: gcc/doc/extend.texi
===================================================================
--- gcc/doc/extend.texi    (revision 259504)
+++ gcc/doc/extend.texi    (working copy)
@@ -15524,12 +15524,17 @@  implementing assertions.
 
 @end table
 
-@node PowerPC Built-in Functions
-@subsection PowerPC Built-in Functions
+@node Low-Level PowerPC Built-in Functions
+@subsection Low-Level PowerPC Built-in Functions
 
-The following built-in functions are always available and can be used to
-check the PowerPC target platform type:
+This section describes PowerPC built-in functions that do not require
+the inclusion of any special header files to declare prototypes or
+provide macro definitions.  The sections that follow describe
+additional PowerPC built-in functions.
 
+@node Low-Level PowerPC Built-in Functions Available on all Targets
+@subsubsection Low-Level PowerPC Built-in Functions Available on all
Targets
+
 @deftypefn {Built-in Function} void __builtin_cpu_init (void)
 This function is a @code{nop} on the PowerPC platform and is included