diff mbox

[rs6000] Add built-in function support for Power9 byte instructions

Message ID ab3a654d-f151-a37c-3c52-5c667f19dc34@linux.vnet.ibm.com
State New
Headers show

Commit Message

Kelvin Nilsen Nov. 14, 2016, 11:43 p.m. UTC
This patch adds built-in function support for the new setb, cmprb, and
cmpeqb Power9 instructions.

The patch has been bootstrapped and tested on
powerpc64le-unknown-linux and powerpc-unknown-linux (big-endian, with
both -m32 and -m64 target options) with no regresions.

Is this ok for the trunk?

gcc/testsuite/ChangeLog:

2016-11-14  Kelvin Nilsen  <kelvin@gcc.gnu.org>

	* gcc.target/powerpc/byte-in-either-range-0.c: New test.
	* gcc.target/powerpc/byte-in-either-range-1.c: New test.
	* gcc.target/powerpc/byte-in-range-0.c: New test.
	* gcc.target/powerpc/byte-in-range-1.c: New test.
	* gcc.target/powerpc/byte-in-set-0.c: New test.
	* gcc.target/powerpc/byte-in-set-1.c: New test.
	* gcc.target/powerpc/byte-in-set-2.c: New test.


gcc/ChangeLog:

2016-11-14  Kelvin Nilsen  <kelvin@gcc.gnu.org>

	* config/rs6000/altivec.md (UNSPEC_CMPRB): New unspec value.
	(UNSPEC_CMPRB2): New unspec value.
	(UNSPEC_CMPEQB): New unspec value.
	(UNSPEC_SETB): New unspec value.
	(cmprb_p): New expansion.
	(*cmprb): New insn.
	(*setb): New insn.
	(cmprb2_p): New expansion.
	(*cmprb2): New insn.
	(cmpeqb_p): New expansion.
	(*cmpeqb): New insn.
	* config/rs6000/rs6000-builtin.def (BU_P9V_64BIT_AV_2): New macro.
	(BU_P9_OVERLOAD_2): Likewise.
	(CMPRB): Add byte-in-range built-in function.
	(CMBRB2): Add byte-in-either_range built-in function.
	(CMPEQB): Add byte-in-set builtin-in function.
	(CMPRB): Add overload support for byte-in-range function.
	(CMPRB2): Add overload support for byte-in-either-range function.
	(CMPEQB): Add overload support for byte-in-set built-in function.
	* config/rs6000/rs6000-c.c (P9V_BUILTIN_SCALAR_CMPRB): Macro
	expansion to define argument types for new builtin.
	(P9V_BUILTIN_SCALAR_CMPRB2): Macro expansion to define argument
	types for new builtin.
	(P9V_BUILTIN_SCALAR_CMPEQB): Macro expansion to define argument
	types for new builtin.
	* doc/extend.texi (PowerPC AltiVec Built-in Functions): Rearrange
	the order of presentation for certain built-in functions
	(scalar_extract_exp, scalar_extract_sig, scalar_insert_exp)
	(scalar_cmp_exp_gt, scalar_cmp_exp_lt, scalar_cmp_exp_eq)
	(scalar_cmp_exp_unordered, scalar_test_data_class)
	(scalar_test_neg) to improve locality and flow.  Document
	the new __builtin_scalar_byte_in_set,
	__builtin_scalar_byte_in_range, and
	__builtin_scalar_byte_in_either_range functions.

Comments

Segher Boessenkool Nov. 15, 2016, 11:19 a.m. UTC | #1
Hi!

On Mon, Nov 14, 2016 at 04:43:35PM -0700, Kelvin Nilsen wrote:
> 	* config/rs6000/altivec.md (UNSPEC_CMPRB): New unspec value.
> 	(UNSPEC_CMPRB2): New unspec value.

I wonder if you really need both?  The number of arguments will tell
which is which, anyway?

> 	(cmprb_p): New expansion.

Not such a great name (now you get a gen_cmprb_p function which isn't
a predicate itself).

> 	(CMPRB): Add byte-in-range built-in function.
> 	(CMBRB2): Add byte-in-either_range built-in function.
> 	(CMPEQB): Add byte-in-set builtin-in function.

"builtin-in", and you typoed an underscore?

> +;; Predicate: test byte within range.
> +;; Return in target register operand 0 a non-zero value iff the byte
> +;; held in bits 24:31 of operand 1 is within the inclusive range
> +;; bounded below by operand 2's bits 0:7 and above by operand 2's
> +;; bits 8:15.
> +(define_expand "cmprb_p"

It seems you got the bit numbers mixed up.  Maybe just call it the low
byte, and the byte just above?

(And it always sets 0 or 1 here, you might want to make that more explicit).

> +;; Set bit 1 (the GT bit, 0x2) of CR register operand 0 to 1 iff the

That's 4, i.e. 0b0100.

> +;; Set operand 0 register to non-zero value iff the CR register named
> +;; by operand 1 has its GT bit (0x2) or its LT bit (0x1) set.
> +(define_insn "*setb"

LT is 8, GT is 4.  If LT is set it returns -1, otherwise if GT is set it
returns 1, otherwise it returns 0.

> +;; Predicate: test byte within two ranges.
> +;; Return in target register operand 0 a non-zero value iff the byte
> +;; held in bits 24:31 of operand 1 is within the inclusive range
> +;; bounded below by operand 2's bits 0:7 and above by operand 2's
> +;; bits 8:15 or if the byte is within the inclusive range bounded
> +;; below by operand 2's bits 16:23 and above by operand 2's bits 24:31.
> +(define_expand "cmprb2_p"

The high bound is higher in the reg than the low bound.  See the example
where 0x3930 is used to do isdigit (and yes 0x3039 would be much more
fun, but alas).

> +;; Predicate: test byte membership within set of 8 bytes.
> +;; Return in target register operand 0 a non-zero value iff the byte
> +;; held in bits 24:31 of operand 1 equals at least one of the eight
> +;; byte values represented by the 64-bit register supplied as operand
> +;; 2.  Note that the 8 byte values held within operand 2 need not be
> +;; unique. 

(trailing space)

I wonder if we really need all these predicate expanders, if it wouldn't
be easier if the builtin handling code did the setb itself?


Segher
Kelvin Nilsen Nov. 15, 2016, 6:05 p.m. UTC | #2
Thank you very much for the prompt and thorough review.  There are a few
points below where I'd like to seek further clarification.

On 11/15/2016 04:19 AM, Segher Boessenkool wrote:
> Hi!
> 
> On Mon, Nov 14, 2016 at 04:43:35PM -0700, Kelvin Nilsen wrote:
>> 	* config/rs6000/altivec.md (UNSPEC_CMPRB): New unspec value.
>> 	(UNSPEC_CMPRB2): New unspec value.
> 
> I wonder if you really need both?  The number of arguments will tell
> which is which, anyway?

I appreciate your preference to avoid proliferation of special-case
unspec constants.  However, it is a not so straightforward to combine
these two cases under the same constant value.  The issue is that though
the two encoding conceptually represent different "numbers of
arguments", the arguments are all packed inside of a 32-bit register.
At the RTL level, it looks like the two different forms have the same
number of arguments (the same number of register operands).  The
difference is which bits serve relevant purposes within the incoming
register operands.

So I'm inclined to keep this as is if that's ok with you.

> 
>> 	(cmprb_p): New expansion.
> 
> Not such a great name (now you get a gen_cmprb_p function which isn't
> a predicate itself).

I'll change these names.

> 
>> 	(CMPRB): Add byte-in-range built-in function.
>> 	(CMBRB2): Add byte-in-either_range built-in function.
>> 	(CMPEQB): Add byte-in-set builtin-in function.
> 
> "builtin-in", and you typoed an underscore?

Thanks.


> 
>> +;; Predicate: test byte within range.
>> +;; Return in target register operand 0 a non-zero value iff the byte
>> +;; held in bits 24:31 of operand 1 is within the inclusive range
>> +;; bounded below by operand 2's bits 0:7 and above by operand 2's
>> +;; bits 8:15.
>> +(define_expand "cmprb_p"
> 
> It seems you got the bit numbers mixed up.  Maybe just call it the low
> byte, and the byte just above?
> 
> (And it always sets 0 or 1 here, you might want to make that more explicit).
> 
>> +;; Set bit 1 (the GT bit, 0x2) of CR register operand 0 to 1 iff the
> 
> That's 4, i.e. 0b0100.
> 
>> +;; Set operand 0 register to non-zero value iff the CR register named
>> +;; by operand 1 has its GT bit (0x2) or its LT bit (0x1) set.
>> +(define_insn "*setb"
> 
> LT is 8, GT is 4.  If LT is set it returns -1, otherwise if GT is set it
> returns 1, otherwise it returns 0.
> 

Thanks for catching this.  I think I got endian confusion inside my head
while I was writing the above.  I will rewrite these comments, below also.

>> +;; Predicate: test byte within two ranges.
>> +;; Return in target register operand 0 a non-zero value iff the byte
>> +;; held in bits 24:31 of operand 1 is within the inclusive range
>> +;; bounded below by operand 2's bits 0:7 and above by operand 2's
>> +;; bits 8:15 or if the byte is within the inclusive range bounded
>> +;; below by operand 2's bits 16:23 and above by operand 2's bits 24:31.
>> +(define_expand "cmprb2_p"
> 
> The high bound is higher in the reg than the low bound.  See the example
> where 0x3930 is used to do isdigit (and yes 0x3039 would be much more
> fun, but alas).
> 
>> +;; Predicate: test byte membership within set of 8 bytes.
>> +;; Return in target register operand 0 a non-zero value iff the byte
>> +;; held in bits 24:31 of operand 1 equals at least one of the eight
>> +;; byte values represented by the 64-bit register supplied as operand
>> +;; 2.  Note that the 8 byte values held within operand 2 need not be
>> +;; unique. 
> 
> (trailing space)
> 
> I wonder if we really need all these predicate expanders, if it wouldn't
> be easier if the builtin handling code did the setb itself?
> 

The reason it seems most "natural" to me use the expanders is because I
need to introduce a temporary CR scratch register between expansion and
insn matching.  Also, it seems that the *setb pattern may be of more
general use in the future implementation of other built-in functions.
I'm inclined to keep this as is, but if you still feel otherwise, I'll
figure out how to avoid the expansion.
Segher Boessenkool Nov. 15, 2016, 6:22 p.m. UTC | #3
On Tue, Nov 15, 2016 at 11:05:07AM -0700, Kelvin Nilsen wrote:
> >> 	* config/rs6000/altivec.md (UNSPEC_CMPRB): New unspec value.
> >> 	(UNSPEC_CMPRB2): New unspec value.
> > 
> > I wonder if you really need both?  The number of arguments will tell
> > which is which, anyway?
> 
> I appreciate your preference to avoid proliferation of special-case
> unspec constants.  However, it is a not so straightforward to combine
> these two cases under the same constant value.  The issue is that though
> the two encoding conceptually represent different "numbers of
> arguments", the arguments are all packed inside of a 32-bit register.
> At the RTL level, it looks like the two different forms have the same
> number of arguments (the same number of register operands).  The
> difference is which bits serve relevant purposes within the incoming
> register operands.
> 
> So I'm inclined to keep this as is if that's ok with you.

Ah right, for some reason I thought the unspec had all the bounds as
separate args.  -ENOTENOUGHCOFFEE.

[ snip ]

> Thanks for catching this.  I think I got endian confusion inside my head
> while I was writing the above.  I will rewrite these comments, below also.

Note the ISA calls the bits in 32-bit registers 32..63, so that 63 is
the rightmost bit in all registers.

> > I wonder if we really need all these predicate expanders, if it wouldn't
> > be easier if the builtin handling code did the setb itself?
> > 
> 
> The reason it seems most "natural" to me use the expanders is because I
> need to introduce a temporary CR scratch register between expansion and
> insn matching.  Also, it seems that the *setb pattern may be of more
> general use in the future implementation of other built-in functions.
> I'm inclined to keep this as is, but if you still feel otherwise, I'll
> figure out how to avoid the expansion.

The code (in rs6000.c) expanding the builtin can create two insns directly,
so that you do not need to repeat this over and over in define_expands?


Segher
Kelvin Nilsen Nov. 15, 2016, 7:16 p.m. UTC | #4
> 
>> Thanks for catching this.  I think I got endian confusion inside my head
>> while I was writing the above.  I will rewrite these comments, below also.
> 
> Note the ISA calls the bits in 32-bit registers 32..63, so that 63 is
> the rightmost bit in all registers.
> 

True, but the ISA only uses the lower half of the 64-bit register, so I
have describe my patterns using SI mode instead of DI mode, which is
part of the reason I was numbering my bits differently than the ISA
document.

The reason I am using SI mode is so that I don't have to disqualify the
use of these functions on a 32-bit big-endian configuration.

Do you want me to switch to DI mode for all the operands?

>>> I wonder if we really need all these predicate expanders, if it wouldn't
>>> be easier if the builtin handling code did the setb itself?
>>>
>>
>> The reason it seems most "natural" to me use the expanders is because I
>> need to introduce a temporary CR scratch register between expansion and
>> insn matching.  Also, it seems that the *setb pattern may be of more
>> general use in the future implementation of other built-in functions.
>> I'm inclined to keep this as is, but if you still feel otherwise, I'll
>> figure out how to avoid the expansion.
> 
> The code (in rs6000.c) expanding the builtin can create two insns directly,
> so that you do not need to repeat this over and over in define_expands?
> 

The pattern I'm familiar with is to allocate the temporary scratch
register during expansion, and to use the allocated temporary at insn
match time.  I'll have to teach myself a new pattern to do all of this
at insn match time.  Feel free to point me to an example of define_insn
code that does this.

Thanks again.
Segher Boessenkool Nov. 15, 2016, 8:18 p.m. UTC | #5
On Tue, Nov 15, 2016 at 12:16:19PM -0700, Kelvin Nilsen wrote:
> The reason I am using SI mode is so that I don't have to disqualify the
> use of these functions on a 32-bit big-endian configuration.
> 
> Do you want me to switch to DI mode for all the operands?

SI is fine, and can give slightly better code in some cases (the machine
instructions work fine with garbage in the upper half of the regs, so GCC
can avoid a zero extend in some cases if you use SImode).  Marginal
advantage here, we have much bigger suboptimalities with extensions, don't
worry too much about it :-)

> > The code (in rs6000.c) expanding the builtin can create two insns directly,
> > so that you do not need to repeat this over and over in define_expands?
> 
> The pattern I'm familiar with is to allocate the temporary scratch
> register during expansion, and to use the allocated temporary at insn
> match time.  I'll have to teach myself a new pattern to do all of this
> at insn match time.  Feel free to point me to an example of define_insn
> code that does this.

I meant not the define_insn, but the actual builtin expander code, like
for example how altivec_expand_predicate_builtin is hooked up.


Segher
diff mbox

Patch

Index: gcc/config/rs6000/altivec.md
===================================================================
--- gcc/config/rs6000/altivec.md	(revision 241245)
+++ gcc/config/rs6000/altivec.md	(working copy)
@@ -153,6 +153,10 @@ 
    UNSPEC_BCDADD
    UNSPEC_BCDSUB
    UNSPEC_BCD_OVERFLOW
+   UNSPEC_CMPRB
+   UNSPEC_CMPRB2
+   UNSPEC_CMPEQB
+   UNSPEC_SETB
 ])
 
 (define_c_enum "unspecv"
@@ -3709,6 +3713,116 @@ 
   "darn %0,1"
   [(set_attr "type" "integer")])
 
+;; Predicate: test byte within range.
+;; Return in target register operand 0 a non-zero value iff the byte
+;; held in bits 24:31 of operand 1 is within the inclusive range
+;; bounded below by operand 2's bits 0:7 and above by operand 2's
+;; bits 8:15.
+(define_expand "cmprb_p"
+  [(set (match_dup 3)
+	(unspec:CC [(match_operand:SI 1 "gpc_reg_operand" "r")
+		    (match_operand:SI 2 "gpc_reg_operand" "r")]
+	 UNSPEC_CMPRB))
+   (set (match_operand:SI 0 "gpc_reg_operand" "=r")
+        (unspec:SI [(match_dup 3)]
+         UNSPEC_SETB))
+  ]
+  "TARGET_P9_MISC"
+{
+  operands[3] = gen_reg_rtx (CCmode);
+})
+
+;; Set bit 1 (the GT bit, 0x2) of CR register operand 0 to 1 iff the
+;; byte found in bits 24:31 of register operand 1 is within the
+;; inclusive range bounded below by operand 2's bits 0:7 and above by
+;; operand 2's bits 8:15.  The other 3 bits of the target CR register
+;; are set to 0.
+(define_insn "*cmprb"
+  [(set (match_operand:CC 0 "cc_reg_operand" "=y")
+	(unspec:CC [(match_operand:SI 1 "gpc_reg_operand" "r")
+		    (match_operand:SI 2 "gpc_reg_operand" "r")]
+	 UNSPEC_CMPRB))]
+  "TARGET_P9_MISC"
+  "cmprb %0,0,%1,%2"
+  [(set_attr "type" "logical")])
+
+;; Set operand 0 register to non-zero value iff the CR register named
+;; by operand 1 has its GT bit (0x2) or its LT bit (0x1) set.
+(define_insn "*setb"
+   [(set (match_operand:SI 0 "gpc_reg_operand" "=r")
+	 (unspec:SI [(match_operand:CC 1 "cc_reg_operand" "y")]
+	  UNSPEC_SETB))]
+  "TARGET_P9_MISC"
+  "setb %0,%1"
+  [(set_attr "type" "logical")])
+
+;; Predicate: test byte within two ranges.
+;; Return in target register operand 0 a non-zero value iff the byte
+;; held in bits 24:31 of operand 1 is within the inclusive range
+;; bounded below by operand 2's bits 0:7 and above by operand 2's
+;; bits 8:15 or if the byte is within the inclusive range bounded
+;; below by operand 2's bits 16:23 and above by operand 2's bits 24:31.
+(define_expand "cmprb2_p"
+  [(set (match_dup 3)
+	(unspec:CC [(match_operand:SI 1 "gpc_reg_operand" "r")
+		    (match_operand:SI 2 "gpc_reg_operand" "r")]
+	 UNSPEC_CMPRB2))
+   (set (match_operand:SI 0 "gpc_reg_operand" "=r")
+        (unspec:SI [(match_dup 3)]
+         UNSPEC_SETB))
+  ]
+  "TARGET_P9_MISC"
+{
+  operands[3] = gen_reg_rtx (CCmode);
+})
+
+;; Set bit 1 (the GT bit, 0x2) of CR register operand 0 to 1 iff the
+;; byte found in bits 24:31 of register operand 1 is within the
+;; inclusive range bounded below by operand 2's bits 0:7 and above by
+;; operand 2's bits 8:15 or within the inclusive range bounded below
+;; by operand 2's bits 16:23 and above by operand 2's bits 24:31.  The
+;; other 3  bits of the target CR register are set to 0.
+(define_insn "*cmprb2"
+  [(set (match_operand:CC 0 "cc_reg_operand" "=y")
+	(unspec:CC [(match_operand:SI 1 "gpc_reg_operand" "r")
+		    (match_operand:SI 2 "gpc_reg_operand" "r")]
+	 UNSPEC_CMPRB2))]
+  "TARGET_P9_MISC"
+  "cmprb %0,1,%1,%2"
+  [(set_attr "type" "logical")])
+
+;; Predicate: test byte membership within set of 8 bytes.
+;; Return in target register operand 0 a non-zero value iff the byte
+;; held in bits 24:31 of operand 1 equals at least one of the eight
+;; byte values represented by the 64-bit register supplied as operand
+;; 2.  Note that the 8 byte values held within operand 2 need not be
+;; unique. 
+(define_expand "cmpeqb_p"
+  [(set (match_dup 3)
+	(unspec:CC [(match_operand:SI 1 "gpc_reg_operand" "r")
+		    (match_operand:DI 2 "gpc_reg_operand" "r")]
+	 UNSPEC_CMPEQB))
+   (set (match_operand:SI 0 "gpc_reg_operand" "=r")
+        (unspec:SI [(match_dup 3)]
+         UNSPEC_SETB))
+  ]
+  "TARGET_P9_MISC && TARGET_64BIT"
+{
+  operands[3] = gen_reg_rtx (CCmode);
+})
+
+;; Set bit 1 (the GT bit, 0x2) of CR register operand 0 to 1 iff the
+;; byte found in bits 24:31 of register operand 1 equals one of the 8
+;; bytes found within register operand 2.
+(define_insn "*cmpeqb"
+  [(set (match_operand:CC 0 "cc_reg_operand" "=y")
+	 (unspec:CC [(match_operand:SI 1 "gpc_reg_operand" "r")
+		     (match_operand:DI 2 "gpc_reg_operand" "r")]
+	  UNSPEC_CMPEQB))]
+  "TARGET_P9_MISC && TARGET_64BIT"
+  "cmpeqb %0,%1,%2"
+  [(set_attr "type" "logical")])
+
 (define_expand "bcd<bcd_add_sub>_<code>"
   [(parallel [(set (reg:CCFP CR6_REGNO)
 		   (compare:CCFP
Index: gcc/config/rs6000/rs6000-builtin.def
===================================================================
--- gcc/config/rs6000/rs6000-builtin.def	(revision 241245)
+++ gcc/config/rs6000/rs6000-builtin.def	(working copy)
@@ -773,6 +773,15 @@ 
 		     | RS6000_BTC_BINARY),				\
 		    CODE_FOR_ ## ICODE)			/* ICODE */
 
+#define BU_P9V_64BIT_AV_2(ENUM, NAME, ATTR, ICODE)			\
+  RS6000_BUILTIN_2 (P9V_BUILTIN_ ## ENUM,		/* ENUM */	\
+		    "__builtin_altivec_" NAME,		/* NAME */	\
+		    RS6000_BTM_P9_VECTOR				\
+		    | RS6000_BTM_64BIT,			/* MASK */	\
+		    (RS6000_BTC_ ## ATTR		/* ATTR */	\
+		     | RS6000_BTC_BINARY),				\
+		    CODE_FOR_ ## ICODE)			/* ICODE */
+
 #define BU_P9V_AV_3(ENUM, NAME, ATTR, ICODE)				\
   RS6000_BUILTIN_3 (P9V_BUILTIN_ ## ENUM,		/* ENUM */	\
 		    "__builtin_altivec_" NAME,		/* NAME */	\
@@ -848,6 +857,15 @@ 
 		    (RS6000_BTC_OVERLOADED		/* ATTR */	\
 		     | RS6000_BTC_TERNARY),				\
 		    CODE_FOR_nothing)			/* ICODE */
+
+#define BU_P9_OVERLOAD_2(ENUM, NAME)					\
+  RS6000_BUILTIN_2 (P9V_BUILTIN_SCALAR_ ## ENUM,	/* ENUM */	\
+		    "__builtin_scalar_" NAME,		/* NAME */	\
+		    RS6000_BTM_P9_VECTOR,		/* MASK */	\
+		    (RS6000_BTC_OVERLOADED		/* ATTR */	\
+		     | RS6000_BTC_BINARY),				\
+		    CODE_FOR_nothing)			/* ICODE */
+
 #endif
 
 
@@ -2004,6 +2022,16 @@  BU_P9V_OVERLOAD_1 (VPRTYBD,	"vprtybd")
 BU_P9V_OVERLOAD_1 (VPRTYBQ,	"vprtybq")
 BU_P9V_OVERLOAD_1 (VPRTYBW,	"vprtybw")
 
+/* 2 argument functions added in ISA 3.0 (power9).  */
+BU_P9V_AV_2 (CMPRB,	"byte_in_range",	CONST,	cmprb_p)
+BU_P9V_AV_2 (CMPRB2,	"byte_in_either_range",	CONST,	cmprb2_p)
+BU_P9V_64BIT_AV_2 (CMPEQB,	"byte_in_set",	CONST,	cmpeqb_p)
+
+/* 2 argument overloaded functions added in ISA 3.0 (power9).  */
+BU_P9_OVERLOAD_2 (CMPRB,	"byte_in_range")
+BU_P9_OVERLOAD_2 (CMPRB2,	"byte_in_either_range")
+BU_P9_OVERLOAD_2 (CMPEQB,	"byte_in_set")
+
 /* 1 argument IEEE 128-bit floating-point functions.  */
 BU_FLOAT128_1 (FABSQ,		"fabsq",       CONST, abskf2)
 
Index: gcc/config/rs6000/rs6000-c.c
===================================================================
--- gcc/config/rs6000/rs6000-c.c	(revision 241245)
+++ gcc/config/rs6000/rs6000-c.c	(working copy)
@@ -4556,6 +4556,13 @@  const struct altivec_builtin_types altivec_overloa
   { P9V_BUILTIN_VEC_VPRTYBQ, P9V_BUILTIN_VPRTYBQ,
     RS6000_BTI_UINTTI, RS6000_BTI_UINTTI, 0, 0 },
 
+  { P9V_BUILTIN_SCALAR_CMPRB, P9V_BUILTIN_CMPRB,
+    RS6000_BTI_INTSI, RS6000_BTI_UINTQI, RS6000_BTI_UINTSI, 0 },
+  { P9V_BUILTIN_SCALAR_CMPRB2, P9V_BUILTIN_CMPRB2,
+    RS6000_BTI_INTSI, RS6000_BTI_UINTQI, RS6000_BTI_UINTSI, 0 },
+  { P9V_BUILTIN_SCALAR_CMPEQB, P9V_BUILTIN_CMPEQB,
+    RS6000_BTI_INTSI, RS6000_BTI_UINTQI, RS6000_BTI_UINTDI, 0 },
+
   { P8V_BUILTIN_VEC_VPKUDUM, P8V_BUILTIN_VPKUDUM,
     RS6000_BTI_V4SI, RS6000_BTI_V2DI, RS6000_BTI_V2DI, 0 },
   { P8V_BUILTIN_VEC_VPKUDUM, P8V_BUILTIN_VPKUDUM,
Index: gcc/doc/extend.texi
===================================================================
--- gcc/doc/extend.texi	(revision 241245)
+++ gcc/doc/extend.texi	(working copy)
@@ -15015,6 +15015,27 @@  long long __builtin_darn (void);
 long long __builtin_darn_raw (void);
 int __builtin_darn_32 (void);
 
+unsigned int scalar_extract_exp (double source);
+unsigned long long int scalar_extract_sig (double source);
+
+double
+scalar_insert_exp (unsigned long long int significand, unsigned long long int exponent);
+
+int scalar_cmp_exp_gt (double arg1, double arg2);
+int scalar_cmp_exp_lt (double arg1, double arg2);
+int scalar_cmp_exp_eq (double arg1, double arg2);
+int scalar_cmp_exp_unordered (double arg1, double arg2);
+
+int scalar_test_data_class (float source, unsigned int condition);
+int scalar_test_data_class (double source, unsigned int condition);
+
+int scalar_test_neg (float source);
+int scalar_test_neg (double source);
+
+int __builtin_scalar_byte_in_set (unsigned char u, unsigned long long set);
+int __builtin_scalar_byte_in_range (unsigned char u, unsigned int range);
+int __builtin_scalar_byte_in_either_range (unsigned char u, unsigned int ranges);
+
 int __builtin_dfp_dtstsfi_lt (unsigned int comparison, _Decimal64 value);
 int __builtin_dfp_dtstsfi_lt (unsigned int comparison, _Decimal128 value);
 int __builtin_dfp_dtstsfi_lt_dd (unsigned int comparison, _Decimal64 value);
@@ -15034,23 +15055,6 @@  int __builtin_dfp_dtstsfi_ov (unsigned int compari
 int __builtin_dfp_dtstsfi_ov (unsigned int comparison, _Decimal128 value);
 int __builtin_dfp_dtstsfi_ov_dd (unsigned int comparison, _Decimal64 value);
 int __builtin_dfp_dtstsfi_ov_td (unsigned int comparison, _Decimal128 value);
-
-unsigned int scalar_extract_exp (double source);
-unsigned long long int scalar_extract_sig (double source);
-
-double
-scalar_insert_exp (unsigned long long int significand, unsigned long long int exponent);
-
-int scalar_cmp_exp_gt (double arg1, double arg2);
-int scalar_cmp_exp_lt (double arg1, double arg2);
-int scalar_cmp_exp_eq (double arg1, double arg2);
-int scalar_cmp_exp_unordered (double arg1, double arg2);
-
-int scalar_test_data_class (float source, unsigned int condition);
-int scalar_test_data_class (double source, unsigned int condition);
-
-int scalar_test_neg (float source);
-int scalar_test_neg (double source);
 @end smallexample
 
 The @code{__builtin_darn} and @code{__builtin_darn_raw}
@@ -15105,6 +15109,22 @@  If all of the enabled test conditions are false, t
 The @code{scalar_test_neg} built-in functions return a non-zero value
 if their @code{source} argument holds a negative value.
 
+The @code{__builtin_scalar_byte_in_set} function requires a
+64-bit environment supporting ISA 3.0 or later.  This function returns
+a non-zero value if and only if its @code{u} argument exactly equals one of
+the eight bytes contained within its 64-bit @code{set} argument.
+
+The @code{__builtin_scalar_byte_in_range} and
+@code{__builtin_scalar_byte_in_either_range} require an environment
+supporting ISA 3.0 or later.  The first of these functions returns a
+non-zero value if and only if its @code{u} argument is within the
+range bounded between @code{(range >> 24)} and @code{((range >> 16) & 0xff)}
+inclusive.  The second of these functions returns non-zero if and only
+if its @code{u} argument is either within the range bounded between
+@code{(range >> 24)} and @code{((range >> 16) & 0xff)}
+inclusive or is within the range bounded between
+@code{((range >> 8) & 0xff)} and @code{(range & 0xff)} inclusive.
+
 The @code{__builtin_dfp_dtstsfi_lt} function returns a non-zero value
 if and only if the number of signficant digits of its @code{value} argument
 is less than its @code{comparison} argument.  The
Index: gcc/testsuite/gcc.target/powerpc/byte-in-either-range-0.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/byte-in-either-range-0.c	(revision 0)
+++ gcc/testsuite/gcc.target/powerpc/byte-in-either-range-0.c	(working copy)
@@ -0,0 +1,25 @@ 
+/* { dg-do compile { target { powerpc*-*-* } } } */
+/* { dg-skip-if "do not override -mcpu" { powerpc*-*-* } { "-mcpu=*" } { "-mcpu=power9" } } */
+/* { dg-require-effective-target powerpc_p9vector_ok } */
+/* { dg-skip-if "" { powerpc*-*-aix* } } */
+/* { dg-options "-mcpu=power9" } */
+
+/* This test should succeed on both 32- and 64-bit configurations.  */
+#include <altivec.h>
+
+int
+test_byte_in_either_range (unsigned char b,
+			   unsigned char first_lo_bound,
+			   unsigned char first_hi_bound,
+			   unsigned char second_lo_bound,
+			   unsigned char second_hi_bound)
+{
+  unsigned int range_encoding;
+  range_encoding = ((first_hi_bound << 24) | (first_lo_bound << 16)
+		    | (second_hi_bound << 8) | second_lo_bound);
+
+  return __builtin_scalar_byte_in_either_range (b, range_encoding);
+}
+
+/* { dg-final { scan-assembler "cmprb" } } */
+/* { dg-final { scan-assembler "setb" } } */
Index: gcc/testsuite/gcc.target/powerpc/byte-in-either-range-1.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/byte-in-either-range-1.c	(revision 0)
+++ gcc/testsuite/gcc.target/powerpc/byte-in-either-range-1.c	(working copy)
@@ -0,0 +1,22 @@ 
+/* { dg-do compile { target { powerpc*-*-* } } } */
+/* { dg-skip-if "do not override -mcpu" { powerpc*-*-* } { "-mcpu=*" } { "-mcpu=power8" } } */
+/* { dg-require-effective-target powerpc_p9vector_ok } */
+/* { dg-skip-if "" { powerpc*-*-aix* } } */
+/* { dg-options "-mcpu=power8" } */
+
+/* This test should succeed on both 32- and 64-bit configurations.  */
+#include <altivec.h>
+
+int
+test_byte_in_either_range (unsigned char b,
+			   unsigned char first_lo_bound,
+			   unsigned char first_hi_bound,
+			   unsigned char second_lo_bound,
+			   unsigned char second_hi_bound)
+{
+  unsigned int range_encoding;
+  range_encoding = ((first_hi_bound << 24) | (first_lo_bound << 16)
+		    | (second_hi_bound << 8) | second_lo_bound);
+
+  return __builtin_scalar_byte_in_either_range (b, range_encoding); /* { dg-error "Builtin function __builtin_altivec_byte_in_either_range requires" } */
+}
Index: gcc/testsuite/gcc.target/powerpc/byte-in-range-0.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/byte-in-range-0.c	(revision 0)
+++ gcc/testsuite/gcc.target/powerpc/byte-in-range-0.c	(working copy)
@@ -0,0 +1,19 @@ 
+/* { dg-do compile { target { powerpc*-*-* } } } */
+/* { dg-skip-if "do not override -mcpu" { powerpc*-*-* } { "-mcpu=*" } { "-mcpu=power9" } } */
+/* { dg-require-effective-target powerpc_p9vector_ok } */
+/* { dg-skip-if "" { powerpc*-*-aix* } } */
+/* { dg-options "-mcpu=power9" } */
+
+/* This test should succeed on both 32- and 64-bit configurations.  */
+#include <altivec.h>
+
+int
+test_byte_in_range (unsigned char b,
+		    unsigned char low_range, unsigned char high_range)
+{
+  unsigned int range_encoding = (high_range << 24) | (low_range << 16);
+  return __builtin_scalar_byte_in_range (b, range_encoding);
+}
+
+/* { dg-final { scan-assembler "cmprb" } } */
+/* { dg-final { scan-assembler "setb" } } */
Index: gcc/testsuite/gcc.target/powerpc/byte-in-range-1.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/byte-in-range-1.c	(revision 0)
+++ gcc/testsuite/gcc.target/powerpc/byte-in-range-1.c	(working copy)
@@ -0,0 +1,16 @@ 
+/* { dg-do compile { target { powerpc*-*-* } } } */
+/* { dg-skip-if "do not override -mcpu" { powerpc*-*-* } { "-mcpu=*" } { "-mcpu=power8" } } */
+/* { dg-require-effective-target powerpc_p9vector_ok } */
+/* { dg-skip-if "" { powerpc*-*-aix* } } */
+/* { dg-options "-mcpu=power8" } */
+
+#include <altivec.h>
+
+int
+test_byte_in_range (unsigned char b,
+		    unsigned char low_range, unsigned char high_range)
+{
+  unsigned int range_encoding = (high_range << 24) | (low_range << 16);
+  return __builtin_scalar_byte_in_range (b, range_encoding); /* { dg-error "Builtin function __builtin_altivec_byte_in_range requires" } */
+}
+
Index: gcc/testsuite/gcc.target/powerpc/byte-in-set-0.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/byte-in-set-0.c	(revision 0)
+++ gcc/testsuite/gcc.target/powerpc/byte-in-set-0.c	(working copy)
@@ -0,0 +1,18 @@ 
+/* { dg-do compile { target { powerpc*-*-* } } } */
+/* { dg-skip-if "do not override -mcpu" { powerpc*-*-* } { "-mcpu=*" } { "-mcpu=power9" } } */
+/* { dg-require-effective-target lp64 } */
+/* { dg-require-effective-target powerpc_p9vector_ok } */
+/* { dg-skip-if "" { powerpc*-*-aix* } } */
+/* { dg-options "-mcpu=power9" } */
+
+/* This test should succeed only on 64-bit configurations.  */
+#include <altivec.h>
+
+int
+test_byte_in_set (unsigned char b, unsigned long long set_members)
+{
+  return __builtin_scalar_byte_in_set (b, set_members);
+}
+
+/* { dg-final { scan-assembler "cmpeqb" } } */
+/* { dg-final { scan-assembler "setb" } } */
Index: gcc/testsuite/gcc.target/powerpc/byte-in-set-1.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/byte-in-set-1.c	(revision 0)
+++ gcc/testsuite/gcc.target/powerpc/byte-in-set-1.c	(working copy)
@@ -0,0 +1,14 @@ 
+/* { dg-do compile { target { powerpc*-*-* } } } */
+/* { dg-skip-if "do not override -mcpu" { powerpc*-*-* } { "-mcpu=*" } { "-mcpu=power8" } } */
+/* { dg-require-effective-target lp64 } */
+/* { dg-require-effective-target powerpc_p9vector_ok } */
+/* { dg-skip-if "" { powerpc*-*-aix* } } */
+/* { dg-options "-mcpu=power8" } */
+
+#include <altivec.h>
+
+int
+test_byte_in_set (unsigned char b, unsigned long long set_members)
+{
+  return __builtin_scalar_byte_in_set (b, set_members); /* { dg-error "Builtin function __builtin_altivec_byte_in_set requires" } */
+}
Index: gcc/testsuite/gcc.target/powerpc/byte-in-set-2.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/byte-in-set-2.c	(revision 0)
+++ gcc/testsuite/gcc.target/powerpc/byte-in-set-2.c	(working copy)
@@ -0,0 +1,16 @@ 
+/* { dg-do compile { target { powerpc*-*-* } } } */
+/* { dg-skip-if "do not override -mcpu" { powerpc*-*-* } { "-mcpu=*" } { "-mcpu=power9" } } */
+/* { dg-require-effective-target ilp32 } */
+/* { dg-require-effective-target powerpc_p9vector_ok } */
+/* { dg-skip-if "" { powerpc*-*-aix* } } */
+/* { dg-options "-mcpu=power9" } */
+
+#include <altivec.h>
+
+/* This test should succeed only on 32-bit configurations.  */
+
+int
+test_byte_in_set (unsigned char b, unsigned long long set_members)
+{
+  return __builtin_scalar_byte_in_set (b, set_members); /* { dg-error "Builtin function __builtin_scalar_byte_in_set not supported in this compiler configuration" } */
+}