diff mbox series

RS6000, add VSX mask manipulation support

Message ID 233966dd2d0070a68126dd86a7ed98d063de92c8.camel@us.ibm.com
State New
Headers show
Series RS6000, add VSX mask manipulation support | expand

Commit Message

Carl Love May 22, 2020, 8:27 p.m. UTC
GCC maintainers:

The following patch adds support for builtins vec_genbm(),  vec_genhm(),
vec_genwm(), vec_gendm(), vec_genqm(), vec_cntm(), vec_expandm(),
vec_extractm().  Support for instructions mtvsrbm, mtvsrhm, mtvsrwm,
mtvsrdm, mtvsrqm, cntm, vexpandm, vextractm.

The test has been tested on:

  powerpc64le-unknown-linux-gnu (Power 9 LE)

and mambo with no regression errors.

Please let me know if this patch is acceptable for mainline.

Thanks.

               Carl Love
-------------------------------------------------------------------

RS6000 RFC 2629, add VSX mask manipulation support

gcc/ChangeLog

2020-05-22  Carl Love  <cel@us.ibm.com>

	* config/rs6000/vsx.md  (VSX_MM): New define_mode_iterator.
	(VSX_MM4): New define_mode_iterator.
	(VSX_MM_SUFFIX4): New define_mode_attr.
	(vec_mtvsrbm): New define_expand.
	(vec_mtvsrbmi): New define_insn.
	(vec_mtvsr_<mode>): New define_insn.
	(vec_cntmb_<mode>): New define_insn.
	(vec_extract_<mode>): New define_insn.
	(vec_expand_<mode>): New define_insn.
	(define_c_enum unspec): Add entries UNSPEC_MTVSBM, UNSPEC_VCNTMB,
	UNSPEC_VEXTRACT, UNSPEC_VEXPAND.
	* config/rs6000/altivec.h: Add defines vec_genbm, vec_genhm, vec_genwm,
	vec_gendm, vec_genqm, vec_cntm, vec_expandm, vec_extractm.
	* config/rs6000/rs6000-builtin.c: Add defines BU_FUTURE_2, BU_FUTURE_1.
	(BU_FUTURE_1): Add definitions for mtvsrbm, mtvsrhm, mtvsrwm,
	mtvsrdm, mtvsrqm, vexpandmb, vexpandmh, vexpandmw, vexpandmd, vexpandmq,
	vextractmb, vextractmh, vextractmw, vextractmd, vextractmq.
	(BU_FUTURE_2): Add definitions for cntmbb, cntmbh, cntmbw, cntmbd.
	(BU_FUTURE_OVERLOAD_1): Add definitions for mtvsrbm, mtvsrhm,
	mtvsrwm, mtvsrdm, mtvsrqm, vexpandm, vextractm.
	(BU_FUTURE_OVERLOAD_2): Add defition for cntm.
	* config/rs6000/rs6000-call.c (rs6000_expand_binop_builtin): Add
	checks for CODE_FOR_vec_cntmbb_v16qi, CODE_FOR_vec_cntmb_v8hi,
	CODE_FOR_vec_cntmb_v4si, CODE_FOR_vec_cntmb_v2di.
	(altivec_overloaded_builtins): Add overloaded argument entries for
	FUTURE_BUILTIN_VEC_MTVSRBM, FUTURE_BUILTIN_VEC_MTVSRHM, FUTURE_BUILTIN_VEC_MTVSRWM,
	FUTURE_BUILTIN_VEC_MTVSRDM, FUTURE_BUILTIN_VEC_MTVSRQM, FUTURE_BUILTIN_VEC_VCNTMBB,
	FUTURE_BUILTIN_VCNTMBB, FUTURE_BUILTIN_VCNTMBH, FUTURE_BUILTIN_VCNTMBW,
	FUTURE_BUILTIN_VCNTMBD, FUTURE_BUILTIN_VEXPANDMB, FUTURE_BUILTIN_VEXPANDMH,
	FUTURE_BUILTIN_VEXPANDMW, FUTURE_BUILTIN_VEXPANDMD, FUTURE_BUILTIN_VEXPANDMQ,
	FUTURE_BUILTIN_VEXTRACTMB, FUTURE_BUILTIN_VEXTRACTMH, FUTURE_BUILTIN_VEXTRACTMW,
	FUTURE_BUILTIN_VEXTRACTMD, FUTURE_BUILTIN_VEXTRACTMQ.
	(builtin_function_type): Add case entries for FUTURE_BUILTIN_MTVSRBM,
	FUTURE_BUILTIN_MTVSRHM, FUTURE_BUILTIN_MTVSRWM, FUTURE_BUILTIN_MTVSRDM,
	FUTURE_BUILTIN_MTVSRQM, FUTURE_BUILTIN_VCNTMBB, FUTURE_BUILTIN_VCNTMBH,
	FUTURE_BUILTIN_VCNTMBW, FUTURE_BUILTIN_VCNTMBD, FUTURE_BUILTIN_VEXPANDMB,
	FUTURE_BUILTIN_VEXPANDMH, FUTURE_BUILTIN_VEXPANDMW, FUTURE_BUILTIN_VEXPANDMD,
	FUTURE_BUILTIN_VEXPANDMQ.
	* config/rs6000/rs6000-c.c (altivec_overloaded_builtins): Add entries
	for MTVSRBM, MTVSRHM, MTVSRWM, MTVSRDM, MTVSRQM, VCNTM, VEXPANDM, VEXTRACTM.
	* testsuite/gcc.target/powerpc/vsx_mask-runnable.c:  Add runnable test case.
---
 gcc/config/rs6000/altivec.h                   |  10 +
 gcc/config/rs6000/rs6000-builtin.def          |  45 ++
 gcc/config/rs6000/rs6000-call.c               |  66 +-
 gcc/config/rs6000/vsx.md                      |  67 ++
 .../gcc.target/powerpc/vsx_mask-runnable.c    | 614 ++++++++++++++++++
 5 files changed, 801 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.target/powerpc/vsx_mask-runnable.c

Comments

will schmidt May 26, 2020, 5:29 p.m. UTC | #1
On Fri, 2020-05-22 at 13:27 -0700, Carl Love wrote:
> GCC maintainers:
> 
> The following patch adds support for builtins
> vec_genbm(),  vec_genhm(),
> vec_genwm(), vec_gendm(), vec_genqm(), vec_cntm(), vec_expandm(),
> vec_extractm().  Support for instructions mtvsrbm, mtvsrhm, mtvsrwm,
> mtvsrdm, mtvsrqm, cntm, vexpandm, vextractm.
> 
> The test has been tested on:
> 
>   powerpc64le-unknown-linux-gnu (Power 9 LE)
> 
> and mambo with no regression errors.
> 
> Please let me know if this patch is acceptable for mainline.
> 
> Thanks.
> 
>                Carl Love
> -------------------------------------------------------------------
> 
> RS6000 RFC 2629, add VSX mask manipulation support
> 
> gcc/ChangeLog
> 
> 2020-05-22  Carl Love  <cel@us.ibm.com>
> 
> 	* config/rs6000/vsx.md  (VSX_MM): New define_mode_iterator.
> 	(VSX_MM4): New define_mode_iterator.
> 	(VSX_MM_SUFFIX4): New define_mode_attr.
> 	(vec_mtvsrbm): New define_expand.
> 	(vec_mtvsrbmi): New define_insn.
> 	(vec_mtvsr_<mode>): New define_insn.
> 	(vec_cntmb_<mode>): New define_insn.
> 	(vec_extract_<mode>): New define_insn.
> 	(vec_expand_<mode>): New define_insn.
> 	(define_c_enum unspec): Add entries UNSPEC_MTVSBM, UNSPEC_VCNTMB,
> 	UNSPEC_VEXTRACT, UNSPEC_VEXPAND.
> 	* config/rs6000/altivec.h: Add defines vec_genbm, vec_genhm, vec_genwm,
> 	vec_gendm, vec_genqm, vec_cntm, vec_expandm, vec_extractm.

Nit (?)  Name/symbol first.  i.e. 
(vec_genbm, vec_genhm,...) Add definitions.

> 	* config/rs6000/rs6000-builtin.c: Add defines BU_FUTURE_2, BU_FUTURE_1.
> 	(BU_FUTURE_1): Add definitions for mtvsrbm, mtvsrhm, mtvsrwm,
> 	mtvsrdm, mtvsrqm, vexpandmb, vexpandmh, vexpandmw, vexpandmd, vexpandmq,
> 	vextractmb, vextractmh, vextractmw, vextractmd, vextractmq.
> 	(BU_FUTURE_2): Add definitions for cntmbb, cntmbh, cntmbw, cntmbd.
> 	(BU_FUTURE_OVERLOAD_1): Add definitions for mtvsrbm, mtvsrhm,
> 	mtvsrwm, mtvsrdm, mtvsrqm, vexpandm, vextractm.
> 	(BU_FUTURE_OVERLOAD_2): Add defition for cntm.
> 	* config/rs6000/rs6000-call.c (rs6000_expand_binop_builtin): Add
> 	checks for CODE_FOR_vec_cntmbb_v16qi, CODE_FOR_vec_cntmb_v8hi,
> 	CODE_FOR_vec_cntmb_v4si, CODE_FOR_vec_cntmb_v2di.
> 	(altivec_overloaded_builtins): Add overloaded argument entries for
> 	FUTURE_BUILTIN_VEC_MTVSRBM, FUTURE_BUILTIN_VEC_MTVSRHM, FUTURE_BUILTIN_VEC_MTVSRWM,
> 	FUTURE_BUILTIN_VEC_MTVSRDM, FUTURE_BUILTIN_VEC_MTVSRQM, FUTURE_BUILTIN_VEC_VCNTMBB,
> 	FUTURE_BUILTIN_VCNTMBB, FUTURE_BUILTIN_VCNTMBH, FUTURE_BUILTIN_VCNTMBW,
> 	FUTURE_BUILTIN_VCNTMBD, FUTURE_BUILTIN_VEXPANDMB, FUTURE_BUILTIN_VEXPANDMH,
> 	FUTURE_BUILTIN_VEXPANDMW, FUTURE_BUILTIN_VEXPANDMD, FUTURE_BUILTIN_VEXPANDMQ,

> 	FUTURE_BUILTIN_VEXTRACTMB, FUTURE_BUILTIN_VEXTRACTMH, FUTURE_BUILTIN_VEXTRACTMW,
> 	FUTURE_BUILTIN_VEXTRACTMD, FUTURE_BUILTIN_VEXTRACTMQ.

> 	(builtin_function_type): Add case entries for FUTURE_BUILTIN_MTVSRBM,
> 	FUTURE_BUILTIN_MTVSRHM, FUTURE_BUILTIN_MTVSRWM, FUTURE_BUILTIN_MTVSRDM,
> 	FUTURE_BUILTIN_MTVSRQM, FUTURE_BUILTIN_VCNTMBB, FUTURE_BUILTIN_VCNTMBH,
> 	FUTURE_BUILTIN_VCNTMBW, FUTURE_BUILTIN_VCNTMBD, FUTURE_BUILTIN_VEXPANDMB,
> 	FUTURE_BUILTIN_VEXPANDMH, FUTURE_BUILTIN_VEXPANDMW, FUTURE_BUILTIN_VEXPANDMD,
> 	FUTURE_BUILTIN_VEXPANDMQ.
> 	* config/rs6000/rs6000-c.c (altivec_overloaded_builtins): Add entries
> 	for MTVSRBM, MTVSRHM, MTVSRWM, MTVSRDM, MTVSRQM, VCNTM, VEXPANDM, VEXTRACTM.

The rs6000-c.c reference here ^ doesn't exist below.  Looks like that
was moved to rs6000-builtin.def.



> 	* testsuite/gcc.target/powerpc/vsx_mask-runnable.c:  Add runnable test case.


> ---
>  gcc/config/rs6000/altivec.h                   |  10 +
>  gcc/config/rs6000/rs6000-builtin.def          |  45 ++
>  gcc/config/rs6000/rs6000-call.c               |  66 +-
>  gcc/config/rs6000/vsx.md                      |  67 ++
>  .../gcc.target/powerpc/vsx_mask-runnable.c    | 614 ++++++++++++++++++



>  5 files changed, 801 insertions(+), 1 deletion(-)
>  create mode 100644 gcc/testsuite/gcc.target/powerpc/vsx_mask-runnable.c
> 
> diff --git a/gcc/config/rs6000/altivec.h b/gcc/config/rs6000/altivec.h
> index 0a7e8ab3647..5917d3a2b76 100644
> --- a/gcc/config/rs6000/altivec.h
> +++ b/gcc/config/rs6000/altivec.h
> @@ -710,6 +710,16 @@ __altivec_scalar_pred(vec_any_nle,
> 
>  #define vec_strir_p(a)	__builtin_vec_strir_p (a)
>  #define vec_stril_p(a)	__builtin_vec_stril_p (a)
> +
> +/* VSX Mask Manipulation builtin. */
> +#define vec_genbm __builtin_vec_mtvsrbm
> +#define vec_genhm __builtin_vec_mtvsrhm
> +#define vec_genwm __builtin_vec_mtvsrwm
> +#define vec_gendm __builtin_vec_mtvsrdm
> +#define vec_genqm __builtin_vec_mtvsrqm
> +#define vec_cntm __builtin_vec_cntm
> +#define vec_expandm __builtin_vec_vexpandm
> +#define vec_extractm __builtin_vec_vextractm
>  #endif

ok

> 
>  #endif /* _ALTIVEC_H */
> diff --git a/gcc/config/rs6000/rs6000-builtin.def b/gcc/config/rs6000/rs6000-builtin.def
> index 8b1ddb00045..7cab5097aeb 100644
> --- a/gcc/config/rs6000/rs6000-builtin.def
> +++ b/gcc/config/rs6000/rs6000-builtin.def
> @@ -1049,6 +1049,22 @@
>  		    (RS6000_BTC_ ## ATTR		/* ATTR */	\
>  		     | RS6000_BTC_TERNARY),				\
>  		    CODE_FOR_ ## ICODE)			/* ICODE */
> +
> +#define BU_FUTURE_1(ENUM, NAME, ATTR, ICODE)			\
> +  RS6000_BUILTIN_1 (FUTURE_BUILTIN_ ## ENUM,		/* ENUM */	\
> +		    "__builtin_vec" NAME,		/* NAME */	\
> +		    RS6000_BTM_FUTURE,			/* MASK */	\
> +		    (RS6000_BTC_ ## ATTR		/* ATTR */	\
> +		     | RS6000_BTC_UNARY),				\
> +		    CODE_FOR_ ## ICODE)			/* ICODE */
> +
> +#define BU_FUTURE_2(ENUM, NAME, ATTR, ICODE)			\
> +  RS6000_BUILTIN_2 (FUTURE_BUILTIN_ ## ENUM,		/* ENUM */	\
> +		    "__builtin_vec" NAME,		/* NAME */	\
> +		    RS6000_BTM_FUTURE,			/* MASK */	\
> +		    (RS6000_BTC_ ## ATTR		/* ATTR */	\
> +		     | RS6000_BTC_BINARY),				\
> +		    CODE_FOR_ ## ICODE)			/* ICODE */
>  #endif
> 

ok

>  
> @@ -2637,6 +2653,26 @@ BU_FUTURE_V_1 (VSTRIHR_P, "vstrihr_p", CONST, vstrir_p_v8hi)
>  BU_FUTURE_V_1 (VSTRIBL_P, "vstribl_p", CONST, vstril_p_v16qi)
>  BU_FUTURE_V_1 (VSTRIHL_P, "vstrihl_p", CONST, vstril_p_v8hi)
> 
> +BU_FUTURE_1 (MTVSRBM, "mtvsrbm", CONST, vec_mtvsrbm)
> +BU_FUTURE_1 (MTVSRHM, "mtvsrhm", CONST, vec_mtvsr_v8hi)
> +BU_FUTURE_1 (MTVSRWM, "mtvsrwm", CONST, vec_mtvsr_v4si)
> +BU_FUTURE_1 (MTVSRDM, "mtvsrdm", CONST, vec_mtvsr_v2di)
> +BU_FUTURE_1 (MTVSRQM, "mtvsrqm", CONST, vec_mtvsr_v1ti)
> +BU_FUTURE_2 (VCNTMBB, "cntmbb", CONST, vec_cntmb_v16qi)
> +BU_FUTURE_2 (VCNTMBH, "cntmbh", CONST, vec_cntmb_v8hi)
> +BU_FUTURE_2 (VCNTMBW, "cntmbw", CONST, vec_cntmb_v4si)
> +BU_FUTURE_2 (VCNTMBD, "cntmbd", CONST, vec_cntmb_v2di)
> +BU_FUTURE_1 (VEXPANDMB, "vexpandmb", CONST, vec_expand_v16qi)
> +BU_FUTURE_1 (VEXPANDMH, "vexpandmh", CONST, vec_expand_v8hi)
> +BU_FUTURE_1 (VEXPANDMW, "vexpandmw", CONST, vec_expand_v4si)
> +BU_FUTURE_1 (VEXPANDMD, "vexpandmd", CONST, vec_expand_v2di)
> +BU_FUTURE_1 (VEXPANDMQ, "vexpandmq", CONST, vec_expand_v1ti)
> +BU_FUTURE_1 (VEXTRACTMB, "vextractmb", CONST, vec_extract_v16qi)
> +BU_FUTURE_1 (VEXTRACTMH, "vextractmh", CONST, vec_extract_v8hi)
> +BU_FUTURE_1 (VEXTRACTMW, "vextractmw", CONST, vec_extract_v4si)
> +BU_FUTURE_1 (VEXTRACTMD, "vextractmd", CONST, vec_extract_v2di)
> +BU_FUTURE_1 (VEXTRACTMQ, "vextractmq", CONST, vec_extract_v1ti)
> +
>  /* Future architecture overloaded vector built-ins.  */
>  BU_FUTURE_OVERLOAD_2 (CLRL, "clrl")
>  BU_FUTURE_OVERLOAD_2 (CLRR, "clrr")
> @@ -2652,6 +2688,15 @@ BU_FUTURE_OVERLOAD_1 (VSTRIL, "stril")
> 
>  BU_FUTURE_OVERLOAD_1 (VSTRIR_P, "strir_p")
>  BU_FUTURE_OVERLOAD_1 (VSTRIL_P, "stril_p")
> +
> +BU_FUTURE_OVERLOAD_1 (MTVSRBM, "mtvsrbm")
> +BU_FUTURE_OVERLOAD_1 (MTVSRHM, "mtvsrhm")
> +BU_FUTURE_OVERLOAD_1 (MTVSRWM, "mtvsrwm")
> +BU_FUTURE_OVERLOAD_1 (MTVSRDM, "mtvsrdm")
> +BU_FUTURE_OVERLOAD_1 (MTVSRQM, "mtvsrqm")
> +BU_FUTURE_OVERLOAD_2 (VCNTM, "cntm")
> +BU_FUTURE_OVERLOAD_1 (VEXPANDM, "vexpandm")
> +BU_FUTURE_OVERLOAD_1 (VEXTRACTM, "vextractm")
> 

ok

>  
>  /* 1 argument crypto functions.  */
>  BU_CRYPTO_1 (VSBOX,		"vsbox",	  CONST, crypto_vsbox_v2di)
> diff --git a/gcc/config/rs6000/rs6000-call.c b/gcc/config/rs6000/rs6000-call.c
> index 0ac8054d030..f50c859b807 100644
> --- a/gcc/config/rs6000/rs6000-call.c
> +++ b/gcc/config/rs6000/rs6000-call.c
> @@ -5618,6 +5618,52 @@ const struct altivec_builtin_types altivec_overloaded_builtins[] = {
>    { FUTURE_BUILTIN_VEC_VSTRIR_P, FUTURE_BUILTIN_VSTRIHR_P,
>      RS6000_BTI_INTSI, RS6000_BTI_V8HI, 0, 0 },
> 
> +  { FUTURE_BUILTIN_VEC_MTVSRBM, FUTURE_BUILTIN_MTVSRBM,
> +    RS6000_BTI_unsigned_V16QI, RS6000_BTI_UINTDI, 0, 0 },
> +  { FUTURE_BUILTIN_VEC_MTVSRHM, FUTURE_BUILTIN_MTVSRHM,
> +    RS6000_BTI_unsigned_V8HI, RS6000_BTI_UINTDI, 0, 0 },
> +  { FUTURE_BUILTIN_VEC_MTVSRWM, FUTURE_BUILTIN_MTVSRWM,
> +    RS6000_BTI_unsigned_V4SI, RS6000_BTI_UINTDI, 0, 0 },
> +  { FUTURE_BUILTIN_VEC_MTVSRDM, FUTURE_BUILTIN_MTVSRDM,
> +    RS6000_BTI_unsigned_V2DI, RS6000_BTI_UINTDI, 0, 0 },
> +  { FUTURE_BUILTIN_VEC_MTVSRQM, FUTURE_BUILTIN_MTVSRQM,
> +    RS6000_BTI_unsigned_V1TI, RS6000_BTI_UINTDI, 0, 0 },
> +
> +  { FUTURE_BUILTIN_VEC_VCNTM, FUTURE_BUILTIN_VCNTMBB,
> +    RS6000_BTI_unsigned_long_long,
> +    RS6000_BTI_unsigned_V16QI, RS6000_BTI_UINTQI, 0 },
> +  { FUTURE_BUILTIN_VEC_VCNTM, FUTURE_BUILTIN_VCNTMBH,
> +    RS6000_BTI_unsigned_long_long,
> +    RS6000_BTI_unsigned_V8HI, RS6000_BTI_UINTQI, 0 },
> +  { FUTURE_BUILTIN_VEC_VCNTM, FUTURE_BUILTIN_VCNTMBW,
> +    RS6000_BTI_unsigned_long_long,
> +    RS6000_BTI_unsigned_V4SI, RS6000_BTI_UINTQI, 0 },
> +  { FUTURE_BUILTIN_VEC_VCNTM, FUTURE_BUILTIN_VCNTMBD,
> +    RS6000_BTI_unsigned_long_long,
> +    RS6000_BTI_unsigned_V2DI, RS6000_BTI_UINTQI, 0 },
> +
> +  { FUTURE_BUILTIN_VEC_VEXPANDM, FUTURE_BUILTIN_VEXPANDMB,
> +    RS6000_BTI_unsigned_V16QI, RS6000_BTI_unsigned_V16QI, 0, 0 },
> +  { FUTURE_BUILTIN_VEC_VEXPANDM, FUTURE_BUILTIN_VEXPANDMH,
> +    RS6000_BTI_unsigned_V8HI, RS6000_BTI_unsigned_V8HI, 0, 0 },
> +  { FUTURE_BUILTIN_VEC_VEXPANDM, FUTURE_BUILTIN_VEXPANDMW,
> +    RS6000_BTI_unsigned_V4SI, RS6000_BTI_unsigned_V4SI, 0, 0 },
> +  { FUTURE_BUILTIN_VEC_VEXPANDM, FUTURE_BUILTIN_VEXPANDMD,
> +    RS6000_BTI_unsigned_V2DI, RS6000_BTI_unsigned_V2DI, 0, 0 },
> +  { FUTURE_BUILTIN_VEC_VEXPANDM, FUTURE_BUILTIN_VEXPANDMQ,
> +    RS6000_BTI_unsigned_V1TI, RS6000_BTI_unsigned_V1TI, 0, 0 },
> +
> +  { FUTURE_BUILTIN_VEC_VEXTRACTM, FUTURE_BUILTIN_VEXTRACTMB,
> +    RS6000_BTI_INTSI, RS6000_BTI_unsigned_V16QI, 0, 0 },
> +  { FUTURE_BUILTIN_VEC_VEXTRACTM, FUTURE_BUILTIN_VEXTRACTMH,
> +    RS6000_BTI_INTSI, RS6000_BTI_unsigned_V8HI, 0, 0 },
> +  { FUTURE_BUILTIN_VEC_VEXTRACTM, FUTURE_BUILTIN_VEXTRACTMW,
> +    RS6000_BTI_INTSI, RS6000_BTI_unsigned_V4SI, 0, 0 },
> +  { FUTURE_BUILTIN_VEC_VEXTRACTM, FUTURE_BUILTIN_VEXTRACTMD,
> +    RS6000_BTI_INTSI, RS6000_BTI_unsigned_V2DI, 0, 0 },
> +  { FUTURE_BUILTIN_VEC_VEXTRACTM, FUTURE_BUILTIN_VEXTRACTMQ,
> +    RS6000_BTI_INTSI, RS6000_BTI_unsigned_V1TI, 0, 0 },
> +
>    { RS6000_BUILTIN_NONE, RS6000_BUILTIN_NONE, 0, 0, 0, 0 }
>  };

ok

>  
> @@ -8968,7 +9014,11 @@ rs6000_expand_binop_builtin (enum insn_code icode, tree exp, rtx target)
>  	   || icode == CODE_FOR_unpackkf
>  	   || icode == CODE_FOR_unpacktf
>  	   || icode == CODE_FOR_unpackif
> -	   || icode == CODE_FOR_unpacktd)
> +	   || icode == CODE_FOR_unpacktd
> +	   || icode == CODE_FOR_vec_cntmb_v16qi
> +	   || icode == CODE_FOR_vec_cntmb_v8hi
> +	   || icode == CODE_FOR_vec_cntmb_v4si
> +	   || icode == CODE_FOR_vec_cntmb_v2di)
>      {
>        /* Only allow 1-bit unsigned literals. */
>        STRIP_NOPS (arg1);
> @@ -13170,6 +13220,20 @@ builtin_function_type (machine_mode mode_ret, machine_mode mode_arg0,
>      case P8V_BUILTIN_VGBBD:
>      case MISC_BUILTIN_CDTBCD:
>      case MISC_BUILTIN_CBCDTD:
> +    case FUTURE_BUILTIN_MTVSRBM:
> +    case FUTURE_BUILTIN_MTVSRHM:
> +    case FUTURE_BUILTIN_MTVSRWM:
> +    case FUTURE_BUILTIN_MTVSRDM:
> +    case FUTURE_BUILTIN_MTVSRQM:
> +    case FUTURE_BUILTIN_VCNTMBB:
> +    case FUTURE_BUILTIN_VCNTMBH:
> +    case FUTURE_BUILTIN_VCNTMBW:
> +    case FUTURE_BUILTIN_VCNTMBD:
> +    case FUTURE_BUILTIN_VEXPANDMB:
> +    case FUTURE_BUILTIN_VEXPANDMH:
> +    case FUTURE_BUILTIN_VEXPANDMW:
> +    case FUTURE_BUILTIN_VEXPANDMD:
> +    case FUTURE_BUILTIN_VEXPANDMQ:
>        h.uns_p[0] = 1;
>        h.uns_p[1] = 1;
>        break;

ok

> diff --git a/gcc/config/rs6000/vsx.md b/gcc/config/rs6000/vsx.md
> index 2a28215ac5b..96b6ad22812 100644
> --- a/gcc/config/rs6000/vsx.md
> +++ b/gcc/config/rs6000/vsx.md
> @@ -263,6 +263,13 @@
>  ;; Mode attribute to give the suffix for the splat instruction
>  (define_mode_attr VSX_SPLAT_SUFFIX [(V16QI "b") (V8HI "h")])
> 
> +;; Iterator for the move to mask instructions
> +(define_mode_iterator VSX_MM [V16QI V8HI V4SI V2DI V1TI])
> +(define_mode_iterator VSX_MM4 [V16QI V8HI V4SI V2DI])
> +
> +;; Mode attribute to give the suffix for the mask instruction
> +(define_mode_attr VSX_MM_SUFFIX [(V16QI "b") (V8HI "h") (V4SI "w") (V2DI "d") (V1TI "q")])
> +
>  ;; Constants for creating unspecs
>  (define_c_enum "unspec"
>    [UNSPEC_VSX_CONCAT
> @@ -344,6 +351,10 @@
>     UNSPEC_VSX_FIRST_MISMATCH_INDEX
>     UNSPEC_VSX_FIRST_MISMATCH_EOS_INDEX
>     UNSPEC_XXGENPCV
> +   UNSPEC_MTVSBM
> +   UNSPEC_VCNTMB
> +   UNSPEC_VEXPAND
> +   UNSPEC_VEXTRACT
>    ])
> 
>  ;; VSX moves
> @@ -5676,3 +5687,59 @@
>    DONE;
>  })
> 
> +;; VSX mask manipulation instructions
> +(define_expand "vec_mtvsrbm"
> +  [(set (match_operand:V16QI 0 "vsx_register_operand" "=v")
> +        (unspec:V16QI [(match_operand:DI 1 "gpc_reg_operand" "b")]
> +       UNSPEC_MTVSBM))]
> +   "TARGET_FUTURE"
> + {
> +  if (IN_RANGE (INTVAL (operands[1]), 0, 63))
> +     /* This is the vec_mtvsrbmi inst with six bit constant.  */

It is either the vec_mtvsrbmi built-in, or the mtvsrbmi instruction.


> +    emit_insn (gen_vec_mtvsrbmi (operands[0], operands[1]));
> +  else
> +    emit_insn (gen_vec_mtvsr_v16qi (operands[0], operands[1]));
> +
> +  DONE;
> +})
> +
> +(define_insn "vec_mtvsrbmi"
> +  [(set (match_operand:V16QI 0 "vsx_register_operand" "=v")
> +        (unspec:V16QI [(match_operand:QI 1 "u6bit_cint_operand" "n")]
> +        UNSPEC_MTVSBM))]
> +  "TARGET_FUTURE"
> +  "mtvsrbmi %0,%1"
> +)
> +
> +(define_insn "vec_mtvsr_<mode>"
> +  [(set (match_operand:VSX_MM 0 "vsx_register_operand" "=v")
> +        (unspec:VSX_MM [(match_operand:DI 1 "gpc_reg_operand" "b")]
> +        UNSPEC_MTVSBM))]
> +  "TARGET_FUTURE"
> +  "mtvsr<VSX_MM_SUFFIX>m %0,%1";
> +  [(set_attr "type" "vecsimple")])
> +
> +(define_insn "vec_cntmb_<mode>"
> +  [(set (match_operand:DI 0 "gpc_reg_operand" "=r")
> +        (unspec:DI [(match_operand:VSX_MM4 1 "vsx_register_operand" "v")
> +                    (match_operand:QI 2 "const_0_to_1_operand" "n")]
> +        UNSPEC_VCNTMB))]
> +  "TARGET_FUTURE"
> +  "vcntmb<VSX_MM_SUFFIX> %0,%1,%2"
> +  [(set_attr "type" "vecsimple")])
> +
> +(define_insn "vec_extract_<mode>"
> +  [(set (match_operand:SI 0 "register_operand" "=r")
> +	(unspec:SI [(match_operand:VSX_MM 1 "vsx_register_operand" "v")]
> +	UNSPEC_VEXTRACT))]
> +  "TARGET_FUTURE"
> +  "vextract<VSX_MM_SUFFIX>m %0,%1"
> +  [(set_attr "type" "vecsimple")])
> +
> +(define_insn "vec_expand_<mode>"
> +  [(set (match_operand:VSX_MM 0 "vsx_register_operand" "=v")
> +        (unspec:VSX_MM [(match_operand:VSX_MM 1 "vsx_register_operand" "v")]
> +        UNSPEC_VEXPAND))]
> +  "TARGET_FUTURE"
> +  "vexpand<VSX_MM_SUFFIX>m %0,%1"
> +  [(set_attr "type" "vecsimple")])

ok

> diff --git a/gcc/testsuite/gcc.target/powerpc/vsx_mask-runnable.c
>  b/gcc/testsuite/gcc.target/powerpc/vsx_mask-runnable.c
> new file mode 100644
> index 00000000000..8eab7107b15
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/vsx_mask-runnable.c
> 
<snip testcase>

I'd probably chop the test up into a few smaller tests, no issue or
concern with the test itself.

Aside from the cosmetic nits mentioned above, lgtm.

Thanks
-Will


> 
> 
> 
> 
> 
> 
> 
>
Carl Love May 27, 2020, 3:50 p.m. UTC | #2
GCC maintainers:

I have addressed the following comments on the patch from Will:

  - ChangeLog: fixed name/symbol order;
    changed reference from rs6000-c.c to rs6000-builtin.def.

  - define_expand "vec_mtvsrbm": changed name to vec_mtvsrbm_mtvsrbmi,
    updated comment.

  - vsx_mask-runnable.c: divided it up into four smaller test cases,
    vsx_mask-count-runnable.c, vsx_mask-expane-runnable.c,
    vsx_mask-extract-runnable.c, vsx_mask-move-runnable.c.

Please let me know if there are additional concerns.  Thanks.

                       Carl Love

-------------------------------------------------------
RS6000 RFC 2629, add VSX mask manipulation support

The following patch adds support for builtins vec_genbm(),  vec_genhm(),
vec_genwm(), vec_gendm(), vec_genqm(), vec_cntm(), vec_expandm(),
vec_extractm().  Support for instructions mtvsrbm, mtvsrhm, mtvsrwm,
mtvsrdm, mtvsrqm, cntm, vexpandm, vextractm.

The test has been tested on:

  powerpc64le-unknown-linux-gnu (Power 9 LE)

and mambo with no regression errors.

Please let me know if this patch is acceptable for inclusion in the pu
branch.  Thanks.

               Carl Love
-------------------------------------------------------------------

RS6000 RFC 2629, add VSX mask manipulation support

gcc/ChangeLog

2020-05-27  Carl Love  <cel@us.ibm.com>

	* config/rs6000/vsx.md  (VSX_MM): New define_mode_iterator.
	(VSX_MM4): New define_mode_iterator.
	(VSX_MM_SUFFIX4): New define_mode_attr.
	(vec_mtvsrbm): New define_expand.
	(vec_mtvsrbmi): New define_insn.
	(vec_mtvsr_<mode>): New define_insn.
	(vec_cntmb_<mode>): New define_insn.
	(vec_extract_<mode>): New define_insn.
	(vec_expand_<mode>): New define_insn.
	(define_c_enum unspec): Add entries UNSPEC_MTVSBM, UNSPEC_VCNTMB,
	UNSPEC_VEXTRACT, UNSPEC_VEXPAND.
	* config/rs6000/altivec.h ( vec_genbm, vec_genhm, vec_genwm,
	vec_gendm, vec_genqm, vec_cntm, vec_expandm, vec_extractm): Add defines.
	* config/rs6000/rs6000-builtin.c: Add defines BU_FUTURE_2, BU_FUTURE_1.
	(BU_FUTURE_1): Add definitions for mtvsrbm, mtvsrhm, mtvsrwm,
	mtvsrdm, mtvsrqm, vexpandmb, vexpandmh, vexpandmw, vexpandmd, vexpandmq,
	vextractmb, vextractmh, vextractmw, vextractmd, vextractmq.
	(BU_FUTURE_2): Add definitions for cntmbb, cntmbh, cntmbw, cntmbd.
	(BU_FUTURE_OVERLOAD_1): Add definitions for mtvsrbm, mtvsrhm,
	mtvsrwm, mtvsrdm, mtvsrqm, vexpandm, vextractm.
	(BU_FUTURE_OVERLOAD_2): Add defition for cntm.
	* config/rs6000/rs6000-call.c (rs6000_expand_binop_builtin): Add
	checks for CODE_FOR_vec_cntmbb_v16qi, CODE_FOR_vec_cntmb_v8hi,
	CODE_FOR_vec_cntmb_v4si, CODE_FOR_vec_cntmb_v2di.
	(altivec_overloaded_builtins): Add overloaded argument entries for
	FUTURE_BUILTIN_VEC_MTVSRBM, FUTURE_BUILTIN_VEC_MTVSRHM, FUTURE_BUILTIN_VEC_MTVSRWM,
	FUTURE_BUILTIN_VEC_MTVSRDM, FUTURE_BUILTIN_VEC_MTVSRQM, FUTURE_BUILTIN_VEC_VCNTMBB,
	FUTURE_BUILTIN_VCNTMBB, FUTURE_BUILTIN_VCNTMBH, FUTURE_BUILTIN_VCNTMBW,
	FUTURE_BUILTIN_VCNTMBD, FUTURE_BUILTIN_VEXPANDMB, FUTURE_BUILTIN_VEXPANDMH,
	FUTURE_BUILTIN_VEXPANDMW, FUTURE_BUILTIN_VEXPANDMD, FUTURE_BUILTIN_VEXPANDMQ,
	FUTURE_BUILTIN_VEXTRACTMB, FUTURE_BUILTIN_VEXTRACTMH, FUTURE_BUILTIN_VEXTRACTMW,
	FUTURE_BUILTIN_VEXTRACTMD, FUTURE_BUILTIN_VEXTRACTMQ.
	(builtin_function_type): Add case entries for FUTURE_BUILTIN_MTVSRBM,
	FUTURE_BUILTIN_MTVSRHM, FUTURE_BUILTIN_MTVSRWM, FUTURE_BUILTIN_MTVSRDM,
	FUTURE_BUILTIN_MTVSRQM, FUTURE_BUILTIN_VCNTMBB, FUTURE_BUILTIN_VCNTMBH,
	FUTURE_BUILTIN_VCNTMBW, FUTURE_BUILTIN_VCNTMBD, FUTURE_BUILTIN_VEXPANDMB,
	FUTURE_BUILTIN_VEXPANDMH, FUTURE_BUILTIN_VEXPANDMW, FUTURE_BUILTIN_VEXPANDMD,
	FUTURE_BUILTIN_VEXPANDMQ.
	* config/rs6000/rs6000-builtin.def (altivec_overloaded_builtins): Add entries
	for MTVSRBM, MTVSRHM, MTVSRWM, MTVSRDM, MTVSRQM, VCNTM, VEXPANDM, VEXTRACTM.
	* testsuite/gcc.target/powerpc/vsx_mask-count-runnable.c:  New test case.
	* testsuite/gcc.target/powerpc/vsx_mask-expand-runnable.c:  New test case.
	* testsuite/gcc.target/powerpc/vsx_mask-extract-count-runnable.c:  New test case.
	* testsuite/gcc.target/powerpc/vsx_mask-move-count-runnable.c:  New test case.
---
 gcc/config/rs6000/altivec.h                   |  10 +
 gcc/config/rs6000/rs6000-builtin.def          |  45 ++++
 gcc/config/rs6000/rs6000-call.c               |  66 ++++-
 gcc/config/rs6000/vsx.md                      |  68 ++++++
 .../powerpc/vsx_mask-count-runnable.c         | 149 ++++++++++++
 .../powerpc/vsx_mask-expand-runnable.c        | 194 +++++++++++++++
 .../powerpc/vsx_mask-extract-runnable.c       | 162 +++++++++++++
 .../powerpc/vsx_mask-move-runnable.c          | 225 ++++++++++++++++++
 8 files changed, 918 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.target/powerpc/vsx_mask-count-runnable.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/vsx_mask-expand-runnable.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/vsx_mask-extract-runnable.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/vsx_mask-move-runnable.c

diff --git a/gcc/config/rs6000/altivec.h b/gcc/config/rs6000/altivec.h
index 0a7e8ab3647..5917d3a2b76 100644
--- a/gcc/config/rs6000/altivec.h
+++ b/gcc/config/rs6000/altivec.h
@@ -710,6 +710,16 @@ __altivec_scalar_pred(vec_any_nle,
 
 #define vec_strir_p(a)	__builtin_vec_strir_p (a)
 #define vec_stril_p(a)	__builtin_vec_stril_p (a)
+
+/* VSX Mask Manipulation builtin. */
+#define vec_genbm __builtin_vec_mtvsrbm
+#define vec_genhm __builtin_vec_mtvsrhm
+#define vec_genwm __builtin_vec_mtvsrwm
+#define vec_gendm __builtin_vec_mtvsrdm
+#define vec_genqm __builtin_vec_mtvsrqm
+#define vec_cntm __builtin_vec_cntm
+#define vec_expandm __builtin_vec_vexpandm
+#define vec_extractm __builtin_vec_vextractm
 #endif
 
 #endif /* _ALTIVEC_H */
diff --git a/gcc/config/rs6000/rs6000-builtin.def b/gcc/config/rs6000/rs6000-builtin.def
index 8b1ddb00045..44cf0dec2f2 100644
--- a/gcc/config/rs6000/rs6000-builtin.def
+++ b/gcc/config/rs6000/rs6000-builtin.def
@@ -1049,6 +1049,22 @@
 		    (RS6000_BTC_ ## ATTR		/* ATTR */	\
 		     | RS6000_BTC_TERNARY),				\
 		    CODE_FOR_ ## ICODE)			/* ICODE */
+
+#define BU_FUTURE_1(ENUM, NAME, ATTR, ICODE)			\
+  RS6000_BUILTIN_1 (FUTURE_BUILTIN_ ## ENUM,		/* ENUM */	\
+		    "__builtin_vec" NAME,		/* NAME */	\
+		    RS6000_BTM_FUTURE,			/* MASK */	\
+		    (RS6000_BTC_ ## ATTR		/* ATTR */	\
+		     | RS6000_BTC_UNARY),				\
+		    CODE_FOR_ ## ICODE)			/* ICODE */
+
+#define BU_FUTURE_2(ENUM, NAME, ATTR, ICODE)			\
+  RS6000_BUILTIN_2 (FUTURE_BUILTIN_ ## ENUM,		/* ENUM */	\
+		    "__builtin_vec" NAME,		/* NAME */	\
+		    RS6000_BTM_FUTURE,			/* MASK */	\
+		    (RS6000_BTC_ ## ATTR		/* ATTR */	\
+		     | RS6000_BTC_BINARY),				\
+		    CODE_FOR_ ## ICODE)			/* ICODE */
 #endif
 
 
@@ -2637,6 +2653,26 @@ BU_FUTURE_V_1 (VSTRIHR_P, "vstrihr_p", CONST, vstrir_p_v8hi)
 BU_FUTURE_V_1 (VSTRIBL_P, "vstribl_p", CONST, vstril_p_v16qi)
 BU_FUTURE_V_1 (VSTRIHL_P, "vstrihl_p", CONST, vstril_p_v8hi)
 
+BU_FUTURE_1 (MTVSRBM, "mtvsrbm", CONST, vec_mtvsrbm_mtvsrbmi)
+BU_FUTURE_1 (MTVSRHM, "mtvsrhm", CONST, vec_mtvsr_v8hi)
+BU_FUTURE_1 (MTVSRWM, "mtvsrwm", CONST, vec_mtvsr_v4si)
+BU_FUTURE_1 (MTVSRDM, "mtvsrdm", CONST, vec_mtvsr_v2di)
+BU_FUTURE_1 (MTVSRQM, "mtvsrqm", CONST, vec_mtvsr_v1ti)
+BU_FUTURE_2 (VCNTMBB, "cntmbb", CONST, vec_cntmb_v16qi)
+BU_FUTURE_2 (VCNTMBH, "cntmbh", CONST, vec_cntmb_v8hi)
+BU_FUTURE_2 (VCNTMBW, "cntmbw", CONST, vec_cntmb_v4si)
+BU_FUTURE_2 (VCNTMBD, "cntmbd", CONST, vec_cntmb_v2di)
+BU_FUTURE_1 (VEXPANDMB, "vexpandmb", CONST, vec_expand_v16qi)
+BU_FUTURE_1 (VEXPANDMH, "vexpandmh", CONST, vec_expand_v8hi)
+BU_FUTURE_1 (VEXPANDMW, "vexpandmw", CONST, vec_expand_v4si)
+BU_FUTURE_1 (VEXPANDMD, "vexpandmd", CONST, vec_expand_v2di)
+BU_FUTURE_1 (VEXPANDMQ, "vexpandmq", CONST, vec_expand_v1ti)
+BU_FUTURE_1 (VEXTRACTMB, "vextractmb", CONST, vec_extract_v16qi)
+BU_FUTURE_1 (VEXTRACTMH, "vextractmh", CONST, vec_extract_v8hi)
+BU_FUTURE_1 (VEXTRACTMW, "vextractmw", CONST, vec_extract_v4si)
+BU_FUTURE_1 (VEXTRACTMD, "vextractmd", CONST, vec_extract_v2di)
+BU_FUTURE_1 (VEXTRACTMQ, "vextractmq", CONST, vec_extract_v1ti)
+
 /* Future architecture overloaded vector built-ins.  */
 BU_FUTURE_OVERLOAD_2 (CLRL, "clrl")
 BU_FUTURE_OVERLOAD_2 (CLRR, "clrr")
@@ -2652,6 +2688,15 @@ BU_FUTURE_OVERLOAD_1 (VSTRIL, "stril")
 
 BU_FUTURE_OVERLOAD_1 (VSTRIR_P, "strir_p")
 BU_FUTURE_OVERLOAD_1 (VSTRIL_P, "stril_p")
+
+BU_FUTURE_OVERLOAD_1 (MTVSRBM, "mtvsrbm")
+BU_FUTURE_OVERLOAD_1 (MTVSRHM, "mtvsrhm")
+BU_FUTURE_OVERLOAD_1 (MTVSRWM, "mtvsrwm")
+BU_FUTURE_OVERLOAD_1 (MTVSRDM, "mtvsrdm")
+BU_FUTURE_OVERLOAD_1 (MTVSRQM, "mtvsrqm")
+BU_FUTURE_OVERLOAD_2 (VCNTM, "cntm")
+BU_FUTURE_OVERLOAD_1 (VEXPANDM, "vexpandm")
+BU_FUTURE_OVERLOAD_1 (VEXTRACTM, "vextractm")
 
 /* 1 argument crypto functions.  */
 BU_CRYPTO_1 (VSBOX,		"vsbox",	  CONST, crypto_vsbox_v2di)
diff --git a/gcc/config/rs6000/rs6000-call.c b/gcc/config/rs6000/rs6000-call.c
index 0ac8054d030..f50c859b807 100644
--- a/gcc/config/rs6000/rs6000-call.c
+++ b/gcc/config/rs6000/rs6000-call.c
@@ -5618,6 +5618,52 @@ const struct altivec_builtin_types altivec_overloaded_builtins[] = {
   { FUTURE_BUILTIN_VEC_VSTRIR_P, FUTURE_BUILTIN_VSTRIHR_P,
     RS6000_BTI_INTSI, RS6000_BTI_V8HI, 0, 0 },
 
+  { FUTURE_BUILTIN_VEC_MTVSRBM, FUTURE_BUILTIN_MTVSRBM,
+    RS6000_BTI_unsigned_V16QI, RS6000_BTI_UINTDI, 0, 0 },
+  { FUTURE_BUILTIN_VEC_MTVSRHM, FUTURE_BUILTIN_MTVSRHM,
+    RS6000_BTI_unsigned_V8HI, RS6000_BTI_UINTDI, 0, 0 },
+  { FUTURE_BUILTIN_VEC_MTVSRWM, FUTURE_BUILTIN_MTVSRWM,
+    RS6000_BTI_unsigned_V4SI, RS6000_BTI_UINTDI, 0, 0 },
+  { FUTURE_BUILTIN_VEC_MTVSRDM, FUTURE_BUILTIN_MTVSRDM,
+    RS6000_BTI_unsigned_V2DI, RS6000_BTI_UINTDI, 0, 0 },
+  { FUTURE_BUILTIN_VEC_MTVSRQM, FUTURE_BUILTIN_MTVSRQM,
+    RS6000_BTI_unsigned_V1TI, RS6000_BTI_UINTDI, 0, 0 },
+
+  { FUTURE_BUILTIN_VEC_VCNTM, FUTURE_BUILTIN_VCNTMBB,
+    RS6000_BTI_unsigned_long_long,
+    RS6000_BTI_unsigned_V16QI, RS6000_BTI_UINTQI, 0 },
+  { FUTURE_BUILTIN_VEC_VCNTM, FUTURE_BUILTIN_VCNTMBH,
+    RS6000_BTI_unsigned_long_long,
+    RS6000_BTI_unsigned_V8HI, RS6000_BTI_UINTQI, 0 },
+  { FUTURE_BUILTIN_VEC_VCNTM, FUTURE_BUILTIN_VCNTMBW,
+    RS6000_BTI_unsigned_long_long,
+    RS6000_BTI_unsigned_V4SI, RS6000_BTI_UINTQI, 0 },
+  { FUTURE_BUILTIN_VEC_VCNTM, FUTURE_BUILTIN_VCNTMBD,
+    RS6000_BTI_unsigned_long_long,
+    RS6000_BTI_unsigned_V2DI, RS6000_BTI_UINTQI, 0 },
+
+  { FUTURE_BUILTIN_VEC_VEXPANDM, FUTURE_BUILTIN_VEXPANDMB,
+    RS6000_BTI_unsigned_V16QI, RS6000_BTI_unsigned_V16QI, 0, 0 },
+  { FUTURE_BUILTIN_VEC_VEXPANDM, FUTURE_BUILTIN_VEXPANDMH,
+    RS6000_BTI_unsigned_V8HI, RS6000_BTI_unsigned_V8HI, 0, 0 },
+  { FUTURE_BUILTIN_VEC_VEXPANDM, FUTURE_BUILTIN_VEXPANDMW,
+    RS6000_BTI_unsigned_V4SI, RS6000_BTI_unsigned_V4SI, 0, 0 },
+  { FUTURE_BUILTIN_VEC_VEXPANDM, FUTURE_BUILTIN_VEXPANDMD,
+    RS6000_BTI_unsigned_V2DI, RS6000_BTI_unsigned_V2DI, 0, 0 },
+  { FUTURE_BUILTIN_VEC_VEXPANDM, FUTURE_BUILTIN_VEXPANDMQ,
+    RS6000_BTI_unsigned_V1TI, RS6000_BTI_unsigned_V1TI, 0, 0 },
+
+  { FUTURE_BUILTIN_VEC_VEXTRACTM, FUTURE_BUILTIN_VEXTRACTMB,
+    RS6000_BTI_INTSI, RS6000_BTI_unsigned_V16QI, 0, 0 },
+  { FUTURE_BUILTIN_VEC_VEXTRACTM, FUTURE_BUILTIN_VEXTRACTMH,
+    RS6000_BTI_INTSI, RS6000_BTI_unsigned_V8HI, 0, 0 },
+  { FUTURE_BUILTIN_VEC_VEXTRACTM, FUTURE_BUILTIN_VEXTRACTMW,
+    RS6000_BTI_INTSI, RS6000_BTI_unsigned_V4SI, 0, 0 },
+  { FUTURE_BUILTIN_VEC_VEXTRACTM, FUTURE_BUILTIN_VEXTRACTMD,
+    RS6000_BTI_INTSI, RS6000_BTI_unsigned_V2DI, 0, 0 },
+  { FUTURE_BUILTIN_VEC_VEXTRACTM, FUTURE_BUILTIN_VEXTRACTMQ,
+    RS6000_BTI_INTSI, RS6000_BTI_unsigned_V1TI, 0, 0 },
+
   { RS6000_BUILTIN_NONE, RS6000_BUILTIN_NONE, 0, 0, 0, 0 }
 };
 
@@ -8968,7 +9014,11 @@ rs6000_expand_binop_builtin (enum insn_code icode, tree exp, rtx target)
 	   || icode == CODE_FOR_unpackkf
 	   || icode == CODE_FOR_unpacktf
 	   || icode == CODE_FOR_unpackif
-	   || icode == CODE_FOR_unpacktd)
+	   || icode == CODE_FOR_unpacktd
+	   || icode == CODE_FOR_vec_cntmb_v16qi
+	   || icode == CODE_FOR_vec_cntmb_v8hi
+	   || icode == CODE_FOR_vec_cntmb_v4si
+	   || icode == CODE_FOR_vec_cntmb_v2di)
     {
       /* Only allow 1-bit unsigned literals. */
       STRIP_NOPS (arg1);
@@ -13170,6 +13220,20 @@ builtin_function_type (machine_mode mode_ret, machine_mode mode_arg0,
     case P8V_BUILTIN_VGBBD:
     case MISC_BUILTIN_CDTBCD:
     case MISC_BUILTIN_CBCDTD:
+    case FUTURE_BUILTIN_MTVSRBM:
+    case FUTURE_BUILTIN_MTVSRHM:
+    case FUTURE_BUILTIN_MTVSRWM:
+    case FUTURE_BUILTIN_MTVSRDM:
+    case FUTURE_BUILTIN_MTVSRQM:
+    case FUTURE_BUILTIN_VCNTMBB:
+    case FUTURE_BUILTIN_VCNTMBH:
+    case FUTURE_BUILTIN_VCNTMBW:
+    case FUTURE_BUILTIN_VCNTMBD:
+    case FUTURE_BUILTIN_VEXPANDMB:
+    case FUTURE_BUILTIN_VEXPANDMH:
+    case FUTURE_BUILTIN_VEXPANDMW:
+    case FUTURE_BUILTIN_VEXPANDMD:
+    case FUTURE_BUILTIN_VEXPANDMQ:
       h.uns_p[0] = 1;
       h.uns_p[1] = 1;
       break;
diff --git a/gcc/config/rs6000/vsx.md b/gcc/config/rs6000/vsx.md
index 2a28215ac5b..2c3da61ed17 100644
--- a/gcc/config/rs6000/vsx.md
+++ b/gcc/config/rs6000/vsx.md
@@ -263,6 +263,13 @@
 ;; Mode attribute to give the suffix for the splat instruction
 (define_mode_attr VSX_SPLAT_SUFFIX [(V16QI "b") (V8HI "h")])
 
+;; Iterator for the move to mask instructions
+(define_mode_iterator VSX_MM [V16QI V8HI V4SI V2DI V1TI])
+(define_mode_iterator VSX_MM4 [V16QI V8HI V4SI V2DI])
+
+;; Mode attribute to give the suffix for the mask instruction
+(define_mode_attr VSX_MM_SUFFIX [(V16QI "b") (V8HI "h") (V4SI "w") (V2DI "d") (V1TI "q")])
+
 ;; Constants for creating unspecs
 (define_c_enum "unspec"
   [UNSPEC_VSX_CONCAT
@@ -344,6 +351,10 @@
    UNSPEC_VSX_FIRST_MISMATCH_INDEX
    UNSPEC_VSX_FIRST_MISMATCH_EOS_INDEX
    UNSPEC_XXGENPCV
+   UNSPEC_MTVSBM
+   UNSPEC_VCNTMB
+   UNSPEC_VEXPAND
+   UNSPEC_VEXTRACT
   ])
 
 ;; VSX moves
@@ -5676,3 +5687,60 @@
   DONE;
 })
 
+;; VSX mask manipulation instructions
+(define_expand "vec_mtvsrbm_mtvsrbmi"
+
+  [(set (match_operand:V16QI 0 "vsx_register_operand" "=v")
+        (unspec:V16QI [(match_operand:DI 1 "gpc_reg_operand" "b")]
+       UNSPEC_MTVSBM))]
+   "TARGET_FUTURE"
+ {
+  /* Six bit constant operand.  */
+  if (IN_RANGE (INTVAL (operands[1]), 0, 63))
+    emit_insn (gen_vec_mtvsrbmi (operands[0], operands[1]));
+  else
+    emit_insn (gen_vec_mtvsr_v16qi (operands[0], operands[1]));
+
+  DONE;
+})
+
+(define_insn "vec_mtvsrbmi"
+  [(set (match_operand:V16QI 0 "vsx_register_operand" "=v")
+        (unspec:V16QI [(match_operand:QI 1 "u6bit_cint_operand" "n")]
+        UNSPEC_MTVSBM))]
+  "TARGET_FUTURE"
+  "mtvsrbmi %0,%1"
+)
+
+(define_insn "vec_mtvsr_<mode>"
+  [(set (match_operand:VSX_MM 0 "vsx_register_operand" "=v")
+        (unspec:VSX_MM [(match_operand:DI 1 "gpc_reg_operand" "b")]
+        UNSPEC_MTVSBM))]
+  "TARGET_FUTURE"
+  "mtvsr<VSX_MM_SUFFIX>m %0,%1";
+  [(set_attr "type" "vecsimple")])
+
+(define_insn "vec_cntmb_<mode>"
+  [(set (match_operand:DI 0 "gpc_reg_operand" "=r")
+        (unspec:DI [(match_operand:VSX_MM4 1 "vsx_register_operand" "v")
+                    (match_operand:QI 2 "const_0_to_1_operand" "n")]
+        UNSPEC_VCNTMB))]
+  "TARGET_FUTURE"
+  "vcntmb<VSX_MM_SUFFIX> %0,%1,%2"
+  [(set_attr "type" "vecsimple")])
+
+(define_insn "vec_extract_<mode>"
+  [(set (match_operand:SI 0 "register_operand" "=r")
+	(unspec:SI [(match_operand:VSX_MM 1 "vsx_register_operand" "v")]
+	UNSPEC_VEXTRACT))]
+  "TARGET_FUTURE"
+  "vextract<VSX_MM_SUFFIX>m %0,%1"
+  [(set_attr "type" "vecsimple")])
+
+(define_insn "vec_expand_<mode>"
+  [(set (match_operand:VSX_MM 0 "vsx_register_operand" "=v")
+        (unspec:VSX_MM [(match_operand:VSX_MM 1 "vsx_register_operand" "v")]
+        UNSPEC_VEXPAND))]
+  "TARGET_FUTURE"
+  "vexpand<VSX_MM_SUFFIX>m %0,%1"
+  [(set_attr "type" "vecsimple")])
diff --git a/gcc/testsuite/gcc.target/powerpc/vsx_mask-count-runnable.c b/gcc/testsuite/gcc.target/powerpc/vsx_mask-count-runnable.c
new file mode 100644
index 00000000000..258179e3d53
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/vsx_mask-count-runnable.c
@@ -0,0 +1,149 @@
+/* { dg-do run } */
+/* { dg-options "-mcpu=future -O2 -save-temps" } */
+/* { dg-require-effective-target powerpc_future_hw } */
+
+/* Check that the expected 128-bit instructions are generated if the processor
+   supports the 128-bit integer instructions. */
+/* { dg-final { scan-assembler-times {\mvcntmbb\M} 1 } } */
+/* { dg-final { scan-assembler-times {\mvcntmbh\M} 1 } } */
+/* { dg-final { scan-assembler-times {\mvcntmbw\M} 1 } } */
+/* { dg-final { scan-assembler-times {\mvcntmbd\M} 1 } } */
+
+#define DEBUG 0
+
+#if DEBUG
+#include <stdio.h>
+#include <stdlib.h>
+#endif
+#include <altivec.h>
+
+void abort (void);
+
+int main ()
+{
+  int i, num_elements;
+  unsigned long long arg1;
+  
+  vector unsigned char  vbc_result_bi, vbc_expected_result_bi;
+  vector unsigned short vbc_result_hi, vbc_expected_result_hi;
+  vector unsigned int  vbc_result_wi, vbc_expected_result_wi;
+  vector unsigned long long vbc_result_di, vbc_expected_result_di;
+  vector __uint128_t vbc_result_qi, vbc_expected_result_qi;
+
+  unsigned int result_wi, expected_result_wi;
+  unsigned long long result, expected_result;
+  const unsigned char mp=1;
+  vector unsigned char vbc_bi_src;
+  vector unsigned short vbc_hi_src;
+  vector unsigned int vbc_wi_src;
+  vector unsigned long long vbc_di_src;
+  vector __uint128_t vbc_qi_src;
+  
+  /* vcntmbb */
+  num_elements = 16;
+  vbc_bi_src[0] = 0xFF;
+  vbc_bi_src[1] = 0xFF;
+  vbc_bi_src[2] = 0x0;
+  vbc_bi_src[3] = 0x0;
+  vbc_bi_src[4] = 0x0;
+  vbc_bi_src[5] = 0x0;
+  vbc_bi_src[6] = 0xFF;
+  vbc_bi_src[7] = 0xFF;
+  vbc_bi_src[8] = 0x0;
+  vbc_bi_src[9] = 0x0;
+  vbc_bi_src[10] = 0x0;
+  vbc_bi_src[11] = 0x0;
+  vbc_bi_src[12] = 0x0;
+  vbc_bi_src[13] = 0xFF;
+  vbc_bi_src[14] = 0xFF;
+  vbc_bi_src[15] = 0xFF;
+
+  expected_result = 7;
+
+  result = vec_cntm (vbc_bi_src, 1);
+  /* Note count is put in bits[0:7], IBM numbering, of the 64-bit result */
+  result = result >> (64-8);
+  
+  if (result != expected_result) {
+#if DEBUG
+    printf("ERROR: char vec_cntm(arg) ");
+    printf("count %llu does not match expected count = %llu\n",
+	   result, expected_result);
+#else
+    abort();
+#endif
+  }
+
+  /* vcntmhb */
+  num_elements = 8;
+  vbc_hi_src[0] = 0xFFFF;
+  vbc_hi_src[1] = 0xFFFF;
+  vbc_hi_src[2] = 0x0;
+  vbc_hi_src[3] = 0x0;
+  vbc_hi_src[4] = 0x0;
+  vbc_hi_src[5] = 0x0;
+  vbc_hi_src[6] = 0xFFFF;
+  vbc_hi_src[7] = 0xFFFF;
+
+  expected_result = 4;
+
+  result = vec_cntm (vbc_hi_src, 1);
+  /* Note count is put in bits[0:6], IBM numbering, of the 64-bit result */
+  result = result >> (64-7);
+  
+  if (result != expected_result) {
+#if DEBUG
+    printf("ERROR: short vec_cntm(arg) ");
+    printf("count %llu does not match expected count = %llu\n",
+	   result, expected_result);
+#else
+    abort();
+#endif
+  }
+
+  /* vcntmwb */
+  num_elements = 4;
+  vbc_wi_src[0] = 0xFFFFFFFF;
+  vbc_wi_src[1] = 0xFFFFFFFF;
+  vbc_wi_src[2] = 0x0;
+  vbc_wi_src[3] = 0x0;
+
+  expected_result = 2;
+
+  result = vec_cntm (vbc_wi_src, 1);
+  /* Note count is put in bits[0:5], IBM numbering, of the 64-bit result */
+  result = result >> (64-6);
+  
+  if (result != expected_result) {
+#if DEBUG
+    printf("ERROR: word vec_cntm(arg) ");
+    printf("count %llu does not match expected count = %llu\n",
+	   result, expected_result);
+#else
+    abort();
+#endif
+  }
+
+  /* vcntmdb */
+  num_elements = 2;
+  vbc_di_src[0] = 0xFFFFFFFFFFFFFFFFULL;
+  vbc_di_src[1] = 0xFFFFFFFFFFFFFFFFULL;
+
+  expected_result = 2;
+
+  result = vec_cntm (vbc_di_src, 1);
+  /* Note count is put in bits[0:4], IBM numbering, of the 64-bit result */
+  result = result >> (64-5);
+
+  if (result != expected_result) {
+#if DEBUG
+    printf("ERROR: double vec_cntm(arg) ");
+    printf("count %llu does not match expected count = %llu\n",
+	   result, expected_result);
+#else
+    abort();
+#endif
+  }
+
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/powerpc/vsx_mask-expand-runnable.c b/gcc/testsuite/gcc.target/powerpc/vsx_mask-expand-runnable.c
new file mode 100644
index 00000000000..aec4d06ab01
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/vsx_mask-expand-runnable.c
@@ -0,0 +1,194 @@
+/* { dg-do run } */
+/* { dg-options "-mcpu=future -O2 -save-temps" } */
+/* { dg-require-effective-target powerpc_future_hw } */
+
+/* Check that the expected 128-bit instructions are generated if the processor
+   supports the 128-bit integer instructions. */
+/* { dg-final { scan-assembler-times {\mvexpandbm\M} 1 } } */
+/* { dg-final { scan-assembler-times {\mvexpandhm\M} 1 } } */
+/* { dg-final { scan-assembler-times {\mvexpandwm\M} 1 } } */
+/* { dg-final { scan-assembler-times {\mvexpanddm\M} 1 } } */
+/* { dg-final { scan-assembler-times {\mvexpandqm\M} 1 } } */
+
+#define DEBUG 0
+
+#if DEBUG
+#include <stdio.h>
+#include <stdlib.h>
+#endif
+#include <altivec.h>
+
+void abort (void);
+
+int main ()
+{
+  int i, num_elements;
+  unsigned long long arg1;
+  
+  vector unsigned char  vbc_result_bi, vbc_expected_result_bi;
+  vector unsigned short vbc_result_hi, vbc_expected_result_hi;
+  vector unsigned int  vbc_result_wi, vbc_expected_result_wi;
+  vector unsigned long long vbc_result_di, vbc_expected_result_di;
+  vector __uint128_t vbc_result_qi, vbc_expected_result_qi;
+
+  unsigned int result_wi, expected_result_wi;
+  unsigned long long result, expected_result;
+  const unsigned char mp=1;
+  vector unsigned char vbc_bi_src;
+  vector unsigned short vbc_hi_src;
+  vector unsigned int vbc_wi_src;
+  vector unsigned long long vbc_di_src;
+  vector __uint128_t vbc_qi_src;
+  
+  /* vexpandbm */
+  num_elements = 16;
+  vbc_bi_src[0] = 0xFF;
+  vbc_bi_src[1] = 0xFF;
+  vbc_bi_src[2] = 0x0;
+  vbc_bi_src[3] = 0x0;
+  vbc_bi_src[4] = 0x0;
+  vbc_bi_src[5] = 0x0;
+  vbc_bi_src[6] = 0xFF;
+  vbc_bi_src[7] = 0xFF;
+  vbc_bi_src[8] = 0x0;
+  vbc_bi_src[9] = 0x0;
+  vbc_bi_src[10] = 0x0;
+  vbc_bi_src[11] = 0x0;
+  vbc_bi_src[12] = 0x0;
+  vbc_bi_src[13] = 0xFF;
+  vbc_bi_src[14] = 0xFF;
+  vbc_bi_src[15] = 0xFF;
+
+  vbc_expected_result_bi[0] = 0xFF;
+  vbc_expected_result_bi[1] = 0xFF;
+  vbc_expected_result_bi[2] = 0x0;
+  vbc_expected_result_bi[3] = 0x0;
+  vbc_expected_result_bi[4] = 0x0;
+  vbc_expected_result_bi[5] = 0x0;
+  vbc_expected_result_bi[6] = 0xFF;
+  vbc_expected_result_bi[7] = 0xFF;
+  vbc_expected_result_bi[8] = 0x0;
+  vbc_expected_result_bi[9] = 0x0;
+  vbc_expected_result_bi[10] = 0x0;
+  vbc_expected_result_bi[11] = 0x0;
+  vbc_expected_result_bi[12] = 0x0;
+  vbc_expected_result_bi[13] = 0xFF;
+  vbc_expected_result_bi[14] = 0xFF;
+  vbc_expected_result_bi[15] = 0xFF;
+
+  vbc_result_bi = vec_expandm (vbc_bi_src);
+  
+  for (i = 0; i<num_elements; i++) {
+    if (vbc_result_bi[i] != vbc_expected_result_bi[i]) {
+#if DEBUG
+      printf("ERROR: char vec_expandm(arg) ");
+      printf("element %d, 0x%x does not match expected value = 0x%x\n",
+	     i, vbc_result_bi[i], vbc_expected_result_bi[i]);
+#else
+    abort();
+#endif
+    }
+  }
+
+  /* vexpandhm */
+  num_elements = 8;
+  vbc_hi_src[0] = 0x0;
+  vbc_hi_src[1] = 0xFFFF;
+  vbc_hi_src[2] = 0x0;
+  vbc_hi_src[3] = 0xFFFF;
+  vbc_hi_src[4] = 0x0;
+  vbc_hi_src[5] = 0x0;
+  vbc_hi_src[6] = 0xFFFF;
+  vbc_hi_src[7] = 0xFFFF;
+
+  vbc_expected_result_hi[0] = 0x0;
+  vbc_expected_result_hi[1] = 0xFFFF;
+  vbc_expected_result_hi[2] = 0x0;
+  vbc_expected_result_hi[3] = 0xFFFF;
+  vbc_expected_result_hi[4] = 0x0;
+  vbc_expected_result_hi[5] = 0x0;
+  vbc_expected_result_hi[6] = 0xFFFF;
+  vbc_expected_result_hi[7] = 0xFFFF;
+
+  vbc_result_hi = vec_expandm (vbc_hi_src);
+  
+  for (i = 0; i<num_elements; i++) {
+    if (vbc_result_hi[i] != vbc_expected_result_hi[i]) {
+#if DEBUG
+      printf("ERROR: short vec_expandm(arg) ");
+      printf("element %d, 0x%x does not match expected value = 0x%x\n",
+	     i, vbc_result_hi[i], vbc_expected_result_hi[i]);
+#else
+    abort();
+#endif
+    }
+  }
+  
+  /* vexpandwm */
+  num_elements = 4;
+  vbc_wi_src[0] = 0x0;
+  vbc_wi_src[1] = 0xFFFFFFFF;
+  vbc_wi_src[2] = 0x0;
+  vbc_wi_src[3] = 0xFFFFFFFF;
+
+  vbc_expected_result_wi[0] = 0x0;
+  vbc_expected_result_wi[1] = 0xFFFFFFFF;
+  vbc_expected_result_wi[2] = 0x0;
+  vbc_expected_result_wi[3] = 0xFFFFFFFF;
+
+  vbc_result_wi = vec_expandm (vbc_wi_src);
+  
+  for (i = 0; i<num_elements; i++) {
+    if (vbc_result_wi[i] != vbc_expected_result_wi[i]) {
+#if DEBUG
+      printf("ERROR: int vec_expandm(arg) ");
+      printf("element %d, 0x%x does not match expected value = 0x%x\n",
+	     i, vbc_result_wi[i], vbc_expected_result_wi[i]);
+#else
+    abort();
+#endif
+    }
+  }
+  
+  /* vexpanddm */
+  num_elements = 2;
+  vbc_di_src[0] = 0x0;
+  vbc_di_src[1] = 0xFFFFFFFFFFFFFFFFULL;
+
+  vbc_expected_result_di[0] = 0x0;
+  vbc_expected_result_di[1] = 0xFFFFFFFFFFFFFFFFULL;
+
+  vbc_result_di = vec_expandm (vbc_di_src);
+  
+  for (i = 0; i<num_elements; i++) {
+    if (vbc_result_di[i] != vbc_expected_result_di[i]) {
+#if DEBUG
+      printf("ERROR: double vec_expandm(arg) ");
+      printf("element %d, 0x%llx does not match expected value = 0x%llx\n",
+	     i, vbc_result_di[i], vbc_expected_result_di[i]);
+#else
+    abort();
+#endif
+    }
+  }
+  
+  /* vexpandqm */
+  num_elements = 1;
+  vbc_qi_src[0] = 0x0;
+
+  vbc_expected_result_qi[0] = 0x0;
+
+  vbc_result_qi = vec_expandm (vbc_qi_src);
+  
+  if (vbc_result_qi[0] != vbc_expected_result_qi[0]) {
+#if DEBUG
+    printf("ERROR: quad vec_expandm(arg) ");
+    printf("element %d, 0x%x does not match expected value = 0x%x\n",
+	   0, vbc_result_qi[i], vbc_expected_result_qi[i]);
+#else
+    abort();
+#endif
+  }
+
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/powerpc/vsx_mask-extract-runnable.c b/gcc/testsuite/gcc.target/powerpc/vsx_mask-extract-runnable.c
new file mode 100644
index 00000000000..4df7de292ba
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/vsx_mask-extract-runnable.c
@@ -0,0 +1,162 @@
+/* { dg-do run } */
+/* { dg-options "-mcpu=future -O2 -save-temps" } */
+/* { dg-require-effective-target powerpc_future_hw } */
+
+/* Check that the expected 128-bit instructions are generated if the processor
+   supports the 128-bit integer instructions. */
+/* { dg-final { scan-assembler-times {\mvextractbm\M} 1 } } */
+/* { dg-final { scan-assembler-times {\mvextracthm\M} 1 } } */
+/* { dg-final { scan-assembler-times {\mvextractwm\M} 1 } } */
+/* { dg-final { scan-assembler-times {\mvextractdm\M} 1 } } */
+/* { dg-final { scan-assembler-times {\mvextractqm\M} 1 } } */
+
+
+#define DEBUG 0
+
+#if DEBUG
+#include <stdio.h>
+#include <stdlib.h>
+#endif
+#include <altivec.h>
+
+void abort (void);
+
+int main ()
+{
+  int i, num_elements;
+  unsigned long long arg1;
+  
+  vector unsigned char  vbc_result_bi, vbc_expected_result_bi;
+  vector unsigned short vbc_result_hi, vbc_expected_result_hi;
+  vector unsigned int  vbc_result_wi, vbc_expected_result_wi;
+  vector unsigned long long vbc_result_di, vbc_expected_result_di;
+  vector __uint128_t vbc_result_qi, vbc_expected_result_qi;
+
+  unsigned int result_wi, expected_result_wi;
+  unsigned long long result, expected_result;
+  const unsigned char mp=1;
+  vector unsigned char vbc_bi_src;
+  vector unsigned short vbc_hi_src;
+  vector unsigned int vbc_wi_src;
+  vector unsigned long long vbc_di_src;
+  vector __uint128_t vbc_qi_src;
+  
+/* vextractbm */
+  num_elements = 8;
+  vbc_bi_src[0] = 0xFF;
+  vbc_bi_src[1] = 0xFF;
+  vbc_bi_src[2] = 0x0;
+  vbc_bi_src[3] = 0x0;
+  vbc_bi_src[4] = 0x0;
+  vbc_bi_src[5] = 0x0;
+  vbc_bi_src[6] = 0xFF;
+  vbc_bi_src[7] = 0xFF;
+  vbc_bi_src[8] = 0xFF;
+  vbc_bi_src[9] = 0xFF;
+  vbc_bi_src[10] = 0xFF;
+  vbc_bi_src[11] = 0xFF;
+  vbc_bi_src[12] = 0xFF;
+  vbc_bi_src[13] = 0x0;
+  vbc_bi_src[14] = 0xFF;
+  vbc_bi_src[15] = 0xFF;
+
+  expected_result_wi = 0b1101111111000011;
+
+  result_wi = vec_extractm (vbc_bi_src);
+  
+  if (result_wi != expected_result_wi) {
+#if DEBUG
+    printf("ERROR: short vec_extractm(%d) ", vbc_bi_src);
+    printf("result %llu does not match expected result = %llu\n",
+	   result, expected_result);
+#else
+    abort();
+#endif
+  }
+
+    /* vextracthm */
+  num_elements = 8;
+  vbc_hi_src[0] = 0xFFFF;
+  vbc_hi_src[1] = 0xFFFF;
+  vbc_hi_src[2] = 0x0;
+  vbc_hi_src[3] = 0x0;
+  vbc_hi_src[4] = 0x0;
+  vbc_hi_src[5] = 0x0;
+  vbc_hi_src[6] = 0xFFFF;
+  vbc_hi_src[7] = 0xFFFF;
+
+  expected_result_wi = 0b11000011;
+
+  result_wi = vec_extractm (vbc_hi_src);
+  
+  if (result_wi != expected_result_wi) {
+#if DEBUG
+    printf("ERROR: short vec_extractm(%d) ", vbc_hi_src);
+    printf("result %llu does not match expected result = %llu\n",
+	   result, expected_result);
+#else
+    abort();
+#endif
+  }
+
+  /* vextractwm */
+  num_elements = 4;
+  vbc_wi_src[0] = 0xFFFFFFFF;
+  vbc_wi_src[1] = 0xFFFFFFFF;
+  vbc_wi_src[2] = 0x0;
+  vbc_wi_src[3] = 0x0;
+
+  expected_result_wi = 0b0011;
+
+  result_wi = vec_extractm (vbc_wi_src);
+  
+  if (result_wi != expected_result_wi) {
+#if DEBUG
+    printf("ERROR: word vec_extractm(%d) ", vbc_wi_src);
+    printf("result %llu does not match expected result = %llu\n",
+	   result, expected_result);
+#else
+    abort();
+#endif
+  }
+
+  /* vextractdm */
+  num_elements = 2;
+  vbc_di_src[0] = 0xFFFFFFFFFFFFFFFF;
+  vbc_di_src[1] = 0xFFFFFFFFFFFFFFFF;
+
+  expected_result_wi = 0b11;
+
+  result_wi = vec_extractm (vbc_di_src);
+  
+  if (result_wi != expected_result_wi) {
+#if DEBUG
+    printf("ERROR: double vec_extractm(%lld) ", vbc_di_src);
+    printf("result %llu does not match expected result = %llu\n",
+	   result, expected_result);
+#else
+    abort();
+#endif
+  }
+
+  /* vextractqm */
+  num_elements = 1;
+  vbc_qi_src[0] = 0x1;
+  vbc_qi_src[0] = vbc_qi_src[0] << 127;
+  
+  expected_result_wi = 1;
+
+  result_wi = vec_extractm (vbc_qi_src);
+  
+  if (result_wi != expected_result_wi) {
+#if DEBUG
+    printf("ERROR: quad vec_extractm(arg) ");
+    printf("result 0x%x does not match expected result = 0x%x\n",
+	   result, expected_result);
+#else
+    abort();
+#endif
+  }
+
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/powerpc/vsx_mask-move-runnable.c b/gcc/testsuite/gcc.target/powerpc/vsx_mask-move-runnable.c
new file mode 100644
index 00000000000..73bf1a3b142
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/vsx_mask-move-runnable.c
@@ -0,0 +1,225 @@
+/* { dg-do run } */
+/* { dg-options "-mcpu=future -O2 -save-temps" } */
+/* { dg-require-effective-target powerpc_future_hw } */
+
+/* Check that the expected 128-bit instructions are generated if the processor
+   supports the 128-bit integer instructions. */
+/* { dg-final { scan-assembler-times {\mmtvsrbm\M} 1 } } */
+/* { dg-final { scan-assembler-times {\mmtvsrhm\M} 1 } } */
+/* { dg-final { scan-assembler-times {\mmtvsrwm\M} 1 } } */
+/* { dg-final { scan-assembler-times {\mmtvsrdm\M} 1 } } */
+/* { dg-final { scan-assembler-times {\mmtvsrqm\M} 1 } } */
+/* { dg-final { scan-assembler-times {\mmtvsrbmi\M} 2 } } */
+
+#define DEBUG 0
+
+#if DEBUG
+#include <stdio.h>
+#include <stdlib.h>
+#endif
+#include <altivec.h>
+
+void abort (void);
+
+int main ()
+{
+  int i, num_elements;
+  unsigned long long arg1;
+  
+  vector unsigned char  vbc_result_bi, vbc_expected_result_bi;
+  vector unsigned short vbc_result_hi, vbc_expected_result_hi;
+  vector unsigned int  vbc_result_wi, vbc_expected_result_wi;
+  vector unsigned long long vbc_result_di, vbc_expected_result_di;
+  vector __uint128_t vbc_result_qi, vbc_expected_result_qi;
+
+  unsigned int result_wi, expected_result_wi;
+  unsigned long long result, expected_result;
+  const unsigned char mp=1;
+  vector unsigned char vbc_bi_src;
+  vector unsigned short vbc_hi_src;
+  vector unsigned int vbc_wi_src;
+  vector unsigned long long vbc_di_src;
+  vector __uint128_t vbc_qi_src;
+  
+ /* mtvsrbmi */
+  num_elements = 16;
+  
+  for (i = 0; i<num_elements; i++)
+    vbc_expected_result_bi[i] = 0x0;
+
+  vbc_expected_result_bi[0] = 0xFF;
+  vbc_expected_result_bi[2] = 0xFF;
+
+  vbc_result_bi = vec_genbm(5);
+  
+  for (i = 0; i<num_elements; i++) {
+    if (vbc_result_bi[i] != vbc_expected_result_bi[i]) {
+#if DEBUG
+      printf("ERROR: vec_genbm(const 5) ");
+      printf("element %d equals 0x%x does not match expected_result = 0x%x",
+	     i, vbc_result_bi[i], vbc_expected_result_bi[i]);
+      printf("\n\n");
+#else
+    abort();
+#endif
+    }
+  }
+
+  /* mtvsrbm */
+  num_elements = 16;
+  /* -O2 should generate mtvsrbmi as argument will fit in 6-bit field. */
+  arg1 = 3;
+  
+  for (i = 0; i<num_elements; i++)
+    vbc_expected_result_bi[i] = 0x0;
+
+  vbc_expected_result_bi[1] = 0xFF;
+  vbc_expected_result_bi[0] = 0xFF;
+
+  vbc_result_bi = vec_genbm(arg1);
+  
+  for (i = 0; i<num_elements; i++) {
+    if (vbc_result_bi[i] != vbc_expected_result_bi[i]) {
+#if DEBUG
+      printf("ERROR: vec_genbm(%d) ", arg1);
+      printf("element %d equals 0x%x does not match expected_result = 0x%x",
+	     i, vbc_result_bi[i], vbc_expected_result_bi[i]);
+      printf("\n\n");
+#else
+    abort();
+#endif
+    }
+  }
+
+  num_elements = 16;
+  /* Should generate mtvsrbm as argument will not fit in 6-bit field. */
+  arg1 = 0xEA;   // 234 decimal
+  
+  for (i = 0; i<num_elements; i++)
+    vbc_expected_result_bi[i] = 0x0;
+
+  vbc_expected_result_bi[7] = 0xFF;
+  vbc_expected_result_bi[6] = 0xFF;
+  vbc_expected_result_bi[5] = 0xFF;
+  vbc_expected_result_bi[3] = 0xFF;
+  vbc_expected_result_bi[1] = 0xFF;
+
+  vbc_result_bi = vec_genbm(arg1);
+  
+  for (i = 0; i<num_elements; i++) {
+    if (vbc_result_bi[i] != vbc_expected_result_bi[i]) {
+#if DEBUG
+      printf("ERROR: vec_genbm(%d) ", arg1);
+      printf("element %d equals 0x%x does not match expected_result = 0x%x",
+	     i, vbc_result_bi[i], vbc_expected_result_bi[i]);
+      printf("\n\n");
+#else
+    abort();
+#endif
+    }
+  }
+
+  /* mtvsrhm */
+  num_elements = 8;
+  arg1 = 5;
+  
+  for (i = 0; i<num_elements; i++)
+    vbc_expected_result_hi[i] = 0x0;
+
+  vbc_expected_result_hi[2] = 0xFFFF;
+  vbc_expected_result_hi[0] = 0xFFFF;
+
+  vbc_result_hi = vec_genhm(arg1);
+  
+  for (i = 0; i<num_elements; i++) {
+    if (vbc_result_hi[i] != vbc_expected_result_hi[i]) {
+#if DEBUG
+      printf("ERROR: vec_genhm(%d) ", arg1);
+      printf("element %d equals 0x%x does not match expected_result = 0x%x",
+	     i, vbc_result_hi[i], vbc_expected_result_hi[i]);
+      printf("\n\n");
+#else
+    abort();
+#endif
+    }
+  }
+
+  /* mtvsrwm */
+  num_elements = 4;
+  arg1 = 7;
+  
+  for (i = 0; i<num_elements; i++)
+    vbc_expected_result_wi[i] = 0x0;
+
+  vbc_expected_result_wi[2] = 0xFFFFFFFF;
+  vbc_expected_result_wi[1] = 0xFFFFFFFF;
+  vbc_expected_result_wi[0] = 0xFFFFFFFF;
+
+  vbc_result_wi = vec_genwm(arg1);
+  
+  for (i = 0; i<num_elements; i++) {
+    if (vbc_result_wi[i] != vbc_expected_result_wi[i]) {
+#if DEBUG
+      printf("ERROR: vec_genwm(%d) ", arg1);
+      printf("element %d equals 0x%x does not match expected_result = 0x%x",
+	     i, vbc_result_wi[i], vbc_expected_result_wi[i]);
+      printf("\n\n");
+#else
+    abort();
+#endif
+    }
+  }
+
+  /* mtvsrdm */
+  num_elements = 2;
+  arg1 = 1;
+  
+  for (i = 0; i<num_elements; i++)
+    vbc_expected_result_di[i] = 0x0;
+
+  vbc_expected_result_di[1] = 0x0;
+  vbc_expected_result_di[0] = 0xFFFFFFFFFFFFFFFF;
+
+  vbc_result_di = vec_gendm(arg1);
+  
+  for (i = 0; i<num_elements; i++) {
+    if (vbc_result_di[i] != vbc_expected_result_di[i]) {
+#if DEBUG
+      printf("ERROR: vec_gendm(%d) ", arg1);
+      printf("element %d equals 0x%llx does not match expected_result = ",
+	     i, vbc_result_di[i]);
+      printf("0x%llx\n\n", vbc_expected_result_di[i]);
+#else
+    abort();
+#endif
+    }
+  }
+
+  /* mtvsrqm */
+  num_elements = 1;
+  arg1 = 1;
+  
+  for (i = 0; i<num_elements; i++)
+    vbc_expected_result_qi[i] = 0x0;
+
+  vbc_expected_result_qi[0] = 0xFFFFFFFFFFFFFFFFULL;
+  vbc_expected_result_qi[0] = (vbc_expected_result_qi[0] << 64)
+    | 0xFFFFFFFFFFFFFFFFULL;
+
+  vbc_result_qi = vec_genqm(arg1);
+  
+  for (i = 0; i<num_elements; i++) {
+    if (vbc_result_qi[i] != vbc_expected_result_qi[i]) {
+#if DEBUG
+      printf("ERROR: vec_genqm(%d) ", arg1);
+      printf("element %d equals 0x%llx does not match expected_result = ",
+	     i, vbc_result_qi[i]);
+      printf("0x%llx\n\n", vbc_expected_result_qi[i]);
+#else
+    abort();
+#endif
+    }
+  }
+
+  return 0;
+}
Segher Boessenkool July 1, 2020, 10:28 p.m. UTC | #3
On Wed, May 27, 2020 at 08:50:43AM -0700, Carl Love wrote:
> The following patch adds support for builtins vec_genbm(),  vec_genhm(),
> vec_genwm(), vec_gendm(), vec_genqm(), vec_cntm(), vec_expandm(),
> vec_extractm().  Support for instructions mtvsrbm, mtvsrhm, mtvsrwm,
> mtvsrdm, mtvsrqm, cntm, vexpandm, vextractm.

> +;; Mode attribute to give the suffix for the mask instruction
> +(define_mode_attr VSX_MM_SUFFIX [(V16QI "b") (V8HI "h") (V4SI "w") (V2DI "d") (V1TI "q")])

Please shorten that line?  It doesn't have to be one line ;-)

> +(define_expand "vec_mtvsrbm_mtvsrbmi"
> +
> +  [(set (match_operand:V16QI 0 "vsx_register_operand" "=v")
> +        (unspec:V16QI [(match_operand:DI 1 "gpc_reg_operand" "b")]
> +       UNSPEC_MTVSBM))]
> +   "TARGET_FUTURE"
> + {
> +  /* Six bit constant operand.  */
> +  if (IN_RANGE (INTVAL (operands[1]), 0, 63))
> +    emit_insn (gen_vec_mtvsrbmi (operands[0], operands[1]));
> +  else
> +    emit_insn (gen_vec_mtvsr_v16qi (operands[0], operands[1]));

operands[1] isn't a CONST_INT (it is a REG), so this won't work (INTVAL
on it will ICE with checking, and do something non-sensical otherwise).

So needs a test first?  Could just use u6bit_cint_operand even, and lose
the explicit IN_RANGE.

> +(define_insn "vec_mtvsr_<mode>"
> +  [(set (match_operand:VSX_MM 0 "vsx_register_operand" "=v")
> +        (unspec:VSX_MM [(match_operand:DI 1 "gpc_reg_operand" "b")]
> +        UNSPEC_MTVSBM))]
> +  "TARGET_FUTURE"
> +  "mtvsr<VSX_MM_SUFFIX>m %0,%1";
> +  [(set_attr "type" "vecsimple")])

vsx_register_operand together with a "v" constraint is curious, btw.
It is used in a few more places, and it probably works, but would
altivec_register_operand be better?

> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/vsx_mask-count-runnable.c
> @@ -0,0 +1,149 @@
> +/* { dg-do run } */
> +/* { dg-options "-mcpu=future -O2 -save-temps" } */
> +/* { dg-require-effective-target powerpc_future_hw } */

Drop the -save-temps?  (Same in the other tests.)

Looks good otherwise.


Segher
diff mbox series

Patch

diff --git a/gcc/config/rs6000/altivec.h b/gcc/config/rs6000/altivec.h
index 0a7e8ab3647..5917d3a2b76 100644
--- a/gcc/config/rs6000/altivec.h
+++ b/gcc/config/rs6000/altivec.h
@@ -710,6 +710,16 @@  __altivec_scalar_pred(vec_any_nle,
 
 #define vec_strir_p(a)	__builtin_vec_strir_p (a)
 #define vec_stril_p(a)	__builtin_vec_stril_p (a)
+
+/* VSX Mask Manipulation builtin. */
+#define vec_genbm __builtin_vec_mtvsrbm
+#define vec_genhm __builtin_vec_mtvsrhm
+#define vec_genwm __builtin_vec_mtvsrwm
+#define vec_gendm __builtin_vec_mtvsrdm
+#define vec_genqm __builtin_vec_mtvsrqm
+#define vec_cntm __builtin_vec_cntm
+#define vec_expandm __builtin_vec_vexpandm
+#define vec_extractm __builtin_vec_vextractm
 #endif
 
 #endif /* _ALTIVEC_H */
diff --git a/gcc/config/rs6000/rs6000-builtin.def b/gcc/config/rs6000/rs6000-builtin.def
index 8b1ddb00045..7cab5097aeb 100644
--- a/gcc/config/rs6000/rs6000-builtin.def
+++ b/gcc/config/rs6000/rs6000-builtin.def
@@ -1049,6 +1049,22 @@ 
 		    (RS6000_BTC_ ## ATTR		/* ATTR */	\
 		     | RS6000_BTC_TERNARY),				\
 		    CODE_FOR_ ## ICODE)			/* ICODE */
+
+#define BU_FUTURE_1(ENUM, NAME, ATTR, ICODE)			\
+  RS6000_BUILTIN_1 (FUTURE_BUILTIN_ ## ENUM,		/* ENUM */	\
+		    "__builtin_vec" NAME,		/* NAME */	\
+		    RS6000_BTM_FUTURE,			/* MASK */	\
+		    (RS6000_BTC_ ## ATTR		/* ATTR */	\
+		     | RS6000_BTC_UNARY),				\
+		    CODE_FOR_ ## ICODE)			/* ICODE */
+
+#define BU_FUTURE_2(ENUM, NAME, ATTR, ICODE)			\
+  RS6000_BUILTIN_2 (FUTURE_BUILTIN_ ## ENUM,		/* ENUM */	\
+		    "__builtin_vec" NAME,		/* NAME */	\
+		    RS6000_BTM_FUTURE,			/* MASK */	\
+		    (RS6000_BTC_ ## ATTR		/* ATTR */	\
+		     | RS6000_BTC_BINARY),				\
+		    CODE_FOR_ ## ICODE)			/* ICODE */
 #endif
 
 
@@ -2637,6 +2653,26 @@  BU_FUTURE_V_1 (VSTRIHR_P, "vstrihr_p", CONST, vstrir_p_v8hi)
 BU_FUTURE_V_1 (VSTRIBL_P, "vstribl_p", CONST, vstril_p_v16qi)
 BU_FUTURE_V_1 (VSTRIHL_P, "vstrihl_p", CONST, vstril_p_v8hi)
 
+BU_FUTURE_1 (MTVSRBM, "mtvsrbm", CONST, vec_mtvsrbm)
+BU_FUTURE_1 (MTVSRHM, "mtvsrhm", CONST, vec_mtvsr_v8hi)
+BU_FUTURE_1 (MTVSRWM, "mtvsrwm", CONST, vec_mtvsr_v4si)
+BU_FUTURE_1 (MTVSRDM, "mtvsrdm", CONST, vec_mtvsr_v2di)
+BU_FUTURE_1 (MTVSRQM, "mtvsrqm", CONST, vec_mtvsr_v1ti)
+BU_FUTURE_2 (VCNTMBB, "cntmbb", CONST, vec_cntmb_v16qi)
+BU_FUTURE_2 (VCNTMBH, "cntmbh", CONST, vec_cntmb_v8hi)
+BU_FUTURE_2 (VCNTMBW, "cntmbw", CONST, vec_cntmb_v4si)
+BU_FUTURE_2 (VCNTMBD, "cntmbd", CONST, vec_cntmb_v2di)
+BU_FUTURE_1 (VEXPANDMB, "vexpandmb", CONST, vec_expand_v16qi)
+BU_FUTURE_1 (VEXPANDMH, "vexpandmh", CONST, vec_expand_v8hi)
+BU_FUTURE_1 (VEXPANDMW, "vexpandmw", CONST, vec_expand_v4si)
+BU_FUTURE_1 (VEXPANDMD, "vexpandmd", CONST, vec_expand_v2di)
+BU_FUTURE_1 (VEXPANDMQ, "vexpandmq", CONST, vec_expand_v1ti)
+BU_FUTURE_1 (VEXTRACTMB, "vextractmb", CONST, vec_extract_v16qi)
+BU_FUTURE_1 (VEXTRACTMH, "vextractmh", CONST, vec_extract_v8hi)
+BU_FUTURE_1 (VEXTRACTMW, "vextractmw", CONST, vec_extract_v4si)
+BU_FUTURE_1 (VEXTRACTMD, "vextractmd", CONST, vec_extract_v2di)
+BU_FUTURE_1 (VEXTRACTMQ, "vextractmq", CONST, vec_extract_v1ti)
+
 /* Future architecture overloaded vector built-ins.  */
 BU_FUTURE_OVERLOAD_2 (CLRL, "clrl")
 BU_FUTURE_OVERLOAD_2 (CLRR, "clrr")
@@ -2652,6 +2688,15 @@  BU_FUTURE_OVERLOAD_1 (VSTRIL, "stril")
 
 BU_FUTURE_OVERLOAD_1 (VSTRIR_P, "strir_p")
 BU_FUTURE_OVERLOAD_1 (VSTRIL_P, "stril_p")
+
+BU_FUTURE_OVERLOAD_1 (MTVSRBM, "mtvsrbm")
+BU_FUTURE_OVERLOAD_1 (MTVSRHM, "mtvsrhm")
+BU_FUTURE_OVERLOAD_1 (MTVSRWM, "mtvsrwm")
+BU_FUTURE_OVERLOAD_1 (MTVSRDM, "mtvsrdm")
+BU_FUTURE_OVERLOAD_1 (MTVSRQM, "mtvsrqm")
+BU_FUTURE_OVERLOAD_2 (VCNTM, "cntm")
+BU_FUTURE_OVERLOAD_1 (VEXPANDM, "vexpandm")
+BU_FUTURE_OVERLOAD_1 (VEXTRACTM, "vextractm")
 
 /* 1 argument crypto functions.  */
 BU_CRYPTO_1 (VSBOX,		"vsbox",	  CONST, crypto_vsbox_v2di)
diff --git a/gcc/config/rs6000/rs6000-call.c b/gcc/config/rs6000/rs6000-call.c
index 0ac8054d030..f50c859b807 100644
--- a/gcc/config/rs6000/rs6000-call.c
+++ b/gcc/config/rs6000/rs6000-call.c
@@ -5618,6 +5618,52 @@  const struct altivec_builtin_types altivec_overloaded_builtins[] = {
   { FUTURE_BUILTIN_VEC_VSTRIR_P, FUTURE_BUILTIN_VSTRIHR_P,
     RS6000_BTI_INTSI, RS6000_BTI_V8HI, 0, 0 },
 
+  { FUTURE_BUILTIN_VEC_MTVSRBM, FUTURE_BUILTIN_MTVSRBM,
+    RS6000_BTI_unsigned_V16QI, RS6000_BTI_UINTDI, 0, 0 },
+  { FUTURE_BUILTIN_VEC_MTVSRHM, FUTURE_BUILTIN_MTVSRHM,
+    RS6000_BTI_unsigned_V8HI, RS6000_BTI_UINTDI, 0, 0 },
+  { FUTURE_BUILTIN_VEC_MTVSRWM, FUTURE_BUILTIN_MTVSRWM,
+    RS6000_BTI_unsigned_V4SI, RS6000_BTI_UINTDI, 0, 0 },
+  { FUTURE_BUILTIN_VEC_MTVSRDM, FUTURE_BUILTIN_MTVSRDM,
+    RS6000_BTI_unsigned_V2DI, RS6000_BTI_UINTDI, 0, 0 },
+  { FUTURE_BUILTIN_VEC_MTVSRQM, FUTURE_BUILTIN_MTVSRQM,
+    RS6000_BTI_unsigned_V1TI, RS6000_BTI_UINTDI, 0, 0 },
+
+  { FUTURE_BUILTIN_VEC_VCNTM, FUTURE_BUILTIN_VCNTMBB,
+    RS6000_BTI_unsigned_long_long,
+    RS6000_BTI_unsigned_V16QI, RS6000_BTI_UINTQI, 0 },
+  { FUTURE_BUILTIN_VEC_VCNTM, FUTURE_BUILTIN_VCNTMBH,
+    RS6000_BTI_unsigned_long_long,
+    RS6000_BTI_unsigned_V8HI, RS6000_BTI_UINTQI, 0 },
+  { FUTURE_BUILTIN_VEC_VCNTM, FUTURE_BUILTIN_VCNTMBW,
+    RS6000_BTI_unsigned_long_long,
+    RS6000_BTI_unsigned_V4SI, RS6000_BTI_UINTQI, 0 },
+  { FUTURE_BUILTIN_VEC_VCNTM, FUTURE_BUILTIN_VCNTMBD,
+    RS6000_BTI_unsigned_long_long,
+    RS6000_BTI_unsigned_V2DI, RS6000_BTI_UINTQI, 0 },
+
+  { FUTURE_BUILTIN_VEC_VEXPANDM, FUTURE_BUILTIN_VEXPANDMB,
+    RS6000_BTI_unsigned_V16QI, RS6000_BTI_unsigned_V16QI, 0, 0 },
+  { FUTURE_BUILTIN_VEC_VEXPANDM, FUTURE_BUILTIN_VEXPANDMH,
+    RS6000_BTI_unsigned_V8HI, RS6000_BTI_unsigned_V8HI, 0, 0 },
+  { FUTURE_BUILTIN_VEC_VEXPANDM, FUTURE_BUILTIN_VEXPANDMW,
+    RS6000_BTI_unsigned_V4SI, RS6000_BTI_unsigned_V4SI, 0, 0 },
+  { FUTURE_BUILTIN_VEC_VEXPANDM, FUTURE_BUILTIN_VEXPANDMD,
+    RS6000_BTI_unsigned_V2DI, RS6000_BTI_unsigned_V2DI, 0, 0 },
+  { FUTURE_BUILTIN_VEC_VEXPANDM, FUTURE_BUILTIN_VEXPANDMQ,
+    RS6000_BTI_unsigned_V1TI, RS6000_BTI_unsigned_V1TI, 0, 0 },
+
+  { FUTURE_BUILTIN_VEC_VEXTRACTM, FUTURE_BUILTIN_VEXTRACTMB,
+    RS6000_BTI_INTSI, RS6000_BTI_unsigned_V16QI, 0, 0 },
+  { FUTURE_BUILTIN_VEC_VEXTRACTM, FUTURE_BUILTIN_VEXTRACTMH,
+    RS6000_BTI_INTSI, RS6000_BTI_unsigned_V8HI, 0, 0 },
+  { FUTURE_BUILTIN_VEC_VEXTRACTM, FUTURE_BUILTIN_VEXTRACTMW,
+    RS6000_BTI_INTSI, RS6000_BTI_unsigned_V4SI, 0, 0 },
+  { FUTURE_BUILTIN_VEC_VEXTRACTM, FUTURE_BUILTIN_VEXTRACTMD,
+    RS6000_BTI_INTSI, RS6000_BTI_unsigned_V2DI, 0, 0 },
+  { FUTURE_BUILTIN_VEC_VEXTRACTM, FUTURE_BUILTIN_VEXTRACTMQ,
+    RS6000_BTI_INTSI, RS6000_BTI_unsigned_V1TI, 0, 0 },
+
   { RS6000_BUILTIN_NONE, RS6000_BUILTIN_NONE, 0, 0, 0, 0 }
 };
 
@@ -8968,7 +9014,11 @@  rs6000_expand_binop_builtin (enum insn_code icode, tree exp, rtx target)
 	   || icode == CODE_FOR_unpackkf
 	   || icode == CODE_FOR_unpacktf
 	   || icode == CODE_FOR_unpackif
-	   || icode == CODE_FOR_unpacktd)
+	   || icode == CODE_FOR_unpacktd
+	   || icode == CODE_FOR_vec_cntmb_v16qi
+	   || icode == CODE_FOR_vec_cntmb_v8hi
+	   || icode == CODE_FOR_vec_cntmb_v4si
+	   || icode == CODE_FOR_vec_cntmb_v2di)
     {
       /* Only allow 1-bit unsigned literals. */
       STRIP_NOPS (arg1);
@@ -13170,6 +13220,20 @@  builtin_function_type (machine_mode mode_ret, machine_mode mode_arg0,
     case P8V_BUILTIN_VGBBD:
     case MISC_BUILTIN_CDTBCD:
     case MISC_BUILTIN_CBCDTD:
+    case FUTURE_BUILTIN_MTVSRBM:
+    case FUTURE_BUILTIN_MTVSRHM:
+    case FUTURE_BUILTIN_MTVSRWM:
+    case FUTURE_BUILTIN_MTVSRDM:
+    case FUTURE_BUILTIN_MTVSRQM:
+    case FUTURE_BUILTIN_VCNTMBB:
+    case FUTURE_BUILTIN_VCNTMBH:
+    case FUTURE_BUILTIN_VCNTMBW:
+    case FUTURE_BUILTIN_VCNTMBD:
+    case FUTURE_BUILTIN_VEXPANDMB:
+    case FUTURE_BUILTIN_VEXPANDMH:
+    case FUTURE_BUILTIN_VEXPANDMW:
+    case FUTURE_BUILTIN_VEXPANDMD:
+    case FUTURE_BUILTIN_VEXPANDMQ:
       h.uns_p[0] = 1;
       h.uns_p[1] = 1;
       break;
diff --git a/gcc/config/rs6000/vsx.md b/gcc/config/rs6000/vsx.md
index 2a28215ac5b..96b6ad22812 100644
--- a/gcc/config/rs6000/vsx.md
+++ b/gcc/config/rs6000/vsx.md
@@ -263,6 +263,13 @@ 
 ;; Mode attribute to give the suffix for the splat instruction
 (define_mode_attr VSX_SPLAT_SUFFIX [(V16QI "b") (V8HI "h")])
 
+;; Iterator for the move to mask instructions
+(define_mode_iterator VSX_MM [V16QI V8HI V4SI V2DI V1TI])
+(define_mode_iterator VSX_MM4 [V16QI V8HI V4SI V2DI])
+
+;; Mode attribute to give the suffix for the mask instruction
+(define_mode_attr VSX_MM_SUFFIX [(V16QI "b") (V8HI "h") (V4SI "w") (V2DI "d") (V1TI "q")])
+
 ;; Constants for creating unspecs
 (define_c_enum "unspec"
   [UNSPEC_VSX_CONCAT
@@ -344,6 +351,10 @@ 
    UNSPEC_VSX_FIRST_MISMATCH_INDEX
    UNSPEC_VSX_FIRST_MISMATCH_EOS_INDEX
    UNSPEC_XXGENPCV
+   UNSPEC_MTVSBM
+   UNSPEC_VCNTMB
+   UNSPEC_VEXPAND
+   UNSPEC_VEXTRACT
   ])
 
 ;; VSX moves
@@ -5676,3 +5687,59 @@ 
   DONE;
 })
 
+;; VSX mask manipulation instructions
+(define_expand "vec_mtvsrbm"
+  [(set (match_operand:V16QI 0 "vsx_register_operand" "=v")
+        (unspec:V16QI [(match_operand:DI 1 "gpc_reg_operand" "b")]
+       UNSPEC_MTVSBM))]
+   "TARGET_FUTURE"
+ {
+  if (IN_RANGE (INTVAL (operands[1]), 0, 63))
+     /* This is the vec_mtvsrbmi inst with six bit constant.  */
+    emit_insn (gen_vec_mtvsrbmi (operands[0], operands[1]));
+  else
+    emit_insn (gen_vec_mtvsr_v16qi (operands[0], operands[1]));
+
+  DONE;
+})
+
+(define_insn "vec_mtvsrbmi"
+  [(set (match_operand:V16QI 0 "vsx_register_operand" "=v")
+        (unspec:V16QI [(match_operand:QI 1 "u6bit_cint_operand" "n")]
+        UNSPEC_MTVSBM))]
+  "TARGET_FUTURE"
+  "mtvsrbmi %0,%1"
+)
+
+(define_insn "vec_mtvsr_<mode>"
+  [(set (match_operand:VSX_MM 0 "vsx_register_operand" "=v")
+        (unspec:VSX_MM [(match_operand:DI 1 "gpc_reg_operand" "b")]
+        UNSPEC_MTVSBM))]
+  "TARGET_FUTURE"
+  "mtvsr<VSX_MM_SUFFIX>m %0,%1";
+  [(set_attr "type" "vecsimple")])
+
+(define_insn "vec_cntmb_<mode>"
+  [(set (match_operand:DI 0 "gpc_reg_operand" "=r")
+        (unspec:DI [(match_operand:VSX_MM4 1 "vsx_register_operand" "v")
+                    (match_operand:QI 2 "const_0_to_1_operand" "n")]
+        UNSPEC_VCNTMB))]
+  "TARGET_FUTURE"
+  "vcntmb<VSX_MM_SUFFIX> %0,%1,%2"
+  [(set_attr "type" "vecsimple")])
+
+(define_insn "vec_extract_<mode>"
+  [(set (match_operand:SI 0 "register_operand" "=r")
+	(unspec:SI [(match_operand:VSX_MM 1 "vsx_register_operand" "v")]
+	UNSPEC_VEXTRACT))]
+  "TARGET_FUTURE"
+  "vextract<VSX_MM_SUFFIX>m %0,%1"
+  [(set_attr "type" "vecsimple")])
+
+(define_insn "vec_expand_<mode>"
+  [(set (match_operand:VSX_MM 0 "vsx_register_operand" "=v")
+        (unspec:VSX_MM [(match_operand:VSX_MM 1 "vsx_register_operand" "v")]
+        UNSPEC_VEXPAND))]
+  "TARGET_FUTURE"
+  "vexpand<VSX_MM_SUFFIX>m %0,%1"
+  [(set_attr "type" "vecsimple")])
diff --git a/gcc/testsuite/gcc.target/powerpc/vsx_mask-runnable.c b/gcc/testsuite/gcc.target/powerpc/vsx_mask-runnable.c
new file mode 100644
index 00000000000..8eab7107b15
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/vsx_mask-runnable.c
@@ -0,0 +1,614 @@ 
+/* { dg-do run } */
+/* { dg-options "-mcpu=future -O2 -save-temps" } */
+/* { dg-require-effective-target powerpc_future_hw } */
+
+/* Check that the expected 128-bit instructions are generated if the processor
+   supports the 128-bit integer instructions. */
+/* { dg-final { scan-assembler-times {\mmtvsrbm\M} 1 } } */
+/* { dg-final { scan-assembler-times {\mmtvsrhm\M} 1 } } */
+/* { dg-final { scan-assembler-times {\mmtvsrwm\M} 1 } } */
+/* { dg-final { scan-assembler-times {\mmtvsrdm\M} 1 } } */
+/* { dg-final { scan-assembler-times {\mmtvsrqm\M} 1 } } */
+/* { dg-final { scan-assembler-times {\mmtvsrbmi\M} 2 } } */
+/* { dg-final { scan-assembler-times {\mvexpandbm\M} 1 } } */
+/* { dg-final { scan-assembler-times {\mvexpandhm\M} 1 } } */
+/* { dg-final { scan-assembler-times {\mvexpandwm\M} 1 } } */
+/* { dg-final { scan-assembler-times {\mvexpanddm\M} 1 } } */
+/* { dg-final { scan-assembler-times {\mvexpandqm\M} 1 } } */
+/* { dg-final { scan-assembler-times {\mvcntmbb\M} 1 } } */
+/* { dg-final { scan-assembler-times {\mvcntmbh\M} 1 } } */
+/* { dg-final { scan-assembler-times {\mvcntmbw\M} 1 } } */
+/* { dg-final { scan-assembler-times {\mvcntmbd\M} 1 } } */
+/* { dg-final { scan-assembler-times {\mvextractbm\M} 1 } } */
+/* { dg-final { scan-assembler-times {\mvextracthm\M} 1 } } */
+/* { dg-final { scan-assembler-times {\mvextractwm\M} 1 } } */
+/* { dg-final { scan-assembler-times {\mvextractdm\M} 1 } } */
+/* { dg-final { scan-assembler-times {\mvextractqm\M} 1 } } */
+
+
+#define DEBUG 0
+
+#if DEBUG
+#include <stdio.h>
+#include <stdlib.h>
+#endif
+#include <altivec.h>
+
+void abort (void);
+
+int main ()
+{
+  int i, num_elements;
+  unsigned long long arg1;
+  
+  vector unsigned char  vbc_result_bi, vbc_expected_result_bi;
+  vector unsigned short vbc_result_hi, vbc_expected_result_hi;
+  vector unsigned int  vbc_result_wi, vbc_expected_result_wi;
+  vector unsigned long long vbc_result_di, vbc_expected_result_di;
+  vector __uint128_t vbc_result_qi, vbc_expected_result_qi;
+
+  unsigned int result_wi, expected_result_wi;
+  unsigned long long result, expected_result;
+  const unsigned char mp=1;
+  vector unsigned char vbc_bi_src;
+  vector unsigned short vbc_hi_src;
+  vector unsigned int vbc_wi_src;
+  vector unsigned long long vbc_di_src;
+  vector __uint128_t vbc_qi_src;
+  
+ /* mtvsrbmi */
+  num_elements = 16;
+  
+  for (i = 0; i<num_elements; i++)
+    vbc_expected_result_bi[i] = 0x0;
+
+  vbc_expected_result_bi[0] = 0xFF;
+  vbc_expected_result_bi[2] = 0xFF;
+
+  vbc_result_bi = vec_genbm(5);
+  
+  for (i = 0; i<num_elements; i++) {
+    if (vbc_result_bi[i] != vbc_expected_result_bi[i]) {
+#if DEBUG
+      printf("ERROR: vec_genbm(const 5) ");
+      printf("element %d equals 0x%x does not match expected_result = 0x%x",
+	     i, vbc_result_bi[i], vbc_expected_result_bi[i]);
+      printf("\n\n");
+#else
+    abort();
+#endif
+    }
+  }
+
+  /* mtvsrbm */
+  num_elements = 16;
+  /* -O2 should generate mtvsrbmi as argument will fit in 6-bit field. */
+  arg1 = 3;
+  
+  for (i = 0; i<num_elements; i++)
+    vbc_expected_result_bi[i] = 0x0;
+
+  vbc_expected_result_bi[1] = 0xFF;
+  vbc_expected_result_bi[0] = 0xFF;
+
+  vbc_result_bi = vec_genbm(arg1);
+  
+  for (i = 0; i<num_elements; i++) {
+    if (vbc_result_bi[i] != vbc_expected_result_bi[i]) {
+#if DEBUG
+      printf("ERROR: vec_genbm(%d) ", arg1);
+      printf("element %d equals 0x%x does not match expected_result = 0x%x",
+	     i, vbc_result_bi[i], vbc_expected_result_bi[i]);
+      printf("\n\n");
+#else
+    abort();
+#endif
+    }
+  }
+
+  num_elements = 16;
+  /* Should generate mtvsrbm as argument will not fit in 6-bit field. */
+  arg1 = 0xEA;   // 234 decimal
+  
+  for (i = 0; i<num_elements; i++)
+    vbc_expected_result_bi[i] = 0x0;
+
+  vbc_expected_result_bi[7] = 0xFF;
+  vbc_expected_result_bi[6] = 0xFF;
+  vbc_expected_result_bi[5] = 0xFF;
+  vbc_expected_result_bi[3] = 0xFF;
+  vbc_expected_result_bi[1] = 0xFF;
+
+  vbc_result_bi = vec_genbm(arg1);
+  
+  for (i = 0; i<num_elements; i++) {
+    if (vbc_result_bi[i] != vbc_expected_result_bi[i]) {
+#if DEBUG
+      printf("ERROR: vec_genbm(%d) ", arg1);
+      printf("element %d equals 0x%x does not match expected_result = 0x%x",
+	     i, vbc_result_bi[i], vbc_expected_result_bi[i]);
+      printf("\n\n");
+#else
+    abort();
+#endif
+    }
+  }
+
+  /* mtvsrhm */
+  num_elements = 8;
+  arg1 = 5;
+  
+  for (i = 0; i<num_elements; i++)
+    vbc_expected_result_hi[i] = 0x0;
+
+  vbc_expected_result_hi[2] = 0xFFFF;
+  vbc_expected_result_hi[0] = 0xFFFF;
+
+  vbc_result_hi = vec_genhm(arg1);
+  
+  for (i = 0; i<num_elements; i++) {
+    if (vbc_result_hi[i] != vbc_expected_result_hi[i]) {
+#if DEBUG
+      printf("ERROR: vec_genhm(%d) ", arg1);
+      printf("element %d equals 0x%x does not match expected_result = 0x%x",
+	     i, vbc_result_hi[i], vbc_expected_result_hi[i]);
+      printf("\n\n");
+#else
+    abort();
+#endif
+    }
+  }
+
+  /* mtvsrwm */
+  num_elements = 4;
+  arg1 = 7;
+  
+  for (i = 0; i<num_elements; i++)
+    vbc_expected_result_wi[i] = 0x0;
+
+  vbc_expected_result_wi[2] = 0xFFFFFFFF;
+  vbc_expected_result_wi[1] = 0xFFFFFFFF;
+  vbc_expected_result_wi[0] = 0xFFFFFFFF;
+
+  vbc_result_wi = vec_genwm(arg1);
+  
+  for (i = 0; i<num_elements; i++) {
+    if (vbc_result_wi[i] != vbc_expected_result_wi[i]) {
+#if DEBUG
+      printf("ERROR: vec_genwm(%d) ", arg1);
+      printf("element %d equals 0x%x does not match expected_result = 0x%x",
+	     i, vbc_result_wi[i], vbc_expected_result_wi[i]);
+      printf("\n\n");
+#else
+    abort();
+#endif
+    }
+  }
+
+  /* mtvsrdm */
+  num_elements = 2;
+  arg1 = 1;
+  
+  for (i = 0; i<num_elements; i++)
+    vbc_expected_result_di[i] = 0x0;
+
+  vbc_expected_result_di[1] = 0x0;
+  vbc_expected_result_di[0] = 0xFFFFFFFFFFFFFFFF;
+
+  vbc_result_di = vec_gendm(arg1);
+  
+  for (i = 0; i<num_elements; i++) {
+    if (vbc_result_di[i] != vbc_expected_result_di[i]) {
+#if DEBUG
+      printf("ERROR: vec_gendm(%d) ", arg1);
+      printf("element %d equals 0x%llx does not match expected_result = ",
+	     i, vbc_result_di[i]);
+      printf("0x%llx\n\n", vbc_expected_result_di[i]);
+#else
+    abort();
+#endif
+    }
+  }
+
+  /* mtvsrqm */
+  num_elements = 1;
+  arg1 = 1;
+  
+  for (i = 0; i<num_elements; i++)
+    vbc_expected_result_qi[i] = 0x0;
+
+  vbc_expected_result_qi[0] = 0xFFFFFFFFFFFFFFFFULL;
+  vbc_expected_result_qi[0] = (vbc_expected_result_qi[0] << 64)
+    | 0xFFFFFFFFFFFFFFFFULL;
+
+  vbc_result_qi = vec_genqm(arg1);
+  
+  for (i = 0; i<num_elements; i++) {
+    if (vbc_result_qi[i] != vbc_expected_result_qi[i]) {
+#if DEBUG
+      printf("ERROR: vec_genqm(%d) ", arg1);
+      printf("element %d equals 0x%llx does not match expected_result = ",
+	     i, vbc_result_qi[i]);
+      printf("0x%llx\n\n", vbc_expected_result_qi[i]);
+#else
+    abort();
+#endif
+    }
+  }
+
+  /* vexpandbm */
+  num_elements = 16;
+  vbc_bi_src[0] = 0xFF;
+  vbc_bi_src[1] = 0xFF;
+  vbc_bi_src[2] = 0x0;
+  vbc_bi_src[3] = 0x0;
+  vbc_bi_src[4] = 0x0;
+  vbc_bi_src[5] = 0x0;
+  vbc_bi_src[6] = 0xFF;
+  vbc_bi_src[7] = 0xFF;
+  vbc_bi_src[8] = 0x0;
+  vbc_bi_src[9] = 0x0;
+  vbc_bi_src[10] = 0x0;
+  vbc_bi_src[11] = 0x0;
+  vbc_bi_src[12] = 0x0;
+  vbc_bi_src[13] = 0xFF;
+  vbc_bi_src[14] = 0xFF;
+  vbc_bi_src[15] = 0xFF;
+
+  vbc_expected_result_bi[0] = 0xFF;
+  vbc_expected_result_bi[1] = 0xFF;
+  vbc_expected_result_bi[2] = 0x0;
+  vbc_expected_result_bi[3] = 0x0;
+  vbc_expected_result_bi[4] = 0x0;
+  vbc_expected_result_bi[5] = 0x0;
+  vbc_expected_result_bi[6] = 0xFF;
+  vbc_expected_result_bi[7] = 0xFF;
+  vbc_expected_result_bi[8] = 0x0;
+  vbc_expected_result_bi[9] = 0x0;
+  vbc_expected_result_bi[10] = 0x0;
+  vbc_expected_result_bi[11] = 0x0;
+  vbc_expected_result_bi[12] = 0x0;
+  vbc_expected_result_bi[13] = 0xFF;
+  vbc_expected_result_bi[14] = 0xFF;
+  vbc_expected_result_bi[15] = 0xFF;
+
+  vbc_result_bi = vec_expandm (vbc_bi_src);
+  
+  for (i = 0; i<num_elements; i++) {
+    if (vbc_result_bi[i] != vbc_expected_result_bi[i]) {
+#if DEBUG
+      printf("ERROR: char vec_expandm(arg) ");
+      printf("element %d, 0x%x does not match expected value = 0x%x\n",
+	     i, vbc_result_bi[i], vbc_expected_result_bi[i]);
+#else
+    abort();
+#endif
+    }
+  }
+
+  /* vexpandhm */
+  num_elements = 8;
+  vbc_hi_src[0] = 0x0;
+  vbc_hi_src[1] = 0xFFFF;
+  vbc_hi_src[2] = 0x0;
+  vbc_hi_src[3] = 0xFFFF;
+  vbc_hi_src[4] = 0x0;
+  vbc_hi_src[5] = 0x0;
+  vbc_hi_src[6] = 0xFFFF;
+  vbc_hi_src[7] = 0xFFFF;
+
+  vbc_expected_result_hi[0] = 0x0;
+  vbc_expected_result_hi[1] = 0xFFFF;
+  vbc_expected_result_hi[2] = 0x0;
+  vbc_expected_result_hi[3] = 0xFFFF;
+  vbc_expected_result_hi[4] = 0x0;
+  vbc_expected_result_hi[5] = 0x0;
+  vbc_expected_result_hi[6] = 0xFFFF;
+  vbc_expected_result_hi[7] = 0xFFFF;
+
+  vbc_result_hi = vec_expandm (vbc_hi_src);
+  
+  for (i = 0; i<num_elements; i++) {
+    if (vbc_result_hi[i] != vbc_expected_result_hi[i]) {
+#if DEBUG
+      printf("ERROR: short vec_expandm(arg) ");
+      printf("element %d, 0x%x does not match expected value = 0x%x\n",
+	     i, vbc_result_hi[i], vbc_expected_result_hi[i]);
+#else
+    abort();
+#endif
+    }
+  }
+  
+  /* vexpandwm */
+  num_elements = 4;
+  vbc_wi_src[0] = 0x0;
+  vbc_wi_src[1] = 0xFFFFFFFF;
+  vbc_wi_src[2] = 0x0;
+  vbc_wi_src[3] = 0xFFFFFFFF;
+
+  vbc_expected_result_wi[0] = 0x0;
+  vbc_expected_result_wi[1] = 0xFFFFFFFF;
+  vbc_expected_result_wi[2] = 0x0;
+  vbc_expected_result_wi[3] = 0xFFFFFFFF;
+
+  vbc_result_wi = vec_expandm (vbc_wi_src);
+  
+  for (i = 0; i<num_elements; i++) {
+    if (vbc_result_wi[i] != vbc_expected_result_wi[i]) {
+#if DEBUG
+      printf("ERROR: int vec_expandm(arg) ");
+      printf("element %d, 0x%x does not match expected value = 0x%x\n",
+	     i, vbc_result_wi[i], vbc_expected_result_wi[i]);
+#else
+    abort();
+#endif
+    }
+  }
+  
+  /* vexpanddm */
+  num_elements = 2;
+  vbc_di_src[0] = 0x0;
+  vbc_di_src[1] = 0xFFFFFFFFFFFFFFFFULL;
+
+  vbc_expected_result_di[0] = 0x0;
+  vbc_expected_result_di[1] = 0xFFFFFFFFFFFFFFFFULL;
+
+  vbc_result_di = vec_expandm (vbc_di_src);
+  
+  for (i = 0; i<num_elements; i++) {
+    if (vbc_result_di[i] != vbc_expected_result_di[i]) {
+#if DEBUG
+      printf("ERROR: double vec_expandm(arg) ");
+      printf("element %d, 0x%llx does not match expected value = 0x%llx\n",
+	     i, vbc_result_di[i], vbc_expected_result_di[i]);
+#else
+    abort();
+#endif
+    }
+  }
+  
+  /* vexpandqm */
+  num_elements = 1;
+  vbc_qi_src[0] = 0x0;
+
+  vbc_expected_result_qi[0] = 0x0;
+
+  vbc_result_qi = vec_expandm (vbc_qi_src);
+  
+  if (vbc_result_qi[0] != vbc_expected_result_qi[0]) {
+#if DEBUG
+    printf("ERROR: quad vec_expandm(arg) ");
+    printf("element %d, 0x%x does not match expected value = 0x%x\n",
+	   0, vbc_result_qi[i], vbc_expected_result_qi[i]);
+#else
+    abort();
+#endif
+  }
+
+  /* vcntmbb */
+  num_elements = 16;
+  vbc_bi_src[0] = 0xFF;
+  vbc_bi_src[1] = 0xFF;
+  vbc_bi_src[2] = 0x0;
+  vbc_bi_src[3] = 0x0;
+  vbc_bi_src[4] = 0x0;
+  vbc_bi_src[5] = 0x0;
+  vbc_bi_src[6] = 0xFF;
+  vbc_bi_src[7] = 0xFF;
+  vbc_bi_src[8] = 0x0;
+  vbc_bi_src[9] = 0x0;
+  vbc_bi_src[10] = 0x0;
+  vbc_bi_src[11] = 0x0;
+  vbc_bi_src[12] = 0x0;
+  vbc_bi_src[13] = 0xFF;
+  vbc_bi_src[14] = 0xFF;
+  vbc_bi_src[15] = 0xFF;
+
+  expected_result = 7;
+
+  result = vec_cntm (vbc_bi_src, 1);
+  /* Note count is put in bits[0:7], IBM numbering, of the 64-bit result */
+  result = result >> (64-8);
+  
+  if (result != expected_result) {
+#if DEBUG
+    printf("ERROR: char vec_cntm(arg) ");
+    printf("count %llu does not match expected count = %llu\n",
+	   result, expected_result);
+#else
+    abort();
+#endif
+  }
+
+  /* vcntmhb */
+  num_elements = 8;
+  vbc_hi_src[0] = 0xFFFF;
+  vbc_hi_src[1] = 0xFFFF;
+  vbc_hi_src[2] = 0x0;
+  vbc_hi_src[3] = 0x0;
+  vbc_hi_src[4] = 0x0;
+  vbc_hi_src[5] = 0x0;
+  vbc_hi_src[6] = 0xFFFF;
+  vbc_hi_src[7] = 0xFFFF;
+
+  expected_result = 4;
+
+  result = vec_cntm (vbc_hi_src, 1);
+  /* Note count is put in bits[0:6], IBM numbering, of the 64-bit result */
+  result = result >> (64-7);
+  
+  if (result != expected_result) {
+#if DEBUG
+    printf("ERROR: short vec_cntm(arg) ");
+    printf("count %llu does not match expected count = %llu\n",
+	   result, expected_result);
+#else
+    abort();
+#endif
+  }
+
+  /* vcntmwb */
+  num_elements = 4;
+  vbc_wi_src[0] = 0xFFFFFFFF;
+  vbc_wi_src[1] = 0xFFFFFFFF;
+  vbc_wi_src[2] = 0x0;
+  vbc_wi_src[3] = 0x0;
+
+  expected_result = 2;
+
+  result = vec_cntm (vbc_wi_src, 1);
+  /* Note count is put in bits[0:5], IBM numbering, of the 64-bit result */
+  result = result >> (64-6);
+  
+  if (result != expected_result) {
+#if DEBUG
+    printf("ERROR: word vec_cntm(arg) ");
+    printf("count %llu does not match expected count = %llu\n",
+	   result, expected_result);
+#else
+    abort();
+#endif
+  }
+
+  /* vcntmdb */
+  num_elements = 2;
+  vbc_di_src[0] = 0xFFFFFFFFFFFFFFFFULL;
+  vbc_di_src[1] = 0xFFFFFFFFFFFFFFFFULL;
+
+  expected_result = 2;
+
+  result = vec_cntm (vbc_di_src, 1);
+  /* Note count is put in bits[0:4], IBM numbering, of the 64-bit result */
+  result = result >> (64-5);
+
+  if (result != expected_result) {
+#if DEBUG
+    printf("ERROR: double vec_cntm(arg) ");
+    printf("count %llu does not match expected count = %llu\n",
+	   result, expected_result);
+#else
+    abort();
+#endif
+  }
+
+/* vextractbm */
+  num_elements = 8;
+  vbc_bi_src[0] = 0xFF;
+  vbc_bi_src[1] = 0xFF;
+  vbc_bi_src[2] = 0x0;
+  vbc_bi_src[3] = 0x0;
+  vbc_bi_src[4] = 0x0;
+  vbc_bi_src[5] = 0x0;
+  vbc_bi_src[6] = 0xFF;
+  vbc_bi_src[7] = 0xFF;
+  vbc_bi_src[8] = 0xFF;
+  vbc_bi_src[9] = 0xFF;
+  vbc_bi_src[10] = 0xFF;
+  vbc_bi_src[11] = 0xFF;
+  vbc_bi_src[12] = 0xFF;
+  vbc_bi_src[13] = 0x0;
+  vbc_bi_src[14] = 0xFF;
+  vbc_bi_src[15] = 0xFF;
+
+  expected_result_wi = 0b1101111111000011;
+
+  result_wi = vec_extractm (vbc_bi_src);
+  
+  if (result_wi != expected_result_wi) {
+#if DEBUG
+    printf("ERROR: short vec_extractm(%d) ", vbc_bi_src);
+    printf("result %llu does not match expected result = %llu\n",
+	   result, expected_result);
+#else
+    abort();
+#endif
+  }
+
+    /* vextracthm */
+  num_elements = 8;
+  vbc_hi_src[0] = 0xFFFF;
+  vbc_hi_src[1] = 0xFFFF;
+  vbc_hi_src[2] = 0x0;
+  vbc_hi_src[3] = 0x0;
+  vbc_hi_src[4] = 0x0;
+  vbc_hi_src[5] = 0x0;
+  vbc_hi_src[6] = 0xFFFF;
+  vbc_hi_src[7] = 0xFFFF;
+
+  expected_result_wi = 0b11000011;
+
+  result_wi = vec_extractm (vbc_hi_src);
+  
+  if (result_wi != expected_result_wi) {
+#if DEBUG
+    printf("ERROR: short vec_extractm(%d) ", vbc_hi_src);
+    printf("result %llu does not match expected result = %llu\n",
+	   result, expected_result);
+#else
+    abort();
+#endif
+  }
+
+  /* vextractwm */
+  num_elements = 4;
+  vbc_wi_src[0] = 0xFFFFFFFF;
+  vbc_wi_src[1] = 0xFFFFFFFF;
+  vbc_wi_src[2] = 0x0;
+  vbc_wi_src[3] = 0x0;
+
+  expected_result_wi = 0b0011;
+
+  result_wi = vec_extractm (vbc_wi_src);
+  
+  if (result_wi != expected_result_wi) {
+#if DEBUG
+    printf("ERROR: word vec_extractm(%d) ", vbc_wi_src);
+    printf("result %llu does not match expected result = %llu\n",
+	   result, expected_result);
+#else
+    abort();
+#endif
+  }
+
+  /* vextractdm */
+  num_elements = 2;
+  vbc_di_src[0] = 0xFFFFFFFFFFFFFFFF;
+  vbc_di_src[1] = 0xFFFFFFFFFFFFFFFF;
+
+  expected_result_wi = 0b11;
+
+  result_wi = vec_extractm (vbc_di_src);
+  
+  if (result_wi != expected_result_wi) {
+#if DEBUG
+    printf("ERROR: double vec_extractm(%lld) ", vbc_di_src);
+    printf("result %llu does not match expected result = %llu\n",
+	   result, expected_result);
+#else
+    abort();
+#endif
+  }
+
+  /* vextractqm */
+  num_elements = 1;
+  vbc_qi_src[0] = 0x1;
+  vbc_qi_src[0] = vbc_qi_src[0] << 127;
+  
+  expected_result_wi = 1;
+
+  result_wi = vec_extractm (vbc_qi_src);
+  
+  if (result_wi != expected_result_wi) {
+#if DEBUG
+    printf("ERROR: quad vec_extractm(arg) ");
+    printf("result 0x%x does not match expected result = 0x%x\n",
+	   result, expected_result);
+#else
+    abort();
+#endif
+  }
+
+  return 0;
+}
+