diff mbox series

RS6000 Add 128-bit Binary Integer sign extend operations

Message ID 17c8bde2ddc366511864446e7de8b873366945b4.camel@us.ibm.com
State New
Headers show
Series RS6000 Add 128-bit Binary Integer sign extend operations | expand

Commit Message

Carl Love April 28, 2021, 5:39 p.m. UTC
Segher, Will:

The agreement for the sign extension builtin was to just make it Endian
aware rather then go with a more complex definition.  The prior patch
has been updated with this new functionality.

This patch adds support for the 128-bit extension instruction and
corresponding builtin support for the various sign extensions.

This was originally part of the Add 128-bit Integer operations patch
series.  The patch logically goes with the earlier 5 patch series.

The LE support testing was done on a Power 10.  The regression testing
for LE passes with no regressions.  

The BE support testing was done by generating the BE code sequence and
then manually using gdb and visual inspection to make sure the elements
were correctly reversed and the expected elements were sign extended.

Please let me know if the patch is acceptable for mainline.

                   Carl Love

-------------------------------------------

gcc/ChangeLog

2021-04-28  Carl Love  <cel@us.ibm.com>
	* config/rs6000/altivec.h (vec_signextll, vec_signexti, vec_signextq):
	Add define for new builtins.
	* config/rs6000/altivec.md(altivec_vreveti2): Add define_expand.
	* config/rs6000/rs6000-builtin.def (VSIGNEXTI, VSIGNEXTLL):  Add
	overloaded builtin definitions.
	(VSIGNEXTSB2W, VSIGNEXTSH2W, VSIGNEXTSB2D, VSIGNEXTSH2D,VSIGNEXTSW2D,
	VSIGNEXTSD2Q):	Add builtin expansions.
	(SIGNEXT): Add P10 overload definition.
	* config/rs6000-call.c (P9V_BUILTIN_VEC_VSIGNEXTI, P9V_BUILTIN_VEC_VSIGNEXTLL,
	P10_BUILTIN_VEC_SIGNEXT): Add overloaded argument definitions.
	* config/rs6000/vsx.md (vsx_sign_extend_v2di_v1ti): Add define_insn.
	(vsignextend_v2di_v1ti, vsignextend_qi_<mode>, vsignextend_hi_<mode>,
	vsignextend_si_v2di)[VIlong]: Add define_expand.
	Make define_insn vsx_sign_extend_si_v2di visible.
	* doc/extend.texi:  Add documentation for the vec_signexti,
	vec_signextll builtins and vec_signextq.

gcc/testsuite/ChangeLog

2021-04-28  Carl Love  <cel@us.ibm.com>
	* gcc.target/powerpc/int_128bit-runnable.c (extsd2q): Update expected
	count.
	Add tests for vec_signextq.
	* gcc.target/powerpc/p9-sign_extend-runnable.c:  New test case.
---
 gcc/config/rs6000/altivec.h                   |   3 +
 gcc/config/rs6000/altivec.md                  |  24 ++++
 gcc/config/rs6000/rs6000-builtin.def          |  12 ++
 gcc/config/rs6000/rs6000-call.c               |  16 +++
 gcc/config/rs6000/vsx.md                      |  83 +++++++++++-
 gcc/doc/extend.texi                           |  16 +++
 .../gcc.target/powerpc/int_128bit-runnable.c  |  41 +++++-
 .../powerpc/p9-sign_extend-runnable.c         | 128 ++++++++++++++++++
 8 files changed, 321 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/powerpc/p9-sign_extend-runnable.c

Comments

Segher Boessenkool June 7, 2021, 7:03 p.m. UTC | #1
Hi Carl,

On Wed, Apr 28, 2021 at 10:39:14AM -0700, Carl Love wrote:
> The agreement for the sign extension builtin was to just make it Endian
> aware rather then go with a more complex definition.  The prior patch
> has been updated with this new functionality.
> 
> This patch adds support for the 128-bit extension instruction and
> corresponding builtin support for the various sign extensions.
> 
> This was originally part of the Add 128-bit Integer operations patch
> series.  The patch logically goes with the earlier 5 patch series.

But it has nothing to do with patch 5/5, so it confused me that you
posted it as a reply to that.

> +(define_expand "vsignextend_v2di_v1ti"
> +  [(set (match_operand:V1TI 0 "vsx_register_operand" "=v")
> +	(unspec:V1TI [(match_operand:V2DI 1 "vsx_register_operand" "v")]
> +		     UNSPEC_VSX_SIGN_EXTEND))]
> +  "TARGET_POWER10"
> +{
> +  if (BYTES_BIG_ENDIAN)
> +    {
> +       rtx tmp = gen_reg_rtx (V2DImode);
> +
> +       emit_insn (gen_altivec_vrevev2di2(tmp, operands[1]));
> +       emit_insn (gen_vsx_sign_extend_v2di_v1ti(operands[0], tmp));
> +     }
> +  else
> +    emit_insn (gen_vsx_sign_extend_v2di_v1ti(operands[0], operands[1]));
> +  DONE;
> +})

The indentation is broken -- everything should go by two columns.

In cases like this where the pattern in the expand agrees with a
define_insn you have, you can write just

{
  if (BYTES_BIG_ENDIAN)
    {
      ...
      DONE;
    }
}

This is the normal way of writing it.  When a define_expand does not
call DONE or FAIL, the pattern is inserted in the insn stream.


Okay for trunk with the indents fixed, and maybe the expand thing.  Also
okay for GCC 11 if it survived testing on all targets and OSes.  That
goes for all patches in this series btw.  Thanks!


Segher
diff mbox series

Patch

diff --git a/gcc/config/rs6000/altivec.h b/gcc/config/rs6000/altivec.h
index 314695a43ca..5b631c7ebaf 100644
--- a/gcc/config/rs6000/altivec.h
+++ b/gcc/config/rs6000/altivec.h
@@ -497,6 +497,8 @@ 
 
 #define vec_xlx __builtin_vec_vextulx
 #define vec_xrx __builtin_vec_vexturx
+#define vec_signexti  __builtin_vec_vsignexti
+#define vec_signextll __builtin_vec_vsignextll
 
 #endif
 
@@ -715,6 +717,7 @@  __altivec_scalar_pred(vec_any_nle,
 #define vec_step(x) __builtin_vec_step (* (__typeof__ (x) *) 0)
 
 #ifdef _ARCH_PWR10
+#define vec_signextq  __builtin_vec_vsignextq
 #define vec_dive __builtin_vec_dive
 #define vec_mod  __builtin_vec_mod
 
diff --git a/gcc/config/rs6000/altivec.md b/gcc/config/rs6000/altivec.md
index c7d2cd0aa88..61a0905789f 100644
--- a/gcc/config/rs6000/altivec.md
+++ b/gcc/config/rs6000/altivec.md
@@ -4291,6 +4291,30 @@ 
 })
 
 ;; Vector reverse elements
+(define_expand "altivec_vreveti2"
+  [(set (match_operand:TI 0 "register_operand" "=v")
+	(unspec:TI [(match_operand:TI 1 "register_operand" "v")]
+		      UNSPEC_VREVEV))]
+  "TARGET_ALTIVEC"
+{
+  int i, j, size, num_elements;
+  rtvec v = rtvec_alloc (16);
+  rtx mask = gen_reg_rtx (V16QImode);
+
+  size = GET_MODE_UNIT_SIZE (TImode);
+  num_elements = GET_MODE_NUNITS (TImode);
+
+  for (j = 0; j < num_elements; j++)
+    for (i = 0; i < size; i++)
+      RTVEC_ELT (v, i + j * size)
+	= GEN_INT (i + (num_elements - 1 - j) * size);
+
+  emit_insn (gen_vec_initv16qiqi (mask, gen_rtx_PARALLEL (V16QImode, v)));
+  emit_insn (gen_altivec_vperm_ti (operands[0], operands[1],
+	     operands[1], mask));
+  DONE;
+})
+
 (define_expand "altivec_vreve<mode>2"
   [(set (match_operand:VEC_A 0 "register_operand" "=v")
 	(unspec:VEC_A [(match_operand:VEC_A 1 "register_operand" "v")]
diff --git a/gcc/config/rs6000/rs6000-builtin.def b/gcc/config/rs6000/rs6000-builtin.def
index dba22825b79..d55095b01bb 100644
--- a/gcc/config/rs6000/rs6000-builtin.def
+++ b/gcc/config/rs6000/rs6000-builtin.def
@@ -2877,6 +2877,8 @@  BU_P9V_OVERLOAD_1 (VPRTYBD,	"vprtybd")
 BU_P9V_OVERLOAD_1 (VPRTYBQ,	"vprtybq")
 BU_P9V_OVERLOAD_1 (VPRTYBW,	"vprtybw")
 BU_P9V_OVERLOAD_1 (VPARITY_LSBB,	"vparity_lsbb")
+BU_P9V_OVERLOAD_1 (VSIGNEXTI,	"vsignexti")
+BU_P9V_OVERLOAD_1 (VSIGNEXTLL,	"vsignextll")
 
 /* 2 argument functions added in ISA 3.0 (power9).  */
 BU_P9_2 (CMPRB,	"byte_in_range",	CONST,	cmprb)
@@ -2888,6 +2890,13 @@  BU_P9_OVERLOAD_2 (CMPRB,	"byte_in_range")
 BU_P9_OVERLOAD_2 (CMPRB2,	"byte_in_either_range")
 BU_P9_OVERLOAD_2 (CMPEQB,	"byte_in_set")
 
+
+BU_P9V_AV_1 (VSIGNEXTSB2W,	"vsignextsb2w",		CONST,  vsignextend_qi_v4si)
+BU_P9V_AV_1 (VSIGNEXTSH2W,	"vsignextsh2w",		CONST,  vsignextend_hi_v4si)
+BU_P9V_AV_1 (VSIGNEXTSB2D,	"vsignextsb2d",		CONST,  vsignextend_qi_v2di)
+BU_P9V_AV_1 (VSIGNEXTSH2D,	"vsignextsh2d",		CONST,  vsignextend_hi_v2di)
+BU_P9V_AV_1 (VSIGNEXTSW2D,	"vsignextsw2d",		CONST,  vsignextend_si_v2di)
+
 /* Builtins for scalar instructions added in ISA 3.1 (power10).  */
 BU_P10V_AV_P (VCMPEQUT_P,	"vcmpequt_p",	CONST,	vector_eq_v1ti_p)
 BU_P10V_AV_P (VCMPGTST_P,	"vcmpgtst_p",	CONST,	vector_gt_v1ti_p)
@@ -2926,6 +2935,8 @@  BU_P10V_AV_2 (VNOR_V1TI,	"vnor_v1ti",	CONST,	norv1ti3)
 BU_P10V_AV_2 (VCMPNET_P,	"vcmpnet_p",	CONST,	vector_ne_v1ti_p)
 BU_P10V_AV_2 (VCMPAET_P,	"vcmpaet_p",	CONST,	vector_ae_v1ti_p)
 
+BU_P10V_AV_1 (VSIGNEXTSD2Q,	"vsignext",     CONST,  vsignextend_v2di_v1ti)
+
 BU_P10V_AV_2 (VMULEUD,	"vmuleud",	CONST,	vec_widen_umult_even_v2di)
 BU_P10V_AV_2 (VMULESD,	"vmulesd",	CONST,	vec_widen_smult_even_v2di)
 BU_P10V_AV_2 (VMULOUD,	"vmuloud",	CONST,	vec_widen_umult_odd_v2di)
@@ -3145,6 +3156,7 @@  BU_CRYPTO_OVERLOAD_2A (VPMSUM,	 "vpmsum")
 BU_CRYPTO_OVERLOAD_3A (VPERMXOR,	 "vpermxor")
 BU_CRYPTO_OVERLOAD_3 (VSHASIGMA, "vshasigma")
 
+BU_P10_OVERLOAD_1 (SIGNEXT, "vsignextq")
 
 /* HTM functions.  */
 BU_HTM_1  (TABORT,	"tabort",	CR,	tabort)
diff --git a/gcc/config/rs6000/rs6000-call.c b/gcc/config/rs6000/rs6000-call.c
index 32a8af92458..25beed5f0e4 100644
--- a/gcc/config/rs6000/rs6000-call.c
+++ b/gcc/config/rs6000/rs6000-call.c
@@ -5821,6 +5821,19 @@  const struct altivec_builtin_types altivec_overloaded_builtins[] = {
     RS6000_BTI_unsigned_V2DI, RS6000_BTI_unsigned_V2DI,
     RS6000_BTI_INTSI, RS6000_BTI_INTSI },
 
+  /* Sign extend builtins that work work on ISA 3.0, not added until ISA 3.1 */
+  { P9V_BUILTIN_VEC_VSIGNEXTI, P9V_BUILTIN_VSIGNEXTSB2W,
+    RS6000_BTI_V4SI, RS6000_BTI_V16QI, 0, 0 },
+  { P9V_BUILTIN_VEC_VSIGNEXTI, P9V_BUILTIN_VSIGNEXTSH2W,
+    RS6000_BTI_V4SI, RS6000_BTI_V8HI, 0, 0 },
+
+  { P9V_BUILTIN_VEC_VSIGNEXTLL, P9V_BUILTIN_VSIGNEXTSB2D,
+    RS6000_BTI_V2DI, RS6000_BTI_V16QI, 0, 0 },
+  { P9V_BUILTIN_VEC_VSIGNEXTLL, P9V_BUILTIN_VSIGNEXTSH2D,
+    RS6000_BTI_V2DI, RS6000_BTI_V8HI, 0, 0 },
+  { P9V_BUILTIN_VEC_VSIGNEXTLL, P9V_BUILTIN_VSIGNEXTSW2D,
+    RS6000_BTI_V2DI, RS6000_BTI_V4SI, 0, 0 },
+
   /* Overloaded built-in functions for ISA3.1 (power10). */
   { P10_BUILTIN_VEC_CLRL, P10V_BUILTIN_VCLRLB,
     RS6000_BTI_V16QI, RS6000_BTI_V16QI, RS6000_BTI_UINTSI, 0 },
@@ -6184,6 +6197,9 @@  const struct altivec_builtin_types altivec_overloaded_builtins[] = {
  { P10_BUILTIN_VEC_XVTLSBB_ONES, P10V_BUILTIN_XVTLSBB_ONES,
     RS6000_BTI_INTSI, RS6000_BTI_unsigned_V16QI, 0, 0 },
 
+  { P10_BUILTIN_VEC_SIGNEXT, P10V_BUILTIN_VSIGNEXTSD2Q,
+     RS6000_BTI_V1TI, RS6000_BTI_V2DI, 0, 0 },
+
   { RS6000_BUILTIN_NONE, RS6000_BUILTIN_NONE, 0, 0, 0, 0 }
 };
 
diff --git a/gcc/config/rs6000/vsx.md b/gcc/config/rs6000/vsx.md
index 587011081e1..c99f80e83dd 100644
--- a/gcc/config/rs6000/vsx.md
+++ b/gcc/config/rs6000/vsx.md
@@ -4883,6 +4883,33 @@ 
    (set_attr "type" "vecload")])
 
 
+;; ISA 3.1 vector extend sign support
+(define_insn "vsx_sign_extend_v2di_v1ti"
+  [(set (match_operand:V1TI 0 "vsx_register_operand" "=v")
+	(unspec:V1TI [(match_operand:V2DI 1 "vsx_register_operand" "v")]
+		     UNSPEC_VSX_SIGN_EXTEND))]
+  "TARGET_POWER10"
+ "vextsd2q %0,%1"
+[(set_attr "type" "vecexts")])
+
+(define_expand "vsignextend_v2di_v1ti"
+  [(set (match_operand:V1TI 0 "vsx_register_operand" "=v")
+	(unspec:V1TI [(match_operand:V2DI 1 "vsx_register_operand" "v")]
+		     UNSPEC_VSX_SIGN_EXTEND))]
+  "TARGET_POWER10"
+{
+  if (BYTES_BIG_ENDIAN)
+    {
+       rtx tmp = gen_reg_rtx (V2DImode);
+
+       emit_insn (gen_altivec_vrevev2di2(tmp, operands[1]));
+       emit_insn (gen_vsx_sign_extend_v2di_v1ti(operands[0], tmp));
+     }
+  else
+    emit_insn (gen_vsx_sign_extend_v2di_v1ti(operands[0], operands[1]));
+  DONE;
+})
+
 ;; ISA 3.0 vector extend sign support
 
 (define_insn "vsx_sign_extend_qi_<mode>"
@@ -4894,6 +4921,24 @@ 
   "vextsb2<wd> %0,%1"
   [(set_attr "type" "vecexts")])
 
+(define_expand "vsignextend_qi_<mode>"
+  [(set (match_operand:VIlong 0 "vsx_register_operand" "=v")
+	(unspec:VIlong
+	 [(match_operand:V16QI 1 "vsx_register_operand" "v")]
+	 UNSPEC_VSX_SIGN_EXTEND))]
+  "TARGET_P9_VECTOR"
+{
+  if (BYTES_BIG_ENDIAN)
+    {
+      rtx tmp = gen_reg_rtx (V16QImode);
+      emit_insn (gen_altivec_vrevev16qi2(tmp, operands[1]));
+      emit_insn (gen_vsx_sign_extend_qi_<mode>(operands[0], tmp));
+    }
+  else
+    emit_insn (gen_vsx_sign_extend_qi_<mode>(operands[0], operands[1]));
+  DONE;
+})
+
 (define_insn "vsx_sign_extend_hi_<mode>"
   [(set (match_operand:VSINT_84 0 "vsx_register_operand" "=v")
 	(unspec:VSINT_84
@@ -4903,7 +4948,25 @@ 
   "vextsh2<wd> %0,%1"
   [(set_attr "type" "vecexts")])
 
-(define_insn "*vsx_sign_extend_si_v2di"
+(define_expand "vsignextend_hi_<mode>"
+  [(set (match_operand:VIlong 0 "vsx_register_operand" "=v")
+	(unspec:VIlong
+	 [(match_operand:V8HI 1 "vsx_register_operand" "v")]
+	 UNSPEC_VSX_SIGN_EXTEND))]
+  "TARGET_P9_VECTOR"
+{
+  if (BYTES_BIG_ENDIAN)
+    {
+      rtx tmp = gen_reg_rtx (V8HImode);
+      emit_insn (gen_altivec_vrevev8hi2(tmp, operands[1]));
+      emit_insn (gen_vsx_sign_extend_hi_<mode>(operands[0], tmp));
+    }
+  else
+     emit_insn (gen_vsx_sign_extend_hi_<mode>(operands[0], operands[1]));
+  DONE;
+})
+
+(define_insn "vsx_sign_extend_si_v2di"
   [(set (match_operand:V2DI 0 "vsx_register_operand" "=v")
 	(unspec:V2DI [(match_operand:V4SI 1 "vsx_register_operand" "v")]
 		     UNSPEC_VSX_SIGN_EXTEND))]
@@ -4911,6 +4974,24 @@ 
   "vextsw2d %0,%1"
   [(set_attr "type" "vecexts")])
 
+(define_expand "vsignextend_si_v2di"
+  [(set (match_operand:V2DI 0 "vsx_register_operand" "=v")
+	(unspec:V2DI [(match_operand:V4SI 1 "vsx_register_operand" "v")]
+		     UNSPEC_VSX_SIGN_EXTEND))]
+  "TARGET_P9_VECTOR"
+{
+  if (BYTES_BIG_ENDIAN)
+    {
+       rtx tmp = gen_reg_rtx (V4SImode);
+
+       emit_insn (gen_altivec_vrevev4si2(tmp, operands[1]));
+       emit_insn (gen_vsx_sign_extend_si_v2di(operands[0], tmp));
+    }
+  else
+     emit_insn (gen_vsx_sign_extend_si_v2di(operands[0], operands[1]));
+  DONE;
+})
+
 ;; ISA 3.1 vector sign extend
 ;; Move DI value from GPR to TI mode in VSX register, word 1.
 (define_insn "mtvsrdd_diti_w1"
diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
index d1c56edbaa8..3f6956276b3 100644
--- a/gcc/doc/extend.texi
+++ b/gcc/doc/extend.texi
@@ -19510,6 +19510,22 @@  The second argument to @var{__builtin_crypto_vshasigmad} and
 integer that is 0 or 1.  The third argument to these built-in functions
 must be a constant integer in the range of 0 to 15.
 
+The following sign extension builtins are provided:
+
+@smallexample
+vector signed int vec_signexti (vector signed char a)
+vector signed long long vec_signextll (vector signed char a)
+vector signed int vec_signexti (vector signed short a)
+vector signed long long vec_signextll (vector signed short a)
+vector signed long long vec_signextll (vector signed int a)
+vector signed long long vec_signextq (vector signed long long a)
+@end smallexample
+
+Each element of the result is produced by sign-extending the element of the
+input vector that would fall in the least significant portion of the result
+element. For example, a sign-extension of a vector signed char to a vector
+signed long long will sign extend the rightmost byte of each doubleword.
+
 @node PowerPC AltiVec Built-in Functions Available on ISA 3.1
 @subsubsection PowerPC AltiVec Built-in Functions Available on ISA 3.1
 
diff --git a/gcc/testsuite/gcc.target/powerpc/int_128bit-runnable.c b/gcc/testsuite/gcc.target/powerpc/int_128bit-runnable.c
index 94dbd2b6230..1255ee9f0ab 100644
--- a/gcc/testsuite/gcc.target/powerpc/int_128bit-runnable.c
+++ b/gcc/testsuite/gcc.target/powerpc/int_128bit-runnable.c
@@ -4,7 +4,7 @@ 
 
 /* Check that the expected 128-bit instructions are generated if the processor
    supports the 128-bit integer instructions. */
-/* { dg-final { scan-assembler-times {\mvextsd2q\M} 4 } } */
+/* { dg-final { scan-assembler-times {\mvextsd2q\M} 6 } } */
 /* { dg-final { scan-assembler-times {\mvslq\M} 2 } } */
 /* { dg-final { scan-assembler-times {\mvsrq\M} 2 } } */
 /* { dg-final { scan-assembler-times {\mvsraq\M} 2 } } */
@@ -85,6 +85,45 @@  int main ()
   vector __int128 vec_arg1, vec_arg2, vec_result;
   vector unsigned __int128 vec_uarg1, vec_uarg2, vec_uarg3, vec_uresult;
   vector bool __int128  vec_result_bool;
+
+  /* sign extend double to 128-bit integer  */
+  vec_arg1_di[0] = 1000;
+  vec_arg1_di[1] = -123456;
+
+  expected_result = 1000;
+
+  vec_result = vec_signextq (vec_arg1_di);
+
+  if (vec_result[0] != expected_result) {
+#if DEBUG
+    printf("ERROR: vec_signextq ((long long) %lld) =  ",  vec_arg1_di[0]);
+    print_i128(vec_result[0]);
+    printf("\n does not match expected_result = ");
+    print_i128(expected_result);
+    printf("\n\n");
+#else
+    abort();
+#endif
+  }
+
+  vec_arg1_di[0] = -123456;
+  vec_arg1_di[1] = 1000;
+
+  expected_result = -123456;
+
+  vec_result = vec_signextq (vec_arg1_di);
+
+  if (vec_result[0] != expected_result) {
+#if DEBUG
+    printf("ERROR: vec_signextq ((long long) %lld) =  ",  vec_arg1_di[0]);
+    print_i128(vec_result[0]);
+    printf("\n does not match expected_result = ");
+    print_i128(expected_result);
+    printf("\n\n");
+#else
+    abort();
+#endif
+  }
   
   /* test shift 128-bit integers.
      Note, shift amount is given by the lower 7-bits of the shift amount. */
diff --git a/gcc/testsuite/gcc.target/powerpc/p9-sign_extend-runnable.c b/gcc/testsuite/gcc.target/powerpc/p9-sign_extend-runnable.c
new file mode 100644
index 00000000000..fdcad019b96
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/p9-sign_extend-runnable.c
@@ -0,0 +1,128 @@ 
+/* { dg-do run { target { *-*-linux* && { lp64 && p9vector_hw } } } } */
+/* { dg-require-effective-target powerpc_p9vector_ok } */
+/* { dg-options "-O2 -mdejagnu-cpu=power9 -save-temps" } */
+
+/* These builtins were not defined until ISA 3.1 but only require ISA 3.0
+   support.  */
+
+/* { dg-final { scan-assembler-times {\mvextsb2w\M} 1 } } */
+/* { dg-final { scan-assembler-times {\mvextsb2d\M} 1 } } */
+/* { dg-final { scan-assembler-times {\mvextsh2w\M} 1 } } */
+/* { dg-final { scan-assembler-times {\mvextsh2d\M} 1 } } */
+/* { dg-final { scan-assembler-times {\mvextsw2d\M} 1 } } */
+
+#include <altivec.h>
+
+#define DEBUG 0
+
+#if DEBUG
+#include <stdio.h>
+#include <stdlib.h>
+#endif
+
+void abort (void);
+
+int main ()
+{
+  int i;
+
+  vector signed char vec_arg_qi, vec_result_qi;
+  vector signed short int vec_arg_hi, vec_result_hi, vec_expected_hi;
+  vector signed int vec_arg_wi, vec_result_wi, vec_expected_wi;
+  vector signed long long vec_result_di, vec_expected_di;
+
+  /* test sign extend byte to word */
+  vec_arg_qi = (vector signed char) {1, 2, 3, 4, 5, 6, 7, 8,
+				     -1, -2, -3, -4, -5, -6, -7, -8};
+  vec_expected_wi = (vector signed int) {1, 5, -1, -5};
+
+  vec_result_wi = vec_signexti (vec_arg_qi);
+
+  for (i = 0; i < 4; i++)
+    if (vec_result_wi[i] != vec_expected_wi[i]) {
+#if DEBUG
+      printf("ERROR: vec_signexti(char, int):  ");
+      printf("vec_result_wi[%d] != vec_expected_wi[%d]\n",
+	     i, i);
+      printf("vec_result_wi[%d] = %d\n", i, vec_result_wi[i]);
+      printf("vec_expected_wi[%d] = %d\n", i, vec_expected_wi[i]);
+#else
+      abort();
+#endif
+    }
+
+  /* test sign extend byte to double */
+  vec_arg_qi = (vector signed char){1, 2, 3, 4, 5, 6, 7, 8,
+				    -1, -2, -3, -4, -5, -6, -7, -8};
+  vec_expected_di = (vector signed long long int){1, -1};
+
+  vec_result_di = vec_signextll(vec_arg_qi);
+
+  for (i = 0; i < 2; i++)
+    if (vec_result_di[i] != vec_expected_di[i]) {
+#if DEBUG
+      printf("ERROR: vec_signextll(byte, long long int):  ");
+      printf("vec_result_di[%d] != vec_expected_di[%d]\n", i, i);
+      printf("vec_result_di[%d] = %lld\n", i, vec_result_di[i]);
+      printf("vec_expected_di[%d] = %lld\n", i, vec_expected_di[i]);
+#else
+      abort();
+#endif
+    }
+
+  /* test sign extend short to word */
+  vec_arg_hi = (vector signed short int){1, 2, 3, 4, -1, -2, -3, -4};
+  vec_expected_wi = (vector signed int){1, 3, -1, -3};
+
+  vec_result_wi = vec_signexti(vec_arg_hi);
+
+  for (i = 0; i < 4; i++)
+    if (vec_result_wi[i] != vec_expected_wi[i]) {
+#if DEBUG
+      printf("ERROR: vec_signexti(short, int):  ");
+      printf("vec_result_wi[%d] != vec_expected_wi[%d]\n", i, i);
+      printf("vec_result_wi[%d] = %d\n", i, vec_result_wi[i]);
+      printf("vec_expected_wi[%d] = %d\n", i, vec_expected_wi[i]);
+#else
+      abort();
+#endif
+    }
+
+  /* test sign extend short to double word */
+  vec_arg_hi = (vector signed short int ){1, 3, 5, 7,  -1, -3, -5, -7};
+  vec_expected_di = (vector signed long long int){1, -1};
+
+  vec_result_di = vec_signextll(vec_arg_hi);
+
+  for (i = 0; i < 2; i++)
+    if (vec_result_di[i] != vec_expected_di[i]) {
+#if DEBUG
+      printf("ERROR: vec_signextll(short, double):  ");
+      printf("vec_result_di[%d] != vec_expected_di[%d]\n", i, i);
+      printf("vec_result_di[%d] = %lld\n", i, vec_result_di[i]);
+      printf("vec_expected_di[%d] = %lld\n", i, vec_expected_di[i]);
+#else
+      abort();
+#endif
+    }
+
+  /* test sign extend word to double word */
+  vec_arg_wi = (vector signed int ){1, 3, -1, -3};
+  vec_expected_di = (vector signed long long int){1, -1};
+
+  vec_result_di = vec_signextll(vec_arg_wi);
+
+  for (i = 0; i < 2; i++)
+    if (vec_result_di[i] != vec_expected_di[i]) {
+#if DEBUG
+      printf("ERROR: vec_signextll(word, double):  ");
+      printf("vec_result_di[%d] != vec_expected_di[%d]\n", i, i);
+      printf("vec_result_di[%d] = %lld\n", i, vec_result_di[i]);
+      printf("vec_expected_di[%d] = %lld\n", i, vec_expected_di[i]);
+#else
+      abort();
+#endif
+    }
+
+  return 0;
+}