From patchwork Tue Dec 13 18:16:56 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Michael Meissner X-Patchwork-Id: 705491 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 3tdSbk3129z9sCZ for ; Wed, 14 Dec 2016 05:17:22 +1100 (AEDT) Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b="jUxMqo1N"; dkim-atps=neutral DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:date :from:to:subject:mime-version:content-type:message-id; q=dns; s= default; b=FyhGOB2oGPbY/HjV3c58L7ldo7oQF1Xa+kVtlTpceEgETzkEw607Z pof7c72rpIQLBVhhhBz1L1Hikns3iQoQyy9nvuisbMHkzICud1OcHwFVNrRUvLET wAcO1TYeIl2xyj1yyLLMLe9li7yk6/k5ldkscfy6wb8RLAAuDznQI0= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:date :from:to:subject:mime-version:content-type:message-id; s= default; bh=1e3Z16eu+ZN/HmdIzt2d/rg4aHU=; b=jUxMqo1NaAbeNSH/b9ef yX2BoxtZOwJ2limdRZ0yaRv3JGdo1Yow6fX/RASPTyb1cSbJuv3WUJoFKhYJvTXr eaXxw6wFILGG6E0mOqOgy/pdr9TLN15Q9VcIZLcZNofeaZWOENAd/VWjOOXod1rq rXmv6YS8A9TWK0JVyAc133w= Received: (qmail 24403 invoked by alias); 13 Dec 2016 18:17:14 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 24392 invoked by uid 89); 13 Dec 2016 18:17:13 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=1.8 required=5.0 tests=AWL, BAYES_50, KAM_ASCII_DIVIDERS, KAM_LAZY_DOMAIN_SECURITY, RCVD_IN_DNSWL_LOW autolearn=no version=3.3.2 spammy=meissnerlinuxvnetibmcom, meissner@linux.vnet.ibm.com, altivec.h, powerpc*** X-HELO: mx0a-001b2d01.pphosted.com Received: from mx0a-001b2d01.pphosted.com (HELO mx0a-001b2d01.pphosted.com) (148.163.156.1) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Tue, 13 Dec 2016 18:17:02 +0000 Received: from pps.filterd (m0098399.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.0.17/8.16.0.17) with SMTP id uBDIETdB067339 for ; Tue, 13 Dec 2016 13:17:00 -0500 Received: from e36.co.us.ibm.com (e36.co.us.ibm.com [32.97.110.154]) by mx0a-001b2d01.pphosted.com with ESMTP id 27akchm6wj-1 (version=TLSv1.2 cipher=AES256-SHA bits=256 verify=NOT) for ; Tue, 13 Dec 2016 13:17:00 -0500 Received: from localhost by e36.co.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Tue, 13 Dec 2016 11:16:59 -0700 Received: from d03dlp02.boulder.ibm.com (9.17.202.178) by e36.co.us.ibm.com (192.168.1.136) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; Tue, 13 Dec 2016 11:16:57 -0700 Received: from b03cxnp07028.gho.boulder.ibm.com (b03cxnp07028.gho.boulder.ibm.com [9.17.130.15]) by d03dlp02.boulder.ibm.com (Postfix) with ESMTP id 15B4F3E4003E; Tue, 13 Dec 2016 11:16:57 -0700 (MST) Received: from b03ledav001.gho.boulder.ibm.com (b03ledav001.gho.boulder.ibm.com [9.17.130.232]) by b03cxnp07028.gho.boulder.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id uBDIGvoh12452130; Tue, 13 Dec 2016 11:16:57 -0700 Received: from b03ledav001.gho.boulder.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id F1E326E03F; Tue, 13 Dec 2016 11:16:56 -0700 (MST) Received: from ibm-tiger.the-meissners.org (unknown [9.32.77.111]) by b03ledav001.gho.boulder.ibm.com (Postfix) with ESMTP id CC7FB6E035; Tue, 13 Dec 2016 11:16:56 -0700 (MST) Received: by ibm-tiger.the-meissners.org (Postfix, from userid 500) id 4112842FF6; Tue, 13 Dec 2016 13:16:56 -0500 (EST) Date: Tue, 13 Dec 2016 13:16:56 -0500 From: Michael Meissner To: gcc-patches@gcc.gnu.org, Segher Boessenkool , David Edelsohn , Bill Schmidt Subject: [PATCH], Add PowerPC ISA 3.0 vec_vinsert4b and vec_vextract4b built-in functions Mail-Followup-To: Michael Meissner , gcc-patches@gcc.gnu.org, Segher Boessenkool , David Edelsohn , Bill Schmidt MIME-Version: 1.0 Content-Disposition: inline User-Agent: Mutt/1.5.20 (2009-12-10) X-TM-AS-GCONF: 00 X-Content-Scanned: Fidelis XPS MAILER x-cbid: 16121318-0020-0000-0000-00000A819460 X-IBM-SpamModules-Scores: X-IBM-SpamModules-Versions: BY=3.00006243; HX=3.00000240; KW=3.00000007; PH=3.00000004; SC=3.00000196; SDB=6.00793346; UDB=6.00384625; IPR=6.00571147; BA=6.00004963; NDR=6.00000001; ZLA=6.00000005; ZF=6.00000009; ZB=6.00000000; ZP=6.00000000; ZH=6.00000000; ZU=6.00000002; MB=3.00013618; XFM=3.00000011; UTC=2016-12-13 18:16:59 X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 16121318-0021-0000-0000-000058117923 Message-Id: <20161213181655.GA22420@ibm-tiger.the-meissners.org> X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:, , definitions=2016-12-13_11:, , signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 spamscore=0 suspectscore=0 malwarescore=0 phishscore=0 adultscore=0 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1609300000 definitions=main-1612130283 X-IsSubscribed: yes This patch adds support for the vec_vinsert4b and vec_vextract4b built-in functions that generate the ISA 3.0 XXINSERTW and XXEXTRACTUW/VEXTUW{L,R}X instructions. These functions are part of the PowerOpen 64-bit ELF V2 abi. In doing the work, I noticed the P9V built-in ternary functions incorrectly were declared to be binary. I have fixed these functions. The built-ins added are: long long vec_vextract4b (const vector signed char, const int); long long vec_vextract4b (const vector unsigned char, const int); vector signed char vec_insert4b (vector int, vector signed char, const int); vector unsigned char vec_insert4b (vector unsigned int, vector unsigned char, const int); vector signed char vec_insert4b (long long, vector signed char, const int); vector unsigned char vec_insert4b (long long, vector unsigned char, const int); Note, the ABI only adds the form of vec_insert4b that takes a vector int as the first argument. On little endian systems, you have to swap double words to get the desired element into the scalar position for the XXINSERTW instruction. I have added a GCC extension to alternatively take a long long (or long in 64-bit) for the value to be inserted, since IMHO, it makes the built-in much easier to use. I have done bootstrap builds on a 64-bit power8 little endian system and a 32/64-bit power7 big endian system. There were no regressions. Can I check this into the GCC trunk? [gcc] 2016-12-13 Michael Meissner * config/rs6000/predicates.md (const_0_to_11_operand): New predicate, match 0..11. * config/rs6000/rs6000-builtin.def (BU_P9V_VSX_3): Set built-in type to ternary, not binary. (BU_P9V_64BIT_VSX_3): Likewise. (P9V_BUILTIN_VEXTRACT4B): Add support for vec_vinsert4b and vec_extract4b non-overloaded built-in functions. (P9V_BUILTIN_VINSERT4B): Likewise. (P9V_BUILTIN_VINSERT4B_DI): Likewise. (P9V_BUILTIN_VEC_VEXTULX): Move to section that adds 2 operand ISA 3.0 built-in functions. (P9V_BUILTIN_VEC_VEXTURX): Likewise. (P9V_BUILTIN_VEC_VEXTRACT4B): Add support for overloaded vec_insert4b and vec_extract4 built-in functions. (P9V_BUILTIN_VEC_VINSERT4B): Likewise. * config/rs6000/rs6000-c.c (altivec_overloaded_builtins): Add overloaded support for vec_vinsert4b and vec_extract4b. * config/rs6000/rs6000.c (altivec_expand_builtin): Add checks for the vec_insert4b and vec_extract4b byte number being a constant in the range 0..11. * config/rs6000/vsx.md (UNSPEC_XXEXTRACTUW): New unspec. (UNSPEC_XXINSERTW): Likewise. (vextract4b): Add support for the vec_vextract4b built-in function. (vextract4b_internal): Likewise. (vinsert4b): Add support for the vec_insert4b built-in function. Include both a version that inserts element 1 from a V4SI object and one that inserts a DI object. (vinsert4b_internal): Likewise. (vinsert4b_di): Likewise. (vinsert4b_di_internal): Likewise. * config/rs6000/altivec.h (vec_vinsert4b): Support vec_vinsert4b and vec_extract4b built-in functions. * doc/extend.doc (PowerPC VSX built-in functions): Document vec_insert4b and vec_extract4b. [gcc/testsuite] 2016-12-13 Michael Meissner * gcc.target/powerpc/p9-vinsert4b-1.c: New test. * gcc.target/powerpc/p9-vinsert4b-2.c: Likewise. Index: gcc/config/rs6000/predicates.md =================================================================== --- gcc/config/rs6000/predicates.md (.../svn+ssh://meissner@gcc.gnu.org/svn/gcc/trunk/gcc/config/rs6000) (revision 243590) +++ gcc/config/rs6000/predicates.md (.../gcc/config/rs6000) (working copy) @@ -210,6 +210,11 @@ (define_predicate "const_0_to_7_operand" (and (match_code "const_int") (match_test "IN_RANGE (INTVAL (op), 0, 7)"))) +;; Match op = 0..11 +(define_predicate "const_0_to_11_operand" + (and (match_code "const_int") + (match_test "IN_RANGE (INTVAL (op), 0, 11)"))) + ;; Match op = 0..15 (define_predicate "const_0_to_15_operand" (and (match_code "const_int") Index: gcc/config/rs6000/rs6000-builtin.def =================================================================== --- gcc/config/rs6000/rs6000-builtin.def (.../svn+ssh://meissner@gcc.gnu.org/svn/gcc/trunk/gcc/config/rs6000) (revision 243590) +++ gcc/config/rs6000/rs6000-builtin.def (.../gcc/config/rs6000) (working copy) @@ -877,7 +877,16 @@ "__builtin_vsx_" NAME, /* NAME */ \ RS6000_BTM_P9_VECTOR, /* MASK */ \ (RS6000_BTC_ ## ATTR /* ATTR */ \ - | RS6000_BTC_BINARY), \ + | RS6000_BTC_TERNARY), \ + CODE_FOR_ ## ICODE) /* ICODE */ + +#define BU_P9V_64BIT_VSX_3(ENUM, NAME, ATTR, ICODE) \ + RS6000_BUILTIN_2 (P9V_BUILTIN_ ## ENUM, /* ENUM */ \ + "__builtin_vsx_" NAME, /* NAME */ \ + (RS6000_BTM_64BIT \ + | RS6000_BTM_P9_VECTOR), /* MASK */ \ + (RS6000_BTC_ ## ATTR /* ATTR */ \ + | RS6000_BTC_TERNARY), \ CODE_FOR_ ## ICODE) /* ICODE */ /* See the comment on BU_ALTIVEC_P. */ @@ -1967,6 +1976,11 @@ BU_P9V_AV_2 (VEXTUHRX, "vextuhrx", CONS BU_P9V_AV_2 (VEXTUWLX, "vextuwlx", CONST, vextuwlx) BU_P9V_AV_2 (VEXTUWRX, "vextuwrx", CONST, vextuwrx) +/* Insert/extract 4 byte word into a vector. */ +BU_P9V_VSX_2 (VEXTRACT4B, "vextract4b", CONST, vextract4b) +BU_P9V_VSX_3 (VINSERT4B, "vinsert4b", CONST, vinsert4b) +BU_P9V_VSX_3 (VINSERT4B_DI, "vinsert4b_di", CONST, vinsert4b_di) + /* 3 argument vector functions returning void, treated as SPECIAL, added in ISA 3.0 (power9). */ BU_P9V_64BIT_AV_X (STXVL, "stxvl", MISC) @@ -2008,12 +2022,13 @@ BU_P9V_AV_P (VCMPNEZW_P, "vcmpnezw_p", C /* ISA 3.0 Vector scalar overloaded 2 argument functions */ BU_P9V_OVERLOAD_2 (LXVL, "lxvl") +BU_P9V_OVERLOAD_2 (VEXTULX, "vextulx") +BU_P9V_OVERLOAD_2 (VEXTURX, "vexturx") +BU_P9V_OVERLOAD_2 (VEXTRACT4B, "vextract4b") /* ISA 3.0 Vector scalar overloaded 3 argument functions */ BU_P9V_OVERLOAD_3 (STXVL, "stxvl") - -BU_P9V_OVERLOAD_2 (VEXTULX, "vextulx") -BU_P9V_OVERLOAD_2 (VEXTURX, "vexturx") +BU_P9V_OVERLOAD_3 (VINSERT4B, "vinsert4b") /* Overloaded CMPNE support was implemented prior to Power 9, so is not mentioned here. */ Index: gcc/config/rs6000/rs6000-c.c =================================================================== --- gcc/config/rs6000/rs6000-c.c (.../svn+ssh://meissner@gcc.gnu.org/svn/gcc/trunk/gcc/config/rs6000) (revision 243590) +++ gcc/config/rs6000/rs6000-c.c (.../gcc/config/rs6000) (working copy) @@ -4682,6 +4682,11 @@ const struct altivec_builtin_types altiv { P9V_BUILTIN_VEC_VCTZLSBB, P9V_BUILTIN_VCTZLSBB, RS6000_BTI_INTSI, RS6000_BTI_unsigned_V16QI, 0, 0 }, + { P9V_BUILTIN_VEC_VEXTRACT4B, P9V_BUILTIN_VEXTRACT4B, + RS6000_BTI_INTDI, RS6000_BTI_V16QI, RS6000_BTI_UINTSI, 0 }, + { P9V_BUILTIN_VEC_VEXTRACT4B, P9V_BUILTIN_VEXTRACT4B, + RS6000_BTI_INTDI, RS6000_BTI_unsigned_V16QI, RS6000_BTI_UINTSI, 0 }, + { P9V_BUILTIN_VEC_VEXTULX, P9V_BUILTIN_VEXTUBLX, RS6000_BTI_INTQI, RS6000_BTI_UINTSI, RS6000_BTI_V16QI, 0 }, @@ -4735,6 +4740,28 @@ const struct altivec_builtin_types altiv { P8V_BUILTIN_VEC_VGBBD, P8V_BUILTIN_VGBBD, RS6000_BTI_unsigned_V16QI, RS6000_BTI_unsigned_V16QI, 0, 0 }, + { P9V_BUILTIN_VEC_VINSERT4B, P9V_BUILTIN_VINSERT4B, + RS6000_BTI_V16QI, RS6000_BTI_V4SI, + RS6000_BTI_V16QI, RS6000_BTI_UINTSI }, + { P9V_BUILTIN_VEC_VINSERT4B, P9V_BUILTIN_VINSERT4B, + RS6000_BTI_V16QI, RS6000_BTI_unsigned_V4SI, + RS6000_BTI_V16QI, RS6000_BTI_UINTSI }, + { P9V_BUILTIN_VEC_VINSERT4B, P9V_BUILTIN_VINSERT4B, + RS6000_BTI_unsigned_V16QI, RS6000_BTI_unsigned_V4SI, + RS6000_BTI_unsigned_V16QI, RS6000_BTI_UINTSI }, + { P9V_BUILTIN_VEC_VINSERT4B, P9V_BUILTIN_VINSERT4B_DI, + RS6000_BTI_V16QI, RS6000_BTI_INTDI, + RS6000_BTI_V16QI, RS6000_BTI_UINTDI }, + { P9V_BUILTIN_VEC_VINSERT4B, P9V_BUILTIN_VINSERT4B_DI, + RS6000_BTI_V16QI, RS6000_BTI_UINTDI, + RS6000_BTI_V16QI, RS6000_BTI_UINTDI }, + { P9V_BUILTIN_VEC_VINSERT4B, P9V_BUILTIN_VINSERT4B_DI, + RS6000_BTI_unsigned_V16QI, RS6000_BTI_INTDI, + RS6000_BTI_unsigned_V16QI, RS6000_BTI_UINTDI }, + { P9V_BUILTIN_VEC_VINSERT4B, P9V_BUILTIN_VINSERT4B_DI, + RS6000_BTI_unsigned_V16QI, RS6000_BTI_UINTDI, + RS6000_BTI_unsigned_V16QI, RS6000_BTI_UINTDI }, + { P8V_BUILTIN_VEC_VADDECUQ, P8V_BUILTIN_VADDECUQ, RS6000_BTI_V1TI, RS6000_BTI_V1TI, RS6000_BTI_V1TI, RS6000_BTI_V1TI }, { P8V_BUILTIN_VEC_VADDECUQ, P8V_BUILTIN_VADDECUQ, Index: gcc/config/rs6000/rs6000.c =================================================================== --- gcc/config/rs6000/rs6000.c (.../svn+ssh://meissner@gcc.gnu.org/svn/gcc/trunk/gcc/config/rs6000) (revision 243590) +++ gcc/config/rs6000/rs6000.c (.../gcc/config/rs6000) (working copy) @@ -15546,7 +15546,7 @@ altivec_expand_builtin (tree exp, rtx ta size_t i; enum insn_code icode; tree fndecl = TREE_OPERAND (CALL_EXPR_FN (exp), 0); - tree arg0; + tree arg0, arg1, arg2; rtx op0, pat; machine_mode tmode, mode0; enum rs6000_builtins fcode @@ -15766,6 +15766,40 @@ altivec_expand_builtin (tree exp, rtx ta case VSX_BUILTIN_VEC_EXT_V1TI: return altivec_expand_vec_ext_builtin (exp, target); + case P9V_BUILTIN_VEXTRACT4B: + case P9V_BUILTIN_VEC_VEXTRACT4B: + arg1 = CALL_EXPR_ARG (exp, 1); + STRIP_NOPS (arg1); + + /* Generate a normal call if it is invalid. */ + /* If we got invalid arguments bail out before generating bad rtl. */ + if (arg1 == error_mark_node) + return expand_call (exp, target, false); + + if (TREE_CODE (arg1) != INTEGER_CST || TREE_INT_CST_LOW (arg1) > 11) + { + error ("second argument to vec_vextract4b must 0..11"); + return expand_call (exp, target, false); + } + break; + + case P9V_BUILTIN_VINSERT4B: + case P9V_BUILTIN_VINSERT4B_DI: + case P9V_BUILTIN_VEC_VINSERT4B: + arg2 = CALL_EXPR_ARG (exp, 2); + STRIP_NOPS (arg2); + + /* If we got invalid arguments bail out before generating bad rtl. */ + if (arg2 == error_mark_node) + return expand_call (exp, target, false); + + if (TREE_CODE (arg2) != INTEGER_CST || TREE_INT_CST_LOW (arg2) > 11) + { + error ("third argument to vec_vinsert4b must 0..11"); + return expand_call (exp, target, false); + } + break; + default: break; /* Fall through. */ Index: gcc/config/rs6000/vsx.md =================================================================== --- gcc/config/rs6000/vsx.md (.../svn+ssh://meissner@gcc.gnu.org/svn/gcc/trunk/gcc/config/rs6000) (revision 243590) +++ gcc/config/rs6000/vsx.md (.../gcc/config/rs6000) (working copy) @@ -366,6 +366,8 @@ (define_c_enum "unspec" UNSPEC_VCMPNEZH UNSPEC_VCMPNEW UNSPEC_VCMPNEZW + UNSPEC_XXEXTRACTUW + UNSPEC_XXINSERTW ]) ;; VSX moves @@ -3686,3 +3688,94 @@ (define_insn "vextuwrx" "TARGET_P9_VECTOR" "vextuwrx %0,%1,%2" [(set_attr "type" "vecsimple")]) + +;; Vector insert/extract word at arbitrary byte values. Note, the little +;; endian version needs to adjust the byte number, and the V4SI element in +;; vinsert4b. +(define_expand "vextract4b" + [(set (match_operand:DI 0 "gpc_reg_operand") + (unspec:DI [(match_operand:V16QI 1 "vsx_register_operand") + (match_operand:QI 2 "const_0_to_11_operand")] + UNSPEC_XXEXTRACTUW))] + "TARGET_P9_VECTOR" +{ + if (!VECTOR_ELT_ORDER_BIG) + operands[2] = GEN_INT (12 - INTVAL (operands[2])); +}) + +(define_insn_and_split "*vextract4b_internal" + [(set (match_operand:DI 0 "gpc_reg_operand" "=wj,r") + (unspec:DI [(match_operand:V16QI 1 "vsx_register_operand" "wa,v") + (match_operand:QI 2 "const_0_to_11_operand" "n,n")] + UNSPEC_XXEXTRACTUW))] + "TARGET_P9_VECTOR" + "@ + xxextractuw %x0,%x1,%2 + #" + "&& reload_completed && int_reg_operand (operands[0], DImode)" + [(const_int 0)] +{ + rtx op0 = operands[0]; + rtx op1 = operands[1]; + rtx op2 = operands[2]; + rtx op0_si = gen_rtx_REG (SImode, REGNO (op0)); + rtx op1_v4si = gen_rtx_REG (V4SImode, REGNO (op1)); + + emit_move_insn (op0, op2); + if (VECTOR_ELT_ORDER_BIG) + emit_insn (gen_vextuwlx (op0_si, op0_si, op1_v4si)); + else + emit_insn (gen_vextuwrx (op0_si, op0_si, op1_v4si)); + DONE; +} + [(set_attr "type" "vecperm")]) + +(define_expand "vinsert4b" + [(set (match_operand:V16QI 0 "vsx_register_operand") + (unspec:V16QI [(match_operand:V4SI 1 "vsx_register_operand") + (match_operand:V16QI 2 "vsx_register_operand") + (match_operand:QI 3 "const_0_to_11_operand")] + UNSPEC_XXINSERTW))] + "TARGET_P9_VECTOR" +{ + if (!VECTOR_ELT_ORDER_BIG) + { + rtx op1 = operands[1]; + rtx v4si_tmp = gen_reg_rtx (V4SImode); + emit_insn (gen_vsx_xxpermdi_v4si (v4si_tmp, op1, op1, const1_rtx)); + operands[1] = v4si_tmp; + operands[3] = GEN_INT (12 - INTVAL (operands[3])); + } +}) + +(define_insn "*vinsert4b_internal" + [(set (match_operand:V16QI 0 "vsx_register_operand" "=wa") + (unspec:V16QI [(match_operand:V4SI 1 "vsx_register_operand" "wa") + (match_operand:V16QI 2 "vsx_register_operand" "0") + (match_operand:QI 3 "const_0_to_11_operand" "n")] + UNSPEC_XXINSERTW))] + "TARGET_P9_VECTOR" + "xxinsertw %x0,%x1,%3" + [(set_attr "type" "vecperm")]) + +(define_expand "vinsert4b_di" + [(set (match_operand:V16QI 0 "vsx_register_operand") + (unspec:V16QI [(match_operand:DI 1 "vsx_register_operand") + (match_operand:V16QI 2 "vsx_register_operand") + (match_operand:QI 3 "const_0_to_11_operand")] + UNSPEC_XXINSERTW))] + "TARGET_P9_VECTOR" +{ + if (!VECTOR_ELT_ORDER_BIG) + operands[3] = GEN_INT (12 - INTVAL (operands[3])); +}) + +(define_insn "*vinsert4b_di_internal" + [(set (match_operand:V16QI 0 "vsx_register_operand" "=wa") + (unspec:V16QI [(match_operand:DI 1 "vsx_register_operand" "wj") + (match_operand:V16QI 2 "vsx_register_operand" "0") + (match_operand:QI 3 "const_0_to_11_operand" "n")] + UNSPEC_XXINSERTW))] + "TARGET_P9_VECTOR" + "xxinsertw %x0,%x1,%3" + [(set_attr "type" "vecperm")]) Index: gcc/config/rs6000/altivec.h =================================================================== --- gcc/config/rs6000/altivec.h (.../svn+ssh://meissner@gcc.gnu.org/svn/gcc/trunk/gcc/config/rs6000) (revision 243590) +++ gcc/config/rs6000/altivec.h (.../gcc/config/rs6000) (working copy) @@ -394,6 +394,8 @@ #define vec_vctzd __builtin_vec_vctzd #define vec_vctzh __builtin_vec_vctzh #define vec_vctzw __builtin_vec_vctzw +#define vec_vextract4b __builtin_vec_vextract4b +#define vec_vinsert4b __builtin_vec_vinsert4b #define vec_vprtyb __builtin_vec_vprtyb #define vec_vprtybd __builtin_vec_vprtybd #define vec_vprtybw __builtin_vec_vprtybw Index: gcc/doc/extend.texi =================================================================== --- gcc/doc/extend.texi (.../svn+ssh://meissner@gcc.gnu.org/svn/gcc/trunk/gcc/doc) (revision 243590) +++ gcc/doc/extend.texi (.../gcc/doc) (working copy) @@ -17988,6 +17988,15 @@ vector unsigned short vec_vctzh (vector vector int vec_vctzw (vector int); vector unsigned int vec_vctzw (vector int); +long long vec_vextract4b (const vector signed char, const int); +long long vec_vextract4b (const vector unsigned char, const int); + +vector signed char vec_insert4b (vector int, vector signed char, const int); +vector unsigned char vec_insert4b (vector unsigned int, vector unsigned char, + const int); +vector signed char vec_insert4b (long long, vector signed char, const int); +vector unsigned char vec_insert4b (long long, vector unsigned char, const int); + vector int vec_vprtyb (vector int); vector unsigned int vec_vprtyb (vector unsigned int); vector long long vec_vprtyb (vector long long); Index: gcc/testsuite/gcc.target/powerpc/p9-vinsert4b-1.c =================================================================== --- gcc/testsuite/gcc.target/powerpc/p9-vinsert4b-1.c (.../svn+ssh://meissner@gcc.gnu.org/svn/gcc/trunk/gcc/testsuite/gcc.target/powerpc) (revision 0) +++ gcc/testsuite/gcc.target/powerpc/p9-vinsert4b-1.c (.../gcc/testsuite/gcc.target/powerpc) (revision 243618) @@ -0,0 +1,39 @@ +/* { dg-do compile { target { powerpc64*-*-* && lp64 } } } */ +/* { dg-skip-if "do not override -mcpu" { powerpc*-*-* } { "-mcpu=*" } { "-mcpu=power9" } } */ +/* { dg-require-effective-target powerpc_p9vector_ok } */ +/* { dg-options "-mcpu=power9 -O2" } */ + +#include + +vector signed char +vins_v4si (vector int *vi, vector signed char *vc) +{ + return vec_vinsert4b (*vi, *vc, 1); +} + +vector unsigned char +vins_di (long di, vector unsigned char *vc) +{ + return vec_vinsert4b (di, *vc, 2); +} + +vector char +vins_di2 (long *p_di, vector char *vc) +{ + return vec_vinsert4b (*p_di, *vc, 3); +} + +vector unsigned char +vins_di0 (vector unsigned char *vc) +{ + return vec_vinsert4b (0, *vc, 4); +} + +long +vext (vector signed char *vc) +{ + return vec_vextract4b (*vc, 5); +} + +/* { dg-final { scan-assembler "xxextractuw\|vextuw\[lr\]x" } } */ +/* { dg-final { scan-assembler "xxinsertw" } } */ Index: gcc/testsuite/gcc.target/powerpc/p9-vinsert4b-2.c =================================================================== --- gcc/testsuite/gcc.target/powerpc/p9-vinsert4b-2.c (.../svn+ssh://meissner@gcc.gnu.org/svn/gcc/trunk/gcc/testsuite/gcc.target/powerpc) (revision 0) +++ gcc/testsuite/gcc.target/powerpc/p9-vinsert4b-2.c (.../gcc/testsuite/gcc.target/powerpc) (revision 243618) @@ -0,0 +1,30 @@ +/* { dg-do compile { target { powerpc64*-*-* && lp64 } } } */ +/* { dg-skip-if "do not override -mcpu" { powerpc*-*-* } { "-mcpu=*" } { "-mcpu=power9" } } */ +/* { dg-require-effective-target powerpc_p9vector_ok } */ +/* { dg-options "-mcpu=power9 -O2" } */ + +#include + +vector signed char +ins_v4si (vector int vi, vector signed char vc) +{ + return vec_vinsert4b (vi, vc, 12); /* { dg-error "vec_vinsert4b" } */ +} + +vector unsigned char +ins_di (long di, vector unsigned char vc, long n) +{ + return vec_vinsert4b (di, vc, n); /* { dg-error "vec_vinsert4b" } */ +} + +long +vext1 (vector signed char vc) +{ + return vec_vextract4b (vc, 12); /* { dg-error "vec_vextract4b" } */ +} + +long +vextn (vector unsigned char vc, long n) +{ + return vec_vextract4b (vc, n); /* { dg-error "vec_vextract4b" } */ +}