From patchwork Thu Jun 18 22:20:01 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Carl Love X-Patchwork-Id: 1312426 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=8.43.85.97; helo=sourceware.org; envelope-from=gcc-patches-bounces@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=gcc.gnu.org Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.a=rsa-sha256 header.s=default header.b=OrAPLGxX; dkim-atps=neutral Received: from sourceware.org (server2.sourceware.org [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 49nxG00Cd8z9s6w for ; Fri, 19 Jun 2020 08:20:24 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id CB4783951C9A; Thu, 18 Jun 2020 22:20:09 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org CB4783951C9A DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1592518809; bh=JzDirPpEOKjNhJCoHrFcqE/R7edbo6f2JSV8YIEIfFE=; h=Subject:To:Date:List-Id:List-Unsubscribe:List-Archive:List-Post: List-Help:List-Subscribe:From:Reply-To:From; b=OrAPLGxXE3dNviOM1D5pCBMyL8NRIRFXOTjUM5sl1M+ux5zvWWTm/Nh6Sktofo2YK d+FLaTbPbKsxBQahs758BaDShu4cEdV1kRFlQUaNbOXKf4xvuRFzNhJiD+qyR7cBHC U5tOeF96k7+gDlbbsoLyBdk+ydV/p91S0MlRRsJw= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mx0a-001b2d01.pphosted.com (mx0b-001b2d01.pphosted.com [148.163.158.5]) by sourceware.org (Postfix) with ESMTPS id A67FD394C070; Thu, 18 Jun 2020 22:20:06 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org A67FD394C070 Received: from pps.filterd (m0098414.ppops.net [127.0.0.1]) by mx0b-001b2d01.pphosted.com (8.16.0.42/8.16.0.42) with SMTP id 05IM4Lvg155638; Thu, 18 Jun 2020 18:20:06 -0400 Received: from pps.reinject (localhost [127.0.0.1]) by mx0b-001b2d01.pphosted.com with ESMTP id 31repa3p66-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 18 Jun 2020 18:20:06 -0400 Received: from m0098414.ppops.net (m0098414.ppops.net [127.0.0.1]) by pps.reinject (8.16.0.36/8.16.0.36) with SMTP id 05IM4RvR156184; Thu, 18 Jun 2020 18:20:05 -0400 Received: from ppma03dal.us.ibm.com (b.bd.3ea9.ip4.static.sl-reverse.com [169.62.189.11]) by mx0b-001b2d01.pphosted.com with ESMTP id 31repa3p5w-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 18 Jun 2020 18:20:05 -0400 Received: from pps.filterd (ppma03dal.us.ibm.com [127.0.0.1]) by ppma03dal.us.ibm.com (8.16.0.42/8.16.0.42) with SMTP id 05IMF6CH020156; Thu, 18 Jun 2020 22:20:04 GMT Received: from b01cxnp22034.gho.pok.ibm.com (b01cxnp22034.gho.pok.ibm.com [9.57.198.24]) by ppma03dal.us.ibm.com with ESMTP id 31q6c64vu4-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 18 Jun 2020 22:20:04 +0000 Received: from b01ledav004.gho.pok.ibm.com (b01ledav004.gho.pok.ibm.com [9.57.199.109]) by b01cxnp22034.gho.pok.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 05IMK3rj43712774 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 18 Jun 2020 22:20:03 GMT Received: from b01ledav004.gho.pok.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 445EA112064; Thu, 18 Jun 2020 22:20:03 +0000 (GMT) Received: from b01ledav004.gho.pok.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 78E6311206F; Thu, 18 Jun 2020 22:20:02 +0000 (GMT) Received: from sig-9-65-250-81.ibm.com (unknown [9.65.250.81]) by b01ledav004.gho.pok.ibm.com (Postfix) with ESMTP; Thu, 18 Jun 2020 22:20:02 +0000 (GMT) Message-ID: <93a6f2f9341ea239c6d7076a9a3e5a9c6e76e963.camel@us.ibm.com> Subject: [PATCH 1/6 ver 3] rs6000, Update support for vec_extract To: segher@gcc.gnu.org, dje.gcc@gmail.com, gcc-patches@gcc.gnu.org Date: Thu, 18 Jun 2020 15:20:01 -0700 X-Mailer: Evolution 3.28.5 (3.28.5-5.el7) Mime-Version: 1.0 X-TM-AS-GCONF: 00 X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:6.0.216, 18.0.687 definitions=2020-06-18_21:2020-06-18, 2020-06-18 signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 suspectscore=4 priorityscore=1501 spamscore=0 phishscore=0 adultscore=0 clxscore=1015 mlxscore=0 cotscore=-2147483648 mlxlogscore=999 malwarescore=0 lowpriorityscore=0 impostorscore=0 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2004280000 definitions=main-2006180168 X-Spam-Status: No, score=-10.7 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_ASCII_DIVIDERS, KAM_DMARC_STATUS, RCVD_IN_DNSWL_LOW, RCVD_IN_MSPIKE_H2, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Carl Love via Gcc-patches From: Carl Love Reply-To: Carl Love Errors-To: gcc-patches-bounces@gcc.gnu.org Sender: "Gcc-patches" V3 changes Redo ChangeLog for code move. Replace spaces with tabs in ChangeLog. Replaced intruction names using * with the actual list of names. For example vextdu*vrx with the explicit instruction names vextdubvrx, vextduhvrx, etc. ------------------------- v2 changes config/rs6000/altivec.md log entry for move from changed as suggested. config/rs6000/vsx.md log entro for moved to here changed as suggested. define_mode_iterator VI2 also moved, included in both change log entries -------------------------------------------- GCC maintainers: Move the existing vector extract support in altivec.md to vsx.md so all of the vector insert and extract support is in the same file. The patch also updates the name of the builtins and descriptions for the builtins in the documentation file so they match the approved builtin names and descriptions. The patch does not make any functional changes. Please let me know if the changes are acceptable for mainline. Thanks. Carl Love ------------------------------------------------------ gcc/ChangeLog 2020-06-18 Carl Love * config/rs6000/altivec.md: (UNSPEC_EXTRACTL, UNSPEC_EXTRACTR) (vextractl, vextractr) (vextractl_internal, vextractr_internal) (VI2): Move to ... * config/rs6000/vsx.md: (UNSPEC_EXTRACTL, UNSPEC_EXTRACTR) (vextractl, vextractr) (vextractl_internal, vextractr_internal) (VI2): ..here. * gcc/doc/extend.texi: Update documentation for vec_extractl. Replace builtin name vec_extractr with vec_extracth. Update description of vec_extracth. --- gcc/config/rs6000/altivec.md | 64 ------------------------------ gcc/config/rs6000/vsx.md | 66 +++++++++++++++++++++++++++++++ gcc/doc/extend.texi | 77 ++++++++++++++++++------------------ 3 files changed, 105 insertions(+), 102 deletions(-) diff --git a/gcc/config/rs6000/altivec.md b/gcc/config/rs6000/altivec.md index 159f24ebc10..0b0b49ee056 100644 --- a/gcc/config/rs6000/altivec.md +++ b/gcc/config/rs6000/altivec.md @@ -171,8 +171,6 @@ UNSPEC_XXEVAL UNSPEC_VSTRIR UNSPEC_VSTRIL - UNSPEC_EXTRACTL - UNSPEC_EXTRACTR ]) (define_c_enum "unspecv" @@ -183,8 +181,6 @@ UNSPECV_DSS ]) -;; Like VI, defined in vector.md, but add ISA 2.07 integer vector ops -(define_mode_iterator VI2 [V4SI V8HI V16QI V2DI]) ;; Short vec int modes (define_mode_iterator VIshort [V8HI V16QI]) ;; Longer vec int modes for rotate/mask ops @@ -785,66 +781,6 @@ DONE; }) -(define_expand "vextractl" - [(set (match_operand:V2DI 0 "altivec_register_operand") - (unspec:V2DI [(match_operand:VI2 1 "altivec_register_operand") - (match_operand:VI2 2 "altivec_register_operand") - (match_operand:SI 3 "register_operand")] - UNSPEC_EXTRACTL))] - "TARGET_FUTURE" -{ - if (BYTES_BIG_ENDIAN) - { - emit_insn (gen_vextractl_internal (operands[0], operands[1], - operands[2], operands[3])); - emit_insn (gen_xxswapd_v2di (operands[0], operands[0])); - } - else - emit_insn (gen_vextractr_internal (operands[0], operands[2], - operands[1], operands[3])); - DONE; -}) - -(define_insn "vextractl_internal" - [(set (match_operand:V2DI 0 "altivec_register_operand" "=v") - (unspec:V2DI [(match_operand:VEC_I 1 "altivec_register_operand" "v") - (match_operand:VEC_I 2 "altivec_register_operand" "v") - (match_operand:SI 3 "register_operand" "r")] - UNSPEC_EXTRACTL))] - "TARGET_FUTURE" - "vextvlx %0,%1,%2,%3" - [(set_attr "type" "vecsimple")]) - -(define_expand "vextractr" - [(set (match_operand:V2DI 0 "altivec_register_operand") - (unspec:V2DI [(match_operand:VI2 1 "altivec_register_operand") - (match_operand:VI2 2 "altivec_register_operand") - (match_operand:SI 3 "register_operand")] - UNSPEC_EXTRACTR))] - "TARGET_FUTURE" -{ - if (BYTES_BIG_ENDIAN) - { - emit_insn (gen_vextractr_internal (operands[0], operands[1], - operands[2], operands[3])); - emit_insn (gen_xxswapd_v2di (operands[0], operands[0])); - } - else - emit_insn (gen_vextractl_internal (operands[0], operands[2], - operands[1], operands[3])); - DONE; -}) - -(define_insn "vextractr_internal" - [(set (match_operand:V2DI 0 "altivec_register_operand" "=v") - (unspec:V2DI [(match_operand:VEC_I 1 "altivec_register_operand" "v") - (match_operand:VEC_I 2 "altivec_register_operand" "v") - (match_operand:SI 3 "register_operand" "r")] - UNSPEC_EXTRACTR))] - "TARGET_FUTURE" - "vextvrx %0,%1,%2,%3" - [(set_attr "type" "vecsimple")]) - (define_expand "vstrir_" [(set (match_operand:VIshort 0 "altivec_register_operand") (unspec:VIshort [(match_operand:VIshort 1 "altivec_register_operand")] diff --git a/gcc/config/rs6000/vsx.md b/gcc/config/rs6000/vsx.md index 2a28215ac5b..51ffe2d2000 100644 --- a/gcc/config/rs6000/vsx.md +++ b/gcc/config/rs6000/vsx.md @@ -344,8 +344,13 @@ UNSPEC_VSX_FIRST_MISMATCH_INDEX UNSPEC_VSX_FIRST_MISMATCH_EOS_INDEX UNSPEC_XXGENPCV + UNSPEC_EXTRACTL + UNSPEC_EXTRACTR ]) +;; Like VI, defined in vector.md, but add ISA 2.07 integer vector ops +(define_mode_iterator VI2 [V4SI V8HI V16QI V2DI]) + ;; VSX moves ;; The patterns for LE permuted loads and stores come before the general @@ -3781,6 +3786,67 @@ } [(set_attr "type" "load")]) +;; ISA 3.1 extract +(define_expand "vextractl" + [(set (match_operand:V2DI 0 "altivec_register_operand") + (unspec:V2DI [(match_operand:VI2 1 "altivec_register_operand") + (match_operand:VI2 2 "altivec_register_operand") + (match_operand:SI 3 "register_operand")] + UNSPEC_EXTRACTL))] + "TARGET_FUTURE" +{ + if (BYTES_BIG_ENDIAN) + { + emit_insn (gen_vextractl_internal (operands[0], operands[1], + operands[2], operands[3])); + emit_insn (gen_xxswapd_v2di (operands[0], operands[0])); + } + else + emit_insn (gen_vextractr_internal (operands[0], operands[2], + operands[1], operands[3])); + DONE; +}) + +(define_insn "vextractl_internal" + [(set (match_operand:V2DI 0 "altivec_register_operand" "=v") + (unspec:V2DI [(match_operand:VEC_I 1 "altivec_register_operand" "v") + (match_operand:VEC_I 2 "altivec_register_operand" "v") + (match_operand:SI 3 "register_operand" "r")] + UNSPEC_EXTRACTL))] + "TARGET_FUTURE" + "vextvlx %0,%1,%2,%3" + [(set_attr "type" "vecsimple")]) + +(define_expand "vextractr" + [(set (match_operand:V2DI 0 "altivec_register_operand") + (unspec:V2DI [(match_operand:VI2 1 "altivec_register_operand") + (match_operand:VI2 2 "altivec_register_operand") + (match_operand:SI 3 "register_operand")] + UNSPEC_EXTRACTR))] + "TARGET_FUTURE" +{ + if (BYTES_BIG_ENDIAN) + { + emit_insn (gen_vextractr_internal (operands[0], operands[1], + operands[2], operands[3])); + emit_insn (gen_xxswapd_v2di (operands[0], operands[0])); + } + else + emit_insn (gen_vextractl_internal (operands[0], operands[2], + operands[1], operands[3])); + DONE; +}) + +(define_insn "vextractr_internal" + [(set (match_operand:V2DI 0 "altivec_register_operand" "=v") + (unspec:V2DI [(match_operand:VEC_I 1 "altivec_register_operand" "v") + (match_operand:VEC_I 2 "altivec_register_operand" "v") + (match_operand:SI 3 "register_operand" "r")] + UNSPEC_EXTRACTR))] + "TARGET_FUTURE" + "vextvrx %0,%1,%2,%3" + [(set_attr "type" "vecsimple")]) + ;; VSX_EXTRACT optimizations ;; Optimize double d = (double) vec_extract (vi, ) ;; Get the element into the top position and use XVCVSWDP/XVCVUWDP diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi index e656e66a80c..116ae4b8378 100644 --- a/gcc/doc/extend.texi +++ b/gcc/doc/extend.texi @@ -20919,6 +20919,9 @@ Perform a 128-bit vector gather operation, as if implemented by the Future integer value between 2 and 7 inclusive. @findex vec_gnb + +Vector Extract + @smallexample @exdent vector unsigned long long int @exdent vec_extractl (vector unsigned char, vector unsigned char, unsigned int) @@ -20929,51 +20932,49 @@ integer value between 2 and 7 inclusive. @exdent vector unsigned long long int @exdent vec_extractl (vector unsigned long long, vector unsigned long long, unsigned int) @end smallexample -Extract a single element from the vector formed by catenating this function's -first two arguments at the byte offset specified by this function's -third argument. On big-endian targets, this function behaves as if -implemented by the Future @code{vextdubvlx}, @code{vextduhvlx}, -@code{vextduwvlx}, or @code{vextddvlx} instructions, depending on the -types of the function's first two arguments. On little-endian -targets, this function behaves as if implemented by the Future -@code{vextdubvrx}, @code{vextduhvrx}, -@code{vextduwvrx}, or @code{vextddvrx} instructions. -The byte offset of the element to be extracted is calculated -by computing the remainder of dividing the third argument by 32. -If this reminader value is not a multiple of the vector element size, -or if its value added to the vector element size exceeds 32, the -result is undefined. +Extract an element from two concatenated vectors starting at the given byte index +in natural-endian order, and place it zero-extended in doubleword 1 of the result +according to natural element order. If the byte index is out of range for the +data type, the intrinsic will be rejected. +For little-endian, this output will match the placement by the hardware +instruction, i.e., dword[0] in RTL notation. For big-endian, an additional +instruction is needed to move it from the "left" doubleword to the "right" one. +For little-endian, semantics matching the vextdubvrx, vextduhvrx, +vextduwvrx instruction will be generated, while for big-endian, semantics +matching the vextdubvlx, vextduhvlx, vextduwvlx instructions +will be generated. Note that some fairly anomalous results can be generated if +the byte index is not aligned on an element boundary for the element being +extracted. This is a limitation of the bi-endian vector programming model is +consistent with the limitation on vec_perm, for example. @findex vec_extractl @smallexample @exdent vector unsigned long long int -@exdent vec_extractr (vector unsigned char, vector unsigned char, unsigned int) +@exdent vec_extracth (vector unsigned char, vector unsigned char, unsigned int) @exdent vector unsigned long long int -@exdent vec_extractr (vector unsigned short, vector unsigned short, unsigned int) +@exdent vec_extracth (vector unsigned short, vector unsigned short, +unsigned int) @exdent vector unsigned long long int -@exdent vec_extractr (vector unsigned int, vector unsigned int, unsigned int) +@exdent vec_extracth (vector unsigned int, vector unsigned int, unsigned int) @exdent vector unsigned long long int -@exdent vec_extractr (vector unsigned long long, vector unsigned long long, unsigned int) -@end smallexample -Extract a single element from the vector formed by catenating this function's -first two arguments at the byte offset calculated by subtracting this -function's third argument from 31. On big-endian targets, this -function behaves as if -implemented by the Future -@code{vextdubvrx}, @code{vextduhvrx}, -@code{vextduwvrx}, or @code{vextddvrx} instructions, depending on the -types of the function's first two arguments. -On little-endian -targets, this function behaves as if implemented by the Future -@code{vextdubvlx}, @code{vextduhvlx}, -@code{vextduwvlx}, or @code{vextddvlx} instructions. -The byte offset of the element to be extracted, measured from the -right end of the catenation of the two vector arguments, is calculated -by computing the remainder of dividing the third argument by 32. -If this reminader value is not a multiple of the vector element size, -or if its value added to the vector element size exceeds 32, the -result is undefined. -@findex vec_extractr +@exdent vec_extracth (vector unsigned long long, vector unsigned long long, +unsigned int) +@end smallexample +Extract an element from two concatenated vectors starting at the given byte +index in opposite-endian order, and place it zero-extended in doubleword 1 +according to natural element order. If the byte index is out of range for the +data type, the intrinsic will be rejected. For little-endian, this output +will match the placement by the hardware instruction, i.e., dword[0] in RTL +notation. For big-endian, an additional instruction is needed to move it +from the "left" doubleword to the "right" one. For little-endian, semantics +matching the vextdubvlx, vextduhvlx, vextduwvlx instructions will be generated, +while for big-endian, semantics matching the vextdubvrx, vextduhvrx, +vextduwvrx instructions will be generated. Note that some fairly anomalous +results can be generated if the byte index is not aligned on the +element boundary for the element being extracted. This is a +limitation of the bi-endian vector programming model consistent with the +limitation on vec_perm, for example. +@findex vec_extracth @smallexample @exdent vector unsigned long long int From patchwork Thu Jun 18 22:20:05 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Carl Love X-Patchwork-Id: 1312427 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=8.43.85.97; helo=sourceware.org; envelope-from=gcc-patches-bounces@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=gcc.gnu.org Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.a=rsa-sha256 header.s=default header.b=PNtoQ21O; dkim-atps=neutral Received: from sourceware.org (server2.sourceware.org [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 49nxG54qSxz9sRR for ; Fri, 19 Jun 2020 08:20:29 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id E8C36395200B; Thu, 18 Jun 2020 22:20:16 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org E8C36395200B DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1592518817; bh=UWQZ5v2zLt8SmSUcWzvtVf7MU10gig6CmScY1SWoVpY=; h=Subject:To:Date:List-Id:List-Unsubscribe:List-Archive:List-Post: List-Help:List-Subscribe:From:Reply-To:From; b=PNtoQ21Obpy/CZ2qAdluGfZ0AV6DEIQlg2R+XNyl7CG32aT5CAAYyersdr4yUjiJS JNJramUeojgmwQgbJKaKcKchRLc1f8CNm6/+VF6zd2eDVoLp900vTWYhDldK/PV9GB RbTeKHJAU3Qr/ncl2AHQQwx6l6E74qFEguCl9Hyg= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com [148.163.156.1]) by sourceware.org (Postfix) with ESMTPS id D6EFA395205E; Thu, 18 Jun 2020 22:20:11 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org D6EFA395205E Received: from pps.filterd (m0098393.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.0.42/8.16.0.42) with SMTP id 05IM2ne1103168; Thu, 18 Jun 2020 18:20:10 -0400 Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com with ESMTP id 31rdh3p2j5-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 18 Jun 2020 18:20:10 -0400 Received: from m0098393.ppops.net (m0098393.ppops.net [127.0.0.1]) by pps.reinject (8.16.0.36/8.16.0.36) with SMTP id 05IM3qjg108390; Thu, 18 Jun 2020 18:20:10 -0400 Received: from ppma03dal.us.ibm.com (b.bd.3ea9.ip4.static.sl-reverse.com [169.62.189.11]) by mx0a-001b2d01.pphosted.com with ESMTP id 31rdh3p2hq-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 18 Jun 2020 18:20:10 -0400 Received: from pps.filterd (ppma03dal.us.ibm.com [127.0.0.1]) by ppma03dal.us.ibm.com (8.16.0.42/8.16.0.42) with SMTP id 05IMFD0p020214; Thu, 18 Jun 2020 22:20:09 GMT Received: from b03cxnp08028.gho.boulder.ibm.com (b03cxnp08028.gho.boulder.ibm.com [9.17.130.20]) by ppma03dal.us.ibm.com with ESMTP id 31q6c64vv0-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 18 Jun 2020 22:20:09 +0000 Received: from b03ledav001.gho.boulder.ibm.com (b03ledav001.gho.boulder.ibm.com [9.17.130.232]) by b03cxnp08028.gho.boulder.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 05IMK8Ur25625048 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 18 Jun 2020 22:20:08 GMT Received: from b03ledav001.gho.boulder.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 81E7C6E052; Thu, 18 Jun 2020 22:20:08 +0000 (GMT) Received: from b03ledav001.gho.boulder.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 901DC6E04E; Thu, 18 Jun 2020 22:20:07 +0000 (GMT) Received: from sig-9-65-250-81.ibm.com (unknown [9.65.250.81]) by b03ledav001.gho.boulder.ibm.com (Postfix) with ESMTP; Thu, 18 Jun 2020 22:20:07 +0000 (GMT) Message-ID: <6bdc912a7c0b6565923c3c4cc1162f818fbe77a6.camel@us.ibm.com> Subject: [PATCH 2/6 ver 3] rs6000 Add vector insert builtin support To: segher@gcc.gnu.org, dje.gcc@gmail.com, gcc-patches@gcc.gnu.org Date: Thu, 18 Jun 2020 15:20:05 -0700 X-Mailer: Evolution 3.28.5 (3.28.5-5.el7) Mime-Version: 1.0 X-TM-AS-GCONF: 00 X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:6.0.216, 18.0.687 definitions=2020-06-18_21:2020-06-18, 2020-06-18 signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 adultscore=0 priorityscore=1501 spamscore=0 bulkscore=0 mlxscore=0 malwarescore=0 impostorscore=0 mlxlogscore=999 cotscore=-2147483648 suspectscore=4 lowpriorityscore=0 clxscore=1015 phishscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2004280000 definitions=main-2006180168 X-Spam-Status: No, score=-9.7 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_ASCII_DIVIDERS, KAM_DMARC_STATUS, RCVD_IN_DNSWL_LOW, RCVD_IN_MSPIKE_H2, SCC_10_SHORT_WORD_LINES, SCC_5_SHORT_WORD_LINES, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Carl Love via Gcc-patches From: Carl Love Reply-To: Carl Love Errors-To: gcc-patches-bounces@gcc.gnu.org Sender: "Gcc-patches" V3 changes Replace spaces with of tabs in ChangeLog Ditto in gcc/config/rs6000/vsx.md. Updated description for vec_insertl() builtin. Cleaned up vec_insert description. ----------------------------------------------------------------- v2 changes Fix change log entry for config/rs6000/altivec.h Fix change log entry for config/rs6000/rs6000-builtin.def Fix change log entry for config/rs6000/rs6000-call.c vsx.md: Fixed if (BYTES_BIG_ENDIAN) else statements. Porting error from pu branch. --------------------------------------------------------------- GCC maintainers: This patch adds support for vec_insertl and vec_inserth builtins. The patch has been compiled and tested on powerpc64le-unknown-linux-gnu (Power 9 LE) and mambo with no regression errors. Please let me know if this patch is acceptable for the mainline branch. Thanks. Carl Love -------------------------------------------------------------- gcc/ChangeLog 2020-06-18 Carl Love * config/rs6000/altivec.h (vec_insertl, vec_inserth): New defines. * config/rs6000/rs6000-builtin.def (VINSERTGPRBL, VINSERTGPRHL, VINSERTGPRWL, VINSERTGPRDL, VINSERTVPRBL, VINSERTVPRHL, VINSERTVPRWL, VINSERTGPRBR, VINSERTGPRHR, VINSERTGPRWR, VINSERTGPRDR, VINSERTVPRBR, VINSERTVPRHR, VINSERTVPRWR): New builtins. (INSERTL, INSERTH): New builtins. * config/rs6000/rs6000-call.c (FUTURE_BUILTIN_VEC_INSERTL, FUTURE_BUILTIN_VEC_INSERTH): New Overloaded definitions. (FUTURE_BUILTIN_VINSERTGPRBL, FUTURE_BUILTIN_VINSERTGPRHL, FUTURE_BUILTIN_VINSERTGPRWL, FUTURE_BUILTIN_VINSERTGPRDL, FUTURE_BUILTIN_VINSERTVPRBL, FUTURE_BUILTIN_VINSERTVPRHL, FUTURE_BUILTIN_VINSERTVPRWL): Add case entries. * config/rs6000/vsx.md (define_c_enum): Add UNSPEC_INSERTL, UNSPEC_INSERTR. (define_expand): Add vinsertvl_, vinsertvr_, vinsertgl_, vinsertgr_, mode is VI2. (define_ins): vinsertvl_internal_, vinsertvr_internal_, vinsertgl_internal_, vinsertgr_internal_, mode VEC_I. * doc/extend.texi: Add documentation for vec_insertl, vec_inserth. gcc/testsuite/ChangeLog 2020-06-18 Carl Love * gcc.target/powerpc/vec-insert-word-runnable.c: New test case. --- gcc/config/rs6000/altivec.h | 2 + gcc/config/rs6000/rs6000-builtin.def | 18 + gcc/config/rs6000/rs6000-call.c | 51 +++ gcc/config/rs6000/vsx.md | 110 ++++++ gcc/doc/extend.texi | 71 ++++ .../powerpc/vec-insert-word-runnable.c | 345 ++++++++++++++++++ 6 files changed, 597 insertions(+) create mode 100644 gcc/testsuite/gcc.target/powerpc/vec-insert-word-runnable.c diff --git a/gcc/config/rs6000/altivec.h b/gcc/config/rs6000/altivec.h index 0a7e8ab3647..936aeb1ee09 100644 --- a/gcc/config/rs6000/altivec.h +++ b/gcc/config/rs6000/altivec.h @@ -699,6 +699,8 @@ __altivec_scalar_pred(vec_any_nle, /* Overloaded built-in functions for future architecture. */ #define vec_extractl(a, b, c) __builtin_vec_extractl (a, b, c) #define vec_extracth(a, b, c) __builtin_vec_extracth (a, b, c) +#define vec_insertl(a, b, c) __builtin_vec_insertl (a, b, c) +#define vec_inserth(a, b, c) __builtin_vec_inserth (a, b, c) #define vec_gnb(a, b) __builtin_vec_gnb (a, b) #define vec_clrl(a, b) __builtin_vec_clrl (a, b) diff --git a/gcc/config/rs6000/rs6000-builtin.def b/gcc/config/rs6000/rs6000-builtin.def index 8b1ddb00045..c5bd4f86555 100644 --- a/gcc/config/rs6000/rs6000-builtin.def +++ b/gcc/config/rs6000/rs6000-builtin.def @@ -2627,6 +2627,22 @@ BU_FUTURE_V_3 (VEXTRACTHR, "vextduhvhx", CONST, vextractrv8hi) BU_FUTURE_V_3 (VEXTRACTWR, "vextduwvhx", CONST, vextractrv4si) BU_FUTURE_V_3 (VEXTRACTDR, "vextddvhx", CONST, vextractrv2di) +BU_FUTURE_V_3 (VINSERTGPRBL, "vinsgubvlx", CONST, vinsertgl_v16qi) +BU_FUTURE_V_3 (VINSERTGPRHL, "vinsguhvlx", CONST, vinsertgl_v8hi) +BU_FUTURE_V_3 (VINSERTGPRWL, "vinsguwvlx", CONST, vinsertgl_v4si) +BU_FUTURE_V_3 (VINSERTGPRDL, "vinsgudvlx", CONST, vinsertgl_v2di) +BU_FUTURE_V_3 (VINSERTVPRBL, "vinsvubvlx", CONST, vinsertvl_v16qi) +BU_FUTURE_V_3 (VINSERTVPRHL, "vinsvuhvlx", CONST, vinsertvl_v8hi) +BU_FUTURE_V_3 (VINSERTVPRWL, "vinsvuwvlx", CONST, vinsertvl_v4si) + +BU_FUTURE_V_3 (VINSERTGPRBR, "vinsgubvrx", CONST, vinsertgr_v16qi) +BU_FUTURE_V_3 (VINSERTGPRHR, "vinsguhvrx", CONST, vinsertgr_v8hi) +BU_FUTURE_V_3 (VINSERTGPRWR, "vinsguwvrx", CONST, vinsertgr_v4si) +BU_FUTURE_V_3 (VINSERTGPRDR, "vinsgudvrx", CONST, vinsertgr_v2di) +BU_FUTURE_V_3 (VINSERTVPRBR, "vinsvubvrx", CONST, vinsertvr_v16qi) +BU_FUTURE_V_3 (VINSERTVPRHR, "vinsvuhvrx", CONST, vinsertvr_v8hi) +BU_FUTURE_V_3 (VINSERTVPRWR, "vinsvuwvrx", CONST, vinsertvr_v4si) + BU_FUTURE_V_1 (VSTRIBR, "vstribr", CONST, vstrir_v16qi) BU_FUTURE_V_1 (VSTRIHR, "vstrihr", CONST, vstrir_v8hi) BU_FUTURE_V_1 (VSTRIBL, "vstribl", CONST, vstril_v16qi) @@ -2646,6 +2662,8 @@ BU_FUTURE_OVERLOAD_2 (XXGENPCVM, "xxgenpcvm") BU_FUTURE_OVERLOAD_3 (EXTRACTL, "extractl") BU_FUTURE_OVERLOAD_3 (EXTRACTH, "extracth") +BU_FUTURE_OVERLOAD_3 (INSERTL, "insertl") +BU_FUTURE_OVERLOAD_3 (INSERTH, "inserth") BU_FUTURE_OVERLOAD_1 (VSTRIR, "strir") BU_FUTURE_OVERLOAD_1 (VSTRIL, "stril") diff --git a/gcc/config/rs6000/rs6000-call.c b/gcc/config/rs6000/rs6000-call.c index 817a14c9c0d..abbe00030ea 100644 --- a/gcc/config/rs6000/rs6000-call.c +++ b/gcc/config/rs6000/rs6000-call.c @@ -5567,6 +5567,28 @@ const struct altivec_builtin_types altivec_overloaded_builtins[] = { RS6000_BTI_unsigned_V2DI, RS6000_BTI_unsigned_V2DI, RS6000_BTI_unsigned_V2DI, RS6000_BTI_UINTQI }, + { FUTURE_BUILTIN_VEC_INSERTL, FUTURE_BUILTIN_VINSERTGPRBL, + RS6000_BTI_unsigned_V16QI, RS6000_BTI_UINTQI, + RS6000_BTI_unsigned_V16QI, RS6000_BTI_UINTSI }, + { FUTURE_BUILTIN_VEC_INSERTL, FUTURE_BUILTIN_VINSERTGPRHL, + RS6000_BTI_unsigned_V8HI, RS6000_BTI_UINTHI, + RS6000_BTI_unsigned_V8HI, RS6000_BTI_UINTSI }, + { FUTURE_BUILTIN_VEC_INSERTL, FUTURE_BUILTIN_VINSERTGPRWL, + RS6000_BTI_unsigned_V4SI, RS6000_BTI_UINTSI, + RS6000_BTI_unsigned_V4SI, RS6000_BTI_UINTSI }, + { FUTURE_BUILTIN_VEC_INSERTL, FUTURE_BUILTIN_VINSERTGPRDL, + RS6000_BTI_unsigned_V2DI, RS6000_BTI_UINTDI, + RS6000_BTI_unsigned_V2DI, RS6000_BTI_UINTSI }, + { FUTURE_BUILTIN_VEC_INSERTL, FUTURE_BUILTIN_VINSERTVPRBL, + RS6000_BTI_unsigned_V16QI, RS6000_BTI_unsigned_V16QI, + RS6000_BTI_unsigned_V16QI, RS6000_BTI_UINTQI }, + { FUTURE_BUILTIN_VEC_INSERTL, FUTURE_BUILTIN_VINSERTVPRHL, + RS6000_BTI_unsigned_V8HI, RS6000_BTI_unsigned_V8HI, + RS6000_BTI_unsigned_V8HI, RS6000_BTI_UINTQI }, + { FUTURE_BUILTIN_VEC_INSERTL, FUTURE_BUILTIN_VINSERTVPRWL, + RS6000_BTI_unsigned_V4SI, RS6000_BTI_unsigned_V4SI, + RS6000_BTI_unsigned_V4SI, RS6000_BTI_UINTQI }, + { FUTURE_BUILTIN_VEC_EXTRACTH, FUTURE_BUILTIN_VEXTRACTBR, RS6000_BTI_unsigned_V2DI, RS6000_BTI_unsigned_V16QI, RS6000_BTI_unsigned_V16QI, RS6000_BTI_UINTQI }, @@ -5580,6 +5602,28 @@ const struct altivec_builtin_types altivec_overloaded_builtins[] = { RS6000_BTI_unsigned_V2DI, RS6000_BTI_unsigned_V2DI, RS6000_BTI_unsigned_V2DI, RS6000_BTI_UINTQI }, + { FUTURE_BUILTIN_VEC_INSERTH, FUTURE_BUILTIN_VINSERTGPRBR, + RS6000_BTI_unsigned_V16QI, RS6000_BTI_UINTQI, + RS6000_BTI_unsigned_V16QI, RS6000_BTI_UINTSI }, + { FUTURE_BUILTIN_VEC_INSERTH, FUTURE_BUILTIN_VINSERTGPRHR, + RS6000_BTI_unsigned_V8HI, RS6000_BTI_UINTHI, + RS6000_BTI_unsigned_V8HI, RS6000_BTI_UINTSI }, + { FUTURE_BUILTIN_VEC_INSERTH, FUTURE_BUILTIN_VINSERTGPRWR, + RS6000_BTI_unsigned_V4SI, RS6000_BTI_UINTSI, + RS6000_BTI_unsigned_V4SI, RS6000_BTI_UINTSI }, + { FUTURE_BUILTIN_VEC_INSERTH, FUTURE_BUILTIN_VINSERTGPRDR, + RS6000_BTI_unsigned_V2DI, RS6000_BTI_UINTDI, + RS6000_BTI_unsigned_V2DI, RS6000_BTI_UINTSI }, + { FUTURE_BUILTIN_VEC_INSERTH, FUTURE_BUILTIN_VINSERTVPRBR, + RS6000_BTI_unsigned_V16QI, RS6000_BTI_unsigned_V16QI, + RS6000_BTI_unsigned_V16QI, RS6000_BTI_UINTQI }, + { FUTURE_BUILTIN_VEC_INSERTH, FUTURE_BUILTIN_VINSERTVPRHR, + RS6000_BTI_unsigned_V8HI, RS6000_BTI_unsigned_V8HI, + RS6000_BTI_unsigned_V8HI, RS6000_BTI_UINTQI }, + { FUTURE_BUILTIN_VEC_INSERTH, FUTURE_BUILTIN_VINSERTVPRWR, + RS6000_BTI_unsigned_V4SI, RS6000_BTI_unsigned_V4SI, + RS6000_BTI_unsigned_V4SI, RS6000_BTI_UINTQI }, + { FUTURE_BUILTIN_VEC_VSTRIL, FUTURE_BUILTIN_VSTRIBL, RS6000_BTI_unsigned_V16QI, RS6000_BTI_unsigned_V16QI, 0, 0 }, { FUTURE_BUILTIN_VEC_VSTRIL, FUTURE_BUILTIN_VSTRIBL, @@ -13291,6 +13335,13 @@ builtin_function_type (machine_mode mode_ret, machine_mode mode_arg0, case FUTURE_BUILTIN_VEXTRACTHR: case FUTURE_BUILTIN_VEXTRACTWR: case FUTURE_BUILTIN_VEXTRACTDR: + case FUTURE_BUILTIN_VINSERTGPRBL: + case FUTURE_BUILTIN_VINSERTGPRHL: + case FUTURE_BUILTIN_VINSERTGPRWL: + case FUTURE_BUILTIN_VINSERTGPRDL: + case FUTURE_BUILTIN_VINSERTVPRBL: + case FUTURE_BUILTIN_VINSERTVPRHL: + case FUTURE_BUILTIN_VINSERTVPRWL: h.uns_p[0] = 1; h.uns_p[1] = 1; h.uns_p[2] = 1; diff --git a/gcc/config/rs6000/vsx.md b/gcc/config/rs6000/vsx.md index 51ffe2d2000..358898a03fb 100644 --- a/gcc/config/rs6000/vsx.md +++ b/gcc/config/rs6000/vsx.md @@ -346,6 +346,8 @@ UNSPEC_XXGENPCV UNSPEC_EXTRACTL UNSPEC_EXTRACTR + UNSPEC_INSERTL + UNSPEC_INSERTR ]) ;; Like VI, defined in vector.md, but add ISA 2.07 integer vector ops @@ -3847,6 +3849,114 @@ "vextvrx %0,%1,%2,%3" [(set_attr "type" "vecsimple")]) +(define_expand "vinsertvl_" + [(set (match_operand:VI2 0 "altivec_register_operand") + (unspec:VI2 [(match_operand:VI2 1 "altivec_register_operand") + (match_operand:VI2 2 "altivec_register_operand") + (match_operand:SI 3 "register_operand" "r")] + UNSPEC_INSERTL))] + "TARGET_FUTURE" +{ + if (BYTES_BIG_ENDIAN) + emit_insn (gen_vinsertvl_internal_ (operands[0], operands[3], + operands[1], operands[2])); + else + emit_insn (gen_vinsertvr_internal_ (operands[0], operands[3], + operands[1], operands[2])); + DONE; +}) + +(define_insn "vinsertvl_internal_" + [(set (match_operand:VEC_I 0 "altivec_register_operand" "=v") + (unspec:VEC_I [(match_operand:SI 1 "register_operand" "r") + (match_operand:VEC_I 2 "altivec_register_operand" "v") + (match_operand:VEC_I 3 "altivec_register_operand" "0")] + UNSPEC_INSERTL))] + "TARGET_FUTURE" + "vinsvlx %0,%1,%2" + [(set_attr "type" "vecsimple")]) + +(define_expand "vinsertvr_" + [(set (match_operand:VI2 0 "altivec_register_operand") + (unspec:VI2 [(match_operand:VI2 1 "altivec_register_operand") + (match_operand:VI2 2 "altivec_register_operand") + (match_operand:SI 3 "register_operand" "r")] + UNSPEC_INSERTR))] + "TARGET_FUTURE" +{ + if (BYTES_BIG_ENDIAN) + emit_insn (gen_vinsertvr_internal_ (operands[0], operands[3], + operands[1], operands[2])); + else + emit_insn (gen_vinsertvl_internal_ (operands[0], operands[3], + operands[1], operands[2])); + DONE; +}) + +(define_insn "vinsertvr_internal_" + [(set (match_operand:VEC_I 0 "altivec_register_operand" "=v") + (unspec:VEC_I [(match_operand:SI 1 "register_operand" "r") + (match_operand:VEC_I 2 "altivec_register_operand" "v") + (match_operand:VEC_I 3 "altivec_register_operand" "0")] + UNSPEC_INSERTR))] + "TARGET_FUTURE" + "vinsvrx %0,%1,%2" + [(set_attr "type" "vecsimple")]) + +(define_expand "vinsertgl_" + [(set (match_operand:VI2 0 "altivec_register_operand") + (unspec:VI2 [(match_operand:SI 1 "register_operand") + (match_operand:VI2 2 "altivec_register_operand") + (match_operand:SI 3 "register_operand")] + UNSPEC_INSERTL))] + "TARGET_FUTURE" +{ + if (BYTES_BIG_ENDIAN) + emit_insn (gen_vinsertgl_internal_ (operands[0], operands[3], + operands[1], operands[2])); + else + emit_insn (gen_vinsertgr_internal_ (operands[0], operands[3], + operands[1], operands[2])); + DONE; + }) + +(define_insn "vinsertgl_internal_" + [(set (match_operand:VEC_I 0 "altivec_register_operand" "=v") + (unspec:VEC_I [(match_operand:SI 1 "register_operand" "r") + (match_operand:SI 2 "register_operand" "r") + (match_operand:VEC_I 3 "altivec_register_operand" "0")] + UNSPEC_INSERTL))] + "TARGET_FUTURE" + "vinslx %0,%1,%2" + [(set_attr "type" "vecsimple")]) + +(define_expand "vinsertgr_" + [(set (match_operand:VI2 0 "altivec_register_operand") + (unspec:VI2 [(match_operand:SI 1 "register_operand") + (match_operand:VI2 2 "altivec_register_operand") + (match_operand:SI 3 "register_operand")] + UNSPEC_INSERTR))] + "TARGET_FUTURE" +{ + if (BYTES_BIG_ENDIAN) + emit_insn (gen_vinsertgr_internal_ (operands[0], operands[3], + operands[1], operands[2])); + else + emit_insn (gen_vinsertgl_internal_ (operands[0], operands[3], + operands[1], operands[2])); + DONE; + }) + +(define_insn "vinsertgr_internal_" + [(set (match_operand:VEC_I 0 "altivec_register_operand" "=v") + (unspec:VEC_I [(match_operand:SI 1 "register_operand" "r") + (match_operand:SI 2 "register_operand" "r") + (match_operand:VEC_I 3 "altivec_register_operand" "0")] + UNSPEC_INSERTR))] + "TARGET_FUTURE" + "vinsrx %0,%1,%2" + [(set_attr "type" "vecsimple")]) + ;; VSX_EXTRACT optimizations ;; Optimize double d = (double) vec_extract (vi, ) ;; Get the element into the top position and use XVCVSWDP/XVCVUWDP diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi index 116ae4b8378..0a970299b34 100644 --- a/gcc/doc/extend.texi +++ b/gcc/doc/extend.texi @@ -20976,6 +20976,77 @@ limitation of the bi-endian vector programming model consistent with the limitation on vec_perm, for example. @findex vec_extracth +Vector Insert + +@smallexample +@exdent vector unsigned char +@exdent vec_insertl (unsigned char, vector unsigned char, unsigned int); +@exdent vector unsigned short +@exdent vec_insertl (unsigned short, vector unsigned short, unsigned int); +@exdent vector unsigned int +@exdent vec_insertl (unsigned int, vector unsigned int, unsigned int); +@exdent vector unsigned long long +@exdent vec_insertl (unsigned long long, vector unsigned long long, +unsigned int); +@exdent vector unsigned char +@exdent vec_insertl (vector unsigned char, vector unsigned char, unsigned int; +@exdent vector unsigned short +@exdent vec_insertl (vector unsigned short, vector unsigned short, +unsigned int); +@exdent vector unsigned int +@exdent vec_insertl (vector unsigned int, vector unsigned int, unsigned int); +@end smallexample + +Let src be the first argument, when the first argument is a scalar, or the +rightmost element of the left doubleword of the first argument, when the first +argument is a vector. Insert the source into the destination at the position +given by the third argument, using natural element order in the second +argument. The rest of the second argument is unchanged. If the byte +index is greater than 14 for halfwords, greatere than 12 for words, or +greater than 8 for doublewords the result is undefined. For little-endian, +the generated code will be semantically equivalent to vinsbrx, vinshrx, +or vinswrx instructions. Similarly for big-endian it will be semantically +equivalent to vinsblx, vinshlx, vinswlx. Note that some +fairly anomalous results can be generated if the byte index is not aligned +on an element boundary for the sort of element being inserted. This is a +limitation of the bi-endian vector programming model. +@findex vec_insertl + +@smallexample +@exdent vector unsigned char +@exdent vec_inserth (unsigned char, vector unsigned char, unsigned int); +@exdent vector unsigned short +@exdent vec_inserth (unsigned short, vector unsigned short, unsigned int); +@exdent vector unsigned int +@exdent vec_inserth (unsigned int, vector unsigned int, unsigned int); +@exdent vector unsigned long long +@exdent vec_inserth (unsigned long long, vector unsigned long long, +unsigned int); +@exdent vector unsigned char +@exdent vec_inserth (vector unsigned char, vector unsigned char, unsigned int); +@exdent vector unsigned short +@exdent vec_inserth (vector unsigned short, vector unsigned short, +unsigned int); +@exdent vector unsigned int +@exdent vec_inserth (vector unsigned int, vector unsigned int, unsigned int); +@end smallexample + +Let src be the first argument, when the first argument is a scalar, or the +rightmost element of the first argument, when the first argument is a vector. +Insert src into the second argument at the position identified by the third +argument, using opposite element order in the second argument, and leaving the +rest of the second argument unchanged. If the byte index is greater than 14 +for halfwords, 12 for words, or 8 for doublewords, the intrinsic will be +rejected. Note that the underlying hardware instruction uses the same register +for the second argument and the result, but this is hidden by the built-in. +For little-endian, the code generation will be semantically equivalent to +vins*lx, while for big-endian it will be semantically equivalent to vins*rx. +Note that some fairly anomalous results can be generated if the byte index is +not aligned on an element boundary for the sort of element being inserted. +This is a limitation of the bi-endian vector programming model consistent with +the limitation on vec_perm, for example. +@findex vec_inserth + @smallexample @exdent vector unsigned long long int @exdent vec_pdep (vector unsigned long long int, vector unsigned long long int) diff --git a/gcc/testsuite/gcc.target/powerpc/vec-insert-word-runnable.c b/gcc/testsuite/gcc.target/powerpc/vec-insert-word-runnable.c new file mode 100644 index 00000000000..3fc68e9d7c7 --- /dev/null +++ b/gcc/testsuite/gcc.target/powerpc/vec-insert-word-runnable.c @@ -0,0 +1,345 @@ +/* { dg-do run } */ +/* { dg-require-effective-target powerpc_future_hw } */ +/* { dg-options "-mdejagnu-cpu=future" } */ +#include + +#define DEBUG 1 + +#ifdef DEBUG +#include +#endif + +extern void abort (void); + +int +main (int argc, char *argv []) +{ + int i; + unsigned int index; + vector unsigned char vresult_ch; + vector unsigned char expected_vresult_ch; + vector unsigned char src_va_ch; + vector unsigned char src_vb_ch; + unsigned char src_a_ch; + + vector unsigned short vresult_sh; + vector unsigned short expected_vresult_sh; + vector unsigned short src_va_sh; + vector unsigned short src_vb_sh; + unsigned short int src_a_sh; + + vector unsigned int vresult_int; + vector unsigned int expected_vresult_int; + vector unsigned int src_va_int; + vector unsigned int src_vb_int; + unsigned int src_a_int; + + vector unsigned long long vresult_ll; + vector unsigned long long expected_vresult_ll; + vector unsigned long long src_va_ll; + unsigned long long int src_a_ll; + + /* Vector insert, low index, from GPR */ + src_a_ch = 79; + index = 2; + src_va_ch = (vector unsigned char) { 0, 1, 2, 3, 4, 5, 6, 7, + 8, 9, 10, 11, 12, 13, 14, 15 }; + vresult_ch = (vector unsigned char) { 0, 0, 0, 0, 0, 0, 0, 0, + 0, 0, 0, 0, 0, 0, 0, 0 }; + expected_vresult_ch = (vector unsigned char) { 0, 1, 79, 3, 4, 5, 6, 7, + 8, 9, 10, 11, 12, 13, 14, 15 }; + + vresult_ch = vec_insertl (src_a_ch, src_va_ch, index); + + if (!vec_all_eq (vresult_ch, expected_vresult_ch)) { +#if DEBUG + printf("ERROR, vec_insertl (src_a_ch, src_va_ch, index)\n"); + for(i = 0; i < 16; i++) + printf(" vresult_ch[%d] = %d, expected_vresult_ch[%d] = %d\n", + i, vresult_ch[i], i, expected_vresult_ch[i]); +#else + abort(); +#endif + } + + src_a_sh = 79; + index = 10; + src_va_sh = (vector unsigned short int) { 0, 1, 2, 3, 4, 5, 6, 7 }; + vresult_sh = (vector unsigned short int) { 0, 0, 0, 0, 0, 0, 0, 0 }; + expected_vresult_sh = (vector unsigned short int) { 0, 1, 2, 3, + 4, 79, 6, 7 }; + + vresult_sh = vec_insertl (src_a_sh, src_va_sh, index); + + if (!vec_all_eq (vresult_sh, expected_vresult_sh)) { +#if DEBUG + printf("ERROR, vec_insertl (src_a_sh, src_va_sh, index)\n"); + for(i = 0; i < 8; i++) + printf(" vresult_sh[%d] = %d, expected_vresult_sh[%d] = %d\n", + i, vresult_sh[i], i, expected_vresult_sh[i]); +#else + abort(); +#endif + } + + src_a_int = 79; + index = 8; + src_va_int = (vector unsigned int) { 0, 1, 2, 3 }; + vresult_int = (vector unsigned int) { 0, 0, 0, 0 }; + expected_vresult_int = (vector unsigned int) { 0, 1, 79, 3 }; + + vresult_int = vec_insertl (src_a_int, src_va_int, index); + + if (!vec_all_eq (vresult_int, expected_vresult_int)) { +#if DEBUG + printf("ERROR, vec_insertl (src_a_int, src_va_int, index)\n"); + for(i = 0; i < 4; i++) + printf(" vresult_int[%d] = %d, expected_vresult_int[%d] = %d\n", + i, vresult_int[i], i, expected_vresult_int[i]); +#else + abort(); +#endif + } + + src_a_ll = 79; + index = 8; + src_va_ll = (vector unsigned long long) { 0, 1 }; + vresult_ll = (vector unsigned long long) { 0, 0 }; + expected_vresult_ll = (vector unsigned long long) { 0, 79 }; + + vresult_ll = vec_insertl (src_a_ll, src_va_ll, index); + + if (!vec_all_eq (vresult_ll, expected_vresult_ll)) { +#if DEBUG + printf("ERROR, vec_insertl (src_a_ll, src_va_ll, index)\n"); + for(i = 0; i < 2; i++) + printf(" vresult_ll[%d] = %d, expected_vresult_ll[%d] = %d\n", + i, vresult_ll[i], i, expected_vresult_ll[i]); +#else + abort(); +#endif + } + + /* Vector insert, low index, from vector */ + index = 2; + src_va_ch = (vector unsigned char) { 0, 1, 2, 3, 4, 5, 6, 7, + 8, 9, 10, 11, 12, 13, 14, 15 }; + src_vb_ch = (vector unsigned char) { 10, 11, 12, 13, 14, 15, 16, 17, + 18, 19, 20, 21, 22, 23, 24, 25 }; + vresult_ch = (vector unsigned char) { 0, 0, 0, 0, 0, 0, 0, 0, + 0, 0, 0, 0, 0, 0, 0, 0 }; + expected_vresult_ch = (vector unsigned char) { 0, 1, 18, 3, 4, 5, 6, 7, + 8, 9, 10, 11, 12, 13, 14, 15 }; + + vresult_ch = vec_insertl (src_vb_ch, src_va_ch, index); + + if (!vec_all_eq (vresult_ch, expected_vresult_ch)) { +#if DEBUG + printf("ERROR, vec_insertl (src_vb_ch, src_va_ch, index)\n"); + for(i = 0; i < 16; i++) + printf(" vresult_ch[%d] = %d, expected_vresult_ch[%d] = %d\n", + i, vresult_ch[i], i, expected_vresult_ch[i]); +#else + abort(); +#endif + } + + index = 4; + src_va_sh = (vector unsigned short) { 0, 1, 2, 3, 4, 5, 6, 7 }; + src_vb_sh = (vector unsigned short) { 10, 11, 12, 13, 14, 15, 16, 17 }; + vresult_sh = (vector unsigned short) { 0, 0, 0, 0, 0, 0, 0, 0 }; + expected_vresult_sh = (vector unsigned short) { 0, 1, 14, 3, 4, 5, 6, 7 }; + + vresult_sh = vec_insertl (src_vb_sh, src_va_sh, index); + + if (!vec_all_eq (vresult_sh, expected_vresult_sh)) { +#if DEBUG + printf("ERROR, vec_insertl (src_vb_sh, src_va_sh, index)\n"); + for(i = 0; i < 8; i++) + printf(" vresult_sh[%d] = %d, expected_vresult_sh[%d] = %d\n", + i, vresult_sh[i], i, expected_vresult_sh[i]); +#else + abort(); +#endif + } + + index = 8; + src_va_int = (vector unsigned int) { 0, 1, 2, 3 }; + src_vb_int = (vector unsigned int) { 10, 11, 12, 13 }; + vresult_int = (vector unsigned int) { 0, 0, 0, 0 }; + expected_vresult_int = (vector unsigned int) { 0, 1, 12, 3 }; + + vresult_int = vec_insertl (src_vb_int, src_va_int, index); + + if (!vec_all_eq (vresult_int, expected_vresult_int)) { +#if DEBUG + printf("ERROR, vec_insertl (src_vb_int, src_va_int, index)\n"); + for(i = 0; i < 4; i++) + printf(" vresult_int[%d] = %d, expected_vresult_int[%d] = %d\n", + i, vresult_int[i], i, expected_vresult_int[i]); +#else + abort(); +#endif + } + + /* Vector insert, high index, from GPR */ + src_a_ch = 79; + index = 2; + src_va_ch = (vector unsigned char) { 0, 1, 2, 3, 4, 5, 6, 7, + 8, 9, 10, 11, 12, 13, 14, 15 }; + vresult_ch = (vector unsigned char) { 0, 0, 0, 0, 0, 0, 0, 0, + 0, 0, 0, 0, 0, 0, 0, 0 }; + expected_vresult_ch = (vector unsigned char) { 0, 1, 2, 3, 4, 5, 6, 7, + 8, 9, 10, 11, 12, 79, 14, 15 }; + + vresult_ch = vec_inserth (src_a_ch, src_va_ch, index); + + if (!vec_all_eq (vresult_ch, expected_vresult_ch)) { +#if DEBUG + printf("ERROR, vec_inserth (src_a_ch, src_va_ch, index)\n"); + for(i = 0; i < 16; i++) + printf(" vresult_ch[%d] = %d, expected_vresult_ch[%d] = %d\n", + i, vresult_ch[i], i, expected_vresult_ch[i]); +#else + abort(); +#endif + } + + src_a_sh = 79; + index = 10; + src_va_sh = (vector unsigned short int) { 0, 1, 2, 3, 4, 5, 6, 7 }; + vresult_sh = (vector unsigned short int) { 0, 0, 0, 0, 0, 0, 0, 0 }; + expected_vresult_sh = (vector unsigned short int) { 0, 1, 79, 3, + 4, 5, 6, 7 }; + + vresult_sh = vec_inserth (src_a_sh, src_va_sh, index); + + if (!vec_all_eq (vresult_sh, expected_vresult_sh)) { +#if DEBUG + printf("ERROR, vec_inserth (src_a_sh, src_va_sh, index)\n"); + for(i = 0; i < 8; i++) + printf(" vresult_sh[%d] = %d, expected_vresult_sh[%d] = %d\n", + i, vresult_sh[i], i, expected_vresult_sh[i]); +#else + abort(); +#endif + } + + src_a_int = 79; + index = 8; + src_va_int = (vector unsigned int) { 0, 1, 2, 3 }; + vresult_int = (vector unsigned int) { 0, 0, 0, 0 }; + expected_vresult_int = (vector unsigned int) { 0, 79, 2, 3 }; + + vresult_int = vec_inserth (src_a_int, src_va_int, index); + + if (!vec_all_eq (vresult_int, expected_vresult_int)) { +#if DEBUG + printf("ERROR, vec_inserth (src_a_int, src_va_int, index)\n"); + for(i = 0; i < 4; i++) + printf(" vresult_int[%d] = %d, expected_vresult_int[%d] = %d\n", + i, vresult_int[i], i, expected_vresult_int[i]); +#else + abort(); +#endif + } + + src_a_ll = 79; + index = 8; + src_va_ll = (vector unsigned long long) { 0, 1 }; + vresult_ll = (vector unsigned long long) { 0, 0 }; + expected_vresult_ll = (vector unsigned long long) { 79, 1 }; + + vresult_ll = vec_inserth (src_a_ll, src_va_ll, index); + + if (!vec_all_eq (vresult_ll, expected_vresult_ll)) { +#if DEBUG + printf("ERROR, vec_inserth (src_a_ll, src_va_ll, index)\n"); + for(i = 0; i < 2; i++) + printf(" vresult_ll[%d] = %d, expected_vresult_ll[%d] = %d\n", + i, vresult_ll[i], i, expected_vresult_ll[i]); +#else + abort(); +#endif + } + + /* Vector insert, left index, from vector */ + index = 2; + src_va_ch = (vector unsigned char) { 0, 1, 2, 3, 4, 5, 6, 7, + 8, 9, 10, 11, 12, 13, 14, 15 }; + src_vb_ch = (vector unsigned char) { 10, 11, 12, 13, 14, 15, 16, 17, + 18, 19, 20, 21, 22, 23, 24, 25 }; + vresult_ch = (vector unsigned char) { 0, 0, 0, 0, 0, 0, 0, 0, + 0, 0, 0, 0, 0, 0, 0, 0 }; + expected_vresult_ch = (vector unsigned char) { 0, 1, 2, 3, 4, 5, 6, 7, + 8, 9, 10, 11, 12, 18, 14, 15 }; + + vresult_ch = vec_inserth (src_vb_ch, src_va_ch, index); + + if (!vec_all_eq (vresult_ch, expected_vresult_ch)) { +#if DEBUG + printf("ERROR, vec_inserth (src_vb_ch, src_va_ch, index)\n"); + for(i = 0; i < 16; i++) + printf(" vresult_ch[%d] = %d, expected_vresult_ch[%d] = %d\n", + i, vresult_ch[i], i, expected_vresult_ch[i]); +#else + abort(); +#endif + } + + index = 4; + src_va_sh = (vector unsigned short) { 0, 1, 2, 3, 4, 5, 6, 7 }; + src_vb_sh = (vector unsigned short) { 10, 11, 12, 13, 14, 15, 16, 17 }; + vresult_sh = (vector unsigned short) { 0, 0, 0, 0, 0, 0, 0, 0 }; + expected_vresult_sh = (vector unsigned short) { 0, 1, 2, 3, 4, 14, 6, 7 }; + + vresult_sh = vec_inserth (src_vb_sh, src_va_sh, index); + + if (!vec_all_eq (vresult_sh, expected_vresult_sh)) { +#if DEBUG + printf("ERROR, vec_inserth (src_vb_sh, src_va_sh, index)\n"); + for(i = 0; i < 8; i++) + printf(" vresult_sh[%d] = %d, expected_vresult_sh[%d] = %d\n", + i, vresult_sh[i], i, expected_vresult_sh[i]); +#else + abort(); +#endif + } + + index = 8; + src_va_int = (vector unsigned int) { 0, 1, 2, 3 }; + src_vb_int = (vector unsigned int) { 10, 11, 12, 13 }; + vresult_int = (vector unsigned int) { 0, 0, 0, 0 }; + expected_vresult_int = (vector unsigned int) { 0, 12, 2, 3 }; + + vresult_int = vec_inserth (src_vb_int, src_va_int, index); + + if (!vec_all_eq (vresult_int, expected_vresult_int)) { +#if DEBUG + printf("ERROR, vec_inserth (src_vb_int, src_va_int, index)\n"); + for(i = 0; i < 4; i++) + printf(" vresult_int[%d] = %d, expected_vresult_int[%d] = %d\n", + i, vresult_int[i], i, expected_vresult_int[i]); +#else + abort(); +#endif + } + return 0; +} + +/* { dg-final { scan-assembler {\mvinsblx\M} } } */ +/* { dg-final { scan-assembler {\mvinshlx\M} } } */ +/* { dg-final { scan-assembler {\mvinswlx\M} } } */ +/* { dg-final { scan-assembler {\mvinsdlx\M} } } */ +/* { dg-final { scan-assembler {\mvinsbvlx\M} } } */ +/* { dg-final { scan-assembler {\mvinshvlx\M} } } */ +/* { dg-final { scan-assembler {\mvinswvlx\M} } } */ + +/* { dg-final { scan-assembler {\mvinsbrx\M} } } */ +/* { dg-final { scan-assembler {\mvinshrx\M} } } */ +/* { dg-final { scan-assembler {\mvinswrx\M} } } */ +/* { dg-final { scan-assembler {\mvinsdrx\M} } } */ +/* { dg-final { scan-assembler {\mvinsbvrx\M} } } */ +/* { dg-final { scan-assembler {\mvinshvrx\M} } } */ +/* { dg-final { scan-assembler {\mvinswvrx\M} } } */ + From patchwork Thu Jun 18 22:20:10 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Carl Love X-Patchwork-Id: 1312428 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=sourceware.org; envelope-from=gcc-patches-bounces@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=gcc.gnu.org Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.a=rsa-sha256 header.s=default header.b=W267Nu4t; dkim-atps=neutral Received: from sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 49nxGC2khQz9sRR for ; Fri, 19 Jun 2020 08:20:35 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id F0687395305D; Thu, 18 Jun 2020 22:20:20 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org F0687395305D DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1592518821; bh=LjBfW05m6eOlvddz+qMc0LCMpdcmaNmKvqiGbm1FINA=; h=Subject:To:Date:List-Id:List-Unsubscribe:List-Archive:List-Post: List-Help:List-Subscribe:From:Reply-To:From; b=W267Nu4tBLncQ+p4fjpr4I4ejHVk81qDnBfrvaFJnGZFjiB2w3EYp1NglHDEqJrFQ 8v3K5tzdrMe+4YltBSslrYgYmM37hIdxCaSYjA6zEY1JlSCvDPOY2ubHu3eVNyzktH /xsogPQeOFksEGo8hMFWRuFrnObosf+ppFL8klsg= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mx0b-001b2d01.pphosted.com (mx0b-001b2d01.pphosted.com [148.163.158.5]) by sourceware.org (Postfix) with ESMTPS id 78E80395205E; Thu, 18 Jun 2020 22:20:16 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 78E80395205E Received: from pps.filterd (m0098417.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.0.42/8.16.0.42) with SMTP id 05IM1wEH034676; Thu, 18 Jun 2020 18:20:16 -0400 Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com with ESMTP id 31r58a6x80-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 18 Jun 2020 18:20:15 -0400 Received: from m0098417.ppops.net (m0098417.ppops.net [127.0.0.1]) by pps.reinject (8.16.0.36/8.16.0.36) with SMTP id 05IM2D9T036460; Thu, 18 Jun 2020 18:20:15 -0400 Received: from ppma01wdc.us.ibm.com (fd.55.37a9.ip4.static.sl-reverse.com [169.55.85.253]) by mx0a-001b2d01.pphosted.com with ESMTP id 31r58a6x7s-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 18 Jun 2020 18:20:15 -0400 Received: from pps.filterd (ppma01wdc.us.ibm.com [127.0.0.1]) by ppma01wdc.us.ibm.com (8.16.0.42/8.16.0.42) with SMTP id 05IMJQ17022265; Thu, 18 Jun 2020 22:20:15 GMT Received: from b01cxnp22036.gho.pok.ibm.com (b01cxnp22036.gho.pok.ibm.com [9.57.198.26]) by ppma01wdc.us.ibm.com with ESMTP id 31q6bdf27n-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 18 Jun 2020 22:20:14 +0000 Received: from b01ledav005.gho.pok.ibm.com (b01ledav005.gho.pok.ibm.com [9.57.199.110]) by b01cxnp22036.gho.pok.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 05IMKDWL14484128 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 18 Jun 2020 22:20:13 GMT Received: from b01ledav005.gho.pok.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 44548AE05C; Thu, 18 Jun 2020 22:20:13 +0000 (GMT) Received: from b01ledav005.gho.pok.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 6C4C1AE062; Thu, 18 Jun 2020 22:20:12 +0000 (GMT) Received: from sig-9-65-250-81.ibm.com (unknown [9.65.250.81]) by b01ledav005.gho.pok.ibm.com (Postfix) with ESMTP; Thu, 18 Jun 2020 22:20:12 +0000 (GMT) Message-ID: <395471e1f12f67d9d066d28a23d1f993d25ae5e0.camel@us.ibm.com> Subject: [PATCH 3/6 ver 3] rs6000, Add vector replace builtin support To: segher@gcc.gnu.org, dje.gcc@gmail.com, gcc-patches@gcc.gnu.org Date: Thu, 18 Jun 2020 15:20:10 -0700 X-Mailer: Evolution 3.28.5 (3.28.5-5.el7) Mime-Version: 1.0 X-TM-AS-GCONF: 00 X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:6.0.216, 18.0.687 definitions=2020-06-18_21:2020-06-18, 2020-06-18 signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 lowpriorityscore=0 priorityscore=1501 cotscore=-2147483648 mlxscore=0 clxscore=1015 suspectscore=4 mlxlogscore=999 malwarescore=0 bulkscore=0 adultscore=0 spamscore=0 impostorscore=0 phishscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2004280000 definitions=main-2006180164 X-Spam-Status: No, score=-10.7 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_ASCII_DIVIDERS, KAM_DMARC_STATUS, RCVD_IN_DNSWL_LOW, RCVD_IN_MSPIKE_H2, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Carl Love via Gcc-patches From: Carl Love Reply-To: Carl Love Errors-To: gcc-patches-bounces@gcc.gnu.org Sender: "Gcc-patches" V3 fixes: Fixed bad word breaks in ChangLog. Replace spaces with tabs in ChangeLog. ------------------------------------ v2 fixes: change log entries config/rs6000/vsx.md, config/rs6000/rs6000-builtin.def, config/rs6000/rs6000-call.c. gcc/config/rs6000/rs6000-call.c: fixed if check for 3rd arg between 0 and 3 fixed if check for 3rd arg between 0 and 12 gcc/config/rs6000/vsx.md: removed REPLACE_ELT_atr definition and used VS_scalar instead. removed REPLACE_ELT_inst definition and used instead fixed spelling mistake on Endianness. fixed indenting for vreplace_elt_ ----------------------------------- GCC maintainers: The following patch adds support for builtins vec_replace_elt and vec_replace_unaligned. The patch has been compiled and tested on powerpc64le-unknown-linux-gnu (Power 9 LE) and mambo with no regression errors. Please let me know if this patch is acceptable for the mainline branch. Thanks. Carl Love ------------------------------------------------------- gcc/ChangeLog 2020-06-18 Carl Love * config/rs6000/altivec.h: Add define for vec_replace_elt and vec_replace_unaligned. * config/rs6000/vsx.md (UNSPEC_REPLACE_ELT, UNSPEC_REPLACE_UN): New. (REPLACE_ELT): New mode iterator. (REPLACE_ELT_atr, REPLACE_ELT_inst, REPLACE_ELT_char, REPLACE_ELT_sh, REPLACE_ELT_max): New mode attributes. (vreplace_un_, vreplace_elt__inst): New. * config/rs6000/rs6000-builtin.def (VREPLACE_ELT_V4SI, VREPLACE_ELT_UV4SI, VREPLACE_ELT_V4SF, VREPLACE_ELT_UV2DI, VREPLACE_ELT_V2DF, VREPLACE_UN_V4SI, VREPLACE_UN_UV4SI, VREPLACE_UN_V4SF, VREPLACE_UN_V2DI, VREPLACE_UN_UV2DI, VREPLACE_UN_V2DF): New. (REPLACE_ELT, REPLACE_UN): New. * config/rs6000/rs6000-call.c (FUTURE_BUILTIN_VEC_REPLACE_ELT, FUTURE_BUILTIN_VEC_REPLACE_UN): New. (rs6000_expand_ternop_builtin): Add 3rd argument checks for CODE_FOR_vreplace_elt_v4si, CODE_FOR_vreplace_elt_v4sf, CODE_FOR_vreplace_un_v4si, CODE_FOR_vreplace_un_v4sf. (builtin_function_type) [FUTURE_BUILTIN_VREPLACE_ELT_UV4SI, FUTURE_BUILTIN_VREPLACE_ELT_UV2DI, FUTURE_BUILTIN_VREPLACE_UN_UV4SI, FUTURE_BUILTIN_VREPLACE_UN_UV2DI]: New cases. * doc/extend.texi: Add description for vec_replace_elt and vec_replace_unaligned builtins. gcc/testsuite/ChangeLog 2020-06-18 Carl Love * gcc.target/powerpc/vec-replace-word.c: Add new test. --- gcc/config/rs6000/altivec.h | 2 + gcc/config/rs6000/rs6000-builtin.def | 16 + gcc/config/rs6000/rs6000-call.c | 61 ++++ gcc/config/rs6000/vsx.md | 60 ++++ gcc/doc/extend.texi | 50 +++ .../powerpc/vec-replace-word-runnable.c | 289 ++++++++++++++++++ 6 files changed, 478 insertions(+) create mode 100644 gcc/testsuite/gcc.target/powerpc/vec-replace-word-runnable.c diff --git a/gcc/config/rs6000/altivec.h b/gcc/config/rs6000/altivec.h index 936aeb1ee09..435ffb8158f 100644 --- a/gcc/config/rs6000/altivec.h +++ b/gcc/config/rs6000/altivec.h @@ -701,6 +701,8 @@ __altivec_scalar_pred(vec_any_nle, #define vec_extracth(a, b, c) __builtin_vec_extracth (a, b, c) #define vec_insertl(a, b, c) __builtin_vec_insertl (a, b, c) #define vec_inserth(a, b, c) __builtin_vec_inserth (a, b, c) +#define vec_replace_elt(a, b, c) __builtin_vec_replace_elt (a, b, c) +#define vec_replace_unaligned(a, b, c) __builtin_vec_replace_un (a, b, c) #define vec_gnb(a, b) __builtin_vec_gnb (a, b) #define vec_clrl(a, b) __builtin_vec_clrl (a, b) diff --git a/gcc/config/rs6000/rs6000-builtin.def b/gcc/config/rs6000/rs6000-builtin.def index c5bd4f86555..91821f29a6f 100644 --- a/gcc/config/rs6000/rs6000-builtin.def +++ b/gcc/config/rs6000/rs6000-builtin.def @@ -2643,6 +2643,20 @@ BU_FUTURE_V_3 (VINSERTVPRBR, "vinsvubvrx", CONST, vinsertvr_v16qi) BU_FUTURE_V_3 (VINSERTVPRHR, "vinsvuhvrx", CONST, vinsertvr_v8hi) BU_FUTURE_V_3 (VINSERTVPRWR, "vinsvuwvrx", CONST, vinsertvr_v4si) +BU_FUTURE_V_3 (VREPLACE_ELT_V4SI, "vreplace_v4si", CONST, vreplace_elt_v4si) +BU_FUTURE_V_3 (VREPLACE_ELT_UV4SI, "vreplace_uv4si", CONST, vreplace_elt_v4si) +BU_FUTURE_V_3 (VREPLACE_ELT_V4SF, "vreplace_v4sf", CONST, vreplace_elt_v4sf) +BU_FUTURE_V_3 (VREPLACE_ELT_V2DI, "vreplace_v2di", CONST, vreplace_elt_v2di) +BU_FUTURE_V_3 (VREPLACE_ELT_UV2DI, "vreplace_uv2di", CONST, vreplace_elt_v2di) +BU_FUTURE_V_3 (VREPLACE_ELT_V2DF, "vreplace_v2df", CONST, vreplace_elt_v2df) + +BU_FUTURE_V_3 (VREPLACE_UN_V4SI, "vreplace_un_v4si", CONST, vreplace_un_v4si) +BU_FUTURE_V_3 (VREPLACE_UN_UV4SI, "vreplace_un_uv4si", CONST, vreplace_un_v4si) +BU_FUTURE_V_3 (VREPLACE_UN_V4SF, "vreplace_un_v4sf", CONST, vreplace_un_v4sf) +BU_FUTURE_V_3 (VREPLACE_UN_V2DI, "vreplace_un_v2di", CONST, vreplace_un_v2di) +BU_FUTURE_V_3 (VREPLACE_UN_UV2DI, "vreplace_un_uv2di", CONST, vreplace_un_v2di) +BU_FUTURE_V_3 (VREPLACE_UN_V2DF, "vreplace_un_v2df", CONST, vreplace_un_v2df) + BU_FUTURE_V_1 (VSTRIBR, "vstribr", CONST, vstrir_v16qi) BU_FUTURE_V_1 (VSTRIHR, "vstrihr", CONST, vstrir_v8hi) BU_FUTURE_V_1 (VSTRIBL, "vstribl", CONST, vstril_v16qi) @@ -2664,6 +2678,8 @@ BU_FUTURE_OVERLOAD_3 (EXTRACTL, "extractl") BU_FUTURE_OVERLOAD_3 (EXTRACTH, "extracth") BU_FUTURE_OVERLOAD_3 (INSERTL, "insertl") BU_FUTURE_OVERLOAD_3 (INSERTH, "inserth") +BU_FUTURE_OVERLOAD_3 (REPLACE_ELT, "replace_elt") +BU_FUTURE_OVERLOAD_3 (REPLACE_UN, "replace_un") BU_FUTURE_OVERLOAD_1 (VSTRIR, "strir") BU_FUTURE_OVERLOAD_1 (VSTRIL, "stril") diff --git a/gcc/config/rs6000/rs6000-call.c b/gcc/config/rs6000/rs6000-call.c index abbe00030ea..2653222ced0 100644 --- a/gcc/config/rs6000/rs6000-call.c +++ b/gcc/config/rs6000/rs6000-call.c @@ -5624,6 +5624,36 @@ const struct altivec_builtin_types altivec_overloaded_builtins[] = { RS6000_BTI_unsigned_V4SI, RS6000_BTI_unsigned_V4SI, RS6000_BTI_unsigned_V4SI, RS6000_BTI_UINTQI }, + { FUTURE_BUILTIN_VEC_REPLACE_ELT, FUTURE_BUILTIN_VREPLACE_ELT_UV4SI, + RS6000_BTI_unsigned_V4SI, RS6000_BTI_unsigned_V4SI, + RS6000_BTI_UINTSI, RS6000_BTI_UINTQI }, + { FUTURE_BUILTIN_VEC_REPLACE_ELT, FUTURE_BUILTIN_VREPLACE_ELT_V4SI, + RS6000_BTI_V4SI, RS6000_BTI_V4SI, RS6000_BTI_INTSI, RS6000_BTI_INTQI }, + { FUTURE_BUILTIN_VEC_REPLACE_ELT, FUTURE_BUILTIN_VREPLACE_ELT_V4SF, + RS6000_BTI_V4SF, RS6000_BTI_V4SF, RS6000_BTI_float, RS6000_BTI_INTQI }, + { FUTURE_BUILTIN_VEC_REPLACE_ELT, FUTURE_BUILTIN_VREPLACE_ELT_UV2DI, + RS6000_BTI_unsigned_V2DI, RS6000_BTI_unsigned_V2DI, + RS6000_BTI_UINTDI, RS6000_BTI_UINTQI }, + { FUTURE_BUILTIN_VEC_REPLACE_ELT, FUTURE_BUILTIN_VREPLACE_ELT_V2DI, + RS6000_BTI_V2DI, RS6000_BTI_V2DI, RS6000_BTI_INTDI, RS6000_BTI_INTQI }, + { FUTURE_BUILTIN_VEC_REPLACE_ELT, FUTURE_BUILTIN_VREPLACE_ELT_V2DF, + RS6000_BTI_V2DF, RS6000_BTI_V2DF, RS6000_BTI_double, RS6000_BTI_INTQI }, + + { FUTURE_BUILTIN_VEC_REPLACE_UN, FUTURE_BUILTIN_VREPLACE_UN_UV4SI, + RS6000_BTI_unsigned_V4SI, RS6000_BTI_unsigned_V4SI, + RS6000_BTI_UINTSI, RS6000_BTI_UINTQI }, + { FUTURE_BUILTIN_VEC_REPLACE_UN, FUTURE_BUILTIN_VREPLACE_UN_V4SI, + RS6000_BTI_V4SI, RS6000_BTI_V4SI, RS6000_BTI_INTSI, RS6000_BTI_INTQI }, + { FUTURE_BUILTIN_VEC_REPLACE_UN, FUTURE_BUILTIN_VREPLACE_UN_V4SF, + RS6000_BTI_V4SF, RS6000_BTI_V4SF, RS6000_BTI_float, RS6000_BTI_INTQI }, + { FUTURE_BUILTIN_VEC_REPLACE_UN, FUTURE_BUILTIN_VREPLACE_UN_UV2DI, + RS6000_BTI_unsigned_V2DI, RS6000_BTI_unsigned_V2DI, + RS6000_BTI_UINTDI, RS6000_BTI_UINTQI }, + { FUTURE_BUILTIN_VEC_REPLACE_UN, FUTURE_BUILTIN_VREPLACE_UN_V2DI, + RS6000_BTI_V2DI, RS6000_BTI_V2DI, RS6000_BTI_INTDI, RS6000_BTI_INTQI }, + { FUTURE_BUILTIN_VEC_REPLACE_UN, FUTURE_BUILTIN_VREPLACE_UN_V2DF, + RS6000_BTI_V2DF, RS6000_BTI_V2DF, RS6000_BTI_double, RS6000_BTI_INTQI }, + { FUTURE_BUILTIN_VEC_VSTRIL, FUTURE_BUILTIN_VSTRIBL, RS6000_BTI_unsigned_V16QI, RS6000_BTI_unsigned_V16QI, 0, 0 }, { FUTURE_BUILTIN_VEC_VSTRIL, FUTURE_BUILTIN_VSTRIBL, @@ -9987,6 +10017,33 @@ rs6000_expand_ternop_builtin (enum insn_code icode, tree exp, rtx target) return CONST0_RTX (tmode); } } + else if (icode == CODE_FOR_vreplace_elt_v4si + || icode == CODE_FOR_vreplace_elt_v4sf) + { + /* Check whether the 3rd argument is an integer constant in the range + 0 to 3 inclusive. */ + STRIP_NOPS (arg2); + if (TREE_CODE (arg2) != INTEGER_CST + || !IN_RANGE (TREE_INT_CST_LOW (arg2), 0, 3)) + { + error ("argument 3 must be in the range 0 to 3"); + return CONST0_RTX (tmode); + } + } + + else if (icode == CODE_FOR_vreplace_un_v4si + || icode == CODE_FOR_vreplace_un_v4sf) + { + /* Check whether the 3rd argument is an integer constant in the range + 0 to 12 inclusive. */ + STRIP_NOPS (arg2); + if (TREE_CODE (arg2) != INTEGER_CST + || !IN_RANGE(TREE_INT_CST_LOW (arg2), 0, 12)) + { + error ("argument 3 must be in the range 0 to 12"); + return CONST0_RTX (tmode); + } + } if (target == 0 || GET_MODE (target) != tmode @@ -13342,6 +13399,10 @@ builtin_function_type (machine_mode mode_ret, machine_mode mode_arg0, case FUTURE_BUILTIN_VINSERTVPRBL: case FUTURE_BUILTIN_VINSERTVPRHL: case FUTURE_BUILTIN_VINSERTVPRWL: + case FUTURE_BUILTIN_VREPLACE_ELT_UV4SI: + case FUTURE_BUILTIN_VREPLACE_ELT_UV2DI: + case FUTURE_BUILTIN_VREPLACE_UN_UV4SI: + case FUTURE_BUILTIN_VREPLACE_UN_UV2DI: h.uns_p[0] = 1; h.uns_p[1] = 1; h.uns_p[2] = 1; diff --git a/gcc/config/rs6000/vsx.md b/gcc/config/rs6000/vsx.md index 358898a03fb..cfdf616ba0b 100644 --- a/gcc/config/rs6000/vsx.md +++ b/gcc/config/rs6000/vsx.md @@ -348,11 +348,22 @@ UNSPEC_EXTRACTR UNSPEC_INSERTL UNSPEC_INSERTR + UNSPEC_REPLACE_ELT + UNSPEC_REPLACE_UN ]) ;; Like VI, defined in vector.md, but add ISA 2.07 integer vector ops (define_mode_iterator VI2 [V4SI V8HI V16QI V2DI]) +;; Vector extract_elt iterator/attr for 32-bit and 64-bit elements +(define_mode_iterator REPLACE_ELT [V4SI V4SF V2DI V2DF]) +(define_mode_attr REPLACE_ELT_char [(V4SI "w") (V4SF "w") + (V2DI "d") (V2DF "d")]) +(define_mode_attr REPLACE_ELT_sh [(V4SI "2") (V4SF "2") + (V2DI "3") (V2DF "3")]) +(define_mode_attr REPLACE_ELT_max [(V4SI "12") (V4SF "12") + (V2DI "8") (V2DF "8")]) + ;; VSX moves ;; The patterns for LE permuted loads and stores come before the general @@ -3957,6 +3968,55 @@ "vinsrx %0,%1,%2" [(set_attr "type" "vecsimple")]) +(define_expand "vreplace_elt_" + [(set (match_operand:REPLACE_ELT 0 "register_operand") + (unspec:REPLACE_ELT [(match_operand:REPLACE_ELT 1 "register_operand") + (match_operand: 2 "register_operand") + (match_operand:QI 3 "const_0_to_3_operand")] + UNSPEC_REPLACE_ELT))] + "TARGET_FUTURE" +{ + int index; + /* Immediate value is the word index, convert to byte index and adjust for + Endianness if needed. */ + if (BYTES_BIG_ENDIAN) + index = INTVAL (operands[3]) << ; + + else + index = - (INTVAL (operands[3]) << ); + + emit_insn (gen_vreplace_elt__inst (operands[0], operands[1], + operands[2], + GEN_INT (index))); + DONE; + } +[(set_attr "type" "vecsimple")]) + +(define_expand "vreplace_un_" + [(set (match_operand:REPLACE_ELT 0 "register_operand") + (unspec:REPLACE_ELT [(match_operand:REPLACE_ELT 1 "register_operand") + (match_operand: 2 "register_operand") + (match_operand:QI 3 "const_0_to_12_operand")] + UNSPEC_REPLACE_UN))] + "TARGET_FUTURE" +{ + /* Immediate value is the byte index Big Endian numbering. */ + emit_insn (gen_vreplace_elt__inst (operands[0], operands[1], + operands[2], operands[3])); + DONE; + } +[(set_attr "type" "vecsimple")]) + +(define_insn "vreplace_elt__inst" + [(set (match_operand:REPLACE_ELT 0 "register_operand" "=v") + (unspec:REPLACE_ELT [(match_operand:REPLACE_ELT 1 "register_operand" "0") + (match_operand: 2 "register_operand" "r") + (match_operand:QI 3 "const_0_to_12_operand" "n")] + UNSPEC_REPLACE_ELT))] + "TARGET_FUTURE" + "vins %0,%2,%3" + [(set_attr "type" "vecsimple")]) + ;; VSX_EXTRACT optimizations ;; Optimize double d = (double) vec_extract (vi, ) ;; Get the element into the top position and use XVCVSWDP/XVCVUWDP diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi index 0a970299b34..54220ca52ee 100644 --- a/gcc/doc/extend.texi +++ b/gcc/doc/extend.texi @@ -21047,6 +21047,56 @@ This is a limitation of the bi-endian vector programming model consistent with the limitation on vec_perm, for example. @findex vec_inserth +Vector Replace Element +@smallexample +@exdent vector signed int vec_replace_elt (vector signed int, signed int, +const int); +@exdent vector unsigned int vec_replace_elt (vector unsigned int, +unsigned int, const int); +@exdent vector float vec_replace_elt (vector float, float, const int); +@exdent vector signed long long vec_replace_elt (vector signed long long, +signed long long, const int); +@exdent vector unsigned long long vec_replace_elt (vector unsigned long long, +unsigned long long, const int); +@exdent vector double rec_replace_elt (vector double, double, const int); +@end smallexample +The third argument (constrained to [0,3]) identifies the natural-endian +element number of the first argument that will be replaced by the second +argument to produce the result. The other elements of the first argument will +remain unchanged in the result. + +If it's desirable to insert a word at an unaligned position, use +vec_replace_unaligned instead. + +@findex vec_replace_element + +Vector Replace Unaligned +@smallexample +@exdent vector unsigned char vec_replace_unaligned (vector unsigned char, +signed int, const int); +@exdent vector unsigned char vec_replace_unaligned (vector unsigned char, +unsigned int, const int); +@exdent vector unsigned char vec_replace_unaligned (vector unsigned char, +float, const int); +@exdent vector unsigned char vec_replace_unaligned (vector unsigned char, +signed long long, const int); +@exdent vector unsigned char vec_replace_unaligned (vector unsigned char, +unsigned long long, const int); +@exdent vector unsigned char vec_replace_unaligned (vector unsigned char, +double, const int); +@end smallexample + +The second argument replaces a portion of the first argument to produce the +result, with the rest of the first argument unchanged in the result. The +third argument identifies the byte index (using left-to-right, or big-endian +order) where the high-order byte of the second argument will be placed, with +the remaining bytes of the second argument placed naturally "to the right" +of the high-order byte. + +The programmer is responsible for understanding the endianness issues involved +with the first argument and the result. +@findex vec_replace_unaligned + @smallexample @exdent vector unsigned long long int @exdent vec_pdep (vector unsigned long long int, vector unsigned long long int) diff --git a/gcc/testsuite/gcc.target/powerpc/vec-replace-word-runnable.c b/gcc/testsuite/gcc.target/powerpc/vec-replace-word-runnable.c new file mode 100644 index 00000000000..1fe23d5f912 --- /dev/null +++ b/gcc/testsuite/gcc.target/powerpc/vec-replace-word-runnable.c @@ -0,0 +1,289 @@ +/* { dg-do run } */ +/* { dg-require-effective-target powerpc_future_hw } */ +/* { dg-options "-mdejagnu-cpu=future" } */ + +#include + +#define DEBUG 1 + +#ifdef DEBUG +#include +#endif + +extern void abort (void); + +int +main (int argc, char *argv []) +{ + int i; + unsigned char ch; + unsigned int index; + + vector unsigned int vresult_uint; + vector unsigned int expected_vresult_uint; + vector unsigned int src_va_uint; + vector unsigned int src_vb_uint; + unsigned int src_a_uint; + + vector int vresult_int; + vector int expected_vresult_int; + vector int src_va_int; + vector int src_vb_int; + int src_a_int; + + vector unsigned long long int vresult_ullint; + vector unsigned long long int expected_vresult_ullint; + vector unsigned long long int src_va_ullint; + vector unsigned long long int src_vb_ullint; + unsigned int long long src_a_ullint; + + vector long long int vresult_llint; + vector long long int expected_vresult_llint; + vector long long int src_va_llint; + vector long long int src_vb_llint; + long long int src_a_llint; + + vector float vresult_float; + vector float expected_vresult_float; + vector float src_va_float; + float src_a_float; + + vector double vresult_double; + vector double expected_vresult_double; + vector double src_va_double; + double src_a_double; + + /* Vector replace 32-bit element */ + src_a_uint = 345; + src_va_uint = (vector unsigned int) { 0, 1, 2, 3 }; + vresult_uint = (vector unsigned int) { 0, 0, 0, 0 }; + expected_vresult_uint = (vector unsigned int) { 0, 1, 345, 3 }; + + vresult_uint = vec_replace_elt (src_va_uint, src_a_uint, 2); + + if (!vec_all_eq (vresult_uint, expected_vresult_uint)) { +#if DEBUG + printf("ERROR, vec_replace_elt (src_vb_uint, src_va_uint, index)\n"); + for(i = 0; i < 4; i++) + printf(" vresult_uint[%d] = %d, expected_vresult_uint[%d] = %d\n", + i, vresult_uint[i], i, expected_vresult_uint[i]); +#else + abort(); +#endif + } + + src_a_int = 234; + src_va_int = (vector int) { 0, 1, 2, 3 }; + vresult_int = (vector int) { 0, 0, 0, 0 }; + expected_vresult_int = (vector int) { 0, 234, 2, 3 }; + + vresult_int = vec_replace_elt (src_va_int, src_a_int, 1); + + if (!vec_all_eq (vresult_int, expected_vresult_int)) { +#if DEBUG + printf("ERROR, vec_replace_elt (src_vb_int, src_va_int, index)\n"); + for(i = 0; i < 4; i++) + printf(" vresult_int[%d] = %d, expected_vresult_int[%d] = %d\n", + i, vresult_int[i], i, expected_vresult_int[i]); +#else + abort(); +#endif + } + + src_a_float = 34.0; + src_va_float = (vector float) { 0.0, 10.0, 20.0, 30.0 }; + vresult_float = (vector float) { 0.0, 0.0, 0.0, 0.0 }; + expected_vresult_float = (vector float) { 0.0, 34.0, 20.0, 30.0 }; + + vresult_float = vec_replace_elt (src_va_float, src_a_float, 1); + + if (!vec_all_eq (vresult_float, expected_vresult_float)) { +#if DEBUG + printf("ERROR, vec_replace_elt (src_vb_float, src_va_float, index)\n"); + for(i = 0; i < 4; i++) + printf(" vresult_float[%d] = %f, expected_vresult_float[%d] = %f\n", + i, vresult_float[i], i, expected_vresult_float[i]); +#else + abort(); +#endif + } + + /* Vector replace 64-bit element */ + src_a_ullint = 456; + src_va_ullint = (vector unsigned long long int) { 0, 1 }; + vresult_ullint = (vector unsigned long long int) { 0, 0 }; + expected_vresult_ullint = (vector unsigned long long int) { 0, 456 }; + + vresult_ullint = vec_replace_elt (src_va_ullint, src_a_ullint, 1); + + if (!vec_all_eq (vresult_ullint, expected_vresult_ullint)) { +#if DEBUG + printf("ERROR, vec_replace_elt (src_vb_ullint, src_va_ullint, index)\n"); + for(i = 0; i < 2; i++) + printf(" vresult_ullint[%d] = %d, expected_vresult_ullint[%d] = %d\n", + i, vresult_ullint[i], i, expected_vresult_ullint[i]); +#else + abort(); +#endif + } + + src_a_llint = 678; + src_va_llint = (vector long long int) { 0, 1 }; + vresult_llint = (vector long long int) { 0, 0 }; + expected_vresult_llint = (vector long long int) { 0, 678 }; + + vresult_llint = vec_replace_elt (src_va_llint, src_a_llint, 1); + + if (!vec_all_eq (vresult_llint, expected_vresult_llint)) { +#if DEBUG + printf("ERROR, vec_replace_elt (src_vb_llint, src_va_llint, index)\n"); + for(i = 0; i < 2; i++) + printf(" vresult_llint[%d] = %d, expected_vresult_llint[%d] = %d\n", + i, vresult_llint[i], i, expected_vresult_llint[i]); +#else + abort(); +#endif + } + + src_a_double = 678.0; + src_va_double = (vector double) { 0.0, 50.0 }; + vresult_double = (vector double) { 0.0, 0.0 }; + expected_vresult_double = (vector double) { 0.0, 678.0 }; + + vresult_double = vec_replace_elt (src_va_double, src_a_double, 1); + + if (!vec_all_eq (vresult_double, expected_vresult_double)) { +#if DEBUG + printf("ERROR, vec_replace_elt (src_vb_double, src_va_double, index)\n"); + for(i = 0; i < 2; i++) + printf(" vresult_double[%d] = %f, expected_vresult_double[%d] = %f\n", + i, vresult_double[i], i, expected_vresult_double[i]); +#else + abort(); +#endif + } + + + /* Vector replace 32-bit element, unaligned */ + src_a_uint = 345; + src_va_uint = (vector unsigned int) { 1, 2, 0, 0 }; + vresult_uint = (vector unsigned int) { 0, 0, 0, 0 }; + /* Byte index 7 will overwrite part of elements 2 and 3 */ + expected_vresult_uint = (vector unsigned int) { 1, 2, 345*256, 0 }; + + vresult_uint = vec_replace_unaligned (src_va_uint, src_a_uint, 3); + + if (!vec_all_eq (vresult_uint, expected_vresult_uint)) { +#if DEBUG + printf("ERROR, vec_replace_unaligned (src_vb_uint, src_va_uint, index)\n"); + for(i = 0; i < 4; i++) + printf(" vresult_uint[%d] = %d, expected_vresult_uint[%d] = %d\n", + i, vresult_uint[i], i, expected_vresult_uint[i]); +#else + abort(); +#endif + } + + src_a_int = 234; + src_va_int = (vector int) { 1, 0, 3, 4 }; + vresult_int = (vector int) { 0, 0, 0, 0 }; + /* Byte index 7 will over write part of elements 1 and 2 */ + expected_vresult_int = (vector int) { 1, 234*256, 0, 4 }; + + vresult_int = vec_replace_unaligned (src_va_int, src_a_int, 7); + + if (!vec_all_eq (vresult_int, expected_vresult_int)) { +#if DEBUG + printf("ERROR, vec_replace_unaligned (src_vb_int, src_va_int, index)\n"); + for(i = 0; i < 4; i++) + printf(" vresult_int[%d] = %d, expected_vresult_int[%d] = %d\n", + i, vresult_int[i], i, expected_vresult_int[i]); +#else + abort(); +#endif + } + + src_a_float = 34.0; + src_va_float = (vector float) { 0.0, 10.0, 20.0, 30.0 }; + vresult_float = (vector float) { 0.0, 0.0, 0.0, 0.0 }; + expected_vresult_float = (vector float) { 0.0, 34.0, 20.0, 30.0 }; + + vresult_float = vec_replace_unaligned (src_va_float, src_a_float, 8); + + if (!vec_all_eq (vresult_float, expected_vresult_float)) { +#if DEBUG + printf("ERROR, vec_replace_unaligned (src_vb_float, src_va_float, index)\n"); + for(i = 0; i < 4; i++) + printf(" vresult_float[%d] = %f, expected_vresult_float[%d] = %f\n", + i, vresult_float[i], i, expected_vresult_float[i]); +#else + abort(); +#endif + } + + /* Vector replace 64-bit element, unaligned */ + src_a_ullint = 456; + src_va_ullint = (vector unsigned long long int) { 0, 0x222 }; + vresult_ullint = (vector unsigned long long int) { 0, 0 }; + expected_vresult_ullint = (vector unsigned long long int) { 456*256, + 0x200 }; + + /* Byte index 7 will over write least significant byte of element 0 */ + vresult_ullint = vec_replace_unaligned (src_va_ullint, src_a_ullint, 7); + + if (!vec_all_eq (vresult_ullint, expected_vresult_ullint)) { +#if DEBUG + printf("ERROR, vec_replace_unaligned (src_vb_ullint, src_va_ullint, index)\n"); + for(i = 0; i < 2; i++) + printf(" vresult_ullint[%d] = %d, expected_vresult_ullint[%d] = %d\n", + i, vresult_ullint[i], i, expected_vresult_ullint[i]); +#else + abort(); +#endif + } + + src_a_llint = 678; + src_va_llint = (vector long long int) { 0, 0x101 }; + vresult_llint = (vector long long int) { 0, 0 }; + /* Byte index 7 will over write least significant byte of element 0 */ + expected_vresult_llint = (vector long long int) { 678*256, 0x100 }; + + vresult_llint = vec_replace_unaligned (src_va_llint, src_a_llint, 7); + + if (!vec_all_eq (vresult_llint, expected_vresult_llint)) { +#if DEBUG + printf("ERROR, vec_replace_unaligned (src_vb_llint, src_va_llint, index)\n"); + for(i = 0; i < 2; i++) + printf(" vresult_llint[%d] = %d, expected_vresult_llint[%d] = %d\n", + i, vresult_llint[i], i, expected_vresult_llint[i]); +#else + abort(); +#endif + } + + src_a_double = 678.0; + src_va_double = (vector double) { 0.0, 50.0 }; + vresult_double = (vector double) { 0.0, 0.0 }; + expected_vresult_double = (vector double) { 0.0, 678.0 }; + + vresult_double = vec_replace_unaligned (src_va_double, src_a_double, 0); + + if (!vec_all_eq (vresult_double, expected_vresult_double)) { +#if DEBUG + printf("ERROR, vec_replace_unaligned (src_vb_double, src_va_double, index)\ +n"); + for(i = 0; i < 2; i++) + printf(" vresult_double[%d] = %f, expected_vresult_double[%d] = %f\n", + i, vresult_double[i], i, expected_vresult_double[i]); +#else + abort(); +#endif + } + + return 0; +} + +/* { dg-final { scan-assembler-times {\mvinsw\M} 6 } } */ +/* { dg-final { scan-assembler-times {\mvinsd\M} 6 } } */ + + From patchwork Thu Jun 18 22:20:14 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Carl Love X-Patchwork-Id: 1312429 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=8.43.85.97; helo=sourceware.org; envelope-from=gcc-patches-bounces@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=gcc.gnu.org Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.a=rsa-sha256 header.s=default header.b=gGlXSrvd; dkim-atps=neutral Received: from sourceware.org (server2.sourceware.org [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 49nxGJ4rlwz9s6w for ; Fri, 19 Jun 2020 08:20:40 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 8FD45395200D; Thu, 18 Jun 2020 22:20:22 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 8FD45395200D DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1592518822; bh=pGe5ipQyZgtw1cqHjmziUar4blmUWSjLBEpaAigqLt0=; h=Subject:To:Date:List-Id:List-Unsubscribe:List-Archive:List-Post: List-Help:List-Subscribe:From:Reply-To:From; b=gGlXSrvdDl3R2Cur7wAPbuUI1+zkSWtiUkqGsXNm307oN401Fd3P/k+nBM98Fn2IS F3Ocg347RaPRVBW4y/jba5M4Xu4SVMLqJ1IRN47wm7hjnbVlEXmWwzTQyP2GreraP3 AcZLGMJ+//s5rJ2gm6PA7G9Z4G4OyvxFXaY0drPw= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mx0a-001b2d01.pphosted.com (mx0b-001b2d01.pphosted.com [148.163.158.5]) by sourceware.org (Postfix) with ESMTPS id E176239540F0; Thu, 18 Jun 2020 22:20:18 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org E176239540F0 Received: from pps.filterd (m0098416.ppops.net [127.0.0.1]) by mx0b-001b2d01.pphosted.com (8.16.0.42/8.16.0.42) with SMTP id 05IM2k2K132856; Thu, 18 Jun 2020 18:20:18 -0400 Received: from pps.reinject (localhost [127.0.0.1]) by mx0b-001b2d01.pphosted.com with ESMTP id 31rdjxdwkk-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 18 Jun 2020 18:20:18 -0400 Received: from m0098416.ppops.net (m0098416.ppops.net [127.0.0.1]) by pps.reinject (8.16.0.36/8.16.0.36) with SMTP id 05IM2sns133756; Thu, 18 Jun 2020 18:20:17 -0400 Received: from ppma02wdc.us.ibm.com (aa.5b.37a9.ip4.static.sl-reverse.com [169.55.91.170]) by mx0b-001b2d01.pphosted.com with ESMTP id 31rdjxdwk9-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 18 Jun 2020 18:20:17 -0400 Received: from pps.filterd (ppma02wdc.us.ibm.com [127.0.0.1]) by ppma02wdc.us.ibm.com (8.16.0.42/8.16.0.42) with SMTP id 05IMJdlM004472; Thu, 18 Jun 2020 22:20:17 GMT Received: from b01cxnp23032.gho.pok.ibm.com (b01cxnp23032.gho.pok.ibm.com [9.57.198.27]) by ppma02wdc.us.ibm.com with ESMTP id 31rdtf94cy-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 18 Jun 2020 22:20:17 +0000 Received: from b01ledav006.gho.pok.ibm.com (b01ledav006.gho.pok.ibm.com [9.57.199.111]) by b01cxnp23032.gho.pok.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 05IMKG1n55312688 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 18 Jun 2020 22:20:16 GMT Received: from b01ledav006.gho.pok.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id E9170AC05F; Thu, 18 Jun 2020 22:20:15 +0000 (GMT) Received: from b01ledav006.gho.pok.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 219FEAC05B; Thu, 18 Jun 2020 22:20:15 +0000 (GMT) Received: from sig-9-65-250-81.ibm.com (unknown [9.65.250.81]) by b01ledav006.gho.pok.ibm.com (Postfix) with ESMTP; Thu, 18 Jun 2020 22:20:14 +0000 (GMT) Message-ID: <64e643b82b85891ca0bcfeac8a3266c837d1fd3e.camel@us.ibm.com> Subject: [PATCH 4/6 ver 3] rs6000, Add vector shift double builtin support To: segher@gcc.gnu.org, dje.gcc@gmail.com, gcc-patches@gcc.gnu.org Date: Thu, 18 Jun 2020 15:20:14 -0700 X-Mailer: Evolution 3.28.5 (3.28.5-5.el7) Mime-Version: 1.0 X-TM-AS-GCONF: 00 X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:6.0.216, 18.0.687 definitions=2020-06-18_21:2020-06-18, 2020-06-18 signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 lowpriorityscore=0 malwarescore=0 priorityscore=1501 cotscore=-2147483648 phishscore=0 mlxscore=0 bulkscore=0 mlxlogscore=999 adultscore=0 clxscore=1015 spamscore=0 suspectscore=4 impostorscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2004280000 definitions=main-2006180164 X-Spam-Status: No, score=-9.7 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_ASCII_DIVIDERS, KAM_DMARC_STATUS, RCVD_IN_DNSWL_LOW, RCVD_IN_MSPIKE_H2, SCC_10_SHORT_WORD_LINES, SCC_5_SHORT_WORD_LINES, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Carl Love via Gcc-patches From: Carl Love Reply-To: Carl Love Errors-To: gcc-patches-bounces@gcc.gnu.org Sender: "Gcc-patches" V3 Fixes Replace spaces with tabs in ChangeLog. Minor edits to ChangeLog entry. Minor edits to vec_sldb description in gcc/doc/extend.texi. ---------------------------------------------------- v2 fixes: change logs redone gcc/config/rs6000/rs6000-call.c - added spaces before parenthesis around args. ----------------------------------------------------------------- GCC maintainers: The following patch adds support for the vector shift double builtins. The patch has been compiled and tested on powerpc64le-unknown-linux-gnu (Power 9 LE) and Mambo with no regression errors. Please let me know if this patch is acceptable for the mainline branch. Thanks. Carl Love ------------------------------------------------------- gcc/ChangeLog 2020-06-18 Carl Love * config/rs6000/altivec.h (vec_sldb, vec_srdb): New defines. * config/rs6000/altivec.md (UNSPEC_SLDB, UNSPEC_SRDB): New. (SLDB_LR): New attribute. (VSHIFT_DBL_LR): New iterator. (vsdb_): New define_insn. * config/rs6000/rs6000-builtin.def (VSLDB_V16QI, VSLDB_V8HI, VSLDB_V4SI, VSLDB_V2DI, VSRDB_V16QI, VSRDB_V8HI, VSRDB_V4SI, VSRDB_V2DI): New BU_FUTURE_V_3 definitions. (SLDB, SRDB): New BU_FUTURE_OVERLOAD_3 definitions. * config/rs6000/rs6000-call.c (FUTURE_BUILTIN_VEC_SLDB, FUTURE_BUILTIN_VEC_SRDB): New definitions. (rs6000_expand_ternop_builtin) [CODE_FOR_vsldb_v16qi, CODE_FOR_vsldb_v8hi, CODE_FOR_vsldb_v4si, CODE_FOR_vsldb_v2di, CODE_FOR_vsrdb_v16qi, CODE_FOR_vsrdb_v8hi, CODE_FOR_vsrdb_v4si, CODE_FOR_vsrdb_v2di}: Add clauses. * doc/extend.texi: Add description for vec_sldb and vec_srdb. gcc/testsuite/ChangeLog 2020-06-18 Carl Love * gcc.target/powerpc/vec-shift-double-runnable.c: New test file. --- gcc/config/rs6000/altivec.h | 2 + gcc/config/rs6000/altivec.md | 18 + gcc/config/rs6000/rs6000-builtin.def | 11 + gcc/config/rs6000/rs6000-call.c | 70 ++++ gcc/doc/extend.texi | 53 +++ .../powerpc/vec-shift-double-runnable.c | 384 ++++++++++++++++++ 6 files changed, 538 insertions(+) create mode 100644 gcc/testsuite/gcc.target/powerpc/vec-shift-double-runnable.c diff --git a/gcc/config/rs6000/altivec.h b/gcc/config/rs6000/altivec.h index 435ffb8158f..0be68892aad 100644 --- a/gcc/config/rs6000/altivec.h +++ b/gcc/config/rs6000/altivec.h @@ -703,6 +703,8 @@ __altivec_scalar_pred(vec_any_nle, #define vec_inserth(a, b, c) __builtin_vec_inserth (a, b, c) #define vec_replace_elt(a, b, c) __builtin_vec_replace_elt (a, b, c) #define vec_replace_unaligned(a, b, c) __builtin_vec_replace_un (a, b, c) +#define vec_sldb(a, b, c) __builtin_vec_sldb (a, b, c) +#define vec_srdb(a, b, c) __builtin_vec_srdb (a, b, c) #define vec_gnb(a, b) __builtin_vec_gnb (a, b) #define vec_clrl(a, b) __builtin_vec_clrl (a, b) diff --git a/gcc/config/rs6000/altivec.md b/gcc/config/rs6000/altivec.md index 0b0b49ee056..832a35cdaa9 100644 --- a/gcc/config/rs6000/altivec.md +++ b/gcc/config/rs6000/altivec.md @@ -171,6 +171,8 @@ UNSPEC_XXEVAL UNSPEC_VSTRIR UNSPEC_VSTRIL + UNSPEC_SLDB + UNSPEC_SRDB ]) (define_c_enum "unspecv" @@ -781,6 +783,22 @@ DONE; }) +;; Map UNSPEC_SLDB to "l" and UNSPEC_SRDB to "r". +(define_int_attr SLDB_LR [(UNSPEC_SLDB "l") + (UNSPEC_SRDB "r")]) + +(define_int_iterator VSHIFT_DBL_LR [UNSPEC_SLDB UNSPEC_SRDB]) + +(define_insn "vsdb_" + [(set (match_operand:VI2 0 "register_operand" "=v") + (unspec:VI2 [(match_operand:VI2 1 "register_operand" "v") + (match_operand:VI2 2 "register_operand" "v") + (match_operand:QI 3 "const_0_to_12_operand" "n")] + VSHIFT_DBL_LR))] + "TARGET_FUTURE" + "vsdbi %0,%1,%2,%3" + [(set_attr "type" "vecsimple")]) + (define_expand "vstrir_" [(set (match_operand:VIshort 0 "altivec_register_operand") (unspec:VIshort [(match_operand:VIshort 1 "altivec_register_operand")] diff --git a/gcc/config/rs6000/rs6000-builtin.def b/gcc/config/rs6000/rs6000-builtin.def index 91821f29a6f..2b198177ef0 100644 --- a/gcc/config/rs6000/rs6000-builtin.def +++ b/gcc/config/rs6000/rs6000-builtin.def @@ -2657,6 +2657,15 @@ BU_FUTURE_V_3 (VREPLACE_UN_V2DI, "vreplace_un_v2di", CONST, vreplace_un_v2di) BU_FUTURE_V_3 (VREPLACE_UN_UV2DI, "vreplace_un_uv2di", CONST, vreplace_un_v2di) BU_FUTURE_V_3 (VREPLACE_UN_V2DF, "vreplace_un_v2df", CONST, vreplace_un_v2df) +BU_FUTURE_V_3 (VSLDB_V16QI, "vsldb_v16qi", CONST, vsldb_v16qi) +BU_FUTURE_V_3 (VSLDB_V8HI, "vsldb_v8hi", CONST, vsldb_v8hi) +BU_FUTURE_V_3 (VSLDB_V4SI, "vsldb_v4si", CONST, vsldb_v4si) +BU_FUTURE_V_3 (VSLDB_V2DI, "vsldb_v2di", CONST, vsldb_v2di) + +BU_FUTURE_V_3 (VSRDB_V16QI, "vsrdb_v16qi", CONST, vsrdb_v16qi) +BU_FUTURE_V_3 (VSRDB_V8HI, "vsrdb_v8hi", CONST, vsrdb_v8hi) +BU_FUTURE_V_3 (VSRDB_V4SI, "vsrdb_v4si", CONST, vsrdb_v4si) +BU_FUTURE_V_3 (VSRDB_V2DI, "vsrdb_v2di", CONST, vsrdb_v2di) BU_FUTURE_V_1 (VSTRIBR, "vstribr", CONST, vstrir_v16qi) BU_FUTURE_V_1 (VSTRIHR, "vstrihr", CONST, vstrir_v8hi) BU_FUTURE_V_1 (VSTRIBL, "vstribl", CONST, vstril_v16qi) @@ -2680,6 +2689,8 @@ BU_FUTURE_OVERLOAD_3 (INSERTL, "insertl") BU_FUTURE_OVERLOAD_3 (INSERTH, "inserth") BU_FUTURE_OVERLOAD_3 (REPLACE_ELT, "replace_elt") BU_FUTURE_OVERLOAD_3 (REPLACE_UN, "replace_un") +BU_FUTURE_OVERLOAD_3 (SLDB, "sldb") +BU_FUTURE_OVERLOAD_3 (SRDB, "srdb") BU_FUTURE_OVERLOAD_1 (VSTRIR, "strir") BU_FUTURE_OVERLOAD_1 (VSTRIL, "stril") diff --git a/gcc/config/rs6000/rs6000-call.c b/gcc/config/rs6000/rs6000-call.c index 2653222ced0..092e6c1cc2c 100644 --- a/gcc/config/rs6000/rs6000-call.c +++ b/gcc/config/rs6000/rs6000-call.c @@ -5654,6 +5654,56 @@ const struct altivec_builtin_types altivec_overloaded_builtins[] = { { FUTURE_BUILTIN_VEC_REPLACE_UN, FUTURE_BUILTIN_VREPLACE_UN_V2DF, RS6000_BTI_V2DF, RS6000_BTI_V2DF, RS6000_BTI_double, RS6000_BTI_INTQI }, + { FUTURE_BUILTIN_VEC_SLDB, FUTURE_BUILTIN_VSLDB_V16QI, + RS6000_BTI_V16QI, RS6000_BTI_V16QI, + RS6000_BTI_V16QI, RS6000_BTI_UINTQI }, + { FUTURE_BUILTIN_VEC_SLDB, FUTURE_BUILTIN_VSLDB_V16QI, + RS6000_BTI_unsigned_V16QI, RS6000_BTI_unsigned_V16QI, + RS6000_BTI_unsigned_V16QI, RS6000_BTI_UINTQI }, + { FUTURE_BUILTIN_VEC_SLDB, FUTURE_BUILTIN_VSLDB_V8HI, + RS6000_BTI_V8HI, RS6000_BTI_V8HI, + RS6000_BTI_V8HI, RS6000_BTI_UINTQI }, + { FUTURE_BUILTIN_VEC_SLDB, FUTURE_BUILTIN_VSLDB_V8HI, + RS6000_BTI_unsigned_V8HI, RS6000_BTI_unsigned_V8HI, + RS6000_BTI_unsigned_V8HI, RS6000_BTI_UINTQI }, + { FUTURE_BUILTIN_VEC_SLDB, FUTURE_BUILTIN_VSLDB_V4SI, + RS6000_BTI_V4SI, RS6000_BTI_V4SI, + RS6000_BTI_V4SI, RS6000_BTI_UINTQI }, + { FUTURE_BUILTIN_VEC_SLDB, FUTURE_BUILTIN_VSLDB_V4SI, + RS6000_BTI_unsigned_V4SI, RS6000_BTI_unsigned_V4SI, + RS6000_BTI_unsigned_V4SI, RS6000_BTI_UINTQI }, + { FUTURE_BUILTIN_VEC_SLDB, FUTURE_BUILTIN_VSLDB_V2DI, + RS6000_BTI_V2DI, RS6000_BTI_V2DI, + RS6000_BTI_V2DI, RS6000_BTI_UINTQI }, + { FUTURE_BUILTIN_VEC_SLDB, FUTURE_BUILTIN_VSLDB_V2DI, + RS6000_BTI_unsigned_V2DI, RS6000_BTI_unsigned_V2DI, + RS6000_BTI_unsigned_V2DI, RS6000_BTI_UINTQI }, + + { FUTURE_BUILTIN_VEC_SRDB, FUTURE_BUILTIN_VSRDB_V16QI, + RS6000_BTI_V16QI, RS6000_BTI_V16QI, + RS6000_BTI_V16QI, RS6000_BTI_UINTQI }, + { FUTURE_BUILTIN_VEC_SRDB, FUTURE_BUILTIN_VSRDB_V16QI, + RS6000_BTI_unsigned_V16QI, RS6000_BTI_unsigned_V16QI, + RS6000_BTI_unsigned_V16QI, RS6000_BTI_UINTQI }, + { FUTURE_BUILTIN_VEC_SRDB, FUTURE_BUILTIN_VSRDB_V8HI, + RS6000_BTI_V8HI, RS6000_BTI_V8HI, + RS6000_BTI_V8HI, RS6000_BTI_UINTQI }, + { FUTURE_BUILTIN_VEC_SRDB, FUTURE_BUILTIN_VSRDB_V8HI, + RS6000_BTI_unsigned_V8HI, RS6000_BTI_unsigned_V8HI, + RS6000_BTI_unsigned_V8HI, RS6000_BTI_UINTQI }, + { FUTURE_BUILTIN_VEC_SRDB, FUTURE_BUILTIN_VSRDB_V4SI, + RS6000_BTI_V4SI, RS6000_BTI_V4SI, + RS6000_BTI_V4SI, RS6000_BTI_UINTQI }, + { FUTURE_BUILTIN_VEC_SRDB, FUTURE_BUILTIN_VSRDB_V4SI, + RS6000_BTI_unsigned_V4SI, RS6000_BTI_unsigned_V4SI, + RS6000_BTI_unsigned_V4SI, RS6000_BTI_UINTQI }, + { FUTURE_BUILTIN_VEC_SRDB, FUTURE_BUILTIN_VSRDB_V2DI, + RS6000_BTI_V2DI, RS6000_BTI_V2DI, + RS6000_BTI_V2DI, RS6000_BTI_UINTQI }, + { FUTURE_BUILTIN_VEC_SRDB, FUTURE_BUILTIN_VSRDB_V2DI, + RS6000_BTI_unsigned_V2DI, RS6000_BTI_unsigned_V2DI, + RS6000_BTI_unsigned_V2DI, RS6000_BTI_UINTQI }, + { FUTURE_BUILTIN_VEC_VSTRIL, FUTURE_BUILTIN_VSTRIBL, RS6000_BTI_unsigned_V16QI, RS6000_BTI_unsigned_V16QI, 0, 0 }, { FUTURE_BUILTIN_VEC_VSTRIL, FUTURE_BUILTIN_VSTRIBL, @@ -10045,6 +10095,26 @@ rs6000_expand_ternop_builtin (enum insn_code icode, tree exp, rtx target) } } + else if (icode == CODE_FOR_vsldb_v16qi + || icode == CODE_FOR_vsldb_v8hi + || icode == CODE_FOR_vsldb_v4si + || icode == CODE_FOR_vsldb_v2di + || icode == CODE_FOR_vsrdb_v16qi + || icode == CODE_FOR_vsrdb_v8hi + || icode == CODE_FOR_vsrdb_v4si + || icode == CODE_FOR_vsrdb_v2di) + { + /* Check whether the 3rd argument is an integer constant in the range + 0 to 7 inclusive. */ + STRIP_NOPS (arg2); + if (TREE_CODE (arg2) != INTEGER_CST + || !IN_RANGE (TREE_INT_CST_LOW (arg2), 0, 7)) + { + error ("argument 3 must be in the range 0 to 7"); + return CONST0_RTX (tmode); + } + } + if (target == 0 || GET_MODE (target) != tmode || ! (*insn_data[icode].operand[0].predicate) (target, tmode)) diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi index 54220ca52ee..3d159d5ea8f 100644 --- a/gcc/doc/extend.texi +++ b/gcc/doc/extend.texi @@ -21097,6 +21097,59 @@ The programmer is responsible for understanding the endianness issues involved with the first argument and the result. @findex vec_replace_unaligned +Vector Shift Left Double Bit Immediate +@smallexample +@exdent vector signed char vec_sldb (vector signed char, vector signed char, +const unsigned int); +@exdent vector unsigned char vec_sldb (vector unsigned char, +vector unsigned char, const unsigned int); +@exdent vector signed short vec_sldb (vector signed short, vector signed short, +const unsigned int); +@exdent vector unsigned short vec_sldb (vector unsigned short, +vector unsigned short, const unsigned int); +@exdent vector signed int vec_sldb (vector signed int, vector signed int, +const unsigned int); +@exdent vector unsigned int vec_sldb (vector unsigned int, vector unsigned int, +const unsigned int); +@exdent vector signed long long vec_sldb (vector signed long long, +vector signed long long, const unsigned int); +@exdent vector unsigned long long vec_sldb (vector unsigned long long, +vector unsigned long long, const unsigned int); +@end smallexample + +Shift the combined input vectors left by the amount specified by the low-order +three bits of the third argument, and return the leftmost remaining 128 bits. +Code using this instruction must be endian-aware. + +@findex vec_sldb + +Vector Shift Right Double Bit Immediate + +@smallexample +@exdent vector signed char vec_srdb (vector signed char, vector signed char, +const unsigned int); +@exdent vector unsigned char vec_srdb (vector unsigned char, vector unsigned char, +const unsigned int); +@exdent vector signed short vec_srdb (vector signed short, vector signed short, +const unsigned int); +@exdent vector unsigned short vec_srdb (vector unsigned short, vector unsigned short, +const unsigned int); +@exdent vector signed int vec_srdb (vector signed int, vector signed int, +const unsigned int); +@exdent vector unsigned int vec_srdb (vector unsigned int, vector unsigned int, +const unsigned int); +@exdent vector signed long long vec_srdb (vector signed long long, +vector signed long long, const unsigned int); +@exdent vector unsigned long long vec_srdb (vector unsigned long long, +vector unsigned long long, const unsigned int); +@end smallexample + +Shift the combined input vectors right by the amount specified by the low-order +three bits of the third argument, and return the remaining 128 bits. Code +using this built-in must be endian-aware. + +@findex vec_srdb + @smallexample @exdent vector unsigned long long int @exdent vec_pdep (vector unsigned long long int, vector unsigned long long int) diff --git a/gcc/testsuite/gcc.target/powerpc/vec-shift-double-runnable.c b/gcc/testsuite/gcc.target/powerpc/vec-shift-double-runnable.c new file mode 100644 index 00000000000..8093c33ba1d --- /dev/null +++ b/gcc/testsuite/gcc.target/powerpc/vec-shift-double-runnable.c @@ -0,0 +1,384 @@ +/* { dg-do run } */ +/* { dg-require-effective-target powerpc_future_hw } */ +/* { dg-options "-mdejagnu-cpu=future" } */ +#include + +#define DEBUG 0 + +#ifdef DEBUG +#include +#endif + +extern void abort (void); + +int +main (int argc, char *argv []) +{ + int i; + + vector signed char vresult_char; + vector signed char expected_vresult_char; + vector signed char src_va_char; + vector signed char src_vb_char; + + vector unsigned char vresult_uchar; + vector unsigned char expected_vresult_uchar; + vector unsigned char src_va_uchar; + vector unsigned char src_vb_uchar; + + vector short int vresult_sh; + vector short int expected_vresult_sh; + vector short int src_va_sh; + vector short int src_vb_sh; + + vector short unsigned int vresult_ush; + vector short unsigned int expected_vresult_ush; + vector short unsigned int src_va_ush; + vector short unsigned int src_vb_ush; + + vector int vresult_int; + vector int expected_vresult_int; + vector int src_va_int; + vector int src_vb_int; + int src_a_int; + + vector unsigned int vresult_uint; + vector unsigned int expected_vresult_uint; + vector unsigned int src_va_uint; + vector unsigned int src_vb_uint; + unsigned int src_a_uint; + + vector long long int vresult_llint; + vector long long int expected_vresult_llint; + vector long long int src_va_llint; + vector long long int src_vb_llint; + long long int src_a_llint; + + vector unsigned long long int vresult_ullint; + vector unsigned long long int expected_vresult_ullint; + vector unsigned long long int src_va_ullint; + vector unsigned long long int src_vb_ullint; + unsigned int long long src_a_ullint; + + /* Vector shift double left */ + src_va_char = (vector signed char) { 0, 2, 4, 6, 8, 10, 12, 14, + 16, 18, 20, 22, 24, 26, 28, 30 }; + src_vb_char = (vector signed char) { 10, 20, 30, 40, 50, 60, 70, 80, 90, + 100, 110, 120, 130, 140, 150, 160 }; + vresult_char = (vector signed char) { 0, 0, 0, 0, 0, 0, 0, 0, + 0, 0, 0, 0, 0, 0, 0, 0 }; + expected_vresult_char = (vector signed char) { 80, 0, 1, 2, 3, 4, 5, 6, 7, + 8, 9, 10, 11, 12, 13, 14 }; + + vresult_char = vec_sldb (src_va_char, src_vb_char, 7); + + if (!vec_all_eq (vresult_char, expected_vresult_char)) { +#if DEBUG + printf("ERROR, vec_sldb (src_va_char_, src_vb_char, 7)\n"); + for(i = 0; i < 16; i++) + printf(" vresult_char[%d] = %d, expected_vresult_char[%d] = %d\n", + i, vresult_char[i], i, expected_vresult_char[i]); +#else + abort(); +#endif + } + + src_va_uchar = (vector unsigned char) { 0, 2, 4, 6, 8, 10, 12, 14, + 16, 18, 20, 22, 24, 26, 28, 30 }; + src_vb_uchar = (vector unsigned char) { 0, 0, 0, 0, 0, 0, 0, 0, + 0, 0, 0, 0, 0, 0, 0, 0 }; + vresult_uchar = (vector unsigned char) { 0, 0, 0, 0, 0, 0, 0, 0, + 0, 0, 0, 0, 0, 0, 0, 0 }; + expected_vresult_uchar = (vector unsigned char) { 0, 0, 1, 2, 3, 4, 5, 6, 7, + 8, 9, 10, 11, 12, 13, 14 }; + + vresult_uchar = vec_sldb (src_va_uchar, src_vb_uchar, 7); + + if (!vec_all_eq (vresult_uchar, expected_vresult_uchar)) { +#if DEBUG + printf("ERROR, vec_sldb (src_va_uchar_, src_vb_uchar, 7)\n"); + for(i = 0; i < 16; i++) + printf(" vresult_uchar[%d] = %d, expected_vresult_uchar[%d] = %d\n", + i, vresult_uchar[i], i, expected_vresult_uchar[i]); +#else + abort(); +#endif + } + + src_va_sh = (vector short int) { 0, 2, 4, 6, 8, 10, 12, 14 }; + src_vb_sh = (vector short int) { 0, 0, 0, 0, 0, 0, 0, 0 }; + vresult_sh = (vector short int) { 0, 0, 0, 0, 0, 0, 0, 0 }; + expected_vresult_sh = (vector short int) { 0, 2*128, 4*128, 6*128, + 8*128, 10*128, 12*128, 14*128 }; + + vresult_sh = vec_sldb (src_va_sh, src_vb_sh, 7); + + if (!vec_all_eq (vresult_sh, expected_vresult_sh)) { +#if DEBUG + printf("ERROR, vec_sldb (src_va_sh_, src_vb_sh, 7)\n"); + for(i = 0; i < 8; i++) + printf(" vresult_sh[%d] = %d, expected_vresult_sh[%d] = %d\n", + i, vresult_sh[i], i, expected_vresult_sh[i]); +#else + abort(); +#endif + } + + src_va_ush = (vector short unsigned int) { 0, 2, 4, 6, 8, 10, 12, 14 }; + src_vb_ush = (vector short unsigned int) { 10, 20, 30, 40, 50, 60, 70, 80 }; + vresult_ush = (vector short unsigned int) { 0, 0, 0, 0, 0, 0, 0, 0 }; + expected_vresult_ush = (vector short unsigned int) { 0, 2*128, 4*128, 6*128, + 8*128, 10*128, 12*128, + 14*128 }; + + vresult_ush = vec_sldb (src_va_ush, src_vb_ush, 7); + + if (!vec_all_eq (vresult_ush, expected_vresult_ush)) { +#if DEBUG + printf("ERROR, vec_sldb (src_va_ush_, src_vb_ush, 7)\n"); + for(i = 0; i < 8; i++) + printf(" vresult_ush[%d] = %d, expected_vresult_ush[%d] = %d\n", + i, vresult_ush[i], i, expected_vresult_ush[i]); +#else + abort(); +#endif + } + + src_va_int = (vector signed int) { 0, 2, 3, 1 }; + src_vb_int = (vector signed int) { 0, 0, 0, 0 }; + vresult_int = (vector signed int) { 0, 0, 0, 0 }; + expected_vresult_int = (vector signed int) { 0, 2*128, 3*128, 1*128 }; + + vresult_int = vec_sldb (src_va_int, src_vb_int, 7); + + if (!vec_all_eq (vresult_int, expected_vresult_int)) { +#if DEBUG + printf("ERROR, vec_sldb (src_va_int_, src_vb_int, 7)\n"); + for(i = 0; i < 4; i++) + printf(" vresult_int[%d] = %d, expected_vresult_int[%d] = %d\n", + i, vresult_int[i], i, expected_vresult_int[i]); +#else + abort(); +#endif + } + + src_va_uint = (vector unsigned int) { 0, 2, 4, 6 }; + src_vb_uint = (vector unsigned int) { 10, 20, 30, 40 }; + vresult_uint = (vector unsigned int) { 0, 0, 0, 0 }; + expected_vresult_uint = (vector unsigned int) { 0, 2*128, 4*128, 6*128 }; + + vresult_uint = vec_sldb (src_va_uint, src_vb_uint, 7); + + if (!vec_all_eq (vresult_uint, expected_vresult_uint)) { +#if DEBUG + printf("ERROR, vec_sldb (src_va_uint_, src_vb_uint, 7)\n"); + for(i = 0; i < 4; i++) + printf(" vresult_uint[%d] = %d, expected_vresult_uint[%d] = %d\n", + i, vresult_uint[i], i, expected_vresult_uint[i]); +#else + abort(); +#endif + } + + src_va_llint = (vector signed long long int) { 5, 6 }; + src_vb_llint = (vector signed long long int) { 0, 0 }; + vresult_llint = (vector signed long long int) { 0, 0 }; + expected_vresult_llint = (vector signed long long int) { 5*128, 6*128 }; + + vresult_llint = vec_sldb (src_va_llint, src_vb_llint, 7); + + if (!vec_all_eq (vresult_llint, expected_vresult_llint)) { +#if DEBUG + printf("ERROR, vec_sldb (src_va_llint_, src_vb_llint, 7)\n"); + for(i = 0; i < 2; i++) + printf(" vresult_llint[%d] = %d, expected_vresult_llint[%d] = %d\n", + i, vresult_llint[i], i, expected_vresult_llint[i]); +#else + abort(); +#endif + } + + src_va_ullint = (vector unsigned long long int) { 54, 26 }; + src_vb_ullint = (vector unsigned long long int) { 10, 20 }; + vresult_ullint = (vector unsigned long long int) { 0, 0 }; + expected_vresult_ullint = (vector unsigned long long int) { 54*128, + 26*128 }; + + vresult_ullint = vec_sldb (src_va_ullint, src_vb_ullint, 7); + + if (!vec_all_eq (vresult_ullint, expected_vresult_ullint)) { +#if DEBUG + printf("ERROR, vec_sldb (src_va_ullint_, src_vb_ullint, 7)\n"); + for(i = 0; i < 2; i++) + printf(" vresult_ullint[%d] = %d, expected_vresult_ullint[%d] = %d\n", + i, vresult_ullint[i], i, expected_vresult_ullint[i]); +#else + abort(); +#endif + } + + /* Vector shift double right */ + src_va_char = (vector signed char) { 0, 2, 4, 6, 8, 10, 12, 14, + 16, 18, 20, 22, 24, 26, 28, 30 }; + src_vb_char = (vector signed char) { 10, 12, 14, 16, 18, 20, 22, 24, 26, + 28, 30, 32, 34, 36, 38, 40 }; + vresult_char = (vector signed char) { 0, 0, 0, 0, 0, 0, 0, 0, + 0, 0, 0, 0, 0, 0, 0, 0 }; + expected_vresult_char = (vector signed char) { 24, 28, 32, 36, 40, 44, 48, + 52, 56, 60, 64, 68, 72, 76, + 80, 0 }; + + vresult_char = vec_srdb (src_va_char, src_vb_char, 7); + + if (!vec_all_eq (vresult_char, expected_vresult_char)) { +#if DEBUG + printf("ERROR, vec_srdb (src_va_char_, src_vb_char, 7)\n"); + for(i = 0; i < 16; i++) + printf(" vresult_char[%d] = %d, expected_vresult_char[%d] = %d\n", + i, vresult_char[i], i, expected_vresult_char[i]); +#else + abort(); +#endif + } + + src_va_uchar = (vector unsigned char) { 100, 0, 0, 0, 0, 0, 0, 0, + 0, 0, 0, 0, 0, 0, 0, 0 }; + src_vb_uchar = (vector unsigned char) { 0, 2, 4, 6, 8, 10, 12, 14, + 16, 18, 20, 22, 24, 26, 28, 30 }; + vresult_uchar = (vector unsigned char) { 0, 0, 0, 0, 0, 0, 0, 0, + 0, 0, 0, 0, 0, 0, 0, 0 }; + expected_vresult_uchar = (vector unsigned char) { 4, 8, 12, 16, 20, 24, 28, + 32, 36, 40, 44, 48, 52, + 56, 60, 200 }; + + vresult_uchar = vec_srdb (src_va_uchar, src_vb_uchar, 7); + + if (!vec_all_eq (vresult_uchar, expected_vresult_uchar)) { +#if DEBUG + printf("ERROR, vec_srdb (src_va_uchar_, src_vb_uchar, 7)\n"); + for(i = 0; i < 16; i++) + printf(" vresult_uchar[%d] = %d, expected_vresult_uchar[%d] = %d\n", + i, vresult_uchar[i], i, expected_vresult_uchar[i]); +#else + abort(); +#endif + } + + src_va_sh = (vector short int) { 0, 0, 0, 0, 0, 0, 0, 0 }; + src_vb_sh = (vector short int) { 0, 2*128, 4*128, 6*128, + 8*128, 10*128, 12*128, 14*128 }; + vresult_sh = (vector short int) { 0, 0, 0, 0, 0, 0, 0, 0 }; + expected_vresult_sh = (vector short int) { 0, 2, 4, 6, 8, 10, 12, 14 }; + + vresult_sh = vec_srdb (src_va_sh, src_vb_sh, 7); + + if (!vec_all_eq (vresult_sh, expected_vresult_sh)) { +#if DEBUG + printf("ERROR, vec_srdb (src_va_sh_, src_vb_sh, 7)\n"); + for(i = 0; i < 8; i++) + printf(" vresult_sh[%d] = %d, expected_vresult_sh[%d] = %d\n", + i, vresult_sh[i], i, expected_vresult_sh[i]); +#else + abort(); +#endif + } + + src_va_ush = (vector short unsigned int) { 0, 20, 30, 40, 50, 60, 70, 80 }; + src_vb_ush = (vector short unsigned int) { 0, 2*128, 4*128, 6*128, + 8*128, 10*128, 12*128, 14*128 }; + vresult_ush = (vector short unsigned int) { 0, 0, 0, 0, 0, 0, 0, 0 }; + expected_vresult_ush = (vector short unsigned int) { 0, 2, 4, 6, 8, 10, + 12, 14 }; + + vresult_ush = vec_srdb (src_va_ush, src_vb_ush, 7); + + if (!vec_all_eq (vresult_ush, expected_vresult_ush)) { +#if DEBUG + printf("ERROR, vec_srdb (src_va_ush_, src_vb_ush, 7)\n"); + for(i = 0; i < 8; i++) + printf(" vresult_ush[%d] = %d, expected_vresult_ush[%d] = %d\n", + i, vresult_ush[i], i, expected_vresult_ush[i]); +#else + abort(); +#endif + } + + src_va_int = (vector signed int) { 0, 0, 0, 0 }; + src_vb_int = (vector signed int) { 0, 2*128, 3*128, 1*128 }; + vresult_int = (vector signed int) { 0, 0, 0, 0 }; + expected_vresult_int = (vector signed int) { 0, 2, 3, 1 }; + + vresult_int = vec_srdb (src_va_int, src_vb_int, 7); + + if (!vec_all_eq (vresult_int, expected_vresult_int)) { +#if DEBUG + printf("ERROR, vec_srdb (src_va_int_, src_vb_int, 7)\n"); + for(i = 0; i < 4; i++) + printf(" vresult_int[%d] = %d, expected_vresult_int[%d] = %d\n", + i, vresult_int[i], i, expected_vresult_int[i]); +#else + abort(); +#endif + } + + src_va_uint = (vector unsigned int) { 0, 20, 30, 40 }; + src_vb_uint = (vector unsigned int) { 128, 2*128, 4*128, 6*128 }; + vresult_uint = (vector unsigned int) { 0, 0, 0, 0 }; + expected_vresult_uint = (vector unsigned int) { 1, 2, 4, 6 }; + + vresult_uint = vec_srdb (src_va_uint, src_vb_uint, 7); + + if (!vec_all_eq (vresult_uint, expected_vresult_uint)) { +#if DEBUG + printf("ERROR, vec_srdb (src_va_uint_, src_vb_uint, 7)\n"); + for(i = 0; i < 4; i++) + printf(" vresult_uint[%d] = %d, expected_vresult_uint[%d] = %d\n", + i, vresult_uint[i], i, expected_vresult_uint[i]); +#else + abort(); +#endif + } + + src_va_llint = (vector signed long long int) { 0, 0 }; + src_vb_llint = (vector signed long long int) { 5*128, 6*128 }; + vresult_llint = (vector signed long long int) { 0, 0 }; + expected_vresult_llint = (vector signed long long int) { 5, 6 }; + + vresult_llint = vec_srdb (src_va_llint, src_vb_llint, 7); + + if (!vec_all_eq (vresult_llint, expected_vresult_llint)) { +#if DEBUG + printf("ERROR, vec_srdb (src_va_llint_, src_vb_llint, 7)\n"); + for(i = 0; i < 2; i++) + printf(" vresult_llint[%d] = %d, expected_vresult_llint[%d] = %d\n", + i, vresult_llint[i], i, expected_vresult_llint[i]); +#else + abort(); +#endif + } + + src_va_ullint = (vector unsigned long long int) { 0, 0 }; + src_vb_ullint = (vector unsigned long long int) { 54*128, 26*128 }; + vresult_ullint = (vector unsigned long long int) { 0, 0 }; + expected_vresult_ullint = (vector unsigned long long int) { 54, 26 }; + + vresult_ullint = vec_srdb (src_va_ullint, src_vb_ullint, 7); + + if (!vec_all_eq (vresult_ullint, expected_vresult_ullint)) { +#if DEBUG + printf("ERROR, vec_srdb (src_va_ullint_, src_vb_ullint, 7)\n"); + for(i = 0; i < 2; i++) + printf(" vresult_ullint[%d] = %d, expected_vresult_ullint[%d] = %d\n", + i, vresult_ullint[i], i, expected_vresult_ullint[i]); +#else + abort(); +#endif + } + + return 0; +} + +/* { dg-final { scan-assembler-times {\msldbi\M} 6 } } */ +/* { dg-final { scan-assembler-times {\msrdbi\M} 6 } } */ + + From patchwork Thu Jun 18 22:20:18 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Carl Love X-Patchwork-Id: 1312430 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=8.43.85.97; helo=sourceware.org; envelope-from=gcc-patches-bounces@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=gcc.gnu.org Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.a=rsa-sha256 header.s=default header.b=EdGf+QXW; dkim-atps=neutral Received: from sourceware.org (server2.sourceware.org [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 49nxGP3bv6z9sRW for ; Fri, 19 Jun 2020 08:20:45 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id AACC83954C11; Thu, 18 Jun 2020 22:20:26 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org AACC83954C11 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1592518826; bh=AozxxW6d2D5b+0cVaK+P40O6kwJGW4L4hHHzaEIKTHA=; h=Subject:To:Date:List-Id:List-Unsubscribe:List-Archive:List-Post: List-Help:List-Subscribe:From:Reply-To:From; b=EdGf+QXWKEtNKasF0Se7ONaKl6MqjSsfP31rqRbbvBNLhF3fH9cWjcdOT03JVd/My hNWB/QBdLcCNdOYQHr4ipBiaNITNnG9FX97gKpmZk6Wf/VO8sBx57evxO6oLOswh1u vaYIS14fHsrIw2oB15hpDkFVlWydAMAa8SwDtfd0= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mx0b-001b2d01.pphosted.com (mx0b-001b2d01.pphosted.com [148.163.158.5]) by sourceware.org (Postfix) with ESMTPS id B572E389040F; Thu, 18 Jun 2020 22:20:23 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org B572E389040F Received: from pps.filterd (m0098421.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.0.42/8.16.0.42) with SMTP id 05IM1fC7090462; Thu, 18 Jun 2020 18:20:23 -0400 Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com with ESMTP id 31ra0w5q00-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 18 Jun 2020 18:20:23 -0400 Received: from m0098421.ppops.net (m0098421.ppops.net [127.0.0.1]) by pps.reinject (8.16.0.36/8.16.0.36) with SMTP id 05IM1lGR091103; Thu, 18 Jun 2020 18:20:22 -0400 Received: from ppma03wdc.us.ibm.com (ba.79.3fa9.ip4.static.sl-reverse.com [169.63.121.186]) by mx0a-001b2d01.pphosted.com with ESMTP id 31ra0w5pyt-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 18 Jun 2020 18:20:22 -0400 Received: from pps.filterd (ppma03wdc.us.ibm.com [127.0.0.1]) by ppma03wdc.us.ibm.com (8.16.0.42/8.16.0.42) with SMTP id 05IMJvsE001822; Thu, 18 Jun 2020 22:20:22 GMT Received: from b03cxnp07029.gho.boulder.ibm.com (b03cxnp07029.gho.boulder.ibm.com [9.17.130.16]) by ppma03wdc.us.ibm.com with ESMTP id 31q8kkxhvs-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 18 Jun 2020 22:20:22 +0000 Received: from b03ledav001.gho.boulder.ibm.com (b03ledav001.gho.boulder.ibm.com [9.17.130.232]) by b03cxnp07029.gho.boulder.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 05IMKL6f52101416 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 18 Jun 2020 22:20:21 GMT Received: from b03ledav001.gho.boulder.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 5C0D56E04C; Thu, 18 Jun 2020 22:20:21 +0000 (GMT) Received: from b03ledav001.gho.boulder.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 4C94D6E04E; Thu, 18 Jun 2020 22:20:20 +0000 (GMT) Received: from sig-9-65-250-81.ibm.com (unknown [9.65.250.81]) by b03ledav001.gho.boulder.ibm.com (Postfix) with ESMTP; Thu, 18 Jun 2020 22:20:20 +0000 (GMT) Message-ID: Subject: [PATCH 5/6 ver 3] rs6000, Add vector splat builtin support To: segher@gcc.gnu.org, dje.gcc@gmail.com, gcc-patches@gcc.gnu.org Date: Thu, 18 Jun 2020 15:20:18 -0700 X-Mailer: Evolution 3.28.5 (3.28.5-5.el7) Mime-Version: 1.0 X-TM-AS-GCONF: 00 X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:6.0.216, 18.0.687 definitions=2020-06-18_21:2020-06-18, 2020-06-18 signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 mlxlogscore=999 clxscore=1015 impostorscore=0 bulkscore=0 cotscore=-2147483648 lowpriorityscore=0 mlxscore=0 spamscore=0 malwarescore=0 suspectscore=4 adultscore=0 phishscore=0 priorityscore=1501 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2004280000 definitions=main-2006180164 X-Spam-Status: No, score=-10.7 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_ASCII_DIVIDERS, KAM_DMARC_STATUS, RCVD_IN_DNSWL_LOW, RCVD_IN_MSPIKE_H2, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Carl Love via Gcc-patches From: Carl Love Reply-To: Carl Love Errors-To: gcc-patches-bounces@gcc.gnu.org Sender: "Gcc-patches" v3 fixes: Minor cleanup in the ChangeLog description. ------------------------------------------------- v2 fixes: change log fixes gcc/config/rs6000/altivec changed name of define_insn and define_expand for vxxspltiw... to xxspltiw... Fixed spaces in gen_xxsplti32dx_v4sf_inst (operands[0], GEN_INT gcc/rs6000-builtin.def propagated name changes above where they are used. Updated definition for S32bit_cint_operand, c32bit_cint_operand, f32bit_const_operand predicate definitions. Changed name of rs6000_constF32toI32 to rs6000_const_f32_to_i32, propagated name change as needed. Replaced if test with gcc_assert(). Fixed description of vec_splatid() in documentation. ----------------------- GCC maintainers: The following patch adds support for the vec_splati, vec_splatid and vec_splati_ins builtins. This patch adds support for instructions that take a 32-bit immediate value that represents a floating point value. This support adds new predicates and a support function to properly handle the immediate value. The patch has been compiled and tested on powerpc64le-unknown-linux-gnu (Power 9 LE) with no regression errors. The test case was compiled on a Power 9 system and then tested on Mambo. Please let me know if this patch is acceptable for the mainline branch. Thanks. Carl Love -------------------------------------------------------- gcc/ChangeLog 2020-06-18 Carl Love * config/rs6000/altivec.h (vec_splati, vec_splatid, vec_splati_ins): Add defines. * config/rs6000/altivec.md (UNSPEC_XXSPLTIW, UNSPEC_XXSPLTID, UNSPEC_XXSPLTI32DX): New. (vxxspltiw_v4si, vxxspltiw_v4sf_inst, vxxspltidp_v2df_inst, vxxsplti32dx_v4si_inst, vxxsplti32dx_v4sf_inst): New define_insn. (vxxspltiw_v4sf, vxxspltidp_v2df, vxxsplti32dx_v4si, vxxsplti32dx_v4sf.): New define_expands. * config/rs6000/predicates (u1bit_cint_operand, s32bit_cint_operand, c32bit_cint_operand, f32bit_const_operand): New predicates. * config/rs6000/rs6000-builtin.def (VXXSPLTIW_V4SI, VXXSPLTIW_V4SF, VXXSPLTID): New definitions. (VXXSPLTI32DX_V4SI, VXXSPLTI32DX_V4SF): New BU_FUTURE_V_3 definitions. (XXSPLTIW, XXSPLTID): New definitions. (XXSPLTI32DX): Add definitions. * config/rs6000/rs6000-call.c (FUTURE_BUILTIN_VEC_XXSPLTIW, FUTURE_BUILTIN_VEC_XXSPLTID, FUTURE_BUILTIN_VEC_XXSPLTI32DX): New definitions. * config/rs6000/rs6000-protos.h (rs6000_constF32toI32): New extern declaration. * config/rs6000/rs6000.c (rs6000_constF32toI32): New function. * config/doc/extend.texi: Add documentation for vec_splati, vec_splatid, and vec_splati_ins. gcc/testsuite/ChangeLog 2020-06-18 Carl Love * testsuite/gcc.target/powerpc/vec-splati-runnable: New test. --- gcc/config/rs6000/altivec.h | 3 + gcc/config/rs6000/altivec.md | 109 +++++++++++++ gcc/config/rs6000/predicates.md | 33 ++++ gcc/config/rs6000/rs6000-builtin.def | 13 ++ gcc/config/rs6000/rs6000-call.c | 19 +++ gcc/config/rs6000/rs6000-protos.h | 1 + gcc/config/rs6000/rs6000.c | 11 ++ gcc/doc/extend.texi | 35 +++++ .../gcc.target/powerpc/vec-splati-runnable.c | 145 ++++++++++++++++++ 9 files changed, 369 insertions(+) create mode 100644 gcc/testsuite/gcc.target/powerpc/vec-splati-runnable.c diff --git a/gcc/config/rs6000/altivec.h b/gcc/config/rs6000/altivec.h index 0be68892aad..9ed41b1cbf1 100644 --- a/gcc/config/rs6000/altivec.h +++ b/gcc/config/rs6000/altivec.h @@ -705,6 +705,9 @@ __altivec_scalar_pred(vec_any_nle, #define vec_replace_unaligned(a, b, c) __builtin_vec_replace_un (a, b, c) #define vec_sldb(a, b, c) __builtin_vec_sldb (a, b, c) #define vec_srdb(a, b, c) __builtin_vec_srdb (a, b, c) +#define vec_splati(a) __builtin_vec_xxspltiw (a) +#define vec_splatid(a) __builtin_vec_xxspltid (a) +#define vec_splati_ins(a, b, c) __builtin_vec_xxsplti32dx (a, b, c) #define vec_gnb(a, b) __builtin_vec_gnb (a, b) #define vec_clrl(a, b) __builtin_vec_clrl (a, b) diff --git a/gcc/config/rs6000/altivec.md b/gcc/config/rs6000/altivec.md index 832a35cdaa9..25f6b9b2f07 100644 --- a/gcc/config/rs6000/altivec.md +++ b/gcc/config/rs6000/altivec.md @@ -173,6 +173,9 @@ UNSPEC_VSTRIL UNSPEC_SLDB UNSPEC_SRDB + UNSPEC_XXSPLTIW + UNSPEC_XXSPLTID + UNSPEC_XXSPLTI32DX ]) (define_c_enum "unspecv" @@ -799,6 +802,112 @@ "vsdbi %0,%1,%2,%3" [(set_attr "type" "vecsimple")]) +(define_insn "xxspltiw_v4si" + [(set (match_operand:V4SI 0 "register_operand" "=wa") + (unspec:V4SI [(match_operand:SI 1 "s32bit_cint_operand" "n")] + UNSPEC_XXSPLTIW))] + "TARGET_FUTURE" + "xxspltiw %x0,%1" + [(set_attr "type" "vecsimple")]) + +(define_expand "xxspltiw_v4sf" + [(set (match_operand:V4SF 0 "register_operand" "=wa") + (unspec:V4SF [(match_operand:SF 1 "f32bit_const_operand" "n")] + UNSPEC_XXSPLTIW))] + "TARGET_FUTURE" +{ + long long value = rs6000_const_f32_to_i32 (operands[1]); + emit_insn (gen_xxspltiw_v4sf_inst (operands[0], GEN_INT (value))); + DONE; +}) + +(define_insn "xxspltiw_v4sf_inst" + [(set (match_operand:V4SF 0 "register_operand" "=wa") + (unspec:V4SF [(match_operand:SI 1 "c32bit_cint_operand" "n")] + UNSPEC_XXSPLTIW))] + "TARGET_FUTURE" + "xxspltiw %x0,%c1" + [(set_attr "type" "vecsimple")]) + +(define_expand "xxspltidp_v2df" + [(set (match_operand:V2DF 0 "register_operand" ) + (unspec:V2DF [(match_operand:SF 1 "f32bit_const_operand")] + UNSPEC_XXSPLTID))] + "TARGET_FUTURE" +{ + long value = rs6000_const_f32_to_i32 (operands[1]); + emit_insn (gen_xxspltidp_v2df_inst (operands[0], GEN_INT (value))); + DONE; +}) + +(define_insn "xxspltidp_v2df_inst" + [(set (match_operand:V2DF 0 "register_operand" "=wa") + (unspec:V2DF [(match_operand:SI 1 "c32bit_cint_operand" "n")] + UNSPEC_XXSPLTID))] + "TARGET_FUTURE" + "xxspltidp %x0,%c1" + [(set_attr "type" "vecsimple")]) + +(define_expand "xxsplti32dx_v4si" + [(set (match_operand:V4SI 0 "register_operand" "=wa") + (unspec:V4SI [(match_operand:V4SI 1 "register_operand" "wa") + (match_operand:QI 2 "u1bit_cint_operand" "n") + (match_operand:SI 3 "s32bit_cint_operand" "n")] + UNSPEC_XXSPLTI32DX))] + "TARGET_FUTURE" +{ + int index = INTVAL (operands[2]); + + if (!BYTES_BIG_ENDIAN) + index = 1 - index; + + /* Instruction uses destination as a source. Do not overwrite source. */ + emit_move_insn (operands[0], operands[1]); + + emit_insn (gen_xxsplti32dx_v4si_inst (operands[0], GEN_INT (index), + operands[3])); + DONE; +} + [(set_attr "type" "vecsimple")]) + +(define_insn "xxsplti32dx_v4si_inst" + [(set (match_operand:V4SI 0 "register_operand" "+wa") + (unspec:V4SI [(match_operand:QI 1 "u1bit_cint_operand" "n") + (match_operand:SI 2 "s32bit_cint_operand" "n")] + UNSPEC_XXSPLTI32DX))] + "TARGET_FUTURE" + "xxsplti32dx %x0,%1,%2" + [(set_attr "type" "vecsimple")]) + +(define_expand "xxsplti32dx_v4sf" + [(set (match_operand:V4SF 0 "register_operand" "=wa") + (unspec:V4SF [(match_operand:V4SF 1 "register_operand" "wa") + (match_operand:QI 2 "u1bit_cint_operand" "n") + (match_operand:SF 3 "f32bit_const_operand" "n")] + UNSPEC_XXSPLTI32DX))] + "TARGET_FUTURE" +{ + int index = INTVAL (operands[2]); + long value = rs6000_const_f32_to_i32 (operands[3]); + if (!BYTES_BIG_ENDIAN) + index = 1 - index; + + /* Instruction uses destination as a source. Do not overwrite source. */ + emit_move_insn (operands[0], operands[1]); + emit_insn (gen_xxsplti32dx_v4sf_inst (operands[0], GEN_INT (index), + GEN_INT (value))); + DONE; +}) + +(define_insn "xxsplti32dx_v4sf_inst" + [(set (match_operand:V4SF 0 "register_operand" "+wa") + (unspec:V4SF [(match_operand:QI 1 "u1bit_cint_operand" "n") + (match_operand:SI 2 "s32bit_cint_operand" "n")] + UNSPEC_XXSPLTI32DX))] + "TARGET_FUTURE" + "xxsplti32dx %x0,%1,%2" + [(set_attr "type" "vecsimple")]) + (define_expand "vstrir_" [(set (match_operand:VIshort 0 "altivec_register_operand") (unspec:VIshort [(match_operand:VIshort 1 "altivec_register_operand")] diff --git a/gcc/config/rs6000/predicates.md b/gcc/config/rs6000/predicates.md index c3f460face2..48f913c5718 100644 --- a/gcc/config/rs6000/predicates.md +++ b/gcc/config/rs6000/predicates.md @@ -214,6 +214,11 @@ (and (match_code "const_int") (match_test "INTVAL (op) >= -16 && INTVAL (op) <= 15"))) +;; Return 1 if op is a unsigned 1-bit constant integer. +(define_predicate "u1bit_cint_operand" + (and (match_code "const_int") + (match_test "INTVAL (op) >= 0 && INTVAL (op) <= 1"))) + ;; Return 1 if op is a unsigned 3-bit constant integer. (define_predicate "u3bit_cint_operand" (and (match_code "const_int") @@ -272,6 +277,34 @@ (match_test "(unsigned HOST_WIDE_INT) (INTVAL (op) + 0x8000) >= 0x10000"))) +;; Return 1 if op is a 32-bit constant signed integer +(define_predicate "s32bit_cint_operand" + (and (match_code "const_int") + (match_test "(unsigned HOST_WIDE_INT) + (0x80000000 + UINTVAL (op)) >> 32 == 0"))) + +;; Return 1 if op is a constant 32-bit unsigned +(define_predicate "c32bit_cint_operand" + (and (match_code "const_int") + (match_test "((UINTVAL (op) >> 32) == 0)"))) + +;; Return 1 if op is a constant 32-bit floating point value +(define_predicate "f32bit_const_operand" + (match_code "const_double") +{ + if (GET_MODE (op) == SFmode) + return 1; + + else if ((GET_MODE (op) == DFmode) && ((UINTVAL (op) >> 32) == 0)) + { + /* Value fits in 32-bits */ + return 1; + } + else + /* Not the expected mode. */ + return 0; +}) + ;; Return 1 if op is a positive constant integer that is an exact power of 2. (define_predicate "exact_log2_cint_operand" (and (match_code "const_int") diff --git a/gcc/config/rs6000/rs6000-builtin.def b/gcc/config/rs6000/rs6000-builtin.def index 2b198177ef0..c85326de7f2 100644 --- a/gcc/config/rs6000/rs6000-builtin.def +++ b/gcc/config/rs6000/rs6000-builtin.def @@ -2666,6 +2666,15 @@ BU_FUTURE_V_3 (VSRDB_V16QI, "vsrdb_v16qi", CONST, vsrdb_v16qi) BU_FUTURE_V_3 (VSRDB_V8HI, "vsrdb_v8hi", CONST, vsrdb_v8hi) BU_FUTURE_V_3 (VSRDB_V4SI, "vsrdb_v4si", CONST, vsrdb_v4si) BU_FUTURE_V_3 (VSRDB_V2DI, "vsrdb_v2di", CONST, vsrdb_v2di) + +BU_FUTURE_V_1 (VXXSPLTIW_V4SI, "vxxspltiw_v4si", CONST, xxspltiw_v4si) +BU_FUTURE_V_1 (VXXSPLTIW_V4SF, "vxxspltiw_v4sf", CONST, xxspltiw_v4sf) + +BU_FUTURE_V_1 (VXXSPLTID, "vxxspltidp", CONST, xxspltidp_v2df) + +BU_FUTURE_V_3 (VXXSPLTI32DX_V4SI, "vxxsplti32dx_v4si", CONST, xxsplti32dx_v4si) +BU_FUTURE_V_3 (VXXSPLTI32DX_V4SF, "vxxsplti32dx_v4sf", CONST, xxsplti32dx_v4sf) + BU_FUTURE_V_1 (VSTRIBR, "vstribr", CONST, vstrir_v16qi) BU_FUTURE_V_1 (VSTRIHR, "vstrihr", CONST, vstrir_v8hi) BU_FUTURE_V_1 (VSTRIBL, "vstribl", CONST, vstril_v16qi) @@ -2697,6 +2706,10 @@ BU_FUTURE_OVERLOAD_1 (VSTRIL, "stril") BU_FUTURE_OVERLOAD_1 (VSTRIR_P, "strir_p") BU_FUTURE_OVERLOAD_1 (VSTRIL_P, "stril_p") + +BU_FUTURE_OVERLOAD_1 (XXSPLTIW, "xxspltiw") +BU_FUTURE_OVERLOAD_1 (XXSPLTID, "xxspltid") +BU_FUTURE_OVERLOAD_3 (XXSPLTI32DX, "xxsplti32dx") /* 1 argument crypto functions. */ BU_CRYPTO_1 (VSBOX, "vsbox", CONST, crypto_vsbox_v2di) diff --git a/gcc/config/rs6000/rs6000-call.c b/gcc/config/rs6000/rs6000-call.c index 092e6c1cc2c..e36aafaf71c 100644 --- a/gcc/config/rs6000/rs6000-call.c +++ b/gcc/config/rs6000/rs6000-call.c @@ -5679,6 +5679,22 @@ const struct altivec_builtin_types altivec_overloaded_builtins[] = { RS6000_BTI_unsigned_V2DI, RS6000_BTI_unsigned_V2DI, RS6000_BTI_unsigned_V2DI, RS6000_BTI_UINTQI }, + { FUTURE_BUILTIN_VEC_XXSPLTIW, FUTURE_BUILTIN_VXXSPLTIW_V4SI, + RS6000_BTI_V4SI, RS6000_BTI_INTSI, 0, 0 }, + { FUTURE_BUILTIN_VEC_XXSPLTIW, FUTURE_BUILTIN_VXXSPLTIW_V4SF, + RS6000_BTI_V4SF, RS6000_BTI_float, 0, 0 }, + + { FUTURE_BUILTIN_VEC_XXSPLTID, FUTURE_BUILTIN_VXXSPLTID, + RS6000_BTI_V2DF, RS6000_BTI_float, 0, 0 }, + + { FUTURE_BUILTIN_VEC_XXSPLTI32DX, FUTURE_BUILTIN_VXXSPLTI32DX_V4SI, + RS6000_BTI_V4SI, RS6000_BTI_V4SI, RS6000_BTI_UINTQI, RS6000_BTI_INTSI }, + { FUTURE_BUILTIN_VEC_XXSPLTI32DX, FUTURE_BUILTIN_VXXSPLTI32DX_V4SI, + RS6000_BTI_unsigned_V4SI, RS6000_BTI_unsigned_V4SI, RS6000_BTI_UINTQI, + RS6000_BTI_UINTSI }, + { FUTURE_BUILTIN_VEC_XXSPLTI32DX, FUTURE_BUILTIN_VXXSPLTI32DX_V4SF, + RS6000_BTI_V4SF, RS6000_BTI_V4SF, RS6000_BTI_UINTQI, RS6000_BTI_float }, + { FUTURE_BUILTIN_VEC_SRDB, FUTURE_BUILTIN_VSRDB_V16QI, RS6000_BTI_V16QI, RS6000_BTI_V16QI, RS6000_BTI_V16QI, RS6000_BTI_UINTQI }, @@ -13539,6 +13555,9 @@ builtin_function_type (machine_mode mode_ret, machine_mode mode_arg0, case ALTIVEC_BUILTIN_VSRH: case ALTIVEC_BUILTIN_VSRW: case P8V_BUILTIN_VSRD: + /* Vector splat immediate insert */ + case FUTURE_BUILTIN_VXXSPLTI32DX_V4SI: + case FUTURE_BUILTIN_VXXSPLTI32DX_V4SF: h.uns_p[2] = 1; break; diff --git a/gcc/config/rs6000/rs6000-protos.h b/gcc/config/rs6000/rs6000-protos.h index 5508484ba19..c6158874ce9 100644 --- a/gcc/config/rs6000/rs6000-protos.h +++ b/gcc/config/rs6000/rs6000-protos.h @@ -274,6 +274,7 @@ extern void rs6000_asm_output_dwarf_pcrel (FILE *file, int size, const char *label); extern void rs6000_asm_output_dwarf_datarel (FILE *file, int size, const char *label); +extern long long rs6000_const_f32_to_i32 (rtx operand); /* Declare functions in rs6000-c.c */ diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c index 58f5d780603..89fcc99df0a 100644 --- a/gcc/config/rs6000/rs6000.c +++ b/gcc/config/rs6000/rs6000.c @@ -26494,6 +26494,17 @@ rs6000_cannot_substitute_mem_equiv_p (rtx mem) return false; } +long long +rs6000_const_f32_to_i32 (rtx operand) +{ + long long value; + const struct real_value *rv = CONST_DOUBLE_REAL_VALUE (operand); + + gcc_assert (GET_MODE (operand) == SFmode); + REAL_VALUE_TO_TARGET_SINGLE (*rv, value); + return value; +} + struct gcc_target targetm = TARGET_INITIALIZER; #include "gt-rs6000.h" diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi index 3d159d5ea8f..37d11e1ef41 100644 --- a/gcc/doc/extend.texi +++ b/gcc/doc/extend.texi @@ -21150,6 +21150,41 @@ using this built-in must be endian-aware. @findex vec_srdb +Vector Splat + +@smallexample +@exdent vector signed int vec_splati (const signed int); +@exdent vector float vec_splati (const float); +@end smallexample + +Splat a 32-bit immediate into a vector of words. + +@findex vec_splati + +@smallexample +@exdent vector double vec_splatid (const float); +@end smallexample + +Convert a single precision floating-point value to double-precision and splat +the result to a vector of double-precision floats. + +@findex vec_splatid + +@smallexample +@exdent vector signed int vec_splati_ins (vector signed int, +const unsigned int, const signed int); +@exdent vector unsigned int vec_splati_ins (vector unsigned int, +const unsigned int, const unsigned int); +@exdent vector float vec_splati_ins (vector float, const unsigned int, +const float); +@end smallexample + +Argument 2 must be either 0 or 1. Splat the value of argument 3 into the word +identified by argument 2 of each doubleword of argument 1 and return the +result. The other words of argument 1 are unchanged. + +@findex vec_splati_ins + @smallexample @exdent vector unsigned long long int @exdent vec_pdep (vector unsigned long long int, vector unsigned long long int) diff --git a/gcc/testsuite/gcc.target/powerpc/vec-splati-runnable.c b/gcc/testsuite/gcc.target/powerpc/vec-splati-runnable.c new file mode 100644 index 00000000000..f9fa55ae0d4 --- /dev/null +++ b/gcc/testsuite/gcc.target/powerpc/vec-splati-runnable.c @@ -0,0 +1,145 @@ +/* { dg-do run } */ +/* { dg-require-effective-target powerpc_future_hw } */ +/* { dg-options "-mdejagnu-cpu=future" } */ +#include + +#define DEBUG 0 + +#ifdef DEBUG +#include +#endif + +extern void abort (void); + +int +main (int argc, char *argv []) +{ + int i; + vector int vsrc_a_int; + vector int vresult_int; + vector int expected_vresult_int; + int src_a_int = 13; + + vector unsigned int vsrc_a_uint; + vector unsigned int vresult_uint; + vector unsigned int expected_vresult_uint; + unsigned int src_a_uint = 7; + + vector float vresult_f; + vector float expected_vresult_f; + vector float vsrc_a_f; + float src_a_f = 23.0; + + vector double vsrc_a_d; + vector double vresult_d; + vector double expected_vresult_d; + + /* Vector splati word */ + vresult_int = (vector signed int) { 1, 2, 3, 4 }; + expected_vresult_int = (vector signed int) { -13, -13, -13, -13 }; + + vresult_int = vec_splati ( -13 ); + + if (!vec_all_eq (vresult_int, expected_vresult_int)) { +#if DEBUG + printf("ERROR, vec_splati (src_a_int)\n"); + for(i = 0; i < 4; i++) + printf(" vresult_int[%d] = %d, expected_vresult_int[%d] = %d\n", + i, vresult_int[i], i, expected_vresult_int[i]); +#else + abort(); +#endif + } + + vresult_f = (vector float) { 1.0, 2.0, 3.0, 4.0 }; + expected_vresult_f = (vector float) { 23.0, 23.0, 23.0, 23.0 }; + + vresult_f = vec_splati (23.0f); + + if (!vec_all_eq (vresult_f, expected_vresult_f)) { +#if DEBUG + printf("ERROR, vec_splati (src_a_f)\n"); + for(i = 0; i < 4; i++) + printf(" vresult_f[%d] = %f, expected_vresult_f[%d] = %f\n", + i, vresult_f[i], i, expected_vresult_f[i]); +#else + abort(); +#endif + } + + /* Vector splati double */ + vresult_d = (vector double) { 2.0, 3.0 }; + expected_vresult_d = (vector double) { -31.0, -31.0 }; + + vresult_d = vec_splatid (-31.0f); + + if (!vec_all_eq (vresult_d, expected_vresult_d)) { +#if DEBUG + printf("ERROR, vec_splati (-31.0f)\n"); + for(i = 0; i < 2; i++) + printf(" vresult_d[%i] = %f, expected_vresult_d[%i] = %f\n", + i, vresult_d[i], i, expected_vresult_d[i]); +#else + abort(); +#endif + } + + /* Vector splat immediate */ + vsrc_a_int = (vector int) { 2, 3, 4, 5 }; + vresult_int = (vector int) { 1, 1, 1, 1 }; + expected_vresult_int = (vector int) { 2, 20, 4, 20 }; + + vresult_int = vec_splati_ins (vsrc_a_int, 1, 20); + + if (!vec_all_eq (vresult_int, expected_vresult_int)) { +#if DEBUG + printf("ERROR, vec_splati_ins (vsrc_a_int, 1, 20)\n"); + for(i = 0; i < 4; i++) + printf(" vresult_int[%i] = %d, expected_vresult_int[%i] = %d\n", + i, vresult_int[i], i, expected_vresult_int[i]); +#else + abort(); +#endif + } + + vsrc_a_uint = (vector unsigned int) { 4, 5, 6, 7 }; + vresult_uint = (vector unsigned int) { 1, 1, 1, 1 }; + expected_vresult_uint = (vector unsigned int) { 4, 40, 6, 40 }; + + vresult_uint = vec_splati_ins (vsrc_a_uint, 1, 40); + + if (!vec_all_eq (vresult_uint, expected_vresult_uint)) { +#if DEBUG + printf("ERROR, vec_splati_ins (vsrc_a_uint, 1, 40)\n"); + for(i = 0; i < 4; i++) + printf(" vresult_uint[%i] = %d, expected_vresult_uint[%i] = %d\n", + i, vresult_uint[i], i, expected_vresult_uint[i]); +#else + abort(); +#endif + } + + vsrc_a_f = (vector float) { 2.0, 3.0, 4.0, 5.0 }; + vresult_f = (vector float) { 1.0, 1.0, 1.0, 1.0 }; + expected_vresult_f = (vector float) { 2.0, 20.1, 4.0, 20.1 }; + + vresult_f = vec_splati_ins (vsrc_a_f, 1, 20.1f); + + if (!vec_all_eq (vresult_f, expected_vresult_f)) { +#if DEBUG + printf("ERROR, vec_splati_ins (vsrc_a_f, 1, 20.1)\n"); + for(i = 0; i < 4; i++) + printf(" vresult_f[%i] = %f, expected_vresult_f[%i] = %f\n", + i, vresult_f[i], i, expected_vresult_f[i]); +#else + abort(); +#endif + } + + return 0; +} + +/* { dg-final { scan-assembler-times {\msplati\M} 6 } } */ +/* { dg-final { scan-assembler-times {\msrdbi\M} 6 } } */ + + From patchwork Thu Jun 18 22:20:23 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Carl Love X-Patchwork-Id: 1312431 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=8.43.85.97; helo=sourceware.org; envelope-from=gcc-patches-bounces@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=gcc.gnu.org Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.a=rsa-sha256 header.s=default header.b=tuhYKaOJ; dkim-atps=neutral Received: from sourceware.org (server2.sourceware.org [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 49nxGV4Cjjz9sRW for ; Fri, 19 Jun 2020 08:20:50 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 463483948A8F; Thu, 18 Jun 2020 22:20:37 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 463483948A8F DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1592518837; bh=uOVaJwzV6cmy1QTtLHlgdipWxQrDLQkKjOLbFr4xAGU=; h=Subject:To:Date:List-Id:List-Unsubscribe:List-Archive:List-Post: List-Help:List-Subscribe:From:Reply-To:From; b=tuhYKaOJY0z/XDesC0ZXoosuK6iKTYN2YuHmJXPxC7Uz+rI7gqJ1brywnYYBkmJI0 OY2tkLZ4Ymtb3ZfBGl9n8QGzwOlTrBGRmDZada5u6eU4Z7I/ohoBcURuJ4C394R+1V FY1cvCof+v7OfMobT903d3F3vy+P8Db4ZpOMLd9E= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com [148.163.156.1]) by sourceware.org (Postfix) with ESMTPS id 0082D3948A8F; Thu, 18 Jun 2020 22:20:31 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 0082D3948A8F Received: from pps.filterd (m0098409.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.0.42/8.16.0.42) with SMTP id 05IM26RD190089; Thu, 18 Jun 2020 18:20:31 -0400 Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com with ESMTP id 31r8rvrbxx-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 18 Jun 2020 18:20:30 -0400 Received: from m0098409.ppops.net (m0098409.ppops.net [127.0.0.1]) by pps.reinject (8.16.0.36/8.16.0.36) with SMTP id 05IM2OOA192046; Thu, 18 Jun 2020 18:20:30 -0400 Received: from ppma05wdc.us.ibm.com (1b.90.2fa9.ip4.static.sl-reverse.com [169.47.144.27]) by mx0a-001b2d01.pphosted.com with ESMTP id 31r8rvrbxn-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 18 Jun 2020 18:20:30 -0400 Received: from pps.filterd (ppma05wdc.us.ibm.com [127.0.0.1]) by ppma05wdc.us.ibm.com (8.16.0.42/8.16.0.42) with SMTP id 05IMJxS1030845; Thu, 18 Jun 2020 22:20:29 GMT Received: from b03cxnp08025.gho.boulder.ibm.com (b03cxnp08025.gho.boulder.ibm.com [9.17.130.17]) by ppma05wdc.us.ibm.com with ESMTP id 31qu27rese-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 18 Jun 2020 22:20:27 +0000 Received: from b03ledav003.gho.boulder.ibm.com (b03ledav003.gho.boulder.ibm.com [9.17.130.234]) by b03cxnp08025.gho.boulder.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 05IMKPRX30408972 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 18 Jun 2020 22:20:25 GMT Received: from b03ledav003.gho.boulder.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 665256A04F; Thu, 18 Jun 2020 22:20:26 +0000 (GMT) Received: from b03ledav003.gho.boulder.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 35EA26A047; Thu, 18 Jun 2020 22:20:25 +0000 (GMT) Received: from sig-9-65-250-81.ibm.com (unknown [9.65.250.81]) by b03ledav003.gho.boulder.ibm.com (Postfix) with ESMTP; Thu, 18 Jun 2020 22:20:25 +0000 (GMT) Message-ID: <5f85bbec1e27339ce510f92ffccd6cdf0b79e068.camel@us.ibm.com> Subject: [PATCH 6/6 ver 3] rs6000 Add vector blend, permute builtin support To: segher@gcc.gnu.org, dje.gcc@gmail.com, gcc-patches@gcc.gnu.org Date: Thu, 18 Jun 2020 15:20:23 -0700 X-Mailer: Evolution 3.28.5 (3.28.5-5.el7) Mime-Version: 1.0 X-TM-AS-GCONF: 00 X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:6.0.216, 18.0.687 definitions=2020-06-18_21:2020-06-18, 2020-06-18 signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 malwarescore=0 suspectscore=4 bulkscore=0 adultscore=0 mlxlogscore=999 phishscore=0 clxscore=1015 lowpriorityscore=0 spamscore=0 cotscore=-2147483648 mlxscore=0 impostorscore=0 priorityscore=1501 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2004280000 definitions=main-2006180164 X-Spam-Status: No, score=-9.7 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_ASCII_DIVIDERS, KAM_DMARC_STATUS, RCVD_IN_DNSWL_LOW, RCVD_IN_MSPIKE_H2, SCC_10_SHORT_WORD_LINES, SCC_5_SHORT_WORD_LINES, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Carl Love via Gcc-patches From: Carl Love Reply-To: Carl Love Errors-To: gcc-patches-bounces@gcc.gnu.org Sender: "Gcc-patches" v3 fixes: Replace spaces with tabs in ChangeLog description. Fix implementation comments for define_expand "xxpermx" in file gcc/config/rs6000/alitvec.md. Fix minor typos in the comments for the changes in gcc/config/rs6000/rs6000-call.c. -------------------- v2 changes: Updated ChangeLog per comments. Updated implementation of the define_expand "xxpermx". Fixed the comments and check for 3-bit immediate field for the CODE_FOR_xxpermx check. gcc/doc/extend.texi: comment "Maybe it should say it is related to vsel/xxsel, but per bigger element?", added comment. I took the description directly from spec. Don't really don't want to mess with the approved description. fixed typo for Vector Permute Extendedextracth ---------- GCC maintainers: The following patch adds support for the vec_blendv and vec_permx builtins. The patch has been compiled and tested on powerpc64le-unknown-linux-gnu (Power 9 LE) with no regression errors. The test cases were compiled on a Power 9 system and then tested on Mambo. Carl Love --------------------------------------------------------------- rs6000 RFC2609 vector blend, permute instructions gcc/ChangeLog 2020-06-18 Carl Love * config/rs6000/altivec.h (vec_blendv, vec_permx): Add define. * config/rs6000/altivec.md (UNSPEC_XXBLEND, UNSPEC_XXPERMX.): New unspecs. (VM3): New define_mode. (VM3_char): New define_attr. (xxblend_ mode VM3): New define_insn. (xxpermx): New define_expand. (xxpermx_inst): New define_insn. * config/rs6000/rs6000-builtin.def (VXXBLEND_V16QI, VXXBLEND_V8HI, VXXBLEND_V4SI, VXXBLEND_V2DI, VXXBLEND_V4SF, VXXBLEND_V2DF): New BU_FUTURE_V_3 definitions. (XXBLENDBU_FUTURE_OVERLOAD_3): New BU_FUTURE_OVERLOAD_3 definition. (XXPERMX): New BU_FUTURE_OVERLOAD_4 definition. * config/rs6000/rs6000-c.c (altivec_resolve_overloaded_builtin): (FUTURE_BUILTIN_VXXPERMX): Add if case support. * config/rs6000/rs6000-call.c (FUTURE_BUILTIN_VXXBLEND_V16QI, FUTURE_BUILTIN_VXXBLEND_V8HI, FUTURE_BUILTIN_VXXBLEND_V4SI, FUTURE_BUILTIN_VXXBLEND_V2DI, FUTURE_BUILTIN_VXXBLEND_V4SF, FUTURE_BUILTIN_VXXBLEND_V2DF, FUTURE_BUILTIN_VXXPERMX): Define overloaded arguments. (rs6000_expand_quaternop_builtin): Add if case for CODE_FOR_xxpermx. (builtin_quaternary_function_type): Add v16uqi_type and xxpermx_type variables, add case statement for FUTURE_BUILTIN_VXXPERMX. (builtin_function_type)[FUTURE_BUILTIN_VXXBLEND_V16QI, FUTURE_BUILTIN_VXXBLEND_V8HI, FUTURE_BUILTIN_VXXBLEND_V4SI, FUTURE_BUILTIN_VXXBLEND_V2DI]: Add case statements. * doc/extend.texi: Add documentation for vec_blendv and vec_permx. gcc/testsuite/ChangeLog 2020-06-18 Carl Love gcc.target/powerpc/vec-blend-runnable.c: New test. gcc.target/powerpc/vec-permute-ext-runnable.c: New test. --- gcc/config/rs6000/altivec.h | 2 + gcc/config/rs6000/altivec.md | 71 +++++ gcc/config/rs6000/rs6000-builtin.def | 13 + gcc/config/rs6000/rs6000-c.c | 28 +- gcc/config/rs6000/rs6000-call.c | 94 ++++++ gcc/doc/extend.texi | 63 ++++ .../gcc.target/powerpc/vec-blend-runnable.c | 276 ++++++++++++++++ .../powerpc/vec-permute-ext-runnable.c | 294 ++++++++++++++++++ 8 files changed, 834 insertions(+), 7 deletions(-) create mode 100644 gcc/testsuite/gcc.target/powerpc/vec-blend-runnable.c create mode 100644 gcc/testsuite/gcc.target/powerpc/vec-permute-ext-runnable.c diff --git a/gcc/config/rs6000/altivec.h b/gcc/config/rs6000/altivec.h index 9ed41b1cbf1..1b532effebe 100644 --- a/gcc/config/rs6000/altivec.h +++ b/gcc/config/rs6000/altivec.h @@ -708,6 +708,8 @@ __altivec_scalar_pred(vec_any_nle, #define vec_splati(a) __builtin_vec_xxspltiw (a) #define vec_splatid(a) __builtin_vec_xxspltid (a) #define vec_splati_ins(a, b, c) __builtin_vec_xxsplti32dx (a, b, c) +#define vec_blendv(a, b, c) __builtin_vec_xxblend (a, b, c) +#define vec_permx(a, b, c, d) __builtin_vec_xxpermx (a, b, c, d) #define vec_gnb(a, b) __builtin_vec_gnb (a, b) #define vec_clrl(a, b) __builtin_vec_clrl (a, b) diff --git a/gcc/config/rs6000/altivec.md b/gcc/config/rs6000/altivec.md index 25f6b9b2f07..6280586fb2c 100644 --- a/gcc/config/rs6000/altivec.md +++ b/gcc/config/rs6000/altivec.md @@ -176,6 +176,8 @@ UNSPEC_XXSPLTIW UNSPEC_XXSPLTID UNSPEC_XXSPLTI32DX + UNSPEC_XXBLEND + UNSPEC_XXPERMX ]) (define_c_enum "unspecv" @@ -218,6 +220,21 @@ (KF "FLOAT128_VECTOR_P (KFmode)") (TF "FLOAT128_VECTOR_P (TFmode)")]) +;; Like VM2, just do char, short, int, long, float and double +(define_mode_iterator VM3 [V4SI + V8HI + V16QI + V4SF + V2DF + V2DI]) + +(define_mode_attr VM3_char [(V2DI "d") + (V4SI "w") + (V8HI "h") + (V16QI "b") + (V2DF "d") + (V4SF "w")]) + ;; Map the Vector convert single precision to double precision for integer ;; versus floating point (define_mode_attr VS_sxwsp [(V4SI "sxw") (V4SF "sp")]) @@ -908,6 +925,60 @@ "xxsplti32dx %x0,%1,%2" [(set_attr "type" "vecsimple")]) +(define_insn "xxblend_" + [(set (match_operand:VM3 0 "register_operand" "=wa") + (unspec:VM3 [(match_operand:VM3 1 "register_operand" "wa") + (match_operand:VM3 2 "register_operand" "wa") + (match_operand:VM3 3 "register_operand" "wa")] + UNSPEC_XXBLEND))] + "TARGET_FUTURE" + "xxblendv %x0,%x1,%x2,%x3" + [(set_attr "type" "vecsimple")]) + +(define_expand "xxpermx" + [(set (match_operand:V2DI 0 "register_operand" "+wa") + (unspec:V2DI [(match_operand:V2DI 1 "register_operand" "wa") + (match_operand:V2DI 2 "register_operand" "wa") + (match_operand:V16QI 3 "register_operand" "wa") + (match_operand:QI 4 "u8bit_cint_operand" "n")] + UNSPEC_XXPERMX))] + "TARGET_FUTURE" +{ + if (BYTES_BIG_ENDIAN) + emit_insn (gen_xxpermx_inst (operands[0], operands[1], + operands[2], operands[3], + operands[4])); + else + { + /* Reverse value of byte element indexes by XORing with 0xFF. + Reverse the 32-byte section identifier match by subracting bits [0:2] + of elemet from 7. */ + int value = INTVAL (operands[4]); + rtx vreg = gen_reg_rtx (V16QImode); + + emit_insn (gen_xxspltib_v16qi (vreg, GEN_INT (-1))); + emit_insn (gen_xorv16qi3 (operands[3], operands[3], vreg)); + value = 7 - value; + emit_insn (gen_xxpermx_inst (operands[0], operands[2], + operands[1], operands[3], + GEN_INT (value))); + } + + DONE; +} + [(set_attr "type" "vecsimple")]) + +(define_insn "xxpermx_inst" + [(set (match_operand:V2DI 0 "register_operand" "+v") + (unspec:V2DI [(match_operand:V2DI 1 "register_operand" "v") + (match_operand:V2DI 2 "register_operand" "v") + (match_operand:V16QI 3 "register_operand" "v") + (match_operand:QI 4 "u3bit_cint_operand" "n")] + UNSPEC_XXPERMX))] + "TARGET_FUTURE" + "xxpermx %x0,%x1,%x2,%x3,%4" + [(set_attr "type" "vecsimple")]) + (define_expand "vstrir_" [(set (match_operand:VIshort 0 "altivec_register_operand") (unspec:VIshort [(match_operand:VIshort 1 "altivec_register_operand")] diff --git a/gcc/config/rs6000/rs6000-builtin.def b/gcc/config/rs6000/rs6000-builtin.def index c85326de7f2..d1d04f013bb 100644 --- a/gcc/config/rs6000/rs6000-builtin.def +++ b/gcc/config/rs6000/rs6000-builtin.def @@ -2675,6 +2675,15 @@ BU_FUTURE_V_1 (VXXSPLTID, "vxxspltidp", CONST, xxspltidp_v2df) BU_FUTURE_V_3 (VXXSPLTI32DX_V4SI, "vxxsplti32dx_v4si", CONST, xxsplti32dx_v4si) BU_FUTURE_V_3 (VXXSPLTI32DX_V4SF, "vxxsplti32dx_v4sf", CONST, xxsplti32dx_v4sf) +BU_FUTURE_V_3 (VXXBLEND_V16QI, "xxblend_v16qi", CONST, xxblend_v16qi) +BU_FUTURE_V_3 (VXXBLEND_V8HI, "xxblend_v8hi", CONST, xxblend_v8hi) +BU_FUTURE_V_3 (VXXBLEND_V4SI, "xxblend_v4si", CONST, xxblend_v4si) +BU_FUTURE_V_3 (VXXBLEND_V2DI, "xxblend_v2di", CONST, xxblend_v2di) +BU_FUTURE_V_3 (VXXBLEND_V4SF, "xxblend_v4sf", CONST, xxblend_v4sf) +BU_FUTURE_V_3 (VXXBLEND_V2DF, "xxblend_v2df", CONST, xxblend_v2df) + +BU_FUTURE_V_4 (VXXPERMX, "xxpermx", CONST, xxpermx) + BU_FUTURE_V_1 (VSTRIBR, "vstribr", CONST, vstrir_v16qi) BU_FUTURE_V_1 (VSTRIHR, "vstrihr", CONST, vstrir_v8hi) BU_FUTURE_V_1 (VSTRIBL, "vstribl", CONST, vstril_v16qi) @@ -2710,6 +2719,10 @@ BU_FUTURE_OVERLOAD_1 (VSTRIL_P, "stril_p") BU_FUTURE_OVERLOAD_1 (XXSPLTIW, "xxspltiw") BU_FUTURE_OVERLOAD_1 (XXSPLTID, "xxspltid") BU_FUTURE_OVERLOAD_3 (XXSPLTI32DX, "xxsplti32dx") + +BU_FUTURE_OVERLOAD_3 (XXBLEND, "xxblend") +BU_FUTURE_OVERLOAD_4 (XXPERMX, "xxpermx") + /* 1 argument crypto functions. */ BU_CRYPTO_1 (VSBOX, "vsbox", CONST, crypto_vsbox_v2di) diff --git a/gcc/config/rs6000/rs6000-c.c b/gcc/config/rs6000/rs6000-c.c index 07ca33a89b4..3acbb3ca8a4 100644 --- a/gcc/config/rs6000/rs6000-c.c +++ b/gcc/config/rs6000/rs6000-c.c @@ -1796,22 +1796,36 @@ altivec_resolve_overloaded_builtin (location_t loc, tree fndecl, unsupported_builtin = true; } } - else if (fcode == FUTURE_BUILTIN_VEC_XXEVAL) + else if ((fcode == FUTURE_BUILTIN_VEC_XXEVAL) + || (fcode == FUTURE_BUILTIN_VXXPERMX)) { - /* Need to special case __builtin_vec_xxeval because this takes - 4 arguments, and the existing infrastructure handles no - more than three. */ + signed char op3_type; + + /* Need to special case the builins_xxeval because it takes + 4 arguments, and the existing infrastructure handles three. */ if (nargs != 4) { - error ("builtin %qs requires 4 arguments", - "__builtin_vec_xxeval"); + if (fcode == FUTURE_BUILTIN_VEC_XXEVAL) + error ("builtin %qs requires 4 arguments", + "__builtin_vec_xxeval"); + else + error ("builtin %qs requires 4 arguments", + "__builtin_vec_xxpermx"); + return error_mark_node; } + + /* Set value for vec_xxpermx here as it is a constant. */ + op3_type = RS6000_BTI_V16QI; + for ( ; desc->code == fcode; desc++) { + if (fcode == FUTURE_BUILTIN_VEC_XXEVAL) + op3_type = desc->op3; + if (rs6000_builtin_type_compatible (types[0], desc->op1) && rs6000_builtin_type_compatible (types[1], desc->op2) - && rs6000_builtin_type_compatible (types[2], desc->op3) + && rs6000_builtin_type_compatible (types[2], op3_type) && rs6000_builtin_type_compatible (types[3], RS6000_BTI_UINTSI)) { diff --git a/gcc/config/rs6000/rs6000-call.c b/gcc/config/rs6000/rs6000-call.c index e36aafaf71c..6770a7f05a2 100644 --- a/gcc/config/rs6000/rs6000-call.c +++ b/gcc/config/rs6000/rs6000-call.c @@ -5554,6 +5554,39 @@ const struct altivec_builtin_types altivec_overloaded_builtins[] = { RS6000_BTI_unsigned_V1TI, RS6000_BTI_unsigned_V1TI, RS6000_BTI_unsigned_V1TI, RS6000_BTI_unsigned_V1TI }, + /* The overloaded XXPERMX definitions are handled specially because the + fourth unsigned char operand is not encoded in this table. */ + { FUTURE_BUILTIN_VEC_XXPERMX, FUTURE_BUILTIN_VXXPERMX, + RS6000_BTI_V16QI, RS6000_BTI_V16QI, RS6000_BTI_V16QI, + RS6000_BTI_unsigned_V16QI }, + { FUTURE_BUILTIN_VEC_XXPERMX, FUTURE_BUILTIN_VXXPERMX, + RS6000_BTI_unsigned_V16QI, RS6000_BTI_unsigned_V16QI, + RS6000_BTI_unsigned_V16QI, RS6000_BTI_unsigned_V16QI }, + { FUTURE_BUILTIN_VEC_XXPERMX, FUTURE_BUILTIN_VXXPERMX, + RS6000_BTI_V8HI, RS6000_BTI_V8HI, RS6000_BTI_V8HI, + RS6000_BTI_unsigned_V16QI }, + { FUTURE_BUILTIN_VEC_XXPERMX, FUTURE_BUILTIN_VXXPERMX, + RS6000_BTI_unsigned_V8HI, RS6000_BTI_unsigned_V8HI, + RS6000_BTI_unsigned_V8HI, RS6000_BTI_unsigned_V16QI }, + { FUTURE_BUILTIN_VEC_XXPERMX, FUTURE_BUILTIN_VXXPERMX, + RS6000_BTI_V4SI, RS6000_BTI_V4SI, RS6000_BTI_V4SI, + RS6000_BTI_unsigned_V16QI }, + { FUTURE_BUILTIN_VEC_XXPERMX, FUTURE_BUILTIN_VXXPERMX, + RS6000_BTI_unsigned_V4SI, RS6000_BTI_unsigned_V4SI, + RS6000_BTI_unsigned_V4SI, RS6000_BTI_unsigned_V16QI }, + { FUTURE_BUILTIN_VEC_XXPERMX, FUTURE_BUILTIN_VXXPERMX, + RS6000_BTI_V2DI, RS6000_BTI_V2DI, RS6000_BTI_V2DI, + RS6000_BTI_unsigned_V16QI }, + { FUTURE_BUILTIN_VEC_XXPERMX, FUTURE_BUILTIN_VXXPERMX, + RS6000_BTI_unsigned_V2DI, RS6000_BTI_unsigned_V2DI, + RS6000_BTI_unsigned_V2DI, RS6000_BTI_unsigned_V16QI }, + { FUTURE_BUILTIN_VEC_XXPERMX, FUTURE_BUILTIN_VXXPERMX, + RS6000_BTI_V4SF, RS6000_BTI_V4SF, RS6000_BTI_V4SF, + RS6000_BTI_unsigned_V16QI }, + { FUTURE_BUILTIN_VEC_XXPERMX, FUTURE_BUILTIN_VXXPERMX, + RS6000_BTI_V2DF, RS6000_BTI_V2DF, RS6000_BTI_V2DF, + RS6000_BTI_unsigned_V16QI }, + { FUTURE_BUILTIN_VEC_EXTRACTL, FUTURE_BUILTIN_VEXTRACTBL, RS6000_BTI_unsigned_V2DI, RS6000_BTI_unsigned_V16QI, RS6000_BTI_unsigned_V16QI, RS6000_BTI_UINTQI }, @@ -5695,6 +5728,37 @@ const struct altivec_builtin_types altivec_overloaded_builtins[] = { { FUTURE_BUILTIN_VEC_XXSPLTI32DX, FUTURE_BUILTIN_VXXSPLTI32DX_V4SF, RS6000_BTI_V4SF, RS6000_BTI_V4SF, RS6000_BTI_UINTQI, RS6000_BTI_float }, + { FUTURE_BUILTIN_VEC_XXBLEND, FUTURE_BUILTIN_VXXBLEND_V16QI, + RS6000_BTI_V16QI, RS6000_BTI_V16QI, RS6000_BTI_V16QI, + RS6000_BTI_unsigned_V16QI }, + { FUTURE_BUILTIN_VEC_XXBLEND, FUTURE_BUILTIN_VXXBLEND_V16QI, + RS6000_BTI_unsigned_V16QI, RS6000_BTI_unsigned_V16QI, + RS6000_BTI_unsigned_V16QI, RS6000_BTI_unsigned_V16QI }, + { FUTURE_BUILTIN_VEC_XXBLEND, FUTURE_BUILTIN_VXXBLEND_V8HI, + RS6000_BTI_V8HI, RS6000_BTI_V8HI, RS6000_BTI_V8HI, + RS6000_BTI_unsigned_V8HI }, + { FUTURE_BUILTIN_VEC_XXBLEND, FUTURE_BUILTIN_VXXBLEND_V8HI, + RS6000_BTI_unsigned_V8HI, RS6000_BTI_unsigned_V8HI, + RS6000_BTI_unsigned_V8HI, RS6000_BTI_unsigned_V8HI }, + { FUTURE_BUILTIN_VEC_XXBLEND, FUTURE_BUILTIN_VXXBLEND_V4SI, + RS6000_BTI_V4SI, RS6000_BTI_V4SI, RS6000_BTI_V4SI, + RS6000_BTI_unsigned_V4SI }, + { FUTURE_BUILTIN_VEC_XXBLEND, FUTURE_BUILTIN_VXXBLEND_V4SI, + RS6000_BTI_unsigned_V4SI, RS6000_BTI_unsigned_V4SI, + RS6000_BTI_unsigned_V4SI, RS6000_BTI_unsigned_V4SI }, + { FUTURE_BUILTIN_VEC_XXBLEND, FUTURE_BUILTIN_VXXBLEND_V2DI, + RS6000_BTI_V2DI, RS6000_BTI_V2DI, RS6000_BTI_V2DI, + RS6000_BTI_unsigned_V2DI }, + { FUTURE_BUILTIN_VEC_XXBLEND, FUTURE_BUILTIN_VXXBLEND_V2DI, + RS6000_BTI_unsigned_V2DI, RS6000_BTI_unsigned_V2DI, + RS6000_BTI_unsigned_V2DI, RS6000_BTI_unsigned_V2DI }, + { FUTURE_BUILTIN_VEC_XXBLEND, FUTURE_BUILTIN_VXXBLEND_V4SF, + RS6000_BTI_V4SF, RS6000_BTI_V4SF, RS6000_BTI_V4SF, + RS6000_BTI_unsigned_V4SI }, + { FUTURE_BUILTIN_VEC_XXBLEND, FUTURE_BUILTIN_VXXBLEND_V2DF, + RS6000_BTI_V2DF, RS6000_BTI_V2DF, RS6000_BTI_V2DF, + RS6000_BTI_unsigned_V2DI }, + { FUTURE_BUILTIN_VEC_SRDB, FUTURE_BUILTIN_VSRDB_V16QI, RS6000_BTI_V16QI, RS6000_BTI_V16QI, RS6000_BTI_V16QI, RS6000_BTI_UINTQI }, @@ -9911,6 +9975,18 @@ rs6000_expand_quaternop_builtin (enum insn_code icode, tree exp, rtx target) } } + else if (icode == CODE_FOR_xxpermx) + { + /* Only allow 3-bit unsigned literals. */ + STRIP_NOPS (arg3); + if (TREE_CODE (arg3) != INTEGER_CST + || TREE_INT_CST_LOW (arg3) & ~0x7) + { + error ("argument 4 must be an 3-bit unsigned literal"); + return CONST0_RTX (tmode); + } + } + if (target == 0 || GET_MODE (target) != tmode || ! (*insn_data[icode].operand[0].predicate) (target, tmode)) @@ -13293,12 +13369,17 @@ builtin_quaternary_function_type (machine_mode mode_ret, tree function_type = NULL; static tree v2udi_type = builtin_mode_to_type[V2DImode][1]; + static tree v16uqi_type = builtin_mode_to_type[V16QImode][1]; static tree uchar_type = builtin_mode_to_type[QImode][1]; static tree xxeval_type = build_function_type_list (v2udi_type, v2udi_type, v2udi_type, v2udi_type, uchar_type, NULL_TREE); + static tree xxpermx_type = + build_function_type_list (v2udi_type, v2udi_type, v2udi_type, + v16uqi_type, uchar_type, NULL_TREE); + switch (builtin) { case FUTURE_BUILTIN_XXEVAL: @@ -13310,6 +13391,15 @@ builtin_quaternary_function_type (machine_mode mode_ret, function_type = xxeval_type; break; + case FUTURE_BUILTIN_VXXPERMX: + gcc_assert ((mode_ret == V2DImode) + && (mode_arg0 == V2DImode) + && (mode_arg1 == V2DImode) + && (mode_arg2 == V16QImode) + && (mode_arg3 == QImode)); + function_type = xxpermx_type; + break; + default: /* A case for each quaternary built-in must be provided above. */ gcc_unreachable (); @@ -13489,6 +13579,10 @@ builtin_function_type (machine_mode mode_ret, machine_mode mode_arg0, case FUTURE_BUILTIN_VREPLACE_ELT_UV2DI: case FUTURE_BUILTIN_VREPLACE_UN_UV4SI: case FUTURE_BUILTIN_VREPLACE_UN_UV2DI: + case FUTURE_BUILTIN_VXXBLEND_V16QI: + case FUTURE_BUILTIN_VXXBLEND_V8HI: + case FUTURE_BUILTIN_VXXBLEND_V4SI: + case FUTURE_BUILTIN_VXXBLEND_V2DI: h.uns_p[0] = 1; h.uns_p[1] = 1; h.uns_p[2] = 1; diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi index 37d11e1ef41..63d3a9babe8 100644 --- a/gcc/doc/extend.texi +++ b/gcc/doc/extend.texi @@ -21185,6 +21185,69 @@ result. The other words of argument 1 are unchanged. @findex vec_splati_ins +Vector Blend Variable + +@smallexample +@exdent vector signed char vec_blendv (vector signed char, vector signed char, +vector unsigned char); +@exdent vector unsigned char vec_blendv (vector unsigned char, +vector unsigned char, vector unsigned char); +@exdent vector signed short vec_blendv (vector signed short, +vector signed short, vector unsigned short); +@exdent vector unsigned short vec_blendv (vector unsigned short, +vector unsigned short, vector unsigned short); +@exdent vector signed int vec_blendv (vector signed int, vector signed int, +vector unsigned int); +@exdent vector unsigned int vec_blendv (vector unsigned int, +vector unsigned int, vector unsigned int); +@exdent vector signed long long vec_blendv (vector signed long long, +vector signed long long, vector unsigned long long); +@exdent vector unsigned long long vec_blendv (vector unsigned long long, +vector unsigned long long, vector unsigned long long); +@exdent vector float vec_blendv (vector float, vector float, +vector unsigned int); +@exdent vector double vec_blendv (vector double, vector double, +vector unsigned long long); +@end smallexample + +Blend the first and second argument vectors according to the sign bits of the +corresponding elements of the third argument vector. This is similar to the +vsel and xxsel instructions but for bigger elements. + +@findex vec_blendv + +Vector Permute Extended + +@smallexample +@exdent vector signed char vec_permx (vector signed char, vector signed char, +vector unsigned char, const int); +@exdent vector unsigned char vec_permx (vector unsigned char, +vector unsigned char, vector unsigned char, const int); +@exdent vector signed short vec_permx (vector signed short, +vector signed short, vector unsigned char, const int); +@exdent vector unsigned short vec_permx (vector unsigned short, +vector unsigned short, vector unsigned char, const int); +@exdent vector signed int vec_permx (vector signed int, vector signed int, +vector unsigned char, const int); +@exdent vector unsigned int vec_permx (vector unsigned int, +vector unsigned int, vector unsigned char, const int); +@exdent vector signed long long vec_permx (vector signed long long, +vector signed long long, vector unsigned char, const int); +@exdent vector unsigned long long vec_permx (vector unsigned long long, +vector unsigned long long, vector unsigned char, const int); +@exdent vector float (vector float, vector float, vector unsigned char, +const int); +@exdent vector double (vector double, vector double, vector unsigned char, +const int); +@end smallexample + +Perform a partial permute of the first two arguments, which form a 32-byte +section of an emulated vector up to 256 bytes wide, using the partial permute +control vector in the third argument. The fourth argument (constrained to +values of 0-7) identifies which 32-byte section of the emulated vector is +contained in the first two arguments. +@findex vec_permx + @smallexample @exdent vector unsigned long long int @exdent vec_pdep (vector unsigned long long int, vector unsigned long long int) diff --git a/gcc/testsuite/gcc.target/powerpc/vec-blend-runnable.c b/gcc/testsuite/gcc.target/powerpc/vec-blend-runnable.c new file mode 100644 index 00000000000..70b25be3bcb --- /dev/null +++ b/gcc/testsuite/gcc.target/powerpc/vec-blend-runnable.c @@ -0,0 +1,276 @@ +/* { dg-do run } */ +/* { dg-require-effective-target powerpc_future_hw } */ +/* { dg-options "-mdejagnu-cpu=future" } */ +#include + +#define DEBUG 0 + +#ifdef DEBUG +#include +#endif + +extern void abort (void); + +int +main (int argc, char *argv []) +{ + int i; + vector signed char vsrc_a_char, vsrc_b_char; + vector signed char vresult_char; + vector signed char expected_vresult_char; + + vector unsigned char vsrc_a_uchar, vsrc_b_uchar, vsrc_c_uchar; + vector unsigned char vresult_uchar; + vector unsigned char expected_vresult_uchar; + + vector signed short vsrc_a_short, vsrc_b_short, vsrc_c_short; + vector signed short vresult_short; + vector signed short expected_vresult_short; + + vector unsigned short vsrc_a_ushort, vsrc_b_ushort, vsrc_c_ushort; + vector unsigned short vresult_ushort; + vector unsigned short expected_vresult_ushort; + + vector int vsrc_a_int, vsrc_b_int, vsrc_c_int; + vector int vresult_int; + vector int expected_vresult_int; + + vector unsigned int vsrc_a_uint, vsrc_b_uint, vsrc_c_uint; + vector unsigned int vresult_uint; + vector unsigned int expected_vresult_uint; + + vector long long int vsrc_a_ll, vsrc_b_ll, vsrc_c_ll; + vector long long int vresult_ll; + vector long long int expected_vresult_ll; + + vector unsigned long long int vsrc_a_ull, vsrc_b_ull, vsrc_c_ull; + vector unsigned long long int vresult_ull; + vector unsigned long long int expected_vresult_ull; + + vector float vresult_f; + vector float expected_vresult_f; + vector float vsrc_a_f, vsrc_b_f; + + vector double vsrc_a_d, vsrc_b_d; + vector double vresult_d; + vector double expected_vresult_d; + + /* Vector blend */ + vsrc_c_uchar = (vector unsigned char) { 0, 0x80, 0, 0x80, 0, 0x80, 0, 0x80, + 0, 0x80, 0, 0x80, 0, 0x80, 0, 0x80 }; + + vsrc_a_char = (vector signed char) { -1, 3, 5, 7, 9, 11, 13, 15, + 17, 19, 21, 23, 25, 27, 29 }; + vsrc_b_char = (vector signed char) { 2, -4, 6, 8, 10, 12, 14, 16, + 18, 20, 22, 24, 26, 28, 30, 32 }; + vsrc_c_uchar = (vector unsigned char) { 0, 0x80, 0, 0x80, 0, 0x80, 0, 0x80, + 0, 0x80, 0, 0x80, 0, 0x80, 0, 0x80 }; + vresult_char = (vector signed char) { 0, 0, 0, 0, 0, 0, 0, 0, + 0, 0, 0, 0, 0, 0, 0, 0 }; + expected_vresult_char = (vector signed char) { -1, -4, 5, 8, + 9, 12, 13, 16, + 17, 20, 21, 24, + 25, 28, 29, 32 }; + + vresult_char = vec_blendv (vsrc_a_char, vsrc_b_char, vsrc_c_uchar); + + if (!vec_all_eq (vresult_char, expected_vresult_char)) { +#if DEBUG + printf("ERROR, vec_blendv (vsrc_a_char, vsrc_b_char, vsrc_c_uchar)\n"); + for(i = 0; i < 16; i++) + printf(" vresult_char[%d] = %d, expected_vresult_char[%d] = %d\n", + i, vresult_char[i], i, expected_vresult_char[i]); +#else + abort(); +#endif + } + + vsrc_a_uchar = (vector unsigned char) { 1, 3, 5, 7, 9, 11, 13, 15, + 17, 19, 21, 23, 25, 27, 29 }; + vsrc_b_uchar = (vector unsigned char) { 2, 4, 6, 8, 10, 12, 14, 16, + 18, 20, 22, 24, 26, 28, 30, 32 }; + vsrc_c_uchar = (vector unsigned char) { 0, 0x80, 0, 0x80, 0, 0x80, 0, 0x80, + 0, 0x80, 0, 0x80, 0, 0x80, 0, 0x80 }; + vresult_uchar = (vector unsigned char) { 0, 0, 0, 0, 0, 0, 0, 0, + 0, 0, 0, 0, 0, 0, 0, 0 }; + expected_vresult_uchar = (vector unsigned char) { 1, 4, 5, 8, + 9, 12, 13, 16, + 17, 20, 21, 24, + 25, 28, 29, 32 }; + + vresult_uchar = vec_blendv (vsrc_a_uchar, vsrc_b_uchar, vsrc_c_uchar); + + if (!vec_all_eq (vresult_uchar, expected_vresult_uchar)) { +#if DEBUG + printf("ERROR, vec_blendv (vsrc_a_uchar, vsrc_b_uchar, vsrc_c_uchar)\n"); + for(i = 0; i < 16; i++) + printf(" vresult_uchar[%d] = %d, expected_vresult_uchar[%d] = %d\n", + i, vresult_uchar[i], i, expected_vresult_uchar[i]); +#else + abort(); +#endif + } + + vsrc_a_short = (vector signed short) { -1, 3, 5, 7, 9, 11, 13, 15 }; + vsrc_b_short = (vector signed short) { 2, -4, 6, 8, 10, 12, 14, 16 }; + vsrc_c_ushort = (vector unsigned short) { 0, 0x8000, 0, 0x8000, + 0, 0x8000, 0, 0x8000 }; + vresult_short = (vector signed short) { 0, 0, 0, 0, 0, 0, 0, 0 }; + expected_vresult_short = (vector signed short) { -1, -4, 5, 8, + 9, 12, 13, 16 }; + + vresult_short = vec_blendv (vsrc_a_short, vsrc_b_short, vsrc_c_ushort); + + if (!vec_all_eq (vresult_short, expected_vresult_short)) { +#if DEBUG + printf("ERROR, vec_blendv (vsrc_a_short, vsrc_b_short, vsrc_c_ushort)\n"); + for(i = 0; i < 8; i++) + printf(" vresult_short[%d] = %d, expected_vresult_short[%d] = %d\n", + i, vresult_short[i], i, expected_vresult_short[i]); +#else + abort(); +#endif + } + + vsrc_a_ushort = (vector unsigned short) { 1, 3, 5, 7, 9, 11, 13, 15 }; + vsrc_b_ushort = (vector unsigned short) { 2, 4, 6, 8, 10, 12, 14, 16 }; + vsrc_c_ushort = (vector unsigned short) { 0, 0x8000, 0, 0x8000, + 0, 0x8000, 0, 0x8000 }; + vresult_ushort = (vector unsigned short) { 0, 0, 0, 0, 0, 0, 0, 0 }; + expected_vresult_ushort = (vector unsigned short) { 1, 4, 5, 8, + 9, 12, 13, 16 }; + + vresult_ushort = vec_blendv (vsrc_a_ushort, vsrc_b_ushort, vsrc_c_ushort); + + if (!vec_all_eq (vresult_ushort, expected_vresult_ushort)) { +#if DEBUG + printf("ERROR, vec_blendv (vsrc_a_ushort, vsrc_b_ushort, vsrc_c_ushort)\n"); + for(i = 0; i < 8; i++) + printf(" vresult_ushort[%d] = %d, expected_vresult_ushort[%d] = %d\n", + i, vresult_ushort[i], i, expected_vresult_ushort[i]); +#else + abort(); +#endif + } + + vsrc_a_int = (vector signed int) { -1, -3, -5, -7 }; + vsrc_b_int = (vector signed int) { 2, 4, 6, 8 }; + vsrc_c_uint = (vector unsigned int) { 0, 0x80000000, 0, 0x80000000}; + vresult_int = (vector signed int) { 0, 0, 0, 0 }; + expected_vresult_int = (vector signed int) { -1, 4, -5, 8 }; + + vresult_int = vec_blendv (vsrc_a_int, vsrc_b_int, vsrc_c_uint); + + if (!vec_all_eq (vresult_int, expected_vresult_int)) { +#if DEBUG + printf("ERROR, vec_blendv (vsrc_a_int, vsrc_b_int, vsrc_c_uint)\n"); + for(i = 0; i < 4; i++) + printf(" vresult_int[%d] = %d, expected_vresult_int[%d] = %d\n", + i, vresult_int[i], i, expected_vresult_int[i]); +#else + abort(); +#endif + } + + vsrc_a_uint = (vector unsigned int) { 1, 3, 5, 7 }; + vsrc_b_uint = (vector unsigned int) { 2, 4, 6, 8 }; + vsrc_c_uint = (vector unsigned int) { 0, 0x80000000, 0, 0x80000000 }; + vresult_uint = (vector unsigned int) { 0, 0, 0, 0 }; + expected_vresult_uint = (vector unsigned int) { 1, 4, 5, 8 }; + + vresult_uint = vec_blendv (vsrc_a_uint, vsrc_b_uint, vsrc_c_uint); + + if (!vec_all_eq (vresult_uint, expected_vresult_uint)) { +#if DEBUG + printf("ERROR, vec_blendv (vsrc_a_uint, vsrc_b_uint, vsrc_c_uint)\n"); + for(i = 0; i < 4; i++) + printf(" vresult_uint[%d] = %d, expected_vresult_uint[%d] = %d\n", + i, vresult_uint[i], i, expected_vresult_uint[i]); +#else + abort(); +#endif + } + + vsrc_a_ll = (vector signed long long int) { -1, -3 }; + vsrc_b_ll = (vector signed long long int) { 2, 4, }; + vsrc_c_ull = (vector unsigned long long int) { 0, 0x8000000000000000ULL }; + vresult_ll = (vector signed long long int) { 0, 0 }; + expected_vresult_ll = (vector signed long long int) { -1, 4 }; + + vresult_ll = vec_blendv (vsrc_a_ll, vsrc_b_ll, vsrc_c_ull); + + if (!vec_all_eq (vresult_ll, expected_vresult_ll)) { +#if DEBUG + printf("ERROR, vec_blendv (vsrc_a_ll, vsrc_b_ll, vsrc_c_ull)\n"); + for(i = 0; i < 2; i++) + printf(" vresult_ll[%d] = %d, expected_vresult_ll[%d] = %d\n", + i, vresult_ll[i], i, expected_vresult_ll[i]); +#else + abort(); +#endif + } + + vsrc_a_ull = (vector unsigned long long) { 1, 3 }; + vsrc_b_ull = (vector unsigned long long) { 2, 4 }; + vsrc_c_ull = (vector unsigned long long int) { 0, 0x8000000000000000ULL }; + vresult_ull = (vector unsigned long long) { 0, 0 }; + expected_vresult_ull = (vector unsigned long long) { 1, 4 }; + + vresult_ull = vec_blendv (vsrc_a_ull, vsrc_b_ull, vsrc_c_ull); + + if (!vec_all_eq (vresult_ull, expected_vresult_ull)) { +#if DEBUG + printf("ERROR, vec_blendv (vsrc_a_ull, vsrc_b_ull, vsrc_c_ull)\n"); + for(i = 0; i < 2; i++) + printf(" vresult_ull[%d] = %d, expected_vresult_ull[%d] = %d\n", + i, vresult_ull[i], i, expected_vresult_ull[i]); +#else + abort(); +#endif + } + + vsrc_a_f = (vector float) { -1.0, -3.0, -5.0, -7.0 }; + vsrc_b_f = (vector float) { 2.0, 4.0, 6.0, 8.0 }; + vsrc_c_uint = (vector unsigned int) { 0, 0x80000000, 0, 0x80000000}; + vresult_f = (vector float) { 0, 0, 0, 0 }; + expected_vresult_f = (vector float) { -1, 4, -5, 8 }; + + vresult_f = vec_blendv (vsrc_a_f, vsrc_b_f, vsrc_c_uint); + + if (!vec_all_eq (vresult_f, expected_vresult_f)) { +#if DEBUG + printf("ERROR, vec_blendv (vsrc_a_f, vsrc_b_f, vsrc_c_uint)\n"); + for(i = 0; i < 4; i++) + printf(" vresult_f[%d] = %d, expected_vresult_f[%d] = %d\n", + i, vresult_f[i], i, expected_vresult_f[i]); +#else + abort(); +#endif + } + + vsrc_a_d = (vector double) { -1.0, -3.0 }; + vsrc_b_d = (vector double) { 2.0, 4.0 }; + vsrc_c_ull = (vector unsigned long long int) { 0, 0x8000000000000000ULL }; + vresult_d = (vector double) { 0, 0 }; + expected_vresult_d = (vector double) { -1, 4 }; + + vresult_d = vec_blendv (vsrc_a_d, vsrc_b_d, vsrc_c_ull); + + if (!vec_all_eq (vresult_d, expected_vresult_d)) { +#if DEBUG + printf("ERROR, vec_blendv (vsrc_a_d, vsrc_b_d, vsrc_c_ull)\n"); + for(i = 0; i < 4; i++) + printf(" vresult_d[%d] = %d, expected_vresult_d[%d] = %d\n", + i, vresult_d[i], i, expected_vresult_d[i]); +#else + abort(); +#endif + } + + return 0; +} + +/* { dg-final { scan-assembler-times {\msplati\M} 6 } } */ +/* { dg-final { scan-assembler-times {\msrdbi\M} 6 } } */ + + diff --git a/gcc/testsuite/gcc.target/powerpc/vec-permute-ext-runnable.c b/gcc/testsuite/gcc.target/powerpc/vec-permute-ext-runnable.c new file mode 100644 index 00000000000..f5d223d0530 --- /dev/null +++ b/gcc/testsuite/gcc.target/powerpc/vec-permute-ext-runnable.c @@ -0,0 +1,294 @@ +/* { dg-do run } */ +/* { dg-require-effective-target powerpc_future_hw } */ +/* { dg-options "-mdejagnu-cpu=future" } */ +#include + +#define DEBUG 1 + +#ifdef DEBUG +#include +#endif + +extern void abort (void); + +int +main (int argc, char *argv []) +{ + int i; + vector signed char vsrc_a_char, vsrc_b_char; + vector signed char vresult_char; + vector signed char expected_vresult_char; + + vector unsigned char vsrc_a_uchar, vsrc_b_uchar, vsrc_c_uchar; + vector unsigned char vresult_uchar; + vector unsigned char expected_vresult_uchar; + + vector signed short vsrc_a_short, vsrc_b_short, vsrc_c_short; + vector signed short vresult_short; + vector signed short expected_vresult_short; + + vector unsigned short vsrc_a_ushort, vsrc_b_ushort, vsrc_c_ushort; + vector unsigned short vresult_ushort; + vector unsigned short expected_vresult_ushort; + + vector int vsrc_a_int, vsrc_b_int, vsrc_c_int; + vector int vresult_int; + vector int expected_vresult_int; + + vector unsigned int vsrc_a_uint, vsrc_b_uint, vsrc_c_uint; + vector unsigned int vresult_uint; + vector unsigned int expected_vresult_uint; + + vector long long int vsrc_a_ll, vsrc_b_ll, vsrc_c_ll; + vector long long int vresult_ll; + vector long long int expected_vresult_ll; + + vector unsigned long long int vsrc_a_ull, vsrc_b_ull, vsrc_c_ull; + vector unsigned long long int vresult_ull; + vector unsigned long long int expected_vresult_ull; + + vector float vresult_f; + vector float expected_vresult_f; + vector float vsrc_a_f, vsrc_b_f; + + vector double vsrc_a_d, vsrc_b_d; + vector double vresult_d; + vector double expected_vresult_d; + + /* Vector permx */ + vsrc_a_char = (vector signed char) { -1, 3, 5, 7, 9, 11, 13, 15, + 17, 19, 21, 23, 25, 27, 29 }; + vsrc_b_char = (vector signed char) { 2, -4, 6, 8, 10, 12, 14, 16, + 18, 20, 22, 24, 26, 28, 30, 32 }; + vsrc_c_uchar = (vector unsigned char) { 0x0, 0x7, 0, 0x5, 0, 0x3, 0, 0x1, + 0, 0x2, 0, 0x4, 0, 0x6, 0, 0x0 }; + vresult_char = (vector signed char) { 0, 0, 0, 0, 0, 0, 0, 0, + 0, 0, 0, 0, 0, 0, 0, 0 }; + expected_vresult_char = (vector signed char) { -1, 15, -1, 11, + -1, 7, -1, 3, + -1, 5, -1, 9, + -1, 13, -1, -1 }; + + vresult_char = vec_permx (vsrc_a_char, vsrc_b_char, vsrc_c_uchar, 0); + + if (!vec_all_eq (vresult_char, expected_vresult_char)) { +#if DEBUG + printf("ERROR, vec_permx (vsrc_a_char, vsrc_b_char, vsrc_c_uchar)\n"); + for(i = 0; i < 16; i++) + printf(" vresult_char[%d] = %d, expected_vresult_char[%d] = %d\n", + i, vresult_char[i], i, expected_vresult_char[i]); +#else + abort(); +#endif + } + + vsrc_a_uchar = (vector unsigned char) { 1, 3, 5, 7, 9, 11, 13, 15, + 17, 19, 21, 23, 25, 27, 29 }; + vsrc_b_uchar = (vector unsigned char) { 2, 4, 6, 8, 10, 12, 14, 16, + 18, 20, 22, 24, 26, 28, 30, 32 }; + vsrc_c_uchar = (vector unsigned char) { 0x0, 0x7, 0, 0x5, 0, 0x3, 0, 0x1, + 0, 0x2, 0, 0x4, 0, 0x6, 0, 0x0 }; + vresult_uchar = (vector unsigned char) { 0, 0, 0, 0, 0, 0, 0, 0, + 0, 0, 0, 0, 0, 0, 0, 0 }; + expected_vresult_uchar = (vector unsigned char) { 1, 15, 1, 11, + 1, 7, 1, 3, + 1, 5, 1, 9, + 1, 13, 1, 1 }; + + vresult_uchar = vec_permx (vsrc_a_uchar, vsrc_b_uchar, vsrc_c_uchar, 0); + + if (!vec_all_eq (vresult_uchar, expected_vresult_uchar)) { +#if DEBUG + printf("ERROR, vec_permx (vsrc_a_uchar, vsrc_b_uchar, vsrc_c_uchar)\n"); + for(i = 0; i < 16; i++) + printf(" vresult_uchar[%d] = %d, expected_vresult_uchar[%d] = %d\n", + i, vresult_uchar[i], i, expected_vresult_uchar[i]); +#else + abort(); +#endif + } + + vsrc_a_short = (vector signed short int) { 1, -3, 5, 7, 9, 11, 13, 15 }; + vsrc_b_short = (vector signed short int) { 2, 4, -6, 8, 10, 12, 14, 16 }; + vsrc_c_uchar = (vector unsigned char) { 0x0, 0x1, 0x2, 0x3, + 0x4, 0x5, 0x2, 0x3, + 0x8, 0x9, 0x2, 0x3, + 0x1E, 0x1F, 0x2, 0x3 }; + vresult_short = (vector signed short int) { 0, 0, 0, 0, 0, 0, 0, 0 }; + expected_vresult_short = (vector signed short int) { 1, -3, 5, -3, + 9, -3, 16, -3 }; + + vresult_short = vec_permx (vsrc_a_short, vsrc_b_short, vsrc_c_uchar, 0); + + if (!vec_all_eq (vresult_short, expected_vresult_short)) { +#if DEBUG + printf("ERROR, vec_permx (vsrc_a_short, vsrc_b_short, vsrc_c_uchar)\n"); + for(i = 0; i < 8; i++) + printf(" vresult_short[%d] = %d, expected_vresult_short[%d] = %d\n", + i, vresult_short[i], i, expected_vresult_short[i]); +#else + abort(); +#endif + } + + vsrc_a_ushort = (vector unsigned short int) { 1, 3, 5, 7, 9, 11, 13, 15 }; + vsrc_b_ushort = (vector unsigned short int) { 2, 4, 6, 8, 10, 12, 14, 16 }; + vsrc_c_uchar = (vector unsigned char) { 0x0, 0x1, 0x2, 0x3, + 0x4, 0x5, 0x2, 0x3, + 0x8, 0x9, 0x2, 0x3, + 0x1E, 0x1F, 0x2, 0x3 }; + vresult_ushort = (vector unsigned short int) { 0, 0, 0, 0, 0, 0, 0, 0 }; + expected_vresult_ushort = (vector unsigned short int) { 1, 3, 5, 3, + 9, 3, 16, 3 }; + + vresult_ushort = vec_permx (vsrc_a_ushort, vsrc_b_ushort, vsrc_c_uchar, 0); + + if (!vec_all_eq (vresult_ushort, expected_vresult_ushort)) { +#if DEBUG + printf("ERROR, vec_permx (vsrc_a_ushort, vsrc_b_ushort, vsrc_c_uchar)\n"); + for(i = 0; i < 8; i++) + printf(" vresult_ushort[%d] = %d, expected_vresult_ushort[%d] = %d\n", + i, vresult_ushort[i], i, expected_vresult_ushort[i]); +#else + abort(); +#endif + } + + vsrc_a_int = (vector signed int) { 1, -3, 5, 7 }; + vsrc_b_int = (vector signed int) { 2, 4, -6, 8 }; + vsrc_c_uchar = (vector unsigned char) { 0x0, 0x1, 0x2, 0x3, + 0x4, 0x5, 0x6, 0x7, + 0x18, 0x19, 0x1A, 0x1B, + 0x1C, 0x1D, 0x1E, 0x1F }; + vresult_int = (vector signed int) { 0, 0, 0, 0 }; + expected_vresult_int = (vector signed int) { 1, -3, -6, 8 }; + + vresult_int = vec_permx (vsrc_a_int, vsrc_b_int, vsrc_c_uchar, 0); + + if (!vec_all_eq (vresult_int, expected_vresult_int)) { +#if DEBUG + printf("ERROR, vec_permx (vsrc_a_int, vsrc_b_int, vsrc_c_uchar)\n"); + for(i = 0; i < 4; i++) + printf(" vresult_int[%d] = %d, expected_vresult_int[%d] = %d\n", + i, vresult_int[i], i, expected_vresult_int[i]); +#else + abort(); +#endif + } + + vsrc_a_uint = (vector unsigned int) { 1, 3, 5, 7 }; + vsrc_b_uint = (vector unsigned int) { 10, 12, 14, 16 }; + vsrc_c_uchar = (vector unsigned char) { 0x0, 0x1, 0x2, 0x3, + 0x4, 0x5, 0x6, 0x7, + 0x18, 0x19, 0x1A, 0x1B, + 0x1C, 0x1D, 0x1E, 0x1F }; + vresult_uint = (vector unsigned int) { 0, 0, 0, 0 }; + expected_vresult_uint = (vector unsigned int) { 1, 3, 14, 16 }; + + vresult_uint = vec_permx (vsrc_a_uint, vsrc_b_uint, vsrc_c_uchar, 0); + + if (!vec_all_eq (vresult_uint, expected_vresult_uint)) { +#if DEBUG + printf("ERROR, vec_permx (vsrc_a_uint, vsrc_b_uint, vsrc_c_uchar)\n"); + for(i = 0; i < 4; i++) + printf(" vresult_uint[%d] = %d, expected_vresult_uint[%d] = %d\n", + i, vresult_uint[i], i, expected_vresult_uint[i]); +#else + abort(); +#endif + } + + vsrc_a_ll = (vector signed long long int) { 1, -3 }; + vsrc_b_ll = (vector signed long long int) { 2, -4 }; + vsrc_c_uchar = (vector unsigned char) { 0x0, 0x1, 0x2, 0x3, + 0x4, 0x5, 0x6, 0x7, + 0x18, 0x19, 0x1A, 0x1B, + 0x1C, 0x1D, 0x1E, 0x1F }; + vresult_ll = (vector signed long long int) { 0, 0}; + expected_vresult_ll = (vector signed long long int) { 1, -4 }; + + vresult_ll = vec_permx (vsrc_a_ll, vsrc_b_ll, vsrc_c_uchar, 0); + + if (!vec_all_eq (vresult_ll, expected_vresult_ll)) { +#if DEBUG + printf("ERROR, vec_permx (vsrc_a_ll, vsrc_b_ll, vsrc_c_uchar)\n"); + for(i = 0; i < 2; i++) + printf(" vresult_ll[%d] = %lld, expected_vresult_ll[%d] = %lld\n", + i, vresult_ll[i], i, expected_vresult_ll[i]); +#else + abort(); +#endif + } + + vsrc_a_ull = (vector unsigned long long int) { 1, 3 }; + vsrc_b_ull = (vector unsigned long long int) { 10, 12 }; + vsrc_c_uchar = (vector unsigned char) { 0x0, 0x1, 0x2, 0x3, + 0x4, 0x5, 0x6, 0x7, + 0x18, 0x19, 0x1A, 0x1B, + 0x1C, 0x1D, 0x1E, 0x1F }; + vresult_ull = (vector unsigned long long int) { 0, 0 }; + expected_vresult_ull = (vector unsigned long long int) { 1, 12 }; + + vresult_ull = vec_permx (vsrc_a_ull, vsrc_b_ull, vsrc_c_uchar, 0); + + if (!vec_all_eq (vresult_ull, expected_vresult_ull)) { +#if DEBUG + printf("ERROR, vec_permx (vsrc_a_ull, vsrc_b_ull, vsrc_c_uchar)\n"); + for(i = 0; i < 2; i++) + printf(" vresult_ull[%d] = %d, expected_vresult_ull[%d] = %d\n", + i, vresult_ull[i], i, expected_vresult_ull[i]); +#else + abort(); +#endif + } + + vsrc_a_f = (vector float) { -3.0, 5.0, 7.0, 9.0 }; + vsrc_b_f = (vector float) { 2.0, 4.0, 6.0, 8.0 }; + vsrc_c_uchar = (vector unsigned char) { 0x0, 0x1, 0x2, 0x3, + 0x4, 0x5, 0x6, 0x7, + 0x18, 0x19, 0x1A, 0x1B, + 0x1C, 0x1D, 0x1E, 0x1F }; + vresult_f = (vector float) { 0.0, 0.0, 0.0, 0.0 }; + expected_vresult_f = (vector float) { -3.0, 5.0, 6.0, 8.0 }; + + vresult_f = vec_permx (vsrc_a_f, vsrc_b_f, vsrc_c_uchar, 0); + + if (!vec_all_eq (vresult_f, expected_vresult_f)) { +#if DEBUG + printf("ERROR, vec_permx (vsrc_a_f, vsrc_b_f, vsrc_c_uchar)\n"); + for(i = 0; i < 4; i++) + printf(" vresult_f[%d] = %f, expected_vresult_f[%d] = %f\n", + i, vresult_f[i], i, expected_vresult_f[i]); +#else + abort(); +#endif + } + + vsrc_a_d = (vector double) { 1.0, -3.0 }; + vsrc_b_d = (vector double) { 2.0, -4.0 }; + vsrc_c_uchar = (vector unsigned char) { 0x0, 0x1, 0x2, 0x3, + 0x4, 0x5, 0x6, 0x7, + 0x1A, 0x1B, 0x1C, 0x1B, + 0x1C, 0x1D, 0x1E, 0x1F }; + vresult_d = (vector double) { 0.0, 0.0 }; + expected_vresult_d = (vector double) { 1.0, -4.0 }; + + vresult_d = vec_permx (vsrc_a_d, vsrc_b_d, vsrc_c_uchar, 0); + + if (!vec_all_eq (vresult_d, expected_vresult_d)) { +#if DEBUG + printf("ERROR, vec_permx (vsrc_a_d, vsrc_b_d, vsrc_c_uchar)\n"); + for(i = 0; i < 2; i++) + printf(" vresult_d[%d] = %f, expected_vresult_d[%d] = %f\n", + i, vresult_d[i], i, expected_vresult_d[i]); +#else + abort(); +#endif + } + + return 0; +} + +/* { dg-final { scan-assembler-times {\mxxpermx\M} 6 } } */ + +