From patchwork Mon Sep 11 16:18:12 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Carl Love X-Patchwork-Id: 812439 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=gcc.gnu.org (client-ip=209.132.180.131; helo=sourceware.org; envelope-from=gcc-patches-return-461829-incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b="YnH0HUyn"; dkim-atps=neutral Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 3xrY585mvDz9s7f for ; Tue, 12 Sep 2017 02:18:32 +1000 (AEST) DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :subject:from:to:cc:date:content-type:mime-version :content-transfer-encoding:message-id; q=dns; s=default; b=t86tj LhuFhresfSQWWNaQPtjh7BK0F+Ytu+iyeDeH9Wo8nBFqOpBEFDhhUWdP3/pBB5sT ER2IJv7khg88OEMRaJ87xCyV+JyiXwcRabBfnQ7XIv4v1vBeH3WfkZfW24T1pnzA U8yw5/kmFj7NQ0S2H6N2/D2QyKMATsKTawgpd4= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :subject:from:to:cc:date:content-type:mime-version :content-transfer-encoding:message-id; s=default; bh=PWxSfmJnTUG J0lXiYxnmeCNXwIA=; b=YnH0HUynOUJVcfLCqMs+yKuWxrfOzQuv1X1jkFGwDSd 8vIf9t2uvNFHHhfUme7LPwT86OdoMkBDWfkH/q4y2BkaeYKRCf1qW5n18hR3s980 x2rIY4EFhEnnfVqMSlDr1sKgDKepZCKkG2p4dmFOoXtnFcQPOOlkl56iXknx0n5M = Received: (qmail 113744 invoked by alias); 11 Sep 2017 16:18:23 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 113734 invoked by uid 89); 11 Sep 2017 16:18:22 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-26.1 required=5.0 tests=AWL, BAYES_00, GIT_PATCH_0, GIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3, KAM_ASCII_DIVIDERS, RCVD_IN_DNSWL_LOW, SPF_PASS autolearn=ham version=3.3.2 spammy=vor, love X-HELO: mx0a-001b2d01.pphosted.com Received: from mx0a-001b2d01.pphosted.com (HELO mx0a-001b2d01.pphosted.com) (148.163.156.1) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Mon, 11 Sep 2017 16:18:21 +0000 Received: from pps.filterd (m0098394.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.0.21/8.16.0.21) with SMTP id v8BGFSMo004987 for ; Mon, 11 Sep 2017 12:18:19 -0400 Received: from e35.co.us.ibm.com (e35.co.us.ibm.com [32.97.110.153]) by mx0a-001b2d01.pphosted.com with ESMTP id 2cwv64qa69-1 (version=TLSv1.2 cipher=AES256-SHA bits=256 verify=NOT) for ; Mon, 11 Sep 2017 12:18:19 -0400 Received: from localhost by e35.co.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Mon, 11 Sep 2017 10:18:16 -0600 Received: from b03cxnp08027.gho.boulder.ibm.com (9.17.130.19) by e35.co.us.ibm.com (192.168.1.135) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; Mon, 11 Sep 2017 10:18:13 -0600 Received: from b03ledav006.gho.boulder.ibm.com (b03ledav006.gho.boulder.ibm.com [9.17.130.237]) by b03cxnp08027.gho.boulder.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id v8BGIDHj65601724; Mon, 11 Sep 2017 09:18:13 -0700 Received: from b03ledav006.gho.boulder.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 1F161C6042; Mon, 11 Sep 2017 10:18:13 -0600 (MDT) Received: from oc3304648336.ibm.com (unknown [9.70.82.190]) by b03ledav006.gho.boulder.ibm.com (Postfix) with ESMTP id A686FC6037; Mon, 11 Sep 2017 10:18:12 -0600 (MDT) Subject: [PATCH, rs6000] add vectorization to vec_mule and vec_mulo builtins From: Carl Love To: gcc-patches@gcc.gnu.org, David Edelsohn , Segher Boessenkool Cc: Bill Schmidt , cel@us.ibm.com Date: Mon, 11 Sep 2017 09:18:12 -0700 Mime-Version: 1.0 X-TM-AS-GCONF: 00 x-cbid: 17091116-0012-0000-0000-000014FD705A X-IBM-SpamModules-Scores: X-IBM-SpamModules-Versions: BY=3.00007708; HX=3.00000241; KW=3.00000007; PH=3.00000004; SC=3.00000227; SDB=6.00915529; UDB=6.00459655; IPR=6.00695760; BA=6.00005584; NDR=6.00000001; ZLA=6.00000005; ZF=6.00000009; ZB=6.00000000; ZP=6.00000000; ZH=6.00000000; ZU=6.00000002; MB=3.00017107; XFM=3.00000015; UTC=2017-09-11 16:18:15 X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 17091116-0013-0000-0000-00004F7196B3 Message-Id: <1505146692.18797.52.camel@us.ibm.com> X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:, , definitions=2017-09-11_06:, , signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 spamscore=0 suspectscore=0 malwarescore=0 phishscore=0 adultscore=0 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1707230000 definitions=main-1709110245 X-IsSubscribed: yes GCC Maintainers: The following patch re-adds the vectorization support for the vec_mule() and vec_mul0() builtins to the tree. The vectorization support was part of the original patch, commit 249424, to add the builtin funtionality. But the vectorization part was pulled as part of commit 250295 due to an underlying bug in the GCC vectorization support. Bill Schmidt fixed the underlying vectorization support in commit 251161 on 8/16/17 allowing the vectorization support to be re-added for these builtins. I have tested the patch on powerpc64le-unknown-linux-gnu (Power 8 LE), powerpc64-unknown-linux-gnu (Power 8 BE) systems with no regressions. Please let me know if the following patch is acceptable. Thanks. Carl Love -------------------------------------------------------------------------------- gcc/ChangeLog: 2017-09-11 Carl Love * config/rs6000/altivec.md (vec_widen_umult_even_v4si, vec_widen_smult_even_v4si): Add define expands for vmuleuw, vmulesw, vmulouw, vmulosw. * config/rs6000/rs6000-builtin.def (VMLEUW, VMULESW, VMULOUW, VMULOSW): Add definitions. * config/rs6000/rs6000-c.c (altivec_overloaded_builtins): Add ALTIVEC_BUILTIN_VMULESW, ALTIVEC_BUILTIN_VMULEUW, ALTIVEC_BUILTIN_VMULOSW, ALTIVEC_BUILTIN_VMULOUW entries. * config/rs6000/rs6000.c (rs6000_gimple_fold_builtin, builtin_function_type): Add ALTIVEC_BUILTIN_* case statements. --- gcc/config/rs6000/altivec.md | 54 ++++++++++++++++++++++++++++++++++++ gcc/config/rs6000/rs6000-builtin.def | 8 +++--- gcc/config/rs6000/rs6000-c.c | 9 ++++++ gcc/config/rs6000/rs6000.c | 4 +++ 4 files changed, 71 insertions(+), 4 deletions(-) diff --git a/gcc/config/rs6000/altivec.md b/gcc/config/rs6000/altivec.md index 0aa1e30..168cdb3 100644 --- a/gcc/config/rs6000/altivec.md +++ b/gcc/config/rs6000/altivec.md @@ -1428,6 +1428,32 @@ DONE; }) +(define_expand "vec_widen_umult_even_v4si" + [(use (match_operand:V2DI 0 "register_operand")) + (use (match_operand:V4SI 1 "register_operand")) + (use (match_operand:V4SI 2 "register_operand"))] + "TARGET_P8_VECTOR" +{ + if (VECTOR_ELT_ORDER_BIG) + emit_insn (gen_altivec_vmuleuw (operands[0], operands[1], operands[2])); + else + emit_insn (gen_altivec_vmulouw (operands[0], operands[1], operands[2])); + DONE; +}) + +(define_expand "vec_widen_smult_even_v4si" + [(use (match_operand:V2DI 0 "register_operand")) + (use (match_operand:V4SI 1 "register_operand")) + (use (match_operand:V4SI 2 "register_operand"))] + "TARGET_P8_VECTOR" +{ + if (VECTOR_ELT_ORDER_BIG) + emit_insn (gen_altivec_vmulesw (operands[0], operands[1], operands[2])); + else + emit_insn (gen_altivec_vmulosw (operands[0], operands[1], operands[2])); + DONE; +}) + (define_expand "vec_widen_umult_odd_v16qi" [(use (match_operand:V8HI 0 "register_operand" "")) (use (match_operand:V16QI 1 "register_operand" "")) @@ -1480,6 +1506,34 @@ DONE; }) +(define_expand "vec_widen_umult_odd_v4si" + [(use (match_operand:V2DI 0 "register_operand")) + (use (match_operand:V4SI 1 "register_operand")) + (use (match_operand:V4SI 2 "register_operand"))] + "TARGET_P8_VECTOR" +{ + if (VECTOR_ELT_ORDER_BIG) + emit_insn (gen_altivec_vmulouw (operands[0], operands[1], operands[2])); + else + emit_insn (gen_altivec_vmuleuw (operands[0], operands[1], operands[2])); + DONE; +}) + +(define_expand "vec_widen_smult_odd_v4si" + [(use (match_operand:V2DI 0 "register_operand" "")) + (use (match_operand:V4SI 1 "register_operand" "")) + (use (match_operand:V4SI 2 "register_operand" ""))] + "TARGET_P8_VECTOR" +{ + if (VECTOR_ELT_ORDER_BIG) + emit_insn (gen_altivec_vmulosw (operands[0], operands[1], + operands[2])); + else + emit_insn (gen_altivec_vmulesw (operands[0], operands[1], + operands[2])); + DONE; +}) + (define_insn "altivec_vmuleub" [(set (match_operand:V8HI 0 "register_operand" "=v") (unspec:V8HI [(match_operand:V16QI 1 "register_operand" "v") diff --git a/gcc/config/rs6000/rs6000-builtin.def b/gcc/config/rs6000/rs6000-builtin.def index 850164a..18db576 100644 --- a/gcc/config/rs6000/rs6000-builtin.def +++ b/gcc/config/rs6000/rs6000-builtin.def @@ -1031,14 +1031,14 @@ BU_ALTIVEC_2 (VMULEUB, "vmuleub", CONST, vec_widen_umult_even_v16qi) BU_ALTIVEC_2 (VMULESB, "vmulesb", CONST, vec_widen_smult_even_v16qi) BU_ALTIVEC_2 (VMULEUH, "vmuleuh", CONST, vec_widen_umult_even_v8hi) BU_ALTIVEC_2 (VMULESH, "vmulesh", CONST, vec_widen_smult_even_v8hi) -BU_ALTIVEC_2 (VMULEUW, "vmuleuw", CONST, altivec_vmuleuw) -BU_ALTIVEC_2 (VMULESW, "vmulesw", CONST, altivec_vmulesw) +BU_ALTIVEC_2 (VMULEUW, "vmuleuw", CONST, vec_widen_umult_even_v4si) +BU_ALTIVEC_2 (VMULESW, "vmulesw", CONST, vec_widen_smult_even_v4si) BU_ALTIVEC_2 (VMULOUB, "vmuloub", CONST, vec_widen_umult_odd_v16qi) BU_ALTIVEC_2 (VMULOSB, "vmulosb", CONST, vec_widen_smult_odd_v16qi) BU_ALTIVEC_2 (VMULOUH, "vmulouh", CONST, vec_widen_umult_odd_v8hi) BU_ALTIVEC_2 (VMULOSH, "vmulosh", CONST, vec_widen_smult_odd_v8hi) -BU_ALTIVEC_2 (VMULOUW, "vmulouw", CONST, altivec_vmulouw) -BU_ALTIVEC_2 (VMULOSW, "vmulosw", CONST, altivec_vmulosw) +BU_ALTIVEC_2 (VMULOUW, "vmulouw", CONST, vec_widen_umult_odd_v4si) +BU_ALTIVEC_2 (VMULOSW, "vmulosw", CONST, vec_widen_smult_odd_v4si) BU_ALTIVEC_2 (VNOR, "vnor", CONST, norv4si3) BU_ALTIVEC_2 (VOR, "vor", CONST, iorv4si3) BU_ALTIVEC_2 (VPKUHUM, "vpkuhum", CONST, altivec_vpkuhum) diff --git a/gcc/config/rs6000/rs6000-c.c b/gcc/config/rs6000/rs6000-c.c index b2df850..fbab0a2 100644 --- a/gcc/config/rs6000/rs6000-c.c +++ b/gcc/config/rs6000/rs6000-c.c @@ -2212,6 +2212,10 @@ const struct altivec_builtin_types altivec_overloaded_builtins[] = { RS6000_BTI_unsigned_V4SI, RS6000_BTI_unsigned_V8HI, RS6000_BTI_unsigned_V8HI, 0 }, { ALTIVEC_BUILTIN_VEC_VMULESH, ALTIVEC_BUILTIN_VMULESH, RS6000_BTI_V4SI, RS6000_BTI_V8HI, RS6000_BTI_V8HI, 0 }, + { ALTIVEC_BUILTIN_VEC_VMULEUW, ALTIVEC_BUILTIN_VMULEUW, + RS6000_BTI_V2DI, RS6000_BTI_V4SI, RS6000_BTI_V4SI, 0 }, + { ALTIVEC_BUILTIN_VEC_VMULESW, ALTIVEC_BUILTIN_VMULESW, + RS6000_BTI_V2DI, RS6000_BTI_V4SI, RS6000_BTI_V4SI, 0 }, { ALTIVEC_BUILTIN_VEC_MULO, ALTIVEC_BUILTIN_VMULOUB, RS6000_BTI_unsigned_V8HI, RS6000_BTI_unsigned_V16QI, RS6000_BTI_unsigned_V16QI, 0 }, { ALTIVEC_BUILTIN_VEC_MULO, ALTIVEC_BUILTIN_VMULOSB, @@ -2233,6 +2237,11 @@ const struct altivec_builtin_types altivec_overloaded_builtins[] = { RS6000_BTI_V8HI, RS6000_BTI_V16QI, RS6000_BTI_V16QI, 0 }, { ALTIVEC_BUILTIN_VEC_VMULOUB, ALTIVEC_BUILTIN_VMULOUB, RS6000_BTI_unsigned_V8HI, RS6000_BTI_unsigned_V16QI, RS6000_BTI_unsigned_V16QI, 0 }, + { ALTIVEC_BUILTIN_VEC_VMULOUW, ALTIVEC_BUILTIN_VMULOUW, + RS6000_BTI_V2DI, RS6000_BTI_V4SI, RS6000_BTI_V4SI, 0 }, + { ALTIVEC_BUILTIN_VEC_VMULOSW, ALTIVEC_BUILTIN_VMULOSW, + RS6000_BTI_V2DI, RS6000_BTI_V4SI, RS6000_BTI_V4SI, 0 }, + { ALTIVEC_BUILTIN_VEC_NABS, ALTIVEC_BUILTIN_NABS_V16QI, RS6000_BTI_V16QI, RS6000_BTI_V16QI, 0, 0 }, { ALTIVEC_BUILTIN_VEC_NABS, ALTIVEC_BUILTIN_NABS_V8HI, diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c index ecdf776..a5d5302 100644 --- a/gcc/config/rs6000/rs6000.c +++ b/gcc/config/rs6000/rs6000.c @@ -16214,9 +16214,11 @@ rs6000_gimple_fold_builtin (gimple_stmt_iterator *gsi) /* Even element flavors of vec_mul (signed). */ case ALTIVEC_BUILTIN_VMULESB: case ALTIVEC_BUILTIN_VMULESH: + case ALTIVEC_BUILTIN_VMULESW: /* Even element flavors of vec_mul (unsigned). */ case ALTIVEC_BUILTIN_VMULEUB: case ALTIVEC_BUILTIN_VMULEUH: + case ALTIVEC_BUILTIN_VMULEUW: { arg0 = gimple_call_arg (stmt, 0); arg1 = gimple_call_arg (stmt, 1); @@ -16229,9 +16231,11 @@ rs6000_gimple_fold_builtin (gimple_stmt_iterator *gsi) /* Odd element flavors of vec_mul (signed). */ case ALTIVEC_BUILTIN_VMULOSB: case ALTIVEC_BUILTIN_VMULOSH: + case ALTIVEC_BUILTIN_VMULOSW: /* Odd element flavors of vec_mul (unsigned). */ case ALTIVEC_BUILTIN_VMULOUB: case ALTIVEC_BUILTIN_VMULOUH: + case ALTIVEC_BUILTIN_VMULOUW: { arg0 = gimple_call_arg (stmt, 0); arg1 = gimple_call_arg (stmt, 1);