From patchwork Thu Nov 12 19:34:54 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Joel Hutton X-Patchwork-Id: 1399307 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=sourceware.org; envelope-from=gcc-patches-bounces@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=none dis=none) header.from=gcc.gnu.org Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.a=rsa-sha256 header.s=default header.b=TZkdzh+0; dkim-atps=neutral Received: from sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4CXBdY71gKz9sRR for ; Fri, 13 Nov 2020 06:35:13 +1100 (AEDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 67FB539FF05C; Thu, 12 Nov 2020 19:35:09 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 67FB539FF05C DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1605209709; bh=Q81I+HfVlmSgAaTT1uvyqqZnOxkx8hjFoW6a1YwRSu4=; h=To:Subject:Date:List-Id:List-Unsubscribe:List-Archive:List-Post: List-Help:List-Subscribe:From:Reply-To:Cc:From; b=TZkdzh+051lQLEqToz0bjoH/GCPg2b6yfUTrlgx8hxcGDHEplOmbKgZovrEs5cJoa L1CfS/N3ZMfW4Qgkn2v9gu25mEQpK72febAcJsH2xGpKk2/epOxBtgsPsam6XGfTC0 69nEQhLTUlBcBxrjfxafib17/wsODaHj0a4A1jUg= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from EUR02-VE1-obe.outbound.protection.outlook.com (mail-eopbgr20064.outbound.protection.outlook.com [40.107.2.64]) by sourceware.org (Postfix) with ESMTPS id D3F883858002 for ; Thu, 12 Nov 2020 19:35:05 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org D3F883858002 Received: from AM6PR0502CA0071.eurprd05.prod.outlook.com (2603:10a6:20b:56::48) by PA4PR08MB6318.eurprd08.prod.outlook.com (2603:10a6:102:e2::20) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3541.21; Thu, 12 Nov 2020 19:35:03 +0000 Received: from AM5EUR03FT030.eop-EUR03.prod.protection.outlook.com (2603:10a6:20b:56:cafe::88) by AM6PR0502CA0071.outlook.office365.com (2603:10a6:20b:56::48) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3541.21 via Frontend Transport; Thu, 12 Nov 2020 19:35:02 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 63.35.35.123) smtp.mailfrom=arm.com; gcc.gnu.org; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com;gcc.gnu.org; dmarc=pass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 63.35.35.123 as permitted sender) receiver=protection.outlook.com; client-ip=63.35.35.123; helo=64aa7808-outbound-1.mta.getcheckrecipient.com; Received: from 64aa7808-outbound-1.mta.getcheckrecipient.com (63.35.35.123) by AM5EUR03FT030.mail.protection.outlook.com (10.152.16.117) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3564.22 via Frontend Transport; Thu, 12 Nov 2020 19:35:02 +0000 Received: ("Tessian outbound fcd5bc555ddc:v71"); Thu, 12 Nov 2020 19:35:02 +0000 X-CheckRecipientChecked: true X-CR-MTA-CID: 54901091ffbac032 X-CR-MTA-TID: 64aa7808 Received: from a06bd9b2f469.2 by 64aa7808-outbound-1.mta.getcheckrecipient.com id E7929F89-42C0-4C47-B7DE-266BBBDC3D9B.1; Thu, 12 Nov 2020 19:34:56 +0000 Received: from EUR04-VI1-obe.outbound.protection.outlook.com by 64aa7808-outbound-1.mta.getcheckrecipient.com with ESMTPS id a06bd9b2f469.2 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384); Thu, 12 Nov 2020 19:34:56 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=iE7IxHZqE2eJ2tfl5gGB6UEY7WQ0sEmVhjJBTwgJ20ATU+hEg3EP7HkJbApc+0CmMXPBFe/WNNUcCbhsvfRR+XTxPWWp2aZffoUDmcQOf7yMppdiBRHbCry4kgXiNmRJUqNayCRM/Y9iYbAkjHUjG52tsa9UNjxQZqJM/m+qQ6JF1pwPq+izsoPkctEFgDpab6FcVSKotGEsM40ru8UjOqcF3GUlTVci3Z47PgReQ9umRXXcNqk501gpjof7RbMXq9ZM39UR0omllfQw0fn8Bh2psUjU/Afxd3NMvIauW3CqPvgKJTFXzWT5+YHyqtSLD4yuuyJ2Io93xYE2IbUBFw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=Q81I+HfVlmSgAaTT1uvyqqZnOxkx8hjFoW6a1YwRSu4=; b=YsZe3EF8/g9dJXrRtu1DUuDWQKWjz6f65xcMNOYutWXB3NFrB+LNNMLB/qLGeDr7U3MzJZtIjQ1nZUGqCEtxZVKAWSTLsTKJ17ui0ODt0bRklQ0e1IMpi+jzaRtneGarbGYheahwOx91warCikxFf/PcrTs2W2IBkftXpOxg53oEYEx0H8ens3hfcDZm42NRGnfZ393R2pTRvV/sAUZ1Kt+tYoKGbfEL5RIuj8pdMrxGTLyw5sJE/TyuA5RWoVibRS+tLVgj6KNbZVyNd+STgVUbJOvRBHOEGLPazZisRP5/exGnX4dyMyDHPYUY7IipxNE7ucPsWxpiE9wYvdOYWQ== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=arm.com; dmarc=pass action=none header.from=arm.com; dkim=pass header.d=arm.com; arc=none Received: from DB6PR0802MB2200.eurprd08.prod.outlook.com (2603:10a6:4:83::20) by DB6PR0802MB2200.eurprd08.prod.outlook.com (2603:10a6:4:83::20) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3541.21; Thu, 12 Nov 2020 19:34:54 +0000 Received: from DB6PR0802MB2200.eurprd08.prod.outlook.com ([fe80::b473:7ab4:6809:9726]) by DB6PR0802MB2200.eurprd08.prod.outlook.com ([fe80::b473:7ab4:6809:9726%12]) with mapi id 15.20.3541.025; Thu, 12 Nov 2020 19:34:54 +0000 To: GCC Patches Subject: [2/3][vect] Add widening add, subtract vect patterns Thread-Topic: [2/3][vect] Add widening add, subtract vect patterns Thread-Index: AQHWuQWFsqvsBkpNsk2bLpWO9RhUGQ== Date: Thu, 12 Nov 2020 19:34:54 +0000 Message-ID: Accept-Language: en-GB, en-US Content-Language: en-GB X-MS-Has-Attach: yes X-MS-TNEF-Correlator: Authentication-Results-Original: gcc.gnu.org; dkim=none (message not signed) header.d=none;gcc.gnu.org; dmarc=none action=none header.from=arm.com; x-originating-ip: [217.140.99.251] x-ms-publictraffictype: Email X-MS-Office365-Filtering-HT: Tenant X-MS-Office365-Filtering-Correlation-Id: ccc16948-2297-425a-b636-08d887420cf9 x-ms-traffictypediagnostic: DB6PR0802MB2200:|PA4PR08MB6318: x-ms-exchange-transport-forked: True X-Microsoft-Antispam-PRVS: x-checkrecipientrouted: true nodisclaimer: true x-ms-oob-tlc-oobclassifiers: OLM:8273;OLM:8273; X-MS-Exchange-SenderADCheck: 1 X-Microsoft-Antispam-Untrusted: BCL:0; X-Microsoft-Antispam-Message-Info-Original: Bn33O0h/1S4Df2agkGyUZYdIbvvJR/tdlei+RN04Wz+4IEfd9iz1dftNy4aqpDLCYV07Z/rcX3yiUN7jdN8qvN+UEnfyf8XEwGcCYQ4xnoFc1X3CVoDDUH2sKDTMSGw91vx9HhPbBjXScRLA15/ilCjtdpE0s25jj38FpErUKgagfMkJxHAH1ZCQ/l3Ra1sj+qS9Cel8e9JJSoUtrMwo1J77lnPS8FUT0lhLnMfX8NkNk6GsX37nPwRbGseJZozvWz3DdZ3lefyqe3AYXEV1oO+83M4rUyXPQHh3WoKFdoL5VrPEjh+bHvBzOxCdlDDmuncwCWrzAluPkA+ud9oo3Q== X-Forefront-Antispam-Report-Untrusted: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:DB6PR0802MB2200.eurprd08.prod.outlook.com; PTR:; CAT:NONE; SFS:(4636009)(366004)(39850400004)(346002)(396003)(376002)(136003)(4001150100001)(6916009)(33656002)(6512007)(9686003)(54906003)(86362001)(5660300002)(6506007)(8676002)(99936003)(83380400001)(66556008)(66476007)(64756008)(2906002)(4326008)(66616009)(76116006)(91956017)(66946007)(8936002)(66446008)(186003)(26005)(52536014)(478600001)(6486002)(71200400001)(316002); DIR:OUT; SFP:1101; x-ms-exchange-antispam-messagedata: aJy2CUkLha0cBzlnO1gfrOUS6Ot+4FvmI7PC5GJQ19o0xs+VvZNpBNwUDaIVLp4fxwrlBvddGjZdFa3mBcd4iA6NoqDbvFmhd6TMYU7Qd0eR3X/2nk6cFC0EsNXVSV+xnQdnODJRVth0JeHvEz4QuoAmCYbqbJTSie8WIRpPTQFdkzatjjA6TiwBQwFuLD3lkrmEibPjmEwnA78BchH65iGLM9oOVeWiXBc6EXIz2hfSxFhlTCOryD9bSCfGGK5KexKCJq1eJs+fIRE7DGN72iFwSz4RmyW2z9jBWK65P67LNN/yP6exwHXGNv+attz3t90WZDYtmAsvnBdzDRJFgsV/aCTZLQ4xz6o7fjqKUeVjJhp5qa8nLvzptudNOJ7TlInu/9Be/Dcv1esTaFO1FlUXzETXhQrsS4WMTcVO3dMafI6ILDAl/3COk9bhPYl0BSvElFvJKb0ewM0FOiVlD3kV51cfls81Isl/5t9u6aIxIkx9F7dZnx6WoGw2lh+eFDU3HvyKi2YyPhjWNpT/Ypw6U5+VJ7UdQKlDLmcg6vMNfeF9xuJoHyOSZ0T+5XuIWovoEToQia8UejUu62bJzDe9uHn2Iiy3v+Qjy2hAdkculeT9MVB9miJbK7Xo5SHOkgBVJKQw87DdS3KePuDT5A== MIME-Version: 1.0 X-MS-Exchange-Transport-CrossTenantHeadersStamped: DB6PR0802MB2200 Original-Authentication-Results: gcc.gnu.org; dkim=none (message not signed) header.d=none;gcc.gnu.org; dmarc=none action=none header.from=arm.com; X-EOPAttributedMessage: 0 X-MS-Exchange-Transport-CrossTenantHeadersStripped: AM5EUR03FT030.eop-EUR03.prod.protection.outlook.com X-MS-Office365-Filtering-Correlation-Id-Prvs: 8b40e688-dfd3-48a0-c6cb-08d8874207cc X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: TnC2/6bty1OXWOAi7S+0GLa00Ci+Gklcot0QuBt6qO1upammjgU04/MuJdiNFiQj1cnq76tM/9jaFtVACePiSFrs1gL6Xdq191FCS2jU5bQM14jSspBOQpDtrfr3bc6WX8ZSb88Tfr3s5Nay+MV09BooaInxLfj19xalWX6OZpXZ14+yFAJaLVrU5XUmVH9zjSnDGP/nC5xOnKLMKBtHz8Z4rJdJFqBiRsSbvcfmyYSAarhnltaDU1UfxMyfA3E6gwQKTGMhqyGG5huOHUY2gWAvUnKKsAidhmyX/jS8oBg27eAkX+W/ev/gnRaf4X0KOXBsvsjNLWSH+oArMOapGklGHNKz4VjhKvqX8QYB3GAXAHj08AKGTD4PoUSVy7nq7QHwSUuyjdbWVwKXEokJqA== X-Forefront-Antispam-Report: CIP:63.35.35.123; CTRY:IE; LANG:en; SCL:1; SRV:; IPV:CAL; SFV:NSPM; H:64aa7808-outbound-1.mta.getcheckrecipient.com; PTR:ec2-63-35-35-123.eu-west-1.compute.amazonaws.com; CAT:NONE; SFS:(4636009)(346002)(396003)(39850400004)(136003)(376002)(46966005)(336012)(6486002)(83380400001)(8676002)(8936002)(86362001)(235185007)(4001150100001)(6506007)(36906005)(54906003)(316002)(5660300002)(4326008)(107886003)(81166007)(2906002)(82740400003)(478600001)(6916009)(47076004)(6512007)(99936003)(186003)(356005)(9686003)(82310400003)(52536014)(66616009)(70206006)(70586007)(33656002)(26005); DIR:OUT; SFP:1101; X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 12 Nov 2020 19:35:02.7368 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: ccc16948-2297-425a-b636-08d887420cf9 X-MS-Exchange-CrossTenant-Id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=f34e5979-57d9-4aaa-ad4d-b122a662184d; Ip=[63.35.35.123]; Helo=[64aa7808-outbound-1.mta.getcheckrecipient.com] X-MS-Exchange-CrossTenant-AuthSource: AM5EUR03FT030.eop-EUR03.prod.protection.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: PA4PR08MB6318 X-Spam-Status: No, score=-14.6 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, GIT_PATCH_0, KAM_LOTSOFHASH, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2, SPF_HELO_PASS, SPF_PASS, TXREP, UNPARSEABLE_RELAY autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Joel Hutton via Gcc-patches From: Joel Hutton Reply-To: Joel Hutton Cc: Richard Biener Errors-To: gcc-patches-bounces@gcc.gnu.org Sender: "Gcc-patches" Hi all, This patch adds widening add and widening subtract patterns to tree-vect-patterns. All 3 patches together bootstrapped and regression tested on aarch64. gcc/ChangeLog: 2020-11-12  Joel Hutton           * expr.c (expand_expr_real_2): add widen_add,widen_subtract cases         * optabs-tree.c (optab_for_tree_code): optabs for widening adds,subtracts         * optabs.def (OPTAB_D): define vectorized widen add, subtracts         * tree-cfg.c (verify_gimple_assign_binary): Add case for widening adds, subtracts         * tree-inline.c (estimate_operator_cost): Add case for widening adds, subtracts         * tree-vect-generic.c (expand_vector_operations_1): Add case for widening adds, subtracts         * tree-vect-patterns.c (vect_recog_widen_add_pattern): New recog ptatern         (vect_recog_widen_sub_pattern): New recog pattern         (vect_recog_average_pattern): Update widened add code         (vect_recog_average_pattern): Update widened add code         * tree-vect-stmts.c (vectorizable_conversion): Add case for widened add, subtract         (supportable_widening_operation): Add case for widened add, subtract         * tree.def (WIDEN_ADD_EXPR): New tree code         (WIDEN_SUB_EXPR): New tree code         (VEC_WIDEN_ADD_HI_EXPR): New tree code         (VEC_WIDEN_ADD_LO_EXPR): New tree code         (VEC_WIDEN_SUB_HI_EXPR): New tree code         (VEC_WIDEN_SUB_LO_EXPR): New tree code gcc/testsuite/ChangeLog: 2020-11-12  Joel Hutton           * gcc.target/aarch64/vect-widen-add.c: New test.         * gcc.target/aarch64/vect-widen-sub.c: New test. Ok for trunk? From e0c10ca554729b9e6d58dbd3f18ba72b2c3ee8bc Mon Sep 17 00:00:00 2001 From: Joel Hutton Date: Mon, 9 Nov 2020 15:44:18 +0000 Subject: [PATCH 2/3] [vect] Add widening add, subtract patterns Add widening add, subtract patterns to tree-vect-patterns. Add aarch64 tests for patterns. fix sad --- gcc/expr.c | 6 ++ gcc/optabs-tree.c | 17 ++++ gcc/optabs.def | 8 ++ .../gcc.target/aarch64/vect-widen-add.c | 90 +++++++++++++++++++ .../gcc.target/aarch64/vect-widen-sub.c | 90 +++++++++++++++++++ gcc/tree-cfg.c | 8 ++ gcc/tree-inline.c | 6 ++ gcc/tree-vect-generic.c | 4 + gcc/tree-vect-patterns.c | 32 +++++-- gcc/tree-vect-stmts.c | 15 +++- gcc/tree.def | 6 ++ 11 files changed, 276 insertions(+), 6 deletions(-) create mode 100644 gcc/testsuite/gcc.target/aarch64/vect-widen-add.c create mode 100644 gcc/testsuite/gcc.target/aarch64/vect-widen-sub.c diff --git a/gcc/expr.c b/gcc/expr.c index ae16f07775870792729e3805436d7f2debafb6ca..ffc8aed5296174066849d9e0d73b1c352c20fd9e 100644 --- a/gcc/expr.c +++ b/gcc/expr.c @@ -9034,6 +9034,8 @@ expand_expr_real_2 (sepops ops, rtx target, machine_mode tmode, target, unsignedp); return target; + case WIDEN_ADD_EXPR: + case WIDEN_SUB_EXPR: case WIDEN_MULT_EXPR: /* If first operand is constant, swap them. Thus the following special case checks need only @@ -9754,6 +9756,10 @@ expand_expr_real_2 (sepops ops, rtx target, machine_mode tmode, return temp; } + case VEC_WIDEN_ADD_HI_EXPR: + case VEC_WIDEN_ADD_LO_EXPR: + case VEC_WIDEN_SUB_HI_EXPR: + case VEC_WIDEN_SUB_LO_EXPR: case VEC_WIDEN_MULT_HI_EXPR: case VEC_WIDEN_MULT_LO_EXPR: case VEC_WIDEN_MULT_EVEN_EXPR: diff --git a/gcc/optabs-tree.c b/gcc/optabs-tree.c index 4dfda756932de1693667c39c6fabed043b20b63b..009dccfa3bd298bca7b3b45401a4cc2acc90ff21 100644 --- a/gcc/optabs-tree.c +++ b/gcc/optabs-tree.c @@ -170,6 +170,23 @@ optab_for_tree_code (enum tree_code code, const_tree type, return (TYPE_UNSIGNED (type) ? vec_widen_ushiftl_lo_optab : vec_widen_sshiftl_lo_optab); + case VEC_WIDEN_ADD_LO_EXPR: + return (TYPE_UNSIGNED (type) + ? vec_widen_uaddl_lo_optab : vec_widen_saddl_lo_optab); + + case VEC_WIDEN_ADD_HI_EXPR: + return (TYPE_UNSIGNED (type) + ? vec_widen_uaddl_hi_optab : vec_widen_saddl_hi_optab); + + case VEC_WIDEN_SUB_LO_EXPR: + return (TYPE_UNSIGNED (type) + ? vec_widen_usubl_lo_optab : vec_widen_ssubl_lo_optab); + + case VEC_WIDEN_SUB_HI_EXPR: + return (TYPE_UNSIGNED (type) + ? vec_widen_usubl_hi_optab : vec_widen_ssubl_hi_optab); + + case VEC_UNPACK_HI_EXPR: return (TYPE_UNSIGNED (type) ? vec_unpacku_hi_optab : vec_unpacks_hi_optab); diff --git a/gcc/optabs.def b/gcc/optabs.def index 78409aa14537d259bf90277751aac00d452a0d3f..a97cdb360781ca9c743e2991422c600626c75aa5 100644 --- a/gcc/optabs.def +++ b/gcc/optabs.def @@ -383,6 +383,14 @@ OPTAB_D (vec_widen_smult_even_optab, "vec_widen_smult_even_$a") OPTAB_D (vec_widen_smult_hi_optab, "vec_widen_smult_hi_$a") OPTAB_D (vec_widen_smult_lo_optab, "vec_widen_smult_lo_$a") OPTAB_D (vec_widen_smult_odd_optab, "vec_widen_smult_odd_$a") +OPTAB_D (vec_widen_ssubl_hi_optab, "vec_widen_ssubl_hi_$a") +OPTAB_D (vec_widen_ssubl_lo_optab, "vec_widen_ssubl_lo_$a") +OPTAB_D (vec_widen_saddl_hi_optab, "vec_widen_saddl_hi_$a") +OPTAB_D (vec_widen_saddl_lo_optab, "vec_widen_saddl_lo_$a") +OPTAB_D (vec_widen_usubl_hi_optab, "vec_widen_usubl_hi_$a") +OPTAB_D (vec_widen_usubl_lo_optab, "vec_widen_usubl_lo_$a") +OPTAB_D (vec_widen_uaddl_hi_optab, "vec_widen_uaddl_hi_$a") +OPTAB_D (vec_widen_uaddl_lo_optab, "vec_widen_uaddl_lo_$a") OPTAB_D (vec_widen_sshiftl_hi_optab, "vec_widen_sshiftl_hi_$a") OPTAB_D (vec_widen_sshiftl_lo_optab, "vec_widen_sshiftl_lo_$a") OPTAB_D (vec_widen_umult_even_optab, "vec_widen_umult_even_$a") diff --git a/gcc/testsuite/gcc.target/aarch64/vect-widen-add.c b/gcc/testsuite/gcc.target/aarch64/vect-widen-add.c new file mode 100644 index 0000000000000000000000000000000000000000..fc6966fd9f257170501247411d50428aaaabbdae --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/vect-widen-add.c @@ -0,0 +1,90 @@ +/* { dg-do run } */ +/* { dg-options "-O3 -save-temps" } */ +#include +#include + +#define ARR_SIZE 1024 + +/* Should produce an uaddl */ +void uadd_opt (uint32_t *foo, uint16_t *a, uint16_t *b) +{ + for( int i = 0; i < ARR_SIZE - 3;i=i+4) + { + foo[i] = a[i] + b[i]; + foo[i+1] = a[i+1] + b[i+1]; + foo[i+2] = a[i+2] + b[i+2]; + foo[i+3] = a[i+3] + b[i+3]; + } +} + +__attribute__((optimize (0))) +void uadd_nonopt (uint32_t *foo, uint16_t *a, uint16_t *b) +{ + for( int i = 0; i < ARR_SIZE - 3;i=i+4) + { + foo[i] = a[i] + b[i]; + foo[i+1] = a[i+1] + b[i+1]; + foo[i+2] = a[i+2] + b[i+2]; + foo[i+3] = a[i+3] + b[i+3]; + } +} + +/* Should produce an saddl */ +void sadd_opt (int32_t *foo, int16_t *a, int16_t *b) +{ + for( int i = 0; i < ARR_SIZE - 3;i=i+4) + { + foo[i] = a[i] + b[i]; + foo[i+1] = a[i+1] + b[i+1]; + foo[i+2] = a[i+2] + b[i+2]; + foo[i+3] = a[i+3] + b[i+3]; + } +} + +__attribute__((optimize (0))) +void sadd_nonopt (int32_t *foo, int16_t *a, int16_t *b) +{ + for( int i = 0; i < ARR_SIZE - 3;i=i+4) + { + foo[i] = a[i] + b[i]; + foo[i+1] = a[i+1] + b[i+1]; + foo[i+2] = a[i+2] + b[i+2]; + foo[i+3] = a[i+3] + b[i+3]; + } +} + + +void __attribute__((optimize (0))) +init(uint16_t *a, uint16_t *b) +{ + for( int i = 0; i < ARR_SIZE;i++) + { + a[i] = i; + b[i] = 2*i; + } +} + +int __attribute__((optimize (0))) +main() +{ + uint32_t foo_arr[ARR_SIZE]; + uint32_t bar_arr[ARR_SIZE]; + uint16_t a[ARR_SIZE]; + uint16_t b[ARR_SIZE]; + + init(a, b); + uadd_opt(foo_arr, a, b); + uadd_nonopt(bar_arr, a, b); + if (memcmp(foo_arr, bar_arr, ARR_SIZE) != 0) + return 1; + sadd_opt((int32_t*) foo_arr, (int16_t*) a, (int16_t*) b); + sadd_nonopt((int32_t*) bar_arr, (int16_t*) a, (int16_t*) b); + if (memcmp(foo_arr, bar_arr, ARR_SIZE) != 0) + return 1; + return 0; +} + +/* { dg-final { scan-assembler-times "uaddl\t" 1} } */ +/* { dg-final { scan-assembler-times "uaddl2\t" 1} } */ +/* { dg-final { scan-assembler-times "saddl\t" 1} } */ +/* { dg-final { scan-assembler-times "saddl2\t" 1} } */ diff --git a/gcc/testsuite/gcc.target/aarch64/vect-widen-sub.c b/gcc/testsuite/gcc.target/aarch64/vect-widen-sub.c new file mode 100644 index 0000000000000000000000000000000000000000..eab252786cd3a24974011c0b4451029ac1194935 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/vect-widen-sub.c @@ -0,0 +1,90 @@ +/* { dg-do run } */ +/* { dg-options "-O3 -save-temps" } */ +#include +#include + +#define ARR_SIZE 1024 + +/* Should produce an usubl */ +void usub_opt (uint32_t *foo, uint16_t *a, uint16_t *b) +{ + for( int i = 0; i < ARR_SIZE - 3;i=i+4) + { + foo[i] = a[i] - b[i]; + foo[i+1] = a[i+1] - b[i+1]; + foo[i+2] = a[i+2] - b[i+2]; + foo[i+3] = a[i+3] - b[i+3]; + } +} + +__attribute__((optimize (0))) +void usub_nonopt (uint32_t *foo, uint16_t *a, uint16_t *b) +{ + for( int i = 0; i < ARR_SIZE - 3;i=i+4) + { + foo[i] = a[i] - b[i]; + foo[i+1] = a[i+1] - b[i+1]; + foo[i+2] = a[i+2] - b[i+2]; + foo[i+3] = a[i+3] - b[i+3]; + } +} + +/* Should produce an ssubl */ +void ssub_opt (int32_t *foo, int16_t *a, int16_t *b) +{ + for( int i = 0; i < ARR_SIZE - 3;i=i+4) + { + foo[i] = a[i] - b[i]; + foo[i+1] = a[i+1] - b[i+1]; + foo[i+2] = a[i+2] - b[i+2]; + foo[i+3] = a[i+3] - b[i+3]; + } +} + +__attribute__((optimize (0))) +void ssub_nonopt (int32_t *foo, int16_t *a, int16_t *b) +{ + for( int i = 0; i < ARR_SIZE - 3;i=i+4) + { + foo[i] = a[i] - b[i]; + foo[i+1] = a[i+1] - b[i+1]; + foo[i+2] = a[i+2] - b[i+2]; + foo[i+3] = a[i+3] - b[i+3]; + } +} + + +void __attribute__((optimize (0))) +init(uint16_t *a, uint16_t *b) +{ + for( int i = 0; i < ARR_SIZE;i++) + { + a[i] = i; + b[i] = 2*i; + } +} + +int __attribute__((optimize (0))) +main() +{ + uint32_t foo_arr[ARR_SIZE]; + uint32_t bar_arr[ARR_SIZE]; + uint16_t a[ARR_SIZE]; + uint16_t b[ARR_SIZE]; + + init(a, b); + usub_opt(foo_arr, a, b); + usub_nonopt(bar_arr, a, b); + if (memcmp(foo_arr, bar_arr, ARR_SIZE) != 0) + return 1; + ssub_opt((int32_t*) foo_arr, (int16_t*) a, (int16_t*) b); + ssub_nonopt((int32_t*) bar_arr, (int16_t*) a, (int16_t*) b); + if (memcmp(foo_arr, bar_arr, ARR_SIZE) != 0) + return 1; + return 0; +} + +/* { dg-final { scan-assembler-times "usubl\t" 1} } */ +/* { dg-final { scan-assembler-times "usubl2\t" 1} } */ +/* { dg-final { scan-assembler-times "ssubl\t" 1} } */ +/* { dg-final { scan-assembler-times "ssubl2\t" 1} } */ diff --git a/gcc/tree-cfg.c b/gcc/tree-cfg.c index 5139f111fecc7ec6e0902145b808308a5e47450b..532692a5da03d6d9653e2d47a1218982b27c4539 100644 --- a/gcc/tree-cfg.c +++ b/gcc/tree-cfg.c @@ -3864,6 +3864,12 @@ verify_gimple_assign_binary (gassign *stmt) return false; } + case VEC_WIDEN_SUB_HI_EXPR: + case VEC_WIDEN_SUB_LO_EXPR: + case VEC_WIDEN_ADD_HI_EXPR: + case VEC_WIDEN_ADD_LO_EXPR: + return false; + case VEC_WIDEN_LSHIFT_HI_EXPR: case VEC_WIDEN_LSHIFT_LO_EXPR: { @@ -3885,6 +3891,8 @@ verify_gimple_assign_binary (gassign *stmt) return false; } + case WIDEN_ADD_EXPR: + case WIDEN_SUB_EXPR: case PLUS_EXPR: case MINUS_EXPR: { diff --git a/gcc/tree-inline.c b/gcc/tree-inline.c index 32424b169c7310c03168baf43cc56a8b6ef2f15b..bff650be79e0c820f619ab333871770658ce93ee 100644 --- a/gcc/tree-inline.c +++ b/gcc/tree-inline.c @@ -4224,6 +4224,8 @@ estimate_operator_cost (enum tree_code code, eni_weights *weights, case REALIGN_LOAD_EXPR: + case WIDEN_ADD_EXPR: + case WIDEN_SUB_EXPR: case WIDEN_SUM_EXPR: case WIDEN_MULT_EXPR: case DOT_PROD_EXPR: @@ -4232,6 +4234,10 @@ estimate_operator_cost (enum tree_code code, eni_weights *weights, case WIDEN_MULT_MINUS_EXPR: case WIDEN_LSHIFT_EXPR: + case VEC_WIDEN_ADD_HI_EXPR: + case VEC_WIDEN_ADD_LO_EXPR: + case VEC_WIDEN_SUB_HI_EXPR: + case VEC_WIDEN_SUB_LO_EXPR: case VEC_WIDEN_MULT_HI_EXPR: case VEC_WIDEN_MULT_LO_EXPR: case VEC_WIDEN_MULT_EVEN_EXPR: diff --git a/gcc/tree-vect-generic.c b/gcc/tree-vect-generic.c index d7bafa77134079faf743c1b482251311abb681c5..940658c6be9c41cdc35b4b72d78fc6c2ed3f2072 100644 --- a/gcc/tree-vect-generic.c +++ b/gcc/tree-vect-generic.c @@ -2118,6 +2118,10 @@ expand_vector_operations_1 (gimple_stmt_iterator *gsi, arguments, not the widened result. VEC_UNPACK_FLOAT_*_EXPR is calculated in the same way above. */ if (code == WIDEN_SUM_EXPR + || code == VEC_WIDEN_ADD_HI_EXPR + || code == VEC_WIDEN_ADD_LO_EXPR + || code == VEC_WIDEN_SUB_HI_EXPR + || code == VEC_WIDEN_SUB_LO_EXPR || code == VEC_WIDEN_MULT_HI_EXPR || code == VEC_WIDEN_MULT_LO_EXPR || code == VEC_WIDEN_MULT_EVEN_EXPR diff --git a/gcc/tree-vect-patterns.c b/gcc/tree-vect-patterns.c index f68a87e05ed54145a25ccff598eeef9e57f9a759..331027444a6d95343eb110f7f9c7db19b40ee5ee 100644 --- a/gcc/tree-vect-patterns.c +++ b/gcc/tree-vect-patterns.c @@ -1086,8 +1086,10 @@ vect_recog_sad_pattern (vec_info *vinfo, of the above pattern. */ tree plus_oprnd0, plus_oprnd1; - if (!vect_reassociating_reduction_p (vinfo, stmt_vinfo, PLUS_EXPR, - &plus_oprnd0, &plus_oprnd1)) + if (!(vect_reassociating_reduction_p (vinfo, stmt_vinfo, PLUS_EXPR, + &plus_oprnd0, &plus_oprnd1) + || vect_reassociating_reduction_p (vinfo, stmt_vinfo, WIDEN_ADD_EXPR, + &plus_oprnd0, &plus_oprnd1))) return NULL; tree sum_type = gimple_expr_type (last_stmt); @@ -1148,7 +1150,7 @@ vect_recog_sad_pattern (vec_info *vinfo, /* FORNOW. Can continue analyzing the def-use chain when this stmt in a phi inside the loop (in case we are analyzing an outer-loop). */ vect_unpromoted_value unprom[2]; - if (!vect_widened_op_tree (vinfo, diff_stmt_vinfo, MINUS_EXPR, MINUS_EXPR, + if (!vect_widened_op_tree (vinfo, diff_stmt_vinfo, MINUS_EXPR, WIDEN_SUB_EXPR, false, 2, unprom, &half_type)) return NULL; @@ -1262,6 +1264,24 @@ vect_recog_widen_mult_pattern (vec_info *vinfo, stmt_vec_info last_stmt_info, "vect_recog_widen_mult_pattern"); } +static gimple * +vect_recog_widen_add_pattern (vec_info *vinfo, stmt_vec_info last_stmt_info, + tree *type_out) +{ + return vect_recog_widen_op_pattern (vinfo, last_stmt_info, type_out, + PLUS_EXPR, WIDEN_ADD_EXPR, false, + "vect_recog_widen_add_pattern"); +} + +static gimple * +vect_recog_widen_sub_pattern (vec_info *vinfo, stmt_vec_info last_stmt_info, + tree *type_out) +{ + return vect_recog_widen_op_pattern (vinfo, last_stmt_info, type_out, + MINUS_EXPR, WIDEN_SUB_EXPR, false, + "vect_recog_widen_sub_pattern"); +} + /* Function vect_recog_pow_pattern Try to find the following pattern: @@ -1978,7 +1998,7 @@ vect_recog_average_pattern (vec_info *vinfo, vect_unpromoted_value unprom[3]; tree new_type; unsigned int nops = vect_widened_op_tree (vinfo, plus_stmt_info, PLUS_EXPR, - PLUS_EXPR, false, 3, + WIDEN_ADD_EXPR, false, 3, unprom, &new_type); if (nops == 0) return NULL; @@ -5249,7 +5269,9 @@ static vect_recog_func vect_vect_recog_func_ptrs[] = { of mask conversion that are needed for gather and scatter internal functions. */ { vect_recog_gather_scatter_pattern, "gather_scatter" }, - { vect_recog_mask_conversion_pattern, "mask_conversion" } + { vect_recog_mask_conversion_pattern, "mask_conversion" }, + { vect_recog_widen_add_pattern, "widen_add" }, + { vect_recog_widen_sub_pattern, "widen_sub" }, }; const unsigned int NUM_PATTERNS = ARRAY_SIZE (vect_vect_recog_func_ptrs); diff --git a/gcc/tree-vect-stmts.c b/gcc/tree-vect-stmts.c index 2c7a8a70913bfc4b9903e9328f4489257ca59e02..f12fd158b13656ee24022ec7e445c53444be6554 100644 --- a/gcc/tree-vect-stmts.c +++ b/gcc/tree-vect-stmts.c @@ -4570,6 +4570,8 @@ vectorizable_conversion (vec_info *vinfo, if (!CONVERT_EXPR_CODE_P (code) && code != FIX_TRUNC_EXPR && code != FLOAT_EXPR + && code != WIDEN_ADD_EXPR + && code != WIDEN_SUB_EXPR && code != WIDEN_MULT_EXPR && code != WIDEN_LSHIFT_EXPR) return false; @@ -4615,7 +4617,8 @@ vectorizable_conversion (vec_info *vinfo, if (op_type == binary_op) { - gcc_assert (code == WIDEN_MULT_EXPR || code == WIDEN_LSHIFT_EXPR); + gcc_assert (code == WIDEN_MULT_EXPR || code == WIDEN_LSHIFT_EXPR + || code == WIDEN_ADD_EXPR || code == WIDEN_SUB_EXPR); op1 = gimple_assign_rhs2 (stmt); tree vectype1_in; @@ -11534,6 +11537,16 @@ supportable_widening_operation (vec_info *vinfo, c2 = VEC_WIDEN_LSHIFT_HI_EXPR; break; + case WIDEN_ADD_EXPR: + c1 = VEC_WIDEN_ADD_LO_EXPR; + c2 = VEC_WIDEN_ADD_HI_EXPR; + break; + + case WIDEN_SUB_EXPR: + c1 = VEC_WIDEN_SUB_LO_EXPR; + c2 = VEC_WIDEN_SUB_HI_EXPR; + break; + CASE_CONVERT: c1 = VEC_UNPACK_LO_EXPR; c2 = VEC_UNPACK_HI_EXPR; diff --git a/gcc/tree.def b/gcc/tree.def index 6c53fe1bf67cd8eee7084de0b20b8d217d70710a..daaa2f22b384739c2d9fddcb1fd9185099e11788 100644 --- a/gcc/tree.def +++ b/gcc/tree.def @@ -1359,6 +1359,8 @@ DEFTREECODE (WIDEN_MULT_MINUS_EXPR, "widen_mult_minus_expr", tcc_expression, 3) the first argument from type t1 to type t2, and then shifting it by the second argument. */ DEFTREECODE (WIDEN_LSHIFT_EXPR, "widen_lshift_expr", tcc_binary, 2) +DEFTREECODE (WIDEN_ADD_EXPR, "widen_add_expr", tcc_binary, 2) +DEFTREECODE (WIDEN_SUB_EXPR, "widen_sub_expr", tcc_binary, 2) /* Widening vector multiplication. The two operands are vectors with N elements of size S. Multiplying the @@ -1423,6 +1425,10 @@ DEFTREECODE (VEC_PACK_FLOAT_EXPR, "vec_pack_float_expr", tcc_binary, 2) */ DEFTREECODE (VEC_WIDEN_LSHIFT_HI_EXPR, "widen_lshift_hi_expr", tcc_binary, 2) DEFTREECODE (VEC_WIDEN_LSHIFT_LO_EXPR, "widen_lshift_lo_expr", tcc_binary, 2) +DEFTREECODE (VEC_WIDEN_ADD_HI_EXPR, "widen_add_hi_expr", tcc_binary, 2) +DEFTREECODE (VEC_WIDEN_ADD_LO_EXPR, "widen_add_lo_expr", tcc_binary, 2) +DEFTREECODE (VEC_WIDEN_SUB_HI_EXPR, "widen_add_hi_expr", tcc_binary, 2) +DEFTREECODE (VEC_WIDEN_SUB_LO_EXPR, "widen_add_lo_expr", tcc_binary, 2) /* PREDICT_EXPR. Specify hint for branch prediction. The PREDICT_EXPR_PREDICTOR specify predictor and PREDICT_EXPR_OUTCOME the -- 2.17.1