From patchwork Fri Sep 25 14:27:23 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tamar Christina X-Patchwork-Id: 1371336 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=8.43.85.97; helo=sourceware.org; envelope-from=gcc-patches-bounces@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=arm.com Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=armh.onmicrosoft.com header.i=@armh.onmicrosoft.com header.a=rsa-sha256 header.s=selector2-armh-onmicrosoft-com header.b=dJWR796z; dkim=pass (1024-bit key) header.d=armh.onmicrosoft.com header.i=@armh.onmicrosoft.com header.a=rsa-sha256 header.s=selector2-armh-onmicrosoft-com header.b=dJWR796z; dkim-atps=neutral Received: from sourceware.org (server2.sourceware.org [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4ByZ5S24Vzz9sRf for ; Sat, 26 Sep 2020 00:28:12 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 1738D398B8B5; Fri, 25 Sep 2020 14:28:10 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from EUR04-DB3-obe.outbound.protection.outlook.com (mail-eopbgr60059.outbound.protection.outlook.com [40.107.6.59]) by sourceware.org (Postfix) with ESMTPS id C7759398B16E for ; Fri, 25 Sep 2020 14:28:06 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org C7759398B16E Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=Tamar.Christina@arm.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector2-armh-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=LDlXl7MQYqFPb1xAUD14fNpK22F6ZOFBzYLoMaBlbMY=; b=dJWR796zH/ZKLSWuZEw+bLeLBxMUfW0ZKT24QwsqCttStpYCYRNTh3v//X4xoW+8jxSXDONNQJwWJZ4l6YtN+5q+N7U7Oo30+7HHYb13OCuXRkk6qo42jMByAZx+jS/WiG5WGF8HXSDSHF9Bz8FNLZy8kFiZxIVfL8TNe1HNdJw= Received: from DB7PR02CA0021.eurprd02.prod.outlook.com (2603:10a6:10:52::34) by AM0PR08MB5457.eurprd08.prod.outlook.com (2603:10a6:208:180::12) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3412.21; Fri, 25 Sep 2020 14:28:05 +0000 Received: from DB5EUR03FT020.eop-EUR03.prod.protection.outlook.com (2603:10a6:10:52:cafe::dd) by DB7PR02CA0021.outlook.office365.com (2603:10a6:10:52::34) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3412.22 via Frontend Transport; Fri, 25 Sep 2020 14:28:05 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 63.35.35.123) smtp.mailfrom=arm.com; gcc.gnu.org; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com;gcc.gnu.org; dmarc=bestguesspass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 63.35.35.123 as permitted sender) receiver=protection.outlook.com; client-ip=63.35.35.123; helo=64aa7808-outbound-1.mta.getcheckrecipient.com; Received: from 64aa7808-outbound-1.mta.getcheckrecipient.com (63.35.35.123) by DB5EUR03FT020.mail.protection.outlook.com (10.152.20.134) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3412.21 via Frontend Transport; Fri, 25 Sep 2020 14:28:04 +0000 Received: ("Tessian outbound 34b830c8a0ef:v64"); Fri, 25 Sep 2020 14:28:04 +0000 X-CheckRecipientChecked: true X-CR-MTA-CID: 8fd35352a19620d0 X-CR-MTA-TID: 64aa7808 Received: from 21a1ac418516.1 by 64aa7808-outbound-1.mta.getcheckrecipient.com id D54E6349-6235-4F0A-A29F-1057B17F5549.1; Fri, 25 Sep 2020 14:27:27 +0000 Received: from EUR05-AM6-obe.outbound.protection.outlook.com by 64aa7808-outbound-1.mta.getcheckrecipient.com with ESMTPS id 21a1ac418516.1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384); Fri, 25 Sep 2020 14:27:27 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=K6im3pJ91IlTGsDLK5X7DjehIFWP50HnZjWDGNCW4hcvPMJ3WrYkskS5OxWQ5PBgyod5x90ybfwudjqETa5fe5wDSZv2SmbnQbYX4HRepKmtt7sqkzrsMZREj2R94ec9nn878cQAIYhIdbKCWsloc4i6aYhpLqP+ybB1wjk6AtwpZv/05WkJO5oNfH7jAm6Ce/g8NrVrLV6CNeNP5O1SqRrfovAti6aB8NST4IKKO9Hc0dNLsG1jGJqqW8VPSCVoeUeHqGWZWoBexE94CYpZWB/X5RrZo4KAJ52QGJC+4SQTHZ+nQ/FqK3SWYbgzzqU7+vOZMk23kubmX9zKmtxsQQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=LDlXl7MQYqFPb1xAUD14fNpK22F6ZOFBzYLoMaBlbMY=; b=aJbQ3CquzYZ7RZKOD62XY/BdHJ7d8P2H8HTfyFfuJj9sTu2+CRVi2jwIwafu2iC+AW8RNhKVXX8kClaYbLmKum4Cx/wuG3plFzoMFifTteJAgSeI0xs5c1F+ykcn9TWICj6re6RBBkqgyg8r+euQBVRpjhPLrGg9cVcmg4y9rAeYfvAobKA7061M8zANSZ4Ww6OH+uLGq8qAB00fLcYx7Bv/n71kSZA5HjcmYHW5y9zT5kjB7a5P4s1VrhepG/W9IOVwFVZ4pM3E9zD5ma/iZpMbTqfD0OT0frtyhb+sOwexb9w7YP81OZLSHw18RviJE5cWuzwVQxHLOo20efCJJw== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=arm.com; dmarc=pass action=none header.from=arm.com; dkim=pass header.d=arm.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector2-armh-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=LDlXl7MQYqFPb1xAUD14fNpK22F6ZOFBzYLoMaBlbMY=; b=dJWR796zH/ZKLSWuZEw+bLeLBxMUfW0ZKT24QwsqCttStpYCYRNTh3v//X4xoW+8jxSXDONNQJwWJZ4l6YtN+5q+N7U7Oo30+7HHYb13OCuXRkk6qo42jMByAZx+jS/WiG5WGF8HXSDSHF9Bz8FNLZy8kFiZxIVfL8TNe1HNdJw= Authentication-Results-Original: gcc.gnu.org; dkim=none (message not signed) header.d=none;gcc.gnu.org; dmarc=none action=none header.from=arm.com; Received: from VI1PR08MB5325.eurprd08.prod.outlook.com (2603:10a6:803:13e::17) by VE1PR08MB5678.eurprd08.prod.outlook.com (2603:10a6:800:1a0::20) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3391.11; Fri, 25 Sep 2020 14:27:25 +0000 Received: from VI1PR08MB5325.eurprd08.prod.outlook.com ([fe80::d0e7:49cd:4dae:a2a2]) by VI1PR08MB5325.eurprd08.prod.outlook.com ([fe80::d0e7:49cd:4dae:a2a2%7]) with mapi id 15.20.3412.024; Fri, 25 Sep 2020 14:27:25 +0000 Date: Fri, 25 Sep 2020 15:27:23 +0100 From: Tamar Christina To: gcc-patches@gcc.gnu.org Subject: [PATCH v2 1/16]middle-end: Refactor refcnt to use SLP_TREE_REF_COUNT for consistency Message-ID: <20200925142721.GA11407@arm.com> Content-Disposition: inline User-Agent: Mutt/1.9.4 (2018-02-28) X-ClientProxiedBy: LO2P265CA0060.GBRP265.PROD.OUTLOOK.COM (2603:10a6:600:60::24) To VI1PR08MB5325.eurprd08.prod.outlook.com (2603:10a6:803:13e::17) MIME-Version: 1.0 X-MS-Exchange-MessageSentRepresentingType: 1 Received: from arm.com (217.140.106.53) by LO2P265CA0060.GBRP265.PROD.OUTLOOK.COM (2603:10a6:600:60::24) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3412.21 via Frontend Transport; Fri, 25 Sep 2020 14:27:24 +0000 X-Originating-IP: [217.140.106.53] X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-HT: Tenant X-MS-Office365-Filtering-Correlation-Id: 0383737b-9e3b-4a43-aaf8-08d8615f373c X-MS-TrafficTypeDiagnostic: VE1PR08MB5678:|AM0PR08MB5457: X-Microsoft-Antispam-PRVS: x-checkrecipientrouted: true NoDisclaimer: true X-MS-Oob-TLC-OOBClassifiers: OLM:1443;OLM:1443; X-MS-Exchange-SenderADCheck: 1 X-Microsoft-Antispam-Untrusted: BCL:0; X-Microsoft-Antispam-Message-Info-Original: 6O1qb8r4TSnDit6MXWsdeygiJoqqubQZ3JGYH0V25jbWiGUfKGAX2mzzucrTS2v/OwDzlz5OBC8i6Ci0HoHw1fZo2pNXexNsDY0BVM/8+4WCyB8QOiZuMqjFelxIUlKrkPYOCeC9DyIReKgEin+Cguqbu/yE9TKLgLthh75rrLsZbCYpd/TOCIvgVatbOFNwISVyfKiLIi1kqUw6ljNI1FZjDMbNL5ffg7Xiuv6++Tks0eigwKhd4MnxYgtMyIIeaWYQ+MdAYv2SsezZrFAD5i/X0Jl8ZAVeOft8nRS534Hk4m6Gnpg+R8djd++WyqTIsjASTLYJe000qrjid3Zt7Uja9glmb9x89u5nQiCEMXdvcC3PgsYbX4a88oEVTujVocb8RYVSESYIzAVnpEeXHXkiHPRrRT1gtMHNlztuv1Q= X-Forefront-Antispam-Report-Untrusted: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:VI1PR08MB5325.eurprd08.prod.outlook.com; PTR:; CAT:NONE; SFS:(4636009)(396003)(366004)(136003)(39860400002)(346002)(376002)(8936002)(1076003)(55016002)(33656002)(8886007)(66946007)(66556008)(86362001)(66616009)(36756003)(66476007)(478600001)(6916009)(235185007)(44144004)(5660300002)(16526019)(186003)(4743002)(4326008)(33964004)(52116002)(44832011)(7696005)(316002)(2616005)(26005)(2906002)(8676002)(956004)(2700100001); DIR:OUT; SFP:1101; X-MS-Exchange-AntiSpam-MessageData: HZwi7xJsFur0Q8ux7y21gMkGsUz9FIyPOSXDTH2QNBQimdL6zli1CsqHi/92mZehKlWoQ7EYG9i2paVHXspqKLb/4l4UgLHdiCL0EoPtmKtVeXhJhY2ewPXCEsc3uwUiAY6uRbUoxWrKUPAdo5VCBrqdGBXMs3at7g3A9uyc/GpgWAK3KbmpVQu0XsjwgXQFMGZKPQYUubyatp92l5oNjX58ULshSQjuKBXqLOB0fftJx2kmQ4AjmA5S2oHkgIXGpF6qY6JQ5D/iA670g7jqfn6m4UzBso/nDFp5N00EO9zyv1S93PP32a1YOt4nsRddAboKQdAyja63OCKe8IfwhSG7VPUTtShQ8FWMQ81EOIG7BNuO7FM/V6oXWq/Xgvjf/QDNwHZERYSNTLnkoOlxJL90YNnSQTDd1PDKIh1HDadYj8sx9BGk4GHN2krk4ZJY2tmGzJ3UcmnwBIZTL46OdsbkcDLBAxHaN6WeKH2bLyW5YE5ZJ1QlH0WH2oSWto8FGI07yCjJtX87DJPsZrC34jpcKZ8rmMxM7ObEyU1jvvDt301lbzMhbqTiB7wOmt5e/Kn48ZddVLHazvpB5uGB6zylvnMf6UsCLCcCS3IZVShrqeMCrBZEO1i2sfW81YFfjsWsL8XxGqM3vMTdhp/mPg== X-MS-Exchange-Transport-Forked: True X-MS-Exchange-Transport-CrossTenantHeadersStamped: VE1PR08MB5678 Original-Authentication-Results: gcc.gnu.org; dkim=none (message not signed) header.d=none;gcc.gnu.org; dmarc=none action=none header.from=arm.com; X-EOPAttributedMessage: 0 X-MS-Exchange-Transport-CrossTenantHeadersStripped: DB5EUR03FT020.eop-EUR03.prod.protection.outlook.com X-MS-Office365-Filtering-Correlation-Id-Prvs: f0d7bda2-d350-4279-8f2e-08d8615f1f62 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: CeHYHKVcxIoUSBFIVQbp4RBf22WcOrawd3cuvJdvgpKvbZZGuTTppq9S1seBa3wUhiHB/vfEU5vOrHR50tXR9+gpuGIrVt4/ZXFSuu3eixsnKfeHF2L6WSZlObCtOpyh3i9gCAU0fTlO+1qznLGF+CowyUBsxnvyCIAG1FljMbu5pZ8r/MBws8P4j8PmUjeIzaDetd2/LOmxm3gNjwtzTXpKVtvLgom2aWBIAitZN1utbIBsFacca1/nEkLhHpkebIBwJH7Gvd0CO0Iid6R4vMzteBP0dKxC62x0dyLNRBBScSXUvEJTHej/UtnF/zQ8nrq+tnRAAXKHq69799oZKNBhrcjmRoMeDO0ng0Qvr41T7nradY8r2qoHWwrmID3oFTmpouM+SCZ1hkC+sAhHg/OscENCEU35uTvdOlGmEx4= X-Forefront-Antispam-Report: CIP:63.35.35.123; CTRY:IE; LANG:en; SCL:1; SRV:; IPV:CAL; SFV:NSPM; H:64aa7808-outbound-1.mta.getcheckrecipient.com; PTR:ec2-63-35-35-123.eu-west-1.compute.amazonaws.com; CAT:NONE; SFS:(4636009)(346002)(39860400002)(136003)(396003)(376002)(46966005)(33964004)(44144004)(7696005)(47076004)(4326008)(356005)(81166007)(4743002)(82310400003)(82740400003)(478600001)(26005)(235185007)(6916009)(186003)(316002)(336012)(55016002)(8886007)(2616005)(1076003)(5660300002)(36756003)(8936002)(70206006)(70586007)(86362001)(44832011)(33656002)(2906002)(956004)(16526019)(66616009)(8676002)(2700100001); DIR:OUT; SFP:1101; X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 25 Sep 2020 14:28:04.9360 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 0383737b-9e3b-4a43-aaf8-08d8615f373c X-MS-Exchange-CrossTenant-Id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=f34e5979-57d9-4aaa-ad4d-b122a662184d; Ip=[63.35.35.123]; Helo=[64aa7808-outbound-1.mta.getcheckrecipient.com] X-MS-Exchange-CrossTenant-AuthSource: DB5EUR03FT020.eop-EUR03.prod.protection.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: AM0PR08MB5457 X-Spam-Status: No, score=-14.1 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, GIT_PATCH_0, MSGID_FROM_MTA_HEADER, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2, SPF_HELO_PASS, SPF_PASS, TXREP, UNPARSEABLE_RELAY autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: nd@arm.com, rguenther@suse.de, ook@ucw.cz Errors-To: gcc-patches-bounces@gcc.gnu.org Sender: "Gcc-patches" Hi All, This is a small refactoring which introduces SLP_TREE_REF_COUNT and replaces the uses of refcnt with it. This for consistency between the other properties. A similar patch was pre-approved last year but since there are more use now I am sending it for review anyway. Bootstrapped Regtested on aarch64-none-linux-gnu and no issues. Ok for master? Thanks, Tamar gcc/ChangeLog: * tree-vectorizer.h (SLP_TREE_REF_COUNT): New. * tree-vect-slp.c (_slp_tree::_slp_tree, _slp_tree::~_slp_tree, vect_free_slp_tree, vect_build_slp_tree, vect_print_slp_tree, slp_copy_subtree, vect_attempt_slp_rearrange_stmts): Use it. diff --git a/gcc/tree-vect-slp.c b/gcc/tree-vect-slp.c index c44fd396bf0b69a4153e46026c545bebb3797551..bf8ea4326597f4211d2772e9db60aa69285b5998 100644 --- a/gcc/tree-vect-slp.c +++ b/gcc/tree-vect-slp.c @@ -66,7 +66,7 @@ _slp_tree::_slp_tree () SLP_TREE_CODE (this) = ERROR_MARK; SLP_TREE_VECTYPE (this) = NULL_TREE; SLP_TREE_REPRESENTATIVE (this) = NULL; - this->refcnt = 1; + SLP_TREE_REF_COUNT (this) = 1; this->max_nunits = 1; this->lanes = 0; } @@ -92,7 +92,7 @@ vect_free_slp_tree (slp_tree node) int i; slp_tree child; - if (--node->refcnt != 0) + if (--SLP_TREE_REF_COUNT (node) != 0) return; FOR_EACH_VEC_ELT (SLP_TREE_CHILDREN (node), i, child) @@ -1180,7 +1180,7 @@ vect_build_slp_tree (vec_info *vinfo, *leader ? "" : "failed ", *leader); if (*leader) { - (*leader)->refcnt++; + SLP_TREE_REF_COUNT (*leader)++; vect_update_max_nunits (max_nunits, (*leader)->max_nunits); } return *leader; @@ -1194,7 +1194,7 @@ vect_build_slp_tree (vec_info *vinfo, res->max_nunits = this_max_nunits; vect_update_max_nunits (max_nunits, this_max_nunits); /* Keep a reference for the bst_map use. */ - res->refcnt++; + SLP_TREE_REF_COUNT (res)++; } bst_map->put (stmts.copy (), res); return res; @@ -1590,7 +1590,7 @@ fail: SLP_TREE_CHILDREN (two).safe_splice (children); slp_tree child; FOR_EACH_VEC_ELT (SLP_TREE_CHILDREN (two), i, child) - child->refcnt++; + SLP_TREE_REF_COUNT (child)++; /* Here we record the original defs since this node represents the final lane configuration. */ @@ -1650,7 +1650,8 @@ vect_print_slp_tree (dump_flags_t dump_kind, dump_location_t loc, : (SLP_TREE_DEF_TYPE (node) == vect_constant_def ? " (constant)" : ""), node, - estimated_poly_value (node->max_nunits), node->refcnt); + estimated_poly_value (node->max_nunits), + SLP_TREE_REF_COUNT (node)); if (SLP_TREE_SCALAR_STMTS (node).exists ()) FOR_EACH_VEC_ELT (SLP_TREE_SCALAR_STMTS (node), i, stmt_info) dump_printf_loc (metadata, user_loc, "\tstmt %u %G", i, stmt_info->stmt); @@ -1802,7 +1803,7 @@ slp_copy_subtree (slp_tree node, hash_map &map) SLP_TREE_REPRESENTATIVE (copy) = SLP_TREE_REPRESENTATIVE (node); SLP_TREE_LANES (copy) = SLP_TREE_LANES (node); copy->max_nunits = node->max_nunits; - copy->refcnt = 0; + SLP_TREE_REF_COUNT (copy) = 0; if (SLP_TREE_SCALAR_STMTS (node).exists ()) SLP_TREE_SCALAR_STMTS (copy) = SLP_TREE_SCALAR_STMTS (node).copy (); if (SLP_TREE_SCALAR_OPS (node).exists ()) @@ -1819,7 +1820,7 @@ slp_copy_subtree (slp_tree node, hash_map &map) FOR_EACH_VEC_ELT (SLP_TREE_CHILDREN (copy), i, child) { SLP_TREE_CHILDREN (copy)[i] = slp_copy_subtree (child, map); - SLP_TREE_CHILDREN (copy)[i]->refcnt++; + SLP_TREE_REF_COUNT (SLP_TREE_CHILDREN (copy)[i])++; } return copy; } @@ -1935,7 +1936,7 @@ vect_attempt_slp_rearrange_stmts (slp_instance slp_instn) hash_map map; slp_tree unshared = slp_copy_subtree (SLP_INSTANCE_TREE (slp_instn), map); vect_free_slp_tree (SLP_INSTANCE_TREE (slp_instn)); - unshared->refcnt++; + SLP_TREE_REF_COUNT (unshared)++; SLP_INSTANCE_TREE (slp_instn) = unshared; FOR_EACH_VEC_ELT (SLP_INSTANCE_LOADS (slp_instn), i, node) SLP_INSTANCE_LOADS (slp_instn)[i] = *map.get (node); diff --git a/gcc/tree-vectorizer.h b/gcc/tree-vectorizer.h index 9dffc5570e51b21c2f5c02b80a9f49d25a183284..2ebcf9f9926ec7175f28391f172800499bbc59db 100644 --- a/gcc/tree-vectorizer.h +++ b/gcc/tree-vectorizer.h @@ -204,6 +204,7 @@ public: #define SLP_TREE_CHILDREN(S) (S)->children #define SLP_TREE_SCALAR_STMTS(S) (S)->stmts #define SLP_TREE_SCALAR_OPS(S) (S)->ops +#define SLP_TREE_REF_COUNT(S) (S)->refcnt #define SLP_TREE_VEC_STMTS(S) (S)->vec_stmts #define SLP_TREE_VEC_DEFS(S) (S)->vec_defs #define SLP_TREE_NUMBER_OF_VEC_STMTS(S) (S)->vec_stmts_size From patchwork Fri Sep 25 14:27:42 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tamar Christina X-Patchwork-Id: 1371338 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=sourceware.org; envelope-from=gcc-patches-bounces@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=arm.com Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=armh.onmicrosoft.com header.i=@armh.onmicrosoft.com header.a=rsa-sha256 header.s=selector2-armh-onmicrosoft-com header.b=pIu6Cvud; dkim=pass (1024-bit key) header.d=armh.onmicrosoft.com header.i=@armh.onmicrosoft.com header.a=rsa-sha256 header.s=selector2-armh-onmicrosoft-com header.b=pIu6Cvud; dkim-atps=neutral Received: from sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4ByZ5x0Sd8z9sR4 for ; Sat, 26 Sep 2020 00:28:37 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 13BB0398C024; Fri, 25 Sep 2020 14:28:29 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from EUR03-VE1-obe.outbound.protection.outlook.com (mail-eopbgr50067.outbound.protection.outlook.com [40.107.5.67]) by sourceware.org (Postfix) with ESMTPS id 9BD6A398B400 for ; Fri, 25 Sep 2020 14:28:25 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 9BD6A398B400 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=Tamar.Christina@arm.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector2-armh-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=c2sRAD3p42ZxuXyKIn0qfPNQINGuHwbN+u3mdv+D4Mo=; b=pIu6CvudxM9dshkrJpMtVxkFJipsvNVtjw2oMAWsXcY4nhpU4ieYxuZApGznBwyz/UpvZAn1uBQ71vsVmge51JN647klKr/iNT3p242jKIOyXx9OtjMhKiEISo6+iQsu46X0R4y04VCIEh5QcN1+f8s/lHDHoNnvvlQyGoIHL1w= Received: from AM5PR0201CA0017.eurprd02.prod.outlook.com (2603:10a6:203:3d::27) by VI1PR08MB4589.eurprd08.prod.outlook.com (2603:10a6:803:ba::30) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3391.19; Fri, 25 Sep 2020 14:28:23 +0000 Received: from AM5EUR03FT045.eop-EUR03.prod.protection.outlook.com (2603:10a6:203:3d:cafe::e3) by AM5PR0201CA0017.outlook.office365.com (2603:10a6:203:3d::27) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3412.20 via Frontend Transport; Fri, 25 Sep 2020 14:28:23 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 63.35.35.123) smtp.mailfrom=arm.com; gcc.gnu.org; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com;gcc.gnu.org; dmarc=bestguesspass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 63.35.35.123 as permitted sender) receiver=protection.outlook.com; client-ip=63.35.35.123; helo=64aa7808-outbound-1.mta.getcheckrecipient.com; Received: from 64aa7808-outbound-1.mta.getcheckrecipient.com (63.35.35.123) by AM5EUR03FT045.mail.protection.outlook.com (10.152.17.105) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3412.21 via Frontend Transport; Fri, 25 Sep 2020 14:28:23 +0000 Received: ("Tessian outbound e8cdb8c6f386:v64"); Fri, 25 Sep 2020 14:28:23 +0000 X-CheckRecipientChecked: true X-CR-MTA-CID: ee255d787c921c8f X-CR-MTA-TID: 64aa7808 Received: from e09c2659fd33.1 by 64aa7808-outbound-1.mta.getcheckrecipient.com id AB3DDAF1-3CE9-4161-A2F1-131AD1ADDCEF.1; Fri, 25 Sep 2020 14:27:45 +0000 Received: from EUR05-AM6-obe.outbound.protection.outlook.com by 64aa7808-outbound-1.mta.getcheckrecipient.com with ESMTPS id e09c2659fd33.1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384); Fri, 25 Sep 2020 14:27:45 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=D1fPN/vjN9S3IvXrVZDwSSgJ9rvqxcbVX27o9JXRtgYe8RJpcPBbsoxjDi2Qwa9bpBLAutaR5sw97e7caYaoHJvuf23pSLL5IG0rrA0nyeAf4qqAiui/8KOq4ZI4oDc2UbeWRHi3GdG+XJe7GsJQz3LrzwtfKG37QNxdJzHAmcAQdK2iQZjynJFW/gcyWFuj7/pIgexbuIL4wXiPnSw/iCLIhDKS216q8sIIHQY+3lt9Ylc8H5sThT+yA3XFQDdmi57BqMU8VTp6DZlSqXhTaP5s8c7Td19rngr5O1HfkI4RHKKTlTIIO2T6siy1tvMtnnYvSF71R1vc5xKeP5wh1A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=c2sRAD3p42ZxuXyKIn0qfPNQINGuHwbN+u3mdv+D4Mo=; b=Rh1X30k5Dq7EEY5CAkN+OaaTia6CtK8YYt+cRH7AXtdnA8mFYVzmyjxIockmlvYjaCf5ZbX+hF/GMez6Dxpdcn7mv2NzTLL44zA3D94SexQv/Ki+N2DLxjNVTUhKoqs8aXyyUz9x7w9tMsDT6UpCy4M8TS2fp4JGHXXuPBh+8zCvPh7Mnw4Fj3H5FxaetwR9fhteHqy2GNSLCEDjK+p2RcZbY1m4i2WWycYW9IZF8hj1TX4Nyep2BroGAnUFXyIHOaS1ifJ1ZSELQauh9SM/T+9Spifqn5xYCSNZXF+E9WMnDJTVTIo0oXbQ45KG8rCSRsAEhuKNTa8f5wXL2DGadQ== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=arm.com; dmarc=pass action=none header.from=arm.com; dkim=pass header.d=arm.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector2-armh-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=c2sRAD3p42ZxuXyKIn0qfPNQINGuHwbN+u3mdv+D4Mo=; b=pIu6CvudxM9dshkrJpMtVxkFJipsvNVtjw2oMAWsXcY4nhpU4ieYxuZApGznBwyz/UpvZAn1uBQ71vsVmge51JN647klKr/iNT3p242jKIOyXx9OtjMhKiEISo6+iQsu46X0R4y04VCIEh5QcN1+f8s/lHDHoNnvvlQyGoIHL1w= Authentication-Results-Original: gcc.gnu.org; dkim=none (message not signed) header.d=none;gcc.gnu.org; dmarc=none action=none header.from=arm.com; Received: from VI1PR08MB5325.eurprd08.prod.outlook.com (2603:10a6:803:13e::17) by VE1PR08MB5678.eurprd08.prod.outlook.com (2603:10a6:800:1a0::20) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3391.11; Fri, 25 Sep 2020 14:27:44 +0000 Received: from VI1PR08MB5325.eurprd08.prod.outlook.com ([fe80::d0e7:49cd:4dae:a2a2]) by VI1PR08MB5325.eurprd08.prod.outlook.com ([fe80::d0e7:49cd:4dae:a2a2%7]) with mapi id 15.20.3412.024; Fri, 25 Sep 2020 14:27:44 +0000 Date: Fri, 25 Sep 2020 15:27:42 +0100 From: Tamar Christina To: gcc-patches@gcc.gnu.org Subject: [PATCH v2 2/16]middle-end: Refactor and expose some vectorizer helper functions. Message-ID: <20200925142740.GA12448@arm.com> Content-Disposition: inline User-Agent: Mutt/1.9.4 (2018-02-28) X-ClientProxiedBy: LO2P265CA0386.GBRP265.PROD.OUTLOOK.COM (2603:10a6:600:f::14) To VI1PR08MB5325.eurprd08.prod.outlook.com (2603:10a6:803:13e::17) MIME-Version: 1.0 X-MS-Exchange-MessageSentRepresentingType: 1 Received: from arm.com (217.140.106.53) by LO2P265CA0386.GBRP265.PROD.OUTLOOK.COM (2603:10a6:600:f::14) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3412.20 via Frontend Transport; Fri, 25 Sep 2020 14:27:44 +0000 X-Originating-IP: [217.140.106.53] X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-HT: Tenant X-MS-Office365-Filtering-Correlation-Id: daa08031-dfba-449e-2592-08d8615f4227 X-MS-TrafficTypeDiagnostic: VE1PR08MB5678:|VI1PR08MB4589: X-Microsoft-Antispam-PRVS: x-checkrecipientrouted: true NoDisclaimer: true X-MS-Oob-TLC-OOBClassifiers: OLM:9508;OLM:9508; X-MS-Exchange-SenderADCheck: 1 X-Microsoft-Antispam-Untrusted: BCL:0; X-Microsoft-Antispam-Message-Info-Original: TbcWy1qOnnF3VLPwpZrPxlApu+JVyzzt/pXwyM2OO9NmWhsYYnxQXAGrBGUeetpnegSiJ2/tpSrUDXsmnscUaQx2VmO0xTfNoBW7/LSj2jj2BUZWcXedKLOtkdy3PjZybMAOJ6F+N4ERmvfcbvB1HD2D8UcD9DcB5vc6HAaEUMNlipjKdTwaHYHePWuHgdeQTsUiPxnwJlmlse2ozpx+JBRjaijuicdRCwoAjmy8ma+WcgH8OXYVIyC7ENorWcVXjr7BfAODxU76RHA3dTJkmXesmJcfR3MsdWVRyuO6dUuJ9ztw5nLEU+swhyoukI8xp+uAAS6Tftlibpnkgo9Q8WNqTf0E7LyVzmHnN3guX22yfpTs9qed6wvcPiDt40VgpQHQy66sEItAEyo3Cvpmozn650RUohVZjt0eV5N9hAA= X-Forefront-Antispam-Report-Untrusted: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:VI1PR08MB5325.eurprd08.prod.outlook.com; PTR:; CAT:NONE; SFS:(4636009)(396003)(366004)(136003)(39860400002)(346002)(376002)(8936002)(1076003)(55016002)(33656002)(8886007)(66946007)(66556008)(86362001)(66616009)(36756003)(66476007)(478600001)(6916009)(235185007)(44144004)(5660300002)(16526019)(186003)(4743002)(4326008)(33964004)(52116002)(44832011)(7696005)(316002)(2616005)(26005)(2906002)(8676002)(956004)(2700100001); DIR:OUT; SFP:1101; X-MS-Exchange-AntiSpam-MessageData: k0/v/DIHFrxKlSagVNOiLtu4lA91BbjdpFKNstywDCEs5kBdFRfwoFRXtCvzToIco54F20Kea5pTKWi2EDJlc1OcBx7VPFPJvWd/UCmWcE+sVbrIBEWXF0aAydEA7SjM985pq9eLgGMbGVdUlDOAwInmMx3w2TX88PHPJLEtDKZ7grAtHW9pzvoZfGC64zci/9EUYTSplGuTU2vqL248R5RW1MIjElphAxUNHfFbwvAVzOSAhfk8MTzbGVZZqoBupQL5WPlN1OMTgPS9wP9bkx3NCgdV+yLK3m/Cqg01dUWD43keLoCtGRS9u3zfOrA8CSdMjNHR6G/uFtYd4v90bsgSKLUxPg6UUdsDoot5aRxXHP/EsNmEdia904Jp7uNh0QtMHS3WKNIb+L5T+7xjdHFLgGcMFbkQhXUWIRaz6LyESJiLsNMf7fBngEYq69+QOj1Z0kPz2SpeNH5xedVM+agqHL65OysjUVwluorYWRH+ukW9O+nWY7eiMcVBzueduGWHg8hOu2xyvANfUjgs9GfSpMrNOFhWrbovzgiA2v0Cnmdlb401AwfBSXUAhIJeTpKzgxAge2EoFNNs5kcTwB55Tq4edpXzNZDoW70dtTXqpwmg9D65mihVS/4p+M8iyJGdjfmmmbVTImQZ9MNBLA== X-MS-Exchange-Transport-Forked: True X-MS-Exchange-Transport-CrossTenantHeadersStamped: VE1PR08MB5678 Original-Authentication-Results: gcc.gnu.org; dkim=none (message not signed) header.d=none;gcc.gnu.org; dmarc=none action=none header.from=arm.com; X-EOPAttributedMessage: 0 X-MS-Exchange-Transport-CrossTenantHeadersStripped: AM5EUR03FT045.eop-EUR03.prod.protection.outlook.com X-MS-Office365-Filtering-Correlation-Id-Prvs: 5ca272aa-d62b-47dd-c067-08d8615f2af7 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: Fsr60OJbGnd0NHkTiSaMQb3umaHY/rYuS2kbs2GvNYQPC0rujpxp3rmEUCxnC+z8CFdAYDbepbRdkQC+V49jLtwELlOiHOPfK/8+NWpXYmBSbkx87cTXQ9D2awb5bpf22ffnfTWmWSjvvH7g2UbyXwhFrKIAGHm92VNrLUeShta/I+MEy6Ig6pKP5E0xgm7X+Z4nbMGlTdNPSTVhcCWdY+WOWANSodtnHObZ2X1+yEpLzW7vpMlIfz31okJSYKJpKfhyoLCPWQa7v3W2W7oGJHXe3Xm1mUmSmLAtvEjXf1enRvmGtNtV7kHiJ7s0nXozejZbtAbWmYTiilNGODMPPOWFq7VhkeBW7Sw6RKgrxLDeMURWLLpvIBFdFCJzcGQyvOlnBFYkU2ZBhCjtiM5SArjkhREOgkGXMkw7ASODPaA= X-Forefront-Antispam-Report: CIP:63.35.35.123; CTRY:IE; LANG:en; SCL:1; SRV:; IPV:CAL; SFV:NSPM; H:64aa7808-outbound-1.mta.getcheckrecipient.com; PTR:ec2-63-35-35-123.eu-west-1.compute.amazonaws.com; CAT:NONE; SFS:(4636009)(346002)(136003)(376002)(39860400002)(396003)(46966005)(44832011)(6916009)(4743002)(36756003)(81166007)(8886007)(66616009)(82740400003)(70206006)(82310400003)(47076004)(2906002)(70586007)(33656002)(235185007)(86362001)(186003)(8676002)(16526019)(956004)(8936002)(336012)(478600001)(5660300002)(356005)(1076003)(2616005)(7696005)(33964004)(36906005)(316002)(44144004)(4326008)(26005)(55016002)(2700100001); DIR:OUT; SFP:1101; X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 25 Sep 2020 14:28:23.2044 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: daa08031-dfba-449e-2592-08d8615f4227 X-MS-Exchange-CrossTenant-Id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=f34e5979-57d9-4aaa-ad4d-b122a662184d; Ip=[63.35.35.123]; Helo=[64aa7808-outbound-1.mta.getcheckrecipient.com] X-MS-Exchange-CrossTenant-AuthSource: AM5EUR03FT045.eop-EUR03.prod.protection.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: VI1PR08MB4589 X-Spam-Status: No, score=-13.9 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, GIT_PATCH_0, KAM_LOTSOFHASH, MSGID_FROM_MTA_HEADER, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2, SPF_HELO_PASS, SPF_PASS, TXREP, UNPARSEABLE_RELAY autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: nd@arm.com, rguenther@suse.de, ook@ucw.cz Errors-To: gcc-patches-bounces@gcc.gnu.org Sender: "Gcc-patches" Hi All, This is a small refactoring which exposes some helper functions in the vectorizer so they can be used in other places. Bootstrapped Regtested on aarch64-none-linux-gnu and no issues. Ok for master? Thanks, Tamar gcc/ChangeLog: * tree-vect-patterns.c (vect_mark_pattern_stmts): Remove static. * tree-vect-slp.c (vect_free_slp_tree, vect_build_slp_tree): Remove static. (struct bst_traits, bst_traits::hash, bst_traits::equal): Move... * tree-vectorizer.h (struct bst_traits, bst_traits::hash, bst_traits::equal): ... to here. (vect_mark_pattern_stmts, vect_free_slp_tree, vect_build_slp_tree): Declare. diff --git a/gcc/tree-vect-patterns.c b/gcc/tree-vect-patterns.c index db45740da3cba14a3552f9446651e8f289187fbb..3bacd5c827e1a6436c5916022c04e0d6594c316a 100644 --- a/gcc/tree-vect-patterns.c +++ b/gcc/tree-vect-patterns.c @@ -5169,7 +5169,7 @@ const unsigned int NUM_PATTERNS = ARRAY_SIZE (vect_vect_recog_func_ptrs); /* Mark statements that are involved in a pattern. */ -static inline void +void vect_mark_pattern_stmts (vec_info *vinfo, stmt_vec_info orig_stmt_info, gimple *pattern_stmt, tree pattern_vectype) diff --git a/gcc/tree-vect-slp.c b/gcc/tree-vect-slp.c index bf8ea4326597f4211d2772e9db60aa69285b5998..01189d44d892fc42b132bbb7de1c471df45518ae 100644 --- a/gcc/tree-vect-slp.c +++ b/gcc/tree-vect-slp.c @@ -86,7 +86,7 @@ _slp_tree::~_slp_tree () /* Recursively free the memory allocated for the SLP tree rooted at NODE. */ -static void +void vect_free_slp_tree (slp_tree node) { int i; @@ -1120,45 +1120,6 @@ vect_build_slp_tree_1 (vec_info *vinfo, unsigned char *swap, return true; } -/* Traits for the hash_set to record failed SLP builds for a stmt set. - Note we never remove apart from at destruction time so we do not - need a special value for deleted that differs from empty. */ -struct bst_traits -{ - typedef vec value_type; - typedef vec compare_type; - static inline hashval_t hash (value_type); - static inline bool equal (value_type existing, value_type candidate); - static inline bool is_empty (value_type x) { return !x.exists (); } - static inline bool is_deleted (value_type x) { return !x.exists (); } - static const bool empty_zero_p = true; - static inline void mark_empty (value_type &x) { x.release (); } - static inline void mark_deleted (value_type &x) { x.release (); } - static inline void remove (value_type &x) { x.release (); } -}; -inline hashval_t -bst_traits::hash (value_type x) -{ - inchash::hash h; - for (unsigned i = 0; i < x.length (); ++i) - h.add_int (gimple_uid (x[i]->stmt)); - return h.end (); -} -inline bool -bst_traits::equal (value_type existing, value_type candidate) -{ - if (existing.length () != candidate.length ()) - return false; - for (unsigned i = 0; i < existing.length (); ++i) - if (existing[i] != candidate[i]) - return false; - return true; -} - -typedef hash_map , slp_tree, - simple_hashmap_traits > - scalar_stmts_to_slp_tree_map_t; - static slp_tree vect_build_slp_tree_2 (vec_info *vinfo, vec stmts, unsigned int group_size, @@ -1166,7 +1127,7 @@ vect_build_slp_tree_2 (vec_info *vinfo, bool *matches, unsigned *npermutes, unsigned *tree_size, scalar_stmts_to_slp_tree_map_t *bst_map); -static slp_tree +slp_tree vect_build_slp_tree (vec_info *vinfo, vec stmts, unsigned int group_size, poly_uint64 *max_nunits, diff --git a/gcc/tree-vectorizer.h b/gcc/tree-vectorizer.h index 2ebcf9f9926ec7175f28391f172800499bbc59db..79926f1a43534635ddca85556a928e364022c40a 100644 --- a/gcc/tree-vectorizer.h +++ b/gcc/tree-vectorizer.h @@ -2047,6 +2047,9 @@ extern int vect_get_place_in_interleaving_chain (stmt_vec_info, stmt_vec_info); extern bool vect_update_shared_vectype (stmt_vec_info, tree); /* In tree-vect-patterns.c. */ +extern void +vect_mark_pattern_stmts (vec_info *, stmt_vec_info, gimple *, tree); + /* Pattern recognition functions. Additional pattern recognition functions can (and will) be added in the future. */ @@ -2058,4 +2061,51 @@ void vect_free_loop_info_assumptions (class loop *); gimple *vect_loop_vectorized_call (class loop *, gcond **cond = NULL); bool vect_stmt_dominates_stmt_p (gimple *, gimple *); +/* Traits for the hash_set to record failed SLP builds for a stmt set. + Note we never remove apart from at destruction time so we do not + need a special value for deleted that differs from empty. */ +struct bst_traits +{ + typedef vec value_type; + typedef vec compare_type; + static inline hashval_t hash (value_type); + static inline bool equal (value_type existing, value_type candidate); + static inline bool is_empty (value_type x) { return !x.exists (); } + static inline bool is_deleted (value_type x) { return !x.exists (); } + static const bool empty_zero_p = true; + static inline void mark_empty (value_type &x) { x.release (); } + static inline void mark_deleted (value_type &x) { x.release (); } + static inline void remove (value_type &x) { x.release (); } +}; +inline hashval_t +bst_traits::hash (value_type x) +{ + inchash::hash h; + for (unsigned i = 0; i < x.length (); ++i) + h.add_int (gimple_uid (x[i]->stmt)); + return h.end (); +} +inline bool +bst_traits::equal (value_type existing, value_type candidate) +{ + if (existing.length () != candidate.length ()) + return false; + for (unsigned i = 0; i < existing.length (); ++i) + if (existing[i] != candidate[i]) + return false; + return true; +} + +typedef hash_map , slp_tree, + simple_hashmap_traits > + scalar_stmts_to_slp_tree_map_t; + +extern void +vect_free_slp_tree (slp_tree node); + +slp_tree +vect_build_slp_tree (vec_info *, vec, unsigned int, + poly_uint64 *, bool *, unsigned *, unsigned *, + scalar_stmts_to_slp_tree_map_t *); + #endif /* GCC_TREE_VECTORIZER_H */ From patchwork Fri Sep 25 14:27:55 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tamar Christina X-Patchwork-Id: 1371337 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=sourceware.org; envelope-from=gcc-patches-bounces@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=arm.com Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=armh.onmicrosoft.com header.i=@armh.onmicrosoft.com header.a=rsa-sha256 header.s=selector2-armh-onmicrosoft-com header.b=xsV+Adg+; dkim=pass (1024-bit key) header.d=armh.onmicrosoft.com header.i=@armh.onmicrosoft.com header.a=rsa-sha256 header.s=selector2-armh-onmicrosoft-com header.b=xsV+Adg+; dkim-atps=neutral Received: from sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4ByZ5X1g00z9sRf for ; Sat, 26 Sep 2020 00:28:16 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 861AD398B43E; Fri, 25 Sep 2020 14:28:13 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from EUR02-AM5-obe.outbound.protection.outlook.com (mail-eopbgr00056.outbound.protection.outlook.com [40.107.0.56]) by sourceware.org (Postfix) with ESMTPS id 2FFCF398B400 for ; Fri, 25 Sep 2020 14:28:08 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 2FFCF398B400 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=Tamar.Christina@arm.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector2-armh-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=KaeF7BRIA8pqHbeHsbsPOqZkl6JeLnIG5w7R+xyV7W4=; b=xsV+Adg+BQa305oAEYMkcxDEUVfCBdkjc5v38nGj3/HEHOsyjt7keFBSoLaL6pkULags7wiQ0z50ep3BhjRJ+g9pOehGRoQqNS5gCUTDONpGKsgDbPHb3eUXjE4mh3SYQ2tjnMZchE2v7XZYgk6Ujt7cJvKH6qLFMyYvc4GmIi4= Received: from AM6P192CA0096.EURP192.PROD.OUTLOOK.COM (2603:10a6:209:8d::37) by AM8PR08MB5585.eurprd08.prod.outlook.com (2603:10a6:20b:1c5::10) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3391.13; Fri, 25 Sep 2020 14:28:06 +0000 Received: from VE1EUR03FT022.eop-EUR03.prod.protection.outlook.com (2603:10a6:209:8d:cafe::15) by AM6P192CA0096.outlook.office365.com (2603:10a6:209:8d::37) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3412.22 via Frontend Transport; Fri, 25 Sep 2020 14:28:06 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 63.35.35.123) smtp.mailfrom=arm.com; gcc.gnu.org; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com;gcc.gnu.org; dmarc=bestguesspass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 63.35.35.123 as permitted sender) receiver=protection.outlook.com; client-ip=63.35.35.123; helo=64aa7808-outbound-1.mta.getcheckrecipient.com; Received: from 64aa7808-outbound-1.mta.getcheckrecipient.com (63.35.35.123) by VE1EUR03FT022.mail.protection.outlook.com (10.152.18.64) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3412.21 via Frontend Transport; Fri, 25 Sep 2020 14:28:05 +0000 Received: ("Tessian outbound 7fc8f57bdedc:v64"); Fri, 25 Sep 2020 14:28:05 +0000 X-CheckRecipientChecked: true X-CR-MTA-CID: 4f76d2a5fc9f5bce X-CR-MTA-TID: 64aa7808 Received: from 4c64cc75e31f.1 by 64aa7808-outbound-1.mta.getcheckrecipient.com id 04EBDA76-FB99-4D86-A2BC-08C659E59B58.1; Fri, 25 Sep 2020 14:27:59 +0000 Received: from EUR03-DB5-obe.outbound.protection.outlook.com by 64aa7808-outbound-1.mta.getcheckrecipient.com with ESMTPS id 4c64cc75e31f.1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384); Fri, 25 Sep 2020 14:27:59 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=Ouli7vm3dNX6YR0TiheLiwIaOv1Kou3LswrrcTnRaQZarlnpaaYLbrjJrfzcHspd3VLuGxESNXA95O90cTDYdDZhvoVZxVkgDT1ZukCC76c8OoM369EuXFNpe7SVLppnIQcLWao37iH339jVCOQCOe1i7ecjTO7JXOPjXQf5IPTIND2wj6JAsK/mOSbFwRgzSR9frE6YCYHf+QGL/+TUzK0ZPtHJv/DmPwnOgAEqR8RUV2nSshluuZt/DC5qAOEqgFpsf2eRPW6ZoF6RLjTYhWsjeYIoHLrWJEDKjFSyMfTaINt8kmqX2Hk0tGDklZLltIhzp+KTZlkIS/v32mbbBg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=KaeF7BRIA8pqHbeHsbsPOqZkl6JeLnIG5w7R+xyV7W4=; b=HpVBbGFjW3LCCdbnSvIkx/8NdEp8ygxNmeRT9BKFE26TU+/a/W46cCoo3Sxp1yfRfR5evwxxs5zN3J3aP5lK9x04fcr9OBNDoRqBDFhfaxF6uco43V/Jz80GLSzgZXPinfFLKSqG6gFcRyUvUhjG+AgyrjgzH6VUWJP1mDmO8NThZq8KifTW68wQNHxxBi24lQSW4aymUnsXaUYDoUwG+KLHFrwMRYhYQiKFYCiMGf3DTHWdADf+dlWgApy9XTWbSy7toPlFvXgl+miJtGeq1N21biKn1XPw48UqY90gKiO1SAxA7ExQ0DLyWkgJeCYqwJVRgSFYsTvHzbz96WdnsQ== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=arm.com; dmarc=pass action=none header.from=arm.com; dkim=pass header.d=arm.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector2-armh-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=KaeF7BRIA8pqHbeHsbsPOqZkl6JeLnIG5w7R+xyV7W4=; b=xsV+Adg+BQa305oAEYMkcxDEUVfCBdkjc5v38nGj3/HEHOsyjt7keFBSoLaL6pkULags7wiQ0z50ep3BhjRJ+g9pOehGRoQqNS5gCUTDONpGKsgDbPHb3eUXjE4mh3SYQ2tjnMZchE2v7XZYgk6Ujt7cJvKH6qLFMyYvc4GmIi4= Authentication-Results-Original: gcc.gnu.org; dkim=none (message not signed) header.d=none;gcc.gnu.org; dmarc=none action=none header.from=arm.com; Received: from VI1PR08MB5325.eurprd08.prod.outlook.com (2603:10a6:803:13e::17) by VE1PR08MB5678.eurprd08.prod.outlook.com (2603:10a6:800:1a0::20) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3391.11; Fri, 25 Sep 2020 14:27:58 +0000 Received: from VI1PR08MB5325.eurprd08.prod.outlook.com ([fe80::d0e7:49cd:4dae:a2a2]) by VI1PR08MB5325.eurprd08.prod.outlook.com ([fe80::d0e7:49cd:4dae:a2a2%7]) with mapi id 15.20.3412.024; Fri, 25 Sep 2020 14:27:58 +0000 Date: Fri, 25 Sep 2020 15:27:55 +0100 From: Tamar Christina To: gcc-patches@gcc.gnu.org Subject: [PATCH v2 3/16]middle-end Add basic SLP pattern matching scaffolding. Message-ID: <20200925142753.GA13692@arm.com> Content-Disposition: inline User-Agent: Mutt/1.9.4 (2018-02-28) X-ClientProxiedBy: LO2P265CA0066.GBRP265.PROD.OUTLOOK.COM (2603:10a6:600:60::30) To VI1PR08MB5325.eurprd08.prod.outlook.com (2603:10a6:803:13e::17) MIME-Version: 1.0 X-MS-Exchange-MessageSentRepresentingType: 1 Received: from arm.com (217.140.106.53) by LO2P265CA0066.GBRP265.PROD.OUTLOOK.COM (2603:10a6:600:60::30) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3412.21 via Frontend Transport; Fri, 25 Sep 2020 14:27:57 +0000 X-Originating-IP: [217.140.106.53] X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-HT: Tenant X-MS-Office365-Filtering-Correlation-Id: 8914fb13-1e7c-493d-a726-08d8615f37e6 X-MS-TrafficTypeDiagnostic: VE1PR08MB5678:|AM8PR08MB5585: X-Microsoft-Antispam-PRVS: x-checkrecipientrouted: true NoDisclaimer: true X-MS-Oob-TLC-OOBClassifiers: OLM:9508;OLM:9508; X-MS-Exchange-SenderADCheck: 1 X-Microsoft-Antispam-Untrusted: BCL:0; X-Microsoft-Antispam-Message-Info-Original: c9ZXeu/fM2yJojM4egdyVE7GSZR6KfrhnloU0wAPPtFBsNIFS1Z3o/fOA0z8R3zJvBQEAYyjJO2/EWLMfQkMzIVqNMS2kDXSgHSiAUvgsCoVg092XJT5OsaR6wTdu0d2zA2y0o2kGGKt37Z2YDaxNHsTcF++2s7HidLfpK3QSFEc16pvgYCcyglbvLmkXtGVEkRV5Li/Rt2dUJH1m2S0khCUUs53WgHTmjmUenPZkB/3lZ0rIMrKfkXcfOg7A9ytIFVe+MjQNBQ+hczIusqWdZTYagms/QGbni2Morz6PWxoQmxLxppfwuhVT+E+nghsYTSRZ89Y3G7NKKm920Fu9j4S8lOXkfEpqx6D6jdRWG6acCM1H9Y7pMwaWVW6q/lxT5SuHJ3+W7qKtovHVX4FkPadhnG1L0ABwvCZQ2u4X3A= X-Forefront-Antispam-Report-Untrusted: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:VI1PR08MB5325.eurprd08.prod.outlook.com; PTR:; CAT:NONE; SFS:(4636009)(396003)(366004)(136003)(39860400002)(346002)(376002)(8936002)(1076003)(55016002)(33656002)(8886007)(66946007)(66556008)(86362001)(66616009)(83380400001)(36756003)(66476007)(478600001)(6916009)(235185007)(44144004)(5660300002)(16526019)(186003)(4743002)(4326008)(33964004)(52116002)(44832011)(7696005)(316002)(2616005)(26005)(2906002)(8676002)(956004)(2700100001); DIR:OUT; SFP:1101; X-MS-Exchange-AntiSpam-MessageData: SqhKOtXzpmStlDLpRQ8UTtZ2VeuaAub+vuUjjP2XdkVfQWzsLLWOfYoB3IsHgUiPQnArW6centCWEsctrWKfdY5wBzTS84oXCcS4xRy/PoSbk8lagvCdgEvsibvizMdqmQGpfTA7BHRGNEalHeR7XUoKVVT4j9Gz9dQ97rbmUqufDmgc9j+bvswi0KY2jKo6g1nvCPIaQER7uimGHIEsI9lYOeEKjhzGZ3N/qXG4xfwgcofMaxFe7VacXx4Buvw1N5HtVCBGoBTxkfnwRWsVN5JfaoZXWU5oVz13StdW04x2a0SIM9+Y0NIKA/fH4NcW3quoiss1wHvPryAsYl4i2LWLwCJqAiW8itHsYhq8YaKUagWjFSkpCsLRe4etn/ohGm0dPY6ZyCuPANqmQVpt4rjny5lzw5lscUA+Euzafzihp2fIiTTzxKlUhqT3VNfqYRhN4j7+0LlnKZIehxnH7WucHsYrbvnPsXcSEnr9SPLT+UScgb9kiu+9WwxNftax2jMmPHch+WyrM0HXHGVe5FqApnuoFX6fU9b0hkzTw8bduQrOIL2jgz4asOKgQsNFVqAF+LL3XMdXwSM+P2GVXMQKLUcLRe9euPChZejmqK6+GRZMkNLL1h0cYQ16wuxwpg+QTKtXTWQHPDEZKoJ4jA== X-MS-Exchange-Transport-Forked: True X-MS-Exchange-Transport-CrossTenantHeadersStamped: VE1PR08MB5678 Original-Authentication-Results: gcc.gnu.org; dkim=none (message not signed) header.d=none;gcc.gnu.org; dmarc=none action=none header.from=arm.com; X-EOPAttributedMessage: 0 X-MS-Exchange-Transport-CrossTenantHeadersStripped: VE1EUR03FT022.eop-EUR03.prod.protection.outlook.com X-MS-Office365-Filtering-Correlation-Id-Prvs: da61759d-4163-49db-9ecb-08d8615f32f9 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: dF0QZTVrGa8bdVOo3/lyCFORSHqZev/Kv2xB99WT83ZS2cbfeom68lM/1EJjZGsykDGY4m0z6Zj/hcprXSY7/vyK+UWlQMueLRsiOaUPD+q+oBvhTmwusyZYx0y5Ixm5+q560JCqjJoOhCpKPQTdQiGG1fuAFNsMdJ/KOHBbXkBdund+hbdPeWbC8pjxsyfaF8Qn6rryae24wMybvOKWMTRBIXRsweA9D5WOjT+falsCnZCbhadw5yJEU2RtUGXXDN/DvG3pnCQyjyq2HmiqnCO8qyxQKYyITlrX2lPNDomz74ky7+vsCm/kcyFwJfWVZl8RKYNd04GeCzMfod7sfKoiTPhJrJVqBbVFKMVPsQtXaxhsxEhA9kE+j6Py01p0ZmBM42/QN51pL6sfFvSCpIEVmpV4SKwixu74xW1jBFg= X-Forefront-Antispam-Report: CIP:63.35.35.123; CTRY:IE; LANG:en; SCL:1; SRV:; IPV:CAL; SFV:NSPM; H:64aa7808-outbound-1.mta.getcheckrecipient.com; PTR:ec2-63-35-35-123.eu-west-1.compute.amazonaws.com; CAT:NONE; SFS:(4636009)(396003)(346002)(136003)(376002)(39860400002)(46966005)(70586007)(83380400001)(86362001)(70206006)(235185007)(36756003)(1076003)(16526019)(8676002)(82310400003)(2906002)(33656002)(44144004)(4326008)(33964004)(7696005)(478600001)(82740400003)(356005)(6916009)(55016002)(66616009)(186003)(5660300002)(2616005)(956004)(81166007)(47076004)(36906005)(316002)(336012)(44832011)(8936002)(4743002)(8886007)(26005)(2700100001); DIR:OUT; SFP:1101; X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 25 Sep 2020 14:28:05.8638 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 8914fb13-1e7c-493d-a726-08d8615f37e6 X-MS-Exchange-CrossTenant-Id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=f34e5979-57d9-4aaa-ad4d-b122a662184d; Ip=[63.35.35.123]; Helo=[64aa7808-outbound-1.mta.getcheckrecipient.com] X-MS-Exchange-CrossTenant-AuthSource: VE1EUR03FT022.eop-EUR03.prod.protection.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: AM8PR08MB5585 X-Spam-Status: No, score=-9.4 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, GIT_PATCH_0, INDUSTRIAL_BODY, INDUSTRIAL_SUBJECT, KAM_ASCII_DIVIDERS, KAM_LOTSOFHASH, KAM_SHORT, MSGID_FROM_MTA_HEADER, RCVD_IN_DNSWL_LOW, RCVD_IN_MSPIKE_H2, SPF_HELO_PASS, SPF_PASS, TXREP, UNPARSEABLE_RELAY autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: nd@arm.com, rguenther@suse.de, ook@ucw.cz Errors-To: gcc-patches-bounces@gcc.gnu.org Sender: "Gcc-patches" Hi All, This patch adds the basic infrastructure for doing pattern matching on SLP trees. This is done immediately after the SLP tree creation because it can change the shape of the tree in radical ways and so we would like to do it before any analysis is performed on the tree. A new file tree-vect-slp-patterns.c is added which contains all the code for pattern matching on SLP trees. This cover letter is short because the changes are heavily commented. All pattern matchers need to implement the abstract type VectPatternMatch. The VectSimplePatternMatch abstract class provides some default functionality for pattern matchers that need to rebuild nodes. The pattern matcher requires if replacing a statement in a node, that ALL statements be replaced. Bootstrapped Regtested on aarch64-none-linux-gnu and no issues. Ok for master? Thanks, Tamar gcc/ChangeLog: * Makefile.in (tree-vect-slp-patterns.o): New. * doc/passes.texi: Update documentation. * tree-vect-slp.c (vect_match_slp_patterns_2, vect_match_slp_patterns): New. (vect_analyze_slp_instance): Call pattern matcher. * tree-vectorizer.h (class VectPatternMatch, class VectPattern): New. * tree-vect-slp-patterns.c: New file. diff --git a/gcc/Makefile.in b/gcc/Makefile.in index 9c6c1c93b976aaf350cc1f9b3bdc538308fdf08b..936202b73696c8529b32c05b2356c7316fabc542 100644 --- a/gcc/Makefile.in +++ b/gcc/Makefile.in @@ -1638,6 +1638,7 @@ OBJS = \ tree-vect-loop.o \ tree-vect-loop-manip.o \ tree-vect-slp.o \ + tree-vect-slp-patterns.o \ tree-vectorizer.o \ tree-vector-builder.o \ tree-vrp.o \ diff --git a/gcc/doc/passes.texi b/gcc/doc/passes.texi index a5ae4143a8c1293e674b499120372ee5fe5c412b..c86df5cd843084a5b7933ef99a23386891a7b0c1 100644 --- a/gcc/doc/passes.texi +++ b/gcc/doc/passes.texi @@ -709,7 +709,8 @@ loop. The pass is implemented in @file{tree-vectorizer.c} (the main driver), @file{tree-vect-loop.c} and @file{tree-vect-loop-manip.c} (loop specific parts and general loop utilities), @file{tree-vect-slp} (loop-aware SLP -functionality), @file{tree-vect-stmts.c} and @file{tree-vect-data-refs.c}. +functionality), @file{tree-vect-stmts.c}, @file{tree-vect-data-refs.c} and +@file{tree-vect-slp-patterns.c} containing the SLP pattern matcher. Analysis of data references is in @file{tree-data-ref.c}. SLP Vectorization. This pass performs vectorization of straight-line code. The diff --git a/gcc/tree-vect-slp-patterns.c b/gcc/tree-vect-slp-patterns.c new file mode 100644 index 0000000000000000000000000000000000000000..f605f68d2a14c4bf4941f97b7c1d57f6acb5ffb1 --- /dev/null +++ b/gcc/tree-vect-slp-patterns.c @@ -0,0 +1,310 @@ +/* SLP - Pattern matcher on SLP trees + Copyright (C) 2020 Free Software Foundation, Inc. + +This file is part of GCC. + +GCC is free software; you can redistribute it and/or modify it under +the terms of the GNU General Public License as published by the Free +Software Foundation; either version 3, or (at your option) any later +version. + +GCC is distributed in the hope that it will be useful, but WITHOUT ANY +WARRANTY; without even the implied warranty of MERCHANTABILITY or +FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License +for more details. + +You should have received a copy of the GNU General Public License +along with GCC; see the file COPYING3. If not see +. */ + +#include "config.h" +#include "system.h" +#include "coretypes.h" +#include "backend.h" +#include "target.h" +#include "rtl.h" +#include "tree.h" +#include "gimple.h" +#include "tree-pass.h" +#include "ssa.h" +#include "optabs-tree.h" +#include "insn-config.h" +#include "recog.h" /* FIXME: for insn_data */ +#include "fold-const.h" +#include "stor-layout.h" +#include "gimple-iterator.h" +#include "cfgloop.h" +#include "tree-vectorizer.h" +#include "langhooks.h" +#include "gimple-walk.h" +#include "dbgcnt.h" +#include "tree-vector-builder.h" +#include "vec-perm-indices.h" +#include "gimple-fold.h" +#include "internal-fn.h" + +/* SLP Pattern matching mechanism. + + This extension to the SLP vectorizer allows one to transform the generated SLP + tree based on any pattern. The difference between this and the normal vect + pattern matcher is that unlike the former, this matcher allows you to match + with instructions that do not belong to the same SSA dominator graph. + + The only requirement that this pattern matcher has is that you are only + only allowed to either match an entire group or none. + + As an example, the following simple loop: + + double a[restrict N]; double b[restrict N]; double c[restrict N]; + + for (int i=0; i < N; i+=2) + { + c[i] = a[i] - b[i+1]; + c[i+1] = a[i+1] + b[i]; + } + + which represents a complex addition on with a rotation of 90* around the + argand plane. i.e. if `a` and `b` were complex numbers then this would be the + same as `a + (b * I)`. + + Here the expressions for `c[i]` and `c[i+1]` are independent but have to be + both recognized in order for the pattern to work. As an SLP tree this is + represented as + + +--------------------------------+ + | stmt 0 *_9 = _10; | + | stmt 1 *_15 = _16; | + +--------------------------------+ + | + | + v + +--------------------------------+ + | stmt 0 _10 = _4 - _8; | + | stmt 1 _16 = _12 + _14; | + | lane permutation { 0[0] 1[1] } | + +--------------------------------+ + | | + | | + | | + +-----+ | | +-----+ + | | | | | | + +-----| { } |<-----+ +----->| { } --------+ + | | | +------------------| | | + | +-----+ | +-----+ | + | | | | + | | | | + | +------|------------------+ | + | | | | + v v v v + +--------------------------+ +--------------------------------+ + | stmt 0 _8 = *_7; | | stmt 0 _4 = *_3; | + | stmt 1 _14 = *_13; | | stmt 1 _12 = *_11; | + | load permutation { 1 0 } | | load permutation { 0 1 } | + +--------------------------+ +--------------------------------+ + + The pattern matcher allows you to replace both statements 0 and 1 or none at + all. You are also allowed to replace and match on any number of nodes. + + The pattern matcher uses a sliding window to handle unrolled cases. Every + pattern has to declare the number of statements that they consume. The + pattern matcher uses this to incrementally ask if the pattern can be applied. + This is done using the method `matches ()`. + + If the pattern can be applied a VecPatternMatch is returned which contains all + state information on where the match was found. This is stored in a list of + operations to perform. If the match cannot be applied then the current + pattern is aborted and no changes made to the tree. + + The pattern matcher has two modes: + + 1) pre-order traversal is used to perform a check to see if the pattern can be + applied or not. If the pattern can be applied then a second step is + performed that allows the pattern to rewrite it's children. This step is + required because the application of a pattern can change the layout of the + tree which affects the nodes that are still to be matched. This is + performed using `validate_p ()`. + + 2) post-order traversal is used to actually perform the rewriting of the + matches found earlier. This is done by calling `build ()` on all matches + that were found earlier. + + The pattern matcher currently only allows you to perform replacements to + internal functions. + + To add a new pattern, implement the VectPattern class and add the type to + slp_patterns. */ + +/* VectSimplePatternMatch holds contextual information about a single match + found in the SLP tree. The use of the class is to allow you to defer + performing any modifications to the SLP tree until they are to be done. By + calling build () the modifications are done in-place as to allow also re- + writing of the root node. */ + +class VectSimplePatternMatch : public VectPatternMatch +{ + protected: + uint8_t m_arity; + vec m_ifn_args; + internal_fn m_ifn; + vec_info *m_vinfo; + int m_idx, m_num_args; + tree m_type, m_vectype; + slp_tree m_node; + int m_pos; + + public: + VectSimplePatternMatch (uint8_t arity, vec ifn_args, + internal_fn ifn, vec_info *vinfo, int idx, + slp_tree node, tree type, tree vectype, + int num_args) + { + /* Number of statements the pattern matches against. */ + this->m_arity = arity; + + /* Arguments to be used when building the new stmts using the IFN. */ + this->m_ifn_args = ifn_args.copy (); + + /* The IFN to create the new statements with. */ + this->m_ifn = ifn; + + /* The vectorization information for the current loop. */ + this->m_vinfo = vinfo; + + /* The index in the sliding window where the statements were matched. */ + this->m_idx = idx; + + /* The number of arguments required to create the new IFN. */ + this->m_num_args = num_args; + + /* The original scalar type of the statement being replaced. */ + this->m_type = type; + + /* The vector type to create the IFN for. */ + this->m_vectype = vectype; + + /* The node that contains the statement that is being replaced. */ + this->m_node = node; + + /* The current position inside the arity of the statement being replaced. + generally the match can be cached and re-used for multiple stmts. */ + this->m_pos = 0; + + gcc_assert ((unsigned)(num_args * arity) == ifn_args.length ()); + } + + uint8_t get_arity () + { + return this->m_arity; + } + + internal_fn get_IFN () + { + return this->m_ifn; + } + + const vec get_IFN_args () + { + return this->m_ifn_args; + } + + /* Create a replacement pattern statement for STMT_INFO and inserts the new + statement into NODE. The statement is created as call to internal + function IFN with arguments ARGS. The arity of IFN needs to match the + amount of elements in ARGS. The scalar type of the statement as TYPE and + the corresponding vector type VECTYPE. These two types are used to + construct the new vector only replacement pattern statement. + + Futhermore the new pattern is also added to the vectorization information + structure VINFO and the old statement STMT_INFO is marked as unused while + the new statement is marked as used and the number of SLP uses of the new + statement is incremented. + + The newly created SLP nodes are marked as SLP only and will be dissolved + if SLP is aborted. + + The newly created gimple call is returned and the BB remains unchanged. + */ + + gcall *build () + { + stmt_vec_info stmt_info; + + /* Check if this call was made too often. */ + if (this->m_pos >= this->m_arity) + return NULL; + + auto_vec args; + args.create (this->m_num_args); + + /* Create the argument set for use by gimple_build_call_internal_vec. */ + stmt_vec_info arg; + for (int i = 0; i < this->m_num_args; i++) + { + arg = this->m_ifn_args[i + (this->m_pos * this->m_num_args)]; + args.quick_push (gimple_get_lhs (STMT_VINFO_STMT (arg))); + } + + /* Check to see if we haven't created all the nodes already. */ + if (args.is_empty ()) + return NULL; + + /* Calculate the location of the statement in NODE to replace. */ + int entry = this->m_idx - (this->m_arity - 1) + this->m_pos; + stmt_info = SLP_TREE_SCALAR_STMTS (this->m_node)[entry]; + + /* Create the new pattern statements. */ + gcall *call_stmt = gimple_build_call_internal_vec (this->m_ifn, args); + tree var = make_temp_ssa_name (this->m_type, call_stmt, "slp_patt"); + gimple* old_stmt = STMT_VINFO_STMT (stmt_info); + gimple_call_set_lhs (call_stmt, var); + gimple_set_location (call_stmt, gimple_location (old_stmt)); + gimple_call_set_nothrow (call_stmt, true); + + /* Adjust the book-keeping for the new and old statements for use during SLP. + This is required to get the right VF and statement during SLP analysis. + These changes are created after relevancy has been set for the nodes as + such we need to manually update them. Any changes will be undone if SLP + is cancelled. */ + stmt_vec_info call_stmt_info = this->m_vinfo->add_stmt (call_stmt); + vect_mark_pattern_stmts (this->m_vinfo, stmt_info, call_stmt, + this->m_vectype); + + /* We have to explicitly mark the old statement as unused because during + statement analysis the original and new pattern statement may require + different level of unrolling. As an example add/sub when vectorized + without a pattern requires 4 copies, whereas with a COMPLEX_ADD pattern + this only requires 2 copies and the two statement will be treated as + hand unrolled. That means that the analysis won't happen as it'll find + a mismatch. So we don't analyze the old statement and if we end up + needing it, e.g. SLP fails then we have to quickly re-analyze it. */ + STMT_VINFO_RELEVANT (stmt_info) = vect_unused_in_scope; + STMT_VINFO_SLP_VECT_ONLY (call_stmt_info) = true; + STMT_VINFO_RELATED_STMT (call_stmt_info) = stmt_info; + + /* Since we are replacing all the statements in the group with the same + thing it doesn't really matter. So just set it every time a new stmt + is created. */ + SLP_TREE_SCALAR_STMTS (this->m_node)[entry] = call_stmt_info; + SLP_TREE_REPRESENTATIVE (this->m_node) = call_stmt_info; + SLP_TREE_CODE (this->m_node) = gimple_expr_code (call_stmt);; + + this->m_pos++; + return call_stmt; + } + + ~VectSimplePatternMatch () + { + this->m_ifn_args.release (); + } +}; + +#define SLP_PATTERN(x) &x::create +VectPatternDecl slp_patterns[] +{ + /* For least amount of back-tracking and more efficient matching + order patterns from the largest to the smallest. Especially if they + overlap in what they can detect. */ +}; +#undef SLP_PATTERN + +size_t num__slp_patterns = sizeof(slp_patterns)/sizeof(VectPatternDecl); diff --git a/gcc/tree-vect-slp.c b/gcc/tree-vect-slp.c index 01189d44d892fc42b132bbb7de1c471df45518ae..947b031a6d492e6a02621dbcf41ba60d96c606f0 100644 --- a/gcc/tree-vect-slp.c +++ b/gcc/tree-vect-slp.c @@ -2055,6 +2055,192 @@ calculate_unrolling_factor (poly_uint64 nunits, unsigned int group_size) return exact_div (common_multiple (nunits, group_size), group_size); } +/* Helper function of vect_match_slp_patterns. + + Attempts to match the given pattern PATT_INFO against the slp tree rooted in + NODE using VINFO and GROUP_SIZE. + + If matching is successful the value in NODE is updated and returned, if not + then it is returned unchanged. */ + +static bool +vect_match_slp_patterns_2 (slp_tree node, vec_info *vinfo, + unsigned int group_size, VectPatternDecl patt_fn, + poly_uint64 *max_nunits, bool *matches, + unsigned *npermutes, unsigned *tree_size, + scalar_stmts_to_slp_tree_map_t * bst_map) +{ + unsigned i; + stmt_vec_info stmt_info; + if (!node) + return false; + + vec scalar_stmts = SLP_TREE_SCALAR_STMTS (node); + bool found_p = false, found_rec_p = false; + VectPattern *pattern = patt_fn (node, vinfo); + uint8_t n = pattern->get_arity (); + + if (group_size % n != 0) + { + delete pattern; + return false; + } + + /* The data dependency orderings will force the nodes to be created in the + order of their data flow. Which means since we're matching specific + patterns in particular order we only have to do a linear scan here to match + the same instruction multiple times. The group size doesn't have to be + constrained. */ + + for (unsigned i = n - 1; i < scalar_stmts.length (); i += n) + { + stmt_info = scalar_stmts[i]; + + if (gimple_assign_load_p (STMT_VINFO_STMT (stmt_info)) + || gimple_store_p (STMT_VINFO_STMT (stmt_info)) + || gimple_assign_cast_p (STMT_VINFO_STMT (stmt_info))) + break; + + stmt_vec_info *stmt_infos = scalar_stmts.begin () + (i - (n - 1)); + + gcc_assert (stmt_infos); + + if (!pattern->matches (stmt_infos, i)) + { + /* We can only do replacements for entire groups, we must replace all + statements in a node as the argument list/children may not have + equal height then. Operations that don't rewrite the arguments + may be safe to do, so perhaps paramatrise it. */ + + found_p = false; + break; + } + + tree type = gimple_expr_type (STMT_VINFO_STMT (stmt_info)); + tree vectype = get_vectype_for_scalar_type (vinfo, type, node); + + if (dump_enabled_p ()) + dump_printf_loc (MSG_NOTE, vect_location, + "Found %s pattern in SLP tree\n", + pattern->get_name ()); + + if (pattern->is_optab_supported_p (vectype, OPTIMIZE_FOR_SPEED)) + { + if (dump_enabled_p ()) + dump_printf_loc (MSG_NOTE, vect_location, + "Target supports %s vectorization with mode %T\n", + internal_fn_name (pattern->get_last_ifn ()), + vectype); + + found_p = true; + } + else + { + if (dump_enabled_p ()) + { + if (!vectype) + dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location, + "Target does not support vector type for " + "%T\n", type); + else + dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location, + "Target does not support %s for " + "vector type %T\n", + internal_fn_name (pattern->get_last_ifn ()), + vectype); + } + found_p = false; + } + } + + if (found_p) + { + /* Find which nodes should be the children of the new node. */ + + if (!pattern->validate_p (max_nunits, matches, + npermutes, tree_size, bst_map)) + { + if (dump_enabled_p ()) + dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location, + "transformation for %s not valid due to post " + "condition\n", internal_fn_name (pattern->get_last_ifn ())); + found_p = false; + } + } + + /* Perform recursive matching, it's important to do this after matching things + in the current node as the matches here may re-order the nodes below it. + As such the pattern that needs to be subsequently match may change. */ + + if (SLP_TREE_CHILDREN (node).exists ()) { + slp_tree child; + FOR_EACH_VEC_ELT (SLP_TREE_CHILDREN (node), i, child) + found_rec_p |= vect_match_slp_patterns_2 (child, vinfo, group_size, + patt_fn, max_nunits, matches, + npermutes, tree_size, bst_map); + } + + if (found_p) + { + if (dump_enabled_p ()) + dump_printf_loc (MSG_NOTE, vect_location, "Creating vec patterns\n"); + + while (gcall* call_stmt = pattern->build ()) + { + if (dump_enabled_p ()) + dump_printf_loc (MSG_NOTE, vect_location, "\t %p stmt: %G", + node, call_stmt); + } + + vect_mark_slp_stmts_relevant (node); + } + + delete pattern; + return found_p | found_rec_p; +} + +/* Applies pattern matching to the given SLP tree rooted in NODE using vec_info + VINFO and group size GROUP_SIZE. + + The modified tree is returned. Patterns are tried in order and multiple + patterns may match. If the permutes need to be cancelled then + CANCEL_PERMUTE is set. */ + +static bool +vect_match_slp_patterns (slp_tree node, vec_info *vinfo, + unsigned int group_size, poly_uint64 *max_nunits, + bool *matches, unsigned *npermutes, + unsigned *tree_size, + scalar_stmts_to_slp_tree_map_t * bst_map) +{ + DUMP_VECT_SCOPE ("vect_match_slp_patterns"); + bool found_p = false; + + if (dump_enabled_p ()) + { + dump_printf_loc (MSG_NOTE, vect_location, "-- before patt match --\n"); + vect_print_slp_graph (MSG_NOTE, vect_location, node); + dump_printf_loc (MSG_NOTE, vect_location, "-- end patt --\n"); + } + + for (unsigned x = 0; x < num__slp_patterns; x++) + found_p |= vect_match_slp_patterns_2 (node, vinfo, group_size, + slp_patterns[x], max_nunits, matches, + npermutes, tree_size, bst_map); + + /* TODO: Remove in final version, only here for generating debug dot graphs + from SLP tree. */ + + if (dump_enabled_p ()) + { + dump_printf_loc (MSG_NOTE, vect_location, "-- start dot --\n"); + vect_print_slp_graph (MSG_NOTE, vect_location, node); + dump_printf_loc (MSG_NOTE, vect_location, "-- end dot --\n"); + } + + return found_p; +} + /* Analyze an SLP instance starting from a group of grouped stores. Call vect_build_slp_tree to build a tree of packed stmts if possible. Return FALSE if it's impossible to SLP any stmt in the loop. */ @@ -2192,6 +2378,17 @@ vect_analyze_slp_instance (vec_info *vinfo, &tree_size, bst_map); if (node != NULL) { + /* Temporarily allow add_stmt calls again. */ + vinfo->stmt_vec_info_ro = false; + + /* See if any patterns can be found in the constructed SLP tree + before we do any analysis on it. */ + vect_match_slp_patterns (node, vinfo, group_size, &max_nunits, + matches, &npermutes, &tree_size, bst_map); + + /* After this no more add_stmt calls are allowed. */ + vinfo->stmt_vec_info_ro = true; + /* Calculate the unrolling factor based on the smallest type. */ poly_uint64 unrolling_factor = calculate_unrolling_factor (max_nunits, group_size); diff --git a/gcc/tree-vectorizer.h b/gcc/tree-vectorizer.h index 79926f1a43534635ddca85556a928e364022c40a..95bbf13b1c733c07b7deb8515c1b17c6979cff21 100644 --- a/gcc/tree-vectorizer.h +++ b/gcc/tree-vectorizer.h @@ -26,6 +26,7 @@ typedef class _stmt_vec_info *stmt_vec_info; #include "tree-data-ref.h" #include "tree-hash-traits.h" #include "target.h" +#include "internal-fn.h" /* Used for naming of new temporaries. */ @@ -2100,6 +2101,99 @@ typedef hash_map , slp_tree, simple_hashmap_traits > scalar_stmts_to_slp_tree_map_t; +/* SLP Pattern matcher types, tree-vect-slp-patterns.c. */ + +class VectPatternMatch +{ + public: + virtual gcall *build () = 0; + virtual internal_fn get_IFN () = 0; + virtual const vec get_IFN_args () = 0; + virtual uint8_t get_arity () = 0; + virtual ~VectPatternMatch () {}; +}; + +class VectPattern +{ + protected: + uint8_t m_arity; + uint8_t m_num_args; + internal_fn m_last_ifn; + int m_last_idx; + slp_tree m_node; + vec_info *m_vinfo; + vec m_matches; + VectPattern (slp_tree node, vec_info *vinfo) + { + this->m_last_ifn = IFN_LAST; + this->m_node = node; + this->m_vinfo = vinfo; + this->m_matches.create (0); + this->m_curr_match = 0; + } + + private: + unsigned m_curr_match; + + public: + static VectPattern* create (slp_tree node, vec_info *vinfo); + virtual bool matches (stmt_vec_info *stmts, int idx) = 0; + + virtual const char* get_name () = 0; + virtual ~VectPattern () + { + int i; + VectPatternMatch *match; + FOR_EACH_VEC_ELT (this->m_matches, i, match) + delete match; + this->m_matches.release (); + } + + virtual gcall *build () + { + if (this->m_curr_match >= this->m_matches.length ()) + return NULL; + + gcall *entry = + this->m_matches[this->m_curr_match]->build (); + + if (entry) + return entry; + + this->m_curr_match++; + return build (); + } + + virtual bool validate_p (poly_uint64 *, bool *, unsigned *, unsigned *, + scalar_stmts_to_slp_tree_map_t *) + { + return true; + } + + virtual uint8_t get_arity () + { + return this->m_arity; + } + + virtual bool is_optab_supported_p ( tree vectype, optimization_type opt_type) + { + if (!vectype) + return false; + + return direct_internal_fn_supported_p (this->m_last_ifn, vectype, + opt_type); + } + + internal_fn get_last_ifn () + { + return this->m_last_ifn; + } +}; + +typedef VectPattern* (*VectPatternDecl) (slp_tree, vec_info *); +extern VectPatternDecl slp_patterns[]; +extern size_t num__slp_patterns; + extern void vect_free_slp_tree (slp_tree node); From patchwork Fri Sep 25 14:28:20 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tamar Christina X-Patchwork-Id: 1371339 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=sourceware.org; envelope-from=gcc-patches-bounces@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=arm.com Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=armh.onmicrosoft.com header.i=@armh.onmicrosoft.com header.a=rsa-sha256 header.s=selector2-armh-onmicrosoft-com header.b=z4Tla2oq; dkim=pass (1024-bit key) header.d=armh.onmicrosoft.com header.i=@armh.onmicrosoft.com header.a=rsa-sha256 header.s=selector2-armh-onmicrosoft-com header.b=z4Tla2oq; dkim-atps=neutral Received: from sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4ByZ6357Mtz9sR4 for ; Sat, 26 Sep 2020 00:28:43 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 84DC7398E48E; Fri, 25 Sep 2020 14:28:41 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from EUR03-DB5-obe.outbound.protection.outlook.com (mail-eopbgr40066.outbound.protection.outlook.com [40.107.4.66]) by sourceware.org (Postfix) with ESMTPS id DA0D8398B400 for ; Fri, 25 Sep 2020 14:28:38 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org DA0D8398B400 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=Tamar.Christina@arm.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector2-armh-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=XvUJYK3D9BrLpGFRZ9fgUfu/dJo3vZ61Fskb03aOBVs=; b=z4Tla2oqrmhY/7hzDHVFa5IY2p1t6B8hDN7JYAGzkmVnk0Ek/Fai7NZusK2AlR6IoNEevFe2nFpXoeTAPygapJezSOl0u5Q1QLY1Va7bciCkKRkpCU8RMGemd8UGfxn7Yfg07Ywlf/eY54jRZNLgP8K36ySQhxQe9aqmOsdatuw= Received: from DB6PR0501CA0045.eurprd05.prod.outlook.com (2603:10a6:4:67::31) by HE1PR0802MB2298.eurprd08.prod.outlook.com (2603:10a6:3:c4::16) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3391.19; Fri, 25 Sep 2020 14:28:36 +0000 Received: from DB5EUR03FT037.eop-EUR03.prod.protection.outlook.com (2603:10a6:4:67:cafe::5f) by DB6PR0501CA0045.outlook.office365.com (2603:10a6:4:67::31) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3412.22 via Frontend Transport; Fri, 25 Sep 2020 14:28:36 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 63.35.35.123) smtp.mailfrom=arm.com; gcc.gnu.org; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com;gcc.gnu.org; dmarc=bestguesspass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 63.35.35.123 as permitted sender) receiver=protection.outlook.com; client-ip=63.35.35.123; helo=64aa7808-outbound-1.mta.getcheckrecipient.com; Received: from 64aa7808-outbound-1.mta.getcheckrecipient.com (63.35.35.123) by DB5EUR03FT037.mail.protection.outlook.com (10.152.20.215) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3412.21 via Frontend Transport; Fri, 25 Sep 2020 14:28:36 +0000 Received: ("Tessian outbound 7a6fb63c1e64:v64"); Fri, 25 Sep 2020 14:28:36 +0000 X-CheckRecipientChecked: true X-CR-MTA-CID: 413c71d75b32518b X-CR-MTA-TID: 64aa7808 Received: from 7259db9ac837.1 by 64aa7808-outbound-1.mta.getcheckrecipient.com id 916CB34C-9F37-4A08-BD1B-C6F493BDE58A.1; Fri, 25 Sep 2020 14:28:29 +0000 Received: from EUR03-DB5-obe.outbound.protection.outlook.com by 64aa7808-outbound-1.mta.getcheckrecipient.com with ESMTPS id 7259db9ac837.1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384); Fri, 25 Sep 2020 14:28:29 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=KAZ2Pr8TAoI2B8Ga1JzFqQwANMyPdnnSkwq83krRXGnLzsA3xBcghtf6Ewf/fZU0Le0futYNJXknS4h1ZE1fAhW9YjmzRTig5xnhtdJt/++jDM4WUdcCTJH6utJOvMC1Z9lWUSf2cLPKfvKVUyLjdaVZFmHoH/UcjO/ntjsOwxdtgHevFPCPXWS+A71R9SAcNttdaCah0Ni6TRhcqUL2i8EIj5qSfCWLPbgiB8rEfdjxEX1GgCC4VjeBODdaKhCpyLLr9IX2L77qfxl0QtaoARseRl23pyV8+cjyf+y+bSWfULTRZv29bPk8mfD1LBKKuFdCtEtgdIANWL6JJYI7Ow== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=XvUJYK3D9BrLpGFRZ9fgUfu/dJo3vZ61Fskb03aOBVs=; b=BkBym8GA+6iH/kkrKmRN1HJz0vS+mUVZTOXE17Ql0MvdmzCnvzZps6E44dcoZ/yK9cYov45aa/zmeguCC0RdSZ5jMcp5Lh3GFuSRYGejuM2KiZpI5hFRCzvF5C/1+nPDunQpbYLdUzDkMneDd6zf11l0RvjkiV9MYe5p0wsVi3223B8uGGorOwFq36jnvE8rEM+rU95tqe/8+wGK5O4+yNPykP8Aubq+1LcdCghshXvgWW2Nu1HfrBqJWYyGTlA6VnQziyBt2uUh3Fv4gL1zlascGEoc8mnGqVluF9EvDdRKwUXQTXnFAVzVDjIHjneml08FkKtSblA2kz21wpdyJA== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=arm.com; dmarc=pass action=none header.from=arm.com; dkim=pass header.d=arm.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector2-armh-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=XvUJYK3D9BrLpGFRZ9fgUfu/dJo3vZ61Fskb03aOBVs=; b=z4Tla2oqrmhY/7hzDHVFa5IY2p1t6B8hDN7JYAGzkmVnk0Ek/Fai7NZusK2AlR6IoNEevFe2nFpXoeTAPygapJezSOl0u5Q1QLY1Va7bciCkKRkpCU8RMGemd8UGfxn7Yfg07Ywlf/eY54jRZNLgP8K36ySQhxQe9aqmOsdatuw= Authentication-Results-Original: gcc.gnu.org; dkim=none (message not signed) header.d=none;gcc.gnu.org; dmarc=none action=none header.from=arm.com; Received: from VI1PR08MB5325.eurprd08.prod.outlook.com (2603:10a6:803:13e::17) by VE1PR08MB5678.eurprd08.prod.outlook.com (2603:10a6:800:1a0::20) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3391.11; Fri, 25 Sep 2020 14:28:29 +0000 Received: from VI1PR08MB5325.eurprd08.prod.outlook.com ([fe80::d0e7:49cd:4dae:a2a2]) by VI1PR08MB5325.eurprd08.prod.outlook.com ([fe80::d0e7:49cd:4dae:a2a2%7]) with mapi id 15.20.3412.024; Fri, 25 Sep 2020 14:28:28 +0000 Date: Fri, 25 Sep 2020 15:28:20 +0100 From: Tamar Christina To: gcc-patches@gcc.gnu.org Subject: [PATCH v2 4/16]middle-end: Add dissolve code for when SLP fails and non-SLP loop vectorization is to be tried. Message-ID: <20200925142816.GA14929@arm.com> Content-Disposition: inline User-Agent: Mutt/1.9.4 (2018-02-28) X-ClientProxiedBy: SA9PR10CA0028.namprd10.prod.outlook.com (2603:10b6:806:a7::33) To VI1PR08MB5325.eurprd08.prod.outlook.com (2603:10a6:803:13e::17) MIME-Version: 1.0 X-MS-Exchange-MessageSentRepresentingType: 1 Received: from arm.com (217.140.106.53) by SA9PR10CA0028.namprd10.prod.outlook.com (2603:10b6:806:a7::33) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3412.20 via Frontend Transport; Fri, 25 Sep 2020 14:28:27 +0000 X-Originating-IP: [217.140.106.53] X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-HT: Tenant X-MS-Office365-Filtering-Correlation-Id: 5160e319-d7aa-46f2-bc12-08d8615f49e1 X-MS-TrafficTypeDiagnostic: VE1PR08MB5678:|HE1PR0802MB2298: X-Microsoft-Antispam-PRVS: x-checkrecipientrouted: true NoDisclaimer: true X-MS-Oob-TLC-OOBClassifiers: OLM:10000;OLM:10000; X-MS-Exchange-SenderADCheck: 1 X-Microsoft-Antispam-Untrusted: BCL:0; X-Microsoft-Antispam-Message-Info-Original: kGnRN+GmN1GR9fHo2Lbz1Pw1Isopy/4cL0o6SOqDbgxlSxXpkpnHvDb4FbhR5QSmrjdbItb0AiLxDu3TkwC1ra27pwTMrjvNfeiPdx2bsnJ144rjv++9XI26HepnyZoks7ptY13rwrcIlFom2pdlwVgU1vgauLkllCDVidKlj9hDZwb++qgRJTijHJ7SqaFAs8ZEnmJeLI9wgvaQ+cStm6UTFVOYqakJxFogOCLjNAvihaAeLIYJPz9FklradG+Agka8JnlnVD1ItqcyelIG+WmxccVOMlsS2m8oOW9Y70ROWhpu1G6dXeKszn7DhNlS4RbWuoXF9AqHnOS9lU5Schg+xuNJRX+aRqJQtqlk48FGPMH45YUunE/qozkv02lKcBgouHH/cUcDZY8dNPBmtKrF3izujj0O8uhqW78nXKc= X-Forefront-Antispam-Report-Untrusted: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:VI1PR08MB5325.eurprd08.prod.outlook.com; PTR:; CAT:NONE; SFS:(4636009)(396003)(366004)(136003)(39860400002)(346002)(376002)(8936002)(1076003)(55016002)(33656002)(8886007)(66946007)(66556008)(86362001)(66616009)(36756003)(66476007)(478600001)(6666004)(6916009)(235185007)(44144004)(5660300002)(16526019)(186003)(4743002)(4326008)(33964004)(52116002)(44832011)(7696005)(316002)(2616005)(26005)(2906002)(8676002)(956004)(2700100001); DIR:OUT; SFP:1101; X-MS-Exchange-AntiSpam-MessageData: YldIEJaJT0ao71dVhfrzQdYVj97ZuL+xHC5r9VaY+pmvVDuOjdzHN14W8WRRl7f8UfRYHTDLmwv3afolBXxOhpwoALcnn1tZiknEJl4ig1+pkzmkIZrdsyGrwT5Ps8xEfqekVfI+JdhsKSmIlnef0ovtfh0Qp1XARfJPQb8seRVauhejhkW2HLHwoHb7FJIkr9Ylap6cQbdBeqWMzxy8DRO0QU+dizxXO7/za+zbqit25+yrZF9oA1M9jKp5nNK6b3qL1CefVtVUrX0Z5ADtRnxzP39A808uWryDt0NLBNXiSvnsL+zg0q7hXu5rHa6/z1C9YNATAZnTtuwlzgICfj5fsxQ666ssbiPKQRkgS1e+pey1J8SyOcYeHQD/eI8RCmj2XoRxSPd6UfBw5CRTcp0PAC5zGjbh5Zlj8Lw/jGZ77TJ3nmVuzF2Y5p3ewUtxf2id9gnp+EzmIf4FXzK1CStUV8n+eA74UUKnlqGOeffTph0SL9KYwUd4vTICc8Sr2IcCTJHesXxD/Kh+BpoZK2dUHKJb7Pnoj6b6Sy9kSkU1qAN8iTO3R/vWsHJR7+G0gMMc3BnXBOlrAHZGy4VJamz/TF7AHgGtSCI+uwHu4TTcdwwsoiglG81Xxua/rhzIqDkrhme0x2DTrcemKnB7Rg== X-MS-Exchange-Transport-Forked: True X-MS-Exchange-Transport-CrossTenantHeadersStamped: VE1PR08MB5678 Original-Authentication-Results: gcc.gnu.org; dkim=none (message not signed) header.d=none;gcc.gnu.org; dmarc=none action=none header.from=arm.com; X-EOPAttributedMessage: 0 X-MS-Exchange-Transport-CrossTenantHeadersStripped: DB5EUR03FT037.eop-EUR03.prod.protection.outlook.com X-MS-Office365-Filtering-Correlation-Id-Prvs: da7958b5-1028-4952-0686-08d8615f454c X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: 0Q4HKVJDOjIeT4QVveJU9x7iOCkTU6Mn5vhUyFcTi3E1gOESN4KagIeVagNGH0QBxBuCS/esHiRqVWO8jp1q3HTGQrgOSILV6tzicqw8ZfEJxu8R5IwUFFFK2uXvOOpP1EMQ7LJgXWzGs7xEAW3PVx/lIn9kqWDm+TnAISURKE6y1ar+5HfnCGRjYhowPa6PsuzcJbY4YRdgrxV4RIZOb3Sr8s6sy380X43+N8C7Yqg2PKWG/VlO9WBZDwEQysiVUm2AzYaojedtx0SleWl9AXFk7mm2LUOeRK4clgvRUiphllTqrhdcF3e3srTwUmQrEjiWb26dbqj2A0Il32LUImwLsIY8b/z/K6uZwv9RjFdKTtOVECHNGjDXRpgLV3OgcdXlqyIocbgrM672fE0ik00tpWL4bEx8QDq0kKlHWos= X-Forefront-Antispam-Report: CIP:63.35.35.123; CTRY:IE; LANG:en; SCL:1; SRV:; IPV:CAL; SFV:NSPM; H:64aa7808-outbound-1.mta.getcheckrecipient.com; PTR:ec2-63-35-35-123.eu-west-1.compute.amazonaws.com; CAT:NONE; SFS:(4636009)(39860400002)(396003)(376002)(346002)(136003)(46966005)(44144004)(4743002)(70586007)(356005)(235185007)(70206006)(1076003)(66616009)(5660300002)(44832011)(8936002)(336012)(16526019)(82740400003)(2616005)(7696005)(186003)(47076004)(36756003)(86362001)(82310400003)(2906002)(33964004)(55016002)(6916009)(8886007)(6666004)(478600001)(316002)(33656002)(26005)(8676002)(4326008)(81166007)(956004)(2700100001); DIR:OUT; SFP:1101; X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 25 Sep 2020 14:28:36.2173 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 5160e319-d7aa-46f2-bc12-08d8615f49e1 X-MS-Exchange-CrossTenant-Id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=f34e5979-57d9-4aaa-ad4d-b122a662184d; Ip=[63.35.35.123]; Helo=[64aa7808-outbound-1.mta.getcheckrecipient.com] X-MS-Exchange-CrossTenant-AuthSource: DB5EUR03FT037.eop-EUR03.prod.protection.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: HE1PR0802MB2298 X-Spam-Status: No, score=-14.1 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, GIT_PATCH_0, MSGID_FROM_MTA_HEADER, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2, SPF_HELO_PASS, SPF_PASS, TXREP, UNPARSEABLE_RELAY autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: nd@arm.com, rguenther@suse.de, ook@ucw.cz Errors-To: gcc-patches-bounces@gcc.gnu.org Sender: "Gcc-patches" Hi All, This adds the dissolve code to undo the patterns created by the pattern matcher in case SLP is to be aborted. As mentioned in the cover letter this has one issue in that the number of copies can needed can change depending on whether TWO_OPERATORS is needed or not. Because of this I don't analyze the original statement when it's replaced by a pattern and attempt to correct it here by analyzing it after dissolve. This however seems too late and I would need to change the unroll factor, which seems a bit odd. Any advice would be appreciated. Bootstrapped Regtested on aarch64-none-linux-gnu and no issues. Thanks, Tamar gcc/ChangeLog: * tree-vect-loop.c (vect_dissolve_slp_only_patterns): New (vect_dissolve_slp_only_groups): Call vect_dissolve_slp_only_patterns. diff --git a/gcc/tree-vect-loop.c b/gcc/tree-vect-loop.c index b1a6e1508c7f00f5f369ec873f927f30d673059e..8231ad6452af6ff111911a7bfb6aab2257df9fc0 100644 --- a/gcc/tree-vect-loop.c +++ b/gcc/tree-vect-loop.c @@ -1956,6 +1956,92 @@ vect_get_datarefs_in_loop (loop_p loop, basic_block *bbs, return opt_result::success (); } +/* For every SLP only pattern created by the pattern matched rooted in ROOT + restore the relevancy of the original statements over those of the pattern + and destroy the pattern relationship. This restores the SLP tree to a state + where it can be used when SLP build is cancelled or re-tried. */ + +static opt_result +vect_dissolve_slp_only_patterns (loop_vec_info loop_vinfo, + hash_set *visited, slp_tree root) +{ + if (!root || visited->contains (root)) + return opt_result::success (); + + unsigned int i; + slp_tree node; + opt_result res = opt_result::success (); + stmt_vec_info stmt_info; + stmt_vec_info related_stmt_info; + bool need_to_vectorize = false; + auto_vec cost_vec; + + visited->add (root); + + FOR_EACH_VEC_ELT (SLP_TREE_SCALAR_STMTS (root), i, stmt_info) + if (STMT_VINFO_SLP_VECT_ONLY (stmt_info) + && (related_stmt_info = STMT_VINFO_RELATED_STMT (stmt_info)) != NULL) + { + if (dump_enabled_p ()) + dump_printf_loc (MSG_NOTE, vect_location, + "dissolving relevancy of %G over %G", + STMT_VINFO_STMT (stmt_info), + STMT_VINFO_STMT (related_stmt_info)); + STMT_VINFO_RELEVANT (stmt_info) = vect_unused_in_scope; + STMT_VINFO_RELEVANT (related_stmt_info) = vect_used_in_scope; + STMT_VINFO_IN_PATTERN_P (related_stmt_info) = false; + STMT_SLP_TYPE (related_stmt_info) = hybrid; + /* Now we have to re-analyze the statement since we skipped it in the + the initial analysis due to the differences in copies. */ + res = vect_analyze_stmt (loop_vinfo, related_stmt_info, + &need_to_vectorize, NULL, NULL, &cost_vec); + + if (!res) + return res; + } + + FOR_EACH_VEC_ELT (SLP_TREE_CHILDREN (root), i, node) + { + res = vect_dissolve_slp_only_patterns (loop_vinfo, visited, node); + if (!res) + return res; + } + + return res; +} + +/* Lookup any SLP Only Pattern statements created by the SLP pattern matcher in + all slp_instances in LOOP_VINFO and undo the relevancy of statements such + that the original SLP tree before the pattern matching is used. */ + +static opt_result +vect_dissolve_slp_only_patterns (loop_vec_info loop_vinfo) +{ + + unsigned int i; + opt_result res = opt_result::success (); + hash_set *visited = new hash_set (); + + DUMP_VECT_SCOPE ("vect_dissolve_slp_only_patterns"); + + /* Unmark any SLP only patterns as relevant and restore the STMT_INFO of the + related instruction. */ + slp_instance instance; + FOR_EACH_VEC_ELT (LOOP_VINFO_SLP_INSTANCES (loop_vinfo), i, instance) + { + res = vect_dissolve_slp_only_patterns (loop_vinfo, visited, + SLP_INSTANCE_TREE (instance)); + if (!res) + { + delete visited; + return res; + } + } + + delete visited; + return res; +} + /* Look for SLP-only access groups and turn each individual access into its own group. */ static void @@ -2427,6 +2513,11 @@ again: /* Ensure that "ok" is false (with an opt_problem if dumping is enabled). */ gcc_assert (!ok); + /* Dissolve any SLP patterns created by the SLP pattern matcher. */ + opt_result dissolved = vect_dissolve_slp_only_patterns (loop_vinfo); + if (!dissolved) + return dissolved; + /* Try again with SLP forced off but if we didn't do any SLP there is no point in re-trying. */ if (!slp) From patchwork Fri Sep 25 14:28:39 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tamar Christina X-Patchwork-Id: 1371340 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=8.43.85.97; helo=sourceware.org; envelope-from=gcc-patches-bounces@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=arm.com Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=armh.onmicrosoft.com header.i=@armh.onmicrosoft.com header.a=rsa-sha256 header.s=selector2-armh-onmicrosoft-com header.b=f/OnIYJ4; dkim=pass (1024-bit key) header.d=armh.onmicrosoft.com header.i=@armh.onmicrosoft.com header.a=rsa-sha256 header.s=selector2-armh-onmicrosoft-com header.b=f/OnIYJ4; dkim-atps=neutral Received: from sourceware.org (server2.sourceware.org [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4ByZ6Z2TKYz9sR4 for ; Sat, 26 Sep 2020 00:29:10 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 7979B398EC1D; Fri, 25 Sep 2020 14:29:08 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from EUR02-HE1-obe.outbound.protection.outlook.com (mail-eopbgr10041.outbound.protection.outlook.com [40.107.1.41]) by sourceware.org (Postfix) with ESMTPS id B5D11398B400 for ; Fri, 25 Sep 2020 14:29:04 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org B5D11398B400 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=Tamar.Christina@arm.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector2-armh-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=lymhjkwZcYPRiQgmuc9g76ix/RB5yMmoo/YZKYhDGWY=; b=f/OnIYJ4mCBCFadYicntkoGk8djoy+EX7iQOyEX68cQtEWqm5vy8lXRfTlLnyDZdrgnNnCuySEC9sc3gpzysdAoSYi6JfHJ9uJ+DXc0Nv+MVSqrkFEzNO1DI8WVaSd+of49MvX4UcWgr64csrYO5/eqnpu2Ofav1V/1yX/x7G+Y= Received: from DB3PR08CA0028.eurprd08.prod.outlook.com (2603:10a6:8::41) by AM0PR08MB4481.eurprd08.prod.outlook.com (2603:10a6:208:148::16) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3412.22; Fri, 25 Sep 2020 14:29:02 +0000 Received: from DB5EUR03FT055.eop-EUR03.prod.protection.outlook.com (2603:10a6:8:0:cafe::52) by DB3PR08CA0028.outlook.office365.com (2603:10a6:8::41) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3412.21 via Frontend Transport; Fri, 25 Sep 2020 14:29:02 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 63.35.35.123) smtp.mailfrom=arm.com; gcc.gnu.org; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com;gcc.gnu.org; dmarc=bestguesspass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 63.35.35.123 as permitted sender) receiver=protection.outlook.com; client-ip=63.35.35.123; helo=64aa7808-outbound-1.mta.getcheckrecipient.com; Received: from 64aa7808-outbound-1.mta.getcheckrecipient.com (63.35.35.123) by DB5EUR03FT055.mail.protection.outlook.com (10.152.21.30) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3412.21 via Frontend Transport; Fri, 25 Sep 2020 14:29:02 +0000 Received: ("Tessian outbound a0bffebca527:v64"); Fri, 25 Sep 2020 14:29:02 +0000 X-CheckRecipientChecked: true X-CR-MTA-CID: d82809f173d0db63 X-CR-MTA-TID: 64aa7808 Received: from e561700e7cf0.1 by 64aa7808-outbound-1.mta.getcheckrecipient.com id C5D29E95-2C2A-400C-B318-53C6E7CB3659.1; Fri, 25 Sep 2020 14:28:48 +0000 Received: from EUR05-AM6-obe.outbound.protection.outlook.com by 64aa7808-outbound-1.mta.getcheckrecipient.com with ESMTPS id e561700e7cf0.1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384); Fri, 25 Sep 2020 14:28:48 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=csI/iRRJJpXznvEg1hvzZ5qCSjw9J9jjZgguSvNRYSJJCbFnrc0gaG3gjgHh8SE15n7+bBFwRRPmdxftocVdlgP3iR6MSVGL8tjZZ81o+tiq+fToGjxbK+zGLuzmMQavBcPNYiBeWM9cfsj64KZYk+ADUx5KZf8rLUs5N0cXZW/1Dqx3GzZs1fWPAZKLWvZqT7Ojj3k7vZolLpNtcWcWDvFMNwzwMScQER3WZq8ZoyLQnLUN7n4SjH51pv9a/5sU3KTYMM7zeswbK7Dj0QLdYQktcdl5h9IxU1HYeWVr38cUvROqhG11i1E01LfOLLCBvvhw8P32f4MlTNqQbPDA0g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=lymhjkwZcYPRiQgmuc9g76ix/RB5yMmoo/YZKYhDGWY=; b=Og2t+qbhmI0IgwqponLbM5I4xLNnMFLQBy6uE0gLaULO5qAnx+14eLNeNXsc1VYOttR+rvZ2++zr1Buh7S1URg9tv6yDJSGc3yKomYR8FD6Fo19Zhd/LJkAOJa21g6zUBFnFsRKLrTxJ9Vlpe/5dN7RCLJLGIRWq3E1lLa2iM4inM3dzu+ADzSTpZnqKF9TxIaynLd//tgvXIF39NJIAHixqRs21QUlWlryWpWCffRNf8HUnf4fuZR0dEbJ9fa0C1FYhHkZJWY7AlXt0aOR3AMF/qoLm3yBObyPW9TrynLUmni09nVFIbF0wQusHTcQQK5CS67ZYWOLLi3UcQ63eIw== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=arm.com; dmarc=pass action=none header.from=arm.com; dkim=pass header.d=arm.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector2-armh-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=lymhjkwZcYPRiQgmuc9g76ix/RB5yMmoo/YZKYhDGWY=; b=f/OnIYJ4mCBCFadYicntkoGk8djoy+EX7iQOyEX68cQtEWqm5vy8lXRfTlLnyDZdrgnNnCuySEC9sc3gpzysdAoSYi6JfHJ9uJ+DXc0Nv+MVSqrkFEzNO1DI8WVaSd+of49MvX4UcWgr64csrYO5/eqnpu2Ofav1V/1yX/x7G+Y= Authentication-Results-Original: gcc.gnu.org; dkim=none (message not signed) header.d=none;gcc.gnu.org; dmarc=none action=none header.from=arm.com; Received: from VI1PR08MB5325.eurprd08.prod.outlook.com (2603:10a6:803:13e::17) by VE1PR08MB5678.eurprd08.prod.outlook.com (2603:10a6:800:1a0::20) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3391.11; Fri, 25 Sep 2020 14:28:47 +0000 Received: from VI1PR08MB5325.eurprd08.prod.outlook.com ([fe80::d0e7:49cd:4dae:a2a2]) by VI1PR08MB5325.eurprd08.prod.outlook.com ([fe80::d0e7:49cd:4dae:a2a2%7]) with mapi id 15.20.3412.024; Fri, 25 Sep 2020 14:28:47 +0000 Date: Fri, 25 Sep 2020 15:28:39 +0100 From: Tamar Christina To: gcc-patches@gcc.gnu.org Subject: [PATCH v2 5/16]middle-end: Add shared machinery for matching patterns involving complex numbers. Message-ID: <20200925142837.GA16579@arm.com> Content-Disposition: inline User-Agent: Mutt/1.9.4 (2018-02-28) X-ClientProxiedBy: SN4PR0201CA0048.namprd02.prod.outlook.com (2603:10b6:803:2e::34) To VI1PR08MB5325.eurprd08.prod.outlook.com (2603:10a6:803:13e::17) MIME-Version: 1.0 X-MS-Exchange-MessageSentRepresentingType: 1 Received: from arm.com (217.140.106.53) by SN4PR0201CA0048.namprd02.prod.outlook.com (2603:10b6:803:2e::34) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3412.20 via Frontend Transport; Fri, 25 Sep 2020 14:28:45 +0000 X-Originating-IP: [217.140.106.53] X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-HT: Tenant X-MS-Office365-Filtering-Correlation-Id: 556acdbc-4fa3-4680-fcac-08d8615f597d X-MS-TrafficTypeDiagnostic: VE1PR08MB5678:|AM0PR08MB4481: X-Microsoft-Antispam-PRVS: x-checkrecipientrouted: true NoDisclaimer: true X-MS-Oob-TLC-OOBClassifiers: OLM:10000;OLM:10000; X-MS-Exchange-SenderADCheck: 1 X-Microsoft-Antispam-Untrusted: BCL:0; X-Microsoft-Antispam-Message-Info-Original: AyxKyccSqNEfXlh/zdB8orlbHSyIrwgOf2rHZk5Ajj0SVy88jC9FESHflvw4EPtICh0JI0LMkwcgkvW8QhNNfxBAA9BkYOkvBeUm9s+gPxQgf21JOWBL5rsbaIfcuJs1vyKAWzQVLwNK9qCrNJIp/G63g8ELzSpHfdg1L3zNgaEIMnPk6aPNsbYPkzlErLOe5OQlDp6It1jgFJY1cENiICHQlHDFLwgXZ1/onoK4z94FI1mZaLZDFib482nNCXQYZ3ALy3Yp91CpnucFWtL4sPxhipF/A7PovSAoE/HarVDFkjVoA0R9iLaQjb6HW2ru8B9efH/lMa9gyJbUHilCjYEncEBk2uKKGKnPbqDmG88AMp09q/6SVyHKZBvnOVeigs4JDjVywwMCpMqNCKqU+GDViZ2pOO0ptMSiXjrQr9w= X-Forefront-Antispam-Report-Untrusted: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:VI1PR08MB5325.eurprd08.prod.outlook.com; PTR:; CAT:NONE; SFS:(4636009)(396003)(366004)(136003)(39860400002)(346002)(376002)(8936002)(1076003)(55016002)(33656002)(8886007)(66946007)(66556008)(86362001)(66616009)(83380400001)(36756003)(66476007)(478600001)(6666004)(6916009)(235185007)(44144004)(5660300002)(16526019)(186003)(4743002)(4326008)(33964004)(52116002)(44832011)(7696005)(316002)(2616005)(26005)(2906002)(8676002)(956004)(2700100001); DIR:OUT; SFP:1101; X-MS-Exchange-AntiSpam-MessageData: d3F8WBc2lkdc524+8g5kITFNRiwx87atKpH967wqVznEHb9FpotTH1LzquUtlHFCltXnsuu2bnIXYzHJVIWSt7eml+GtN0ZpF6a8ZpWrgPy5EFBb27ruet1Manlb4jdSHCya8csEjFfuQsEO75yRU7eoMOUwkru3MSWse55Y8yw+h7L70nyVRb6vBxcZ+0poTsGL90dR5S2KRhaI6SDyct/9+2AmnAdEocJHT2SWuX2RtzokdzbP3oqXyp0I8jjGFdx9uF68PWMeb9TBj6Wc8A1VltHk3Dp56Xt152UFVzSIsewsJOpUtmOShZwBqGwxUeQAkEr0gjyCnLDa58epHojljUZt2MxpyEYo9qVLQlsOe2RAZvBz4TenumL6AB3DEPkYFz5kpQciSMBdvGboCcuvyU0Eeu4ZHERUcUScEKstb/0QCpIntvSFNjTUnm3qTdXhN8tbRkDFG0urHV2F0WSzku0UfBSoFKQJkHNZOQ68GX+NrkdBbcKEJWmru4chrPdkdM+NRiBgVdz9Wumr0tkN+ZmmZY0fTIynXrCaO8I/Uzg6G5yMFbcWZmcT7LAqR0htToOmPmbZa+AKjIXqpnqO99gkh+8v/mEK7hQXepDUeeS9JWp7azmhGqMcrKYiYOLQyd8+IKlwgSQRQTOTRQ== X-MS-Exchange-Transport-Forked: True X-MS-Exchange-Transport-CrossTenantHeadersStamped: VE1PR08MB5678 Original-Authentication-Results: gcc.gnu.org; dkim=none (message not signed) header.d=none;gcc.gnu.org; dmarc=none action=none header.from=arm.com; X-EOPAttributedMessage: 0 X-MS-Exchange-Transport-CrossTenantHeadersStripped: DB5EUR03FT055.eop-EUR03.prod.protection.outlook.com X-MS-Office365-Filtering-Correlation-Id-Prvs: 1585d413-586d-4893-f554-08d8615f500e X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: JPbpHjGV073B0PX/JP7K1VBTpJ95TxTvUi3rx3jNMHtVa5HUtdw6Z0xwRFg5IAJOp8/QYMpDGDdnBa7MIZmvf4TiBspUi7D4+h2khll0oVe+qe4glOAO5aZ1VV35pXzhi+R6pfgSEgjfBvQJJMLT32wJ4mbk3LkGGkJ7ffDscWuTS3zMz+VUr8KC2adtVD7rW1VCG+7Q8T7ATlETLnf4gpNWYLR4ddhAyVHx0uPd+mZe/rbk6Hbg57ilCIDrtPYDpPOx3EhlPYcyCTo30KHzWOQcDfk9NRikTSFZDFZ7OMv7e9CgBEnu6m6iFHCZrmioJvObDXM257Xi4wNysoJkD2YhuBWE4+Cdb6JbmYB/ufbrPLkuxnp7m7uJqUY/VKYnnwX1KtpcLyWtvs14RLk+OiJMXy9AG0CpO+UkjU5zi0A= X-Forefront-Antispam-Report: CIP:63.35.35.123; CTRY:IE; LANG:en; SCL:1; SRV:; IPV:CAL; SFV:NSPM; H:64aa7808-outbound-1.mta.getcheckrecipient.com; PTR:ec2-63-35-35-123.eu-west-1.compute.amazonaws.com; CAT:NONE; SFS:(4636009)(396003)(39860400002)(136003)(376002)(346002)(46966005)(8936002)(2906002)(33656002)(356005)(81166007)(70206006)(956004)(316002)(2616005)(83380400001)(82740400003)(82310400003)(47076004)(478600001)(44832011)(186003)(7696005)(4326008)(86362001)(36756003)(16526019)(70586007)(8886007)(55016002)(6666004)(44144004)(336012)(26005)(8676002)(6916009)(4743002)(66616009)(1076003)(33964004)(235185007)(5660300002)(2700100001); DIR:OUT; SFP:1101; X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 25 Sep 2020 14:29:02.4061 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 556acdbc-4fa3-4680-fcac-08d8615f597d X-MS-Exchange-CrossTenant-Id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=f34e5979-57d9-4aaa-ad4d-b122a662184d; Ip=[63.35.35.123]; Helo=[64aa7808-outbound-1.mta.getcheckrecipient.com] X-MS-Exchange-CrossTenant-AuthSource: DB5EUR03FT055.eop-EUR03.prod.protection.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: AM0PR08MB4481 X-Spam-Status: No, score=-14.1 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, GIT_PATCH_0, MSGID_FROM_MTA_HEADER, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2, SPF_HELO_PASS, SPF_PASS, TXREP, UNPARSEABLE_RELAY autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: nd@arm.com, rguenther@suse.de, ook@ucw.cz Errors-To: gcc-patches-bounces@gcc.gnu.org Sender: "Gcc-patches" Hi All, This patch adds shared machinery for detecting patterns having to do with complex number operations. The class ComplexPattern provides helpers for matching and ultimately undoing the permutation in the tree by rebuilding the graph. Bootstrapped Regtested on aarch64-none-linux-gnu and no issues. Ok for master? Thanks, Tamar gcc/ChangeLog: * tree-vect-slp-patterns.c (complex_operation_t,class ComplexPattern): New. diff --git a/gcc/tree-vect-slp-patterns.c b/gcc/tree-vect-slp-patterns.c index f605f68d2a14c4bf4941f97b7c1d57f6acb5ffb1..6453a5b1b6464dba833adc2c2a194db5e712bb79 100644 --- a/gcc/tree-vect-slp-patterns.c +++ b/gcc/tree-vect-slp-patterns.c @@ -134,6 +134,19 @@ along with GCC; see the file COPYING3. If not see To add a new pattern, implement the VectPattern class and add the type to slp_patterns. */ +/* The COMPLEX_OPERATION enum denotes the possible pair of operations that can + be matched when looking for expressions that we are interested matching for + complex numbers addition and mla. */ + +typedef enum _complex_operation { + PLUS_PLUS, + MINUS_PLUS, + PLUS_MINUS, + MULT_MULT, + NEG_NEG, + CMPLX_NONE +} complex_operation_t; + /* VectSimplePatternMatch holds contextual information about a single match found in the SLP tree. The use of the class is to allow you to defer performing any modifications to the SLP tree until they are to be done. By @@ -298,6 +311,358 @@ class VectSimplePatternMatch : public VectPatternMatch } }; +/* The ComplexPattern class contains common code for pattern matchers that work + on complex numbers. These provide functionality to allow de-construction and + validation of sequences depicting/transforming REAL and IMAG pairs. */ + +class ComplexPattern : public VectPattern +{ + protected: + /* Current list of arguments that were found during the current invocation + of the pattern matcher. */ + vec m_vects; + + /* Representative statement for the current match being performed. */ + stmt_vec_info m_stmt_info; + + /* A list of all arguments found between all invocations of the current + pattern matcher. */ + vec> m_defs; + + /* Checks to see of the expression EXPR is a gimple assign with code CODE + and if this is the case the two operands of EXPR is returned in OP1 and + OP2. + + If the matching and extraction is successful TRUE is returned otherwise + FALSE in which case the value of OP1 and OP2 will not have been touched. + */ + + bool + vect_match_expression_p (slp_tree node, tree_code code, int base, int idx, + stmt_vec_info *op1, stmt_vec_info *op2) + { + + vec scalar_stmts = SLP_TREE_SCALAR_STMTS (node); + + /* Calculate the index of the statement in the node to inspect. */ + int n = base + idx; + if (scalar_stmts.length () < (unsigned)n) // can use group_size + return false; + + gimple* expr = STMT_VINFO_STMT (scalar_stmts[n]); + if (!is_gimple_assign (expr) + || gimple_expr_code (expr) != code) + return false; + + vec children = SLP_TREE_CHILDREN (node); + + /* If it's a VEC_PERM_EXPR we need to look one deeper. VEC_PERM_EXPR + only have one entry. So pick on. */ + if (node->code == VEC_PERM_EXPR) + children = SLP_TREE_CHILDREN (children.last ()); + + if (children.length () != (op2 ? 2 : 1)) + return false; + + if (op1) + { + if (SLP_TREE_DEF_TYPE (children[0]) != vect_internal_def) + return false; + *op1 = SLP_TREE_SCALAR_STMTS (children[0])[n]; + } + + if (op2) + { + if (SLP_TREE_DEF_TYPE (children[1]) != vect_internal_def) + return false; + *op2 = SLP_TREE_SCALAR_STMTS (children[1])[n]; + } + + return true; + } + + /* This function will match two gimple expressions STMT_0 and STMT_1 in + parallel and returns the pair operation that represents the two + expressions in the two statements. The statements are located in NODE1 + and NODE2 at offset base + offset1 and base + offset2 respectively. + + If match is successful then the corresponding complex_operation is + returned and the arguments to the two matched operations are returned in + OPS. + + If unsuccessful then CMPLX_NONE is returned and OPS is untouched. + + e.g. the following gimple statements + + stmt 0 _39 = _37 + _12; + stmt 1 _6 = _38 - _36; + + will return PLUS_MINUS along with OPS containing {_37, _12, _38, _36}. + */ + + complex_operation_t + vect_detect_pair_op (int base, slp_tree node1, int offset1, slp_tree node2, + int offset2, vec *ops) + { + stmt_vec_info op1 = NULL, op2 = NULL, op3 = NULL, op4 = NULL; + complex_operation_t result = CMPLX_NONE; + #define CHECK_FOR(x, y, z) \ + (vect_match_expression_p (node1, x, base, offset1, &op1, \ + z ? &op2 : NULL) \ + && vect_match_expression_p (node2, y, base, offset2, &op3, \ + z ? &op4 : NULL)) + + if (CHECK_FOR (MINUS_EXPR, PLUS_EXPR, true)) + result = MINUS_PLUS; + else if (CHECK_FOR (PLUS_EXPR, MINUS_EXPR, true)) + result = PLUS_MINUS; + else if (CHECK_FOR (PLUS_EXPR, PLUS_EXPR, true)) + result = PLUS_PLUS; + else if (CHECK_FOR (MULT_EXPR, MULT_EXPR, true)) + result = MULT_MULT; + else if (CHECK_FOR (NEGATE_EXPR, NEGATE_EXPR, false)) + result = NEG_NEG; + #undef CHECK_FOR + + if (result != CMPLX_NONE && ops != NULL) + { + ops->create (4); + ops->quick_push (op1); + ops->quick_push (op2); + ops->quick_push (op3); + ops->quick_push (op4); + } + return result; + } + + /* Overload of vect_detect_pair_op where the statements are assumed to be + one after the other. This inspects node[base] and node[base+1]. */ + + complex_operation_t + vect_detect_pair_op (int base, slp_tree node, vec *ops) + { + return vect_detect_pair_op (base, node, 0, node, 1, ops); + } + + /* Create the intermediate states that are needed and generate a new match + object with the information. */ + + bool + store_results () + { + this->m_defs.safe_push (this->m_vects); + save_match (); + return true; + } + + /* This function marks every statement that is being replaced during the + the pattern matching as PURE. Normally when replacing a statement due + to a pattern we add the statement to the STMT_VINFO_PATTERN_DEF_SEQ of + the pattern that is replacing them. In this case however this won't + work as when doing the replacement we are changing the nodes that are + used by the statements. This means that when vectorized the SSA chain + is different than in the BB. + + Declaring the statements as part of the sequence will then cause SSA + verification to fail as we may refer to statements that were not in the + original USE-DEF chain of the statement we are replacing. + + The downside of this approach is that the statements will still be + seen as relevant and so we will still generate code for them and they + will be in the output, unconnected until DSE. We could mark them as + irrelevant, but that is only safe if there are no more uses of the node + in the SLP graph (So perhaps this should be done in free_slp_tree + instead of here. */ + + static void + vect_mark_stmts_as_in_pattern (hash_set *cache, vec orig, + slp_tree node) + { + if (cache->contains (node)) + return; + + unsigned i; + stmt_vec_info stmt_info; + slp_tree child; + + cache->add (node); + + FOR_EACH_VEC_ELT (SLP_TREE_SCALAR_STMTS (node), i, stmt_info) + { + if (gimple_assign_load_p (STMT_VINFO_STMT (stmt_info))) + return; + + STMT_SLP_TYPE (stmt_info) = pure_slp; + } + + FOR_EACH_VEC_ELT (SLP_TREE_CHILDREN (node), i, child) + vect_mark_stmts_as_in_pattern (cache, orig, child); + } + + protected: + ComplexPattern (slp_tree node, vec_info *vinfo) + : VectPattern (node, vinfo) + { } + + /* Create and store a new VectPatternMatch object with the current match + that was found. */ + + void save_match () + { + tree type = gimple_expr_type (STMT_VINFO_STMT (this->m_stmt_info)); + tree vectype = get_vectype_for_scalar_type (this->m_vinfo, type, + this->m_node); + VectPatternMatch *match + = new VectSimplePatternMatch (this->m_arity, this->m_defs.last (), + this->m_last_ifn, this->m_vinfo, + this->m_last_idx, this->m_node, type, + vectype, this->m_num_args); + this->m_matches.safe_push (match); + } + + public: + + /* Check to see if all loads rooted in ROOT are linear. Linearity is + defined as having no gaps between values loaded. */ + static bool + linear_loads_p (slp_tree root) + { + if (!root) + return false; + + unsigned i; + + if (SLP_TREE_LOAD_PERMUTATION (root).exists ()) + { + vec loads = SLP_TREE_LOAD_PERMUTATION (root); + unsigned leader = loads[0]; + unsigned load; + FOR_EACH_VEC_ELT_FROM (loads, i, load, 1) + if (load != ++leader) + return false; + } + + slp_tree child; + FOR_EACH_VEC_ELT (SLP_TREE_CHILDREN (root), i, child) + if (!linear_loads_p (child)) + return false; + + return true; + } + + /* The post transform and validation function for the complex number + patterns. This will re-arrange the tree and re-organize the nodes such + that they can be used by the complex number instructions that are to be + created. It does this by doing the following steps: + + 1) It looks up the definition nodes of each statement in DEFS which are + the new arguments to be used in the new patterns that will be created. + From this new SLP trees are created by calling vect_build_slp_tree + with the statements in the order we expect them to be. A majority of + these will be found in the cache and so this call will be fast. + + 2) After the new trees are created we check to see if all of them are + linear. If they are not linear we abort and undo the bookkeeping + information that vect_build_slp_tree created for them. + + 3) The children of NODE are replaced with the new set of nodes we + created. + + 4) The new sub-tree rooted in NODE are marked as relevant. + + This sequence of operations does an implicit re-ordering nodes. After + which DSE can remove the unused nodes. e.g. it will undo as much of the + permutes as it possibly can. This is required such that pattern + matchers running on the newly created statements match the correct + operations. */ + + bool validate_p (poly_uint64 *max_nunits, bool *matches, + unsigned *npermutes, unsigned *tree_size, + scalar_stmts_to_slp_tree_map_t * bst_map) + { + int group_size = SLP_TREE_SCALAR_STMTS (this->m_node).length (); + auto_vec nodes; + auto_vec stmts; + stmts.create (0); + stmts.safe_grow_cleared (group_size); + nodes.create (0); + nodes.safe_grow_cleared (this->m_num_args); + slp_tree tmp = NULL; + vec iters = SLP_TREE_SCALAR_STMTS (this->m_node); + + VectPatternMatch *match; + unsigned int i, count; + int idx = -1; + hash_set *visited = new hash_set (); + + for (idx = 0; idx < this->m_num_args; idx++) + { + count = 0; + FOR_EACH_VEC_ELT (this->m_matches, i, match) + { + vec def = this->m_defs[i]; + for (int x = 0; x < this->m_arity; x++) + { + stmt_vec_info op = def[idx + (x * this->m_num_args)]; + stmts[count++] = op; + } + } + + /* We need top copy the statements in case the node is not in the + cache. But if it is in the cache we leak? */ + vec new_stmts = stmts.copy (); + tmp = vect_build_slp_tree (this->m_vinfo, new_stmts, group_size, + max_nunits, matches, npermutes, tree_size, + bst_map); + + gimple *info + = STMT_VINFO_STMT (SLP_TREE_REPRESENTATIVE (this->m_node)); + if (!tmp) + { + if (dump_enabled_p ()) + dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location, + "Could not build new SLP tree for %G\n", info); + + goto graceful_exit; + } + + nodes[idx] = tmp; + visited->add (tmp); + + if (!linear_loads_p (tmp)) + { + if (dump_enabled_p ()) + dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location, + "Loads could not be made linear %G\n", info); + + goto graceful_exit; + } + } + + /* Mark all statements that are unused between the new and old nodes as in + a pattern. */ + FOR_EACH_VEC_ELT (SLP_TREE_CHILDREN (this->m_node), i, tmp) + vect_mark_stmts_as_in_pattern (visited, iters, tmp); + + delete visited; + + SLP_TREE_CHILDREN (this->m_node).truncate (0); + SLP_TREE_CHILDREN (this->m_node).safe_splice (nodes); + + return true; + +graceful_exit: + + delete visited; + + FOR_EACH_VEC_ELT (nodes, i, tmp) + if (tmp) + vect_free_slp_tree (tmp); + + return false; + } +}; + #define SLP_PATTERN(x) &x::create VectPatternDecl slp_patterns[] { From patchwork Fri Sep 25 14:28:58 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tamar Christina X-Patchwork-Id: 1371341 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=8.43.85.97; helo=sourceware.org; envelope-from=gcc-patches-bounces@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=arm.com Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=armh.onmicrosoft.com header.i=@armh.onmicrosoft.com header.a=rsa-sha256 header.s=selector2-armh-onmicrosoft-com header.b=lznPVSZB; dkim=pass (1024-bit key) header.d=armh.onmicrosoft.com header.i=@armh.onmicrosoft.com header.a=rsa-sha256 header.s=selector2-armh-onmicrosoft-com header.b=lznPVSZB; dkim-atps=neutral Received: from sourceware.org (server2.sourceware.org [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4ByZ6w1qnZz9sR4 for ; Sat, 26 Sep 2020 00:29:28 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 5BD1C398B86D; Fri, 25 Sep 2020 14:29:26 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from EUR05-AM6-obe.outbound.protection.outlook.com (mail-am6eur05on2089.outbound.protection.outlook.com [40.107.22.89]) by sourceware.org (Postfix) with ESMTPS id 73070398B400 for ; Fri, 25 Sep 2020 14:29:22 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 73070398B400 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=Tamar.Christina@arm.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector2-armh-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=zAgCDFAIpsXApfuS190Tg0xG//RojuHHKfHNMDubn80=; b=lznPVSZBrgAc9jWfTIxoiDQjpU3dkVuK8al4/mLwE+cUW671Es5Cv2A8vP1qEhKIcvQ12nJF+OfmECi0xbA9aZASHRDeQSmAIFuv+/HnTj/h5tkKL9Lfcsg5g/T0+SE9wk4bDJBoh0fNYPMPSrd5fNo+BwYHw1/lgU/OI/V7IRI= Received: from AM6P195CA0003.EURP195.PROD.OUTLOOK.COM (2603:10a6:209:81::16) by VI1PR08MB4286.eurprd08.prod.outlook.com (2603:10a6:803:f6::16) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3412.21; Fri, 25 Sep 2020 14:29:20 +0000 Received: from AM5EUR03FT023.eop-EUR03.prod.protection.outlook.com (2603:10a6:209:81:cafe::1f) by AM6P195CA0003.outlook.office365.com (2603:10a6:209:81::16) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3412.20 via Frontend Transport; Fri, 25 Sep 2020 14:29:20 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 63.35.35.123) smtp.mailfrom=arm.com; gcc.gnu.org; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com;gcc.gnu.org; dmarc=bestguesspass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 63.35.35.123 as permitted sender) receiver=protection.outlook.com; client-ip=63.35.35.123; helo=64aa7808-outbound-1.mta.getcheckrecipient.com; Received: from 64aa7808-outbound-1.mta.getcheckrecipient.com (63.35.35.123) by AM5EUR03FT023.mail.protection.outlook.com (10.152.16.169) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3412.21 via Frontend Transport; Fri, 25 Sep 2020 14:29:20 +0000 Received: ("Tessian outbound e8cdb8c6f386:v64"); Fri, 25 Sep 2020 14:29:20 +0000 X-CheckRecipientChecked: true X-CR-MTA-CID: 9454446fccfff07a X-CR-MTA-TID: 64aa7808 Received: from 936e2253589e.1 by 64aa7808-outbound-1.mta.getcheckrecipient.com id 55BFDA7B-72C2-4F3E-83F1-75A1A3AC825F.1; Fri, 25 Sep 2020 14:29:07 +0000 Received: from EUR03-DB5-obe.outbound.protection.outlook.com by 64aa7808-outbound-1.mta.getcheckrecipient.com with ESMTPS id 936e2253589e.1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384); Fri, 25 Sep 2020 14:29:07 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=e4Wv2geNzx4+j+JkP+ytPF6izgfEkiOZftTRgO/+ZBi1u8OIWd5K0mL5JC67honc3MquwbgFUs2gW/tTWqSKeJc6t+sS9310HXbNNUjBVlB1IFUY1cExodRVk2BmLLRbCgnz2h2DInRlzAy2l/mY0yWgQUNlFlKmYQULDwNoW07oFArMOJqWylbvwhVf2y2JQPKaoQd2M0NR8QrFynApUk4MBebESGJI2eTk7Aoaw9ujCh/KrIxvrqYk067yF3RCTDT+2XbEQncz7DCWQD9jm9qKv8yliIPAT+vzXI2kd5/97JJBuvY9eyf16Lv8gHxV1rDuSECV6QXtbSrupgv/yg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=zAgCDFAIpsXApfuS190Tg0xG//RojuHHKfHNMDubn80=; b=SotIYwS9GuPMo/OsxLzkyRAMbK9nedLhvoHkI2QgiZPWTL1MYb5nc/85USezA1TG6A9Oh389zBkDjeljQElZG2eN04nbVPTlQKg6caJeWszhSEh0+gHhf4p/ZJJMjZUiNioWYgnCkv56sUmWos134jze9F3NJ/oeZA5ske2eJmwWtDBOw4YA+mr8tQGhAA4lAqawN4N9zAjIvG3E1l1hYG/86/rymhiBr5qCRUnrxqe0cH9IpIMJ2lZbDn+kV5POD9EKjM9k3+S0DwJy6s7QDv47MCw977/1sC27ddbIFmVGb3kzpoLCQwBIkClFcbRez1qWJuGS95UGQ2axcBcrmw== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=arm.com; dmarc=pass action=none header.from=arm.com; dkim=pass header.d=arm.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector2-armh-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=zAgCDFAIpsXApfuS190Tg0xG//RojuHHKfHNMDubn80=; b=lznPVSZBrgAc9jWfTIxoiDQjpU3dkVuK8al4/mLwE+cUW671Es5Cv2A8vP1qEhKIcvQ12nJF+OfmECi0xbA9aZASHRDeQSmAIFuv+/HnTj/h5tkKL9Lfcsg5g/T0+SE9wk4bDJBoh0fNYPMPSrd5fNo+BwYHw1/lgU/OI/V7IRI= Authentication-Results-Original: gcc.gnu.org; dkim=none (message not signed) header.d=none;gcc.gnu.org; dmarc=none action=none header.from=arm.com; Received: from VI1PR08MB5325.eurprd08.prod.outlook.com (2603:10a6:803:13e::17) by VE1PR08MB5678.eurprd08.prod.outlook.com (2603:10a6:800:1a0::20) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3391.11; Fri, 25 Sep 2020 14:29:06 +0000 Received: from VI1PR08MB5325.eurprd08.prod.outlook.com ([fe80::d0e7:49cd:4dae:a2a2]) by VI1PR08MB5325.eurprd08.prod.outlook.com ([fe80::d0e7:49cd:4dae:a2a2%7]) with mapi id 15.20.3412.024; Fri, 25 Sep 2020 14:29:06 +0000 Date: Fri, 25 Sep 2020 15:28:58 +0100 From: Tamar Christina To: gcc-patches@gcc.gnu.org Subject: [PATCH v2 6/16]middle-end Add Complex Addition with rotation detection Message-ID: <20200925142856.GA17824@arm.com> Content-Disposition: inline User-Agent: Mutt/1.9.4 (2018-02-28) X-ClientProxiedBy: SN4PR0601CA0007.namprd06.prod.outlook.com (2603:10b6:803:2f::17) To VI1PR08MB5325.eurprd08.prod.outlook.com (2603:10a6:803:13e::17) MIME-Version: 1.0 X-MS-Exchange-MessageSentRepresentingType: 1 Received: from arm.com (217.140.106.53) by SN4PR0601CA0007.namprd06.prod.outlook.com (2603:10b6:803:2f::17) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3412.21 via Frontend Transport; Fri, 25 Sep 2020 14:29:04 +0000 X-Originating-IP: [217.140.106.53] X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-HT: Tenant X-MS-Office365-Filtering-Correlation-Id: b5c106fe-cb63-4920-4e9f-08d8615f6438 X-MS-TrafficTypeDiagnostic: VE1PR08MB5678:|VI1PR08MB4286: X-Microsoft-Antispam-PRVS: x-checkrecipientrouted: true NoDisclaimer: true X-MS-Oob-TLC-OOBClassifiers: OLM:10000;OLM:10000; X-MS-Exchange-SenderADCheck: 1 X-Microsoft-Antispam-Untrusted: BCL:0; X-Microsoft-Antispam-Message-Info-Original: b+7lkR7IHMILk2K0HAEssuRBSQhrxTpvyUWE2Dzz9HsGTVq9tUvYyMNxIljBbfKaDTVsUa2fySUUMT79WQl6KLzBUQMvDyZetOOxisvWOhmwMiVW3o9lQTQRaxt7Nr86WFxEnEzN7ASrYgmlRAAtrVt2Qy9pay43cNqshmx+0e68mfgC16NPJQxznQ5QMpz371WYCFLMZgnsa3Zr87B2B9OuKHK5HdjUTkcrYsLOl313CI/DDEf36jv9Jb2wS4/pFdsd/AYO2qLseNVaf865oPEQPh8eE348Oi3lK3d42xU+Q/meB8nL5eMHHm4PpVArgmDpsDUuHEOvYaSiz5jYtXvDvDtol3rYBMjqauZwXsfK9kPaoWDP+Jm+HE1nYm+/E85OfUIA9ZGizdQZcUhD4d2tTGYAf3bhwtRx4phTX+0= X-Forefront-Antispam-Report-Untrusted: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:VI1PR08MB5325.eurprd08.prod.outlook.com; PTR:; CAT:NONE; SFS:(4636009)(396003)(366004)(136003)(39860400002)(346002)(376002)(8936002)(1076003)(55016002)(33656002)(8886007)(66946007)(66556008)(86362001)(66616009)(83380400001)(36756003)(66476007)(478600001)(6666004)(6916009)(235185007)(44144004)(5660300002)(16526019)(186003)(4743002)(4326008)(33964004)(52116002)(44832011)(7696005)(316002)(2616005)(26005)(2906002)(8676002)(956004)(2700100001); DIR:OUT; SFP:1101; X-MS-Exchange-AntiSpam-MessageData: hMMuhccMB2jNCvIulS2YmvJXRcCuVXuZtIAgnGBVf3LPZ+jevIsKitDYIw7nfxOoc2rk76IxVfVhYI2YaqulBPxxGhjenNBjqXVAM2j3TfD9LiRTobQ6bEzt8iq/FzEHVo1+brdDgU3o2AQ8IYbvx33ltYx/tQfd91vjwsv/UPjk4/0zvaT5eIFUnWCnExcYQuQF1mN65L/SKhNCRo2C/kBjYsgrn61ksAKM8WfDvu5nvjxYLjALu0+I7RQjWEB+sB+a6atPTna4LA89ZYyFm0Ncf+KMUhdGtKpWZbe1+E2F47gbgcYk5MpMM0IzRHhI9cS1K/OapLRqYsipsWr7sY6bsjj17A7hB2iMazE5k9CF3482VoL0vTqa/WU0L1Ey6Za1vZAKpOsC2uUPfRZsqjbbe6YNsr/Hr1UQFlhvVM8gP0KN2xPNWz0eby5EO72ziSEIV9WwakAIk7qR+yyfHLvI+jbCvqlpjKou2Ri4VVj+vFHT08ijbTtfNIkR2j4awbU91UrU6SRO7WL/Oq5o7mRR67F5RxvYOK22gp8J1OuYYbA8S9lUld78lo5QUsZg+PaiRF+KYoFykqodRhJYkAuIvQq/BdLaez/ZKPwY8pUn65bXAzcVtPyAomN2gn2o20utCKxdmJktTyzlvzAjQA== X-MS-Exchange-Transport-Forked: True X-MS-Exchange-Transport-CrossTenantHeadersStamped: VE1PR08MB5678 Original-Authentication-Results: gcc.gnu.org; dkim=none (message not signed) header.d=none;gcc.gnu.org; dmarc=none action=none header.from=arm.com; X-EOPAttributedMessage: 0 X-MS-Exchange-Transport-CrossTenantHeadersStripped: AM5EUR03FT023.eop-EUR03.prod.protection.outlook.com X-MS-Office365-Filtering-Correlation-Id-Prvs: 3e7b9eaa-2bdb-4c41-1672-08d8615f5baf X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: xmHduiikshwxecTEBGUHYnum6xJVbg+MvpDeOuWhgoxjOXikPLXp1QiT6QpPIjzN5OZ0NSuQ4TJclRoBC/Q0i1WITfhr7UitC8IQktxDs+Yl6SVIDbO7NaicaSDQjuJDfNEA0KBGAOBg1NFYdDKX2iSl6Ybuuf5wzp3PDE4x2w/focaWuWCEEH0Ju/hQ66uEArVGGKkSfk4vczzRaNllSEWyqYclTwu6mFslytaq2nOpZvVFHQg0333dE3B/94WGxU7Ha1NNuqFJiRtja8u1U3e8WkrSKItIpV9Yr3RmPlCUXpR5pxwmfZZH5Vo93b+kNCkkkeI54gckFdSS3nb8oFhWrBYXw9uO1wwplTUzgESTEGEdPnEmKp7px0IQsvfMA//VexqQpgekkNjOFq/tE9G5vffOZnaVHy4ORXiwb4Q= X-Forefront-Antispam-Report: CIP:63.35.35.123; CTRY:IE; LANG:en; SCL:1; SRV:; IPV:CAL; SFV:NSPM; H:64aa7808-outbound-1.mta.getcheckrecipient.com; PTR:ec2-63-35-35-123.eu-west-1.compute.amazonaws.com; CAT:NONE; SFS:(4636009)(136003)(376002)(346002)(396003)(39860400002)(46966005)(82740400003)(83380400001)(26005)(1076003)(33656002)(55016002)(4743002)(186003)(36756003)(6666004)(16526019)(44832011)(44144004)(356005)(33964004)(2616005)(956004)(81166007)(2906002)(7696005)(4326008)(47076004)(316002)(235185007)(86362001)(6916009)(5660300002)(82310400003)(8886007)(478600001)(36906005)(66616009)(70586007)(70206006)(8676002)(8936002)(336012)(2700100001); DIR:OUT; SFP:1101; X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 25 Sep 2020 14:29:20.3384 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: b5c106fe-cb63-4920-4e9f-08d8615f6438 X-MS-Exchange-CrossTenant-Id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=f34e5979-57d9-4aaa-ad4d-b122a662184d; Ip=[63.35.35.123]; Helo=[64aa7808-outbound-1.mta.getcheckrecipient.com] X-MS-Exchange-CrossTenant-AuthSource: AM5EUR03FT023.eop-EUR03.prod.protection.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: VI1PR08MB4286 X-Spam-Status: No, score=-14.3 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, GIT_PATCH_0, KAM_LOTSOFHASH, MSGID_FROM_MTA_HEADER, RCVD_IN_DNSWL_LOW, RCVD_IN_MSPIKE_H2, SPF_HELO_PASS, SPF_PASS, TXREP, UNPARSEABLE_RELAY autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: nd@arm.com, rguenther@suse.de, ook@ucw.cz Errors-To: gcc-patches-bounces@gcc.gnu.org Sender: "Gcc-patches" Hi All, This patch adds pattern detections for the following operation: Addition with rotation of the second argument around the Argand plane. Supported rotations are 90 and 180. c = a + (b * I) and c = a + (b * I * I) where a, b and c are complex numbers. Bootstrapped Regtested on aarch64-none-linux-gnu and no issues. Ok for master? Thanks, Tamar gcc/ChangeLog: * doc/md.texi: Document optabs. * internal-fn.def (COMPLEX_ADD_ROT90, COMPLEX_ADD_ROT270): New. * optabs.def (cadd90_optab, cadd270_optab): New. * tree-vect-slp-patterns.c (class ComplexAddPattern): New. (slp_patterns): Add ComplexAddPattern. diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi index 2b46286943778e16d95b15def4299bcbf8db7eb8..71e226505b2619d10982b59a4ebbed73a70f29be 100644 --- a/gcc/doc/md.texi +++ b/gcc/doc/md.texi @@ -6132,6 +6132,17 @@ floating-point mode. This pattern is not allowed to @code{FAIL}. +@cindex @code{cadd@var{m}@var{n}3} instruction pattern +@item @samp{cadd@var{m}@var{n}3} +Perform a vector addition of complex numbers in operand 1 with operand 2 +rotated by @var{m} degrees around the argand plane and storing the result in +operand 0. The instruction must perform the operation on data loaded +contiguously into the vectors. +The operation is only supported for vector modes @var{n} and with +rotations @var{m} of 90 or 270. + +This pattern is not allowed to @code{FAIL}. + @cindex @code{ffs@var{m}2} instruction pattern @item @samp{ffs@var{m}2} Store into operand 0 one plus the index of the least significant 1-bit diff --git a/gcc/internal-fn.def b/gcc/internal-fn.def index 13e60828fcf5db6c5f15aae2bacd4cf04029e430..956a65a338c157b51de7e78a3fb005b5af78ef31 100644 --- a/gcc/internal-fn.def +++ b/gcc/internal-fn.def @@ -275,6 +275,8 @@ DEF_INTERNAL_FLT_FN (SCALB, ECF_CONST, scalb, binary) DEF_INTERNAL_FLT_FLOATN_FN (FMIN, ECF_CONST, fmin, binary) DEF_INTERNAL_FLT_FLOATN_FN (FMAX, ECF_CONST, fmax, binary) DEF_INTERNAL_OPTAB_FN (XORSIGN, ECF_CONST, xorsign, binary) +DEF_INTERNAL_OPTAB_FN (COMPLEX_ADD_ROT90, ECF_CONST, cadd90, binary) +DEF_INTERNAL_OPTAB_FN (COMPLEX_ADD_ROT270, ECF_CONST, cadd270, binary) /* FP scales. */ DEF_INTERNAL_FLT_FN (LDEXP, ECF_CONST, ldexp, binary) diff --git a/gcc/optabs.def b/gcc/optabs.def index 78409aa14537d259bf90277751aac00d452a0d3f..2bb0bf857977035bf562a77f5f6848e80edf936d 100644 --- a/gcc/optabs.def +++ b/gcc/optabs.def @@ -290,6 +290,8 @@ OPTAB_D (atan_optab, "atan$a2") OPTAB_D (atanh_optab, "atanh$a2") OPTAB_D (copysign_optab, "copysign$F$a3") OPTAB_D (xorsign_optab, "xorsign$F$a3") +OPTAB_D (cadd90_optab, "cadd90$a3") +OPTAB_D (cadd270_optab, "cadd270$a3") OPTAB_D (cos_optab, "cos$a2") OPTAB_D (cosh_optab, "cosh$a2") OPTAB_D (exp10_optab, "exp10$a2") diff --git a/gcc/tree-vect-slp-patterns.c b/gcc/tree-vect-slp-patterns.c index 6453a5b1b6464dba833adc2c2a194db5e712bb79..b2b0ac62e9a69145470f41d2bac736dd970be735 100644 --- a/gcc/tree-vect-slp-patterns.c +++ b/gcc/tree-vect-slp-patterns.c @@ -663,12 +663,94 @@ graceful_exit: } }; +class ComplexAddPattern : public ComplexPattern +{ + protected: + ComplexAddPattern (slp_tree node, vec_info *vinfo) + : ComplexPattern (node, vinfo) + { + this->m_arity = 2; + this->m_num_args = 2; + this->m_vects.create (0); + this->m_defs.create (0); + } + + public: + ~ComplexAddPattern () + { + this->m_vects.release (); + this->m_defs.release (); + } + + static VectPattern* create (slp_tree node, vec_info *vinfo) + { + return new ComplexAddPattern (node, vinfo); + } + + const char* get_name () + { + return "Complex Addition"; + } + + /* Pattern matcher for trying to match complex addition pattern in SLP tree + using the N statements statements found in node starting at position IDX. + If the operation matches then IFN is set to the operation it matched and + the arguments to the two replacement statements are put in VECTS. + + If no match is found then IFN is set to IFN_LAST. + + This function matches the patterns shaped as: + + c[i] = a[i] - b[i+1]; + c[i+1] = a[i+1] + b[i]; + + If a match occurred then TRUE is returned, else FALSE. */ + + bool matches (stmt_vec_info *stmts, int idx) + { + this->m_last_ifn = IFN_LAST; + int base = idx - (this->m_arity - 1); + this->m_last_idx = idx; + this->m_stmt_info = stmts[0]; + + complex_operation_t op + = vect_detect_pair_op (base, this->m_node, &this->m_vects); + + /* Find the two components. Rotation in the complex plane will modify + the operations: + + * Rotation 0: + + + * Rotation 90: - + + * Rotation 180: - - + * Rotation 270: + - + + Rotation 0 and 180 can be handled by normal SIMD code, so we don't need + to care about them here. */ + if (op == MINUS_PLUS) + this->m_last_ifn = IFN_COMPLEX_ADD_ROT90; + else if (op == PLUS_MINUS) + this->m_last_ifn = IFN_COMPLEX_ADD_ROT270; + + if (this->m_last_ifn == IFN_LAST) + return false; + + /* Correct the arguments after matching. */ + std::swap (this->m_vects[1], this->m_vects[3]); + + /* If the two operands are the same, we don't have a permute. In such a case + there is no advantage in doing the replacement. */ + return store_results (); + } +}; + #define SLP_PATTERN(x) &x::create VectPatternDecl slp_patterns[] { /* For least amount of back-tracking and more efficient matching order patterns from the largest to the smallest. Especially if they overlap in what they can detect. */ + + SLP_PATTERN (ComplexAddPattern), }; #undef SLP_PATTERN From patchwork Fri Sep 25 14:29:16 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tamar Christina X-Patchwork-Id: 1371342 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=8.43.85.97; helo=sourceware.org; envelope-from=gcc-patches-bounces@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=arm.com Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=armh.onmicrosoft.com header.i=@armh.onmicrosoft.com header.a=rsa-sha256 header.s=selector2-armh-onmicrosoft-com header.b=3umYu7gM; dkim=pass (1024-bit key) header.d=armh.onmicrosoft.com header.i=@armh.onmicrosoft.com header.a=rsa-sha256 header.s=selector2-armh-onmicrosoft-com header.b=3umYu7gM; dkim-atps=neutral Received: from sourceware.org (server2.sourceware.org [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4ByZ7N0Hn8z9sR4 for ; Sat, 26 Sep 2020 00:29:51 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 4637F398B43E; Fri, 25 Sep 2020 14:29:49 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from EUR05-VI1-obe.outbound.protection.outlook.com (mail-vi1eur05on2077.outbound.protection.outlook.com [40.107.21.77]) by sourceware.org (Postfix) with ESMTPS id EEB63398B43E for ; Fri, 25 Sep 2020 14:29:45 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org EEB63398B43E Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=Tamar.Christina@arm.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector2-armh-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=motXyPabO9PyP8wpO1/T96xxF8upFSY2zxMKM8UjK7o=; b=3umYu7gMa3c6TqTbjnTuMTALGm/YZlBF6PBrxXxM70aGl3EjMMT5tBhW2DlGfXX0vmMwy6ymIc7NKJlao6ri8I8iZlkRzQi8exq7yIt7CxXhkoWjfyXCJN34X3jWsy+JMHBBcA6pudJoRssidB/r/N8Pf/poaSb7XWg++oa90Xw= Received: from DB6PR0501CA0020.eurprd05.prod.outlook.com (2603:10a6:4:8f::30) by AM6PR08MB3317.eurprd08.prod.outlook.com (2603:10a6:209:42::28) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3412.20; Fri, 25 Sep 2020 14:29:43 +0000 Received: from DB5EUR03FT032.eop-EUR03.prod.protection.outlook.com (2603:10a6:4:8f:cafe::20) by DB6PR0501CA0020.outlook.office365.com (2603:10a6:4:8f::30) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3412.20 via Frontend Transport; Fri, 25 Sep 2020 14:29:43 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 63.35.35.123) smtp.mailfrom=arm.com; gcc.gnu.org; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com;gcc.gnu.org; dmarc=bestguesspass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 63.35.35.123 as permitted sender) receiver=protection.outlook.com; client-ip=63.35.35.123; helo=64aa7808-outbound-1.mta.getcheckrecipient.com; Received: from 64aa7808-outbound-1.mta.getcheckrecipient.com (63.35.35.123) by DB5EUR03FT032.mail.protection.outlook.com (10.152.20.162) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3412.21 via Frontend Transport; Fri, 25 Sep 2020 14:29:43 +0000 Received: ("Tessian outbound 34b830c8a0ef:v64"); Fri, 25 Sep 2020 14:29:43 +0000 X-CheckRecipientChecked: true X-CR-MTA-CID: 2775ea3c550e80a8 X-CR-MTA-TID: 64aa7808 Received: from 5aa689ade720.1 by 64aa7808-outbound-1.mta.getcheckrecipient.com id C98E311C-BCA2-4058-BFCF-C7F2DCF0AB5B.1; Fri, 25 Sep 2020 14:29:24 +0000 Received: from EUR03-DB5-obe.outbound.protection.outlook.com by 64aa7808-outbound-1.mta.getcheckrecipient.com with ESMTPS id 5aa689ade720.1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384); Fri, 25 Sep 2020 14:29:24 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=OUHXCzRUYnj8DndTRSd5STiND77+3KgPkB9kAvVPqGQAKgfqq8oQmQZKZ2OmRu+WjzMvb/q7zM19ZQ6QSbRKPbYuzU2UAPyywPg5DnEQ6YJdz4en1x7SxPCi9X9nleerGuTyZ2NKbhUn8I0+CvrEwLd9L2M/LNiHTfbFYDbMwCUeIiOLAKavdw3En3iXYzPp0bdAbcmsjgs/XYbYAUm+5ktDnokrBYDuWUnw0rSt17X7cJJtf5GTK0Mu7YqNirr0eFqBPtB5yamgR1rPGlYAQ3UK+sPhPE9iiet1Mic8J4mBrUkzcxUVPVzHNy4qmKG/eXTPE5CgSgYzkVvQXo7fVQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=motXyPabO9PyP8wpO1/T96xxF8upFSY2zxMKM8UjK7o=; b=PY0Zer906CGix1DYo3EXXlyPOqQvw+u3q1BMacV7NeEEVjCWufet2QB8i1g8GeOD7BL46DIMXPZyjZCw0VfEN7snhxklKbeJk4RuzWR0FoZ+wDkMu04lDlYIGOH27hI8AVoe75jP+AQk+rOGhIE0BWJi3iRuGHfQlt6CIPdZjhMR9W8BEW75UPJl0tSGb9QHrymk3ep2OY8VpEjL8Oi2EX35V6tztCsdOig9QMVhcQvXZCU6e5XQ3SD27uXK/Nb04cTgHExboLDyBvGO9YUEs4KOAl6lKMO0y7DBiZENFJwvYokTiXsqY1RXq1TkSp2g5ubNuhVwVCmrLIsj6TsvZg== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=arm.com; dmarc=pass action=none header.from=arm.com; dkim=pass header.d=arm.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector2-armh-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=motXyPabO9PyP8wpO1/T96xxF8upFSY2zxMKM8UjK7o=; b=3umYu7gMa3c6TqTbjnTuMTALGm/YZlBF6PBrxXxM70aGl3EjMMT5tBhW2DlGfXX0vmMwy6ymIc7NKJlao6ri8I8iZlkRzQi8exq7yIt7CxXhkoWjfyXCJN34X3jWsy+JMHBBcA6pudJoRssidB/r/N8Pf/poaSb7XWg++oa90Xw= Authentication-Results-Original: gcc.gnu.org; dkim=none (message not signed) header.d=none;gcc.gnu.org; dmarc=none action=none header.from=arm.com; Received: from VI1PR08MB5325.eurprd08.prod.outlook.com (2603:10a6:803:13e::17) by VE1PR08MB5678.eurprd08.prod.outlook.com (2603:10a6:800:1a0::20) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3391.11; Fri, 25 Sep 2020 14:29:24 +0000 Received: from VI1PR08MB5325.eurprd08.prod.outlook.com ([fe80::d0e7:49cd:4dae:a2a2]) by VI1PR08MB5325.eurprd08.prod.outlook.com ([fe80::d0e7:49cd:4dae:a2a2%7]) with mapi id 15.20.3412.024; Fri, 25 Sep 2020 14:29:24 +0000 Date: Fri, 25 Sep 2020 15:29:16 +0100 From: Tamar Christina To: gcc-patches@gcc.gnu.org Subject: [PATCH v2 7/16]middle-end: Add Complex Multiplication and Multiplication with Conjucate detection Message-ID: <20200925142914.GA19264@arm.com> Content-Disposition: inline User-Agent: Mutt/1.9.4 (2018-02-28) X-ClientProxiedBy: SA0PR11CA0042.namprd11.prod.outlook.com (2603:10b6:806:d0::17) To VI1PR08MB5325.eurprd08.prod.outlook.com (2603:10a6:803:13e::17) MIME-Version: 1.0 X-MS-Exchange-MessageSentRepresentingType: 1 Received: from arm.com (217.140.106.53) by SA0PR11CA0042.namprd11.prod.outlook.com (2603:10b6:806:d0::17) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3412.20 via Frontend Transport; Fri, 25 Sep 2020 14:29:22 +0000 X-Originating-IP: [217.140.106.53] X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-HT: Tenant X-MS-Office365-Filtering-Correlation-Id: 52a41c78-fe7a-4155-4c05-08d8615f7225 X-MS-TrafficTypeDiagnostic: VE1PR08MB5678:|AM6PR08MB3317: X-Microsoft-Antispam-PRVS: x-checkrecipientrouted: true NoDisclaimer: true X-MS-Oob-TLC-OOBClassifiers: OLM:9508;OLM:9508; X-MS-Exchange-SenderADCheck: 1 X-Microsoft-Antispam-Untrusted: BCL:0; X-Microsoft-Antispam-Message-Info-Original: 1pDl3kKayL0uO9SkUVx3vmELCF/4YeYWxdZteDvdApVAGlQWG83ncZV1ILQ5Yo81DuVXrXL8QgziMCLHZ+t+3SNF2YuVeDM4my37FjI6Xk9AVb2wlJ6UO9Gks7DrbqgUOZTgo83UKdWpV7idG+Wn/fWs/QyKcerbtJo7fOmX84HtFmgFG689cyEwfSuo1aL6t/RUflFtNsP+DYbrosLZa8X0DBnpGLLCGItRrhtAISjmxarqSdXgKtCWcemwFfDbx5kGdpxEJHcmoG1w9UEpklSPnFzMv2Hzjbd0tQtas4Pkvb37XIJNFbLk3iDaSIPs3Vn/Xt2KM3avlHJ7lJfw5LLa3zVV0tFXoqE2TcaKgVzs6Co1khAnWBq3ZEOcmPWsM6cN5KuSFPY84km7BfzcULLno1k/GFmQl6BqSrzbibU= X-Forefront-Antispam-Report-Untrusted: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:VI1PR08MB5325.eurprd08.prod.outlook.com; PTR:; CAT:NONE; SFS:(4636009)(396003)(366004)(136003)(39860400002)(346002)(376002)(8936002)(1076003)(55016002)(33656002)(8886007)(66946007)(66556008)(86362001)(66616009)(83380400001)(36756003)(66476007)(478600001)(6666004)(6916009)(235185007)(44144004)(5660300002)(16526019)(186003)(4743002)(4326008)(33964004)(52116002)(44832011)(7696005)(316002)(2616005)(26005)(2906002)(8676002)(956004)(2700100001); DIR:OUT; SFP:1101; X-MS-Exchange-AntiSpam-MessageData: Fl7J448anu32fu6j5UczpiYzpCkdoPVOba1HPgZjgAHl0EPc/xRBHl2W39B7p4SiZ9hPWt/BcmcctOrvXnsJLs2xUhiB1nbTD4/jSUKWcy1eGp0tvuyxkT54efWQnKSsgC3//ezZQ/KZuooSSxL2xG0X2ZESABjvExAu7TyHhn4vamF9YbZrRocwlDGhZmi04cJeK4KIwX5ziV5Y6IKE2lSyn6y4+86lO7eY2z6YfOyydDMng7lU1gdMNwCSy34CxbLO2LxWIF45woiW2OMQRKVZamjO/qkCUNOq7Cm8JjTvwmHOt1k7zInO/3FeQjcUiPUZydKaqjmwMnSsilvXC78Rg8ekEb36cv9mt+AmhUc7Zyt16WXiLIvOfdnyYrvlp6f1Z1tDeXppWXxWRANt8LwM2SsLGA7pdi2fByhOujN3mdx3dYP9GOkx1u883MGnjlZWIW53fSzLZN/rPNxG5HT6wZ1bPuxt0K6e86fXREv71WIo7eVG+SnUXrmcKmaa1vA1g2HIFdpcImntMdFIjKsTQni4BQKInswOEwddsavv8Cg3WZOmK5LbYRgNRN+qidaXBGlPV/LTAQAvCnFN9qzI4MHGmbl8gbbk/ljcy2mf1AHq3x1bYwZ2ffjeNobwLHe0foMtLUwoYQ2j2TrDMw== X-MS-Exchange-Transport-Forked: True X-MS-Exchange-Transport-CrossTenantHeadersStamped: VE1PR08MB5678 Original-Authentication-Results: gcc.gnu.org; dkim=none (message not signed) header.d=none;gcc.gnu.org; dmarc=none action=none header.from=arm.com; X-EOPAttributedMessage: 0 X-MS-Exchange-Transport-CrossTenantHeadersStripped: DB5EUR03FT032.eop-EUR03.prod.protection.outlook.com X-MS-Office365-Filtering-Correlation-Id-Prvs: 97a53244-4a12-4ded-9b7c-08d8615f663d X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: 5EmBRPZ/h3J9U55Mf/bKPey8nhZO/3H+JCTDrIGq0g1b2IhHe5QGKzMj9oFf1uJ13HL9S5Oqldx0mLT5GGY8KzjP7pvfGVrKxOGLoSIXzzdDuIVWYS3P9W/lfCZmxt1Bce+Mu4rha5azuc7Bs3osGH6ppvTgc9qqHZbeVnPDPuhzVxEekaUQjhlcnxxc6QD449n4NbvDyH5Ff6nhz37L8HbgrpRZMFwUUe4dhkv4uFMXzwYVDng/3RBtC8XI+xU6VRHmaulQsntQ+n+qIEgObIiUr71BJasMy1TU/spEEv9+2TQWMSjDZbEPe6arDqwXtvdGEcopsY9oBEKoMXS6M4RZIwzTJbSHRVgAbLLuSvOcddQLt0wdR8Fk/aIhlplvVUvKKZODT+JmPs7RJREZSS2xHZnTbHpe8q1Fg5ZBN4s= X-Forefront-Antispam-Report: CIP:63.35.35.123; CTRY:IE; LANG:en; SCL:1; SRV:; IPV:CAL; SFV:NSPM; H:64aa7808-outbound-1.mta.getcheckrecipient.com; PTR:ec2-63-35-35-123.eu-west-1.compute.amazonaws.com; CAT:NONE; SFS:(4636009)(39860400002)(376002)(346002)(396003)(136003)(46966005)(16526019)(186003)(36756003)(83380400001)(8936002)(336012)(47076004)(26005)(316002)(33656002)(8676002)(81166007)(86362001)(4326008)(55016002)(44144004)(33964004)(82310400003)(356005)(6666004)(235185007)(8886007)(82740400003)(2906002)(4743002)(44832011)(956004)(2616005)(70206006)(478600001)(66616009)(5660300002)(70586007)(1076003)(6916009)(7696005)(2700100001); DIR:OUT; SFP:1101; X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 25 Sep 2020 14:29:43.7679 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 52a41c78-fe7a-4155-4c05-08d8615f7225 X-MS-Exchange-CrossTenant-Id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=f34e5979-57d9-4aaa-ad4d-b122a662184d; Ip=[63.35.35.123]; Helo=[64aa7808-outbound-1.mta.getcheckrecipient.com] X-MS-Exchange-CrossTenant-AuthSource: DB5EUR03FT032.eop-EUR03.prod.protection.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: AM6PR08MB3317 X-Spam-Status: No, score=-14.3 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, GIT_PATCH_0, KAM_LOTSOFHASH, MSGID_FROM_MTA_HEADER, RCVD_IN_DNSWL_LOW, RCVD_IN_MSPIKE_H2, SPF_HELO_PASS, SPF_PASS, TXREP, UNPARSEABLE_RELAY autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: nd@arm.com, rguenther@suse.de, ook@ucw.cz Errors-To: gcc-patches-bounces@gcc.gnu.org Sender: "Gcc-patches" Hi All, This patch adds pattern detections for the following operation: Complex multiplication and Conjucate Complex multiplication of the second parameter. c = a * b and c = a * conj (b) For the conjucate cases it supports under fast-math that the operands that is being conjucated be flipped by flipping the arguments to the optab. This allows it to support c = conj (a) * b and c += conj (a) * b. where a, b and c are complex numbers. and provides a shared class for anything needing to recognize complex MLA patterns. Bootstrapped Regtested on aarch64-none-linux-gnu and no issues. Ok for master? Thanks, Tamar gcc/ChangeLog: * doc/md.texi: Document optabs. * internal-fn.def (COMPLEX_MUL, COMPLEX_MUL_CONJ): New. * optabs.def (cmul_optab, cmul_conj_optab): New, * tree-vect-slp-patterns.c (class ComplexMLAPattern, class ComplexMulPattern): New. (slp_patterns): Add ComplexMulPattern. diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi index 71e226505b2619d10982b59a4ebbed73a70f29be..ddaf1abaccbd44dae11ea902ec38b474aacfb8e1 100644 --- a/gcc/doc/md.texi +++ b/gcc/doc/md.texi @@ -6143,6 +6143,28 @@ rotations @var{m} of 90 or 270. This pattern is not allowed to @code{FAIL}. +@cindex @code{cmul@var{m}4} instruction pattern +@item @samp{cmul@var{m}4} +Perform a vector floating point multiplication of complex numbers in operand 0 +and operand 1. + +The instruction must perform the operation on data loaded contiguously into the +vectors. +The operation is only supported for vector modes @var{m}. + +This pattern is not allowed to @code{FAIL}. + +@cindex @code{cmul_conj@var{m}4} instruction pattern +@item @samp{cmul_conj@var{m}4} +Perform a vector floating point multiplication of complex numbers in operand 0 +and the conjucate of operand 1. + +The instruction must perform the operation on data loaded contiguously into the +vectors. +The operation is only supported for vector modes @var{m}. + +This pattern is not allowed to @code{FAIL}. + @cindex @code{ffs@var{m}2} instruction pattern @item @samp{ffs@var{m}2} Store into operand 0 one plus the index of the least significant 1-bit diff --git a/gcc/internal-fn.def b/gcc/internal-fn.def index 956a65a338c157b51de7e78a3fb005b5af78ef31..51bebf8701af262b22d66d19a29a8dafb74db1f0 100644 --- a/gcc/internal-fn.def +++ b/gcc/internal-fn.def @@ -277,6 +277,9 @@ DEF_INTERNAL_FLT_FLOATN_FN (FMAX, ECF_CONST, fmax, binary) DEF_INTERNAL_OPTAB_FN (XORSIGN, ECF_CONST, xorsign, binary) DEF_INTERNAL_OPTAB_FN (COMPLEX_ADD_ROT90, ECF_CONST, cadd90, binary) DEF_INTERNAL_OPTAB_FN (COMPLEX_ADD_ROT270, ECF_CONST, cadd270, binary) +DEF_INTERNAL_OPTAB_FN (COMPLEX_MUL, ECF_CONST, cmul, binary) +DEF_INTERNAL_OPTAB_FN (COMPLEX_MUL_CONJ, ECF_CONST, cmul_conj, binary) + /* FP scales. */ DEF_INTERNAL_FLT_FN (LDEXP, ECF_CONST, ldexp, binary) diff --git a/gcc/optabs.def b/gcc/optabs.def index 2bb0bf857977035bf562a77f5f6848e80edf936d..9c267d422478d0011f288b1f5f62daabe3989ba7 100644 --- a/gcc/optabs.def +++ b/gcc/optabs.def @@ -292,6 +292,8 @@ OPTAB_D (copysign_optab, "copysign$F$a3") OPTAB_D (xorsign_optab, "xorsign$F$a3") OPTAB_D (cadd90_optab, "cadd90$a3") OPTAB_D (cadd270_optab, "cadd270$a3") +OPTAB_D (cmul_optab, "cmul$a3") +OPTAB_D (cmul_conj_optab, "cmul_conj$a3") OPTAB_D (cos_optab, "cos$a2") OPTAB_D (cosh_optab, "cosh$a2") OPTAB_D (exp10_optab, "exp10$a2") diff --git a/gcc/tree-vect-slp-patterns.c b/gcc/tree-vect-slp-patterns.c index b2b0ac62e9a69145470f41d2bac736dd970be735..bef7cc73b21c020e4c0128df5d186a034809b103 100644 --- a/gcc/tree-vect-slp-patterns.c +++ b/gcc/tree-vect-slp-patterns.c @@ -743,6 +743,179 @@ class ComplexAddPattern : public ComplexPattern } }; +class ComplexMLAPattern : public ComplexPattern +{ + protected: + ComplexMLAPattern (slp_tree node, vec_info *vinfo) + : ComplexPattern (node, vinfo) + { } + + protected: + /* Helper function of vect_match_call_complex_mla that looks up the + definition of LHS_0 and LHS_1 by finding the statements starting in + position BASE + IDX in child ROOT of NODE and tries to match the + definition against pair ops. + + If the match is successful then ARGS will contain the operands matched + and the complex_operation_t type is returned. If match is not successful + then CMPLX_NONE is returned and ARGS is left unmodified. */ + + complex_operation_t + vect_match_call_complex_mla_1 (slp_tree node, slp_tree *res, int root, + int base, int idx, vec *args) + { + gcc_assert (base >= 0 && idx >= 0 && node != NULL); + + if ((unsigned)root >= SLP_TREE_CHILDREN (node).length ()) + return CMPLX_NONE; + + slp_tree data = SLP_TREE_CHILDREN (node)[root]; + + /* If it's a VEC_PERM_EXPR we need to look one deeper. */ + if (node->code == VEC_PERM_EXPR) + data = SLP_TREE_CHILDREN (data)[root]; + + int lhs_0 = base + idx; + int lhs_1 = base + idx + 1; + + vec stmts = SLP_TREE_SCALAR_STMTS (data); + if (stmts.length () < (unsigned)lhs_1) + return CMPLX_NONE; + + gimple *stmt_0 = STMT_VINFO_STMT (stmts[lhs_0]); + gimple *stmt_1 = STMT_VINFO_STMT (stmts[lhs_1]); + + if (gimple_expr_type (stmt_0) != gimple_expr_type (stmt_1)) + return CMPLX_NONE; + + if (res) + *res = data; + + return vect_detect_pair_op (base, data, args); + } +}; + +class ComplexMulPattern : public ComplexMLAPattern +{ + protected: + ComplexMulPattern (slp_tree node, vec_info *vinfo) + : ComplexMLAPattern (node, vinfo) + { + this->m_arity = 2; + this->m_num_args = 2; + this->m_vects.create (0); + this->m_defs.create (0); + } + + public: + ~ComplexMulPattern () + { + this->m_vects.release (); + this->m_defs.release (); + } + + static VectPattern* create (slp_tree node, vec_info *vinfo) + { + return new ComplexMulPattern (node, vinfo); + } + + const char* get_name () + { + return "Complex Multiplication"; + } + + + /* Pattern matcher for trying to match complex multiply pattern in SLP tree + using N statements STMT_0 and STMT_0 as the root statements by finding + the statements starting in position IDX in NODE. If the operation + matches then IFN is set to the operation it matched and the arguments to + the two replacement statements are put in VECTS. + + If no match is found then IFN is set to IFN_LAST and VECTS is unchanged. + + This function matches the patterns shaped as: + + double ax = (b[i+1] * a[i]); + double bx = (a[i+1] * b[i]); + + c[i] = c[i] - ax; + c[i+1] = c[i+1] + bx; + + If a match occurred then TRUE is returned, else FALSE. */ + + bool + matches (stmt_vec_info *stmts, int idx) + { + this->m_last_ifn = IFN_LAST; + this->m_vects.truncate (0); + this->m_vects.create (6); + int base = idx - (this->m_arity - 1); + this->m_last_idx = idx; + this->m_stmt_info = stmts[0]; + + complex_operation_t op1 = vect_detect_pair_op (base, this->m_node, NULL); + + if (op1 != MINUS_PLUS) + return false; + + slp_tree sub1a, sub1b, sub2; + /* Now operand1+3 must lead to another expression. */ + auto_vec args0; + complex_operation_t op2 + = vect_match_call_complex_mla_1 (this->m_node, &sub1a, 0, base, 0, + &args0); + + if (op2 != MULT_MULT) + return false; + + /* Now operand2+4 must lead to another expression. */ + auto_vec args1; + complex_operation_t op3 + = vect_match_call_complex_mla_1 (this->m_node, &sub1b, 1, base, 0, + &args1); + + if (op3 != MULT_MULT) + return false; + + /* Now operand2+4 may lead to another expression. */ + auto_vec args2; + complex_operation_t op4 + = vect_match_call_complex_mla_1 (sub1b, &sub2, 1, base, 0, &args2); + + if (op4 != CMPLX_NONE && op4 != NEG_NEG) + return false; + + if (op4 == CMPLX_NONE) + { + this->m_last_ifn = IFN_COMPLEX_MUL; + /* Correct the arguments after matching. */ + std::swap (args0[2], args1[0]); + } + else if (op4 == NEG_NEG) + { + this->m_last_ifn = IFN_COMPLEX_MUL_CONJ; + /* Check if the conjucate is on the first or second parameter. */ + if (args1[1] == args1[3] && args0[1] == args0[3]) + { + this->m_vects.quick_push (args0[3]); + this->m_vects.quick_push (args0[0]); + this->m_vects.quick_push (args2[0]); + this->m_vects.quick_push (args0[2]); + } + else + { + /* Correct the arguments after matching. */ + std::swap (args0[2], args2[0]); + } + } + + if (this->m_vects.length () == 0) + this->m_vects.splice (args0); + + return this->m_last_ifn != IFN_LAST && store_results (); + } +}; + #define SLP_PATTERN(x) &x::create VectPatternDecl slp_patterns[] { @@ -750,6 +923,7 @@ VectPatternDecl slp_patterns[] order patterns from the largest to the smallest. Especially if they overlap in what they can detect. */ + SLP_PATTERN (ComplexMulPattern), SLP_PATTERN (ComplexAddPattern), }; #undef SLP_PATTERN From patchwork Fri Sep 25 14:29:36 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tamar Christina X-Patchwork-Id: 1371344 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=8.43.85.97; helo=sourceware.org; envelope-from=gcc-patches-bounces@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=arm.com Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=armh.onmicrosoft.com header.i=@armh.onmicrosoft.com header.a=rsa-sha256 header.s=selector2-armh-onmicrosoft-com header.b=Lidu8qC9; dkim=pass (1024-bit key) header.d=armh.onmicrosoft.com header.i=@armh.onmicrosoft.com header.a=rsa-sha256 header.s=selector2-armh-onmicrosoft-com header.b=Lidu8qC9; dkim-atps=neutral Received: from sourceware.org (server2.sourceware.org [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4ByZ7S1cCVz9sRf for ; Sat, 26 Sep 2020 00:29:55 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id C09AF398C03B; Fri, 25 Sep 2020 14:29:53 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from EUR01-DB5-obe.outbound.protection.outlook.com (mail-eopbgr150040.outbound.protection.outlook.com [40.107.15.40]) by sourceware.org (Postfix) with ESMTPS id 5C202398B817 for ; Fri, 25 Sep 2020 14:29:50 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 5C202398B817 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=Tamar.Christina@arm.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector2-armh-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=7XjPq4iBB9SmTcIshxj8Nutp4CXZCz5LepX7WQ6l9m4=; b=Lidu8qC9qI1PX/Op2Sip2rL3xAt238SGxKaBmyEK+yeOgfHF9/3825f6DIyHu2DbjjMh4bvc3W7NDcOXPPo9JGbxpV91lmLxgVI3am1oa/aFpknuczGrZ960Z1NcyJJgtsG5YIId0fZsMzWOM9UfrzTF77jZUqslUVeo9Nx3a50= Received: from DB6P195CA0012.EURP195.PROD.OUTLOOK.COM (2603:10a6:4:cb::22) by DB8PR08MB5033.eurprd08.prod.outlook.com (2603:10a6:10:e7::21) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3412.20; Fri, 25 Sep 2020 14:29:48 +0000 Received: from DB5EUR03FT054.eop-EUR03.prod.protection.outlook.com (2603:10a6:4:cb:cafe::99) by DB6P195CA0012.outlook.office365.com (2603:10a6:4:cb::22) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3412.24 via Frontend Transport; Fri, 25 Sep 2020 14:29:48 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 63.35.35.123) smtp.mailfrom=arm.com; gcc.gnu.org; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com;gcc.gnu.org; dmarc=bestguesspass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 63.35.35.123 as permitted sender) receiver=protection.outlook.com; client-ip=63.35.35.123; helo=64aa7808-outbound-1.mta.getcheckrecipient.com; Received: from 64aa7808-outbound-1.mta.getcheckrecipient.com (63.35.35.123) by DB5EUR03FT054.mail.protection.outlook.com (10.152.20.248) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3412.21 via Frontend Transport; Fri, 25 Sep 2020 14:29:48 +0000 Received: ("Tessian outbound 34b830c8a0ef:v64"); Fri, 25 Sep 2020 14:29:48 +0000 X-CheckRecipientChecked: true X-CR-MTA-CID: fc1e2a10687f660a X-CR-MTA-TID: 64aa7808 Received: from 300860cf845f.1 by 64aa7808-outbound-1.mta.getcheckrecipient.com id F34A46F2-83B7-4354-A854-B001B24ED893.1; Fri, 25 Sep 2020 14:29:39 +0000 Received: from EUR03-DB5-obe.outbound.protection.outlook.com by 64aa7808-outbound-1.mta.getcheckrecipient.com with ESMTPS id 300860cf845f.1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384); Fri, 25 Sep 2020 14:29:39 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=kJZRIDdN7r801Ld5O/v+4AC7G2AVB6s19frBTKp5f5HlYA8tj/kh4YupbsKsQeiUZc/rDyQn8X/i3rSCVHltDA6VisZQVEqPorrMJ0UX3kovXOE1HfuRNMjdqWky8CIsT6B9F78E3jcsM74sV4iA4odn8Ls00mbNy9zzA3HfSC1pTH5UzOnvWGF3OD8BbJ6XZmrM+QT4EVYpLk+8YpxmmBqyr8D9BwWHIVeY7mLFNEWjB+QRcw19n17Kh/FK/P3/TNCV6HgzipB40tiJBUu10fDtDCAfLnaMqTuEUDrcn6m/jGMLjjmxkbBp5/4Lo4HnhabGOBpHuZ9e9ihFERJKOw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=7XjPq4iBB9SmTcIshxj8Nutp4CXZCz5LepX7WQ6l9m4=; b=b4tqtISbRTvWT+vTtVKUIZst0eeF43tPIyu0YTIH9Ojhkb6gMm80i3q7ohOih8G2GHg6Cc4KpQzn98U+Kfspw4xSGOnSq7lb3EV64wA/BfVKh7YO+1N9VoXLR92iTfWwg6rGu9pTs3wnFgpeV2iA2t9CCgj6oGVFvIAKRzgsFE57x7Jt+XcPZKiePwXPkzn977XcSdrsLZI2eP41ZyAGKld0vkWEA9qFZNOqR5VXPtEHmhVlBo6tv4JyGSQv5BxkiMdRYVZvbFYweZWCBNMBe1+S+XLTw2jtFkukB9zUTWkhdjqDPv9kd/eHFWV/VaP8TVwolHdUHI+xe9MVxebdUw== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=arm.com; dmarc=pass action=none header.from=arm.com; dkim=pass header.d=arm.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector2-armh-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=7XjPq4iBB9SmTcIshxj8Nutp4CXZCz5LepX7WQ6l9m4=; b=Lidu8qC9qI1PX/Op2Sip2rL3xAt238SGxKaBmyEK+yeOgfHF9/3825f6DIyHu2DbjjMh4bvc3W7NDcOXPPo9JGbxpV91lmLxgVI3am1oa/aFpknuczGrZ960Z1NcyJJgtsG5YIId0fZsMzWOM9UfrzTF77jZUqslUVeo9Nx3a50= Authentication-Results-Original: gcc.gnu.org; dkim=none (message not signed) header.d=none;gcc.gnu.org; dmarc=none action=none header.from=arm.com; Received: from VI1PR08MB5325.eurprd08.prod.outlook.com (2603:10a6:803:13e::17) by VE1PR08MB5678.eurprd08.prod.outlook.com (2603:10a6:800:1a0::20) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3391.11; Fri, 25 Sep 2020 14:29:38 +0000 Received: from VI1PR08MB5325.eurprd08.prod.outlook.com ([fe80::d0e7:49cd:4dae:a2a2]) by VI1PR08MB5325.eurprd08.prod.outlook.com ([fe80::d0e7:49cd:4dae:a2a2%7]) with mapi id 15.20.3412.024; Fri, 25 Sep 2020 14:29:38 +0000 Date: Fri, 25 Sep 2020 15:29:36 +0100 From: Tamar Christina To: gcc-patches@gcc.gnu.org Subject: [PATCH v2 8/16]middle-end: add Complex Multiply and Accumulate/Subtract and Multiply and Accumulate/Subtract with Conjucate detection Message-ID: <20200925142931.GA21805@arm.com> Content-Disposition: inline User-Agent: Mutt/1.9.4 (2018-02-28) X-ClientProxiedBy: LO2P265CA0276.GBRP265.PROD.OUTLOOK.COM (2603:10a6:600:a1::24) To VI1PR08MB5325.eurprd08.prod.outlook.com (2603:10a6:803:13e::17) MIME-Version: 1.0 X-MS-Exchange-MessageSentRepresentingType: 1 Received: from arm.com (217.140.106.53) by LO2P265CA0276.GBRP265.PROD.OUTLOOK.COM (2603:10a6:600:a1::24) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3412.20 via Frontend Transport; Fri, 25 Sep 2020 14:29:37 +0000 X-Originating-IP: [217.140.106.53] X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-HT: Tenant X-MS-Office365-Filtering-Correlation-Id: 4281fc55-64f7-4353-12c3-08d8615f750c X-MS-TrafficTypeDiagnostic: VE1PR08MB5678:|DB8PR08MB5033: X-Microsoft-Antispam-PRVS: x-checkrecipientrouted: true NoDisclaimer: true X-MS-Oob-TLC-OOBClassifiers: OLM:9508;OLM:9508; X-MS-Exchange-SenderADCheck: 1 X-Microsoft-Antispam-Untrusted: BCL:0; X-Microsoft-Antispam-Message-Info-Original: tLTl60TswqTLqN3/oim9isuPTnQavUGrBkfiORZozaXIm01k7UbVZXacOk3HQm4zxhtaBzUF+vPCPKtiQsciYsisJ+guOx3beULZbI6mLWZe/ViZnJyGjXGC1ZcxLVvhXFhTXV7CBFirOgzGvAa60SKFbEy5W/YNRwaQtZ5xAJWG1vkZGKsebu3x2zPiNfU7F11Za93Lfu8xB1MaWt5o/3he5Vpds0L6haV55hXYWLIJWPK+sRY4tZtojf6DPENj84C4VNwTM1gupdlGz972bqz1po7i8C+/H0XG0DLtf+yNcCuAxlRLps7bLXHI8dkTTvn6fgRs6Opq2SXW5oKQvPm22rNAxuoXYCbG79wWxm9obGkoA8XaabjfMftTAVN+BYsxmk/Q+Vhkj+UL4cE64jBSk+FFQfIr/sStWtcLkkk= X-Forefront-Antispam-Report-Untrusted: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:VI1PR08MB5325.eurprd08.prod.outlook.com; PTR:; CAT:NONE; SFS:(4636009)(396003)(366004)(136003)(39860400002)(346002)(376002)(8936002)(1076003)(55016002)(33656002)(8886007)(66946007)(66556008)(86362001)(66616009)(83380400001)(36756003)(66476007)(478600001)(6916009)(235185007)(44144004)(5660300002)(16526019)(186003)(4743002)(4326008)(33964004)(52116002)(44832011)(7696005)(316002)(2616005)(26005)(2906002)(8676002)(956004)(2700100001); DIR:OUT; SFP:1101; X-MS-Exchange-AntiSpam-MessageData: e72LmnFp+s6i+gT58wa4eo01wp9CqScR4iLGx8zj+5dQ+FOVXDkM3kO0Nybd18D0Pr3G1agXT3eX+k+yjr+iRN9W2jXBrYIFqbnj3Gr9dzW3sieAC7EuvaVpgI42BM5Xu7Kjig6mHKbUiHmdWT9eIQKXaOgSPmlLfiq7t0eRc0DTcUt5IcR5orKwVzM3QhyqWfoQQhztMzC3BqCrfNMNyqpEhp+0cyKFalWCQ8V7zV/FsfUS7NScEsll63T3lGQ1OGBqk0fBJGZ4BC2plyc6w1HQSqISdZKfVgaSXn76vkIr4e+N4L7xVi1V/C2MSjBEcqCC2gbQajdyGh65z5B/R+O8rI2rVLE5AWZipxRqHWWulBbCcgc+hUfmaRdd/pfd0BPvrTuzyDZF/9b1wiU94IDolqr6SKcjpKrkcHNNVyy+A3dSs6nEwO5ayp0Z/WAq7v0lvfxGuH+D70aWtNfFZrwLNy5Y5O5dfLT8jFAPxPx/ube6Hmj3u4891bs3xLug8paK/i6YQsHxtNuJyLXNNCqcVsnF3sPr2qEkBwCa5tpNaFKFImszEqwlykDtVzdPgm15Vki3+9hYoUKCAHlPmu2PIFISYIhO8Q43nC+5ICsJkAxmiPD/tPt9+fVUYSn0KxsUxvykhNDT+3OFLQ+fAA== X-MS-Exchange-Transport-Forked: True X-MS-Exchange-Transport-CrossTenantHeadersStamped: VE1PR08MB5678 Original-Authentication-Results: gcc.gnu.org; dkim=none (message not signed) header.d=none;gcc.gnu.org; dmarc=none action=none header.from=arm.com; X-EOPAttributedMessage: 0 X-MS-Exchange-Transport-CrossTenantHeadersStripped: DB5EUR03FT054.eop-EUR03.prod.protection.outlook.com X-MS-Office365-Filtering-Correlation-Id-Prvs: 724533c6-5c00-431f-d919-08d8615f6ebf X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: NeikVmuhLA2zOIMfwb0WppYKB+EqSM0z8IPuPNPAXqVXH46s/7YbfvM3w1VnbRoM8yYNiQalZWiyK+uZNd2qfI3QsISYPK1y00aPp0EItb13fUgqU5SHMqVCcgBHXVFWlPs0xArC5umrDK9zoemEV2INLOCASuMT3U8fLcYw3OtzvuuaMCywaMrTjuJKUtO5GuYi5sJ9lj3UNoJOtmA80hOHtRN6/Ok2IETbjO+RIr9edvbt1hkpLXdQD9LqRlPx5bjQcQsIzHeaTwRLdDUuEQliczyfT5zK8hks+nYDyniE3wFpGfgfR11mxXVzR5B8lYSGo3+P0oQfBzGJ1z1m+S+sEuLGmTW4+NdO66GPGAuAQrSAneCzlLoLWoM+GfWk3G4GUkLKo+1RvefBThDIp2Fu9WlBC1ZqJNrgDvAWO915l/frJ4+oGoj10Vta2ooJbP/vABcca6mbsA+dyu5Bwvk1xoWK3ocdBTB1J8IkaoE= X-Forefront-Antispam-Report: CIP:63.35.35.123; CTRY:IE; LANG:en; SCL:1; SRV:; IPV:CAL; SFV:NSPM; H:64aa7808-outbound-1.mta.getcheckrecipient.com; PTR:ec2-63-35-35-123.eu-west-1.compute.amazonaws.com; CAT:NONE; SFS:(4636009)(376002)(396003)(346002)(136003)(39860400002)(46966005)(83380400001)(86362001)(356005)(4326008)(81166007)(478600001)(6916009)(47076004)(8676002)(82740400003)(8886007)(2616005)(336012)(956004)(44144004)(5660300002)(2906002)(8936002)(33964004)(235185007)(16526019)(186003)(26005)(36756003)(44832011)(1076003)(33656002)(4743002)(316002)(70206006)(66616009)(82310400003)(55016002)(70586007)(7696005)(2700100001); DIR:OUT; SFP:1101; X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 25 Sep 2020 14:29:48.6368 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 4281fc55-64f7-4353-12c3-08d8615f750c X-MS-Exchange-CrossTenant-Id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=f34e5979-57d9-4aaa-ad4d-b122a662184d; Ip=[63.35.35.123]; Helo=[64aa7808-outbound-1.mta.getcheckrecipient.com] X-MS-Exchange-CrossTenant-AuthSource: DB5EUR03FT054.eop-EUR03.prod.protection.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: DB8PR08MB5033 X-Spam-Status: No, score=-14.0 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, GIT_PATCH_0, KAM_LOTSOFHASH, MSGID_FROM_MTA_HEADER, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2, SPF_HELO_PASS, SPF_PASS, TXREP, UNPARSEABLE_RELAY autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: nd@arm.com, rguenther@suse.de, ook@ucw.cz Errors-To: gcc-patches-bounces@gcc.gnu.org Sender: "Gcc-patches" Hi All, This patch adds pattern detections for the following operation: Complex FMLA, Conjucate FMLA of the second parameter and FMLS. c += a * b, c += a * conj (b), c -= a * b and c -= a * conj (b) For the conjucate cases it supports under fast-math that the operands that is being conjucated be flipped by flipping the arguments to the optab. This allows it to support c = conj (a) * b and c += conj (a) * b. where a, b and c are complex numbers. Bootstrapped Regtested on aarch64-none-linux-gnu and no issues. Ok for master? Thanks, Tamar gcc/ChangeLog: * doc/md.texi: Document optabs. * internal-fn.def (COMPLEX_FMA, COMPLEX_FMA_CONJ, COMPLEX_FMS, COMPLEX_FMS_CONJ): New. * optabs.def (cmla_optab, cmla_conj_optab, cmls_optab, cmls_conj_optab): New. * tree-vect-slp-patterns.c (class ComplexFMAPattern): New. (slp_patterns): Add ComplexFMAPattern. diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi index ddaf1abaccbd44dae11ea902ec38b474aacfb8e1..d8142f745050d963e8d15c7793fae06d9ad02020 100644 --- a/gcc/doc/md.texi +++ b/gcc/doc/md.texi @@ -6143,6 +6143,50 @@ rotations @var{m} of 90 or 270. This pattern is not allowed to @code{FAIL}. +@cindex @code{cmla@var{m}4} instruction pattern +@item @samp{cmla@var{m}4} +Perform a vector floating point multiply and accumulate of complex numbers +in operand 0, operand 1 and operand 2. + +The instruction must perform the operation on data loaded contiguously into the +vectors. +The operation is only supported for vector modes @var{m}. + +This pattern is not allowed to @code{FAIL}. + +@cindex @code{cmla_conj@var{m}4} instruction pattern +@item @samp{cmla_conj@var{m}4} +Perform a vector floating point multiply and accumulate of complex numbers +in operand 0, operand 1 and the conjucate of operand 2. + +The instruction must perform the operation on data loaded contiguously into the +vectors. +The operation is only supported for vector modes @var{m}. + +This pattern is not allowed to @code{FAIL}. + +@cindex @code{cmls@var{m}4} instruction pattern +@item @samp{cmls@var{m}4} +Perform a vector floating point multiply and subtract of complex numbers +in operand 0, operand 1 and operand 2. + +The instruction must perform the operation on data loaded contiguously into the +vectors. +The operation is only supported for vector modes @var{m}. + +This pattern is not allowed to @code{FAIL}. + +@cindex @code{cmls_conj@var{m}4} instruction pattern +@item @samp{cmls_conj@var{m}4} +Perform a vector floating point multiply and subtract of complex numbers +in operand 0, operand 1 and the conjucate of operand 2. + +The instruction must perform the operation on data loaded contiguously into the +vectors. +The operation is only supported for vector modes @var{m}. + +This pattern is not allowed to @code{FAIL}. + @cindex @code{cmul@var{m}4} instruction pattern @item @samp{cmul@var{m}4} Perform a vector floating point multiplication of complex numbers in operand 0 diff --git a/gcc/internal-fn.def b/gcc/internal-fn.def index 51bebf8701af262b22d66d19a29a8dafb74db1f0..cc0135cb2c1c14b593181edeaa5f896fa6c4c659 100644 --- a/gcc/internal-fn.def +++ b/gcc/internal-fn.def @@ -286,6 +286,10 @@ DEF_INTERNAL_FLT_FN (LDEXP, ECF_CONST, ldexp, binary) /* Ternary math functions. */ DEF_INTERNAL_FLT_FLOATN_FN (FMA, ECF_CONST, fma, ternary) +DEF_INTERNAL_OPTAB_FN (COMPLEX_FMA, ECF_CONST, cmla, ternary) +DEF_INTERNAL_OPTAB_FN (COMPLEX_FMA_CONJ, ECF_CONST, cmla_conj, ternary) +DEF_INTERNAL_OPTAB_FN (COMPLEX_FMS, ECF_CONST, cmls, ternary) +DEF_INTERNAL_OPTAB_FN (COMPLEX_FMS_CONJ, ECF_CONST, cmls_conj, ternary) /* Unary integer ops. */ DEF_INTERNAL_INT_FN (CLRSB, ECF_CONST | ECF_NOTHROW, clrsb, unary) diff --git a/gcc/optabs.def b/gcc/optabs.def index 9c267d422478d0011f288b1f5f62daabe3989ba7..19db9c00896cd08adfd20a01669990bbbebd79f1 100644 --- a/gcc/optabs.def +++ b/gcc/optabs.def @@ -294,6 +294,10 @@ OPTAB_D (cadd90_optab, "cadd90$a3") OPTAB_D (cadd270_optab, "cadd270$a3") OPTAB_D (cmul_optab, "cmul$a3") OPTAB_D (cmul_conj_optab, "cmul_conj$a3") +OPTAB_D (cmla_optab, "cmla$a4") +OPTAB_D (cmla_conj_optab, "cmla_conj$a4") +OPTAB_D (cmls_optab, "cmls$a4") +OPTAB_D (cmls_conj_optab, "cmls_conj$a4") OPTAB_D (cos_optab, "cos$a2") OPTAB_D (cosh_optab, "cosh$a2") OPTAB_D (exp10_optab, "exp10$a2") diff --git a/gcc/tree-vect-slp-patterns.c b/gcc/tree-vect-slp-patterns.c index bef7cc73b21c020e4c0128df5d186a034809b103..d9554aaaf2cce14bb5b9c68e6141ea7f555a35de 100644 --- a/gcc/tree-vect-slp-patterns.c +++ b/gcc/tree-vect-slp-patterns.c @@ -916,6 +916,199 @@ class ComplexMulPattern : public ComplexMLAPattern } }; +class ComplexFMAPattern : public ComplexMLAPattern +{ + protected: + ComplexFMAPattern (slp_tree node, vec_info *vinfo) + : ComplexMLAPattern (node, vinfo) + { + this->m_arity = 2; + this->m_num_args = 3; + this->m_vects.create (0); + this->m_defs.create (0); + } + + public: + ~ComplexFMAPattern () + { + this->m_vects.release (); + this->m_defs.release (); + } + + static VectPattern* create (slp_tree node, vec_info *vinfo) + { + return new ComplexFMAPattern (node, vinfo); + } + + const char* get_name () + { + return "Complex FM(A|S)"; + } + + /* Pattern matcher for trying to match complex multiply and accumulate + pattern in SLP tree using N statements STMT_0 and STMT_0 as the root + statements by finding the statements starting in position IDX in NODE. + If the operation matches then IFN is set to the operation it matched and + the arguments to the two replacement statements are put in VECTS. + + If no match is found then IFN is set to IFN_LAST and VECTS is unchanged. + + This function matches the patterns shaped as: + + double ax = (b[i+1] * a[i]) + (b[i] * a[i]); + double bx = (a[i+1] * b[i]) - (a[i+1] * b[i+1]); + + c[i] = c[i] - ax; + c[i+1] = c[i+1] + bx; + + If a match occurred then TRUE is returned, else FALSE. */ + bool + matches (stmt_vec_info *stmts, int idx) + { + this->m_last_ifn = IFN_LAST; + this->m_vects.truncate (0); + this->m_vects.create (6); + int base = idx - (this->m_arity - 1); + this->m_last_idx = idx; + slp_tree node = this->m_node; + this->m_stmt_info = stmts[0]; + + + /* Find the two components. Rotation in the complex plane will modify + the operations: + + * Rotation 0: + + + * Rotation 90: - + + * Rotation 180: - - + * Rotation 270: + -. */ + auto_vec args0; + complex_operation_t op1 = vect_detect_pair_op (base, node, &args0); + + if (op1 == CMPLX_NONE) + return false; + + slp_tree sub1, sub2a, sub2b, sub3; + + /* Now operand2+4 must lead to another expression. */ + auto_vec args1; + complex_operation_t op2 + = vect_match_call_complex_mla_1 (node, &sub1, 1, base, 0, &args1); + + if (op2 != MINUS_PLUS && op2 != PLUS_MINUS) + return false; + + /* Now operand1+3 must lead to another expression. */ + auto_vec args2; + complex_operation_t op3 + = vect_match_call_complex_mla_1 (sub1, &sub2a, 0, base, 0, &args2); + + if (op3 != MULT_MULT) + return false; + + /* Now operand2+4 must lead to another expression. */ + auto_vec args3; + complex_operation_t op4 + = vect_match_call_complex_mla_1 (sub1, &sub2b, 1, base, 0, &args3); + + if (op4 != MULT_MULT) + return false; + + /* Now operand2+4 may lead to another expression. */ + auto_vec args4; + complex_operation_t op5 + = vect_match_call_complex_mla_1 (sub2b, &sub3, 1, base, 0, &args4); + + /* Or operand1+3 may lead to another expression. */ + auto_vec args5; + complex_operation_t op6 + = vect_match_call_complex_mla_1 (sub2b, &sub3, 0, base, 0, &args5); + + if (op1 == PLUS_MINUS && op2 == MINUS_PLUS) + { + + /* The FMS conjucate has a different layout so check that. */ + if (op5 == CMPLX_NONE && op6 == CMPLX_NONE) + { + op6 = vect_match_call_complex_mla_1 (sub2a, &sub3, 0, base, 0, + &args5); + if (op6 == CMPLX_NONE) + op6 = vect_match_call_complex_mla_1 (sub2a, &sub3, 1, base, 0, + &args5); + } + if (op5 == CMPLX_NONE && op6 != NEG_NEG) + this->m_last_ifn = IFN_COMPLEX_FMS; + else if (op5 == NEG_NEG || op6 == NEG_NEG) + this->m_last_ifn = IFN_COMPLEX_FMS_CONJ; + } + else if (op1 == PLUS_PLUS && op2 == MINUS_PLUS) + { + if (op5 == CMPLX_NONE && op6 != NEG_NEG) + this->m_last_ifn = IFN_COMPLEX_FMA; + else if (op5 == NEG_NEG || op6 == NEG_NEG) + this->m_last_ifn = IFN_COMPLEX_FMA_CONJ; + } + + if (this->m_last_ifn == IFN_LAST) + return false; + + if (this->m_last_ifn == IFN_COMPLEX_FMA_CONJ) + { + /* Check if the conjucate is on the first or second parameter. */ + if (op5 == NEG_NEG) + { + this->m_vects.quick_push (args0[0]); + this->m_vects.quick_push (args2[2]); + this->m_vects.quick_push (args3[2]); + this->m_vects.quick_push (args0[2]); + this->m_vects.quick_push (args4[0]); + this->m_vects.quick_push (args2[3]); + } + else + { + this->m_vects.quick_push (args0[0]); + this->m_vects.quick_push (args2[3]); + this->m_vects.quick_push (args2[0]); + this->m_vects.quick_push (args0[2]); + this->m_vects.quick_push (args5[0]); + this->m_vects.quick_push (args2[2]); + } + } + else if (this->m_last_ifn == IFN_COMPLEX_FMS_CONJ) + { + /* Check if the conjucate is on the first or second parameter. */ + if (op6 == NEG_NEG) + { + this->m_vects.quick_push (args0[0]); + this->m_vects.quick_push (args3[1]); + this->m_vects.quick_push (args2[3]); + this->m_vects.quick_push (args0[2]); + this->m_vects.quick_push (args5[0]); + this->m_vects.quick_push (args2[1]); + } + else + { + this->m_vects.quick_push (args0[0]); + this->m_vects.quick_push (args2[2]); + this->m_vects.quick_push (args3[2]); + this->m_vects.quick_push (args0[2]); + this->m_vects.quick_push (args2[0]); + this->m_vects.quick_push (args5[0]); + } + } + else + { + this->m_vects.quick_push (args0[0]); + this->m_vects.quick_push (args2[3]); + this->m_vects.quick_push (args3[2]); + this->m_vects.quick_push (args0[2]); + this->m_vects.quick_push (args3[3]); + this->m_vects.quick_push (args2[2]); + } + + return store_results (); + } +}; + #define SLP_PATTERN(x) &x::create VectPatternDecl slp_patterns[] { @@ -923,6 +1116,7 @@ VectPatternDecl slp_patterns[] order patterns from the largest to the smallest. Especially if they overlap in what they can detect. */ + SLP_PATTERN (ComplexFMAPattern), SLP_PATTERN (ComplexMulPattern), SLP_PATTERN (ComplexAddPattern), }; From patchwork Fri Sep 25 14:29:49 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tamar Christina X-Patchwork-Id: 1371345 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=sourceware.org; envelope-from=gcc-patches-bounces@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=arm.com Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=armh.onmicrosoft.com header.i=@armh.onmicrosoft.com header.a=rsa-sha256 header.s=selector2-armh-onmicrosoft-com header.b=QfbkqSnJ; dkim=pass (1024-bit key) header.d=armh.onmicrosoft.com header.i=@armh.onmicrosoft.com header.a=rsa-sha256 header.s=selector2-armh-onmicrosoft-com header.b=QfbkqSnJ; dkim-atps=neutral Received: from sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4ByZ7z4Zwwz9sR4 for ; Sat, 26 Sep 2020 00:30:23 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 1B26F39960D7; Fri, 25 Sep 2020 14:30:21 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from EUR01-DB5-obe.outbound.protection.outlook.com (mail-eopbgr150049.outbound.protection.outlook.com [40.107.15.49]) by sourceware.org (Postfix) with ESMTPS id 0335F39960D5 for ; Fri, 25 Sep 2020 14:30:02 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 0335F39960D5 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=Tamar.Christina@arm.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector2-armh-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=dPrYrpWAdmL+fOyMtQXelRRmeOXOQC2lTRRTDmpMTmE=; b=QfbkqSnJ9rLafF+4JDGfJTh0sygJW8vIXkx+h6icSU3ZFB0uPN036gyFPvoCoNxQpcY9uPvr/lFNlbknBsOJGhlLqxbSDTuV9vAQqX8BpKvGG4dh/sjb39c38wvCCrhiXzArv+rnYlp+PnR1rO8lzTGcyMfVkA6KEKLEgdLq9O0= Received: from AM6PR10CA0008.EURPRD10.PROD.OUTLOOK.COM (2603:10a6:209:89::21) by PR3PR08MB5563.eurprd08.prod.outlook.com (2603:10a6:102:89::13) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3412.20; Fri, 25 Sep 2020 14:30:00 +0000 Received: from VE1EUR03FT008.eop-EUR03.prod.protection.outlook.com (2603:10a6:209:89:cafe::96) by AM6PR10CA0008.outlook.office365.com (2603:10a6:209:89::21) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3412.21 via Frontend Transport; Fri, 25 Sep 2020 14:30:00 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 63.35.35.123) smtp.mailfrom=arm.com; gcc.gnu.org; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com;gcc.gnu.org; dmarc=bestguesspass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 63.35.35.123 as permitted sender) receiver=protection.outlook.com; client-ip=63.35.35.123; helo=64aa7808-outbound-1.mta.getcheckrecipient.com; Received: from 64aa7808-outbound-1.mta.getcheckrecipient.com (63.35.35.123) by VE1EUR03FT008.mail.protection.outlook.com (10.152.18.75) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3412.21 via Frontend Transport; Fri, 25 Sep 2020 14:29:59 +0000 Received: ("Tessian outbound bac899b43a54:v64"); Fri, 25 Sep 2020 14:29:59 +0000 X-CheckRecipientChecked: true X-CR-MTA-CID: 2427287593450ba6 X-CR-MTA-TID: 64aa7808 Received: from f5738a187d52.1 by 64aa7808-outbound-1.mta.getcheckrecipient.com id DA4059C4-1E59-47D0-AABC-3AF3174B4725.1; Fri, 25 Sep 2020 14:29:54 +0000 Received: from EUR01-HE1-obe.outbound.protection.outlook.com by 64aa7808-outbound-1.mta.getcheckrecipient.com with ESMTPS id f5738a187d52.1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384); Fri, 25 Sep 2020 14:29:54 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=TSl4ZLoN2rDNROnaLXzbO98VnLBVpeoY0d1ftqv4UWFO/f+OU6Wp/Wa9Z3G3LWOvC+1+h9Ak9c03I75ytU1W/qzCHjfF+bzvKRSDT95G0Etf490giSZTjgWHfjlWd5EVjxZXCnxIlmjZjkAzg6u22cBsS/8+LexSSP9W0ef7n82FzVMwbuSPXFZI6/jCv/sLc0k4YMlFFAiDLDzP0Xi7eN1gxaHKamF3p0VAFQtzSu4YSEY7Io6bMWNrqFqckiFHNJCJ27Ds8FeiPfxyqjs8Y3vrnArZ7XgVlbfSQQ+xVpVrLwM9bBeK2lNAYZH0cWzw2oxNSJbHn7As45xp756H4g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=dPrYrpWAdmL+fOyMtQXelRRmeOXOQC2lTRRTDmpMTmE=; b=Cyqs/hAn2LS1YJAUqakk6fr4H/csp0XaM3q+tNNdDzE4kfpLlzRG1ydLghkzWDwM+Tx6a2qlIJhRWCZPVGo77liUOKTZUgHMWif+Uaz2ltV7spu+lCZi3LXFANp4XiI/DO4pGWiy7+mNT12wPTh6U4eS++F4+iLbC/rpuFt63ZsNWfIxUpe03yOsywErj92xWbcZ9H8RfXZmkriZLuLqdvzFUm8Lqh/uV/75SLZFdmRVcBfJsn1Z9Spexj0We+yw/KIXTI2VEG0Cm4YRaPCWyBW6qS3Qa2c49uhXaH5rAg0/9lxNlQlRFTg1a7NJ2f+owCgsmgllDhuLsm1RpOu86A== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=arm.com; dmarc=pass action=none header.from=arm.com; dkim=pass header.d=arm.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector2-armh-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=dPrYrpWAdmL+fOyMtQXelRRmeOXOQC2lTRRTDmpMTmE=; b=QfbkqSnJ9rLafF+4JDGfJTh0sygJW8vIXkx+h6icSU3ZFB0uPN036gyFPvoCoNxQpcY9uPvr/lFNlbknBsOJGhlLqxbSDTuV9vAQqX8BpKvGG4dh/sjb39c38wvCCrhiXzArv+rnYlp+PnR1rO8lzTGcyMfVkA6KEKLEgdLq9O0= Authentication-Results-Original: gcc.gnu.org; dkim=none (message not signed) header.d=none;gcc.gnu.org; dmarc=none action=none header.from=arm.com; Received: from VI1PR08MB5325.eurprd08.prod.outlook.com (2603:10a6:803:13e::17) by VE1PR08MB5760.eurprd08.prod.outlook.com (2603:10a6:800:1af::10) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3412.20; Fri, 25 Sep 2020 14:29:52 +0000 Received: from VI1PR08MB5325.eurprd08.prod.outlook.com ([fe80::d0e7:49cd:4dae:a2a2]) by VI1PR08MB5325.eurprd08.prod.outlook.com ([fe80::d0e7:49cd:4dae:a2a2%7]) with mapi id 15.20.3412.024; Fri, 25 Sep 2020 14:29:52 +0000 Date: Fri, 25 Sep 2020 15:29:49 +0100 From: Tamar Christina To: gcc-patches@gcc.gnu.org Subject: [PATCH v2 9/16][docs] Add some missing test directive documentaion. Message-ID: <20200925142948.GA23047@arm.com> Content-Disposition: inline User-Agent: Mutt/1.9.4 (2018-02-28) X-ClientProxiedBy: LO2P265CA0472.GBRP265.PROD.OUTLOOK.COM (2603:10a6:600:a2::28) To VI1PR08MB5325.eurprd08.prod.outlook.com (2603:10a6:803:13e::17) MIME-Version: 1.0 X-MS-Exchange-MessageSentRepresentingType: 1 Received: from arm.com (217.140.106.53) by LO2P265CA0472.GBRP265.PROD.OUTLOOK.COM (2603:10a6:600:a2::28) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3412.20 via Frontend Transport; Fri, 25 Sep 2020 14:29:51 +0000 X-Originating-IP: [217.140.106.53] X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-HT: Tenant X-MS-Office365-Filtering-Correlation-Id: 61d6eee8-b4bd-43a3-f5ef-08d8615f7bdf X-MS-TrafficTypeDiagnostic: VE1PR08MB5760:|PR3PR08MB5563: X-Microsoft-Antispam-PRVS: x-checkrecipientrouted: true NoDisclaimer: true X-MS-Oob-TLC-OOBClassifiers: OLM:8273;OLM:8273; X-MS-Exchange-SenderADCheck: 1 X-Microsoft-Antispam-Untrusted: BCL:0; X-Microsoft-Antispam-Message-Info-Original: uRNFLBaFfLRAuVDu/0xlKHuEzI8NX7xJGcG86Yt/TLBKYumsif4FPy0MRcby86qSzvlqo7HXiLMT+wKdT2t1LODZJE5EduH0GhcaW+CA7T8Gv3e+WFhX97oTeQaCSJkocu5CCxcDHUd5Egmzl8FawvjyYZBbP8+GsjLMaYFsB+XleWJwt1jzAVbK6bYmHKoA5I/JnvkiJbImAcXFt+xQe2qB8TmqGDaMzQws56mpwToTkCwa+dWZdtanRPGvDG0pc+M3JGDXQTibtCJy5WaKUhYNbOemjytH507QtmXEx2NB7xTxqTBHDMMjqm48WCzU7nSdcCgQLRy2Y/FdT8IlVfB6+zX/i8Tj+YDRgMXo/dkc+A1QzacVtV5Qdj0C07ip4zZFdW/SEQqTmuuiCXgWAtcWL6D1oTREmy4Ut0n4dYE= X-Forefront-Antispam-Report-Untrusted: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:VI1PR08MB5325.eurprd08.prod.outlook.com; PTR:; CAT:NONE; SFS:(4636009)(136003)(376002)(346002)(366004)(39860400002)(396003)(564344004)(1076003)(26005)(2616005)(44832011)(235185007)(4326008)(8886007)(86362001)(478600001)(36756003)(44144004)(66946007)(52116002)(7696005)(956004)(66476007)(66616009)(66556008)(33964004)(55016002)(8936002)(316002)(2906002)(33656002)(16526019)(8676002)(186003)(5660300002)(4743002)(6916009)(2700100001); DIR:OUT; SFP:1101; X-MS-Exchange-AntiSpam-MessageData: 1Ag6BBs1AAFtoTLP0cnjW3Pnip0RsPk+6s2NNMwxwtEwGI5xRyOhQo0YoGifPDllbXBiWlo5ccbKTpL+UPHpjRAITTHEBT9RZ/4oTduP9LqQwr2UaUtGxyCNPGthgntlMRBCtxCdu5aYVr5oCxhk5+7W3Qjp05drjkcBemGRqO7rHwKDqIRtX/PqRkcYKB2SfficNtkI0Kux7twTcA1JBOaUkwa7f4xYXKqGiqwq504a/ES511SOr2Aa+8i3cQ6tfDxbul9D3AyNzQRdJv6yEtj8MtZV//waYU55m3FYjGB6h2+rf8/6Tyvtmlns0S42nEW1ynZekriCia792LXNrZkLi2xSOBbsTc8Ue6l5y4OZyIfriqAtRjDIAWedDer15cDBhjVlW2OQLCHwzqs2K/zIKMm9jTyzbslkktNcudSceCt5vq6ic+noI81D+VxQHHhcgBc28X58+qF1T3FlwLKF6KSLcZD0vR7vKo5yL+kWVtDTk+Y176YfhX0xPZZdVJQgmnYmGJe79/+V7JZM/3S87HFCBfr8JaSUFcmAdA9qbxv3xHxrJ5RspiVq+E6YEYs7qBGX63QhPbZ6J7KCMtIoC/upg3JmQE21hJb1U0KG0tRfp2O3azhQWUFuN+Cn6gRAKYPmT42PIybVB9fDlw== X-MS-Exchange-Transport-Forked: True X-MS-Exchange-Transport-CrossTenantHeadersStamped: VE1PR08MB5760 Original-Authentication-Results: gcc.gnu.org; dkim=none (message not signed) header.d=none;gcc.gnu.org; dmarc=none action=none header.from=arm.com; X-EOPAttributedMessage: 0 X-MS-Exchange-Transport-CrossTenantHeadersStripped: VE1EUR03FT008.eop-EUR03.prod.protection.outlook.com X-MS-Office365-Filtering-Correlation-Id-Prvs: ea6b0685-f7d8-4ad0-a8b3-08d8615f76f0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: rRtT3AP2scHvx3BUVQY0LdUoR0nNiGMPBMYTLFIhcRX7RWO1o0GvIgeXvnVF4ozqpLIhJgsTV6CLeqtUuOGdaVjStg/VuRt6jxIS/9TCam854JIvQyIREEUgOjPcXmGcxTWhxgw8v7wI6Yu0/vxiJcXjzMG66OVoquf8vF0hYEJfDKqYRNDsteO8KG6k0fFGBNfHDlZNsclhSTuNM5Yq+Z/irEBwNAw/b+c+jH5DvRSP0/5O53pUqRl01LAZ6bZncg8xUKC1qqgHOg6Ed20RIbkZj8YY89nWayWVERErCRCBuU6IGuw1+pXXrnNoHpnh66C7M+xf8zkhBhkZORxVp2Im5VoJVJ4SQbh82CW50W7lj1QT0dUNQVrvuTQqvDN5v8cr5381bXz1xALOJp5QRg4yNP76OwNNUF4cGAxd6wQ6KPA30IRRLpz1oKAguKmotz22VCSrB6B3GFDBaH5e4g== X-Forefront-Antispam-Report: CIP:63.35.35.123; CTRY:IE; LANG:en; SCL:1; SRV:; IPV:CAL; SFV:NSPM; H:64aa7808-outbound-1.mta.getcheckrecipient.com; PTR:ec2-63-35-35-123.eu-west-1.compute.amazonaws.com; CAT:NONE; SFS:(4636009)(376002)(346002)(136003)(39860400002)(396003)(46966005)(47076004)(66616009)(107886003)(4743002)(81166007)(4326008)(82740400003)(36756003)(70586007)(186003)(2906002)(1076003)(336012)(478600001)(36906005)(6916009)(33656002)(5660300002)(26005)(86362001)(956004)(235185007)(70206006)(316002)(2616005)(16526019)(8936002)(8676002)(33964004)(82310400003)(356005)(7696005)(8886007)(564344004)(55016002)(44144004)(44832011)(2700100001); DIR:OUT; SFP:1101; X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 25 Sep 2020 14:29:59.9859 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 61d6eee8-b4bd-43a3-f5ef-08d8615f7bdf X-MS-Exchange-CrossTenant-Id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=f34e5979-57d9-4aaa-ad4d-b122a662184d; Ip=[63.35.35.123]; Helo=[64aa7808-outbound-1.mta.getcheckrecipient.com] X-MS-Exchange-CrossTenant-AuthSource: VE1EUR03FT008.eop-EUR03.prod.protection.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: PR3PR08MB5563 X-Spam-Status: No, score=-14.5 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, GIT_PATCH_0, MSGID_FROM_MTA_HEADER, RCVD_IN_DNSWL_LOW, RCVD_IN_MSPIKE_H2, SPF_HELO_PASS, SPF_PASS, TXREP, UNPARSEABLE_RELAY autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: sandra@codesourcery.com, nd@arm.com, joseph@codesourcery.com Errors-To: gcc-patches-bounces@gcc.gnu.org Sender: "Gcc-patches" Hi All, This adds some documentation for some test directives that are missing. Bootstrapped Regtested on aarch64-none-linux-gnu and no issues. Ok for master? Thanks, Tamar gcc/ChangeLog: * doc/sourcebuild.texi (vect_complex_rot_, arm_v8_3a_complex_neon_ok, arm_v8_3a_complex_neon_hw): New. diff --git a/gcc/doc/sourcebuild.texi b/gcc/doc/sourcebuild.texi index 65b2e552b74becdbc5474ba5ac387a4a0296e341..3abd8f631cb0234076641e399f6f00768b38ebee 100644 --- a/gcc/doc/sourcebuild.texi +++ b/gcc/doc/sourcebuild.texi @@ -1671,6 +1671,10 @@ Target supports a vector dot-product of @code{signed short}. @item vect_udot_hi Target supports a vector dot-product of @code{unsigned short}. +@item vect_complex_rot_@var{n} +Target supports a vector complex addition and complex fma of mode @var{N}. +Possible values of @var{n} are @code{hf}, @code{sf}, @code{df}. + @item vect_pack_trunc Target supports a vector demotion (packing) of @code{short} to @code{char} and from @code{int} to @code{short} using modulo arithmetic. @@ -1941,6 +1945,16 @@ ARM target supports executing instructions from ARMv8.2-A with the Dot Product extension. Some multilibs may be incompatible with these options. Implies arm_v8_2a_dotprod_neon_ok. +@item arm_v8_3a_complex_neon_ok +@anchor{arm_v8_3a_complex_neon_ok} +ARM target supports options to generate complex number arithmetic instructions +from ARMv8.3-A. Some multilibs may be incompatible with these options. + +@item arm_v8_3a_complex_neon_hw +ARM target supports executing complex arithmetic instructions from ARMv8.3-A. +Some multilibs may be incompatible with these options. +Implies arm_v8_3a_complex_neon_ok. + @item arm_fp16fml_neon_ok @anchor{arm_fp16fml_neon_ok} ARM target supports extensions to generate the @code{VFMAL} and @code{VFMLS} From patchwork Fri Sep 25 14:30:07 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tamar Christina X-Patchwork-Id: 1371347 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=sourceware.org; envelope-from=gcc-patches-bounces@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=arm.com Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=armh.onmicrosoft.com header.i=@armh.onmicrosoft.com header.a=rsa-sha256 header.s=selector2-armh-onmicrosoft-com header.b=80P8J/69; dkim=pass (1024-bit key) header.d=armh.onmicrosoft.com header.i=@armh.onmicrosoft.com header.a=rsa-sha256 header.s=selector2-armh-onmicrosoft-com header.b=80P8J/69; dkim-atps=neutral Received: from sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4ByZ8X6CbQz9sRf for ; Sat, 26 Sep 2020 00:30:52 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 939E639960D5; Fri, 25 Sep 2020 14:30:50 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from EUR03-AM5-obe.outbound.protection.outlook.com (mail-eopbgr30048.outbound.protection.outlook.com [40.107.3.48]) by sourceware.org (Postfix) with ESMTPS id 3B629398B879 for ; Fri, 25 Sep 2020 14:30:47 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 3B629398B879 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=Tamar.Christina@arm.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector2-armh-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=dh9WrnhdSRo5WsQL7a43ZshtuCrUw39WEMclQLA3RxM=; b=80P8J/69XlAN1EeYddyVo1OuMuXz29mV0zq7QftK0uq1hI33ECFyS4ZuZtiq9fPp4f8RIs3q7etaw5JuY+eGfV08ob3EIBEtS3xcEdLbVp2TfSG7v1ATrd0bMB03nKjBqRM5XLZ2cpd0XBirG3joSMOrDXytJnBEARPtozZNg+Q= Received: from DB6PR07CA0002.eurprd07.prod.outlook.com (2603:10a6:6:2d::12) by VE1PR08MB4942.eurprd08.prod.outlook.com (2603:10a6:803:10f::23) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3412.21; Fri, 25 Sep 2020 14:30:44 +0000 Received: from DB5EUR03FT060.eop-EUR03.prod.protection.outlook.com (2603:10a6:6:2d:cafe::49) by DB6PR07CA0002.outlook.office365.com (2603:10a6:6:2d::12) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3433.14 via Frontend Transport; Fri, 25 Sep 2020 14:30:44 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 63.35.35.123) smtp.mailfrom=arm.com; gcc.gnu.org; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com;gcc.gnu.org; dmarc=bestguesspass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 63.35.35.123 as permitted sender) receiver=protection.outlook.com; client-ip=63.35.35.123; helo=64aa7808-outbound-1.mta.getcheckrecipient.com; Received: from 64aa7808-outbound-1.mta.getcheckrecipient.com (63.35.35.123) by DB5EUR03FT060.mail.protection.outlook.com (10.152.21.231) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3412.21 via Frontend Transport; Fri, 25 Sep 2020 14:30:43 +0000 Received: ("Tessian outbound 7161e0c2a082:v64"); Fri, 25 Sep 2020 14:30:43 +0000 X-CheckRecipientChecked: true X-CR-MTA-CID: 9d5742090f30eb25 X-CR-MTA-TID: 64aa7808 Received: from 5f005b0f7ffe.2 by 64aa7808-outbound-1.mta.getcheckrecipient.com id 30A383F3-32A8-43CE-A726-AB6D2D0E49F8.1; Fri, 25 Sep 2020 14:30:17 +0000 Received: from EUR01-HE1-obe.outbound.protection.outlook.com by 64aa7808-outbound-1.mta.getcheckrecipient.com with ESMTPS id 5f005b0f7ffe.2 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384); Fri, 25 Sep 2020 14:30:17 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=APpS0TWKPkWvmensttdbcCfEui/UjiHiUhtcPzBm7IVfcl4eFUAEWgLIWuhnRXfYeB9x0PH8d6fncozrbuO3AjKQcDuOUZ0eXpBVeyI6HBKLU6VnpsHhG2ZNwos9yy7BL84K/f3tERiYfGaRXkTNnmhBFNbYolj+OQFzCQhPPEpZPmctD0Xqf7nLrot2rJPmIxKzonIlbLS4BaBS4ykJw/or5LEoxgXWbd4pFu7e6+yLUlp92JaKKmFzQTCPEVW7p6M8KadbG+wup2s3+jOcTvMOMDRZli0G9jqRym728KHTQ71ijaj15jvFU7oOzBouyp49RooU4ecA2pOg4UDTBg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=dh9WrnhdSRo5WsQL7a43ZshtuCrUw39WEMclQLA3RxM=; b=BcJg4FGDAlP/YlGkp0YIr4IpKiivHX/f8Pqq4mfkg39/yUDJ10rgQ43jC4v6L+2S1P/5jNkiMc4p5kuFNjcYTNMR52EvlH3VV1HJ66tZve0naiNAUXOAAiaJ1FQ/eRHygi6LeXy7gv//Z1Ue4w9sp31iSmRlS9gaNRixQMJaKav2kZzM+i7m0VHcTGMlX75ZzVEia5wBL2W8QQiR0rp0azTFSmXdnr8gwqEhC/8lAJCjslsDGPCIlz6hRRIonkHRP4WhyH2RjMPJsGrTzLf3T9a4DcaN2YCUNzVwf6wBU9bcTGaElfUtiitpOyqzLt316bgewQWnhYGHuOCn13Pv8w== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=arm.com; dmarc=pass action=none header.from=arm.com; dkim=pass header.d=arm.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector2-armh-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=dh9WrnhdSRo5WsQL7a43ZshtuCrUw39WEMclQLA3RxM=; b=80P8J/69XlAN1EeYddyVo1OuMuXz29mV0zq7QftK0uq1hI33ECFyS4ZuZtiq9fPp4f8RIs3q7etaw5JuY+eGfV08ob3EIBEtS3xcEdLbVp2TfSG7v1ATrd0bMB03nKjBqRM5XLZ2cpd0XBirG3joSMOrDXytJnBEARPtozZNg+Q= Authentication-Results-Original: gcc.gnu.org; dkim=none (message not signed) header.d=none;gcc.gnu.org; dmarc=none action=none header.from=arm.com; Received: from VI1PR08MB5325.eurprd08.prod.outlook.com (2603:10a6:803:13e::17) by VE1PR08MB5760.eurprd08.prod.outlook.com (2603:10a6:800:1af::10) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3412.20; Fri, 25 Sep 2020 14:30:15 +0000 Received: from VI1PR08MB5325.eurprd08.prod.outlook.com ([fe80::d0e7:49cd:4dae:a2a2]) by VI1PR08MB5325.eurprd08.prod.outlook.com ([fe80::d0e7:49cd:4dae:a2a2%7]) with mapi id 15.20.3412.024; Fri, 25 Sep 2020 14:30:15 +0000 Date: Fri, 25 Sep 2020 15:30:07 +0100 From: Tamar Christina To: gcc-patches@gcc.gnu.org Subject: [PATCH v2 10/16]AArch64: Add NEON RTL patterns for Complex Addition, Multiply and FMA. Message-ID: <20200925143005.GA24088@arm.com> Content-Disposition: inline User-Agent: Mutt/1.9.4 (2018-02-28) X-ClientProxiedBy: SN4PR0401CA0042.namprd04.prod.outlook.com (2603:10b6:803:2a::28) To VI1PR08MB5325.eurprd08.prod.outlook.com (2603:10a6:803:13e::17) MIME-Version: 1.0 X-MS-Exchange-MessageSentRepresentingType: 1 Received: from arm.com (217.140.106.53) by SN4PR0401CA0042.namprd04.prod.outlook.com (2603:10b6:803:2a::28) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3412.20 via Frontend Transport; Fri, 25 Sep 2020 14:30:13 +0000 X-Originating-IP: [217.140.106.53] X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-HT: Tenant X-MS-Office365-Filtering-Correlation-Id: f4cdd1a1-61cf-44f2-89ee-08d8615f95fd X-MS-TrafficTypeDiagnostic: VE1PR08MB5760:|VE1PR08MB4942: X-MS-Exchange-Transport-Forked: True X-Microsoft-Antispam-PRVS: x-checkrecipientrouted: true NoDisclaimer: true X-MS-Oob-TLC-OOBClassifiers: OLM:2399;OLM:2399; X-MS-Exchange-SenderADCheck: 1 X-Microsoft-Antispam-Untrusted: BCL:0; X-Microsoft-Antispam-Message-Info-Original: qkd+t4yat+OzMsBHFAx5OLlFPi7oDX5K0l0SPdVCz2nmVhacP2lE+rE7MuN+ws1Ln827P/HHoEAnnysqdcSWdnCvt3NsfKDjMDx3R1rfdIYSsbVl/oc7eHiH+c3WxeGKnXlGYFNDaPZqxGFc0bldo2G8U/x6VbVdHkpjAqObodm+mkVUQ+atvuVgNnGvkpwTH9+X6LxBNCFOGEVz6wdLteUGfIIf40chQusyWkJZrUBRjnG6Y7UXm+kms8KIEwukKoXD7S5S+TWee9ZtJz3306csM+oKmg+IuZ8AV98BLaohxP7woFfq4FiYuPI604NqxZZBPdv1xDA5BYo/A4RwWOBvUhYUAeRNnPH08Cb0Q+TUanqqDixtu+rBN1YxvfCaB+99krzyksp33YqKrcH+bDz32hjpOWKwRvHgBOcvoCg= X-Forefront-Antispam-Report-Untrusted: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:VI1PR08MB5325.eurprd08.prod.outlook.com; PTR:; CAT:NONE; SFS:(4636009)(136003)(376002)(346002)(366004)(39860400002)(396003)(1076003)(26005)(2616005)(44832011)(235185007)(4326008)(8886007)(86362001)(478600001)(36756003)(44144004)(66946007)(52116002)(7696005)(956004)(66476007)(66616009)(66556008)(33964004)(55016002)(8936002)(316002)(2906002)(33656002)(16526019)(6666004)(8676002)(186003)(5660300002)(4743002)(6916009)(2700100001); DIR:OUT; SFP:1101; X-MS-Exchange-AntiSpam-MessageData: vhbSiTcxFSRxscPvB8RI8AoCr3ary90lCyYDq3EzigsAKPBR02BCRp8joF6utusJgBI29pcOSbIHNJsx5Bt2yZ2VOkjEjZm0hnK4esqvCLIh1kiPkuhuGCf4XVnmnXB22h43bHWJdK3KT3ttNpdJbMpUMX7fSqYM1DBvr9Xr9SpSX9D2zSClMoKGjkrsVdCkdTCJwmgDZXLubZH715mSRxlvhbnY7W5R23kOb78C3DdZFy3oX3ESEqRWgfH0zTbzaIvK0rW1+tACjmaGApeQhmCO3pb93oJ7wD6aIKLNsZmfiXAPI9mJwySwyeYS5aIPnP2vOjQjBAObswmNSPffieD9erIafrOm6NUErU7VTrDAxN9JqU4FEZUzMJnVFcqGjIOajBQ5mYImS9Os8eGHTiq/ozy7F9/UIMnaE+8Zrivu6rSH9ECXJyutHQ83VGLsFiM/jsfx/5mUnax1HRJ5yteWr5SB6XWf6mkY4vehMzG8IgsmBBfZDvjxboh60g0YRNJ29ea9NEHIEFrIPk2Lik8s+xLpRld5f5Ya/asuhsLLDei02GfkShb2dyDqCR4j+XeMuepuyMZWcBkte99Q3u9BQIVCRBsjY3tGn6riQgUfwpcSfZTl4qT38cfRBceERouMU3JlC3pdFDNFhm0HXg== X-MS-Exchange-Transport-CrossTenantHeadersStamped: VE1PR08MB5760 Original-Authentication-Results: gcc.gnu.org; dkim=none (message not signed) header.d=none;gcc.gnu.org; dmarc=none action=none header.from=arm.com; X-EOPAttributedMessage: 0 X-MS-Exchange-Transport-CrossTenantHeadersStripped: DB5EUR03FT060.eop-EUR03.prod.protection.outlook.com X-MS-Office365-Filtering-Correlation-Id-Prvs: c8201a98-63fb-4c6e-f322-08d8615f84e2 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: ZAhlIn9Lp2j6KiVBgl2RAW98RW0LT4yZbAk1iJ1xz+qs7ZewryAgXYFMoRleQw810XiOVi3rNbLD0cAQMz3u/lYdLxXG2N4cGVQ9T2uuPviRMR04M3LBXX2locx3uM6wC3TjMy9dlCjPjO7U86K7PuIAEdK3fWXWoC2lJZN6sMDosCHOplrmbDIT/raaAXnuM1LWuB+Cnl2aeS2klf0c7LLvIkRg79FdfkKMTnNRZL5WJmlXfRd2ESrX68ui4rKqZVhfg7PdSk38Te/5Qw8qTHz0C5JMbaQvy/TQrj+fbVSfllLvLns+9rGbEcSd0Tg6cmb7ZFIAAeWHObdDCVDn2iFVSrPDLDYfpTjuEei9ZASwwmtPr9n2d1xRy/MyTHEBXL2ny/1+WBLlMUWPLuGLRU9oZBO6VCNqAhE7sOarDWjtrSHv113cxS7+UNGxoY6vfk4/JTSPhAhDmy2hwZuOUiCQbDAqg/sCs01LsC/oqqM= X-Forefront-Antispam-Report: CIP:63.35.35.123; CTRY:IE; LANG:en; SCL:1; SRV:; IPV:CAL; SFV:NSPM; H:64aa7808-outbound-1.mta.getcheckrecipient.com; PTR:ec2-63-35-35-123.eu-west-1.compute.amazonaws.com; CAT:NONE; SFS:(4636009)(376002)(346002)(39860400002)(136003)(396003)(46966005)(235185007)(33656002)(956004)(8936002)(2616005)(82740400003)(36756003)(16526019)(316002)(186003)(8886007)(478600001)(5660300002)(4326008)(6916009)(44832011)(70206006)(4743002)(66616009)(33964004)(47076004)(336012)(82310400003)(44144004)(70586007)(55016002)(81166007)(7696005)(356005)(86362001)(1076003)(2906002)(26005)(6666004)(8676002)(2700100001); DIR:OUT; SFP:1101; X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 25 Sep 2020 14:30:43.9063 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: f4cdd1a1-61cf-44f2-89ee-08d8615f95fd X-MS-Exchange-CrossTenant-Id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=f34e5979-57d9-4aaa-ad4d-b122a662184d; Ip=[63.35.35.123]; Helo=[64aa7808-outbound-1.mta.getcheckrecipient.com] X-MS-Exchange-CrossTenant-AuthSource: DB5EUR03FT060.eop-EUR03.prod.protection.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: VE1PR08MB4942 X-Spam-Status: No, score=-14.1 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, GIT_PATCH_0, MSGID_FROM_MTA_HEADER, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2, SPF_HELO_PASS, SPF_PASS, TXREP, UNPARSEABLE_RELAY autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Richard.Earnshaw@arm.com, nd@arm.com, Marcus.Shawcroft@arm.com Errors-To: gcc-patches-bounces@gcc.gnu.org Sender: "Gcc-patches" Hi All, This adds implementation for the optabs for complex operations. With this the following C code: void f90 (float complex a[restrict N], float complex b[restrict N], float complex c[restrict N]) { for (int i=0; i < N; i++) c[i] = a[i] + (b[i] * I); } generates f90: mov x3, 0 .p2align 3,,7 .L2: ldr q0, [x0, x3] ldr q1, [x1, x3] fcadd v0.4s, v0.4s, v1.4s, #90 str q0, [x2, x3] add x3, x3, 16 cmp x3, 1600 bne .L2 ret instead of f90: add x3, x1, 1600 .p2align 3,,7 .L2: ld2 {v4.4s - v5.4s}, [x0], 32 ld2 {v2.4s - v3.4s}, [x1], 32 fsub v0.4s, v4.4s, v3.4s fadd v1.4s, v5.4s, v2.4s st2 {v0.4s - v1.4s}, [x2], 32 cmp x3, x1 bne .L2 ret It defined a new iterator VALL_ARITH which contains types for which we can do general arithmetic (excludes bfloat16). Bootstrapped Regtested on aarch64-none-linux-gnu and no issues. Ok for master? Thanks, Tamar gcc/ChangeLog: * config/aarch64/aarch64-simd.md (cadd3, cml4, cmul3): New. * config/aarch64/iterators.md (VALL_ARITH, UNSPEC_FCMUL, UNSPEC_FCMUL180, UNSPEC_FCMLS, UNSPEC_FCMLS180, UNSPEC_CMLS, UNSPEC_CMLS180, UNSPEC_CMUL, UNSPEC_CMUL180, FCMLA_OP, FCMUL_OP, rot_op, rotsplit1, rotsplit2, fcmac1): New. (rot): Add UNSPEC_FCMLS, UNSPEC_FCMUL, UNSPEC_FCMUL180. diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md index 381a702eba003520d2e83e91065d2a808b9c6493..c2ddef19e4e433f7ca055e42d1222d9dad6bd6c2 100644 --- a/gcc/config/aarch64/aarch64-simd.md +++ b/gcc/config/aarch64/aarch64-simd.md @@ -449,6 +449,14 @@ (define_insn "aarch64_fcadd" [(set_attr "type" "neon_fcadd")] ) +(define_expand "cadd3" + [(set (match_operand:VHSDF 0 "register_operand") + (unspec:VHSDF [(match_operand:VHSDF 1 "register_operand") + (match_operand:VHSDF 2 "register_operand")] + FCADD))] + "TARGET_COMPLEX" +) + (define_insn "aarch64_fcmla" [(set (match_operand:VHSDF 0 "register_operand" "=w") (plus:VHSDF (match_operand:VHSDF 1 "register_operand" "0") @@ -508,6 +516,45 @@ (define_insn "aarch64_fcmlaq_lane" [(set_attr "type" "neon_fcmla")] ) +;; The complex mla/mls operations always need to expand to two instructions. +;; The first operation does half the computation and the second does the +;; remainder. Because of this, expand early. +(define_expand "cml4" + [(set (match_operand:VHSDF 0 "register_operand") + (plus:VHSDF (match_operand:VHSDF 1 "register_operand") + (unspec:VHSDF [(match_operand:VHSDF 2 "register_operand") + (match_operand:VHSDF 3 "register_operand")] + FCMLA_OP)))] + "TARGET_COMPLEX" +{ + emit_insn (gen_aarch64_fcmla (operands[0], operands[1], + operands[2], operands[3])); + emit_insn (gen_aarch64_fcmla (operands[0], operands[0], + operands[2], operands[3])); + DONE; +}) + +;; The complex mul operations always need to expand to two instructions. +;; The first operation does half the computation and the second does the +;; remainder. Because of this, expand early. +(define_expand "cmul3" + [(set (match_operand:VHSDF 0 "register_operand") + (unspec:VHSDF [(match_operand:VHSDF 1 "register_operand") + (match_operand:VHSDF 2 "register_operand")] + FCMUL_OP))] + "TARGET_COMPLEX" +{ + rtx tmp = gen_reg_rtx (mode); + emit_move_insn (tmp, CONST0_RTX (mode)); + emit_insn (gen_aarch64_fcmla (operands[0], tmp, + operands[1], operands[2])); + emit_insn (gen_aarch64_fcmla (operands[0], operands[0], + operands[1], operands[2])); + DONE; +}) + + + ;; These instructions map to the __builtins for the Dot Product operations. (define_insn "aarch64_dot" [(set (match_operand:VS 0 "register_operand" "=w") diff --git a/gcc/config/aarch64/iterators.md b/gcc/config/aarch64/iterators.md index 054fd8515c6ebf136da699e2993f6ebb348c3b1a..98217c9fd3ee2b6063f7564193e400e9ef71c6ac 100644 --- a/gcc/config/aarch64/iterators.md +++ b/gcc/config/aarch64/iterators.md @@ -182,6 +182,11 @@ (define_mode_iterator V2F [V2SF V2DF]) ;; All Advanced SIMD modes on which we support any arithmetic operations. (define_mode_iterator VALL [V8QI V16QI V4HI V8HI V2SI V4SI V2DI V2SF V4SF V2DF]) +;; All Advanced SIMD modes suitable for performing arithmetics. +(define_mode_iterator VALL_ARITH [V8QI V16QI V4HI V8HI V2SI V4SI V2DI + (V4HF "TARGET_SIMD_F16INST") (V8HF "TARGET_SIMD_F16INST") + V2SF V4SF V2DF]) + ;; All Advanced SIMD modes suitable for moving, loading, and storing. (define_mode_iterator VALL_F16 [V8QI V16QI V4HI V8HI V2SI V4SI V2DI V4HF V8HF V4BF V8BF V2SF V4SF V2DF]) @@ -705,6 +710,10 @@ (define_c_enum "unspec" UNSPEC_FCMLA90 ; Used in aarch64-simd.md. UNSPEC_FCMLA180 ; Used in aarch64-simd.md. UNSPEC_FCMLA270 ; Used in aarch64-simd.md. + UNSPEC_FCMUL ; Used in aarch64-simd.md. + UNSPEC_FCMUL180 ; Used in aarch64-simd.md. + UNSPEC_FCMLS ; Used in aarch64-simd.md. + UNSPEC_FCMLS180 ; Used in aarch64-simd.md. UNSPEC_ASRD ; Used in aarch64-sve.md. UNSPEC_ADCLB ; Used in aarch64-sve2.md. UNSPEC_ADCLT ; Used in aarch64-sve2.md. @@ -723,6 +732,10 @@ (define_c_enum "unspec" UNSPEC_CMLA180 ; Used in aarch64-sve2.md. UNSPEC_CMLA270 ; Used in aarch64-sve2.md. UNSPEC_CMLA90 ; Used in aarch64-sve2.md. + UNSPEC_CMLS ; Used in aarch64-sve2.md. + UNSPEC_CMLS180 ; Used in aarch64-sve2.md. + UNSPEC_CMUL ; Used in aarch64-sve2.md. + UNSPEC_CMUL180 ; Used in aarch64-sve2.md. UNSPEC_COND_FCVTLT ; Used in aarch64-sve2.md. UNSPEC_COND_FCVTNT ; Used in aarch64-sve2.md. UNSPEC_COND_FCVTX ; Used in aarch64-sve2.md. @@ -2680,6 +2693,14 @@ (define_int_iterator FMMLA [UNSPEC_FMMLA]) (define_int_iterator BF_MLA [UNSPEC_BFMLALB UNSPEC_BFMLALT]) +(define_int_iterator FCMLA_OP [UNSPEC_FCMLA + UNSPEC_FCMLA180 + UNSPEC_FCMLS + UNSPEC_FCMLS180]) + +(define_int_iterator FCMUL_OP [UNSPEC_FCMUL + UNSPEC_FCMUL180]) + ;; Iterators for atomic operations. (define_int_iterator ATOMIC_LDOP @@ -3375,6 +3396,7 @@ (define_int_attr rot [(UNSPEC_CADD90 "90") (UNSPEC_CMLA270 "270") (UNSPEC_FCADD90 "90") (UNSPEC_FCADD270 "270") + (UNSPEC_FCMLS "0") (UNSPEC_FCMLA "0") (UNSPEC_FCMLA90 "90") (UNSPEC_FCMLA180 "180") @@ -3390,7 +3412,41 @@ (define_int_attr rot [(UNSPEC_CADD90 "90") (UNSPEC_COND_FCMLA "0") (UNSPEC_COND_FCMLA90 "90") (UNSPEC_COND_FCMLA180 "180") - (UNSPEC_COND_FCMLA270 "270")]) + (UNSPEC_COND_FCMLA270 "270") + (UNSPEC_FCMUL "0") + (UNSPEC_FCMUL180 "180")]) + +;; A conjucate is a rotation of 180* around the argand plane, or * I. +(define_int_attr rot_op [(UNSPEC_FCMLS "") + (UNSPEC_FCMLS180 "_conj") + (UNSPEC_FCMLA "") + (UNSPEC_FCMLA180 "_conj") + (UNSPEC_FCMUL "") + (UNSPEC_FCMUL180 "_conj") + (UNSPEC_CMLS "") + (UNSPEC_CMLA "") + (UNSPEC_CMLA180 "_conj") + (UNSPEC_CMUL "") + (UNSPEC_CMUL180 "_conj")]) + +(define_int_attr rotsplit1 [(UNSPEC_FCMLA "0") + (UNSPEC_FCMLA180 "0") + (UNSPEC_FCMUL "0") + (UNSPEC_FCMUL180 "0") + (UNSPEC_FCMLS "270") + (UNSPEC_FCMLS180 "90")]) + +(define_int_attr rotsplit2 [(UNSPEC_FCMLA "90") + (UNSPEC_FCMLA180 "270") + (UNSPEC_FCMUL "90") + (UNSPEC_FCMUL180 "270") + (UNSPEC_FCMLS "180") + (UNSPEC_FCMLS180 "180")]) + +(define_int_attr fcmac1 [(UNSPEC_FCMLA "a") (UNSPEC_FCMLA180 "a") + (UNSPEC_FCMLS "s") (UNSPEC_FCMLS180 "s") + (UNSPEC_CMLA "a") (UNSPEC_CMLA180 "a") + (UNSPEC_CMLS "s") (UNSPEC_CMLS180 "s")]) (define_int_attr sve_fmla_op [(UNSPEC_COND_FMLA "fmla") (UNSPEC_COND_FMLS "fmls") From patchwork Fri Sep 25 14:30:26 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tamar Christina X-Patchwork-Id: 1371348 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=8.43.85.97; helo=sourceware.org; envelope-from=gcc-patches-bounces@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=arm.com Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=armh.onmicrosoft.com header.i=@armh.onmicrosoft.com header.a=rsa-sha256 header.s=selector2-armh-onmicrosoft-com header.b=0cZbUP6j; dkim=pass (1024-bit key) header.d=armh.onmicrosoft.com header.i=@armh.onmicrosoft.com header.a=rsa-sha256 header.s=selector2-armh-onmicrosoft-com header.b=0cZbUP6j; dkim-atps=neutral Received: from sourceware.org (server2.sourceware.org [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4ByZ8z14Ykz9sR4 for ; Sat, 26 Sep 2020 00:31:14 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 88A6E398B879; Fri, 25 Sep 2020 14:31:12 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from EUR01-VE1-obe.outbound.protection.outlook.com (mail-eopbgr140071.outbound.protection.outlook.com [40.107.14.71]) by sourceware.org (Postfix) with ESMTPS id 8D16F398B43E for ; Fri, 25 Sep 2020 14:31:09 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 8D16F398B43E Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=Tamar.Christina@arm.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector2-armh-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=58//QTeNagfw06BoA1te708bT9O7tA0ZEIK9q5DZpso=; b=0cZbUP6jUvGw41/nsKcRyJ7ZWqQYg0BvubuyNLFhJR2AxrYlp80/CY1O0f4qinaNFeXpjS6InWFyZdeXQd5tIiOVBk8NYHl8QAbiI3HdvG2fVgk1BzeFNfvkVXYWSqgxd9ysP+SbF88WNAAnlucB1woynoObGbEiTwzYMjMTX/w= Received: from AM5PR04CA0011.eurprd04.prod.outlook.com (2603:10a6:206:1::24) by AM6PR08MB4997.eurprd08.prod.outlook.com (2603:10a6:20b:e8::25) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3412.22; Fri, 25 Sep 2020 14:31:06 +0000 Received: from AM5EUR03FT064.eop-EUR03.prod.protection.outlook.com (2603:10a6:206:1:cafe::16) by AM5PR04CA0011.outlook.office365.com (2603:10a6:206:1::24) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3412.20 via Frontend Transport; Fri, 25 Sep 2020 14:31:06 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 63.35.35.123) smtp.mailfrom=arm.com; gcc.gnu.org; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com;gcc.gnu.org; dmarc=bestguesspass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 63.35.35.123 as permitted sender) receiver=protection.outlook.com; client-ip=63.35.35.123; helo=64aa7808-outbound-1.mta.getcheckrecipient.com; Received: from 64aa7808-outbound-1.mta.getcheckrecipient.com (63.35.35.123) by AM5EUR03FT064.mail.protection.outlook.com (10.152.17.53) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3412.21 via Frontend Transport; Fri, 25 Sep 2020 14:31:06 +0000 Received: ("Tessian outbound 7fc8f57bdedc:v64"); Fri, 25 Sep 2020 14:31:06 +0000 X-CheckRecipientChecked: true X-CR-MTA-CID: 1c0167bbdc365689 X-CR-MTA-TID: 64aa7808 Received: from 3e2b42ad105d.2 by 64aa7808-outbound-1.mta.getcheckrecipient.com id BC6F9184-F261-421B-B2CE-FFBC980DEF0B.1; Fri, 25 Sep 2020 14:30:31 +0000 Received: from EUR03-DB5-obe.outbound.protection.outlook.com by 64aa7808-outbound-1.mta.getcheckrecipient.com with ESMTPS id 3e2b42ad105d.2 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384); Fri, 25 Sep 2020 14:30:31 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=f7qnDj08LtpLI6eq+1KvpVH1uWO3NrB0EYlONb8F60l2E8xB4ncXXBiO67RdwiaZlAWcUdByjZdiMo+dfACfBnZDwpt6F8V9oueAVbTxCdr82jaKV98aGYovCZ8oVPqLfXoxikHmHJ3wuWnO20UbgC5nq0+AVrE0Tciukho0oD5+evsiJbiaW+TGKmjuM8UPyeZb5o4rCc8kEgmgSSsLx8Wx47g9NjfcFNy5aRkQXlbeMD4s1kHCDYoaotlUTUYIgz4Y68SNon7UwILi2CIa9vHaQCeVIsme1Na0ml7aF09KK9O+oXDGu5E9WmQCjCuCJDhZ4IChyOh2TFJraoYtwA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=58//QTeNagfw06BoA1te708bT9O7tA0ZEIK9q5DZpso=; b=ipWEjeX15EKFmvRJ11aiBKDACbU8vZjjWZ6XIMsusfu13+FvtvagMI3dGGBrFdc2qyxeUe37SPCOnG0EBqSZy9YDOM7w6DS9OWSLuiSq30zJRxxZAUbpDj62+4M2IW1g2Lme9x8hNyPVeWnN92SbRMxvwijvN6tyqf9IktGR8lEQS4Slfcbonvsx3lAvLAntrDkyk7SoFOR/8Rp4Zur4zQIul4G6EizE3GgSe0v1ey/0/DgDmWc7BhHYZNyaCMOLCv3z4TPEpOrT/z3XB3rTKhikbGwMwLQcfuJDm9cbpuKC7N0KpPMK8AkC+daPDErIY7/sZgSbeI7Si6vbxXRYug== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=arm.com; dmarc=pass action=none header.from=arm.com; dkim=pass header.d=arm.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector2-armh-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=58//QTeNagfw06BoA1te708bT9O7tA0ZEIK9q5DZpso=; b=0cZbUP6jUvGw41/nsKcRyJ7ZWqQYg0BvubuyNLFhJR2AxrYlp80/CY1O0f4qinaNFeXpjS6InWFyZdeXQd5tIiOVBk8NYHl8QAbiI3HdvG2fVgk1BzeFNfvkVXYWSqgxd9ysP+SbF88WNAAnlucB1woynoObGbEiTwzYMjMTX/w= Authentication-Results-Original: gcc.gnu.org; dkim=none (message not signed) header.d=none;gcc.gnu.org; dmarc=none action=none header.from=arm.com; Received: from VI1PR08MB5325.eurprd08.prod.outlook.com (2603:10a6:803:13e::17) by VE1PR08MB5678.eurprd08.prod.outlook.com (2603:10a6:800:1a0::20) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3391.11; Fri, 25 Sep 2020 14:30:28 +0000 Received: from VI1PR08MB5325.eurprd08.prod.outlook.com ([fe80::d0e7:49cd:4dae:a2a2]) by VI1PR08MB5325.eurprd08.prod.outlook.com ([fe80::d0e7:49cd:4dae:a2a2%7]) with mapi id 15.20.3412.024; Fri, 25 Sep 2020 14:30:28 +0000 Date: Fri, 25 Sep 2020 15:30:26 +0100 From: Tamar Christina To: gcc-patches@gcc.gnu.org Subject: [PATCH v2 11/16]AArch64: Add SVE RTL patterns for Complex Addition, Multiply and FMA. Message-ID: <20200925143024.GA25584@arm.com> Content-Disposition: inline User-Agent: Mutt/1.9.4 (2018-02-28) X-ClientProxiedBy: LO2P265CA0154.GBRP265.PROD.OUTLOOK.COM (2603:10a6:600:9::22) To VI1PR08MB5325.eurprd08.prod.outlook.com (2603:10a6:803:13e::17) MIME-Version: 1.0 X-MS-Exchange-MessageSentRepresentingType: 1 Received: from arm.com (217.140.106.53) by LO2P265CA0154.GBRP265.PROD.OUTLOOK.COM (2603:10a6:600:9::22) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3412.20 via Frontend Transport; Fri, 25 Sep 2020 14:30:28 +0000 X-Originating-IP: [217.140.106.53] X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-HT: Tenant X-MS-Office365-Filtering-Correlation-Id: 9eec7538-4e67-4486-f99a-08d8615fa3aa X-MS-TrafficTypeDiagnostic: VE1PR08MB5678:|AM6PR08MB4997: X-MS-Exchange-Transport-Forked: True X-Microsoft-Antispam-PRVS: x-checkrecipientrouted: true NoDisclaimer: true X-MS-Oob-TLC-OOBClassifiers: OLM:2089;OLM:2089; X-MS-Exchange-SenderADCheck: 1 X-Microsoft-Antispam-Untrusted: BCL:0; X-Microsoft-Antispam-Message-Info-Original: 9P24HkCMI1MIDqsJR4IEtPznqj0lZm1LPw5NLE33hkUejOJd4GEDONdF6kcp5jKZerPlJSGg9mrF4rmvR51aO57REtOMS6c+p1kvvn7dxjgI8gtSaZ/PawcaM9f2VBcZpu2V3CErlaSOma/eCvEJKCAeMw9LpHdbfUtlrsjZO3iGr1UxFNFH3aHnq/teic4x2kY+WRsyPeAsIGp5lFY0619E+fcCl0TR5ruukGI3IiVHlGfx8f6ov5/+uvoQAVTtUEsLMzzt4A6IHv5rBvD36fH+r/TVopeVA7MMC3o2sIuj3mgxR29iUrv5yO/zE9+aLW/f7fErBoglfwX1AUi73EsOlSLYeZDHVUmvszhmzwLbXWvBddoWJVMY8lpamci8RzTNEVvkhacth5ghqKDjhmo1VeTu04EFIkY74NI9NnKUmn/88VfJbnFz5NzT9gid8QEhQRaYz7AJGfVJ4XfGMQ== X-Forefront-Antispam-Report-Untrusted: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:VI1PR08MB5325.eurprd08.prod.outlook.com; PTR:; CAT:NONE; SFS:(4636009)(396003)(366004)(136003)(39860400002)(346002)(376002)(8936002)(1076003)(55016002)(33656002)(8886007)(66946007)(66556008)(86362001)(66616009)(36756003)(66476007)(478600001)(6916009)(235185007)(44144004)(5660300002)(16526019)(186003)(4743002)(4326008)(33964004)(52116002)(44832011)(7696005)(316002)(2616005)(26005)(2906002)(8676002)(956004)(2700100001)(357404004); DIR:OUT; SFP:1101; X-MS-Exchange-AntiSpam-MessageData: OnXjdq+vjtZePOuMjdTTI5dpmdE29fYHCorT/4Sn9h9e7w8ZQSqf1W1IyCUetFiYQTri0xL+I5sNHw4r9De3OKqy/k6SkjRCirLGgeCncR5wi1eTiqZ6CUPSskJpoXGZvLy1nLQWu8C458jrUbv8y0U0Oj4UOe3JR9DFQWfG27uuDvALnu6xSGkQ6s7P7l7zIgpe6z9uu7cPo1k6yc54M4oNdJtJClE4Ap88oI5mRfQhbMMoFndQ35WeRVQ28tWWJk+fjWLhdv2WGLQHN6bGii7owFpm+bkNw+6Zik3eJlMjD5siUASgCdwOXvdFUgUbPfChT5Gd4kZBKB0knrnBuKtZVuF6fUjhUv1fMZ2Nkt6VnuuwdS87gCfvC9KIqrnm192X8uov7vIH+wTr/lZoTYHiXAOA3DOxfR9Q4T/QFz1c74qPCJCAO0Z6flvjTMfFpoy2rLJAzqxVHaXvAMs5TKaDHC/BVsJYfsZDZLfeUJxEfOTEYTCopqpogGXPTFdux1XohN4/Q9D8X8tgv+hmW3uYMkQ6DqsbUcs5gnofCYVL/NT2zV6wkcw8NGkCmJU0WurqhZDyPlR06QPa+HM0N+43bYeghNmml3zYEc3UwVXIDP6nQLyQVS6P6+kDHoxGEo+8fuGMzlxCScw09U6aaA== X-MS-Exchange-Transport-CrossTenantHeadersStamped: VE1PR08MB5678 Original-Authentication-Results: gcc.gnu.org; dkim=none (message not signed) header.d=none;gcc.gnu.org; dmarc=none action=none header.from=arm.com; X-EOPAttributedMessage: 0 X-MS-Exchange-Transport-CrossTenantHeadersStripped: AM5EUR03FT064.eop-EUR03.prod.protection.outlook.com X-MS-Office365-Filtering-Correlation-Id-Prvs: 07f17f2b-a6c4-4747-4b52-08d8615f8ccb X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: /w1s1V/KLCQ8yV2kDoiQ4llMg+ZpglAuoV7e3ifXxGF7faYuQsppvLFGdYa7dTIhAw4aACHqcC0emeGYgPaI831+6hUxIdcYRpL4Y0IItvWaR/Vq0Q/fBy1tNXKeaFDlLlmJa4/auOLuAeJFvroDKed46jTcace95xtYamnOQKfnTAoC6SnDFeYmWxzzcbZb47h8wK+7vF6Zd8bZvukmwOgGddue42FaG4F1FUyoJ6w4XdrwNNKGIM4JIVyCeIzL2b7afszXcTj8NdifRgXW0tL28YtHZ6fMISgtiV+4vXMWzJUjsVn0TbXPFWI181YFo8R/FhKYDhyIopCQk5xrAA4VGEI47NlxL1iLBqHtgvu6a3kpf7NVL4o7DXn/Vo5gpxYMXJYPJKFTRSms64X8LmqGiag0eij2W4rzAudOhrl5mgHxRpXJXFmjpQgtLTJ1QlvnhagD0jLMiFz+AuwleQ== X-Forefront-Antispam-Report: CIP:63.35.35.123; CTRY:IE; LANG:en; SCL:1; SRV:; IPV:CAL; SFV:NSPM; H:64aa7808-outbound-1.mta.getcheckrecipient.com; PTR:ec2-63-35-35-123.eu-west-1.compute.amazonaws.com; CAT:NONE; SFS:(4636009)(376002)(136003)(346002)(39860400002)(396003)(46966005)(81166007)(33656002)(44144004)(478600001)(336012)(44832011)(2616005)(8886007)(4743002)(36756003)(33964004)(6916009)(16526019)(5660300002)(186003)(1076003)(26005)(70586007)(70206006)(956004)(66616009)(82310400003)(235185007)(47076004)(356005)(2906002)(36906005)(55016002)(4326008)(316002)(86362001)(8936002)(8676002)(82740400003)(7696005)(2700100001)(357404004); DIR:OUT; SFP:1101; X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 25 Sep 2020 14:31:06.8064 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 9eec7538-4e67-4486-f99a-08d8615fa3aa X-MS-Exchange-CrossTenant-Id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=f34e5979-57d9-4aaa-ad4d-b122a662184d; Ip=[63.35.35.123]; Helo=[64aa7808-outbound-1.mta.getcheckrecipient.com] X-MS-Exchange-CrossTenant-AuthSource: AM5EUR03FT064.eop-EUR03.prod.protection.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: AM6PR08MB4997 X-Spam-Status: No, score=-14.1 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, GIT_PATCH_0, MSGID_FROM_MTA_HEADER, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2, SPF_HELO_PASS, SPF_PASS, TXREP, UNPARSEABLE_RELAY autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Richard.Earnshaw@arm.com, nd@arm.com, Marcus.Shawcroft@arm.com Errors-To: gcc-patches-bounces@gcc.gnu.org Sender: "Gcc-patches" Hi All, This adds implementation for the optabs for complex operations. With this the following C code: void f90 (float complex a[restrict N], float complex b[restrict N], float complex c[restrict N]) { for (int i=0; i < N; i++) c[i] = a[i] + (b[i] * I); } generates f90: mov x3, 0 mov x4, 400 ptrue p1.b, all whilelo p0.s, xzr, x4 .p2align 3,,7 .L2: ld1w z0.s, p0/z, [x0, x3, lsl 2] ld1w z1.s, p0/z, [x1, x3, lsl 2] fcadd z0.s, p1/m, z0.s, z1.s, #90 st1w z0.s, p0, [x2, x3, lsl 2] incw x3 whilelo p0.s, x3, x4 b.any .L2 ret instead of f90: mov x3, 0 mov x4, 0 mov w5, 200 whilelo p0.s, wzr, w5 .p2align 3,,7 .L2: ld2w {z4.s - z5.s}, p0/z, [x0, x3, lsl 2] ld2w {z2.s - z3.s}, p0/z, [x1, x3, lsl 2] fsub z0.s, z4.s, z3.s fadd z1.s, z2.s, z5.s st2w {z0.s - z1.s}, p0, [x2, x3, lsl 2] incw x4 inch x3 whilelo p0.s, w4, w5 b.any .L2 ret Bootstrapped Regtested on aarch64-none-linux-gnu and no issues. Ok for master? Thanks, Tamar gcc/ChangeLog: * config/aarch64/aarch64-sve.md (cadd3, cml4, cmul3): New. * config/aarch64/iterators.md (sve_rot1, sve_rot2): New. diff --git a/gcc/config/aarch64/aarch64-sve.md b/gcc/config/aarch64/aarch64-sve.md index cd79aba90ec9cdb5da9e9758495015ef36b2d869..12bc8077994f5a130ff4af6e9bfa7ca1237d0868 100644 --- a/gcc/config/aarch64/aarch64-sve.md +++ b/gcc/config/aarch64/aarch64-sve.md @@ -5109,6 +5109,20 @@ (define_expand "@cond_" "TARGET_SVE" ) +;; Predicated FCADD using ptrue for unpredicated optab for auto-vectorizer +(define_expand "@cadd3" + [(set (match_operand:SVE_FULL_F 0 "register_operand") + (unspec:SVE_FULL_F + [(match_dup 3) + (const_int SVE_RELAXED_GP) + (match_operand:SVE_FULL_F 1 "register_operand") + (match_operand:SVE_FULL_F 2 "register_operand")] + SVE_COND_FCADD))] + "TARGET_SVE" +{ + operands[3] = aarch64_ptrue_reg (mode); +}) + ;; Predicated FCADD, merging with the first input. (define_insn_and_rewrite "*cond__2" [(set (match_operand:SVE_FULL_F 0 "register_operand" "=w, ?&w") @@ -6554,6 +6568,62 @@ (define_insn "@aarch64_pred_" [(set_attr "movprfx" "*,yes")] ) +;; unpredicated optab pattern for auto-vectorizer +;; The complex mla/mls operations always need to expand to two instructions. +;; The first operation does half the computation and the second does the +;; remainder. Because of this, expand early. +(define_expand "cml4" + [(set (match_operand:SVE_FULL_F 0 "register_operand") + (unspec:SVE_FULL_F + [(match_dup 4) + (match_dup 5) + (match_operand:SVE_FULL_F 1 "register_operand") + (match_operand:SVE_FULL_F 2 "register_operand") + (match_operand:SVE_FULL_F 3 "register_operand")] + FCMLA_OP))] + "TARGET_SVE" +{ + operands[4] = aarch64_ptrue_reg (mode); + operands[5] = gen_int_mode (SVE_RELAXED_GP, SImode); + emit_insn ( + gen_aarch64_pred_fcmla (operands[0], operands[4], + operands[1], operands[2], + operands[3], operands[5])); + emit_insn ( + gen_aarch64_pred_fcmla (operands[0], operands[4], + operands[0], operands[2], + operands[3], operands[5])); + DONE; +}) + +;; unpredicated optab pattern for auto-vectorizer +;; The complex mul operations always need to expand to two instructions. +;; The first operation does half the computation and the second does the +;; remainder. Because of this, expand early. +(define_expand "cmul3" + [(set (match_operand:SVE_FULL_F 0 "register_operand") + (unspec:SVE_FULL_F + [(match_dup 3) + (match_dup 4) + (match_operand:SVE_FULL_F 1 "register_operand") + (match_operand:SVE_FULL_F 2 "register_operand") + (match_dup 5)] + FCMUL_OP))] + "TARGET_SVE" +{ + operands[3] = aarch64_ptrue_reg (mode); + operands[4] = gen_int_mode (SVE_RELAXED_GP, SImode); + operands[5] = force_reg (mode, CONST0_RTX (mode)); + emit_insn ( + gen_aarch64_pred_fcmla (operands[0], operands[3], operands[1], + operands[2], operands[5], operands[4])); + emit_insn ( + gen_aarch64_pred_fcmla (operands[0], operands[3], operands[1], + operands[2], operands[0], + operands[4])); + DONE; +}) + ;; Predicated FCMLA with merging. (define_expand "@cond_" [(set (match_operand:SVE_FULL_F 0 "register_operand") diff --git a/gcc/config/aarch64/iterators.md b/gcc/config/aarch64/iterators.md index 98217c9fd3ee2b6063f7564193e400e9ef71c6ac..7662b929e2c4f6c103cc06e051eb574247320809 100644 --- a/gcc/config/aarch64/iterators.md +++ b/gcc/config/aarch64/iterators.md @@ -3443,6 +3443,35 @@ (define_int_attr rotsplit2 [(UNSPEC_FCMLA "90") (UNSPEC_FCMLS "180") (UNSPEC_FCMLS180 "180")]) +;; SVE has slightly different namings from NEON so we have to split these +;; iterators. +(define_int_attr sve_rot1 [(UNSPEC_FCMLA "") + (UNSPEC_FCMLA180 "") + (UNSPEC_FCMUL "") + (UNSPEC_FCMUL180 "") + (UNSPEC_FCMLS "270") + (UNSPEC_FCMLS180 "90") + (UNSPEC_CMLA "") + (UNSPEC_CMLA180 "") + (UNSPEC_CMUL "") + (UNSPEC_CMUL180 "") + (UNSPEC_CMLS "270") + (UNSPEC_CMLS180 "90")]) + +(define_int_attr sve_rot2 [(UNSPEC_FCMLA "90") + (UNSPEC_FCMLA180 "270") + (UNSPEC_FCMUL "90") + (UNSPEC_FCMUL180 "270") + (UNSPEC_FCMLS "180") + (UNSPEC_FCMLS180 "180") + (UNSPEC_CMLA "90") + (UNSPEC_CMLA180 "270") + (UNSPEC_CMUL "90") + (UNSPEC_CMUL180 "270") + (UNSPEC_CMLS "180") + (UNSPEC_CMLS180 "180")]) + + (define_int_attr fcmac1 [(UNSPEC_FCMLA "a") (UNSPEC_FCMLA180 "a") (UNSPEC_FCMLS "s") (UNSPEC_FCMLS180 "s") (UNSPEC_CMLA "a") (UNSPEC_CMLA180 "a") From patchwork Fri Sep 25 14:30:39 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tamar Christina X-Patchwork-Id: 1371350 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=8.43.85.97; helo=sourceware.org; envelope-from=gcc-patches-bounces@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=arm.com Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=armh.onmicrosoft.com header.i=@armh.onmicrosoft.com header.a=rsa-sha256 header.s=selector2-armh-onmicrosoft-com header.b=0zY2AA8K; dkim=pass (1024-bit key) header.d=armh.onmicrosoft.com header.i=@armh.onmicrosoft.com header.a=rsa-sha256 header.s=selector2-armh-onmicrosoft-com header.b=0zY2AA8K; dkim-atps=neutral Received: from sourceware.org (server2.sourceware.org [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4ByZ99651mz9sR4 for ; Sat, 26 Sep 2020 00:31:25 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 07311398C024; Fri, 25 Sep 2020 14:31:24 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from EUR05-AM6-obe.outbound.protection.outlook.com (mail-am6eur05on2070.outbound.protection.outlook.com [40.107.22.70]) by sourceware.org (Postfix) with ESMTPS id 36065398B817 for ; Fri, 25 Sep 2020 14:31:20 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 36065398B817 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=Tamar.Christina@arm.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector2-armh-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=srtyTrE9ZEBNrQsDf4hU0KSznZ14HDY/hCxYToiDcso=; b=0zY2AA8KYO5jbi5ksTYdHOSdXaw1rxe8ClMuCzySCVD/NUmX+mizpDWmIre1Wl5DT9Z9sjDNmQwbR3kftR2gZS33v/9bszdEc96CJRs8VVVqxTlvjxVql04wG3LVYtyPnyokuNGrWfL+jooG8Lie2s2uqiUqLIN+EuOcBQYLqLs= Received: from DB6PR0501CA0008.eurprd05.prod.outlook.com (2603:10a6:4:8f::18) by AM0PR08MB3730.eurprd08.prod.outlook.com (2603:10a6:208:fe::15) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3412.20; Fri, 25 Sep 2020 14:31:16 +0000 Received: from DB5EUR03FT032.eop-EUR03.prod.protection.outlook.com (2603:10a6:4:8f:cafe::b4) by DB6PR0501CA0008.outlook.office365.com (2603:10a6:4:8f::18) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3412.22 via Frontend Transport; Fri, 25 Sep 2020 14:31:16 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 63.35.35.123) smtp.mailfrom=arm.com; gcc.gnu.org; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com;gcc.gnu.org; dmarc=bestguesspass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 63.35.35.123 as permitted sender) receiver=protection.outlook.com; client-ip=63.35.35.123; helo=64aa7808-outbound-1.mta.getcheckrecipient.com; Received: from 64aa7808-outbound-1.mta.getcheckrecipient.com (63.35.35.123) by DB5EUR03FT032.mail.protection.outlook.com (10.152.20.162) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3412.21 via Frontend Transport; Fri, 25 Sep 2020 14:31:16 +0000 Received: ("Tessian outbound 34b830c8a0ef:v64"); Fri, 25 Sep 2020 14:31:16 +0000 X-CheckRecipientChecked: true X-CR-MTA-CID: 316fc4661d3bdf45 X-CR-MTA-TID: 64aa7808 Received: from 660fc9780c56.1 by 64aa7808-outbound-1.mta.getcheckrecipient.com id 7A9FF0D1-2AC0-498B-8A85-719D9939E39B.1; Fri, 25 Sep 2020 14:30:49 +0000 Received: from EUR03-DB5-obe.outbound.protection.outlook.com by 64aa7808-outbound-1.mta.getcheckrecipient.com with ESMTPS id 660fc9780c56.1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384); Fri, 25 Sep 2020 14:30:49 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=MrgV+qudUKeytenISwX/JkZ6lwlUQRIPlgKWLRjm3GnAw9zGs16Y+jJ827My3xquQfPgmlktiVMfnkVV1aFF40St54FvLPm2dUI7/U12FXjgeO0jnCZG+8lbK6bEuqpiqcOKkm9O7YoSPb/VIoT5KhxRkPiY5Lrlg2L7qnOOG2bjbpecxBbrR0FysW8iIsdfB8k2mmqCkRwgoR+NrbPnhGObI0TFh2TpC6cC0eFpVzjtddSAVHtqsqTlnkNuyGjtXxDRFSty5OjQk8lAekwDSibymDj3FNL5kiBrW0qA131bqYmjcLlkiXqCXATA4eh3JvV5Y6HgQ7hhjnl8U/U5iA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=srtyTrE9ZEBNrQsDf4hU0KSznZ14HDY/hCxYToiDcso=; b=Aa0jLmhbyZKXurO8eukiZhErETw6CxMdpql9jLfJbnrD/kbZx1B+SC2Mm3HFGICDb1yfLSODhpB9FaNNV605O85byQV+gl7dgH1Q0JnpWO/HkLhnoJI14LlAu9Tae9X9vtNSln6OBmbDfn6D+EAqdJZcYbCFKJ5/1zsYnDAE+x+HCalhZ4um3hKIoWyV1N5qmjtmRSzN4d3CpD/pJ04UteSNNW1JxxU0DV/Yqxpl/VVkxFmv6xke3yizLsa1GLZBJ8qB02Vc+kpZaXm8ds5p83YB5J23D8JegZNGx6Kxwaa9q/FVqtrPrsckzTgxAJUINFE3fkfsf91fr63cxK874g== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=arm.com; dmarc=pass action=none header.from=arm.com; dkim=pass header.d=arm.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector2-armh-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=srtyTrE9ZEBNrQsDf4hU0KSznZ14HDY/hCxYToiDcso=; b=0zY2AA8KYO5jbi5ksTYdHOSdXaw1rxe8ClMuCzySCVD/NUmX+mizpDWmIre1Wl5DT9Z9sjDNmQwbR3kftR2gZS33v/9bszdEc96CJRs8VVVqxTlvjxVql04wG3LVYtyPnyokuNGrWfL+jooG8Lie2s2uqiUqLIN+EuOcBQYLqLs= Authentication-Results-Original: gcc.gnu.org; dkim=none (message not signed) header.d=none;gcc.gnu.org; dmarc=none action=none header.from=arm.com; Received: from VI1PR08MB5325.eurprd08.prod.outlook.com (2603:10a6:803:13e::17) by VE1PR08MB5678.eurprd08.prod.outlook.com (2603:10a6:800:1a0::20) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3391.11; Fri, 25 Sep 2020 14:30:48 +0000 Received: from VI1PR08MB5325.eurprd08.prod.outlook.com ([fe80::d0e7:49cd:4dae:a2a2]) by VI1PR08MB5325.eurprd08.prod.outlook.com ([fe80::d0e7:49cd:4dae:a2a2%7]) with mapi id 15.20.3412.024; Fri, 25 Sep 2020 14:30:48 +0000 Date: Fri, 25 Sep 2020 15:30:39 +0100 From: Tamar Christina To: gcc-patches@gcc.gnu.org Subject: [PATCH v2 12/16]AArch64: Add SVE2 Integer RTL patterns for Complex Addition, Multiply and FMA. Message-ID: <20200925143037.GA26815@arm.com> Content-Disposition: inline User-Agent: Mutt/1.9.4 (2018-02-28) X-ClientProxiedBy: SA0PR11CA0033.namprd11.prod.outlook.com (2603:10b6:806:d0::8) To VI1PR08MB5325.eurprd08.prod.outlook.com (2603:10a6:803:13e::17) MIME-Version: 1.0 X-MS-Exchange-MessageSentRepresentingType: 1 Received: from arm.com (217.140.106.53) by SA0PR11CA0033.namprd11.prod.outlook.com (2603:10b6:806:d0::8) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3412.21 via Frontend Transport; Fri, 25 Sep 2020 14:30:45 +0000 X-Originating-IP: [217.140.106.53] X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-HT: Tenant X-MS-Office365-Filtering-Correlation-Id: 3626d6ad-ff91-4215-c4e3-08d8615fa961 X-MS-TrafficTypeDiagnostic: VE1PR08MB5678:|AM0PR08MB3730: X-MS-Exchange-Transport-Forked: True X-Microsoft-Antispam-PRVS: x-checkrecipientrouted: true NoDisclaimer: true X-MS-Oob-TLC-OOBClassifiers: OLM:1284;OLM:1284; X-MS-Exchange-SenderADCheck: 1 X-Microsoft-Antispam-Untrusted: BCL:0; X-Microsoft-Antispam-Message-Info-Original: 2Y22SPwXcX5Tm4jRxpZZ6GXZVUZ9Ld4amzvX8R4EWmtb1vf92n+mz8mHsiBnzQMsFgbkz1kQ/Xt2970+Y62wP7n6ZzgE8kPoMc7rPHtSlTYebgor6gywAP16szrDBG43e+JmgpGjL/jtW184OHGSzb1uuH0qucC2wVi+VbucrLUt7oHn5wc2OX7qbXFoNpnT/InFWiGcJvZW7B+hfOxvG5mS0XLj/Z+Q1MTrwnJhG8zSNHoROYVXL7i4SwygVMl+CxRLRxTpHblzPIuPp1lg0c/7u059fpIdn+MLT4gtlU/aYNc6oqdjkIBJIw6shMNbCOh6Mvy8qZbyU+aoV6ASIyo+N0ukazlqW3AaGl+Na9+fAe58/dJFUaEHqYGzbUufltCe6DGq0nYvetgl9TQrkoVwkawr/oa1lcEvFRn6jVVE0JeKnDIHAittnY/Totu7opvuA+gvgKSvSNMYci+Tag== X-Forefront-Antispam-Report-Untrusted: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:VI1PR08MB5325.eurprd08.prod.outlook.com; PTR:; CAT:NONE; SFS:(4636009)(376002)(346002)(39860400002)(136003)(366004)(396003)(186003)(16526019)(52116002)(4743002)(4326008)(33964004)(235185007)(5660300002)(44144004)(8676002)(956004)(26005)(2906002)(2616005)(44832011)(7696005)(316002)(33656002)(66556008)(8886007)(66946007)(1076003)(55016002)(8936002)(6916009)(86362001)(36756003)(66476007)(478600001)(6666004)(66616009)(2700100001)(357404004); DIR:OUT; SFP:1101; X-MS-Exchange-AntiSpam-MessageData: skwFN/ziTWMG742tqcKpSq8ybD1K2mCchLj5r/zJ5o1r4ws9pWLm/1TipAsTQLsKpwDP/0qNm9DzmtMDkZiOgDrNHrvfNEmp0qfF7C2PeTJGnx+fZYYJW1gBobEw3nh1i45Qv23C5qUQG8hYk+FFKbwyjVfD+fHaPUQ6raNEXoXNs2Gjc1Bf45NuU74b4Ak6/e/hI/ZB+s625VBpYTje7MYPrsUAoQte+v3/bl+LQh/IeOzKdt5ASIouqIFCNAAZeJYpZpoOAp9nCXw2CTzGR4OXaEKWqVp5MzlIg4pzDTUMsluzeGjF32ryGZ1XdeQg8UwFAWOo/6Jsfst40mtb7Fx0C/U+qkrr6+6q+Fbnt86I8nUAAjCvhclfbo9JIYFHVsmMKaEPFvBYjozeIa9Gn69dAVglb/y9b+nN2WgvxyHfjGZJKAZjMslniWv1XvMRLr/pd7l4ZkegUYG92BPS1aFH634PnmXpEuTpxHz9SGjZz90DhRKu3TqULdDl6UL2aeNE2V7VJj1aGzFfqZFihaV/Q4k7lE22dBPw7PRfMzkLareejaYDLMA14EMo/nBy14/9+Zs9VpdaSO58RJ52shWbNpHdAw+PaB/j8MQRB3hzsQ2hfmZNT7PlL/hf15aeiKF9ZHpz/fM2x+6SGr6BZw== X-MS-Exchange-Transport-CrossTenantHeadersStamped: VE1PR08MB5678 Original-Authentication-Results: gcc.gnu.org; dkim=none (message not signed) header.d=none;gcc.gnu.org; dmarc=none action=none header.from=arm.com; X-EOPAttributedMessage: 0 X-MS-Exchange-Transport-CrossTenantHeadersStripped: DB5EUR03FT032.eop-EUR03.prod.protection.outlook.com X-MS-Office365-Filtering-Correlation-Id-Prvs: 6020feac-39bb-4bdb-2e85-08d8615f986c X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: L8U7TF9LX6DML+UhI4BqE4NdQzLYxfaUvW5A5l3Ocg/gVNKvfuX43yboNOrS8uMy994kEn77xp/tPVd5GHWVLtnY3dP86TonSuguQHVCgW6Il4fmeAI9u/pN6MVYV0X/fx01WsL8OV2j9a2mt1WMIoxWpBGk1HpScHsAMR9FH6AE/pGLs4LxzsVnks36rDuNXWE+bv+GioCkhbOefKYjWrBusVp3zHPkclMcFXCDynX7W7AcPRFN7U/cMQ6AabXxESZzhClNJmSVtpG9gXz7HTXnGniBXdLNKC4OD1aa2l9hQVkkC0LUnhPz1ydTBlZMMgY9iMzES825nOHi3tHvz5FSDN9IYnoZsGgmNUpu0rpbXVI74ZuJcGkT7YpeFdVd52BH+rlp2SI40KaneCiRbXCmoV+u8r/irugpI1OGk35XcWm/8gMwyRtw8C1f1z2LLjsToAWkni67qwTMxA2dju+HWG+QViO/RCAHvBZ9fvcHyxPCPF6Yy/i12SRBgAyH X-Forefront-Antispam-Report: CIP:63.35.35.123; CTRY:IE; LANG:en; SCL:1; SRV:; IPV:CAL; SFV:NSPM; H:64aa7808-outbound-1.mta.getcheckrecipient.com; PTR:ec2-63-35-35-123.eu-west-1.compute.amazonaws.com; CAT:NONE; SFS:(4636009)(39860400002)(346002)(376002)(136003)(396003)(46966005)(186003)(70586007)(6916009)(8936002)(82740400003)(36756003)(356005)(81166007)(47076004)(55016002)(8886007)(16526019)(86362001)(44832011)(2616005)(33656002)(2906002)(7696005)(1076003)(235185007)(66616009)(82310400003)(316002)(70206006)(956004)(336012)(44144004)(6666004)(4743002)(4326008)(33964004)(26005)(5660300002)(8676002)(478600001)(2700100001)(357404004); DIR:OUT; SFP:1101; X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 25 Sep 2020 14:31:16.4409 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 3626d6ad-ff91-4215-c4e3-08d8615fa961 X-MS-Exchange-CrossTenant-Id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=f34e5979-57d9-4aaa-ad4d-b122a662184d; Ip=[63.35.35.123]; Helo=[64aa7808-outbound-1.mta.getcheckrecipient.com] X-MS-Exchange-CrossTenant-AuthSource: DB5EUR03FT032.eop-EUR03.prod.protection.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: AM0PR08MB3730 X-Spam-Status: No, score=-14.1 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, GIT_PATCH_0, KAM_ASCII_DIVIDERS, MSGID_FROM_MTA_HEADER, RCVD_IN_DNSWL_LOW, RCVD_IN_MSPIKE_H2, SPF_HELO_PASS, SPF_PASS, TXREP, UNPARSEABLE_RELAY autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Richard.Earnshaw@arm.com, nd@arm.com, Marcus.Shawcroft@arm.com Errors-To: gcc-patches-bounces@gcc.gnu.org Sender: "Gcc-patches" Hi All, This adds implementation for the optabs for complex operations. With this the following C code: void f90 (int _Complex a[restrict N], int _Complex b[restrict N], int _Complex c[restrict N]) { for (int i=0; i < N; i++) c[i] = a[i] + (b[i] * I); } generates f90: mov x3, 0 mov x4, 200 whilelo p0.s, xzr, x4 .p2align 3,,7 .L2: ld1w z0.s, p0/z, [x0, x3, lsl 2] ld1w z1.s, p0/z, [x1, x3, lsl 2] cadd z0.s, z0.s, z1.s, #90 st1w z0.s, p0, [x2, x3, lsl 2] incw x3 whilelo p0.s, x3, x4 b.any .L2 ret instead of f90: mov x3, 0 mov x4, 0 mov w5, 100 whilelo p0.s, wzr, w5 .p2align 3,,7 .L2: ld2w {z4.s - z5.s}, p0/z, [x0, x3, lsl 2] ld2w {z2.s - z3.s}, p0/z, [x1, x3, lsl 2] sub z0.s, z4.s, z3.s add z1.s, z5.s, z2.s st2w {z0.s - z1.s}, p0, [x2, x3, lsl 2] incw x4 inch x3 whilelo p0.s, w4, w5 b.any .L2 ret Bootstrapped Regtested on aarch64-none-linux-gnu and no issues. Ok for master? Thanks, Tamar gcc/ChangeLog: * config/aarch64/aarch64-sve2.md (cadd3, cml4, cmul3): New. * config/aarch64/iterators.md (SVE2_INT_CMLA_OP, SVE2_INT_CMUL_OP, SVE2_INT_CADD_OP): New. diff --git a/gcc/config/aarch64/aarch64-sve2.md b/gcc/config/aarch64/aarch64-sve2.md index e18b9fef16e72496588fb5850e362da4ae42898a..e601c6a4586e3ed1e11aedf047f56d556a99a302 100644 --- a/gcc/config/aarch64/aarch64-sve2.md +++ b/gcc/config/aarch64/aarch64-sve2.md @@ -1774,6 +1774,16 @@ (define_insn "@aarch64_sve_" [(set_attr "movprfx" "*,yes")] ) +;; unpredicated optab pattern for auto-vectorizer +(define_expand "cadd3" + [(set (match_operand:SVE_FULL_I 0 "register_operand") + (unspec:SVE_FULL_I + [(match_operand:SVE_FULL_I 1 "register_operand") + (match_operand:SVE_FULL_I 2 "register_operand")] + SVE2_INT_CADD_OP))] + "TARGET_SVE2" +) + ;; ------------------------------------------------------------------------- ;; ---- [INT] Complex ternary operations ;; ------------------------------------------------------------------------- @@ -1813,6 +1823,47 @@ (define_insn "@aarch64__lane_" [(set_attr "movprfx" "*,yes")] ) +;; unpredicated optab pattern for auto-vectorizer +;; The complex mla/mls operations always need to expand to two instructions. +;; The first operation does half the computation and the second does the +;; remainder. Because of this, expand early. +(define_expand "cml4" + [(set (match_operand:SVE_FULL_I 0 "register_operand") + (plus:SVE_FULL_I (match_operand:SVE_FULL_I 1 "register_operand") + (unspec:SVE_FULL_I + [(match_operand:SVE_FULL_I 2 "register_operand") + (match_operand:SVE_FULL_I 3 "register_operand")] + SVE2_INT_CMLA_OP)))] + "TARGET_SVE2" +{ + emit_insn (gen_aarch64_sve_cmla (operands[0], operands[1], + operands[2], operands[3])); + emit_insn (gen_aarch64_sve_cmla (operands[0], operands[0], + operands[2], operands[3])); + DONE; +}) + +;; unpredicated optab pattern for auto-vectorizer +;; The complex mul operations always need to expand to two instructions. +;; The first operation does half the computation and the second does the +;; remainder. Because of this, expand early. +(define_expand "cmul3" + [(set (match_operand:SVE_FULL_I 0 "register_operand") + (unspec:SVE_FULL_I + [(match_operand:SVE_FULL_I 1 "register_operand") + (match_operand:SVE_FULL_I 2 "register_operand") + (match_dup 3)] + SVE2_INT_CMUL_OP))] + "TARGET_SVE2" +{ + operands[3] = force_reg (mode, CONST0_RTX (mode)); + emit_insn (gen_aarch64_sve_cmla (operands[0], operands[3], + operands[1], operands[2])); + emit_insn (gen_aarch64_sve_cmla (operands[0], operands[0], + operands[1], operands[2])); + DONE; +}) + ;; ------------------------------------------------------------------------- ;; ---- [INT] Complex dot product ;; ------------------------------------------------------------------------- diff --git a/gcc/config/aarch64/iterators.md b/gcc/config/aarch64/iterators.md index 7662b929e2c4f6c103cc06e051eb574247320809..c11e976237d30771a7bd7c7fb56922f9c5c785de 100644 --- a/gcc/config/aarch64/iterators.md +++ b/gcc/config/aarch64/iterators.md @@ -2583,6 +2583,23 @@ (define_int_iterator SVE2_INT_CMLA [UNSPEC_CMLA UNSPEC_SQRDCMLAH180 UNSPEC_SQRDCMLAH270]) +;; Unlike the normal CMLA instructions these represent the actual operation you +;; to be performed. They will always need to be expanded into multiple +;; sequences consisting of CMLA. +(define_int_iterator SVE2_INT_CMLA_OP [UNSPEC_CMLA + UNSPEC_CMLA180 + UNSPEC_CMLS]) + +;; Unlike the normal CMLA instructions these represent the actual operation you +;; to be performed. They will always need to be expanded into multiple +;; sequences consisting of CMLA. +(define_int_iterator SVE2_INT_CMUL_OP [UNSPEC_CMUL + UNSPEC_CMUL180]) + +;; Same as SVE2_INT_CADD but exclude the saturating instructions +(define_int_iterator SVE2_INT_CADD_OP [UNSPEC_CADD90 + UNSPEC_CADD270]) + (define_int_iterator SVE2_INT_CDOT [UNSPEC_CDOT UNSPEC_CDOT90 UNSPEC_CDOT180 From patchwork Fri Sep 25 14:30:57 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tamar Christina X-Patchwork-Id: 1371351 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=sourceware.org; envelope-from=gcc-patches-bounces@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=arm.com Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=armh.onmicrosoft.com header.i=@armh.onmicrosoft.com header.a=rsa-sha256 header.s=selector2-armh-onmicrosoft-com header.b=RHT+h7B0; dkim=pass (1024-bit key) header.d=armh.onmicrosoft.com header.i=@armh.onmicrosoft.com header.a=rsa-sha256 header.s=selector2-armh-onmicrosoft-com header.b=RHT+h7B0; dkim-atps=neutral Received: from sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4ByZ9G6nf3z9sRf for ; Sat, 26 Sep 2020 00:31:30 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 72578398E80F; Fri, 25 Sep 2020 14:31:24 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from EUR03-AM5-obe.outbound.protection.outlook.com (mail-eopbgr30083.outbound.protection.outlook.com [40.107.3.83]) by sourceware.org (Postfix) with ESMTPS id E536B398B86D for ; Fri, 25 Sep 2020 14:31:20 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org E536B398B86D Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=Tamar.Christina@arm.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector2-armh-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=ztQvcG3INbcuX7j2A1zcbxZfhUpVpS5QJKzqgInsBII=; b=RHT+h7B0xMVRFt75o5UJyEur+azMmwdAK1OGzVdFWtsqVfWreuXW7/pqZC2i0x2V5G3gOoQI8q+FR7lZH9WcmvFhQiXRXxVIm5ErvhIx/EUGGDi+Zy5bYarRqqs8XW7Os+6VRSXx9shuG4nrsMeZfkNxsOMHahM8sddJOu0G47o= Received: from DB6PR07CA0002.eurprd07.prod.outlook.com (2603:10a6:6:2d::12) by VE1PR08MB4942.eurprd08.prod.outlook.com (2603:10a6:803:10f::23) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3412.21; Fri, 25 Sep 2020 14:31:19 +0000 Received: from DB5EUR03FT060.eop-EUR03.prod.protection.outlook.com (2603:10a6:6:2d:cafe::66) by DB6PR07CA0002.outlook.office365.com (2603:10a6:6:2d::12) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3433.14 via Frontend Transport; Fri, 25 Sep 2020 14:31:19 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 63.35.35.123) smtp.mailfrom=arm.com; gcc.gnu.org; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com;gcc.gnu.org; dmarc=bestguesspass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 63.35.35.123 as permitted sender) receiver=protection.outlook.com; client-ip=63.35.35.123; helo=64aa7808-outbound-1.mta.getcheckrecipient.com; Received: from 64aa7808-outbound-1.mta.getcheckrecipient.com (63.35.35.123) by DB5EUR03FT060.mail.protection.outlook.com (10.152.21.231) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3412.21 via Frontend Transport; Fri, 25 Sep 2020 14:31:19 +0000 Received: ("Tessian outbound 7161e0c2a082:v64"); Fri, 25 Sep 2020 14:31:19 +0000 X-CheckRecipientChecked: true X-CR-MTA-CID: 8f36bbe3f3c8966d X-CR-MTA-TID: 64aa7808 Received: from 80872aceda3d.2 by 64aa7808-outbound-1.mta.getcheckrecipient.com id 3425BB0F-5327-41FB-B1E5-FEEF584F003D.1; Fri, 25 Sep 2020 14:31:08 +0000 Received: from EUR01-HE1-obe.outbound.protection.outlook.com by 64aa7808-outbound-1.mta.getcheckrecipient.com with ESMTPS id 80872aceda3d.2 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384); Fri, 25 Sep 2020 14:31:08 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=gWF9Udue3RFq2DoZ7QOgDoRvXgy/ugrtOSrPHILeuvqvNkgv2GvqNfQr/Kxjxerj5doZ+II4Akio7j6K7LQgEPfKXy9J7llZrULrvrLr+B9Oqn84vt615p3foEFYzlyECK4OkdJW4gQQWGNAXboudArj08FyIZlrIyYR0E0El5An53xKMXA8kKRcgkLfjQwtHF4eDE8yG+SV4AmbxVs/dRM7Uci3cwiWLWb7tr/iYapjH3EisWaSciPacvaBie1pDN5iVnpRz3YDycGMjcC+TAFj3w5gh09lBaxQnFtRQFfEBiMx8vWrWtsa3t2ngDLpwLgA47p0XhfDkmkK3o8Grg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=ztQvcG3INbcuX7j2A1zcbxZfhUpVpS5QJKzqgInsBII=; b=MCOiXTTtoDi0V+nFNycm1vuuUm4z8KA6EggAsbwoy43q8a1zlr0r26WXrU6QD1sFARr5jsRoMQCR36TW90w+oOGfNvVHWvMFEa6ktB1TguLlui3eQc/f3pHa70hjq9vzL01XecswRxgttcNusFFx4Q4T3MnCHO2sPNYokBFQI2v5HXVCEMucXRxbhkYLzvhKbRb+nzzrGhCElYdTlWVPPRuZ67ajst0LeE78GFFbcKvdMuekBYGAdi8OIx7sR8KMXLqcTSRvHTSV5r7rxkzeOCdun4naFTMpNk0uTCXvSpNrX+e52iMZuU4ukATq75y+K1yEyyUMhmJ3B1lhCju0WA== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=arm.com; dmarc=pass action=none header.from=arm.com; dkim=pass header.d=arm.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector2-armh-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=ztQvcG3INbcuX7j2A1zcbxZfhUpVpS5QJKzqgInsBII=; b=RHT+h7B0xMVRFt75o5UJyEur+azMmwdAK1OGzVdFWtsqVfWreuXW7/pqZC2i0x2V5G3gOoQI8q+FR7lZH9WcmvFhQiXRXxVIm5ErvhIx/EUGGDi+Zy5bYarRqqs8XW7Os+6VRSXx9shuG4nrsMeZfkNxsOMHahM8sddJOu0G47o= Authentication-Results-Original: gcc.gnu.org; dkim=none (message not signed) header.d=none;gcc.gnu.org; dmarc=none action=none header.from=arm.com; Received: from VI1PR08MB5325.eurprd08.prod.outlook.com (2603:10a6:803:13e::17) by VE1PR08MB5760.eurprd08.prod.outlook.com (2603:10a6:800:1af::10) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3412.20; Fri, 25 Sep 2020 14:31:06 +0000 Received: from VI1PR08MB5325.eurprd08.prod.outlook.com ([fe80::d0e7:49cd:4dae:a2a2]) by VI1PR08MB5325.eurprd08.prod.outlook.com ([fe80::d0e7:49cd:4dae:a2a2%7]) with mapi id 15.20.3412.024; Fri, 25 Sep 2020 14:31:06 +0000 Date: Fri, 25 Sep 2020 15:30:57 +0100 From: Tamar Christina To: gcc-patches@gcc.gnu.org Subject: [PATCH v2 13/16]Arm: Add support for auto-vectorization using HF mode. Message-ID: <20200925143055.GA28257@arm.com> Content-Disposition: inline User-Agent: Mutt/1.9.4 (2018-02-28) X-ClientProxiedBy: SN6PR08CA0014.namprd08.prod.outlook.com (2603:10b6:805:66::27) To VI1PR08MB5325.eurprd08.prod.outlook.com (2603:10a6:803:13e::17) MIME-Version: 1.0 X-MS-Exchange-MessageSentRepresentingType: 1 Received: from arm.com (217.140.106.53) by SN6PR08CA0014.namprd08.prod.outlook.com (2603:10b6:805:66::27) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3326.19 via Frontend Transport; Fri, 25 Sep 2020 14:31:04 +0000 X-Originating-IP: [217.140.106.53] X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-HT: Tenant X-MS-Office365-Filtering-Correlation-Id: 0ed2b2b7-75b6-4797-d86a-08d8615faaeb X-MS-TrafficTypeDiagnostic: VE1PR08MB5760:|VE1PR08MB4942: X-LD-Processed: f34e5979-57d9-4aaa-ad4d-b122a662184d,ExtAddr X-MS-Exchange-Transport-Forked: True X-Microsoft-Antispam-PRVS: x-checkrecipientrouted: true NoDisclaimer: true X-MS-Oob-TLC-OOBClassifiers: OLM:8273;OLM:8273; X-MS-Exchange-SenderADCheck: 1 X-Microsoft-Antispam-Untrusted: BCL:0; X-Microsoft-Antispam-Message-Info-Original: XPSmTgsb28f04x4xYPjUrvFpon0vX+R7CXRZq4jJ0KI8qebPGX0pPsGE0QIt0WsS7nCP6d7zvH12cWwqdF5ZP7/ISXzkpHX6hjDG6iYVF9Y0cQDaiLS2OL4H77g+jB4YBxrQ7DYebf0Lok79FtuqDXwIwKKIqNd7uZdZjpvv1eYkQWh/WiBQrX7mFpTT4Plpp9VmKfzIZ3v/xNiNDjAByP/6tAtmBnX0Udx/saXYcyNVwlQIWtGv6JM67mwkt+kWvgICbGjtZS4jSOuP6uPqkJDmuKXqQR6R4lUYBxR+z+x8yvq+qd1KsJuiM96LtH8TGmC7GEgDL0YGE0JVoZ3yMgeXZzu54V74J1LpdLPjmOffrmG9D+mXVxMgVmvsKUx/pUk+NT5KGEb2FMQnnFCtqG6kVYuL00rji8pgP+rXTSI= X-Forefront-Antispam-Report-Untrusted: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:VI1PR08MB5325.eurprd08.prod.outlook.com; PTR:; CAT:NONE; SFS:(4636009)(136003)(376002)(346002)(366004)(39860400002)(396003)(1076003)(26005)(2616005)(44832011)(235185007)(4326008)(8886007)(86362001)(478600001)(36756003)(44144004)(66946007)(52116002)(7696005)(956004)(66476007)(66616009)(66556008)(33964004)(55016002)(8936002)(316002)(2906002)(33656002)(16526019)(6666004)(8676002)(186003)(5660300002)(4743002)(6916009)(2700100001); DIR:OUT; SFP:1101; X-MS-Exchange-AntiSpam-MessageData: NzQ8FE8onH0CLYoYWrTRXDqLKltDyXDSQ2qyhTfTXZDb2T3kCu7vKbITSFeiJ0b1xO9yJYw1LDRPIrJs0MxPuXcRqEnfHq0ZNDyqSXFt4MJPBIvAwwZCBsGwHn40Z00VdBiPdGrmwnkeZXXUq29P5YFKDzQm40SbEhJ/MlHR8KcXsifZ3Un7RHfIP9N2IFbtMazHLqd/3He8RW33P0pJsz2CLXeJ/uGtrlH6vNSKYFunE+YHg6XvZSDEB1YMYMAr4SQM2hPWfYsbTARbgZi1nnj9sxZSHf/5q1arROVsini0H/WvMv3FVVsSv9/htr/LCKWY2r29RL5OSDmyJ2Vc05ZPZMeMlwUjIpv2BPasAYOYFRnP1kNQb/TicotxQDWORO1/OI2MYMI/3iRo+i19mK6UEcE6GTO983tiMiUaOUxdMJl+zwq9r3DwJMrb53fbDNNBbYT99H6SM5wZyCGTT5gq4BjOFdlrdpwqkLz1AdiMaxaSKF3LPNOYXwfFbpQoXCNr+c5QoPt4t3BuKns7qcC7v8wrvR6o69/Bz7DDxh7B+Ny0/lO6+4G5Aw/Xrd6xAkNG4zs4RIRPzMStl2Mf2Qqc+VcOi+LCZts5MNLHvXnKSOlp6byY+uzI4Mb5K1OD9g9rMupuHErJjNLzp9bm7A== X-MS-Exchange-Transport-CrossTenantHeadersStamped: VE1PR08MB5760 Original-Authentication-Results: gcc.gnu.org; dkim=none (message not signed) header.d=none;gcc.gnu.org; dmarc=none action=none header.from=arm.com; X-EOPAttributedMessage: 0 X-MS-Exchange-Transport-CrossTenantHeadersStripped: DB5EUR03FT060.eop-EUR03.prod.protection.outlook.com X-MS-Office365-Filtering-Correlation-Id-Prvs: 2bd6398f-a9d4-45bd-fa5f-08d8615fa34b X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: fYwdZbTdcgPOdDCuizZ/7xTRNqeuUlxQokxJ3788ptJbNEYocO5OqvJAfgXBWRqD4k06ZIRfCK/dygdH/4GMFgXbe8DBCLHBby+M/RbA4TKzZdpctw8WS5lUl48eaoSBb6wbgRYmdtuxLTrgULBII+sSQ6c7FQW80Of/+l+F6h7RYhWEhR+p9aSiw4Uqd1Fueczi0bVX55NclqPrREoNnJsLFwEy0skCW//4Ums7REtYu3eBnbdhejpZL4z1S9rWI38fGjk5GsT1hhqNvNCCyVS+xMkXd81quOG42rVvNICqhqRioqlTIlZLzYcBe5qye6B3zjzDWgk0odk9/57v7ki7Ey9Gjuqqu577sJLNFo0/lQ3yeE4eJOjxra0GVfNadGhb5vPJoxMK4mT6wyBohQUespuDNzobQLTvHPHs2ZZ9zpXXLT+BL1WwcPsRUaH/errgX3+nb2PBXkpgwky9AfAdfEjFaPFgadbItn01IGY= X-Forefront-Antispam-Report: CIP:63.35.35.123; CTRY:IE; LANG:en; SCL:1; SRV:; IPV:CAL; SFV:NSPM; H:64aa7808-outbound-1.mta.getcheckrecipient.com; PTR:ec2-63-35-35-123.eu-west-1.compute.amazonaws.com; CAT:NONE; SFS:(4636009)(376002)(346002)(39860400002)(136003)(396003)(46966005)(235185007)(33656002)(956004)(8936002)(2616005)(82740400003)(36756003)(16526019)(316002)(186003)(8886007)(478600001)(5660300002)(4326008)(6916009)(44832011)(70206006)(4743002)(66616009)(33964004)(47076004)(336012)(82310400003)(44144004)(70586007)(55016002)(81166007)(7696005)(356005)(86362001)(1076003)(2906002)(26005)(6666004)(8676002)(2700100001); DIR:OUT; SFP:1101; X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 25 Sep 2020 14:31:19.0215 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 0ed2b2b7-75b6-4797-d86a-08d8615faaeb X-MS-Exchange-CrossTenant-Id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=f34e5979-57d9-4aaa-ad4d-b122a662184d; Ip=[63.35.35.123]; Helo=[64aa7808-outbound-1.mta.getcheckrecipient.com] X-MS-Exchange-CrossTenant-AuthSource: DB5EUR03FT060.eop-EUR03.prod.protection.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: VE1PR08MB4942 X-Spam-Status: No, score=-14.2 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, GIT_PATCH_0, MSGID_FROM_MTA_HEADER, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2, SPF_HELO_PASS, SPF_PASS, TXREP, UNPARSEABLE_RELAY autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Richard.Earnshaw@arm.com, nd@arm.com, Ramana.Radhakrishnan@arm.com Errors-To: gcc-patches-bounces@gcc.gnu.org Sender: "Gcc-patches" Hi All, This adds support to the auto-vectorizer to support HFmode vectorization for AArch32. This is supported when +fp16 is used. I wonder if I should disable the returning of the type if the option isn't enabled. At the moment it will be returned but the vectorizer will try and fail to use it. It wastes a few compile cycles but doesn't result in bad code. Bootstrapped Regtested on arm-none-linux-gnueabihf and no issues. Ok for master? Thanks, Tamar gcc/ChangeLog: * config/arm/arm.c (arm_preferred_simd_mode): Add E_HFmode. gcc/testsuite/ChangeLog: * gcc.target/arm/vect-half-floats.c: New test. diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c index 022ef6c3f1d723bdf421268c81cd0c759c414d9a..8ca6b913fddb74cd6f4867efc0a7264184c59db0 100644 --- a/gcc/config/arm/arm.c +++ b/gcc/config/arm/arm.c @@ -28917,6 +28917,8 @@ arm_preferred_simd_mode (scalar_mode mode) if (TARGET_NEON) switch (mode) { + case E_HFmode: + return TARGET_NEON_VECTORIZE_DOUBLE ? V4HFmode : V8HFmode; case E_SFmode: return TARGET_NEON_VECTORIZE_DOUBLE ? V2SFmode : V4SFmode; case E_SImode: diff --git a/gcc/testsuite/gcc.target/arm/vect-half-floats.c b/gcc/testsuite/gcc.target/arm/vect-half-floats.c new file mode 100644 index 0000000000000000000000000000000000000000..ebfe7f964442a09053b0cbe04bed425e36b0af96 --- /dev/null +++ b/gcc/testsuite/gcc.target/arm/vect-half-floats.c @@ -0,0 +1,14 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target target_float16 } */ +/* { dg-require-effective-target arm_fp16_ok } */ +/* { dg-add-options for_float16 } */ +/* { dg-additional-options "-Ofast -ftree-vectorize -fdump-tree-vect-all -std=c11" } */ + +void foo (_Float16 n1[], _Float16 n2[], _Float16 r[], int n) +{ + for (int i = 0; i < n; i++) + r[i] = n1[i] + n2[i]; +} + +/* { dg-final { scan-tree-dump-not "LOOP VECTORIZED" "vect" } } */ + From patchwork Fri Sep 25 14:31:20 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tamar Christina X-Patchwork-Id: 1371352 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=sourceware.org; envelope-from=gcc-patches-bounces@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=arm.com Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=armh.onmicrosoft.com header.i=@armh.onmicrosoft.com header.a=rsa-sha256 header.s=selector2-armh-onmicrosoft-com header.b=vxQ/XKeo; dkim=pass (1024-bit key) header.d=armh.onmicrosoft.com header.i=@armh.onmicrosoft.com header.a=rsa-sha256 header.s=selector2-armh-onmicrosoft-com header.b=vxQ/XKeo; dkim-atps=neutral Received: from sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4ByZ9Q33qmz9sR4 for ; Sat, 26 Sep 2020 00:31:38 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 4BCBF398E46E; Fri, 25 Sep 2020 14:31:36 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from EUR05-VI1-obe.outbound.protection.outlook.com (mail-vi1eur05on2087.outbound.protection.outlook.com [40.107.21.87]) by sourceware.org (Postfix) with ESMTPS id D123E398B817 for ; Fri, 25 Sep 2020 14:31:33 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org D123E398B817 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=Tamar.Christina@arm.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector2-armh-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=imkaOEFHbdD0eWbBJrc7VUD0Ocewnwx6/25X5/eQ8FQ=; b=vxQ/XKeod0tcPjLMBXOypmQotTT8DbNQK8AP+gH6uJH1libGZSsJEzkLWkpbs4lkl9gABrMKCpctsZHfJ0T53nBNPty3DQv1wbFmXUTvvy/XD5GlDyxNq/Vrr4ZMWTP3PcmPPr5x+ZFY85D2Q8V+U0IM7tFUBAVjsHG6W5NJ4XQ= Received: from DB6PR0202CA0023.eurprd02.prod.outlook.com (2603:10a6:4:29::33) by AM6PR08MB4184.eurprd08.prod.outlook.com (2603:10a6:20b:a0::28) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3412.20; Fri, 25 Sep 2020 14:31:32 +0000 Received: from DB5EUR03FT044.eop-EUR03.prod.protection.outlook.com (2603:10a6:4:29:cafe::17) by DB6PR0202CA0023.outlook.office365.com (2603:10a6:4:29::33) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3412.21 via Frontend Transport; Fri, 25 Sep 2020 14:31:32 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 63.35.35.123) smtp.mailfrom=arm.com; gcc.gnu.org; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com;gcc.gnu.org; dmarc=bestguesspass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 63.35.35.123 as permitted sender) receiver=protection.outlook.com; client-ip=63.35.35.123; helo=64aa7808-outbound-1.mta.getcheckrecipient.com; Received: from 64aa7808-outbound-1.mta.getcheckrecipient.com (63.35.35.123) by DB5EUR03FT044.mail.protection.outlook.com (10.152.21.167) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3412.21 via Frontend Transport; Fri, 25 Sep 2020 14:31:31 +0000 Received: ("Tessian outbound e8cdb8c6f386:v64"); Fri, 25 Sep 2020 14:31:31 +0000 X-CheckRecipientChecked: true X-CR-MTA-CID: a427bb688fd4d190 X-CR-MTA-TID: 64aa7808 Received: from 40273b03c8f9.2 by 64aa7808-outbound-1.mta.getcheckrecipient.com id 1B3AA695-CFE5-451C-8696-D221D01851AA.1; Fri, 25 Sep 2020 14:31:23 +0000 Received: from EUR05-DB8-obe.outbound.protection.outlook.com by 64aa7808-outbound-1.mta.getcheckrecipient.com with ESMTPS id 40273b03c8f9.2 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384); Fri, 25 Sep 2020 14:31:23 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=eoTPge3Evv4ukt+ySQyeHagbjkV0LoPJflNJCfUvFzOzyF7eW0fjSCyk5iDJG1n05SzwG1nJWSxXup/ie831JYiFDMbSzIYncIT5YUKS7VQOkoh3s7meJ4zjl86yXGI3HK7WWLQ5TzTYc2k8aJy4VKnLwWSLZbRRpfiqSAlwPOyTDoe05BcsUIqQ0KX47/yjyK5wR5SoD81qkaxjZguoWNhojs2pfgx4nJo2v5QPcRCEUF0+c3U5bFaSHcnkJ2MCoV5hKoGUGXU7AEyAmHkQAm7CQoF2H5y7RpLx2uPsLwFWAte5woA+5o5j4tYQTSq4l5AEaw4joY/Xk89ykM4vHw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=imkaOEFHbdD0eWbBJrc7VUD0Ocewnwx6/25X5/eQ8FQ=; b=U2UBI/GU61R1Iq286zLNaSpYQyKgkagAB8Hf86o4jXIbZuQHofbcvCOq/pN3BrvWcHJS68D4Euzq/1wn08PdDXv+e5cf/QYua3RTgpb8nByJDWOMfdoLco3tNHKJRrsUTWdW38LobSXv0PQzRkWHieyf4lN4pohsm+o8j6Y4K1gZLIbkKSgra2csbQRw5Ymx/tch9Z5tpahQVdZbY4MLDfDy3yq/jkjgcdBPNQ+ZFJDd3o4nc+MJMpfHi0cthJaw3q+ybBPpjAx+aaPujOn6SPDBnXwiGZ/OxM26hTnwlnmU+WxUhxbzJGW0GE5aC04pY+OXoKnQ+xRrzVr3yda4MA== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=arm.com; dmarc=pass action=none header.from=arm.com; dkim=pass header.d=arm.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector2-armh-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=imkaOEFHbdD0eWbBJrc7VUD0Ocewnwx6/25X5/eQ8FQ=; b=vxQ/XKeod0tcPjLMBXOypmQotTT8DbNQK8AP+gH6uJH1libGZSsJEzkLWkpbs4lkl9gABrMKCpctsZHfJ0T53nBNPty3DQv1wbFmXUTvvy/XD5GlDyxNq/Vrr4ZMWTP3PcmPPr5x+ZFY85D2Q8V+U0IM7tFUBAVjsHG6W5NJ4XQ= Authentication-Results-Original: gcc.gnu.org; dkim=none (message not signed) header.d=none;gcc.gnu.org; dmarc=none action=none header.from=arm.com; Received: from VI1PR08MB5325.eurprd08.prod.outlook.com (2603:10a6:803:13e::17) by VE1PR08MB5760.eurprd08.prod.outlook.com (2603:10a6:800:1af::10) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3412.20; Fri, 25 Sep 2020 14:31:22 +0000 Received: from VI1PR08MB5325.eurprd08.prod.outlook.com ([fe80::d0e7:49cd:4dae:a2a2]) by VI1PR08MB5325.eurprd08.prod.outlook.com ([fe80::d0e7:49cd:4dae:a2a2%7]) with mapi id 15.20.3412.024; Fri, 25 Sep 2020 14:31:22 +0000 Date: Fri, 25 Sep 2020 15:31:20 +0100 From: Tamar Christina To: gcc-patches@gcc.gnu.org Subject: [PATCH v2 14/16]Arm: Add NEON RTL patterns for Complex Addition, Multiply and FMA. Message-ID: <20200925143115.GA29298@arm.com> Content-Disposition: inline User-Agent: Mutt/1.9.4 (2018-02-28) X-ClientProxiedBy: LO2P265CA0374.GBRP265.PROD.OUTLOOK.COM (2603:10a6:600:a3::26) To VI1PR08MB5325.eurprd08.prod.outlook.com (2603:10a6:803:13e::17) MIME-Version: 1.0 X-MS-Exchange-MessageSentRepresentingType: 1 Received: from arm.com (217.140.106.53) by LO2P265CA0374.GBRP265.PROD.OUTLOOK.COM (2603:10a6:600:a3::26) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3412.21 via Frontend Transport; Fri, 25 Sep 2020 14:31:21 +0000 X-Originating-IP: [217.140.106.53] X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-HT: Tenant X-MS-Office365-Filtering-Correlation-Id: e7272100-a5f9-4476-3691-08d8615fb2a5 X-MS-TrafficTypeDiagnostic: VE1PR08MB5760:|AM6PR08MB4184: X-LD-Processed: f34e5979-57d9-4aaa-ad4d-b122a662184d,ExtAddr X-MS-Exchange-Transport-Forked: True X-Microsoft-Antispam-PRVS: x-checkrecipientrouted: true NoDisclaimer: true X-MS-Oob-TLC-OOBClassifiers: OLM:486;OLM:486; X-MS-Exchange-SenderADCheck: 1 X-Microsoft-Antispam-Untrusted: BCL:0; X-Microsoft-Antispam-Message-Info-Original: Pp2HU8r6Apzh5WJdxiSt+CS142Fk305yJJCGzbU7Drm1FKg0Oojp9gli3qCF+J82lL373BSD39YLKCu/TQR3GSH4HN3XHe+ulUWosgoJYTxzTsYP+AJOX2Qc4o2EydKD6a+V881Y+T085J1Y0V0yePtgizjmz4tCED6nrU2KMaMjKUE5XWb6+eJAFhr4X8wHrawthvcwQcSoXt1RANsIgVvjr5euh+nuFO8CxyvMLfM2h95nrntr5nm/xZu/YDzxfrehDpkY02dtCxTtomb5QZ9N6kXq3QHUl2t/IUuiIWf7YXb2JXARsg5sMrZL96KI88yjYvgABxYQXZJf3pkJxs1BynqRvDp1uFlmO29NTZlgC+nh5mVGu2MqzS7bb6brzKq8q00ReAp0PSAGEhi0EnXwaMjeu+ItpSwMbjK0fxE= X-Forefront-Antispam-Report-Untrusted: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:VI1PR08MB5325.eurprd08.prod.outlook.com; PTR:; CAT:NONE; SFS:(4636009)(136003)(376002)(346002)(366004)(39860400002)(396003)(1076003)(26005)(2616005)(44832011)(235185007)(4326008)(8886007)(86362001)(478600001)(36756003)(44144004)(66946007)(52116002)(7696005)(956004)(66476007)(66616009)(66556008)(33964004)(55016002)(8936002)(316002)(2906002)(33656002)(16526019)(8676002)(186003)(5660300002)(4743002)(6916009)(2700100001); DIR:OUT; SFP:1101; X-MS-Exchange-AntiSpam-MessageData: gfDeuUlnWA1TUE+02ZKrY2KON+vQxzQwKlrUYaA6rkB+WiJ6eB5X5o8UDUEDeI3YY/1k9I2H5KVxML3VTSYzPxfOqin2lekMJLL5H5P5dMYC+yIWGWjqdjLp6Usbk6aqxGF9TLdyExEYfiP+3nASCzOqJrW4Y8hM6rZD97DfzOPYUGZ+MuoSmMGAUtbbGuO/c4BBlr6uWd3kbbCfNBH0tv84xYiDK9QFsL7XGZi00PRZPESeBIWN77+kDTw/OXYdNWTp/PGZOp4UKMkiAhlkPGZi83OfqwlM2UG2lnKlUd9MaMaEc8NfNTV9RPw6S/HpKUM6DrjKTyu4h2sq/NFant6HD9+TZe9Ea6G7QsxdOr/Igc9AhHZZVX+ZmrOjwMAhhls3OsSCzKHeuxyn5KJEkSwNLYhuEIR6qHaMgEhgmqWV2tVZ4JX1ckhd9wCdcfHgryuPvoo/T7D61WlmNnJZl+/oOufh8RLxhwpVYtZZlGUChPDPVkB8TU7aI8C4dJlrIB1yGyRtxqc7N70iQmRMPXfg0RB8/V43oMrcQh2hvBX5yKEmgrbJmfesMR1wkQWFtk1r9MVrDuNl9iKG49juk5+aDy7Du6OApnF7TzGkLuK2EAB4g84cnpmEYBdo/8Dz0IJX13p4VClchXJjodeoTA== X-MS-Exchange-Transport-CrossTenantHeadersStamped: VE1PR08MB5760 Original-Authentication-Results: gcc.gnu.org; dkim=none (message not signed) header.d=none;gcc.gnu.org; dmarc=none action=none header.from=arm.com; X-EOPAttributedMessage: 0 X-MS-Exchange-Transport-CrossTenantHeadersStripped: DB5EUR03FT044.eop-EUR03.prod.protection.outlook.com X-MS-Office365-Filtering-Correlation-Id-Prvs: f9d781e5-721a-4cb1-9552-08d8615facc8 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: OIPYiAhKvmrOu/iuxaDuf9mB/t9jwiOT9qjAn53f25sUjlDZ3+PbpBXxFWj/S30udk8l31Y5IaaMiFCtrx2k/Tv+eRR7vPkkYuWkHnzu1zGL8A2ngTT+08NI3YJIukHiaARSraPXTvW50I9GMV75d+qx22DEo9Gclewbd0NjUbtujYX45PQyo1tUfU/N2YdXv/rU6y8hC4EASqk/VpMjTzf5UIbNeOYYA897o34CGU1sAv9FNZqAI+6b57JszOWciEQ+COOnIS+LFC9N9fyr10ylFg29uVDAiUqmnTxlp4sQaAW/+Sp9cgspjyNc2Vkeb1LnqJdY1eJHoYyZT+Ka7gFauZ25lUCamIH+n15SPDECHD20O44LwCZfc0wj0ECX/wfoCw1zSUSQKRE76DWSJ1jbhM7a8xMJeqgNdSDTcbo= X-Forefront-Antispam-Report: CIP:63.35.35.123; CTRY:IE; LANG:en; SCL:1; SRV:; IPV:CAL; SFV:NSPM; H:64aa7808-outbound-1.mta.getcheckrecipient.com; PTR:ec2-63-35-35-123.eu-west-1.compute.amazonaws.com; CAT:NONE; SFS:(4636009)(396003)(39860400002)(346002)(376002)(136003)(46966005)(33964004)(86362001)(26005)(47076004)(4743002)(36756003)(235185007)(6916009)(316002)(356005)(81166007)(1076003)(478600001)(82740400003)(8886007)(33656002)(55016002)(16526019)(336012)(70586007)(70206006)(186003)(4326008)(7696005)(2906002)(2616005)(82310400003)(956004)(44144004)(66616009)(8936002)(5660300002)(44832011)(8676002)(2700100001); DIR:OUT; SFP:1101; X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 25 Sep 2020 14:31:31.9814 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: e7272100-a5f9-4476-3691-08d8615fb2a5 X-MS-Exchange-CrossTenant-Id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=f34e5979-57d9-4aaa-ad4d-b122a662184d; Ip=[63.35.35.123]; Helo=[64aa7808-outbound-1.mta.getcheckrecipient.com] X-MS-Exchange-CrossTenant-AuthSource: DB5EUR03FT044.eop-EUR03.prod.protection.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: AM6PR08MB4184 X-Spam-Status: No, score=-14.4 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, GIT_PATCH_0, KAM_LOTSOFHASH, MSGID_FROM_MTA_HEADER, RCVD_IN_DNSWL_LOW, RCVD_IN_MSPIKE_H2, SPF_HELO_PASS, SPF_PASS, TXREP, UNPARSEABLE_RELAY autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Richard.Earnshaw@arm.com, nd@arm.com, Ramana.Radhakrishnan@arm.com Errors-To: gcc-patches-bounces@gcc.gnu.org Sender: "Gcc-patches" Hi All, This adds implementation for the optabs for complex additions. With this the following C code: void f90 (float complex a[restrict N], float complex b[restrict N], float complex c[restrict N]) { for (int i=0; i < N; i++) c[i] = a[i] + (b[i] * I); } generates f90: add r3, r2, #1600 .L2: vld1.32 {q8}, [r0]! vld1.32 {q9}, [r1]! vcadd.f32 q8, q8, q9, #90 vst1.32 {q8}, [r2]! cmp r3, r2 bne .L2 bx lr instead of f90: add r3, r2, #1600 .L2: vld2.32 {d24-d27}, [r0]! vld2.32 {d20-d23}, [r1]! vsub.f32 q8, q12, q11 vadd.f32 q9, q13, q10 vst2.32 {d16-d19}, [r2]! cmp r3, r2 bne .L2 bx lr Bootstrapped Regtested on arm-none-linux-gnueabihf and no issues. Ok for master? Thanks, Tamar gcc/ChangeLog: * config/arm/iterators.md (rot): Add UNSPEC_VCMLS, UNSPEC_VCMUL and UNSPEC_VCMUL180. (rot_op, rotsplit1, rotsplit2, fcmac1, VCMLA_OP, VCMUL_OP): New. * config/arm/neon.md (cadd3, cml4, cmul3): New. * config/arm/unspecs.md (UNSPEC_VCMUL, UNSPEC_VCMUL180, UNSPEC_VCMLS, UNSPEC_VCMLS180): New. diff --git a/gcc/config/arm/iterators.md b/gcc/config/arm/iterators.md index 0bc9eba0722689aff4c1a143e952f6eb91c0cd86..f5693c0524274da1eb1c767713574c01ec6d544c 100644 --- a/gcc/config/arm/iterators.md +++ b/gcc/config/arm/iterators.md @@ -1146,10 +1146,38 @@ (define_int_attr crypto_mode [(UNSPEC_SHA1H "V4SI") (UNSPEC_AESMC "V16QI") (define_int_attr rot [(UNSPEC_VCADD90 "90") (UNSPEC_VCADD270 "270") + (UNSPEC_VCMLS "0") (UNSPEC_VCMLA "0") (UNSPEC_VCMLA90 "90") (UNSPEC_VCMLA180 "180") - (UNSPEC_VCMLA270 "270")]) + (UNSPEC_VCMLA270 "270") + (UNSPEC_VCMUL "0") + (UNSPEC_VCMUL180 "180")]) + +;; A conjucate is a rotation of 180* around the argand plane, or * I. +(define_int_attr rot_op [(UNSPEC_VCMLS "") + (UNSPEC_VCMLS180 "_conj") + (UNSPEC_VCMLA "") + (UNSPEC_VCMLA180 "_conj") + (UNSPEC_VCMUL "") + (UNSPEC_VCMUL180 "_conj")]) + +(define_int_attr rotsplit1 [(UNSPEC_VCMLA "0") + (UNSPEC_VCMLA180 "0") + (UNSPEC_VCMUL "0") + (UNSPEC_VCMUL180 "0") + (UNSPEC_VCMLS "270") + (UNSPEC_VCMLS180 "90")]) + +(define_int_attr rotsplit2 [(UNSPEC_VCMLA "90") + (UNSPEC_VCMLA180 "270") + (UNSPEC_VCMUL "90") + (UNSPEC_VCMUL180 "270") + (UNSPEC_VCMLS "180") + (UNSPEC_VCMLS180 "180")]) + +(define_int_attr fcmac1 [(UNSPEC_VCMLA "a") (UNSPEC_VCMLA180 "a") + (UNSPEC_VCMLS "s") (UNSPEC_VCMLS180 "s")]) (define_int_attr simd32_op [(UNSPEC_QADD8 "qadd8") (UNSPEC_QSUB8 "qsub8") (UNSPEC_SHADD8 "shadd8") (UNSPEC_SHSUB8 "shsub8") @@ -1256,3 +1284,12 @@ (define_int_attr bt [(UNSPEC_BFMAB "b") (UNSPEC_BFMAT "t")]) ;; An iterator for CDE MVE accumulator/non-accumulator versions. (define_int_attr a [(UNSPEC_VCDE "") (UNSPEC_VCDEA "a")]) + +;; Define iterators for VCMLA operations +(define_int_iterator VCMLA_OP [UNSPEC_VCMLA + UNSPEC_VCMLA180 + UNSPEC_VCMLS]) + +;; Define iterators for VCMLA operations as MUL +(define_int_iterator VCMUL_OP [UNSPEC_VCMUL + UNSPEC_VCMUL180]) diff --git a/gcc/config/arm/neon.md b/gcc/config/arm/neon.md index 3e7b51d8ab60007901392df0ca1cb09fead4d0e9..1611bcea1ba8cb416d27368e4dc39ce15b3a4cd8 100644 --- a/gcc/config/arm/neon.md +++ b/gcc/config/arm/neon.md @@ -3217,6 +3217,14 @@ (define_insn "neon_vcadd" [(set_attr "type" "neon_fcadd")] ) +(define_expand "cadd3" + [(set (match_operand:VF 0 "register_operand") + (unspec:VF [(match_operand:VF 1 "register_operand") + (match_operand:VF 2 "register_operand")] + VCADD))] + "TARGET_COMPLEX" +) + (define_insn "neon_vcmla" [(set (match_operand:VF 0 "register_operand" "=w") (plus:VF (match_operand:VF 1 "register_operand" "0") @@ -3274,6 +3282,43 @@ (define_insn "neon_vcmlaq_lane" ) +;; The complex mla/mls operations always need to expand to two instructions. +;; The first operation does half the computation and the second does the +;; remainder. Because of this, expand early. +(define_expand "cml4" + [(set (match_operand:VF 0 "register_operand") + (plus:VF (match_operand:VF 1 "register_operand") + (unspec:VF [(match_operand:VF 2 "register_operand") + (match_operand:VF 3 "register_operand")] + VCMLA_OP)))] + "TARGET_COMPLEX" +{ + emit_insn (gen_neon_vcmla (operands[0], operands[1], + operands[2], operands[3])); + emit_insn (gen_neon_vcmla (operands[0], operands[0], + operands[2], operands[3])); + DONE; +}) + +;; The complex mul operations always need to expand to two instructions. +;; The first operation does half the computation and the second does the +;; remainder. Because of this, expand early. +(define_expand "cmul3" + [(set (match_operand:VF 0 "register_operand") + (unspec:VF [(match_operand:VF 1 "register_operand") + (match_operand:VF 2 "register_operand")] + VCMUL_OP))] + "TARGET_COMPLEX" +{ + rtx tmp = gen_reg_rtx (mode); + emit_move_insn (tmp, CONST0_RTX (mode)); + emit_insn (gen_neon_vcmla (operands[0], tmp, + operands[1], operands[2])); + emit_insn (gen_neon_vcmla (operands[0], operands[0], + operands[1], operands[2])); + DONE; +}) + ;; These instructions map to the __builtins for the Dot Product operations. (define_insn "neon_dot" [(set (match_operand:VCVTI 0 "register_operand" "=w") diff --git a/gcc/config/arm/unspecs.md b/gcc/config/arm/unspecs.md index 0a2399d4fb7bdef6c9ff2b31a743cf357fd271d5..d1b2824a0fe76f62d69c18dcec2f47dfb75b586e 100644 --- a/gcc/config/arm/unspecs.md +++ b/gcc/config/arm/unspecs.md @@ -510,6 +510,10 @@ (define_c_enum "unspec" [ UNSPEC_VCMLA90 UNSPEC_VCMLA180 UNSPEC_VCMLA270 + UNSPEC_VCMUL + UNSPEC_VCMUL180 + UNSPEC_VCMLS + UNSPEC_VCMLS180 UNSPEC_MATMUL_S UNSPEC_MATMUL_U UNSPEC_MATMUL_US From patchwork Fri Sep 25 14:31:35 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tamar Christina X-Patchwork-Id: 1371353 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=sourceware.org; envelope-from=gcc-patches-bounces@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=arm.com Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=armh.onmicrosoft.com header.i=@armh.onmicrosoft.com header.a=rsa-sha256 header.s=selector2-armh-onmicrosoft-com header.b=W9FPd1CW; dkim=pass (1024-bit key) header.d=armh.onmicrosoft.com header.i=@armh.onmicrosoft.com header.a=rsa-sha256 header.s=selector2-armh-onmicrosoft-com header.b=W9FPd1CW; dkim-atps=neutral Received: from sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4ByZB70ncnz9sR4 for ; Sat, 26 Sep 2020 00:32:15 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id E1DA039960CD; Fri, 25 Sep 2020 14:31:52 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from EUR03-VE1-obe.outbound.protection.outlook.com (mail-eopbgr50049.outbound.protection.outlook.com [40.107.5.49]) by sourceware.org (Postfix) with ESMTPS id 3BFED398B83C for ; Fri, 25 Sep 2020 14:31:47 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 3BFED398B83C Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=Tamar.Christina@arm.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector2-armh-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=enRDSP387f5PWsyzvHJaekSpl9AskHG5RPyrCgxRn8Y=; b=W9FPd1CW79iXg0Ik8dW3zP4erUylPgs3X9YmXdPm7xEkhD1mTvHSIXecCcutedb7gFFvaERQnQ6ZblJ92qFgoRi7i2H8vNwyswNDZ4vnB8nVv4tAjWIWB5QWThFukjOebDdzPkX6Wof5oEWu3bl294rWQDQfKoSDvoi2rQ09X60= Received: from AM7PR04CA0008.eurprd04.prod.outlook.com (2603:10a6:20b:110::18) by VI1PR08MB3165.eurprd08.prod.outlook.com (2603:10a6:803:45::27) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3412.21; Fri, 25 Sep 2020 14:31:45 +0000 Received: from AM5EUR03FT041.eop-EUR03.prod.protection.outlook.com (2603:10a6:20b:110:cafe::65) by AM7PR04CA0008.outlook.office365.com (2603:10a6:20b:110::18) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3412.22 via Frontend Transport; Fri, 25 Sep 2020 14:31:45 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 63.35.35.123) smtp.mailfrom=arm.com; gcc.gnu.org; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com;gcc.gnu.org; dmarc=bestguesspass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 63.35.35.123 as permitted sender) receiver=protection.outlook.com; client-ip=63.35.35.123; helo=64aa7808-outbound-1.mta.getcheckrecipient.com; Received: from 64aa7808-outbound-1.mta.getcheckrecipient.com (63.35.35.123) by AM5EUR03FT041.mail.protection.outlook.com (10.152.17.186) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3412.21 via Frontend Transport; Fri, 25 Sep 2020 14:31:45 +0000 Received: ("Tessian outbound 7a6fb63c1e64:v64"); Fri, 25 Sep 2020 14:31:45 +0000 X-CheckRecipientChecked: true X-CR-MTA-CID: 41cab13e829c749e X-CR-MTA-TID: 64aa7808 Received: from a352c16a1f11.2 by 64aa7808-outbound-1.mta.getcheckrecipient.com id 933172C0-9F3F-4D0B-BFE5-E7A37D5321C7.1; Fri, 25 Sep 2020 14:31:39 +0000 Received: from EUR03-DB5-obe.outbound.protection.outlook.com by 64aa7808-outbound-1.mta.getcheckrecipient.com with ESMTPS id a352c16a1f11.2 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384); Fri, 25 Sep 2020 14:31:39 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=Tn0Zf5IARDMgM8E2ZyeOVu8mtl+YJMzMnw1cgtlCm0MzM9P7dI4R+NlhMEhrBnUdY7RCJgUZVRf+pAyYuqaCL7v9R8fLkIJ2o/YMIFFlkm32zkmilOzUcXbLpzgCHtrphH7g7R6kx6ROXA7uHInh3upza4gaYdlbxF188RZ3m/4lwaQX1df4OCtmWX1W2ujJlHUazecPh06Ic60xythYhT65XahRpT6EXEb6ShdSP8EaTvGV5elhN7Is1T2Akw9N7wNAasPvWB+ahFc6h6s4zpm1JFj3L0oPrq8xCETOV00vMzGhV7PRuWF2gH43wCt51SGiHrhyA86LO594ChgIfg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=enRDSP387f5PWsyzvHJaekSpl9AskHG5RPyrCgxRn8Y=; b=LquyTgHUbLGHuhKGO1jQgwgkPPaYCCjX5UrRbWNvKF7Lg1vB6R8MV6ZQWVrm5ez2rJn25ZL2oTQwkQr3XnJVn0UEhx4bdVW+w2GWP8ejYZ39CDRF//IGRcouLahICTI1we7ydr4pa6GCMASrSixhY2BTjvnxEydLwIaxwJiLLZGCWJUjkAdSbQKepr0dPiFiKjG7Sg64BVuyxuHpx2DHk26nJ4HXUKvk48LRfbmwljhhI2eytU4BOu6C/4JxoXqGtwxhTHOX919mvlHrkNYGrFtXWqf0LSR+ZuxlbBOT+H152r8OX8S8VCGByllaRNLTP++Oztb2bFc4GsSioY8HEQ== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=arm.com; dmarc=pass action=none header.from=arm.com; dkim=pass header.d=arm.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector2-armh-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=enRDSP387f5PWsyzvHJaekSpl9AskHG5RPyrCgxRn8Y=; b=W9FPd1CW79iXg0Ik8dW3zP4erUylPgs3X9YmXdPm7xEkhD1mTvHSIXecCcutedb7gFFvaERQnQ6ZblJ92qFgoRi7i2H8vNwyswNDZ4vnB8nVv4tAjWIWB5QWThFukjOebDdzPkX6Wof5oEWu3bl294rWQDQfKoSDvoi2rQ09X60= Authentication-Results-Original: gcc.gnu.org; dkim=none (message not signed) header.d=none;gcc.gnu.org; dmarc=none action=none header.from=arm.com; Received: from VI1PR08MB5325.eurprd08.prod.outlook.com (2603:10a6:803:13e::17) by VE1PR08MB5678.eurprd08.prod.outlook.com (2603:10a6:800:1a0::20) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3391.11; Fri, 25 Sep 2020 14:31:38 +0000 Received: from VI1PR08MB5325.eurprd08.prod.outlook.com ([fe80::d0e7:49cd:4dae:a2a2]) by VI1PR08MB5325.eurprd08.prod.outlook.com ([fe80::d0e7:49cd:4dae:a2a2%7]) with mapi id 15.20.3412.024; Fri, 25 Sep 2020 14:31:38 +0000 Date: Fri, 25 Sep 2020 15:31:35 +0100 From: Tamar Christina To: gcc-patches@gcc.gnu.org Subject: [PATCH v2 15/16]Arm: Add MVE RTL patterns for Complex Addition, Multiply and FMA. Message-ID: <20200925143131.GA30552@arm.com> Content-Disposition: inline User-Agent: Mutt/1.9.4 (2018-02-28) X-ClientProxiedBy: LO2P265CA0191.GBRP265.PROD.OUTLOOK.COM (2603:10a6:600:a::35) To VI1PR08MB5325.eurprd08.prod.outlook.com (2603:10a6:803:13e::17) MIME-Version: 1.0 X-MS-Exchange-MessageSentRepresentingType: 1 Received: from arm.com (217.140.106.53) by LO2P265CA0191.GBRP265.PROD.OUTLOOK.COM (2603:10a6:600:a::35) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3412.22 via Frontend Transport; Fri, 25 Sep 2020 14:31:37 +0000 X-Originating-IP: [217.140.106.53] X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-HT: Tenant X-MS-Office365-Filtering-Correlation-Id: 5754cf88-8e4f-4483-1979-08d8615fba7a X-MS-TrafficTypeDiagnostic: VE1PR08MB5678:|VI1PR08MB3165: X-LD-Processed: f34e5979-57d9-4aaa-ad4d-b122a662184d,ExtAddr X-MS-Exchange-Transport-Forked: True X-Microsoft-Antispam-PRVS: x-checkrecipientrouted: true NoDisclaimer: true X-MS-Oob-TLC-OOBClassifiers: OLM:2733;OLM:2733; X-MS-Exchange-SenderADCheck: 1 X-Microsoft-Antispam-Untrusted: BCL:0; X-Microsoft-Antispam-Message-Info-Original: WRoiQynynmzIomKfKnwwr3qJxKTHFU2PQLei5gVM1FFrU7kXkyLOYKDoCwlppKuhPVnnoMrfidQmfzF3LVqndK4HTY9+DAPWpSaICGGjpuerY+CbofwW+AWANFu5loEJWVTMUNL2Cq7T7a3Jm6I2mqTJNJCwOEGy9DNF1gu7k4G+xdQoWjLJvD1s5coklVO7SaMq0LiR1D9BLmkZ2z4C2jvDCCEC2Y3hggcGqoky3DThbIQ37s7MrJpDgL6r9MrauUq369FtuAYWN/dXg4glXT9TVwFvuXaFfinJ02ZdwilhiOhMsLL0AaSLCCrwCcjaXFmnmcbBZT9MTndxjzWc+rkHqr+1UDKT03F6Me5kxu9PqsbOL5yfnTsKwlQwlR91UEJA3KhHZOOznG3zQ/F5GlolgNRLMYFHDofhTszy8D4= X-Forefront-Antispam-Report-Untrusted: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:VI1PR08MB5325.eurprd08.prod.outlook.com; PTR:; CAT:NONE; SFS:(4636009)(376002)(346002)(39860400002)(136003)(366004)(396003)(186003)(16526019)(52116002)(4743002)(4326008)(33964004)(235185007)(5660300002)(44144004)(8676002)(956004)(26005)(2906002)(2616005)(44832011)(7696005)(316002)(33656002)(66556008)(8886007)(66946007)(1076003)(55016002)(8936002)(6916009)(86362001)(36756003)(66476007)(478600001)(66616009)(83380400001)(2700100001); DIR:OUT; SFP:1101; X-MS-Exchange-AntiSpam-MessageData: DR9wet18sPuDJu+tqp3+f10DmQQ0la0igb4krhyDHOBgX1THO1u1yeQHaniANo3lhEWqMaT+f/mNIzmihnSBuUaF0YHDWUaU9z6yw72pdvskwWDj2K+un40u9rrOcbUowlIGc6RGS1vbX5Qz5LzEDOZDQMWAZCozRLi+AdRXKTWIIRN65UERga6AddmkZ6VJRfSb+qJ5J/WHOMlGsM0uKhyAInWC1koZTSLyR8rpek48PEpAgt+LRJyPdztFB/dxUKYTaQP802e7QIgo+DjEm2reGVPNNxAZtkbzjgj9iJsOr/qQpnP5xRi9R0hLnbvn/2ai+jNdeHfyVhWc7WizpmrejmyaIpphBV4lsl5uWWCLKP/CjxyuqRFRonZdbPN3cIp6ONEMe4etnVvIJSM5vfNKWh5Fzxw7wVPlwOgvIoJCT8zdlVAG6zalNqA9N7H+53Nbjzzz/xf2JFocPtsAnBNpQ58h8vvUVeQip5BaUvi0GvBBfMS6RPOIDTInejZBe/MTcUUPI2yTtSnKNvk+qp1v7y2aICZYueFUEBkxL14OqodxAgkTP68PCTFZHRVS+hXOp2ar06AMXUCS+r8tMiIbVnH0KEg4uFlp1XE6tWUyrrYh1uk5SwI5KFBYlDYZ++Tpzk+M1O8RX6firxewzg== X-MS-Exchange-Transport-CrossTenantHeadersStamped: VE1PR08MB5678 Original-Authentication-Results: gcc.gnu.org; dkim=none (message not signed) header.d=none;gcc.gnu.org; dmarc=none action=none header.from=arm.com; X-EOPAttributedMessage: 0 X-MS-Exchange-Transport-CrossTenantHeadersStripped: AM5EUR03FT041.eop-EUR03.prod.protection.outlook.com X-MS-Office365-Filtering-Correlation-Id-Prvs: b68d4d05-3ec1-4dc6-488c-08d8615fb619 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: EUa7yCJx8xhb+Cr96k+2RJMO4Q9pdAp0qxZG0uC73+yGN2cDUOujzcRHedh69SY3wB4FszQ8F4VAatBPOX776NX7eX9VTCPccgp+PX/TSApPVtk+hEA52lJwNf7fnNcbzbsNFRIKwuybs83S3gAw2r2fbt+QKIJpyn/s6ZQbmO6lVBMClg0J6m/MWhjYYDhqwB8AJP4VFWYwMnC6NDkSusQv1Z0MS4owRtl31HhPT08UEdFS9zSpB/icujGICwTa4Lb/5WwBfiJVOuSO/7iY3iyPApeAI0D4l+KIkbgv/t8PAZo76vtFutaNGQ0unbaDszx/MrRInJusLs6y6wpHdxKwT7ULChTh7PfjhYwYjBLK2LqhJRZrQO0Sn1bGmujAy52ygc8FrmwgHd/2g5KNdMlOrsvP6YCsb0wSP9B69L5/GW6FVmpQMuCWPItZDxWOStevnuj/jXTZ2eUGa8WmkwoS8g7RwHZcgy7YXJYvTyk= X-Forefront-Antispam-Report: CIP:63.35.35.123; CTRY:IE; LANG:en; SCL:1; SRV:; IPV:CAL; SFV:NSPM; H:64aa7808-outbound-1.mta.getcheckrecipient.com; PTR:ec2-63-35-35-123.eu-west-1.compute.amazonaws.com; CAT:NONE; SFS:(4636009)(376002)(396003)(136003)(39860400002)(346002)(46966005)(82310400003)(8676002)(336012)(86362001)(66616009)(16526019)(70586007)(186003)(2906002)(33656002)(70206006)(55016002)(2616005)(316002)(956004)(44832011)(1076003)(5660300002)(6916009)(8886007)(36756003)(8936002)(33964004)(36906005)(4743002)(356005)(478600001)(82740400003)(83380400001)(26005)(235185007)(7696005)(47076004)(44144004)(4326008)(81166007)(2700100001); DIR:OUT; SFP:1101; X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 25 Sep 2020 14:31:45.0487 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 5754cf88-8e4f-4483-1979-08d8615fba7a X-MS-Exchange-CrossTenant-Id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=f34e5979-57d9-4aaa-ad4d-b122a662184d; Ip=[63.35.35.123]; Helo=[64aa7808-outbound-1.mta.getcheckrecipient.com] X-MS-Exchange-CrossTenant-AuthSource: AM5EUR03FT041.eop-EUR03.prod.protection.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: VI1PR08MB3165 X-Spam-Status: No, score=-13.6 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, GIT_PATCH_0, KAM_LOTSOFHASH, MSGID_FROM_MTA_HEADER, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2, SCC_5_SHORT_WORD_LINES, SPF_HELO_PASS, SPF_PASS, TXREP, UNPARSEABLE_RELAY autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Richard.Earnshaw@arm.com, nd@arm.com, Ramana.Radhakrishnan@arm.com Errors-To: gcc-patches-bounces@gcc.gnu.org Sender: "Gcc-patches" Hi All, This adds implementation for the optabs for complex operations. With this the following C code: void f90 (int _Complex a[restrict N], int _Complex b[restrict N], int _Complex c[restrict N]) { for (int i=0; i < N; i++) c[i] = a[i] + (b[i] * I); } generates .L3: mov r3, r0 vldrw.32 q2, [r3] mov r3, r1 vldrw.32 q1, [r3] mov r3, r2 vcadd.i32 q3, q2, q1, #90 adds r0, r0, #16 vstrw.32 q3, [r3] adds r1, r1, #16 adds r2, r2, #16 le lr, .L3 pop {r4, r5, r6, r7, r8, pc} which is not ideal due to register allocation and addressing mode issues with MVE in general. However -frename-register cleans up the register allocation: .L3: mov r5, r0 mov r6, r1 vldrw.32 q2, [r5] vldrw.32 q1, [r6] mov r7, r2 vcadd.i32 q3, q2, q1, #90 adds r0, r0, #16 vstrw.32 q3, [r7] adds r1, r1, #16 adds r2, r2, #16 le lr, .L3 pop {r4, r5, r6, r7, r8, pc} but leaves the addressing mode problems. Before this patch it generated a scalar loop .L2: ldr r7, [r0, r3, lsl #2] ldr r5, [r6, r3, lsl #2] ldr r4, [r1, r3, lsl #2] subs r5, r7, r5 ldr r7, [lr, r3, lsl #2] add r4, r4, r7 str r5, [r2, r3, lsl #2] str r4, [ip, r3, lsl #2] adds r3, r3, #2 cmp r3, #200 bne .L2 pop {r4, r5, r6, r7, pc} Bootstrapped Regtested on arm-none-linux-gnueabihf and no issues. Cross compiled arm-none-eabi and ran with -march=armv8.1-m.main+mve.fp -mfloat-abi=hard -mfpu=auto and regression is on-going. Unfortunately MVE does not currently implement auto-vectorization of floating point values. As such I cannot test this directly. But since they share 90% of the code with NEON these should just work whenever support is added so I would still like to commit these. To support this I had to refactor the MVE bits a bit. This now uses the same unspecs for both NEON and MVE and removes the unneeded different signed and unsigned unspecs since they both point to the signed instruction. I have tried multiple approaches to cleaning this up but I think this is the nicest it can get given the slight ISA differences. Ok for master if no issues? Thanks, Tamar gcc/ChangeLog: * config/arm/arm_mve.h (__arm_vcaddq_rot90_u8, __arm_vcaddq_rot270_u8, , __arm_vcaddq_rot90_s8, __arm_vcaddq_rot270_s8, __arm_vcaddq_rot90_u16, __arm_vcaddq_rot270_u16, __arm_vcaddq_rot90_s16, __arm_vcaddq_rot270_s16, __arm_vcaddq_rot90_u32, __arm_vcaddq_rot270_u32, __arm_vcaddq_rot90_s32, __arm_vcaddq_rot270_s32, __arm_vcmulq_rot90_f16, __arm_vcmulq_rot270_f16, __arm_vcmulq_rot180_f16, __arm_vcmulq_f16, __arm_vcaddq_rot90_f16, __arm_vcaddq_rot270_f16, __arm_vcmulq_rot90_f32, __arm_vcmulq_rot270_f32, __arm_vcmulq_rot180_f32, __arm_vcmulq_f32, __arm_vcaddq_rot90_f32, __arm_vcaddq_rot270_f32, __arm_vcmlaq_f16, __arm_vcmlaq_rot180_f16, __arm_vcmlaq_rot270_f16, __arm_vcmlaq_rot90_f16, __arm_vcmlaq_f32, __arm_vcmlaq_rot180_f32, __arm_vcmlaq_rot270_f32, __arm_vcmlaq_rot90_f32): Update builtin calls. * config/arm/arm_mve_builtins.def (vcaddq_rot90_u, vcaddq_rot270_u, vcaddq_rot90_s, vcaddq_rot270_s, vcaddq_rot90_f, vcaddq_rot270_f, vcmulq_f, vcmulq_rot90_f, vcmulq_rot180_f, vcmulq_rot270_f, vcmlaq_f, vcmlaq_rot90_f, vcmlaq_rot180_f, vcmlaq_rot270_f): Removed. (vcaddq_rot90, vcaddq_rot270, vcmulq, vcmulq_rot90, vcmulq_rot180, vcmulq_rot270, vcmlaq, vcmlaq_rot90, vcmlaq_rot180, vcmlaq_rot270): New. * config/arm/constraints.md (Dz): Include MVE. * config/arm/iterators.md (mve_rotsplit1, mve_rotsplit2): New. * config/arm/mve.md (VCADDQ_ROT270_S, VCADDQ_ROT90_S, VCADDQ_ROT270_U, VCADDQ_ROT90_U, VCADDQ_ROT270_F, VCADDQ_ROT90_F, VCMULQ_F, VCMULQ_ROT180_F, VCMULQ_ROT270_F, VCMULQ_ROT90_F, VCMLAQ_F, VCMLAQ_ROT180_F, VCMLAQ_ROT90_F, VCMLAQ_ROT270_F, VCADDQ_ROT270_S, VCADDQ_ROT270, VCADDQ_ROT90): Removed. (mve_rot, VCMUL): New. (mve_vcaddq_rot270_, mve_vcaddq_rot270_f, mve_vcaddq_rot90_f, mve_vcmulq_f, mve_vcmulq_rot270_f, mve_vcmulq_rot90_f, mve_vcmlaq_f, mve_vcmlaq_rot180_f, mve_vcmlaq_rot270_f, mve_vcmlaq_rot90_f): Removed. (mve_vcmlaq, mve_vcmulq, mve_vcaddq, cadd3, mve_vcaddq): New. * config/arm/neon.md (cadd3, cml4): Moved. (cmul3): Exclude MVE types. * config/arm/unspecs.md (UNSPEC_VCMUL90, UNSPEC_VCMUL270): New. * config/arm/vec-common.md (cadd3, cmul3, arm_vcmla, cml4): New. diff --git a/gcc/config/arm/arm_mve.h b/gcc/config/arm/arm_mve.h index a801705ced582105df60ccdc79a7500b320e12d4..bd379a5d915ad1d682f2d92554f4bd03c2762733 100644 --- a/gcc/config/arm/arm_mve.h +++ b/gcc/config/arm/arm_mve.h @@ -3983,14 +3983,14 @@ __extension__ extern __inline uint8x16_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vcaddq_rot90_u8 (uint8x16_t __a, uint8x16_t __b) { - return __builtin_mve_vcaddq_rot90_uv16qi (__a, __b); + return __builtin_mve_vcaddq_rot90v16qi (__a, __b); } __extension__ extern __inline uint8x16_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vcaddq_rot270_u8 (uint8x16_t __a, uint8x16_t __b) { - return __builtin_mve_vcaddq_rot270_uv16qi (__a, __b); + return __builtin_mve_vcaddq_rot270v16qi (__a, __b); } __extension__ extern __inline uint8x16_t @@ -4522,14 +4522,14 @@ __extension__ extern __inline int8x16_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vcaddq_rot90_s8 (int8x16_t __a, int8x16_t __b) { - return __builtin_mve_vcaddq_rot90_sv16qi (__a, __b); + return __builtin_mve_vcaddq_rot90v16qi (__a, __b); } __extension__ extern __inline int8x16_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vcaddq_rot270_s8 (int8x16_t __a, int8x16_t __b) { - return __builtin_mve_vcaddq_rot270_sv16qi (__a, __b); + return __builtin_mve_vcaddq_rot270v16qi (__a, __b); } __extension__ extern __inline int8x16_t @@ -4823,14 +4823,14 @@ __extension__ extern __inline uint16x8_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vcaddq_rot90_u16 (uint16x8_t __a, uint16x8_t __b) { - return __builtin_mve_vcaddq_rot90_uv8hi (__a, __b); + return __builtin_mve_vcaddq_rot90v8hi (__a, __b); } __extension__ extern __inline uint16x8_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vcaddq_rot270_u16 (uint16x8_t __a, uint16x8_t __b) { - return __builtin_mve_vcaddq_rot270_uv8hi (__a, __b); + return __builtin_mve_vcaddq_rot270v8hi (__a, __b); } __extension__ extern __inline uint16x8_t @@ -5362,14 +5362,14 @@ __extension__ extern __inline int16x8_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vcaddq_rot90_s16 (int16x8_t __a, int16x8_t __b) { - return __builtin_mve_vcaddq_rot90_sv8hi (__a, __b); + return __builtin_mve_vcaddq_rot90v8hi (__a, __b); } __extension__ extern __inline int16x8_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vcaddq_rot270_s16 (int16x8_t __a, int16x8_t __b) { - return __builtin_mve_vcaddq_rot270_sv8hi (__a, __b); + return __builtin_mve_vcaddq_rot270v8hi (__a, __b); } __extension__ extern __inline int16x8_t @@ -5663,14 +5663,14 @@ __extension__ extern __inline uint32x4_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vcaddq_rot90_u32 (uint32x4_t __a, uint32x4_t __b) { - return __builtin_mve_vcaddq_rot90_uv4si (__a, __b); + return __builtin_mve_vcaddq_rot90v4si (__a, __b); } __extension__ extern __inline uint32x4_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vcaddq_rot270_u32 (uint32x4_t __a, uint32x4_t __b) { - return __builtin_mve_vcaddq_rot270_uv4si (__a, __b); + return __builtin_mve_vcaddq_rot270v4si (__a, __b); } __extension__ extern __inline uint32x4_t @@ -6202,14 +6202,14 @@ __extension__ extern __inline int32x4_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vcaddq_rot90_s32 (int32x4_t __a, int32x4_t __b) { - return __builtin_mve_vcaddq_rot90_sv4si (__a, __b); + return __builtin_mve_vcaddq_rot90v4si (__a, __b); } __extension__ extern __inline int32x4_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vcaddq_rot270_s32 (int32x4_t __a, int32x4_t __b) { - return __builtin_mve_vcaddq_rot270_sv4si (__a, __b); + return __builtin_mve_vcaddq_rot270v4si (__a, __b); } __extension__ extern __inline int32x4_t @@ -17380,42 +17380,42 @@ __extension__ extern __inline float16x8_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vcmulq_rot90_f16 (float16x8_t __a, float16x8_t __b) { - return __builtin_mve_vcmulq_rot90_fv8hf (__a, __b); + return __builtin_mve_vcmulq_rot90v8hf (__a, __b); } __extension__ extern __inline float16x8_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vcmulq_rot270_f16 (float16x8_t __a, float16x8_t __b) { - return __builtin_mve_vcmulq_rot270_fv8hf (__a, __b); + return __builtin_mve_vcmulq_rot270v8hf (__a, __b); } __extension__ extern __inline float16x8_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vcmulq_rot180_f16 (float16x8_t __a, float16x8_t __b) { - return __builtin_mve_vcmulq_rot180_fv8hf (__a, __b); + return __builtin_mve_vcmulq_rot180v8hf (__a, __b); } __extension__ extern __inline float16x8_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vcmulq_f16 (float16x8_t __a, float16x8_t __b) { - return __builtin_mve_vcmulq_fv8hf (__a, __b); + return __builtin_mve_vcmulqv8hf (__a, __b); } __extension__ extern __inline float16x8_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vcaddq_rot90_f16 (float16x8_t __a, float16x8_t __b) { - return __builtin_mve_vcaddq_rot90_fv8hf (__a, __b); + return __builtin_mve_vcaddq_rot90v8hf (__a, __b); } __extension__ extern __inline float16x8_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vcaddq_rot270_f16 (float16x8_t __a, float16x8_t __b) { - return __builtin_mve_vcaddq_rot270_fv8hf (__a, __b); + return __builtin_mve_vcaddq_rot270v8hf (__a, __b); } __extension__ extern __inline float16x8_t @@ -17632,42 +17632,42 @@ __extension__ extern __inline float32x4_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vcmulq_rot90_f32 (float32x4_t __a, float32x4_t __b) { - return __builtin_mve_vcmulq_rot90_fv4sf (__a, __b); + return __builtin_mve_vcmulq_rot90v4sf (__a, __b); } __extension__ extern __inline float32x4_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vcmulq_rot270_f32 (float32x4_t __a, float32x4_t __b) { - return __builtin_mve_vcmulq_rot270_fv4sf (__a, __b); + return __builtin_mve_vcmulq_rot270v4sf (__a, __b); } __extension__ extern __inline float32x4_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vcmulq_rot180_f32 (float32x4_t __a, float32x4_t __b) { - return __builtin_mve_vcmulq_rot180_fv4sf (__a, __b); + return __builtin_mve_vcmulq_rot180v4sf (__a, __b); } __extension__ extern __inline float32x4_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vcmulq_f32 (float32x4_t __a, float32x4_t __b) { - return __builtin_mve_vcmulq_fv4sf (__a, __b); + return __builtin_mve_vcmulqv4sf (__a, __b); } __extension__ extern __inline float32x4_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vcaddq_rot90_f32 (float32x4_t __a, float32x4_t __b) { - return __builtin_mve_vcaddq_rot90_fv4sf (__a, __b); + return __builtin_mve_vcaddq_rot90v4sf (__a, __b); } __extension__ extern __inline float32x4_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vcaddq_rot270_f32 (float32x4_t __a, float32x4_t __b) { - return __builtin_mve_vcaddq_rot270_fv4sf (__a, __b); + return __builtin_mve_vcaddq_rot270v4sf (__a, __b); } __extension__ extern __inline float32x4_t @@ -17822,28 +17822,28 @@ __extension__ extern __inline float16x8_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vcmlaq_f16 (float16x8_t __a, float16x8_t __b, float16x8_t __c) { - return __builtin_mve_vcmlaq_fv8hf (__a, __b, __c); + return __builtin_mve_vcmlaqv8hf (__a, __b, __c); } __extension__ extern __inline float16x8_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vcmlaq_rot180_f16 (float16x8_t __a, float16x8_t __b, float16x8_t __c) { - return __builtin_mve_vcmlaq_rot180_fv8hf (__a, __b, __c); + return __builtin_mve_vcmlaq_rot180v8hf (__a, __b, __c); } __extension__ extern __inline float16x8_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vcmlaq_rot270_f16 (float16x8_t __a, float16x8_t __b, float16x8_t __c) { - return __builtin_mve_vcmlaq_rot270_fv8hf (__a, __b, __c); + return __builtin_mve_vcmlaq_rot270v8hf (__a, __b, __c); } __extension__ extern __inline float16x8_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vcmlaq_rot90_f16 (float16x8_t __a, float16x8_t __b, float16x8_t __c) { - return __builtin_mve_vcmlaq_rot90_fv8hf (__a, __b, __c); + return __builtin_mve_vcmlaq_rot90v8hf (__a, __b, __c); } __extension__ extern __inline float16x8_t @@ -18130,28 +18130,28 @@ __extension__ extern __inline float32x4_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vcmlaq_f32 (float32x4_t __a, float32x4_t __b, float32x4_t __c) { - return __builtin_mve_vcmlaq_fv4sf (__a, __b, __c); + return __builtin_mve_vcmlaqv4sf (__a, __b, __c); } __extension__ extern __inline float32x4_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vcmlaq_rot180_f32 (float32x4_t __a, float32x4_t __b, float32x4_t __c) { - return __builtin_mve_vcmlaq_rot180_fv4sf (__a, __b, __c); + return __builtin_mve_vcmlaq_rot180v4sf (__a, __b, __c); } __extension__ extern __inline float32x4_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vcmlaq_rot270_f32 (float32x4_t __a, float32x4_t __b, float32x4_t __c) { - return __builtin_mve_vcmlaq_rot270_fv4sf (__a, __b, __c); + return __builtin_mve_vcmlaq_rot270v4sf (__a, __b, __c); } __extension__ extern __inline float32x4_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vcmlaq_rot90_f32 (float32x4_t __a, float32x4_t __b, float32x4_t __c) { - return __builtin_mve_vcmlaq_rot90_fv4sf (__a, __b, __c); + return __builtin_mve_vcmlaq_rot90v4sf (__a, __b, __c); } __extension__ extern __inline float32x4_t diff --git a/gcc/config/arm/arm_mve_builtins.def b/gcc/config/arm/arm_mve_builtins.def index 753e40a951d071c1ab77476a1cc4779e91689178..f31248595b7bef4eed4963dbbbc72371b15b8af8 100644 --- a/gcc/config/arm/arm_mve_builtins.def +++ b/gcc/config/arm/arm_mve_builtins.def @@ -125,8 +125,6 @@ VAR3 (BINOP_UNONE_UNONE_UNONE, vcmpeqq_u, v16qi, v8hi, v4si) VAR3 (BINOP_UNONE_UNONE_UNONE, vcmpeqq_n_u, v16qi, v8hi, v4si) VAR3 (BINOP_UNONE_UNONE_UNONE, vcmpcsq_u, v16qi, v8hi, v4si) VAR3 (BINOP_UNONE_UNONE_UNONE, vcmpcsq_n_u, v16qi, v8hi, v4si) -VAR3 (BINOP_UNONE_UNONE_UNONE, vcaddq_rot90_u, v16qi, v8hi, v4si) -VAR3 (BINOP_UNONE_UNONE_UNONE, vcaddq_rot270_u, v16qi, v8hi, v4si) VAR3 (BINOP_UNONE_UNONE_UNONE, vbicq_u, v16qi, v8hi, v4si) VAR3 (BINOP_UNONE_UNONE_UNONE, vandq_u, v16qi, v8hi, v4si) VAR3 (BINOP_UNONE_UNONE_UNONE, vaddvq_p_u, v16qi, v8hi, v4si) @@ -202,8 +200,6 @@ VAR3 (BINOP_NONE_NONE_NONE, vhcaddq_rot270_s, v16qi, v8hi, v4si) VAR3 (BINOP_NONE_NONE_NONE, vhaddq_s, v16qi, v8hi, v4si) VAR3 (BINOP_NONE_NONE_NONE, vhaddq_n_s, v16qi, v8hi, v4si) VAR3 (BINOP_NONE_NONE_NONE, veorq_s, v16qi, v8hi, v4si) -VAR3 (BINOP_NONE_NONE_NONE, vcaddq_rot90_s, v16qi, v8hi, v4si) -VAR3 (BINOP_NONE_NONE_NONE, vcaddq_rot270_s, v16qi, v8hi, v4si) VAR3 (BINOP_NONE_NONE_NONE, vbrsrq_n_s, v16qi, v8hi, v4si) VAR3 (BINOP_NONE_NONE_NONE, vbicq_s, v16qi, v8hi, v4si) VAR3 (BINOP_NONE_NONE_NONE, vandq_s, v16qi, v8hi, v4si) @@ -264,12 +260,6 @@ VAR2 (BINOP_NONE_NONE_NONE, vmaxnmq_f, v8hf, v4sf) VAR2 (BINOP_NONE_NONE_NONE, vmaxnmavq_f, v8hf, v4sf) VAR2 (BINOP_NONE_NONE_NONE, vmaxnmaq_f, v8hf, v4sf) VAR2 (BINOP_NONE_NONE_NONE, veorq_f, v8hf, v4sf) -VAR2 (BINOP_NONE_NONE_NONE, vcmulq_rot90_f, v8hf, v4sf) -VAR2 (BINOP_NONE_NONE_NONE, vcmulq_rot270_f, v8hf, v4sf) -VAR2 (BINOP_NONE_NONE_NONE, vcmulq_rot180_f, v8hf, v4sf) -VAR2 (BINOP_NONE_NONE_NONE, vcmulq_f, v8hf, v4sf) -VAR2 (BINOP_NONE_NONE_NONE, vcaddq_rot90_f, v8hf, v4sf) -VAR2 (BINOP_NONE_NONE_NONE, vcaddq_rot270_f, v8hf, v4sf) VAR2 (BINOP_NONE_NONE_NONE, vbicq_f, v8hf, v4sf) VAR2 (BINOP_NONE_NONE_NONE, vandq_f, v8hf, v4sf) VAR2 (BINOP_NONE_NONE_NONE, vaddq_n_f, v8hf, v4sf) @@ -472,10 +462,6 @@ VAR2 (TERNOP_NONE_NONE_NONE_NONE, vfmsq_f, v8hf, v4sf) VAR2 (TERNOP_NONE_NONE_NONE_NONE, vfmasq_n_f, v8hf, v4sf) VAR2 (TERNOP_NONE_NONE_NONE_NONE, vfmaq_n_f, v8hf, v4sf) VAR2 (TERNOP_NONE_NONE_NONE_NONE, vfmaq_f, v8hf, v4sf) -VAR2 (TERNOP_NONE_NONE_NONE_NONE, vcmlaq_rot90_f, v8hf, v4sf) -VAR2 (TERNOP_NONE_NONE_NONE_NONE, vcmlaq_rot270_f, v8hf, v4sf) -VAR2 (TERNOP_NONE_NONE_NONE_NONE, vcmlaq_rot180_f, v8hf, v4sf) -VAR2 (TERNOP_NONE_NONE_NONE_NONE, vcmlaq_f, v8hf, v4sf) VAR2 (TERNOP_NONE_NONE_NONE_IMM, vshrntq_n_s, v8hi, v4si) VAR2 (TERNOP_NONE_NONE_NONE_IMM, vshrnbq_n_s, v8hi, v4si) VAR2 (TERNOP_NONE_NONE_NONE_IMM, vrshrntq_n_s, v8hi, v4si) @@ -904,3 +890,15 @@ VAR3 (QUADOP_NONE_NONE_UNONE_IMM_UNONE, vshlcq_m_vec_s, v16qi, v8hi, v4si) VAR3 (QUADOP_NONE_NONE_UNONE_IMM_UNONE, vshlcq_m_carry_s, v16qi, v8hi, v4si) VAR3 (QUADOP_UNONE_UNONE_UNONE_IMM_UNONE, vshlcq_m_vec_u, v16qi, v8hi, v4si) VAR3 (QUADOP_UNONE_UNONE_UNONE_IMM_UNONE, vshlcq_m_carry_u, v16qi, v8hi, v4si) + +/* optabs without any suffixes. */ +VAR5 (BINOP_NONE_NONE_NONE, vcaddq_rot90, v16qi, v8hi, v4si, v8hf, v4sf) +VAR5 (BINOP_NONE_NONE_NONE, vcaddq_rot270, v16qi, v8hi, v4si, v8hf, v4sf) +VAR2 (BINOP_NONE_NONE_NONE, vcmulq_rot90, v8hf, v4sf) +VAR2 (BINOP_NONE_NONE_NONE, vcmulq_rot270, v8hf, v4sf) +VAR2 (BINOP_NONE_NONE_NONE, vcmulq_rot180, v8hf, v4sf) +VAR2 (BINOP_NONE_NONE_NONE, vcmulq, v8hf, v4sf) +VAR2 (TERNOP_NONE_NONE_NONE_NONE, vcmlaq_rot90, v8hf, v4sf) +VAR2 (TERNOP_NONE_NONE_NONE_NONE, vcmlaq_rot270, v8hf, v4sf) +VAR2 (TERNOP_NONE_NONE_NONE_NONE, vcmlaq_rot180, v8hf, v4sf) +VAR2 (TERNOP_NONE_NONE_NONE_NONE, vcmlaq, v8hf, v4sf) diff --git a/gcc/config/arm/constraints.md b/gcc/config/arm/constraints.md index ff229aa98455e05470801d7110b1aaf5ab3e0d25..e166c11a4c78e8dbcdd25d5cfda14cc36daad5b2 100644 --- a/gcc/config/arm/constraints.md +++ b/gcc/config/arm/constraints.md @@ -310,7 +310,7 @@ (define_constraint "Dz" "@internal In ARM/Thumb-2 state a vector of constant zeros." (and (match_code "const_vector") - (match_test "TARGET_NEON && op == CONST0_RTX (mode)"))) + (match_test "(TARGET_NEON || TARGET_HAVE_MVE) && op == CONST0_RTX (mode)"))) (define_constraint "Da" "@internal diff --git a/gcc/config/arm/iterators.md b/gcc/config/arm/iterators.md index a4da379670ce3428253664d44d2e18415a8f49ab..63d4ebe786c5f1262851e462942127adc4a5e92c 100644 --- a/gcc/config/arm/iterators.md +++ b/gcc/config/arm/iterators.md @@ -1168,6 +1168,21 @@ (define_int_attr rotsplit2 [(UNSPEC_VCMLA "90") (UNSPEC_VCMLS "180") (UNSPEC_VCMLS180 "180")]) +(define_int_attr mve_rotsplit1 [(UNSPEC_VCMLA "") + (UNSPEC_VCMLA180 "") + (UNSPEC_VCMUL "") + (UNSPEC_VCMUL180 "") + (UNSPEC_VCMLS "_rot270") + (UNSPEC_VCMLS180 "_rot90")]) + +(define_int_attr mve_rotsplit2 [(UNSPEC_VCMLA "_rot90") + (UNSPEC_VCMLA180 "_rot270") + (UNSPEC_VCMUL "_rot90") + (UNSPEC_VCMUL180 "_rot270") + (UNSPEC_VCMLS "_rot180") + (UNSPEC_VCMLS180 "_rot180")]) + + (define_int_attr fcmac1 [(UNSPEC_VCMLA "a") (UNSPEC_VCMLA180 "a") (UNSPEC_VCMLS "s") (UNSPEC_VCMLS180 "s")]) diff --git a/gcc/config/arm/mve.md b/gcc/config/arm/mve.md index 465b39a51b3a258295ed764f0e742932e5d59225..b8cd74176a41572008d86e6a074a626ccf35a69c 100644 --- a/gcc/config/arm/mve.md +++ b/gcc/config/arm/mve.md @@ -42,7 +42,7 @@ (define_c_enum "unspec" [VST4Q VRNDXQ_F VRNDQ_F VRNDPQ_F VRNDNQ_F VRNDMQ_F VCVTQ_N_FROM_F_S VCVTQ_N_FROM_F_U VADDLVQ_P_S VADDLVQ_P_U VCMPNEQ_U VCMPNEQ_S VSHLQ_S VSHLQ_U VABDQ_S VADDQ_N_S VADDVAQ_S VADDVQ_P_S VANDQ_S VBICQ_S - VBRSRQ_N_S VCADDQ_ROT270_S VCADDQ_ROT90_S VCMPEQQ_S + VBRSRQ_N_S VCMPEQQ_S VCMPEQQ_N_S VCMPNEQ_N_S VEORQ_S VHADDQ_S VHADDQ_N_S VHSUBQ_S VHSUBQ_N_S VMAXQ_S VMAXVQ_S VMINQ_S VMINVQ_S VMLADAVQ_S VMULHQ_S VMULLBQ_INT_S VMULLTQ_INT_S VMULQ_S @@ -51,7 +51,7 @@ (define_c_enum "unspec" [VST4Q VRNDXQ_F VRNDQ_F VRNDPQ_F VRNDNQ_F VRNDMQ_F VQSUBQ_N_S VRHADDQ_S VRMULHQ_S VRSHLQ_S VRSHLQ_N_S VRSHRQ_N_S VSHLQ_N_S VSHLQ_R_S VSUBQ_S VSUBQ_N_S VABDQ_U VADDQ_N_U VADDVAQ_U VADDVQ_P_U VANDQ_U VBICQ_U - VBRSRQ_N_U VCADDQ_ROT270_U VCADDQ_ROT90_U VCMPEQQ_U + VBRSRQ_N_U VCMPEQQ_U VCMPEQQ_N_U VCMPNEQ_N_U VEORQ_U VHADDQ_U VHADDQ_N_U VHSUBQ_U VHSUBQ_N_U VMAXQ_U VMAXVQ_U VMINQ_U VMINVQ_U VMLADAVQ_U VMULHQ_U VMULLBQ_INT_U VMULLTQ_INT_U VMULQ_U @@ -66,10 +66,9 @@ (define_c_enum "unspec" [VST4Q VRNDXQ_F VRNDQ_F VRNDPQ_F VRNDNQ_F VRNDMQ_F VQDMULHQ_S VQRDMULHQ_N_S VQRDMULHQ_S VQSHLUQ_N_S VCMPCSQ_N_U VCMPCSQ_U VCMPHIQ_N_U VCMPHIQ_U VABDQ_M_S VABDQ_M_U VABDQ_F VADDQ_N_F VANDQ_F VBICQ_F - VCADDQ_ROT270_F VCADDQ_ROT90_F VCMPEQQ_F VCMPEQQ_N_F + VCMPEQQ_F VCMPEQQ_N_F VCMPGEQ_F VCMPGEQ_N_F VCMPGTQ_F VCMPGTQ_N_F VCMPLEQ_F VCMPLEQ_N_F VCMPLTQ_F VCMPLTQ_N_F VCMPNEQ_F VCMPNEQ_N_F - VCMULQ_F VCMULQ_ROT180_F VCMULQ_ROT270_F VCMULQ_ROT90_F VEORQ_F VMAXNMAQ_F VMAXNMAVQ_F VMAXNMQ_F VMAXNMVQ_F VMINNMAQ_F VMINNMAVQ_F VMINNMQ_F VMINNMVQ_F VMULQ_F VMULQ_N_F VORNQ_F VORRQ_F VSUBQ_F VADDLVAQ_U @@ -112,18 +111,18 @@ (define_c_enum "unspec" [VST4Q VRNDXQ_F VRNDQ_F VRNDPQ_F VRNDNQ_F VRNDMQ_F VMLSDAVAXQ_S VMLSDAVAQ_S VMLADAVAXQ_S VCMPGEQ_M_F VCMPGTQ_M_N_F VMLSLDAVQ_P_S VRMLALDAVHAXQ_S VMLSLDAVXQ_P_S VFMAQ_F VMLSLDAVAQ_S VQSHRUNBQ_N_S - VQRSHRUNTQ_N_S VCMLAQ_F VMINNMAQ_M_F VFMASQ_N_F + VQRSHRUNTQ_N_S VMINNMAQ_M_F VFMASQ_N_F VDUPQ_M_N_F VCMPGTQ_M_F VCMPLTQ_M_F VRMLSLDAVHQ_P_S VQSHRUNTQ_N_S VABSQ_M_F VMAXNMAVQ_P_F VFMAQ_N_F VRMLSLDAVHXQ_P_S VREV32Q_M_F VRMLSLDAVHAQ_S VRMLSLDAVHAXQ_S VCMPLTQ_M_N_F VCMPNEQ_M_F VRNDAQ_M_F VRNDPQ_M_F VADDLVAQ_P_S VQMOVUNBQ_M_S VCMPLEQ_M_F - VCMLAQ_ROT180_F VMLSLDAVAXQ_S VRNDXQ_M_F VFMSQ_F - VMINNMVQ_P_F VMAXNMVQ_P_F VPSELQ_F VCMLAQ_ROT90_F + VMLSLDAVAXQ_S VRNDXQ_M_F VFMSQ_F + VMINNMVQ_P_F VMAXNMVQ_P_F VPSELQ_F VQMOVUNTQ_M_S VREV64Q_M_F VNEGQ_M_F VRNDMQ_M_F VCMPLEQ_M_N_F VCMPGEQ_M_N_F VRNDNQ_M_F VMINNMAVQ_P_F VCMPNEQ_M_N_F VRMLALDAVHQ_P_S VRMLALDAVHXQ_P_S - VCMPEQQ_M_N_F VCMLAQ_ROT270_F VMAXNMAQ_M_F VRNDQ_M_F + VCMPEQQ_M_N_F VMAXNMAQ_M_F VRNDQ_M_F VMLALDAVQ_P_U VMLALDAVQ_P_S VQMOVNBQ_M_S VQMOVNBQ_M_U VMOVLTQ_M_U VMOVLTQ_M_S VMOVNBQ_M_U VMOVNBQ_M_S VRSHRNTQ_N_U VRSHRNTQ_N_S VORRQ_M_N_S VORRQ_M_N_U @@ -240,9 +239,8 @@ (define_int_attr supf [(VCVTQ_TO_F_S "s") (VCVTQ_TO_F_U "u") (VREV16Q_S "s") (VABDQ_U "u") (VADDQ_N_S "s") (VADDQ_N_U "u") (VADDVQ_P_S "s") (VADDVQ_P_U "u") (VANDQ_S "s") (VANDQ_U "u") (VBICQ_S "s") (VBICQ_U "u") - (VBRSRQ_N_S "s") (VBRSRQ_N_U "u") (VCADDQ_ROT270_S "s") - (VCADDQ_ROT270_U "u") (VCADDQ_ROT90_S "s") - (VCMPEQQ_S "s") (VCMPEQQ_U "u") (VCADDQ_ROT90_U "u") + (VBRSRQ_N_S "s") (VBRSRQ_N_U "u") + (VCMPEQQ_S "s") (VCMPEQQ_U "u") (VCMPEQQ_N_S "s") (VCMPEQQ_N_U "u") (VCMPNEQ_N_S "s") (VCMPNEQ_N_U "u") (VEORQ_S "s") (VEORQ_U "u") (VHADDQ_N_S "s") (VHADDQ_N_U "u") (VHADDQ_S "s") @@ -421,6 +419,19 @@ (define_mode_attr V_extr_elem [(V16QI "u8") (V8HI "u16") (V4SI "32") (define_mode_attr earlyclobber_32 [(V16QI "=w") (V8HI "=w") (V4SI "=&w") (V8HF "=w") (V4SF "=&w")]) +(define_int_attr mve_rot [(UNSPEC_VCADD90 "_rot90") + (UNSPEC_VCADD270 "_rot270") + (UNSPEC_VCMLA "") + (UNSPEC_VCMLA90 "_rot90") + (UNSPEC_VCMLA180 "_rot180") + (UNSPEC_VCMLA270 "_rot270") + (UNSPEC_VCMUL "") + (UNSPEC_VCMUL90 "_rot90") + (UNSPEC_VCMUL180 "_rot180") + (UNSPEC_VCMUL270 "_rot270")]) + +(define_int_iterator VCMUL [UNSPEC_VCMUL UNSPEC_VCMUL90 UNSPEC_VCMUL180 UNSPEC_VCMUL270]) + (define_int_iterator VCVTQ_TO_F [VCVTQ_TO_F_S VCVTQ_TO_F_U]) (define_int_iterator VMVNQ_N [VMVNQ_N_U VMVNQ_N_S]) (define_int_iterator VREV64Q [VREV64Q_S VREV64Q_U]) @@ -454,8 +465,6 @@ (define_int_iterator VADDVQ_P [VADDVQ_P_U VADDVQ_P_S]) (define_int_iterator VANDQ [VANDQ_U VANDQ_S]) (define_int_iterator VBICQ [VBICQ_S VBICQ_U]) (define_int_iterator VBRSRQ_N [VBRSRQ_N_U VBRSRQ_N_S]) -(define_int_iterator VCADDQ_ROT270 [VCADDQ_ROT270_S VCADDQ_ROT270_U]) -(define_int_iterator VCADDQ_ROT90 [VCADDQ_ROT90_U VCADDQ_ROT90_S]) (define_int_iterator VCMPEQQ [VCMPEQQ_U VCMPEQQ_S]) (define_int_iterator VCMPEQQ_N [VCMPEQQ_N_S VCMPEQQ_N_U]) (define_int_iterator VCMPNEQ_N [VCMPNEQ_N_U VCMPNEQ_N_S]) @@ -1585,34 +1594,28 @@ (define_insn "mve_vbrsrq_n_" ]) ;; -;; [vcaddq_rot270_s, vcaddq_rot270_u]) +;; [vcaddq, vcaddq_rot90, vcadd_rot180, vcadd_rot270]) ;; -(define_insn "mve_vcaddq_rot270_" +(define_insn "mve_vcaddq" [ (set (match_operand:MVE_2 0 "s_register_operand" "") (unspec:MVE_2 [(match_operand:MVE_2 1 "s_register_operand" "w") (match_operand:MVE_2 2 "s_register_operand" "w")] - VCADDQ_ROT270)) + VCADD)) ] "TARGET_HAVE_MVE" - "vcadd.i%# %q0, %q1, %q2, #270" + "vcadd.i%# %q0, %q1, %q2, #" [(set_attr "type" "mve_move") ]) -;; -;; [vcaddq_rot90_u, vcaddq_rot90_s]) -;; -(define_insn "mve_vcaddq_rot90_" - [ - (set (match_operand:MVE_2 0 "s_register_operand" "") - (unspec:MVE_2 [(match_operand:MVE_2 1 "s_register_operand" "w") - (match_operand:MVE_2 2 "s_register_operand" "w")] - VCADDQ_ROT90)) - ] +;; Auto vectorizer pattern for int vcadd +(define_expand "cadd3" + [(set (match_operand:MVE_2 0 "register_operand") + (unspec:MVE_2 [(match_operand:MVE_2 1 "register_operand") + (match_operand:MVE_2 2 "register_operand")] + VCADD))] "TARGET_HAVE_MVE" - "vcadd.i%# %q0, %q1, %q2, #90" - [(set_attr "type" "mve_move") -]) +) ;; ;; [vcmpcsq_n_u]) @@ -2665,32 +2668,17 @@ (define_insn "mve_vbicq_n_" ]) ;; -;; [vcaddq_rot270_f]) +;; [vcaddq, vcaddq_rot90, vcadd_rot180, vcadd_rot270]) ;; -(define_insn "mve_vcaddq_rot270_f" +(define_insn "mve_vcaddq" [ (set (match_operand:MVE_0 0 "s_register_operand" "") (unspec:MVE_0 [(match_operand:MVE_0 1 "s_register_operand" "w") (match_operand:MVE_0 2 "s_register_operand" "w")] - VCADDQ_ROT270_F)) + VCADD)) ] "TARGET_HAVE_MVE && TARGET_HAVE_MVE_FLOAT" - "vcadd.f%# %q0, %q1, %q2, #270" - [(set_attr "type" "mve_move") -]) - -;; -;; [vcaddq_rot90_f]) -;; -(define_insn "mve_vcaddq_rot90_f" - [ - (set (match_operand:MVE_0 0 "s_register_operand" "") - (unspec:MVE_0 [(match_operand:MVE_0 1 "s_register_operand" "w") - (match_operand:MVE_0 2 "s_register_operand" "w")] - VCADDQ_ROT90_F)) - ] - "TARGET_HAVE_MVE && TARGET_HAVE_MVE_FLOAT" - "vcadd.f%# %q0, %q1, %q2, #90" + "vcadd.f%# %q0, %q1, %q2, #" [(set_attr "type" "mve_move") ]) @@ -2875,62 +2863,17 @@ (define_insn "mve_vcmpneq_n_f" ]) ;; -;; [vcmulq_f]) -;; -(define_insn "mve_vcmulq_f" - [ - (set (match_operand:MVE_0 0 "s_register_operand" "") - (unspec:MVE_0 [(match_operand:MVE_0 1 "s_register_operand" "w") - (match_operand:MVE_0 2 "s_register_operand" "w")] - VCMULQ_F)) - ] - "TARGET_HAVE_MVE && TARGET_HAVE_MVE_FLOAT" - "vcmul.f%# %q0, %q1, %q2, #0" - [(set_attr "type" "mve_move") -]) - -;; -;; [vcmulq_rot180_f]) -;; -(define_insn "mve_vcmulq_rot180_f" - [ - (set (match_operand:MVE_0 0 "s_register_operand" "") - (unspec:MVE_0 [(match_operand:MVE_0 1 "s_register_operand" "w") - (match_operand:MVE_0 2 "s_register_operand" "w")] - VCMULQ_ROT180_F)) - ] - "TARGET_HAVE_MVE && TARGET_HAVE_MVE_FLOAT" - "vcmul.f%# %q0, %q1, %q2, #180" - [(set_attr "type" "mve_move") -]) - -;; -;; [vcmulq_rot270_f]) -;; -(define_insn "mve_vcmulq_rot270_f" - [ - (set (match_operand:MVE_0 0 "s_register_operand" "") - (unspec:MVE_0 [(match_operand:MVE_0 1 "s_register_operand" "w") - (match_operand:MVE_0 2 "s_register_operand" "w")] - VCMULQ_ROT270_F)) - ] - "TARGET_HAVE_MVE && TARGET_HAVE_MVE_FLOAT" - "vcmul.f%# %q0, %q1, %q2, #270" - [(set_attr "type" "mve_move") -]) - -;; -;; [vcmulq_rot90_f]) +;; [vcmulq, vcmulq_rot90, vcmulq_rot180, vcmulq_rot270]) ;; -(define_insn "mve_vcmulq_rot90_f" +(define_insn "mve_vcmulq" [ (set (match_operand:MVE_0 0 "s_register_operand" "") (unspec:MVE_0 [(match_operand:MVE_0 1 "s_register_operand" "w") (match_operand:MVE_0 2 "s_register_operand" "w")] - VCMULQ_ROT90_F)) + VCMUL)) ] "TARGET_HAVE_MVE && TARGET_HAVE_MVE_FLOAT" - "vcmul.f%# %q0, %q1, %q2, #90" + "vcmul.f%# %q0, %q1, %q2, #" [(set_attr "type" "mve_move") ]) @@ -4692,66 +4635,20 @@ (define_insn "mve_vaddlvaq_p_v4si" [(set_attr "type" "mve_move") (set_attr "length""8")]) ;; -;; [vcmlaq_f]) -;; -(define_insn "mve_vcmlaq_f" - [ - (set (match_operand:MVE_0 0 "s_register_operand" "=w") - (unspec:MVE_0 [(match_operand:MVE_0 1 "s_register_operand" "0") - (match_operand:MVE_0 2 "s_register_operand" "w") - (match_operand:MVE_0 3 "s_register_operand" "w")] - VCMLAQ_F)) - ] - "TARGET_HAVE_MVE && TARGET_HAVE_MVE_FLOAT" - "vcmla.f%# %q0, %q2, %q3, #0" - [(set_attr "type" "mve_move") -]) - -;; -;; [vcmlaq_rot180_f]) -;; -(define_insn "mve_vcmlaq_rot180_f" - [ - (set (match_operand:MVE_0 0 "s_register_operand" "=w") - (unspec:MVE_0 [(match_operand:MVE_0 1 "s_register_operand" "0") - (match_operand:MVE_0 2 "s_register_operand" "w") - (match_operand:MVE_0 3 "s_register_operand" "w")] - VCMLAQ_ROT180_F)) - ] - "TARGET_HAVE_MVE && TARGET_HAVE_MVE_FLOAT" - "vcmla.f%# %q0, %q2, %q3, #180" - [(set_attr "type" "mve_move") -]) - -;; -;; [vcmlaq_rot270_f]) -;; -(define_insn "mve_vcmlaq_rot270_f" - [ - (set (match_operand:MVE_0 0 "s_register_operand" "=w") - (unspec:MVE_0 [(match_operand:MVE_0 1 "s_register_operand" "0") - (match_operand:MVE_0 2 "s_register_operand" "w") - (match_operand:MVE_0 3 "s_register_operand" "w")] - VCMLAQ_ROT270_F)) - ] - "TARGET_HAVE_MVE && TARGET_HAVE_MVE_FLOAT" - "vcmla.f%# %q0, %q2, %q3, #270" - [(set_attr "type" "mve_move") -]) - -;; -;; [vcmlaq_rot90_f]) +;; [vcmlaq, vcmlaq_rot90, vcmlaq_rot180, vcmlaq_rot270]) ;; -(define_insn "mve_vcmlaq_rot90_f" +(define_insn "mve_vcmlaq" [ - (set (match_operand:MVE_0 0 "s_register_operand" "=w") - (unspec:MVE_0 [(match_operand:MVE_0 1 "s_register_operand" "0") - (match_operand:MVE_0 2 "s_register_operand" "w") - (match_operand:MVE_0 3 "s_register_operand" "w")] - VCMLAQ_ROT90_F)) + (set (match_operand:MVE_0 0 "s_register_operand" "=w,w") + (unspec:MVE_0 [(match_operand:MVE_0 1 "s_register_operand" "0,Dz") + (match_operand:MVE_0 2 "s_register_operand" "w,w") + (match_operand:MVE_0 3 "s_register_operand" "w,w")] + VCMLA)) ] "TARGET_HAVE_MVE && TARGET_HAVE_MVE_FLOAT" - "vcmla.f%# %q0, %q2, %q3, #90" + "@ + vcmla.f%# %q0, %q2, %q3, # + vcmul.f%# %q0, %q2, %q3, #" [(set_attr "type" "mve_move") ]) diff --git a/gcc/config/arm/neon.md b/gcc/config/arm/neon.md index 2ccbf99883ec6a4808f16453e3d07454f3e3077e..d87ae8adbf1292bebbb9046e67038fcaa5f23040 100644 --- a/gcc/config/arm/neon.md +++ b/gcc/config/arm/neon.md @@ -3174,14 +3174,6 @@ (define_insn "neon_vcadd" [(set_attr "type" "neon_fcadd")] ) -(define_expand "cadd3" - [(set (match_operand:VF 0 "register_operand") - (unspec:VF [(match_operand:VF 1 "register_operand") - (match_operand:VF 2 "register_operand")] - VCADD))] - "TARGET_COMPLEX" -) - (define_insn "neon_vcmla" [(set (match_operand:VF 0 "register_operand" "=w") (plus:VF (match_operand:VF 1 "register_operand" "0") @@ -3238,32 +3230,13 @@ (define_insn "neon_vcmlaq_lane" [(set_attr "type" "neon_fcmla")] ) - -;; The complex mla/mls operations always need to expand to two instructions. -;; The first operation does half the computation and the second does the -;; remainder. Because of this, expand early. -(define_expand "cml4" - [(set (match_operand:VF 0 "register_operand") - (plus:VF (match_operand:VF 1 "register_operand") - (unspec:VF [(match_operand:VF 2 "register_operand") - (match_operand:VF 3 "register_operand")] - VCMLA_OP)))] - "TARGET_COMPLEX" -{ - emit_insn (gen_neon_vcmla (operands[0], operands[1], - operands[2], operands[3])); - emit_insn (gen_neon_vcmla (operands[0], operands[0], - operands[2], operands[3])); - DONE; -}) - ;; The complex mul operations always need to expand to two instructions. ;; The first operation does half the computation and the second does the ;; remainder. Because of this, expand early. (define_expand "cmul3" - [(set (match_operand:VF 0 "register_operand") - (unspec:VF [(match_operand:VF 1 "register_operand") - (match_operand:VF 2 "register_operand")] + [(set (match_operand:VDF 0 "register_operand") + (unspec:VDF [(match_operand:VDF 1 "register_operand") + (match_operand:VDF 2 "register_operand")] VCMUL_OP))] "TARGET_COMPLEX" { @@ -3276,6 +3249,7 @@ (define_expand "cmul3" DONE; }) + ;; These instructions map to the __builtins for the Dot Product operations. (define_insn "neon_dot" [(set (match_operand:VCVTI 0 "register_operand" "=w") diff --git a/gcc/config/arm/unspecs.md b/gcc/config/arm/unspecs.md index d1b2824a0fe76f62d69c18dcec2f47dfb75b586e..1251aace01b42d393a40467906c610c45de0412a 100644 --- a/gcc/config/arm/unspecs.md +++ b/gcc/config/arm/unspecs.md @@ -511,7 +511,9 @@ (define_c_enum "unspec" [ UNSPEC_VCMLA180 UNSPEC_VCMLA270 UNSPEC_VCMUL + UNSPEC_VCMUL90 UNSPEC_VCMUL180 + UNSPEC_VCMUL270 UNSPEC_VCMLS UNSPEC_VCMLS180 UNSPEC_MATMUL_S diff --git a/gcc/config/arm/vec-common.md b/gcc/config/arm/vec-common.md index c3c86c46355e6ace6c90e189b4160dbe4cd9caf3..8affffb68e6928bc4210656134677ad5f2915426 100644 --- a/gcc/config/arm/vec-common.md +++ b/gcc/config/arm/vec-common.md @@ -191,3 +191,70 @@ (define_expand "vec_set" GEN_INT (elem), operands[0])); DONE; }) + +(define_expand "cadd3" + [(set (match_operand:VF 0 "register_operand") + (unspec:VF [(match_operand:VF 1 "register_operand") + (match_operand:VF 2 "register_operand")] + VCADD))] + "TARGET_COMPLEX || (TARGET_HAVE_MVE && TARGET_HAVE_MVE_FLOAT + && ARM_HAVE_NEON__ARITH)" +) + +;; The complex mul operations always need to expand to two instructions. +;; The first operation does half the computation and the second does the +;; remainder. Because of this, expand early. +(define_expand "cmul3" + [(set (match_operand:VQ_HSF 0 "register_operand") + (unspec:VQ_HSF [(match_operand:VQ_HSF 1 "register_operand") + (match_operand:VQ_HSF 2 "register_operand")] + VCMUL_OP))] + "TARGET_COMPLEX || (TARGET_HAVE_MVE && TARGET_HAVE_MVE_FLOAT)" +{ + if (TARGET_COMPLEX) + { + rtx tmp = gen_reg_rtx (mode); + emit_move_insn (tmp, CONST0_RTX (mode)); + emit_insn (gen_neon_vcmla (operands[0], tmp, + operands[1], operands[2])); + emit_insn (gen_neon_vcmla (operands[0], operands[0], + operands[1], operands[2])); + } + else + { + emit_insn (gen_mve_vcmulq (operands[0], operands[1], + operands[2])); + emit_insn (gen_mve_vcmulq (operands[0], operands[1], + operands[2])); + } + DONE; +}) + +(define_expand "arm_vcmla" + [(set (match_operand:VF 0 "register_operand") + (plus:VF (match_operand:VF 1 "register_operand") + (unspec:VF [(match_operand:VF 2 "register_operand") + (match_operand:VF 3 "register_operand")] + VCMLA)))] + "TARGET_COMPLEX || (TARGET_HAVE_MVE && TARGET_HAVE_MVE_FLOAT + && ARM_HAVE_NEON__ARITH)" +) + +;; The complex mla/mls operations always need to expand to two instructions. +;; The first operation does half the computation and the second does the +;; remainder. Because of this, expand early. +(define_expand "cml4" + [(set (match_operand:VF 0 "register_operand") + (plus:VF (match_operand:VF 1 "register_operand") + (unspec:VF [(match_operand:VF 2 "register_operand") + (match_operand:VF 3 "register_operand")] + VCMLA_OP)))] + "TARGET_COMPLEX || (TARGET_HAVE_MVE && TARGET_HAVE_MVE_FLOAT + && ARM_HAVE_NEON__ARITH)" +{ + emit_insn (gen_arm_vcmla (operands[0], operands[1], + operands[2], operands[3])); + emit_insn (gen_arm_vcmla (operands[0], operands[0], + operands[2], operands[3])); + DONE; +}) From patchwork Fri Sep 25 14:31:47 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tamar Christina X-Patchwork-Id: 1371354 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=8.43.85.97; helo=sourceware.org; envelope-from=gcc-patches-bounces@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=arm.com Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=armh.onmicrosoft.com header.i=@armh.onmicrosoft.com header.a=rsa-sha256 header.s=selector2-armh-onmicrosoft-com header.b=X2PpY1bi; dkim=pass (1024-bit key) header.d=armh.onmicrosoft.com header.i=@armh.onmicrosoft.com header.a=rsa-sha256 header.s=selector2-armh-onmicrosoft-com header.b=X2PpY1bi; dkim-atps=neutral Received: from sourceware.org (server2.sourceware.org [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4ByZBN6gbpz9sR4 for ; Sat, 26 Sep 2020 00:32:28 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 2687A39F6EC2; Fri, 25 Sep 2020 14:32:10 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from EUR05-AM6-obe.outbound.protection.outlook.com (mail-am6eur05on2050.outbound.protection.outlook.com [40.107.22.50]) by sourceware.org (Postfix) with ESMTPS id 7515E398B83C for ; Fri, 25 Sep 2020 14:32:04 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 7515E398B83C Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=Tamar.Christina@arm.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector2-armh-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=WvhnL3iniVffVnVz2JfTCu3Jgq1EwAjl6xBEMaSfxhI=; b=X2PpY1billnlwkODKZwHW7PHksYI6n4EqkA2NIcb7cXKV76pkA1Pc/NNuEDin0b5pFJc8Vsc7SSGWdlpcL1pGeTLy2bb6XoFMfCSn1atKg98CBAjj/kA0sRydQ5s56xa8zm7fnu2Wd6wCCCmBqtcUxC/k9YeJrwhEQnIHK+Cd10= Received: from DB3PR08CA0012.eurprd08.prod.outlook.com (2603:10a6:8::25) by DB8PR08MB5274.eurprd08.prod.outlook.com (2603:10a6:10:e6::18) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3412.23; Fri, 25 Sep 2020 14:32:02 +0000 Received: from DB5EUR03FT055.eop-EUR03.prod.protection.outlook.com (2603:10a6:8:0:cafe::63) by DB3PR08CA0012.outlook.office365.com (2603:10a6:8::25) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3412.22 via Frontend Transport; Fri, 25 Sep 2020 14:32:02 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 63.35.35.123) smtp.mailfrom=arm.com; gcc.gnu.org; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com;gcc.gnu.org; dmarc=bestguesspass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 63.35.35.123 as permitted sender) receiver=protection.outlook.com; client-ip=63.35.35.123; helo=64aa7808-outbound-1.mta.getcheckrecipient.com; Received: from 64aa7808-outbound-1.mta.getcheckrecipient.com (63.35.35.123) by DB5EUR03FT055.mail.protection.outlook.com (10.152.21.30) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3412.21 via Frontend Transport; Fri, 25 Sep 2020 14:32:02 +0000 Received: ("Tessian outbound a0bffebca527:v64"); Fri, 25 Sep 2020 14:32:02 +0000 X-CheckRecipientChecked: true X-CR-MTA-CID: 91b94bc4467e9bdd X-CR-MTA-TID: 64aa7808 Received: from 8ccb409c9771.1 by 64aa7808-outbound-1.mta.getcheckrecipient.com id 8ADC6694-845C-40E0-9B79-9826B2255493.1; Fri, 25 Sep 2020 14:31:56 +0000 Received: from EUR03-DB5-obe.outbound.protection.outlook.com by 64aa7808-outbound-1.mta.getcheckrecipient.com with ESMTPS id 8ccb409c9771.1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384); Fri, 25 Sep 2020 14:31:56 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=TS+L5nfP6n10b65Mj1o5wudq5XwOGvkdyyS3UTA/9mUFa0W9xHyZTz9URCmdSBaoUH+/vfFed3f7G/6iZWXoW71UA+ZJqB8vCqNFpC0YzAZwVfNNTbioEOrQit0x23+fkoms+OT4rkaSeM/UoFOIeCFkLfYzcTwM1Bn1mPvEA0zpXmlaFwFJ7fjLPbxZmkGIsMmSl7y0q9ixhIhR9f5x3JOhe6bwi0EVW/euERg09GGhJe+YetcalUeWZCA315/FUNrHXUhJQOLnz3YwcUqmqe39uwB6bo0ApreqvUxOywIGyRg7qTQpZ99Y1rHeEQk3kd7QEzDns09ALhTCqTaIkA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=WvhnL3iniVffVnVz2JfTCu3Jgq1EwAjl6xBEMaSfxhI=; b=kkuCWIhnVyFQfnw6FshOZjzQ7P4/0VDQfGJxZYvfu6vEpdazoM+9jncpitNyEuhZ63st8DyE50pw+LdABW63ont5762qqdAbUb17jM+OaPohvddd8hcOTy7WtGo5L776O19dadjbTmAorKuxnbS/K5r8AnXGVZ7Bj5KZp/ZJF6mjrLc7zvH5nD0Au5d+NJ9KDeAvV2uhVFeIrMTkZBi0C9B+uSU+GSimSfG09bBi6ehgx3bA90bPsxMWzDXSlPwJ2/LgQ3EdDk4xGe968ZmdluFTzBr9KyWwwwi90PLAUj82CvJq6al5EtJv7YyDlZWRoCsPiR/Xd5TfEs+KhaC92w== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=arm.com; dmarc=pass action=none header.from=arm.com; dkim=pass header.d=arm.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector2-armh-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=WvhnL3iniVffVnVz2JfTCu3Jgq1EwAjl6xBEMaSfxhI=; b=X2PpY1billnlwkODKZwHW7PHksYI6n4EqkA2NIcb7cXKV76pkA1Pc/NNuEDin0b5pFJc8Vsc7SSGWdlpcL1pGeTLy2bb6XoFMfCSn1atKg98CBAjj/kA0sRydQ5s56xa8zm7fnu2Wd6wCCCmBqtcUxC/k9YeJrwhEQnIHK+Cd10= Authentication-Results-Original: gcc.gnu.org; dkim=none (message not signed) header.d=none;gcc.gnu.org; dmarc=none action=none header.from=arm.com; Received: from VI1PR08MB5325.eurprd08.prod.outlook.com (2603:10a6:803:13e::17) by VE1PR08MB5678.eurprd08.prod.outlook.com (2603:10a6:800:1a0::20) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3391.11; Fri, 25 Sep 2020 14:31:55 +0000 Received: from VI1PR08MB5325.eurprd08.prod.outlook.com ([fe80::d0e7:49cd:4dae:a2a2]) by VI1PR08MB5325.eurprd08.prod.outlook.com ([fe80::d0e7:49cd:4dae:a2a2%7]) with mapi id 15.20.3412.024; Fri, 25 Sep 2020 14:31:55 +0000 Date: Fri, 25 Sep 2020 15:31:47 +0100 From: Tamar Christina To: gcc-patches@gcc.gnu.org Subject: [PATCH v2 16/16] Testsuite: Add initial tests for NEON (incomplete) Message-ID: <20200925143145.GA31591@arm.com> Content-Disposition: inline User-Agent: Mutt/1.9.4 (2018-02-28) X-ClientProxiedBy: SN1PR12CA0090.namprd12.prod.outlook.com (2603:10b6:802:21::25) To VI1PR08MB5325.eurprd08.prod.outlook.com (2603:10a6:803:13e::17) MIME-Version: 1.0 X-MS-Exchange-MessageSentRepresentingType: 1 Received: from arm.com (217.140.106.53) by SN1PR12CA0090.namprd12.prod.outlook.com (2603:10b6:802:21::25) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3412.20 via Frontend Transport; Fri, 25 Sep 2020 14:31:53 +0000 X-Originating-IP: [217.140.106.53] X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-HT: Tenant X-MS-Office365-Filtering-Correlation-Id: 5161e7e1-c180-4129-a2d8-08d8615fc48f X-MS-TrafficTypeDiagnostic: VE1PR08MB5678:|DB8PR08MB5274: X-Microsoft-Antispam-PRVS: x-checkrecipientrouted: true NoDisclaimer: true X-MS-Oob-TLC-OOBClassifiers: OLM:5797;OLM:5797; X-MS-Exchange-SenderADCheck: 1 X-Microsoft-Antispam-Untrusted: BCL:0; X-Microsoft-Antispam-Message-Info-Original: 3OlqUiiqXXLWcZO4klPX6jmc6U/5xURf0fvl/Fw4oOf5z5CMaA2zAJGeca4gX/fm4kcJQOH5s22c3esKiCFTifPLBKFoabHxrThNbQRLrRJHIhYnNmD0io556xToNrX8HXVD60t1wY2Tv7jga/A0hCSlVG5JhNlcYcYD4rxuyOoIYmSv1uEsWz6ZGsq1WhPexuTUi+WJYuOQ7eB6Xfb3wcLOWHbOPqqBvWLrUafdALFQj0cK/QHz6JLa55/rTZmxWIRR/O2xZItFEHRVc9eolN4c++4rttcIJ/kGQrB3DU8KhCoSXLb6kXtlpvooY235cHe76T7lCLOjRTYj783nPPZeKgKXOBtoti0G6RvJmmnAPQY51g7zrBFGlTnbQUDQ6JMfpHwiOJqkNxPhcXk6Ndav+SekAmATynJqcoaz6f0= X-Forefront-Antispam-Report-Untrusted: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:VI1PR08MB5325.eurprd08.prod.outlook.com; PTR:; CAT:NONE; SFS:(4636009)(376002)(346002)(39860400002)(136003)(366004)(396003)(186003)(16526019)(52116002)(4743002)(4326008)(33964004)(235185007)(5660300002)(44144004)(8676002)(956004)(26005)(2906002)(2616005)(44832011)(7696005)(316002)(33656002)(66556008)(8886007)(66946007)(1076003)(55016002)(8936002)(6916009)(86362001)(36756003)(66476007)(478600001)(6666004)(66616009)(2700100001); DIR:OUT; SFP:1101; X-MS-Exchange-AntiSpam-MessageData: KL2uCC7LIiUgb26XRSM7hxbsRV5KGgrWTXkJNd7LgRUVpPO4Atym9281sAdsoO51iXtrCEwd+m2fJsQ7dit9mH2mmfaMFnM0CFuuZsu9G1zrZIJE7MpHItyTc35jkPCvz+OgZcHkKgrfOuz9WEJ8eEmS0msFQ/XBObYTTEzrMxorw4/AuE0qoZ6v8zqGWu5LMaLTxpeVryey03iI1zbHU3/msNfQ8fWRpW51PF2v2T2CWRoiuXP+h0gDUx9uQMdSBxN10uad+JkVKSIPlofSQaVjonI8np5BYb1eExjXYZdnEO63gRi1BowskMjQnwJ95kKj6QWAEbfwj4hwdh4wQDT8oFZV8tu4nECN8FHQ4fHNoAc/5lvDcZBKsFUM+TOMlB9lXpXbzV63bmg1hr3OyvBvk492jK5RvrKD+WWQp0nMyEP64LAfBErniSwgcyBJS4Gt5/dWZoqEWgX5sbQwwbd7LI1rxyIR8hICEzkkxcqrKhQDxYPLEgkQd/ekEcY8qXn6IEFB0PC7zZL54oPvY15qO8MqAoCqQSFoMbEU9lThp9yw8MNtzituXi8ERguT+XD0i4+2ZgNSO7vHgBjhjHpbbtfWyYMXfHd0sgAjftHwUrNLUrx0DfLTBXkO5KAz368OLtFLH6GzODxgSWlhuA== X-MS-Exchange-Transport-Forked: True X-MS-Exchange-Transport-CrossTenantHeadersStamped: VE1PR08MB5678 Original-Authentication-Results: gcc.gnu.org; dkim=none (message not signed) header.d=none;gcc.gnu.org; dmarc=none action=none header.from=arm.com; X-EOPAttributedMessage: 0 X-MS-Exchange-Transport-CrossTenantHeadersStripped: DB5EUR03FT055.eop-EUR03.prod.protection.outlook.com X-MS-Office365-Filtering-Correlation-Id-Prvs: 444195df-103b-4996-af87-08d8615fc043 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: AzBlzm5SiTyTexfaGqkkHdSbkBrr4xrWXwJ+jMozVORqL7+IQ2Feq8mTPS1+7bMx8XdzSlG6b2S3fveMVSb2MNcb2rT5QhDNUfMv3rNssa90jcvQIxWlkf+EVdgYi0tVEhb2ZxwwtLF9VcftXKw5UlEwTbHLAzBtU57ZYVJvaB90rXVrvzdoRqAmJQfbVsYNJM3YHxJi4EFjtlK6OLTUYJgA62xO+1mK6kayJdj0LZDtPuFstuix0AExePlF7ZD3iCBaWe2+1qefvrIXwwloMlYIfFlIw5VHwzTEdLYBDVLOa8BH0csd7su/xJldpp5QFQpLN/asBxufMPrj3kXUJ6MFfHOGcbVytWDS8b5uHuffR4JawszrIEK3LWaf8rHLe53FfmIxe8xRtL2Gk1IlJu7Nhp5obfp+CqdhQK434qo= X-Forefront-Antispam-Report: CIP:63.35.35.123; CTRY:IE; LANG:en; SCL:1; SRV:; IPV:CAL; SFV:NSPM; H:64aa7808-outbound-1.mta.getcheckrecipient.com; PTR:ec2-63-35-35-123.eu-west-1.compute.amazonaws.com; CAT:NONE; SFS:(4636009)(396003)(136003)(39860400002)(376002)(346002)(46966005)(6916009)(8676002)(36756003)(316002)(4743002)(956004)(2906002)(81166007)(356005)(186003)(336012)(26005)(82310400003)(16526019)(82740400003)(8936002)(86362001)(2616005)(44144004)(8886007)(7696005)(55016002)(33656002)(1076003)(6666004)(44832011)(235185007)(70206006)(66616009)(4326008)(5660300002)(70586007)(478600001)(33964004)(47076004)(2700100001); DIR:OUT; SFP:1101; X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 25 Sep 2020 14:32:02.0380 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 5161e7e1-c180-4129-a2d8-08d8615fc48f X-MS-Exchange-CrossTenant-Id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=f34e5979-57d9-4aaa-ad4d-b122a662184d; Ip=[63.35.35.123]; Helo=[64aa7808-outbound-1.mta.getcheckrecipient.com] X-MS-Exchange-CrossTenant-AuthSource: DB5EUR03FT055.eop-EUR03.prod.protection.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: DB8PR08MB5274 X-Spam-Status: No, score=-14.4 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, GIT_PATCH_0, KAM_LOTSOFHASH, MSGID_FROM_MTA_HEADER, RCVD_IN_DNSWL_LOW, RCVD_IN_MSPIKE_H2, SPF_HELO_PASS, SPF_PASS, TXREP, UNPARSEABLE_RELAY autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: nd@arm.com Errors-To: gcc-patches-bounces@gcc.gnu.org Sender: "Gcc-patches" Hi All, These are just initial testcases to show what the patch is testing for, however it is incomplete and I am working on better test setup to test all targets and add middle-end tests. These were just included for completeness. Thanks, Tamar gcc/testsuite/ChangeLog: * gcc.target/aarch64/advsimd-intrinsics/vcadd-arrays-autovec-270.c: New test. * gcc.target/aarch64/advsimd-intrinsics/vcadd-arrays-autovec-90.c: New test. * gcc.target/aarch64/advsimd-intrinsics/vcadd-arrays_1.c: New test. * gcc.target/aarch64/advsimd-intrinsics/vcadd-arrays_2.c: New test. * gcc.target/aarch64/advsimd-intrinsics/vcadd-arrays_3.c: New test. * gcc.target/aarch64/advsimd-intrinsics/vcadd-arrays_4.c: New test. * gcc.target/aarch64/advsimd-intrinsics/vcadd-arrays_5.c: New test. * gcc.target/aarch64/advsimd-intrinsics/vcadd-arrays_6.c: New test. * gcc.target/aarch64/advsimd-intrinsics/vcadd-complex-autovec.c: New test. * gcc.target/aarch64/advsimd-intrinsics/vcadd-complex_1.c: New test. * gcc.target/aarch64/advsimd-intrinsics/vcadd-complex_2.c: New test. * gcc.target/aarch64/advsimd-intrinsics/vcadd-complex_3.c: New test. * gcc.target/aarch64/advsimd-intrinsics/vcadd-complex_4.c: New test. * gcc.target/aarch64/advsimd-intrinsics/vcadd-complex_5.c: New test. * gcc.target/aarch64/advsimd-intrinsics/vcadd-complex_6.c: New test. * gcc.target/aarch64/advsimd-intrinsics/vcmla-complex-autovec.c: New test. * gcc.target/aarch64/advsimd-intrinsics/vcmla-complex_1.c: New test. * gcc.target/aarch64/advsimd-intrinsics/vcmla-complex_180_1.c: New test. * gcc.target/aarch64/advsimd-intrinsics/vcmla-complex_180_2.c: New test. * gcc.target/aarch64/advsimd-intrinsics/vcmla-complex_180_3.c: New test. * gcc.target/aarch64/advsimd-intrinsics/vcmla-complex_2.c: New test. * gcc.target/aarch64/advsimd-intrinsics/vcmla-complex_270_1.c: New test. * gcc.target/aarch64/advsimd-intrinsics/vcmla-complex_270_2.c: New test. * gcc.target/aarch64/advsimd-intrinsics/vcmla-complex_270_3.c: New test. * gcc.target/aarch64/advsimd-intrinsics/vcmla-complex_3.c: New test. * gcc.target/aarch64/advsimd-intrinsics/vcmla-complex_90_1.c: New test. * gcc.target/aarch64/advsimd-intrinsics/vcmla-complex_90_2.c: New test. * gcc.target/aarch64/advsimd-intrinsics/vcmla-complex_90_3.c: New test. diff --git a/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vcadd-arrays-autovec-270.c b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vcadd-arrays-autovec-270.c new file mode 100644 index 0000000000000000000000000000000000000000..8f660f392153c3a6a83b31486e275be316c6ad2b --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vcadd-arrays-autovec-270.c @@ -0,0 +1,13 @@ +/* { dg-skip-if "" { *-*-* } } */ + +#define N 200 + +__attribute__ ((noinline)) +void calc (TYPE a[N], TYPE b[N], TYPE *c) +{ + for (int i=0; i < N; i+=2) + { + c[i] = a[i] + b[i+1]; + c[i+1] = a[i+1] - b[i]; + } +} diff --git a/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vcadd-arrays-autovec-90.c b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vcadd-arrays-autovec-90.c new file mode 100644 index 0000000000000000000000000000000000000000..14014b9d4f2c41e75be3e253d2e47e639e4224c0 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vcadd-arrays-autovec-90.c @@ -0,0 +1,12 @@ +/* { dg-skip-if "" { *-*-* } } */ +#define N 200 + +__attribute__ ((noinline)) +void calc (TYPE a[N], TYPE b[N], TYPE *c) +{ + for (int i=0; i < N; i+=2) + { + c[i] = a[i] - b[i+1]; + c[i+1] = a[i+1] + b[i]; + } +} diff --git a/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vcadd-arrays_1.c b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vcadd-arrays_1.c new file mode 100644 index 0000000000000000000000000000000000000000..997d9065504a9a16d3ea1316f7ea4208b3516c55 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vcadd-arrays_1.c @@ -0,0 +1,30 @@ +/* { dg-do run } */ +/* { dg-require-effective-target arm_v8_3a_complex_neon_ok } */ +/* { dg-require-effective-target vect_complex_rot_df } */ +/* { dg-add-options arm_v8_3a_complex_neon } */ +/* { dg-additional-options "-Ofast -save-temps" } */ + +#define TYPE double +#include "vcadd-arrays-autovec-90.c" + +extern void abort(void); + +int main() +{ + TYPE a[N] = {1.0, 2.0, 3.0, 4.0}; + TYPE b[N] = {4.0, 2.0, 1.5, 4.5}; + TYPE c[N] = {0}; + calc (a, b, c); + + if (c[0] != -1.0 || c[1] != 6.0) + abort (); + + if (c[2] != -1.5 || c[3] != 5.5) + abort (); + + return 0; +} + +/* { dg-final { scan-assembler-times {fcadd\tv[0-9]+\.2d, v[0-9]+\.2d, v[0-9]+\.2d, #90} 1 { target { aarch64*-*-* } } } } */ +/* { dg-final { scan-assembler-not {vcadd\.} { target { arm*-*-* } } } } */ + diff --git a/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vcadd-arrays_2.c b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vcadd-arrays_2.c new file mode 100644 index 0000000000000000000000000000000000000000..8ab2aa75e261e0d885fb8042c09b6e42284dea85 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vcadd-arrays_2.c @@ -0,0 +1,30 @@ +/* { dg-do run } */ +/* { dg-require-effective-target arm_v8_3a_complex_neon_ok } */ +/* { dg-require-effective-target vect_complex_rot_sf } */ +/* { dg-add-options arm_v8_3a_complex_neon } */ +/* { dg-additional-options "-Ofast -save-temps" } */ + +#define TYPE float +#include "vcadd-arrays-autovec-90.c" + +extern void abort(void); + +int main() +{ + TYPE a[N] = {1.0, 2.0, 3.0, 4.0}; + TYPE b[N] = {4.0, 2.0, 1.5, 4.5}; + TYPE c[N] = {0}; + calc (a, b, c); + + if (c[0] != -1.0 || c[1] != 6.0) + abort (); + + if (c[2] != -1.5 || c[3] != 5.5) + abort (); + + return 0; +} + +/* { dg-final { scan-assembler-times {fcadd\tv[0-9]+\.4s, v[0-9]+\.4s, v[0-9]+\.4s, #90} 1 { target { aarch64*-*-* } } } } */ +/* { dg-final { scan-assembler-times {vcadd\.f32\tq[0-9]+, q[0-9]+, q[0-9]+, #(?:0|90)} 2 { target { arm*-*-* } } } } */ + diff --git a/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vcadd-arrays_3.c b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vcadd-arrays_3.c new file mode 100644 index 0000000000000000000000000000000000000000..8002d4efa003bb8af6a6592334e7749da336875e --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vcadd-arrays_3.c @@ -0,0 +1,31 @@ +/* { dg-do run } */ +/* { dg-require-effective-target arm_v8_3a_complex_neon_ok } */ +/* { dg-require-effective-target vect_complex_rot_hf } */ +/* { dg-require-effective-target arm_v8_2a_fp16_scalar_ok } */ +/* { dg-add-options arm_v8_3a_complex_neon } */ +/* { dg-additional-options "-Ofast -march=armv8.3-a+fp16 -save-temps" } */ + +#define TYPE _Float16 +#include "vcadd-arrays-autovec-90.c" + +extern void abort(void); + +int main() +{ + TYPE a[N] = {1.0, 2.0, 3.0, 4.0}; + TYPE b[N] = {4.0, 2.0, 1.5, 4.5}; + TYPE c[N] = {0}; + calc (a, b, c); + + if (c[0] != -1.0 || c[1] != 6.0) + abort (); + + if (c[2] != -1.5 || c[3] != 5.5) + abort (); + + return 0; +} + +/* { dg-final { scan-assembler-times {fcadd\tv[0-9]+\.8h, v[0-9]+\.8h, v[0-9]+\.8h, #90} 1 { target { aarch64*-*-* } } } } */ +/* { dg-final { scan-assembler-times {vcadd\.f16\tq[0-9]+, q[0-9]+, q[0-9]+, #90} 1 { target { arm*-*-* } } } } */ + diff --git a/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vcadd-arrays_4.c b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vcadd-arrays_4.c new file mode 100644 index 0000000000000000000000000000000000000000..601d6886a4c999d010ca2e8a5babad066d5fa0a5 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vcadd-arrays_4.c @@ -0,0 +1,30 @@ +/* { dg-do run } */ +/* { dg-require-effective-target arm_v8_3a_complex_neon_ok } */ +/* { dg-require-effective-target vect_complex_rot_df } */ +/* { dg-add-options arm_v8_3a_complex_neon } */ +/* { dg-additional-options "-Ofast -save-temps" } */ + +#define TYPE double +#include "vcadd-arrays-autovec-270.c" + +extern void abort(void); + +int main() +{ + TYPE a[N] = {1.0, 2.0, 3.0, 4.0}; + TYPE b[N] = {4.0, 2.0, 1.5, 4.5}; + TYPE c[N] = {0}; + calc (a, b, c); + + if (c[0] != 3.0 || c[1] != -2.0) + abort (); + + if (c[2] != 7.5 || c[3] != 2.5) + abort (); + + return 0; +} + +/* { dg-final { scan-assembler-times {fcadd\tv[0-9]+\.2d, v[0-9]+\.2d, v[0-9]+\.2d, #270} 1 { target { aarch64*-*-* } } } } */ +/* { dg-final { scan-assembler-not {vcadd\.} { target { arm*-*-* } } } } */ + diff --git a/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vcadd-arrays_5.c b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vcadd-arrays_5.c new file mode 100644 index 0000000000000000000000000000000000000000..f7851bc7304bf671f3e14bb08e7dc434e867a29c --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vcadd-arrays_5.c @@ -0,0 +1,30 @@ +/* { dg-do run } */ +/* { dg-require-effective-target arm_v8_3a_complex_neon_ok } */ +/* { dg-require-effective-target vect_complex_rot_sf } */ +/* { dg-add-options arm_v8_3a_complex_neon } */ +/* { dg-additional-options "-Ofast -save-temps" } */ + +#define TYPE float +#include "vcadd-arrays-autovec-270.c" + +extern void abort(void); + +int main() +{ + TYPE a[N] = {1.0, 2.0, 3.0, 4.0}; + TYPE b[N] = {4.0, 2.0, 1.5, 4.5}; + TYPE c[N] = {0}; + calc (a, b, c); + + if (c[0] != 3.0 || c[1] != -2.0) + abort (); + + if (c[2] != 7.5 || c[3] != 2.5) + abort (); + + return 0; +} + +/* { dg-final { scan-assembler-times {fcadd\tv[0-9]+\.4s, v[0-9]+\.4s, v[0-9]+\.4s, #270} 1 { target { aarch64*-*-* } } } } */ +/* { dg-final { scan-assembler-times {vcadd\.f32\tq[0-9]+, q[0-9]+, q[0-9]+, #270} 1 { target { arm*-*-* } } } } */ + diff --git a/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vcadd-arrays_6.c b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vcadd-arrays_6.c new file mode 100644 index 0000000000000000000000000000000000000000..02172be3647852cd3a959a6b1aef82e3a4c5f28d --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vcadd-arrays_6.c @@ -0,0 +1,31 @@ +/* { dg-do run } */ +/* { dg-require-effective-target arm_v8_3a_complex_neon_ok } */ +/* { dg-require-effective-target vect_complex_rot_hf } */ +/* { dg-require-effective-target arm_v8_2a_fp16_scalar_ok } */ +/* { dg-add-options arm_v8_3a_complex_neon } */ +/* { dg-additional-options "-Ofast -march=armv8.3-a+fp16 -save-temps" } */ + +#define TYPE _Float16 +#include "vcadd-arrays-autovec-270.c" + +extern void abort(void); + +int main() +{ + TYPE a[N] = {1.0, 2.0, 3.0, 4.0}; + TYPE b[N] = {4.0, 2.0, 1.5, 4.5}; + TYPE c[N] = {0}; + calc (a, b, c); + + if (c[0] != 3.0 || c[1] != -2.0) + abort (); + + if (c[2] != 7.5 || c[3] != 2.5) + abort (); + + return 0; +} + +/* { dg-final { scan-assembler-times {fcadd\tv[0-9]+\.8h, v[0-9]+\.8h, v[0-9]+\.8h, #270} 1 { target { aarch64*-*-* } } } } */ +/* { dg-final { scan-assembler-times {vcadd\.f16\tq[0-9]+, q[0-9]+, q[0-9]+, #270} 1 { target { arm*-*-* } } } } */ + diff --git a/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vcadd-complex-autovec.c b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vcadd-complex-autovec.c new file mode 100644 index 0000000000000000000000000000000000000000..2a301e6ec0a9ba23a16c39d9c36ee281422f1803 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vcadd-complex-autovec.c @@ -0,0 +1,12 @@ +/* { dg-skip-if "" { *-*-* } } */ + +#include + +#define N 200 + +__attribute__ ((noinline)) +void calc (TYPE complex a[N], TYPE complex b[N], TYPE complex c[N]) +{ + for (int i=0; i < N; i++) + c[i] = a[i] + b[i] ROT; +} diff --git a/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vcadd-complex_1.c b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vcadd-complex_1.c new file mode 100644 index 0000000000000000000000000000000000000000..aebe0b8bdeee25d7ae6e387b006de9413ecbc13e --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vcadd-complex_1.c @@ -0,0 +1,33 @@ +/* { dg-do run } */ +/* { dg-require-effective-target arm_v8_3a_complex_neon_ok } */ +/* { dg-require-effective-target vect_complex_rot_df } */ +/* { dg-add-options arm_v8_3a_complex_neon } */ +/* { dg-additional-options "-Ofast -save-temps" } */ + +#define TYPE double +#define ROT * I +#include "vcadd-complex-autovec.c" + +extern void abort(void); + +#include + +int main() +{ + TYPE complex a[N] = {1.0 + 2.0 * I, 3.0 + 4.0 * I}; + TYPE complex b[N] = {4.0 + 2.0 * I, 1.5 + 4.5 * I}; + TYPE complex c[N] = {0}; + calc (a, b, c); + + if (creal (c[0]) != -1.0 || cimag (c[0]) != 6.0) + abort (); + + if (creal (c[1]) != -1.5 || cimag (c[1]) != 5.5) + abort (); + + return 0; +} + +/* { dg-final { scan-assembler-times {fcadd\tv[0-9]+\.2d, v[0-9]+\.2d, v[0-9]+\.2d, #90} 1 { target { aarch64*-*-* } } } } */ +/* { dg-final { scan-assembler-not {vcadd\.} { target { arm*-*-* } } } } */ + diff --git a/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vcadd-complex_2.c b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vcadd-complex_2.c new file mode 100644 index 0000000000000000000000000000000000000000..891e9874d2d66b9849809c3f7ca3c31044256f99 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vcadd-complex_2.c @@ -0,0 +1,31 @@ +/* { dg-do run } */ +/* { dg-require-effective-target arm_v8_3a_complex_neon_ok } */ +/* { dg-require-effective-target vect_complex_rot_sf } */ +/* { dg-add-options arm_v8_3a_complex_neon } */ +/* { dg-additional-options "-Ofast -save-temps" } */ + +#define TYPE float +#define ROT * I +#include "vcadd-complex-autovec.c" + +extern void abort(void); + +int main() +{ + TYPE complex a[N] = {1.0 + 2.0 * I, 3.0 + 4.0 * I}; + TYPE complex b[N] = {4.0 + 2.0 * I, 1.5 + 4.5 * I}; + TYPE complex c[N] = {0}; + calc (a, b, c); + + if (creal (c[0]) != -1.0 || cimag (c[0]) != 6.0) + abort (); + + if (creal (c[1]) != -1.5 || cimag (c[1]) != 5.5) + abort (); + + return 0; +} + +/* { dg-final { scan-assembler-times {fcadd\tv[0-9]+\.4s, v[0-9]+\.4s, v[0-9]+\.4s, #90} 1 { target { aarch64*-*-* } } } } */ +/* { dg-final { scan-assembler-times {vcadd\.f32\tq[0-9]+, q[0-9]+, q[0-9]+, #90} 1 { target { arm*-*-* } } } } */ + diff --git a/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vcadd-complex_3.c b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vcadd-complex_3.c new file mode 100644 index 0000000000000000000000000000000000000000..871d64a9bab0b08433f55eedd890146058526a1c --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vcadd-complex_3.c @@ -0,0 +1,32 @@ +/* { dg-do run } */ +/* { dg-require-effective-target arm_v8_3a_complex_neon_ok } */ +/* { dg-require-effective-target vect_complex_rot_hf } */ +/* { dg-require-effective-target arm_v8_2a_fp16_scalar_ok } */ +/* { dg-add-options arm_v8_3a_complex_neon } */ +/* { dg-additional-options "-Ofast -march=armv8.3-a+fp16 -save-temps" } */ + +#define TYPE _Float16 +#define ROT * I +#include "vcadd-complex-autovec.c" + +extern void abort(void); + +int main() +{ + TYPE complex a[N] = {1.0 + 2.0 * I, 3.0 + 4.0 * I}; + TYPE complex b[N] = {4.0 + 2.0 * I, 1.5 + 4.5 * I}; + TYPE complex c[N] = {0}; + calc (a, b, c); + + if (creal (c[0]) != -1.0 || cimag (c[0]) != 6.0) + abort (); + + if (creal (c[1]) != -1.5 || cimag (c[1]) != 5.5) + abort (); + + return 0; +} + +/* { dg-final { scan-assembler-times {fcadd\tv[0-9]+\.8h, v[0-9]+\.8h, v[0-9]+\.8h, #90} 1 { target { aarch64*-*-* } } } } */ +/* { dg-final { scan-assembler-times {vcadd\.f32\tq[0-9]+, q[0-9]+, q[0-9]+, #90} 1 { target { arm*-*-* } } } } */ + diff --git a/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vcadd-complex_4.c b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vcadd-complex_4.c new file mode 100644 index 0000000000000000000000000000000000000000..7c9278945fc28e1350ef8ac9a4ddfdac56da14c9 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vcadd-complex_4.c @@ -0,0 +1,31 @@ +/* { dg-do run } */ +/* { dg-require-effective-target arm_v8_3a_complex_neon_ok } */ +/* { dg-require-effective-target vect_complex_rot_df } */ +/* { dg-add-options arm_v8_3a_complex_neon } */ +/* { dg-additional-options "-Ofast -save-temps" } */ + +#define TYPE double +#define ROT * I * I * I +#include "vcadd-complex-autovec.c" + +extern void abort(void); + +int main() +{ + TYPE complex a[N] = {1.0 + 2.0 * I, 3.0 + 4.0 * I}; + TYPE complex b[N] = {4.0 + 2.0 * I, 1.5 + 4.5 * I}; + TYPE complex c[N] = {0}; + calc (a, b, c); + + if (creal (c[0]) != 3.0 || cimag (c[0]) != -2.0) + abort (); + + if (creal (c[1]) != 7.5 || cimag (c[1]) != 2.5) + abort (); + + return 0; +} + +/* { dg-final { scan-assembler-times {fcadd\tv[0-9]+\.2d, v[0-9]+\.2d, v[0-9]+\.2d, #270} 1 { target { aarch64*-*-* } } } } */ +/* { dg-final { scan-assembler-not {fcadd\.} { target { arm*-*-* } } } } */ + diff --git a/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vcadd-complex_5.c b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vcadd-complex_5.c new file mode 100644 index 0000000000000000000000000000000000000000..a431fc82155c5eccf02cf4b66313caf989777084 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vcadd-complex_5.c @@ -0,0 +1,31 @@ +/* { dg-do run } */ +/* { dg-require-effective-target arm_v8_3a_complex_neon_ok } */ +/* { dg-require-effective-target vect_complex_rot_sf } */ +/* { dg-add-options arm_v8_3a_complex_neon } */ +/* { dg-additional-options "-Ofast -save-temps" } */ + +#define TYPE float +#define ROT * I * I * I +#include "vcadd-complex-autovec.c" + +extern void abort(void); + +int main() +{ + TYPE complex a[N] = {1.0 + 2.0 * I, 3.0 + 4.0 * I}; + TYPE complex b[N] = {4.0 + 2.0 * I, 1.5 + 4.5 * I}; + TYPE complex c[N] = {0}; + calc (a, b, c); + + if (creal (c[0]) != 3.0 || cimag (c[0]) != -2.0) + abort (); + + if (creal (c[1]) != 7.5 || cimag (c[1]) != 2.5) + abort (); + + return 0; +} + +/* { dg-final { scan-assembler-times {fcadd\tv[0-9]+\.4s, v[0-9]+\.4s, v[0-9]+\.4s, #270} 1 { target { aarch64*-*-* } } } } */ +/* { dg-final { scan-assembler-times {vcadd\.f32\tq[0-9]+, q[0-9]+, q[0-9]+, #270} 1 { target { arm*-*-* } } } } */ + diff --git a/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vcadd-complex_6.c b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vcadd-complex_6.c new file mode 100644 index 0000000000000000000000000000000000000000..6e1b04d4088b9dd503f33aa6e85196a61db0ee5c --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vcadd-complex_6.c @@ -0,0 +1,32 @@ +/* { dg-do run } */ +/* { dg-require-effective-target arm_v8_3a_complex_neon_ok } */ +/* { dg-require-effective-target vect_complex_rot_hf } */ +/* { dg-require-effective-target arm_v8_2a_fp16_scalar_ok } */ +/* { dg-add-options arm_v8_3a_complex_neon } */ +/* { dg-additional-options "-Ofast -march=armv8.3-a+fp16 -save-temps" } */ + +#define TYPE _Float16 +#define ROT * I * I * I +#include "vcadd-complex-autovec.c" + +extern void abort(void); + +int main() +{ + TYPE complex a[N] = {1.0 + 2.0 * I, 3.0 + 4.0 * I}; + TYPE complex b[N] = {4.0 + 2.0 * I, 1.5 + 4.5 * I}; + TYPE complex c[N] = {0}; + calc (a, b, c); + + if (creal (c[0]) != 3.0 || cimag (c[0]) != -2.0) + abort (); + + if (creal (c[1]) != 7.5 || cimag (c[1]) != 2.5) + abort (); + + return 0; +} + +/* { dg-final { scan-assembler-times {fcadd\tv[0-9]+\.8h, v[0-9]+\.8h, v[0-9]+\.8h, #270} 1 { target { aarch64*-*-* } } } } */ +/* { dg-final { scan-assembler-times {vcadd\.f16\tq[0-9]+, q[0-9]+, q[0-9]+, #270} 1 { target { arm*-*-* } } } } */ + diff --git a/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vcmla-complex-autovec.c b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vcmla-complex-autovec.c new file mode 100644 index 0000000000000000000000000000000000000000..1ad7cc319eeef2ea15f530997a9ffc09571ea02e --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vcmla-complex-autovec.c @@ -0,0 +1,11 @@ +/* { dg-skip-if "" { *-*-* } } */ +#include + +#define N 200 + +__attribute__ ((noinline, noipa)) +void calc (TYPE complex a[N], TYPE complex b[N], TYPE complex c[N]) +{ + for (int i=0; i < N; i++) + c[i] += a[i] * b[i] ROT; +} diff --git a/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vcmla-complex_1.c b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vcmla-complex_1.c new file mode 100644 index 0000000000000000000000000000000000000000..6b5baf013ce285cfa0a28cb9128d839d6ad3d4eb --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vcmla-complex_1.c @@ -0,0 +1,33 @@ +/* { dg-do run } */ +/* { dg-require-effective-target arm_v8_3a_complex_neon_ok } */ +/* { dg-require-effective-target vect_complex_rot_df } */ +/* { dg-add-options arm_v8_3a_complex_neon } */ +/* { dg-additional-options "-Ofast -save-temps" } */ +/* { dg-keep-saved-temps ".s" ".o" ".exe" } */ +#define TYPE double +#define ROT +#include "vcmla-complex-autovec.c" + +extern void abort(void); + +#include + +int main() +{ + TYPE complex a[N] = {1.0 + 2.0 * I, 3.0 + 4.0 * I}; + TYPE complex b[N] = {4.0 + 2.0 * I, 1.5 + 4.5 * I}; + TYPE complex c[N] = {2.5 + 1.5 * I, 2.0 + 1.5 * I}; + calc (a, b, c); + + if (creal (c[0]) != 2.5 || cimag (c[0]) != 11.5) + abort (); + + if (creal (c[1]) != -11.5 || cimag (c[1]) != 21.0) + abort (); + + return 0; +} + +/* { dg-final { scan-assembler-times {fcmla\tv[0-9]+\.2d, v[0-9]+\.2d, v[0-9]+\.2d, #(?:0|90)} 2 { target { aarch64*-*-* } } } } */ +/* { dg-final { scan-assembler-not {vcmla\.} { target { arm*-*-* } } } } */ + diff --git a/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vcmla-complex_180_1.c b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vcmla-complex_180_1.c new file mode 100644 index 0000000000000000000000000000000000000000..2d6fc3354ad5b32c4d636efbeeefdc756d7d2b7a --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vcmla-complex_180_1.c @@ -0,0 +1,31 @@ +/* { dg-do run } */ +/* { dg-require-effective-target arm_v8_3a_complex_neon_ok } */ +/* { dg-require-effective-target vect_complex_rot_df } */ +/* { dg-add-options arm_v8_3a_complex_neon } */ +/* { dg-additional-options "-Ofast -save-temps" } */ + +#define TYPE double +#define ROT * I * I +#include "vcmla-complex-autovec.c" + +extern void abort(void); + +int main() +{ + TYPE complex a[N] = {1.0 + 2.0 * I, 3.0 + 4.0 * I}; + TYPE complex b[N] = {4.0 + 2.0 * I, 1.5 + 4.5 * I}; + TYPE complex c[N] = {2.5 + 1.5 * I, 2.0 + 1.5 * I}; + calc (a, b, c); + + if (creal (c[0]) != 2.5 || cimag (c[0]) != -8.5) + abort (); + + if (creal (c[1]) != 15.5 || cimag (c[1]) != -18.0) + abort (); + + return 0; +} + +/* { dg-final { scan-assembler-times {fcmla\tv[0-9]+\.2d, v[0-9]+\.2d, v[0-9]+\.2d, #(?:180|270)} 2 { target { aarch64*-*-* } } } } */ +/* { dg-final { scan-assembler-not {vcmla\.} { target { arm*-*-* } } } } */ + diff --git a/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vcmla-complex_180_2.c b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vcmla-complex_180_2.c new file mode 100644 index 0000000000000000000000000000000000000000..f4ce831705b09288ef2ca52c26a26fef1d8cca20 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vcmla-complex_180_2.c @@ -0,0 +1,31 @@ +/* { dg-do run } */ +/* { dg-require-effective-target arm_v8_3a_complex_neon_ok } */ +/* { dg-require-effective-target vect_complex_rot_sf } */ +/* { dg-add-options arm_v8_3a_complex_neon } */ +/* { dg-additional-options "-Ofast -save-temps" } */ + +#define TYPE float +#define ROT * I * I +#include "vcmla-complex-autovec.c" + +extern void abort(void); + +int main() +{ + TYPE complex a[N] = {1.0 + 2.0 * I, 3.0 + 4.0 * I}; + TYPE complex b[N] = {4.0 + 2.0 * I, 1.5 + 4.5 * I}; + TYPE complex c[N] = {2.5 + 1.5 * I, 2.0 + 1.5 * I}; + calc (a, b, c); + + if (creal (c[0]) != 2.5 || cimag (c[0]) != -8.5) + abort (); + + if (creal (c[1]) != 15.5 || cimag (c[1]) != -18.0) + abort (); + + return 0; +} + +/* { dg-final { scan-assembler-times {fcmla\tv[0-9]+\.4s, v[0-9]+\.4s, v[0-9]+\.4s, #(?:180|270)} 2 { target { aarch64*-*-* } } } } */ +/* { dg-final { scan-assembler-times {vcmla\.f32\tq[0-9]+, q[0-9]+, q[0-9]+, #(?:180|270)} 2 { target { arm*-*-* } } } } */ + diff --git a/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vcmla-complex_180_3.c b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vcmla-complex_180_3.c new file mode 100644 index 0000000000000000000000000000000000000000..7a6aed992322753dc928f1db9689f58f02702745 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vcmla-complex_180_3.c @@ -0,0 +1,32 @@ +/* { dg-do run } */ +/* { dg-require-effective-target arm_v8_3a_complex_neon_ok } */ +/* { dg-require-effective-target vect_complex_rot_hf } */ +/* { dg-require-effective-target arm_v8_2a_fp16_scalar_ok } */ +/* { dg-add-options arm_v8_3a_complex_neon } */ +/* { dg-additional-options "-Ofast -march=armv8.3-a+fp16 -save-temps" } */ + +#define TYPE _Float16 +#define ROT * I * I +#include "vcmla-complex-autovec.c" + +extern void abort(void); + +int main() +{ + TYPE complex a[N] = {1.0 + 2.0 * I, 3.0 + 4.0 * I}; + TYPE complex b[N] = {4.0 + 2.0 * I, 1.5 + 4.5 * I}; + TYPE complex c[N] = {2.5 + 1.5 * I, 2.0 + 1.5 * I}; + calc (a, b, c); + + if (creal (c[0]) != 2.5 || cimag (c[0]) != -8.5) + abort (); + + if (creal (c[1]) != 15.5 || cimag (c[1]) != -18.0) + abort (); + + return 0; +} + +/* { dg-final { scan-assembler-times {fcmla\tv[0-9]+\.8h, v[0-9]+\.8h, v[0-9]+\.8h, #(?:180|270)} 2 { target { aarch64*-*-* } } } } */ +/* { dg-final { scan-assembler-times {vcmla\.f16\tq[0-9]+, q[0-9]+, q[0-9]+, #(?:180|270)} 2 { target { arm*-*-* } } } } */ + diff --git a/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vcmla-complex_2.c b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vcmla-complex_2.c new file mode 100644 index 0000000000000000000000000000000000000000..70198d0eb52cf1be2c3df4c99ae5868d7abafd38 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vcmla-complex_2.c @@ -0,0 +1,31 @@ +/* { dg-do run } */ +/* { dg-require-effective-target arm_v8_3a_complex_neon_ok } */ +/* { dg-require-effective-target vect_complex_rot_sf } */ +/* { dg-add-options arm_v8_3a_complex_neon } */ +/* { dg-additional-options "-Ofast -save-temps" } */ + +#define TYPE float +#define ROT +#include "vcmla-complex-autovec.c" + +extern void abort(void); + +int main() +{ + TYPE complex a[N] = {1.0 + 2.0 * I, 3.0 + 4.0 * I}; + TYPE complex b[N] = {4.0 + 2.0 * I, 1.5 + 4.5 * I}; + TYPE complex c[N] = {2.5 + 1.5 * I, 2.0 + 1.5 * I}; + calc (a, b, c); + + if (creal (c[0]) != 2.5 || cimag (c[0]) != 11.5) + abort (); + + if (creal (c[1]) != -11.5 || cimag (c[1]) != 21.0) + abort (); + + return 0; +} + +/* { dg-final { scan-assembler-times {fcmla\tv[0-9]+\.4s, v[0-9]+\.4s, v[0-9]+\.4s, #(?:0|90)} 2 { target { aarch64*-*-* } } } } */ +/* { dg-final { scan-assembler-times {vcmla\.f32\tq[0-9]+, q[0-9]+, q[0-9]+, #(?:0|90)} 2 { target { arm*-*-* } } } } */ + diff --git a/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vcmla-complex_270_1.c b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vcmla-complex_270_1.c new file mode 100644 index 0000000000000000000000000000000000000000..ccc4a8723b28f81de0ee93abeff6d8a09e841260 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vcmla-complex_270_1.c @@ -0,0 +1,30 @@ +/* { dg-do run } */ +/* { dg-require-effective-target arm_v8_3a_complex_neon_ok } */ +/* { dg-require-effective-target vect_complex_rot_df } */ +/* { dg-add-options arm_v8_3a_complex_neon } */ +/* { dg-additional-options "-Ofast -save-temps" } */ + +#define TYPE double +#define ROT * I * I * I +#include "vcmla-complex-autovec.c" + +extern void abort(void); + +int main() +{ + TYPE complex a[N] = {1.0 + 2.0 * I, 3.0 + 4.0 * I}; + TYPE complex b[N] = {4.0 + 2.0 * I, 1.5 + 4.5 * I}; + TYPE complex c[N] = {2.5 + 1.5 * I, 2.0 + 1.5 * I}; + calc (a, b, c); + + if (creal (c[0]) != 12.5 || cimag (c[0]) != 1.5) + abort (); + + if (creal (c[1]) != 21.5 || cimag (c[1]) != 15.0) + abort (); + + return 0; +} + +/* { dg-final { scan-assembler-not {fcmla} { target { aarch64*-*-* } } } } */ +/* { dg-final { scan-assembler-not {vcmla\.} { target { arm*-*-* } } } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vcmla-complex_270_2.c b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vcmla-complex_270_2.c new file mode 100644 index 0000000000000000000000000000000000000000..b9748e3674f3594369d81f8587f5d5a424c13562 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vcmla-complex_270_2.c @@ -0,0 +1,30 @@ +/* { dg-do run } */ +/* { dg-require-effective-target arm_v8_3a_complex_neon_ok } */ +/* { dg-require-effective-target vect_complex_rot_sf } */ +/* { dg-add-options arm_v8_3a_complex_neon } */ +/* { dg-additional-options "-Ofast -save-temps" } */ + +#define TYPE float +#define ROT * I * I * I +#include "vcmla-complex-autovec.c" + +extern void abort(void); + +int main() +{ + TYPE complex a[N] = {1.0 + 2.0 * I, 3.0 + 4.0 * I}; + TYPE complex b[N] = {4.0 + 2.0 * I, 1.5 + 4.5 * I}; + TYPE complex c[N] = {2.5 + 1.5 * I, 2.0 + 1.5 * I}; + calc (a, b, c); + + if (creal (c[0]) != 12.5 || cimag (c[0]) != 1.5) + abort (); + + if (creal (c[1]) != 21.5 || cimag (c[1]) != 15.0) + abort (); + + return 0; +} + +/* { dg-final { scan-assembler-not {fcmla} { target { aarch64*-*-* } } } } */ +/* { dg-final { scan-assembler-not {vcmla\.} { target { arm*-*-* } } } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vcmla-complex_270_3.c b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vcmla-complex_270_3.c new file mode 100644 index 0000000000000000000000000000000000000000..09e489ffcd302b4bdba2148c3a11529344df2a11 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vcmla-complex_270_3.c @@ -0,0 +1,31 @@ +/* { dg-do run } */ +/* { dg-require-effective-target arm_v8_3a_complex_neon_ok } */ +/* { dg-require-effective-target vect_complex_rot_hf } */ +/* { dg-require-effective-target arm_v8_2a_fp16_scalar_ok } */ +/* { dg-add-options arm_v8_3a_complex_neon } */ +/* { dg-additional-options "-Ofast -march=armv8.3-a+fp16 -save-temps" } */ + +#define TYPE _Float16 +#define ROT * I * I * I +#include "vcmla-complex-autovec.c" + +extern void abort(void); + +int main() +{ + TYPE complex a[N] = {1.0 + 2.0 * I, 3.0 + 4.0 * I}; + TYPE complex b[N] = {4.0 + 2.0 * I, 1.5 + 4.5 * I}; + TYPE complex c[N] = {2.5 + 1.5 * I, 2.0 + 1.5 * I}; + calc (a, b, c); + + if (creal (c[0]) != 12.5 || cimag (c[0]) != 1.5) + abort (); + + if (creal (c[1]) != 21.5 || cimag (c[1]) != 15.0) + abort (); + + return 0; +} + +/* { dg-final { scan-assembler-not {fcmla} { target { aarch64*-*-* } } } } */ +/* { dg-final { scan-assembler-not {vcmla\.} { target { arm*-*-* } } } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vcmla-complex_3.c b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vcmla-complex_3.c new file mode 100644 index 0000000000000000000000000000000000000000..2259587237b510149a8369761f6b3b92d1d79cb2 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vcmla-complex_3.c @@ -0,0 +1,32 @@ +/* { dg-do run } */ +/* { dg-require-effective-target arm_v8_3a_complex_neon_ok } */ +/* { dg-require-effective-target vect_complex_rot_hf } */ +/* { dg-require-effective-target arm_v8_2a_fp16_scalar_ok } */ +/* { dg-add-options arm_v8_3a_complex_neon } */ +/* { dg-additional-options "-Ofast -march=armv8.3-a+fp16 -save-temps" } */ + +#define TYPE _Float16 +#define ROT +#include "vcmla-complex-autovec.c" + +extern void abort(void); + +int main() +{ + TYPE complex a[N] = {1.0 + 2.0 * I, 3.0 + 4.0 * I}; + TYPE complex b[N] = {4.0 + 2.0 * I, 1.5 + 4.5 * I}; + TYPE complex c[N] = {2.5 + 1.5 * I, 2.0 + 1.5 * I}; + calc (a, b, c); + + if (creal (c[0]) != 2.5 || cimag (c[0]) != 11.5) + abort (); + + if (creal (c[1]) != -11.5 || cimag (c[1]) != 21.0) + abort (); + + return 0; +} + +/* { dg-final { scan-assembler-times {fcmla\tv[0-9]+\.8h, v[0-9]+\.8h, v[0-9]+\.8h, #(?:0|90)} 2 { target { aarch64*-*-* } } } } */ +/* { dg-final { scan-assembler-times {vcmla\.f16\tq[0-9]+, q[0-9]+, q[0-9]+, #(?:0|90)} 2 { target { arm*-*-* } } } } */ + diff --git a/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vcmla-complex_90_1.c b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vcmla-complex_90_1.c new file mode 100644 index 0000000000000000000000000000000000000000..acc3fad76791d7038c3f96d5333b68ce9af99468 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vcmla-complex_90_1.c @@ -0,0 +1,33 @@ +/* { dg-do run } */ +/* { dg-require-effective-target arm_v8_3a_complex_neon_ok } */ +/* { dg-require-effective-target vect_complex_rot_df } */ +/* { dg-add-options arm_v8_3a_complex_neon } */ +/* { dg-additional-options "-Ofast -save-temps" } */ + +#define TYPE double +#define ROT * I +#include "vcmla-complex-autovec.c" + +extern void abort(void); + +#include + +int main() +{ + TYPE complex a[N] = {1.0 + 2.0 * I, 3.0 + 4.0 * I}; + TYPE complex b[N] = {4.0 + 2.0 * I, 1.5 + 4.5 * I}; + TYPE complex c[N] = {2.5 + 1.5 * I, 2.0 + 1.5 * I}; + calc (a, b, c); + + if (creal (c[0]) != -7.5 || cimag (c[0]) != 1.5) + abort (); + + if (creal (c[1]) != -17.5 || cimag (c[1]) != -12.0) + abort (); + + return 0; +} + +/* { dg-final { scan-assembler-not {fcmla} { target { aarch64*-*-* } } } } */ +/* { dg-final { scan-assembler-not {vcmla\.} { target { arm*-*-* } } } } */ + diff --git a/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vcmla-complex_90_2.c b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vcmla-complex_90_2.c new file mode 100644 index 0000000000000000000000000000000000000000..d913a192bce0b9d059297c4c2024814a59dabd0e --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vcmla-complex_90_2.c @@ -0,0 +1,30 @@ +/* { dg-do run } */ +/* { dg-require-effective-target arm_v8_3a_complex_neon_ok } */ +/* { dg-require-effective-target vect_complex_rot_sf } */ +/* { dg-add-options arm_v8_3a_complex_neon } */ +/* { dg-additional-options "-Ofast -save-temps" } */ + +#define TYPE float +#define ROT * I +#include "vcmla-complex-autovec.c" + +extern void abort(void); + +int main() +{ + TYPE complex a[N] = {1.0 + 2.0 * I, 3.0 + 4.0 * I}; + TYPE complex b[N] = {4.0 + 2.0 * I, 1.5 + 4.5 * I}; + TYPE complex c[N] = {2.5 + 1.5 * I, 2.0 + 1.5 * I}; + calc (a, b, c); + + if (creal (c[0]) != -7.5 || cimag (c[0]) != 1.5) + abort (); + + if (creal (c[1]) != -17.5 || cimag (c[1]) != -12.0) + abort (); + + return 0; +} + +/* { dg-final { scan-assembler-not {fcmla} { target { aarch64*-*-* } } } } */ +/* { dg-final { scan-assembler-not {vcmla\.} { target { arm*-*-* } } } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vcmla-complex_90_3.c b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vcmla-complex_90_3.c new file mode 100644 index 0000000000000000000000000000000000000000..08a77a8f8215db944d8d0438b310ce32f68a57ba --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vcmla-complex_90_3.c @@ -0,0 +1,31 @@ +/* { dg-do run } */ +/* { dg-require-effective-target arm_v8_3a_complex_neon_ok } */ +/* { dg-require-effective-target vect_complex_rot_hf } */ +/* { dg-require-effective-target arm_v8_2a_fp16_scalar_ok } */ +/* { dg-add-options arm_v8_3a_complex_neon } */ +/* { dg-additional-options "-Ofast -march=armv8.3-a+fp16 -save-temps" } */ + +#define TYPE _Float16 +#define ROT * I +#include "vcmla-complex-autovec.c" + +extern void abort(void); + +int main() +{ + TYPE complex a[N] = {1.0 + 2.0 * I, 3.0 + 4.0 * I}; + TYPE complex b[N] = {4.0 + 2.0 * I, 1.5 + 4.5 * I}; + TYPE complex c[N] = {2.5 + 1.5 * I, 2.0 + 1.5 * I}; + calc (a, b, c); + + if (creal (c[0]) != -7.5 || cimag (c[0]) != 1.5) + abort (); + + if (creal (c[1]) != -17.5 || cimag (c[1]) != -12.0) + abort (); + + return 0; +} + +/* { dg-final { scan-assembler-not {fcmla} { target { aarch64*-*-* } } } } */ +/* { dg-final { scan-assembler-not {vcmla\.} { target { arm*-*-* } } } } */