From patchwork Fri Feb 19 14:41:29 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tamar Christina X-Patchwork-Id: 1442258 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=sourceware.org; envelope-from=gcc-patches-bounces@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.a=rsa-sha256 header.s=default header.b=Wp6R6A1T; dkim-atps=neutral Received: from sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4DhvRq0W5lz9sRR for ; Sat, 20 Feb 2021 01:42:13 +1100 (AEDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 9C76E398B8A3; Fri, 19 Feb 2021 14:42:11 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 9C76E398B8A3 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1613745731; bh=2aT/TAmk7EEWDC0M2yvt4WzXN9z1T9jXPbBdcvNECpw=; h=Date:To:Subject:List-Id:List-Unsubscribe:List-Archive:List-Post: List-Help:List-Subscribe:From:Reply-To:Cc:From; b=Wp6R6A1TpOCDFK7QRrnaEJpy8fQUnvBseBD5uDIW//7e7MkjqXrTuuVMslFgKVpab 9nlXIXgjHSI3br6nJpAw1pC5061W7rYBeIBsuORDkT8b4kXt7XKSuXoAgK5Ij8KJlv 8abVSFlESTFpKBEsT2pLFmDODEMTn1uff6lDek6c= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from EUR05-VI1-obe.outbound.protection.outlook.com (mail-vi1eur05on2080.outbound.protection.outlook.com [40.107.21.80]) by sourceware.org (Postfix) with ESMTPS id 7F4FB398B830 for ; Fri, 19 Feb 2021 14:42:06 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 7F4FB398B830 Received: from AM4PR0302CA0032.eurprd03.prod.outlook.com (2603:10a6:205:2::45) by AM6PR08MB2968.eurprd08.prod.outlook.com (2603:10a6:209:43::17) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3846.27; Fri, 19 Feb 2021 14:42:01 +0000 Received: from AM5EUR03FT003.eop-EUR03.prod.protection.outlook.com (2603:10a6:205:2:cafe::16) by AM4PR0302CA0032.outlook.office365.com (2603:10a6:205:2::45) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3868.28 via Frontend Transport; Fri, 19 Feb 2021 14:42:00 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 63.35.35.123) smtp.mailfrom=arm.com; gcc.gnu.org; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com;gcc.gnu.org; dmarc=pass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 63.35.35.123 as permitted sender) receiver=protection.outlook.com; client-ip=63.35.35.123; helo=64aa7808-outbound-1.mta.getcheckrecipient.com; Received: from 64aa7808-outbound-1.mta.getcheckrecipient.com (63.35.35.123) by AM5EUR03FT003.mail.protection.outlook.com (10.152.16.149) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3868.27 via Frontend Transport; Fri, 19 Feb 2021 14:42:00 +0000 Received: ("Tessian outbound f8d85101260a:v71"); Fri, 19 Feb 2021 14:41:59 +0000 X-CheckRecipientChecked: true X-CR-MTA-CID: 59892887fe1aef25 X-CR-MTA-TID: 64aa7808 Received: from 5158ca310f88.1 by 64aa7808-outbound-1.mta.getcheckrecipient.com id 36896D7E-8FB7-44F0-B85D-DF18739A7459.1; Fri, 19 Feb 2021 14:41:39 +0000 Received: from EUR02-VE1-obe.outbound.protection.outlook.com by 64aa7808-outbound-1.mta.getcheckrecipient.com with ESMTPS id 5158ca310f88.1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384); Fri, 19 Feb 2021 14:41:39 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=QppQUroOniAMnRuUSU58KoAOVt6VcfOZzNQpHuYISjNG8eYeLe7TFlZ/THWpXwP75+rhAfUxFETevqd0vNR8qqgWxatoQtLJGNoPuR6+Vhz3tZN4U3hs49Qeoc0AUvnFSua/QL8bl5P+cksxY4sbwSQ2TdeBohp2ELhdWmGfG9p4LYeK/HwT3cQii1vWCuxWvv8D4pouLZsrloaQLQXHf2nIv1VbF85SARQVwPmbxZ6cMi6CnVbbPclAkx/e2YzsE4g0CsYa1Geh8IYMoliiXcjii0VvoOarqf5JorQvK5WKJXE2qIR/L7D9v2YIU3T1kHbu0/5BnbngUN+tBtcTlg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=2aT/TAmk7EEWDC0M2yvt4WzXN9z1T9jXPbBdcvNECpw=; b=iMN44SYuoGvF9fabCf8wO3nv1hflBtKqweyMAsXTg0iYCyyKR6O6QKisZm3qMdDJC18ANJcRBZ72AAMKy2A9VU6pOAMayLFgqCtegG6MS/+qi3zKG+zLw54cZfi5irxtQF0A+3gnYjJHFER9ELRwkxX6XwylEZw+4k1HFFAvI/AlDol7kNqhNRZ+pw8IWSPV3BBQUPmc/XxCdZlUCnOEelf50/RRgXMKOgVVOQ6nxGOfs8TGCzZ1CtjU88dwThgK+90FxzC8viuh2LzDniSHTeunlM0/Jb0xpPPX+WtnmYeXHsFflRhYrcLe0t4XxKW8DfPUQdVvp3b53o7DUEp3PA== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=arm.com; dmarc=pass action=none header.from=arm.com; dkim=pass header.d=arm.com; arc=none Authentication-Results-Original: gcc.gnu.org; dkim=none (message not signed) header.d=none;gcc.gnu.org; dmarc=none action=none header.from=arm.com; Received: from AM0PR08MB5316.eurprd08.prod.outlook.com (2603:10a6:208:185::14) by AM0PR08MB5506.eurprd08.prod.outlook.com (2603:10a6:208:17e::12) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3868.29; Fri, 19 Feb 2021 14:41:37 +0000 Received: from AM0PR08MB5316.eurprd08.prod.outlook.com ([fe80::6093:2c97:a05b:377d]) by AM0PR08MB5316.eurprd08.prod.outlook.com ([fe80::6093:2c97:a05b:377d%7]) with mapi id 15.20.3846.044; Fri, 19 Feb 2021 14:41:37 +0000 Date: Fri, 19 Feb 2021 14:41:29 +0000 To: gcc-patches@gcc.gnu.org Subject: [PATCH] slp: fix sharing of SLP only patterns. (PR99149) Message-ID: Content-Disposition: inline User-Agent: Mutt/1.9.4 (2018-02-28) X-Originating-IP: [217.140.106.53] X-ClientProxiedBy: SN4PR0601CA0002.namprd06.prod.outlook.com (2603:10b6:803:2f::12) To AM0PR08MB5316.eurprd08.prod.outlook.com (2603:10a6:208:185::14) MIME-Version: 1.0 X-MS-Exchange-MessageSentRepresentingType: 1 Received: from arm.com (217.140.106.53) by SN4PR0601CA0002.namprd06.prod.outlook.com (2603:10b6:803:2f::12) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3868.28 via Frontend Transport; Fri, 19 Feb 2021 14:41:35 +0000 X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-HT: Tenant X-MS-Office365-Filtering-Correlation-Id: 1af450b3-a4b7-4ec1-72f1-08d8d4e483e0 X-MS-TrafficTypeDiagnostic: AM0PR08MB5506:|AM6PR08MB2968: X-Microsoft-Antispam-PRVS: x-checkrecipientrouted: true NoDisclaimer: true X-MS-Oob-TLC-OOBClassifiers: OLM:9508;OLM:9508; X-MS-Exchange-SenderADCheck: 1 X-Microsoft-Antispam-Untrusted: BCL:0; X-Microsoft-Antispam-Message-Info-Original: uk6LqU1g1mmWCpaDzySH8maKU50Ln8ESnkKYQaB2ta6y/A6c1KSZBwZXbNCywoLz9a7RKiu2gL+hnkos+c6K+GR//RXzEuR6DZOBDY9qF9xpAwbtywv31aDicbMU7/4ka44h2juEU5SjrKoDmHwZgStECFE+pOuR84gS/DrSCoSNrhrTZh3ETka2Z51kX/zqE7mdF9+2BK7/Ahu6K//a6YR6Wcs+8/4yHh2MmB+BJdTOwo/hBndFRzWli/KV7rrqr2fnZxgVRKulaUKHpcQW4TVvLekd5BCPtV+VVf+MNb3Xii//2PG5sQocTLzX9Q4+/wBWsaBOD5fuEI/9ka8X4pqtpxYHGBLspFk9txpn71bYQnOz4OcCClT41iopyOASVEMHbE6wYlr0l34nwcFkBwVvXiYQ36Oc+hbL8pe1pMoN0+1UtTN2rhIDrZAaFskO003Sw8oldoStdnejaIl+kQZvdUmde+LPA4RPnzSoliXnjCDoh26As890uRziSxrisD0wppu21OFgo1xoR6+9+eFA6M913JHzOapvN6VFlZZiaw8mCKytsgcoNsliDjC8+CeWGzQToTkTF+jTbHtU0yWLP3GpRT06gH6UPx7CMH0= X-Forefront-Antispam-Report-Untrusted: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:AM0PR08MB5316.eurprd08.prod.outlook.com; PTR:; CAT:NONE; SFS:(4636009)(136003)(39860400002)(396003)(346002)(376002)(366004)(8676002)(66946007)(83380400001)(33964004)(8936002)(36756003)(86362001)(52116002)(7696005)(66616009)(44144004)(316002)(2906002)(55016002)(4326008)(66476007)(478600001)(8886007)(6916009)(6666004)(5660300002)(235185007)(66556008)(26005)(2616005)(956004)(16526019)(186003)(44832011)(4216001)(2700100001); DIR:OUT; SFP:1101; X-MS-Exchange-AntiSpam-MessageData: =?utf-8?q?wDzzz2pjbweekfmwwhFETu9s0F5g0b?= =?utf-8?q?X11jbu2v/sLiPpMp0tUtGMEVwaGGgEWcSJyKkdEniV97BxPadiENsQaDn9S7oE6Rg?= =?utf-8?q?Shsnp+WUBR3bONDE9cfbh5u/aMA7JhQ1HxRiZfBKZvb5W4gYpLryWGX1qzOYr2Uxo?= =?utf-8?q?YRhfL6VpJ05qs48zCYvYXRZLcMM9o0eB58nnxCiAI6ktU3jVVlGeUzGBv3OvIPrL2?= =?utf-8?q?rqv2Snif0TwRYX6VZ4V+DLIJDHD8lNbBVeNMoxCoZLMUzHScbnKRyDse5fuNvQcPh?= =?utf-8?q?zzWmJcYC8QJjaHrWWqTzsH0d+7i+DHGAia0nyRCiZJ3wI/fnujNViTrbdC26sS11E?= =?utf-8?q?DQm6rZnlTBFYKmfzGD4WsbqK+3kOmqS8w4xWXKzb0T1bzINBTtwKiks6yeWlxPGj1?= =?utf-8?q?3xXmp10M1pHAG4bpCS/CanttS6p4c/D3POiTlwZuhenZrRViyoBc+DVNhsdEFn0p+?= =?utf-8?q?W9GDQKKyx59MgIBb/vJLFjKv8xbCN+GlNttE97zXWeX8hCRiQ33sGCbhiExzTrPTe?= =?utf-8?q?IDDADaXa2P0/TH5VrE9iwICySuSRVV7L7/48JTznEoyu5P2lWQHmswgLPTgo10V3+?= =?utf-8?q?49Ff77gfhQ8bGeTTEaqSGoOtBgMebnE3jTOw6iOOS+s51tpzsvpVlYMVAs/0EOiSL?= =?utf-8?q?9ST258WOsgY2ILHaJXGcUPeKkk4BM1nip7LQEPSAaxriaelgZ+acjwzQyW9w+V8vI?= =?utf-8?q?Jv7RxohR59AUClbZ4NFWv45PtWZwsI+KYMBYHnp7Ae+4Nd560JENR1ItkmOySqLBD?= =?utf-8?q?dn9kltz65AzHa8FkxwGJlA6aQdAOw1nalw5mGrdRGOWy1ZtBvR1lCBappWwDF+48r?= =?utf-8?q?UsHRj6gQ4GEPm+Eq/XRI+a1DT3JMGbf0V/p7vGb5Zm05bnrl4E9e4ao2qwrDvaxo2?= =?utf-8?q?KLs1sgRtDg8vns3ptXMhz4W1TQ0Af89ISNb2K50VTL4J9Qx6S41GpScgEpnR/fVY3?= =?utf-8?q?QQ+U2QvPJq/oTCpMvTbRuH8Sj7MvfhCSTvanm5cseXeNkpV+2UajQxE2pZ4giNHAV?= =?utf-8?q?rjLrK+81j1u6AehIyiv9/abLJY4Y8lnLzEDw6Deiav8VC1YkLyHH84SgGeijhnL3u?= =?utf-8?q?jhi8VNR5xjter8cqB7ohtAfQ8+nUX2p5WGCmzQawI2gpokL02jizR/TeqdAy5FJCP?= =?utf-8?q?a0JTNRpRed1RMDeEwGwMjBxod0dTRkjUj5ohDG+MjhTdPQxRifbFpA8yctZGSODzQ?= =?utf-8?q?oZlAIowbBhOdXAqh3Vl4SFMMljE42qdRXHz2cQ0LgwcpYMFBpVn1cXtymLzI0lk6L?= =?utf-8?q?X1TZm8nHdupukr?= X-MS-Exchange-Transport-Forked: True X-MS-Exchange-Transport-CrossTenantHeadersStamped: AM0PR08MB5506 Original-Authentication-Results: gcc.gnu.org; dkim=none (message not signed) header.d=none;gcc.gnu.org; dmarc=none action=none header.from=arm.com; X-EOPAttributedMessage: 0 X-MS-Exchange-Transport-CrossTenantHeadersStripped: AM5EUR03FT003.eop-EUR03.prod.protection.outlook.com X-MS-Office365-Filtering-Correlation-Id-Prvs: d5461a46-e61e-4f35-628d-08d8d4e475f1 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: 6SWv5q69l7h5Dy1uvL+ZD1o4CyNltgJFc340GNydK2s3a4kY69lYslVtmBvqKIzZyjAGM3/TJywuflmLu3LZ6n2VIdY9sg/0Zn7MrzWay81Z5oliWSMRup+BfPABggQZczV6uhrSKfz0RrpcB2ZLpq1BJ3N2jMvrKb6PZuB9oAoT6NLjvyYMt+Uw4Ye9BurWj9W0gGVCgLfoCX+L2X3lYNi9PxvklbuLMY/TqTDEFsGKQMyQk5c42TnuzX9bkdGYaZy/+cEkI1Oq0NcSMsaDcGRCnaxqt0NyJSeseKg/0jWYF8kx49A/lNRfnnr6g4fBTHBw7zgzcSirA6QvM7ahhJb4mbvc9KlTVXRVy3wfulBCtaw1PpGtZ9ttzpC1D+DoRuOMGJksABRDK+cAyGFPylMKyQECxj5jkOmyFfShTz4SoESo1gFX6VC349Yifh1FtEDXLzuNKZ8f7ZnD4np+emF1EEeWQGGmw0UfKkjs8ABFnlfS37c3yjtAKLxZejhqfBAL+6DvqtciFhN5efuaiWlXu+xFTmWZYgWwlY0x/+fYOPVpOEsSXdez4mkoAowMUjxzvLObixaJSbKTRbK2WVXa0tond6LF9FKJ+P1DP2rQYoWyrd0FgB3/O1Neck0X+JFQeXHTSLlWw83OKH1cL1ved8N/8o1HmBOCNlURgvkmGEVab27aZ9+w6T3hQdECxbNCAMvBawsVvpeA93FCMQ== X-Forefront-Antispam-Report: CIP:63.35.35.123; CTRY:IE; LANG:en; SCL:1; SRV:; IPV:CAL; SFV:NSPM; H:64aa7808-outbound-1.mta.getcheckrecipient.com; PTR:ec2-63-35-35-123.eu-west-1.compute.amazonaws.com; CAT:NONE; SFS:(4636009)(346002)(39860400002)(136003)(396003)(376002)(36840700001)(46966006)(83380400001)(36756003)(356005)(107886003)(478600001)(5660300002)(47076005)(26005)(86362001)(4326008)(8936002)(82740400003)(8676002)(6916009)(316002)(2906002)(2616005)(82310400003)(235185007)(6666004)(70206006)(8886007)(336012)(33964004)(55016002)(70586007)(7696005)(956004)(66616009)(44144004)(36860700001)(44832011)(186003)(81166007)(16526019)(4216001)(2700100001); DIR:OUT; SFP:1101; X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 19 Feb 2021 14:42:00.2338 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 1af450b3-a4b7-4ec1-72f1-08d8d4e483e0 X-MS-Exchange-CrossTenant-Id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=f34e5979-57d9-4aaa-ad4d-b122a662184d; Ip=[63.35.35.123]; Helo=[64aa7808-outbound-1.mta.getcheckrecipient.com] X-MS-Exchange-CrossTenant-AuthSource: AM5EUR03FT003.eop-EUR03.prod.protection.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: AM6PR08MB2968 X-Spam-Status: No, score=-13.8 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, GIT_PATCH_0, KAM_LOTSOFHASH, MSGID_FROM_MTA_HEADER, RCVD_IN_DNSWL_LOW, RCVD_IN_MSPIKE_H2, SCC_5_SHORT_WORD_LINES, SPF_HELO_PASS, SPF_PASS, TXREP, UNPARSEABLE_RELAY autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Tamar Christina via Gcc-patches From: Tamar Christina Reply-To: Tamar Christina Cc: nd@arm.com, rguenther@suse.de Errors-To: gcc-patches-bounces@gcc.gnu.org Sender: "Gcc-patches" Hi Richi, The attached testcase ICEs due to a couple of issues. In the testcase you have two SLP instances that share the majority of their definition with each other. One tree defines a COMPLEX_MUL sequence and the other tree a COMPLEX_FMA. The ice happens because: 1. the refcounts are wrong, in particular the FMA case doesn't correctly count the references for the COMPLEX_MUL that it consumes. 2. when the FMA is created it incorrectly assumes it can just tear apart the MUL node that it's consuming. This is wrong and should only be done when there is no more uses of the node, in which case the vector only pattern is no longer relevant. To fix the last part the SLP only pattern reset code was moved into vect_free_slp_tree which results in cleaner code. I also think it does belong there since that function knows when there are no more uses of the node and so the pattern should be unmarked, so when the the vectorizer is inspecting the BB it doesn't find the now invalid vector only patterns. This has the obvious problem in that, eventually after analysis is done, the entire SLP tree is dissolved before codegen. Where we get into trouble as we have now dissolved the patterns too... My initial thought was to add a parameter to vect_free_slp_tree, but I know you wouldn't like that. So I am sending this patch up as an RFC. PS. This testcase actually shows that the codegen we get in these cases is not optimal. Currently this won't vectorize as the compiler thinks the vector version is too expensive. My guess here is because the patterns now unshare the tree and it's likely costing the setup for the vector code twice? Even with the shared code (without patterns, so same as GCC 10, or turning off the patterns) it won't vectorize it. The scalar code is mov w0, 0 ldr x4, [x1, #:lo12:.LANCHOR0] ldrsw x2, [x3, 20] ldr x1, [x3, 8] lsl x2, x2, 3 ldrsw x3, [x3, 16] ldp s3, s2, [x4] add x5, x1, x2 ldr s0, [x1, x2] lsl x3, x3, 3 add x4, x1, x3 fmul s1, s2, s2 fnmsub s1, s3, s3, s1 fmul s0, s2, s0 fmadd s0, s3, s2, s0 ldr s3, [x1, x3] ldr s2, [x4, 4] fadd s3, s3, s1 fadd s2, s0, s2 str s3, [x1, x3] str s2, [x4, 4] str s1, [x1, x2] str s0, [x5, 4] ret and turning off the cost model we get movi v1.2s, 0 mov w0, 0 ldr x4, [x1, #:lo12:.LANCHOR0] ldrsw x3, [x2, 16] ldr x1, [x2, 8] ldrsw x2, [x2, 20] ldr d0, [x4] fcmla v1.2s, v0.2s, v0.2s, #0 ldr d2, [x1, x3, lsl 3] fcmla v2.2s, v0.2s, v0.2s, #0 fcmla v2.2s, v0.2s, v0.2s, #90 str d2, [x1, x3, lsl 3] fcmla v1.2s, v0.2s, v0.2s, #90 str d1, [x1, x2, lsl 3] however, if the pattern matcher doesn't create the FMA node because it would unshare the tree (which I think is a general heuristic that would work out to better code wouldn't it?) then we would get (with the cost model enabled even) movi v0.2s, 0 mov w0, 0 ldr x4, [x1, #:lo12:.LANCHOR0] ldrsw x3, [x2, 16] ldr x1, [x2, 8] ldr d1, [x4] fcmla v0.2s, v1.2s, v1.2s, #0 fcmla v0.2s, v1.2s, v1.2s, #90 ldrsw x2, [x2, 20] ldr d1, [x1, x3, lsl 3] fadd v1.2s, v1.2s, v0.2s str d1, [x1, x3, lsl 3] str d0, [x1, x2, lsl 3] ret Which is the most optimal form. So I think this should perhaps be handled in GCC 12 if there's a way to detect when you're going to unshare a sub-tree. Thanks, Tamar gcc/ChangeLog: PR tree-optimization/99149 * tree-vect-slp-patterns.c (vect_detect_pair_op): Don't recreate the buffer. (vect_slp_reset_pattern): Remove. (complex_fma_pattern::matches): Remove call to vect_slp_reset_pattern. (complex_mul_pattern::build, complex_fma_pattern::build, complex_fms_pattern::build): Fix ref counts. * tree-vect-slp.c (vect_free_slp_tree): Undo SLP only pattern relevancy when node is being deleted. * tree-vectorizer.c (vec_info::new_stmt_vec_info): Initialize value. gcc/testsuite/ChangeLog: PR tree-optimization/99149 * gcc.dg/vect/pr99149.C: New test. --- inline copy of patch -- diff --git a/gcc/testsuite/gcc.dg/vect/pr99149.C b/gcc/testsuite/gcc.dg/vect/pr99149.C new file mode 100755 index 0000000000000000000000000000000000000000..b12fe17e4ded148ce2bf67486e425dd65461a148 diff --git a/gcc/testsuite/gcc.dg/vect/pr99149.C b/gcc/testsuite/gcc.dg/vect/pr99149.C new file mode 100755 index 0000000000000000000000000000000000000000..b12fe17e4ded148ce2bf67486e425dd65461a148 --- /dev/null +++ b/gcc/testsuite/gcc.dg/vect/pr99149.C @@ -0,0 +1,25 @@ +/* { dg-do compile } */ +/* { dg-additional-options "-w -O3 -march=armv8.3-a" { target { aarch64*-*-* } } } */ + +class a { + float b; + float c; + +public: + a(float d, float e) : b(d), c(e) {} + a operator+(a d) { return a(b + d.b, c + d.c); } + a operator*(a d) { return a(b * b - c * c, b * c + c * d.b); } +}; +int f, g; +class { + a *h; + a *i; + +public: + void j() { + a k = h[0], l = i[g], m = k * i[f]; + i[g] = l + m; + i[f] = m; + } +} n; +main() { n.j(); } diff --git a/gcc/tree-vect-slp-patterns.c b/gcc/tree-vect-slp-patterns.c index f0817da9f622d22e3df2e30410d1cf610b4ffa1d..1e2769662a54229ab8e24390f97dfe206f17ab57 100644 --- a/gcc/tree-vect-slp-patterns.c +++ b/gcc/tree-vect-slp-patterns.c @@ -407,9 +407,8 @@ vect_detect_pair_op (slp_tree node1, slp_tree node2, lane_permutation_t &lanes, if (result != CMPLX_NONE && ops != NULL) { - ops->create (2); - ops->quick_push (node1); - ops->quick_push (node2); + ops->safe_push (node1); + ops->safe_push (node2); } return result; } @@ -1090,15 +1089,17 @@ complex_mul_pattern::build (vec_info *vinfo) { slp_tree node; unsigned i; + slp_tree newnode + = vect_build_combine_node (this->m_ops[0], this->m_ops[1], *this->m_node); + SLP_TREE_REF_COUNT (this->m_ops[2])++; + FOR_EACH_VEC_ELT (SLP_TREE_CHILDREN (*this->m_node), i, node) vect_free_slp_tree (node); /* First re-arrange the children. */ SLP_TREE_CHILDREN (*this->m_node).reserve_exact (2); SLP_TREE_CHILDREN (*this->m_node)[0] = this->m_ops[2]; - SLP_TREE_CHILDREN (*this->m_node)[1] = - vect_build_combine_node (this->m_ops[0], this->m_ops[1], *this->m_node); - SLP_TREE_REF_COUNT (this->m_ops[2])++; + SLP_TREE_CHILDREN (*this->m_node)[1] = newnode; /* And then rewrite the node itself. */ complex_pattern::build (vinfo); @@ -1133,18 +1134,6 @@ class complex_fma_pattern : public complex_pattern } }; -/* Helper function to "reset" a previously matched node and undo the changes - made enough so that the node is treated as an irrelevant node. */ - -static inline void -vect_slp_reset_pattern (slp_tree node) -{ - stmt_vec_info stmt_info = vect_orig_stmt (SLP_TREE_REPRESENTATIVE (node)); - STMT_VINFO_IN_PATTERN_P (stmt_info) = false; - STMT_SLP_TYPE (stmt_info) = pure_slp; - SLP_TREE_REPRESENTATIVE (node) = stmt_info; -} - /* Pattern matcher for trying to match complex multiply and accumulate and multiply and subtract patterns in SLP tree. If the operation matches then IFN is set to the operation it matched and @@ -1208,15 +1197,6 @@ complex_fma_pattern::matches (complex_operation_t op, if (!vect_pattern_validate_optab (ifn, vnode)) return IFN_LAST; - /* FMA matched ADD + CMUL. During the matching of CMUL the - stmt that starts the pattern is marked as being in a pattern, - namely the CMUL. When replacing this with a CFMA we have to - unmark this statement as being in a pattern. This is because - vect_mark_pattern_stmts will only mark the current stmt as being - in a pattern. Later on when the scalar stmts are examined the - old statement which is supposed to be irrelevant will point to - CMUL unless we undo the pattern relationship here. */ - vect_slp_reset_pattern (node); ops->truncate (0); ops->create (3); @@ -1259,10 +1239,17 @@ complex_fma_pattern::recognize (slp_tree_to_load_perm_map_t *perm_cache, void complex_fma_pattern::build (vec_info *vinfo) { + slp_tree node = SLP_TREE_CHILDREN (*this->m_node)[1]; + SLP_TREE_CHILDREN (*this->m_node).release (); SLP_TREE_CHILDREN (*this->m_node).create (3); SLP_TREE_CHILDREN (*this->m_node).safe_splice (this->m_ops); + SLP_TREE_REF_COUNT (this->m_ops[1])++; + SLP_TREE_REF_COUNT (this->m_ops[2])++; + + vect_free_slp_tree (node); + complex_pattern::build (vinfo); } @@ -1427,6 +1414,11 @@ complex_fms_pattern::build (vec_info *vinfo) { slp_tree node; unsigned i; + slp_tree newnode = + vect_build_combine_node (this->m_ops[2], this->m_ops[3], *this->m_node); + SLP_TREE_REF_COUNT (this->m_ops[0])++; + SLP_TREE_REF_COUNT (this->m_ops[1])++; + FOR_EACH_VEC_ELT (SLP_TREE_CHILDREN (*this->m_node), i, node) vect_free_slp_tree (node); @@ -1436,10 +1428,7 @@ complex_fms_pattern::build (vec_info *vinfo) /* First re-arrange the children. */ SLP_TREE_CHILDREN (*this->m_node).quick_push (this->m_ops[0]); SLP_TREE_CHILDREN (*this->m_node).quick_push (this->m_ops[1]); - SLP_TREE_CHILDREN (*this->m_node).quick_push ( - vect_build_combine_node (this->m_ops[2], this->m_ops[3], *this->m_node)); - SLP_TREE_REF_COUNT (this->m_ops[0])++; - SLP_TREE_REF_COUNT (this->m_ops[1])++; + SLP_TREE_CHILDREN (*this->m_node).quick_push (newnode); /* And then rewrite the node itself. */ complex_pattern::build (vinfo); diff --git a/gcc/tree-vect-slp.c b/gcc/tree-vect-slp.c index ea8a97b01c6371791ac66de3e1dabfedee69cb67..65c2ff867ab41ea70367087dc26fb6eea1375ffb 100644 --- a/gcc/tree-vect-slp.c +++ b/gcc/tree-vect-slp.c @@ -146,6 +146,16 @@ vect_free_slp_tree (slp_tree node) if (child) vect_free_slp_tree (child); + /* If the node defines any SLP only patterns then those patterns are no + longer valid and should be removed. */ + stmt_vec_info rep_stmt_info = SLP_TREE_REPRESENTATIVE (node); + if (rep_stmt_info && STMT_VINFO_SLP_VECT_ONLY_PATTERN (rep_stmt_info)) + { + stmt_vec_info stmt_info = vect_orig_stmt (rep_stmt_info); + //STMT_VINFO_IN_PATTERN_P (stmt_info) = false; + //STMT_SLP_TYPE (stmt_info) = STMT_SLP_TYPE (rep_stmt_info); + } + delete node; } diff --git a/gcc/tree-vectorizer.c b/gcc/tree-vectorizer.c index 5b45df3a4e00266b7530eb4da6985f0d940cb05b..63ba594f2276850a00fc372072d98326891f19e6 100644 --- a/gcc/tree-vectorizer.c +++ b/gcc/tree-vectorizer.c @@ -695,6 +695,7 @@ vec_info::new_stmt_vec_info (gimple *stmt) STMT_VINFO_REDUC_FN (res) = IFN_LAST; STMT_VINFO_REDUC_IDX (res) = -1; STMT_VINFO_SLP_VECT_ONLY (res) = false; + STMT_VINFO_SLP_VECT_ONLY_PATTERN (res) = false; STMT_VINFO_VEC_STMTS (res) = vNULL; if (is_a (this)