From patchwork Thu Nov 16 18:06:26 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alex Coplan X-Patchwork-Id: 1864862 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=armh.onmicrosoft.com header.i=@armh.onmicrosoft.com header.a=rsa-sha256 header.s=selector2-armh-onmicrosoft-com header.b=TriwE+OP; dkim=pass (1024-bit key) header.d=armh.onmicrosoft.com header.i=@armh.onmicrosoft.com header.a=rsa-sha256 header.s=selector2-armh-onmicrosoft-com header.b=TriwE+OP; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=8.43.85.97; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4SWSfZ4sFtz1yRV for ; Fri, 17 Nov 2023 05:07:02 +1100 (AEDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 6BE523876890 for ; Thu, 16 Nov 2023 18:06:59 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from EUR05-DB8-obe.outbound.protection.outlook.com (mail-db8eur05on2045.outbound.protection.outlook.com [40.107.20.45]) by sourceware.org (Postfix) with ESMTPS id D1389383E70D for ; Thu, 16 Nov 2023 18:06:40 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org D1389383E70D Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=arm.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org D1389383E70D Authentication-Results: server2.sourceware.org; arc=pass smtp.remote-ip=40.107.20.45 ARC-Seal: i=3; a=rsa-sha256; d=sourceware.org; s=key; t=1700158007; cv=pass; b=eoWPdi7K6KL49iHOBNJv690mIItLzte5qVPGB2M19S67Tg/KIVwO0cIzJsy2ZD4ORpFwyk7AdgNXCpBDyZfweFuX2hNjzwN43/3bdEdyybqyCIgCIuY/V1n1IjrKgz3CrcGzk8bLa5WzEwzCuGD0gxeSTfmagfbU3Dhrx3OiA2U= ARC-Message-Signature: i=3; a=rsa-sha256; d=sourceware.org; s=key; t=1700158007; c=relaxed/simple; bh=ccqm9EY/syG9wTvnamgh0cs0vSsfSbKwBmvn2DzHEEA=; h=DKIM-Signature:DKIM-Signature:Date:From:To:Subject:Message-ID: MIME-Version; b=OmygcvmesW0b4rfGlZcbwxNSwG/JaYmivdlebDgEMnjzVfbpdI2n3w1rvQtgbzJNBFL4nUgVGfTMWIHVQmeJ+AUz/QLMVOoRNsxAzwl1B/Zu/CeWNVuHS2j8xUTAHYH8V3NNrhgykFvXpii9Ts5HQVMkfNF5iS2NqaE10mmxxOU= ARC-Authentication-Results: i=3; server2.sourceware.org ARC-Seal: i=2; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=pass; b=b/kP6k86nExpbF7JLd5kMfN6bh109LIuwSj87y4anOyyguD8PuMlwzLHzJTMZyfrFf3y+5+iH84SWsPboWHFFfB5WCpovpioaBSM6c/MGbMIg326xAO5z7uhL7PmWPtD7naj27b9icx8i6whWMRNgsDAo/Zn/yjqnE6zndmleNJ48TzL/7qmWaPw/Eo24Pyqq8c1xPWWt1X5T9zseUTTXAR3juaQBBpjMIbqxZ19seN9MeY0tzpfP4v0MEzkbxUoypq0MZcaMVQ7ExI4ySv2/t2KrAY6aTPwW5Xx7rSTNMNsX7r3NYeR8JrLmKOx2UEvbiGkUe01niHLPC+fhh0MIQ== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=kltjQmWGGn4ZLMqtDyWaZYMUThI1jvaFb4V34H5+mec=; b=bMtOEFtYjtlb9YEba/+moo2YVxdQC82u8riRDlzhrLxQCIGQIxcn+J6iG15K+63Q47e4kC53KSOBeIBzUF4+2YbBK14boKkEM7gMjNceCurIgU8bglMeU9uM3avDYrrovzQDE8fmOdZ/3GESiocgS85AxmJf/0tLrR7AfmC70Rw38QXRZJrtElR/YwQ/kmoOgQWaOdHtDGqIFcXiyt4rJKV8/incmqKJDgFtC+7quXbZJZzwLOrX/9pwxdkvOPcUl71OODUETZyTZSK9GqW873NSW/LJlT26j8QwvwWj/rWPxNlT+nnhWyvxl2qeUFr35Zjodj8oFh6CU62QDM3CBg== ARC-Authentication-Results: i=2; mx.microsoft.com 1; spf=pass (sender ip is 63.35.35.123) smtp.rcpttodomain=gcc.gnu.org smtp.mailfrom=arm.com; dmarc=pass (p=none sp=none pct=100) action=none header.from=arm.com; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com; arc=pass (0 oda=1 ltdi=1 spf=[1,1,smtp.mailfrom=arm.com] dkim=[1,1,header.d=arm.com] dmarc=[1,1,header.from=arm.com]) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector2-armh-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=kltjQmWGGn4ZLMqtDyWaZYMUThI1jvaFb4V34H5+mec=; b=TriwE+OP1ZbyoRsC1norghyacx9+P52x8kuC4YOMJSbpzsmwbR3Ff7rWUZaedvBh3aliNzB/OezLn+Fq4LWXFpUWVOCxdWMQMVsCMbCOe8Ot8b5dhSxSxGufBU3YVz4T/nLA+fbvdhcqw8ii6JZ7YxnaTu9baClPBxPDK0ukmn4= Received: from DU7PR01CA0040.eurprd01.prod.exchangelabs.com (2603:10a6:10:50e::23) by AS4PR08MB7781.eurprd08.prod.outlook.com (2603:10a6:20b:515::17) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7002.20; Thu, 16 Nov 2023 18:06:38 +0000 Received: from DB5PEPF00014B9F.eurprd02.prod.outlook.com (2603:10a6:10:50e:cafe::ea) by DU7PR01CA0040.outlook.office365.com (2603:10a6:10:50e::23) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7002.23 via Frontend Transport; Thu, 16 Nov 2023 18:06:37 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 63.35.35.123) smtp.mailfrom=arm.com; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com;dmarc=pass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 63.35.35.123 as permitted sender) receiver=protection.outlook.com; client-ip=63.35.35.123; helo=64aa7808-outbound-1.mta.getcheckrecipient.com; pr=C Received: from 64aa7808-outbound-1.mta.getcheckrecipient.com (63.35.35.123) by DB5PEPF00014B9F.mail.protection.outlook.com (10.167.8.169) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7002.20 via Frontend Transport; Thu, 16 Nov 2023 18:06:37 +0000 Received: ("Tessian outbound 26ee1d40577c:v228"); Thu, 16 Nov 2023 18:06:37 +0000 X-CheckRecipientChecked: true X-CR-MTA-CID: 0491455609b03488 X-CR-MTA-TID: 64aa7808 Received: from 00b76b9b7519.2 by 64aa7808-outbound-1.mta.getcheckrecipient.com id 208E7DCA-66BE-4D18-BFDB-9AC6A82C5244.1; Thu, 16 Nov 2023 18:06:31 +0000 Received: from EUR05-AM6-obe.outbound.protection.outlook.com by 64aa7808-outbound-1.mta.getcheckrecipient.com with ESMTPS id 00b76b9b7519.2 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384); Thu, 16 Nov 2023 18:06:31 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=WahBdYdLXcbGxKeuOZLNOBpWvPRjKwGIRBh+sr6U1ZOEMoWFyaq0e1HP2pN8r6np8N2l2pbPDPjxXRHAhKQz+89gI6AvUc2O6aBzjqypgOeWOxm4sVLWCUrmiCnyoVgHAc5+l1Zz4leeOHm2Xj+Js8+yU6i230geGWL9IY9GfZqecL5wL1Gj12bkZMxWYjPQ00gPuJCW7QfbfCnvfrWZ5hCNXTPzBGdbpdLVVCx1K/mKeNrm2j3GF8lfFS8Yu3Rb8jT7VqrtppypOyEGzdTKGMu1KPNyQllpMwOsGGUnarPXkI4lv8Vnyh7dCXRgTdcyAf+502XkI5itro+14qOzjQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=kltjQmWGGn4ZLMqtDyWaZYMUThI1jvaFb4V34H5+mec=; b=n7ZjPJH1s97FrYlXaNP8Vl8pyDgKjBrHYVkYF9QzFbkhi1OkW4FwOT8NrThcP5aT93atlQqfybrYTxHEBs9HS8yPYMiwyWy1gfmFCGglCycecCQdZjd00ZTlY433MWzaF47gFox5RcPaGdvbaxS1JVD0m06e0lXJXCSPllPzWkQ/8dz/eYWtA5h41tFGvVlownS7SU9gKz00TcGZdfafuyU/wXzYQqAj6DAaNwE5UqCyPKkUBFj9s3Wl/nVPywH8rLUF+lcam5Q1uc3ySEb/F90iwZof4FLJJo0RmN5basquytQnnTsLry66EChqYFc6Hjt+gnARF0/jWD1rnq+0qA== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=arm.com; dmarc=pass action=none header.from=arm.com; dkim=pass header.d=arm.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector2-armh-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=kltjQmWGGn4ZLMqtDyWaZYMUThI1jvaFb4V34H5+mec=; b=TriwE+OP1ZbyoRsC1norghyacx9+P52x8kuC4YOMJSbpzsmwbR3Ff7rWUZaedvBh3aliNzB/OezLn+Fq4LWXFpUWVOCxdWMQMVsCMbCOe8Ot8b5dhSxSxGufBU3YVz4T/nLA+fbvdhcqw8ii6JZ7YxnaTu9baClPBxPDK0ukmn4= Authentication-Results-Original: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com; Received: from PAWPR08MB8958.eurprd08.prod.outlook.com (2603:10a6:102:33e::15) by GV1PR08MB7378.eurprd08.prod.outlook.com (2603:10a6:150:22::14) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7002.21; Thu, 16 Nov 2023 18:06:29 +0000 Received: from PAWPR08MB8958.eurprd08.prod.outlook.com ([fe80::8512:cc10:24d4:1919]) by PAWPR08MB8958.eurprd08.prod.outlook.com ([fe80::8512:cc10:24d4:1919%5]) with mapi id 15.20.6977.029; Thu, 16 Nov 2023 18:06:29 +0000 Date: Thu, 16 Nov 2023 18:06:26 +0000 From: Alex Coplan To: gcc-patches@gcc.gnu.org Cc: Richard Sandiford , Kyrylo Tkachov Subject: [PATCH 01/11] rtl-ssa: Support for inserting new insns Message-ID: Content-Disposition: inline X-ClientProxiedBy: LO2P265CA0178.GBRP265.PROD.OUTLOOK.COM (2603:10a6:600:a::22) To PAWPR08MB8958.eurprd08.prod.outlook.com (2603:10a6:102:33e::15) MIME-Version: 1.0 X-MS-TrafficTypeDiagnostic: PAWPR08MB8958:EE_|GV1PR08MB7378:EE_|DB5PEPF00014B9F:EE_|AS4PR08MB7781:EE_ X-MS-Office365-Filtering-Correlation-Id: 7aed145c-45ee-483b-9e42-08dbe6cec6e9 x-checkrecipientrouted: true NoDisclaimer: true X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam-Untrusted: BCL:0; X-Microsoft-Antispam-Message-Info-Original: +n2VFyNwMYzbQt4ibWVKQif1wkyeKoczn6Cbkebs8Sksg4ptD9gAIKZiAFGI3JgmMDkiw8QgwG0jCn4SaHTPDVqjdpQLmUk7zLm2JzdOtsxlIYKV2h/zpP9LvqQrXeUS+A7CMgIBve52ukAeDYcJVBeFgm1lcsJp57hTqwga14dYuXAEYHW+num0kfqRpFCCURWy5o8D9tcj597FBZLkDGs0Q3/JKT4S0wQbwiBkVRhFfgU4JgluR8kc6JtqAYAAqBRfWvujQ/g3K9mMm/WqULlgtbZ96gpf01xOB3k6EWVrZvw2JXlEWqfW5bMtTQ4zP5Rfb65sgfGfbt+rCIIlvr4Z3xEZupDbccgnAjV5xh6ECHN4Wfub24AIBDT1ajsTJPqR9o2n2Df0+6bMWHTLmhEnZEid8x/gi16B/q+TPdEMN7M1MhKPJozE8bI/R0qkrjEH/WCA+ViN6a9mn+pg+xRHZ+VhEWp8AXz7GqaBo5kZxiz+vQGQOboZsrIU87osGiGg8PWqEiYp+RcFdFKvkvO44ECHiEt6ySqV3tWPk82Tj1aWe9s+zJrx1kY7/jWNKtfVMSDwRGxAIyd+/L4RYSq8+BOCMMIq1zC2iDMmeFc= X-Forefront-Antispam-Report-Untrusted: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:PAWPR08MB8958.eurprd08.prod.outlook.com; PTR:; CAT:NONE; SFS:(13230031)(366004)(136003)(39860400002)(376002)(346002)(396003)(230922051799003)(1800799009)(64100799003)(186009)(451199024)(2906002)(41300700001)(316002)(66476007)(66556008)(54906003)(6916009)(36756003)(8936002)(8676002)(66946007)(4326008)(38100700002)(478600001)(966005)(235185007)(5660300002)(44832011)(6486002)(86362001)(83380400001)(26005)(6512007)(2616005)(6666004)(6506007)(44144004)(33964004)(2700100001); DIR:OUT; SFP:1101; X-MS-Exchange-Transport-CrossTenantHeadersStamped: GV1PR08MB7378 Original-Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com; X-EOPAttributedMessage: 0 X-MS-Exchange-Transport-CrossTenantHeadersStripped: DB5PEPF00014B9F.eurprd02.prod.outlook.com X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id-Prvs: f3b24202-0548-478e-9ef5-08dbe6cec1a4 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: CLk8+OuwU5jIZLo8F0TWRHDhQG5FERvUsQLhwQeZdcHAcSgs2QHkRpt8td0Q14wxGKUF5/eACtV/Xl/79a0lV4QCaVbcAlGKT5UargkXLYSE00AiZJGHV7nXRlHCdEu4X1jpKAjLLYW2eMnqbGwaU9foBAjKH0vELZ2t/dnMvHv3eN77KZsX3w2stsoaIsqg911DUzzevglJ8Lcbc5FOkuOuCx/Aygd8n4dAsJ/XWpJmdq1b6sN8JFMK3qeNwQrEML1kac0dBpMLz+ObDpnbM7+zTlONYnuwPgB/sF/T4n678H5aBx1XsIu0V8HTA4q8ZqsWAOEGJLjrCs+0r6AudFxMViQC2mk3rbRgA6yxOGX0QOI3OlejWX38KRP7X7Fx7LDfMlYF+yklJ7XnIfIJv3z7dsCP72J70Rqt+QKqXyCqWTzivEboVYUIEWHBAfQ6OKiEKu3PjPTa5OY1waqifz9HLndpdy6dgzLSeOhBhPE5npPqpp3HWX91w+Mj3N3zPmAKxywByK7K9LrgXyG08+UxCLzmZyNYYapGV3OhMvNeQGVNXHXUb4FgN9wtC+tzhatqEkEn5pRNS+qLiZ9muVLXRBobmmS6AN5TE19pdI4YrFUWtAVg81mGinYXTTZC9PTTGQTfUpwHHi6J8IN8PrMILBBJ4U0Jxm+a9WOcx5/fdohHQC2TU1Qy5W8wfgi1jKNvBYoZF64qYrNYRj633cxwqFekBnDxOyITuaPS+QSMT4PBDdi9iSGS1Lvg15R/ X-Forefront-Antispam-Report: CIP:63.35.35.123; CTRY:IE; LANG:en; SCL:1; SRV:; IPV:CAL; SFV:NSPM; H:64aa7808-outbound-1.mta.getcheckrecipient.com; PTR:ec2-63-35-35-123.eu-west-1.compute.amazonaws.com; CAT:NONE; SFS:(13230031)(4636009)(39860400002)(376002)(136003)(346002)(396003)(230922051799003)(82310400011)(64100799003)(451199024)(1800799009)(186009)(40470700004)(36840700001)(46966006)(6666004)(36860700001)(356005)(81166007)(47076005)(83380400001)(82740400003)(26005)(336012)(2616005)(44144004)(40480700001)(33964004)(6506007)(478600001)(6486002)(966005)(6512007)(316002)(54906003)(6916009)(70586007)(70206006)(44832011)(4326008)(8676002)(8936002)(86362001)(5660300002)(235185007)(40460700003)(2906002)(41300700001)(36756003)(2700100001); DIR:OUT; SFP:1101; X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 16 Nov 2023 18:06:37.7907 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 7aed145c-45ee-483b-9e42-08dbe6cec6e9 X-MS-Exchange-CrossTenant-Id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=f34e5979-57d9-4aaa-ad4d-b122a662184d; Ip=[63.35.35.123]; Helo=[64aa7808-outbound-1.mta.getcheckrecipient.com] X-MS-Exchange-CrossTenant-AuthSource: DB5PEPF00014B9F.eurprd02.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: AS4PR08MB7781 X-Spam-Status: No, score=-12.0 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, FORGED_SPF_HELO, GIT_PATCH_0, KAM_DMARC_NONE, KAM_SHORT, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2, SPF_HELO_PASS, SPF_NONE, TXREP, T_SCC_BODY_TEXT_LINE, UNPARSEABLE_RELAY autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org N.B. this is just a rebased (but otherwise unchanged) version of the same patch already posted here: https://gcc.gnu.org/pipermail/gcc-patches/2023-October/633348.html this is the only unreviewed dependency from the previous series, so it seemed easier just to re-post it (not least to appease the pre-commit CI). -- >8 -- The upcoming aarch64 load pair pass needs to form store pairs, and can re-order stores over loads when alias analysis determines this is safe. In the case that both mem defs have uses in the RTL-SSA IR, and both stores require re-ordering over their uses, we represent that as (tentative) deletion of the original store insns and creation of a new insn, to prevent requiring repeated re-parenting of uses during the pass. We then update all mem uses that require re-parenting in one go at the end of the pass. To support this, RTL-SSA needs to handle inserting new insns (rather than just changing existing ones), so this patch adds support for that. New insns (and new accesses) are temporaries, allocated above a temporary obstack_watermark, such that the user can easily back out of a change without awkward bookkeeping. Bootstrapped/regtested as a series on aarch64-linux-gnu, OK for trunk? gcc/ChangeLog: * rtl-ssa/accesses.cc (function_info::create_set): New. * rtl-ssa/accesses.h (access_info::is_temporary): New. * rtl-ssa/changes.cc (move_insn): Handle new (temporary) insns. (function_info::finalize_new_accesses): Handle new/temporary user-created accesses. (function_info::apply_changes_to_insn): Ensure m_is_temp flag on new insns gets cleared. (function_info::change_insns): Handle new/temporary insns. (function_info::create_insn): New. * rtl-ssa/changes.h (class insn_change): Make function_info a friend class. * rtl-ssa/functions.h (function_info): Declare new entry points: create_set, create_insn. Declare new change_alloc helper. * rtl-ssa/insns.cc (insn_info::print_full): Identify temporary insns in dump. * rtl-ssa/insns.h (insn_info): Add new m_is_temp flag and accompanying is_temporary accessor. * rtl-ssa/internals.inl (insn_info::insn_info): Initialize m_is_temp to false. * rtl-ssa/member-fns.inl (function_info::change_alloc): New. * rtl-ssa/movement.h (restrict_movement_for_defs_ignoring): Add handling for temporary defs. --- gcc/rtl-ssa/accesses.cc | 10 ++++++ gcc/rtl-ssa/accesses.h | 4 +++ gcc/rtl-ssa/changes.cc | 74 +++++++++++++++++++++++++++++++------- gcc/rtl-ssa/changes.h | 2 ++ gcc/rtl-ssa/functions.h | 14 ++++++++ gcc/rtl-ssa/insns.cc | 5 +++ gcc/rtl-ssa/insns.h | 7 +++- gcc/rtl-ssa/internals.inl | 1 + gcc/rtl-ssa/member-fns.inl | 12 +++++++ gcc/rtl-ssa/movement.h | 8 ++++- 10 files changed, 123 insertions(+), 14 deletions(-) diff --git a/gcc/rtl-ssa/accesses.cc b/gcc/rtl-ssa/accesses.cc index 510545a8bad..76d70fd8bd3 100644 --- a/gcc/rtl-ssa/accesses.cc +++ b/gcc/rtl-ssa/accesses.cc @@ -1456,6 +1456,16 @@ function_info::make_uses_available (obstack_watermark &watermark, return use_array (new_uses, num_uses); } +set_info * +function_info::create_set (obstack_watermark &watermark, + insn_info *insn, + resource_info resource) +{ + auto set = change_alloc (watermark, insn, resource); + set->m_is_temp = true; + return set; +} + // Return true if ACCESS1 can represent ACCESS2 and if ACCESS2 can // represent ACCESS1. static bool diff --git a/gcc/rtl-ssa/accesses.h b/gcc/rtl-ssa/accesses.h index fce31d46717..7e7a90ece97 100644 --- a/gcc/rtl-ssa/accesses.h +++ b/gcc/rtl-ssa/accesses.h @@ -204,6 +204,10 @@ public: // in the main instruction pattern. bool only_occurs_in_notes () const { return m_only_occurs_in_notes; } + // Return true if this is a temporary access, e.g. one created for + // an insn that is about to be inserted. + bool is_temporary () const { return m_is_temp; } + protected: access_info (resource_info, access_kind); diff --git a/gcc/rtl-ssa/changes.cc b/gcc/rtl-ssa/changes.cc index aab532b9f26..da2a61d701a 100644 --- a/gcc/rtl-ssa/changes.cc +++ b/gcc/rtl-ssa/changes.cc @@ -394,14 +394,20 @@ move_insn (insn_change &change, insn_info *after) // At the moment we don't support moving instructions between EBBs, // but this would be worth adding if it's useful. insn_info *insn = change.insn (); - gcc_assert (after->ebb () == insn->ebb ()); + bb_info *bb = after->bb (); basic_block cfg_bb = bb->cfg_bb (); - if (insn->bb () != bb) - // Force DF to mark the old block as dirty. - df_insn_delete (rtl); - ::remove_insn (rtl); + if (!insn->is_temporary ()) + { + gcc_assert (after->ebb () == insn->ebb ()); + + if (insn->bb () != bb) + // Force DF to mark the old block as dirty. + df_insn_delete (rtl); + ::remove_insn (rtl); + } + ::add_insn_after (rtl, after_rtl, cfg_bb); } @@ -439,10 +445,15 @@ function_info::finalize_new_accesses (insn_change &change, insn_info *pos) gcc_assert (def); if (def->m_is_temp) { - // At present, the only temporary instruction definitions we - // create are clobbers, such as those added during recog. - gcc_assert (is_a (def)); - def = allocate (change.insn (), ref.regno); + if (is_a (def)) + def = allocate (change.insn (), ref.regno); + else if (is_a (def)) + { + def->m_is_temp = false; + def = allocate (change.insn (), def->resource ()); + } + else + gcc_unreachable (); } else if (!def->m_has_been_superceded) { @@ -511,7 +522,9 @@ function_info::finalize_new_accesses (insn_change &change, insn_info *pos) unsigned int i = 0; for (use_info *use : change.new_uses) { - if (!use->m_has_been_superceded) + if (use->m_is_temp) + use->m_has_been_superceded = true; + else if (!use->m_has_been_superceded) { use = allocate_temp (insn, use->resource (), use->def ()); use->m_has_been_superceded = true; @@ -645,6 +658,8 @@ function_info::apply_changes_to_insn (insn_change &change) insn->set_accesses (builder.finish ().begin (), num_defs, num_uses); } + + insn->m_is_temp = false; } // Add a temporary placeholder instruction after AFTER. @@ -677,7 +692,8 @@ function_info::change_insns (array_slice changes) if (!change->is_deletion ()) { // Remove any notes that are no longer relevant. - update_notes (change->rtl ()); + if (!change->insn ()->m_is_temp) + update_notes (change->rtl ()); // Make sure that the placement of this instruction would still // leave room for previous instructions. @@ -686,6 +702,17 @@ function_info::change_insns (array_slice changes) // verify_insn_changes is supposed to make sure that this holds. gcc_unreachable (); min_insn = later_insn (min_insn, change->move_range.first); + + if (change->insn ()->m_is_temp) + { + change->m_insn = allocate (change->insn ()->bb (), + change->rtl (), + change->insn_uid ()); + + // Set the flag again so subsequent logic is aware. + // It will be cleared later on. + change->m_insn->m_is_temp = true; + } } } @@ -784,7 +811,8 @@ function_info::change_insns (array_slice changes) // Remove the placeholder first so that we have a wider range of // program points when inserting INSN. insn_info *after = placeholder->prev_any_insn (); - remove_insn (insn); + if (!insn->is_temporary ()) + remove_insn (insn); remove_insn (placeholder); insn->set_bb (after->bb ()); add_insn_after (insn, after); @@ -1105,6 +1133,28 @@ function_info::perform_pending_updates () return changed_cfg; } +insn_info * +function_info::create_insn (obstack_watermark &watermark, + rtx_code insn_code, + rtx pat) +{ + rtx_insn *rti = nullptr; + + // TODO: extend, move in to emit-rtl.cc. + switch (insn_code) + { + case INSN: + rti = make_insn_raw (pat); + break; + default: + gcc_unreachable (); + } + + auto insn = change_alloc (watermark, nullptr, rti, INSN_UID (rti)); + insn->m_is_temp = true; + return insn; +} + // Print a description of CHANGE to PP. void rtl_ssa::pp_insn_change (pretty_printer *pp, const insn_change &change) diff --git a/gcc/rtl-ssa/changes.h b/gcc/rtl-ssa/changes.h index d56e3a646e2..d91cf432afe 100644 --- a/gcc/rtl-ssa/changes.h +++ b/gcc/rtl-ssa/changes.h @@ -32,6 +32,8 @@ namespace rtl_ssa { // something that we might do. class insn_change { + friend class function_info; + public: enum delete_action { DELETE }; diff --git a/gcc/rtl-ssa/functions.h b/gcc/rtl-ssa/functions.h index ecb40fdaf57..4ffd3fa44e2 100644 --- a/gcc/rtl-ssa/functions.h +++ b/gcc/rtl-ssa/functions.h @@ -68,6 +68,16 @@ public: // Return the SSA information for CFG_BB. bb_info *bb (basic_block cfg_bb) const { return m_bbs[cfg_bb->index]; } + // Create a temporary def. + set_info *create_set (obstack_watermark &watermark, + insn_info *insn, + resource_info resource); + + // Create a temporary insn with code INSN_CODE and pattern PAT. + insn_info *create_insn (obstack_watermark &watermark, + rtx_code insn_code, + rtx pat); + // Return a list of all the instructions in the function, in reverse // postorder. The list includes both real and artificial instructions. // @@ -195,6 +205,10 @@ public: // Print the contents of the function to PP. void print (pretty_printer *pp) const; + // Allocate an object of type T above the obstack watermark WM. + template + T *change_alloc (obstack_watermark &wm, Ts... args); + private: class bb_phi_info; class build_info; diff --git a/gcc/rtl-ssa/insns.cc b/gcc/rtl-ssa/insns.cc index 5fde3f2bb4b..2fa48e0dacd 100644 --- a/gcc/rtl-ssa/insns.cc +++ b/gcc/rtl-ssa/insns.cc @@ -192,6 +192,11 @@ insn_info::print_full (pretty_printer *pp) const pp_newline_and_indent (pp, 0); pp_string (pp, "has volatile refs"); } + if (m_is_temp) + { + pp_newline_and_indent (pp, 0); + pp_string (pp, "temporary"); + } } pp_indentation (pp) -= 2; } diff --git a/gcc/rtl-ssa/insns.h b/gcc/rtl-ssa/insns.h index a604fe295cd..6d0506706ad 100644 --- a/gcc/rtl-ssa/insns.h +++ b/gcc/rtl-ssa/insns.h @@ -306,6 +306,8 @@ public: // Print a full description of the instruction. void print_full (pretty_printer *) const; + bool is_temporary () const { return m_is_temp; } + private: // The first-order way of representing the order between instructions // is to assign "program points", with higher point numbers coming @@ -414,8 +416,11 @@ private: unsigned int m_has_pre_post_modify : 1; unsigned int m_has_volatile_refs : 1; + // Indicates the insn is a temporary / new user-allocated insn. + unsigned int m_is_temp : 1; + // For future expansion. - unsigned int m_spare : 27; + unsigned int m_spare : 26; // The program point at which the instruction occurs. // diff --git a/gcc/rtl-ssa/internals.inl b/gcc/rtl-ssa/internals.inl index e49297c12b3..907c4504352 100644 --- a/gcc/rtl-ssa/internals.inl +++ b/gcc/rtl-ssa/internals.inl @@ -415,6 +415,7 @@ inline insn_info::insn_info (bb_info *bb, rtx_insn *rtl, int cost_or_uid) m_is_asm (false), m_has_pre_post_modify (false), m_has_volatile_refs (false), + m_is_temp (false), m_spare (0), m_point (0), m_cost_or_uid (cost_or_uid), diff --git a/gcc/rtl-ssa/member-fns.inl b/gcc/rtl-ssa/member-fns.inl index ce2db045b78..b8940ca5566 100644 --- a/gcc/rtl-ssa/member-fns.inl +++ b/gcc/rtl-ssa/member-fns.inl @@ -962,4 +962,16 @@ function_info::add_regno_clobber (obstack_watermark &watermark, return true; } +template +inline T * +function_info::change_alloc (obstack_watermark &wm, Ts... args) +{ + static_assert (std::is_trivially_destructible::value, + "destructor won't be called"); + static_assert (alignof (T) <= obstack_alignment, + "too much alignment required"); + void *addr = XOBNEW (wm, T); + return new (addr) T (std::forward (args)...); +} + } diff --git a/gcc/rtl-ssa/movement.h b/gcc/rtl-ssa/movement.h index ec076db406f..41226dd3666 100644 --- a/gcc/rtl-ssa/movement.h +++ b/gcc/rtl-ssa/movement.h @@ -182,6 +182,11 @@ restrict_movement_for_defs_ignoring (insn_range_info &move_range, { for (def_info *def : defs) { + // Skip fresh defs that are being inserted, as these shouldn't + // constrain movement. + if (def->is_temporary ()) + continue; + // If the definition is a clobber, we can move it with respect // to other clobbers. // @@ -247,7 +252,8 @@ restrict_movement_for_defs_ignoring (insn_range_info &move_range, // Make sure that we don't move stores between basic blocks, since we // don't have enough information to tell whether it's safe. - if (def_info *def = memory_access (defs)) + def_info *def = memory_access (defs); + if (def && !def->is_temporary ()) { move_range = move_later_than (move_range, def->bb ()->head_insn ()); move_range = move_earlier_than (move_range, def->bb ()->end_insn ()); From patchwork Thu Nov 16 18:07:02 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alex Coplan X-Patchwork-Id: 1864864 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=armh.onmicrosoft.com header.i=@armh.onmicrosoft.com header.a=rsa-sha256 header.s=selector2-armh-onmicrosoft-com header.b=zpK0K/BS; dkim=pass (1024-bit key) header.d=armh.onmicrosoft.com header.i=@armh.onmicrosoft.com header.a=rsa-sha256 header.s=selector2-armh-onmicrosoft-com header.b=zpK0K/BS; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4SWSgC55MJz1yRV for ; Fri, 17 Nov 2023 05:07:35 +1100 (AEDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id E5401387103F for ; Thu, 16 Nov 2023 18:07:32 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from EUR04-DB3-obe.outbound.protection.outlook.com (mail-db3eur04on2043.outbound.protection.outlook.com [40.107.6.43]) by sourceware.org (Postfix) with ESMTPS id 9942A3875DD2 for ; Thu, 16 Nov 2023 18:07:20 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 9942A3875DD2 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=arm.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 9942A3875DD2 Authentication-Results: server2.sourceware.org; arc=pass smtp.remote-ip=40.107.6.43 ARC-Seal: i=3; a=rsa-sha256; d=sourceware.org; s=key; t=1700158041; cv=pass; b=KiQWnWKWtXXQ8infUWY0e2Qt59xKt/ccDhoBcef7Dc0gH6VxYqC90h6D6b+xfHUIHPXjgiqmHiTfg1OqqMlDSp/A3MV10TQTZnTF7BO0ps/r1Qa9+2IHVHnkldtbfBNp3TCrVyzidaTupEFjSRWmJcAmA09S2GZWtz/sJ+ofElM= ARC-Message-Signature: i=3; a=rsa-sha256; d=sourceware.org; s=key; t=1700158041; c=relaxed/simple; bh=o7w+OzLe5Ys55K5crE6SrG/DXYiI9OurAoFawR+xGCM=; h=DKIM-Signature:DKIM-Signature:Date:From:To:Subject:Message-ID: MIME-Version; b=xcb+UK0hJplyKsiiolVwuF9CRoLtzA2bDQ32DtmKvb7+WAwEVlCsNP/C8REJPf8SUxZdcR7DWJ/hsFsl9XnLFcblQgDqlvxpVM6Br4JtoaGsDmDIeFsgtt4siSDdT5EEVrCnv04ja0ty1SScPGCWo2zvaUN9e71jFEL3lY90Xkc= ARC-Authentication-Results: i=3; server2.sourceware.org ARC-Seal: i=2; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=pass; b=kB0EmaYOXgh5kPHZ96sdQE7ci0ukF8mVEelyiPbSAp05m2OLI6JP8Tr2aeG1hnBA1AtY4xMMY1FEfi4kXjJkVvGDO60dxBHZ3eaxZQfGzepwLIEktLIdjgLJfndJTfVgTxYhk25xD80uN7aw1kY3v0rBdt2fYwkv8RXVwejvlAh07yLyoKeQZg0z4Pr9eDgqcYz/lTs3tAFhKYJVqZ/FRWCJBU7/k2yySmwNHfSAZvBLbuBizNEwFcNScV9os2VQGmUvw4LDJh6xld0YqfhrmpslZ4Bu+BCVd2jGq7PbT950hGxskhXQLpfBgoh0mOn8ktvf5PyAJGGahvfgnUQQag== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=PRUTHX9d+6OLzzfU5ld4LGePund9c6BKeSXUPeivWV0=; b=GJwS59fXtKHaKaxr+h6zJTqhf1+fwyCZUlb6m3g3vozpeBFFQig5j4gGaAt4iVTRkyrn5yLIyKIi6YN53ITpTaTG1dNZnyuUpt2djo6NjhGC/doggXHt8w9VyR4pp0rHvzHaqdbsOnG9KSxnbWdYbPrSGqoET7lPJmHF38QM2qVl/g7slcTPSEFg/399a4QuqMsIVD2KLAJaj0NEwtFpKXDBhe/fkBwlruHZO9TUnR9Rry1agNRZuJCMft7qX8FMpF24/kvY85H0ietX7Cctiul+ARs/6VCudOmQGIvFDm2IFK8KVqxVlQH8VPRlclKFNSQAORZV14dpaiZUyJHOlg== ARC-Authentication-Results: i=2; mx.microsoft.com 1; spf=pass (sender ip is 63.35.35.123) smtp.rcpttodomain=gcc.gnu.org smtp.mailfrom=arm.com; dmarc=pass (p=none sp=none pct=100) action=none header.from=arm.com; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com; arc=pass (0 oda=1 ltdi=1 spf=[1,1,smtp.mailfrom=arm.com] dkim=[1,1,header.d=arm.com] dmarc=[1,1,header.from=arm.com]) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector2-armh-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=PRUTHX9d+6OLzzfU5ld4LGePund9c6BKeSXUPeivWV0=; b=zpK0K/BSRaLTrugYOrJqkrDidCYsOs2tX9ZrG2VOXJhJRXmo8aagYLEwgF0rUcDmMKJ+QPb0yGn9z0p4rBGm+KuNH7DUneuk5rVWTUXIkzj7FVJ01MT/jTH2SzH7oAtOzQI5OmDP1tdCOHzvfA5P0hV5Bp59WFR3cQiZtDSgIvY= Received: from AM6PR05CA0036.eurprd05.prod.outlook.com (2603:10a6:20b:2e::49) by VI0PR08MB10780.eurprd08.prod.outlook.com (2603:10a6:800:204::10) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7002.21; Thu, 16 Nov 2023 18:07:18 +0000 Received: from AMS0EPF000001B4.eurprd05.prod.outlook.com (2603:10a6:20b:2e:cafe::2b) by AM6PR05CA0036.outlook.office365.com (2603:10a6:20b:2e::49) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7002.21 via Frontend Transport; Thu, 16 Nov 2023 18:07:18 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 63.35.35.123) smtp.mailfrom=arm.com; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com;dmarc=pass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 63.35.35.123 as permitted sender) receiver=protection.outlook.com; client-ip=63.35.35.123; helo=64aa7808-outbound-1.mta.getcheckrecipient.com; pr=C Received: from 64aa7808-outbound-1.mta.getcheckrecipient.com (63.35.35.123) by AMS0EPF000001B4.mail.protection.outlook.com (10.167.16.168) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7002.19 via Frontend Transport; Thu, 16 Nov 2023 18:07:18 +0000 Received: ("Tessian outbound 26ee1d40577c:v228"); Thu, 16 Nov 2023 18:07:17 +0000 X-CheckRecipientChecked: true X-CR-MTA-CID: 9a0625d862b149a2 X-CR-MTA-TID: 64aa7808 Received: from 12a0f1343511.2 by 64aa7808-outbound-1.mta.getcheckrecipient.com id 08BDA194-F602-4904-8DCD-EAB765542858.1; Thu, 16 Nov 2023 18:07:06 +0000 Received: from EUR05-AM6-obe.outbound.protection.outlook.com by 64aa7808-outbound-1.mta.getcheckrecipient.com with ESMTPS id 12a0f1343511.2 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384); Thu, 16 Nov 2023 18:07:06 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=HIxCBpDLKC0PfyxRruNvatfAj5jvUdQKxiYMXup/hhsJlJES3o92UBaSARhM73KBVaUknpFeC2hxaPNbhc5QVCs6Oxjjl1ps5pPTaVm77jUdc+v+kZYoocWgRfgAN+IyrupaZubpG4Maz9LeJjp+obyYB9Jpo+UTzCNTYUhmpWlY8s5FerOVA1elPagDtRsW3ZrQrL1DvlG+dayLhouV/gR1eJ/APhen4VPCDz689+xZb5La696ZblfMqGjilm60xN9GG6MoIVKRe7hLkGk85YLQxj8mmfy8/u7jLIjaRD6+hhT1wn2YQPZW0hwunSCfoBopWABZLwkDbLOsFd2QzQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=PRUTHX9d+6OLzzfU5ld4LGePund9c6BKeSXUPeivWV0=; b=H1X9wuDY1iTS6UK1Ox5G7yXg6g1oQHmOVVllUnFeqI6+Z0fcxqkHKZ2fTHOekTY8Dj9JM5kBUuc3scKOE4QOShOylEeMhlaYdcmvGeYEabrpP91Liv9StXM9lDh7JXU4ovBxH28DffAwZtxiwuC1ed9O0WD3t8echWO8l3E0azapyGDlkkU/SfitMidPALREgarOkW4C4HeGuSGuQRoPa97XqBb+qSV4ayYmvmvlhZOYwzdOkKVFXbN8kanZjaBbM3kofjLKCmYjXzN4gmUWx/DZv997VBWOCwgcK44dzlRS+z193tAAb8jpA1qMQSYXJyySlx8IJEYxymYLMBhEOQ== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=arm.com; dmarc=pass action=none header.from=arm.com; dkim=pass header.d=arm.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector2-armh-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=PRUTHX9d+6OLzzfU5ld4LGePund9c6BKeSXUPeivWV0=; b=zpK0K/BSRaLTrugYOrJqkrDidCYsOs2tX9ZrG2VOXJhJRXmo8aagYLEwgF0rUcDmMKJ+QPb0yGn9z0p4rBGm+KuNH7DUneuk5rVWTUXIkzj7FVJ01MT/jTH2SzH7oAtOzQI5OmDP1tdCOHzvfA5P0hV5Bp59WFR3cQiZtDSgIvY= Authentication-Results-Original: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com; Received: from PAWPR08MB8958.eurprd08.prod.outlook.com (2603:10a6:102:33e::15) by GV1PR08MB7378.eurprd08.prod.outlook.com (2603:10a6:150:22::14) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7002.21; Thu, 16 Nov 2023 18:07:04 +0000 Received: from PAWPR08MB8958.eurprd08.prod.outlook.com ([fe80::8512:cc10:24d4:1919]) by PAWPR08MB8958.eurprd08.prod.outlook.com ([fe80::8512:cc10:24d4:1919%5]) with mapi id 15.20.6977.029; Thu, 16 Nov 2023 18:07:04 +0000 Date: Thu, 16 Nov 2023 18:07:02 +0000 From: Alex Coplan To: gcc-patches@gcc.gnu.org Cc: Richard Sandiford , Kyrylo Tkachov Subject: [PATCH 02/11] rtl-ssa: Add some helpers for removing accesses Message-ID: Content-Disposition: inline X-ClientProxiedBy: LO2P265CA0163.GBRP265.PROD.OUTLOOK.COM (2603:10a6:600:9::31) To PAWPR08MB8958.eurprd08.prod.outlook.com (2603:10a6:102:33e::15) MIME-Version: 1.0 X-MS-TrafficTypeDiagnostic: PAWPR08MB8958:EE_|GV1PR08MB7378:EE_|AMS0EPF000001B4:EE_|VI0PR08MB10780:EE_ X-MS-Office365-Filtering-Correlation-Id: 72f5183e-6a19-4179-3ee2-08dbe6cedef3 x-checkrecipientrouted: true NoDisclaimer: true X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam-Untrusted: BCL:0; X-Microsoft-Antispam-Message-Info-Original: 0uGyTqFWOWYnoV9cGhRYWdhG4pkG3uIl4758UGUF74PmF5D+/jYT/fq2D2S63uC0ix6Rmu6+acqUWuSxUwtqf2KAQlo9jXE4AxZW88FdXRQqBkIFbNFzHKBdnbzvVmV6OCRVmVFsSuXxe+Y46XFKNxBKR7lPHZ8ylDcUNm+NZ/QKvYxZj6MHakMwfqS3VpVjn2NWM1jLSfNkvQietuY1OQ/5K/MwXGL18bP3xgoFegn8SuLFu4Mb48xTpSyVg8sCCdBV2B1OIfb1LwVGaVouVXBuiTnk0OPzx9OuTEEjsSwWO0c91mcVZMgFYAmdCwBlLqfLcLthK5nSnTAuTe8mmo+2JY+JJ8c5GnZCmN5clp9YBt5bu/MgRJTMzrvPu5jm87hNkR+lXLemabdt20sS9pMBJURku/I/gIi5QxLTyNwGu86XGG4hOcO2TnsMxVjiHi6jzjm8yjJEE12iYSip8/6Z4w+WqW1MNcTmgDoppcTUv2VYIdaGRQHqTHVT6663JsFzb598V/jC84kHO81XlMlBAYqG4KjueBXOoN7qFso/pPyG01v6C/Q8EeG3rloa9bkUDflWcoWZuD68MD7QwQekUNpprGGOhf4SQFx4j+r7/mgIXVktqKBDFByR7RW7 X-Forefront-Antispam-Report-Untrusted: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:PAWPR08MB8958.eurprd08.prod.outlook.com; PTR:; CAT:NONE; SFS:(13230031)(366004)(136003)(39860400002)(376002)(346002)(396003)(230922051799003)(1800799009)(64100799003)(186009)(451199024)(2906002)(41300700001)(316002)(66476007)(66556008)(54906003)(6916009)(36756003)(8936002)(8676002)(66946007)(4326008)(38100700002)(478600001)(235185007)(5660300002)(44832011)(6486002)(86362001)(83380400001)(26005)(6512007)(2616005)(6506007)(44144004)(33964004)(2700100001); DIR:OUT; SFP:1101; X-MS-Exchange-Transport-CrossTenantHeadersStamped: GV1PR08MB7378 Original-Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com; X-EOPAttributedMessage: 0 X-MS-Exchange-Transport-CrossTenantHeadersStripped: AMS0EPF000001B4.eurprd05.prod.outlook.com X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id-Prvs: bba1b864-9c82-4cba-b37c-08dbe6ced6f3 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: Ez4tZ8WEDmhQL1/JymFBSkaD7A4CW+31doNUJbUC8wwm0ffHalzE/N3jcmwrVnOYtEHdD71cEheVY2hbcERVd81kI66v12ENwUt+iWv+18hCaXOWXQEOEKx1yD02nG06w8epvndgYdQ3VcWa7HtF5lxoFVkRwh5m3ATypWdyj5vrtUYWF0Hobz6jiIZDkOJ3RLJ3BUp2EyP2LXgkAN3xh+gGGbK5TfqWrJxN6dMvNuo5vzhbeCp1SHc1lm8+gsveomjjCjAngTSP050tr4N1ULehzuuEdPOPQK9vJBynyIgs4Mbi9uoGL3uuEd2z58YzQBvTG+2ZjrtOVU9eoJZLfN8xjR/CqELGbcKsvhi7BKgw/biWclAzqlKsLuBcmlblfKaqFVyhhD7YuDpwyZSq0ALjaqvD6TDFyDm+MwbaJTwQCYhRih5CNrjfCPdea0ltWhscZRSobU0mN4sxHaDBdo5/Gs1IHYavBY3iroDM6jgNHN53jvHaQMoU7+fCZawOtFOgUotr8nEfP5QiXtKuFnawAuePpSE8bQnTbfOkrUEf2XdbR0a/UBQfTD86LrMC48C6GFY1I2cmaEf6XTQH7G0lChXXjCENV/r4R2HRz/BKD7n/jHZadOrq5E0XflvqT3IFC7eG9KuBUvVU33Dd4TycLVdnvE3WzDD+B3bqmSTBS4VRM5FSxUdjlPqIHTYzReeLhXdrYSboN++FYO0cQyTaJ2lIK9wFc6gDVpt+UnbuZIIw5whJGTPCAL8oA4FTAZCnwgGXXu6jIZlWoO26Fg== X-Forefront-Antispam-Report: CIP:63.35.35.123; CTRY:IE; LANG:en; SCL:1; SRV:; IPV:CAL; SFV:NSPM; H:64aa7808-outbound-1.mta.getcheckrecipient.com; PTR:ec2-63-35-35-123.eu-west-1.compute.amazonaws.com; CAT:NONE; SFS:(13230031)(4636009)(136003)(39860400002)(376002)(346002)(396003)(230922051799003)(1800799009)(82310400011)(64100799003)(186009)(451199024)(36840700001)(46966006)(40470700004)(40480700001)(2906002)(36860700001)(41300700001)(316002)(54906003)(6916009)(36756003)(8936002)(8676002)(70206006)(4326008)(70586007)(478600001)(47076005)(81166007)(235185007)(5660300002)(44832011)(6486002)(86362001)(356005)(83380400001)(40460700003)(336012)(82740400003)(26005)(6512007)(2616005)(6506007)(44144004)(33964004)(2700100001); DIR:OUT; SFP:1101; X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 16 Nov 2023 18:07:18.0723 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 72f5183e-6a19-4179-3ee2-08dbe6cedef3 X-MS-Exchange-CrossTenant-Id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=f34e5979-57d9-4aaa-ad4d-b122a662184d; Ip=[63.35.35.123]; Helo=[64aa7808-outbound-1.mta.getcheckrecipient.com] X-MS-Exchange-CrossTenant-AuthSource: AMS0EPF000001B4.eurprd05.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: VI0PR08MB10780 X-Spam-Status: No, score=-12.0 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, FORGED_SPF_HELO, GIT_PATCH_0, KAM_DMARC_NONE, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2, SPF_HELO_PASS, SPF_NONE, TXREP, T_SCC_BODY_TEXT_LINE, UNPARSEABLE_RELAY autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org This adds some helpers to access-utils.h for removing accesses from an access_array. This is needed by the upcoming aarch64 load/store pair fusion pass. Bootstrapped/regtested as a series on aarch64-linux-gnu, OK for trunk? gcc/ChangeLog: * rtl-ssa/access-utils.h (filter_accesses): New. (remove_regno_access): New. (check_remove_regno_access): New. --- gcc/rtl-ssa/access-utils.h | 42 ++++++++++++++++++++++++++++++++++++++ 1 file changed, 42 insertions(+) diff --git a/gcc/rtl-ssa/access-utils.h b/gcc/rtl-ssa/access-utils.h index f078625babf..31259d742d9 100644 --- a/gcc/rtl-ssa/access-utils.h +++ b/gcc/rtl-ssa/access-utils.h @@ -78,6 +78,48 @@ drop_memory_access (T accesses) return T (arr.begin (), accesses.size () - 1); } +// Filter ACCESSES to return an access_array of only those accesses that +// satisfy PREDICATE. Alocate the new array above WATERMARK. +template +inline T +filter_accesses (obstack_watermark &watermark, + T accesses, + FilterPredicate predicate) +{ + access_array_builder builder (watermark); + builder.reserve (accesses.size ()); + auto it = accesses.begin (); + auto end = accesses.end (); + for (; it != end; it++) + if (predicate (*it)) + builder.quick_push (*it); + return T (builder.finish ()); +} + +// Given an array of ACCESSES, remove any access with regno REGNO. +// Allocate the new access array above WM. +template +inline T +remove_regno_access (obstack_watermark &watermark, + T accesses, unsigned int regno) +{ + using Access = decltype (accesses[0]); + auto pred = [regno](Access a) { return a->regno () != regno; }; + return filter_accesses (watermark, accesses, pred); +} + +// As above, but additionally check that we actually did remove an access. +template +inline T +check_remove_regno_access (obstack_watermark &watermark, + T accesses, unsigned regno) +{ + auto orig_size = accesses.size (); + auto result = remove_regno_access (watermark, accesses, regno); + gcc_assert (result.size () < orig_size); + return result; +} + // If sorted array ACCESSES includes a reference to REGNO, return the // access, otherwise return null. template From patchwork Thu Nov 16 18:07:24 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alex Coplan X-Patchwork-Id: 1864867 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=armh.onmicrosoft.com header.i=@armh.onmicrosoft.com header.a=rsa-sha256 header.s=selector2-armh-onmicrosoft-com header.b=8vQUDRna; dkim=pass (1024-bit key) header.d=armh.onmicrosoft.com header.i=@armh.onmicrosoft.com header.a=rsa-sha256 header.s=selector2-armh-onmicrosoft-com header.b=8vQUDRna; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4SWSgZ6xWPz1yS1 for ; Fri, 17 Nov 2023 05:07:54 +1100 (AEDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 83E1E383A606 for ; Thu, 16 Nov 2023 18:07:52 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from EUR03-DBA-obe.outbound.protection.outlook.com (mail-dbaeur03on2041.outbound.protection.outlook.com [40.107.104.41]) by sourceware.org (Postfix) with ESMTPS id DBF18383E71E for ; Thu, 16 Nov 2023 18:07:39 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org DBF18383E71E Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=arm.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org DBF18383E71E Authentication-Results: server2.sourceware.org; arc=pass smtp.remote-ip=40.107.104.41 ARC-Seal: i=3; a=rsa-sha256; d=sourceware.org; s=key; t=1700158061; cv=pass; b=eL7JvSpV68cQELgx9mu5Upisg5ec7pFQ1agRGxVTM+dHPpi2KTAKfwYBc13EuOr+L2kvdNJ/CkUot6kd1BBlqZZJnXmw645aT+YY2AShF9Utf4A+ipRtOVjEwuSuhufkjargxrSJ08FEWRuNZY50qmKcDRXBFkwLdponvCMnD0c= ARC-Message-Signature: i=3; a=rsa-sha256; d=sourceware.org; s=key; t=1700158061; c=relaxed/simple; bh=J7BbB+8CSgWeyDk95am6fyDn2ZTUlP6exfb+SqqE4/w=; h=DKIM-Signature:DKIM-Signature:Date:From:To:Subject:Message-ID: MIME-Version; b=gDLmK83FWV9BZdGHSjX+kPJ9S0tChGbkei8NUsdb7E0/KRkzA6ksUBewD2Pvj62FrsrkLy6p0z3HdlmQ0D3ne1aemGSG0lpzpwmKJkgemjIcR8IrCUuvqCFBOPv2PWCqkP9CvO/3phrcwpxAr5trOHjBwtQByQqYuOeJcDJyL4M= ARC-Authentication-Results: i=3; server2.sourceware.org ARC-Seal: i=2; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=pass; b=GZIO5RhL2OlRugrZR1O10CZVoRlpZryqb0FWNtthsci9EvwE+Mnf9Zmp7Lf2Y66yvLv7BppGdbUqGGYCOvzILBC/XY5i/DK1LP6let1uB0t+mjMRDEpVo3FmAQ/JAAhfdRojWcxTEx88vi0weROWSYQJTqTquAKkGrGOmR6BaeieTzrzkOEmAUQmIIQM0mftw8u+2F6cWvGc9TH2xQd1XyaIARIAeQ/FmAeLDZT/q1FDA/gbozMCHTOe8JkEJXGh9+2HZXZ1l1hccZC4Wy+5GWKIUlTaLPvNJuL9uVzKnNJTxcJbMNmLwfiWOxInW7YsOWqSMyrCyPKRDdDxjvwd/g== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=ROIhS7MZ/13Rp1vndjq3sK9s/aOjfxtgIUi6aHRaIpI=; b=dKMKRV4KJ/37WSx4tbGnRhkqnLsM5sKr+Fylqe7p3+F944sudRLiBq1BrbzT2GqK8Sv0uH6xpELqN8PK/ltBKDknohB4tQZcfssHdgzcsxQP81L+F5zLjZiPIiusdi4tZD3ZZ6Y3Rb0I9ZPULAA8oD5Rjh+Mc1UytRGYBygQub1wq4i9TaWHa5h6T6QepGoCHgCI54nLr/ZKS7Pos/EGhXEU+aKdLJGRWElsY2yDP4ERkpnBYagpdGpw6eoeuE7q2VRR0oFRJ0NnidDH0DmwWlfJ5++bv39e/J2VtJuLrqBbF9AGlXL+8EvPlCkqrrqByo26P+/v2EeXwfpptMvGAg== ARC-Authentication-Results: i=2; mx.microsoft.com 1; spf=pass (sender ip is 63.35.35.123) smtp.rcpttodomain=gcc.gnu.org smtp.mailfrom=arm.com; dmarc=pass (p=none sp=none pct=100) action=none header.from=arm.com; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com; arc=pass (0 oda=1 ltdi=1 spf=[1,1,smtp.mailfrom=arm.com] dkim=[1,1,header.d=arm.com] dmarc=[1,1,header.from=arm.com]) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector2-armh-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=ROIhS7MZ/13Rp1vndjq3sK9s/aOjfxtgIUi6aHRaIpI=; b=8vQUDRnapia7xqkLm/ZHoVtqXxIZpQhWj1i0+2/utlJXyXr9Evr1ZI/rDaSvnAJ6ChM3sdPUYRrXZFBWPSuAwBl4rjGzwwnQ/6k1Urx1DhXGjdwD7Y74MRjAjx8QAkpVDslnt71iAr55iBRJQAAEwjWJAA3HJnqufTi0MznYlQk= Received: from AS9PR06CA0694.eurprd06.prod.outlook.com (2603:10a6:20b:49f::7) by PAVPR08MB9747.eurprd08.prod.outlook.com (2603:10a6:102:31e::19) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7002.18; Thu, 16 Nov 2023 18:07:37 +0000 Received: from AMS1EPF00000044.eurprd04.prod.outlook.com (2603:10a6:20b:49f:cafe::7c) by AS9PR06CA0694.outlook.office365.com (2603:10a6:20b:49f::7) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7002.21 via Frontend Transport; Thu, 16 Nov 2023 18:07:37 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 63.35.35.123) smtp.mailfrom=arm.com; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com;dmarc=pass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 63.35.35.123 as permitted sender) receiver=protection.outlook.com; client-ip=63.35.35.123; helo=64aa7808-outbound-1.mta.getcheckrecipient.com; pr=C Received: from 64aa7808-outbound-1.mta.getcheckrecipient.com (63.35.35.123) by AMS1EPF00000044.mail.protection.outlook.com (10.167.16.41) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7002.20 via Frontend Transport; Thu, 16 Nov 2023 18:07:37 +0000 Received: ("Tessian outbound 8289ea11ec17:v228"); Thu, 16 Nov 2023 18:07:37 +0000 X-CheckRecipientChecked: true X-CR-MTA-CID: 39d2571c7df41fcf X-CR-MTA-TID: 64aa7808 Received: from 6ae9a4459328.2 by 64aa7808-outbound-1.mta.getcheckrecipient.com id B0C56C97-F8FA-4944-B5F2-F82E1E7DFBB5.1; Thu, 16 Nov 2023 18:07:30 +0000 Received: from EUR05-AM6-obe.outbound.protection.outlook.com by 64aa7808-outbound-1.mta.getcheckrecipient.com with ESMTPS id 6ae9a4459328.2 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384); Thu, 16 Nov 2023 18:07:30 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=oaIU7U3+hjtR1UpYPCctW7e8yeSHJq3btKqE8g5BfG1YlLmSGbp3drCKB2KcNIxGBHm8j/Fd5rmZh6AFfU4/O15XnNijJlrMHoW5o02tpDGJFaQmTDOSfF4xZRdhFZ1n4Teb6nbLiYyqhnLivEZzsdHA7yWkgUcUIg32782LQJCgEnLDTQZe9dQxLnix/JqcJPx2J77nM7ArcwAhevLYpuU22tcFuyVRCQhHOdE3HXZ5z+7I+MbEQcebB9TY0S/p/+Veyv4S9BJ0R0/n+1194QwZabK5GHEC5rkzU5VASDx2/f7cALKYi6K1VVeHlXmf8q33ZLMfVQrEPsmhZ6GFYg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=ROIhS7MZ/13Rp1vndjq3sK9s/aOjfxtgIUi6aHRaIpI=; b=COIQYwJWoBX6rQdWRmvvIDn3sWvZTUeo9keAwkVjSRuAL6Ja7QSdbrIHF1YtoV6xtTYfOnUHZmRq5KBIlwjHST3jTNoOJPhpm/dkbNGEq5YGhXX8mGWniEMKRZ4emox8sxVP5LF+W3b4AfLmizYtTcm5Qvolu9m2HorQKseMgphV7mHASzIoxstkBtMmsCJ0+L5AyIAngQh1BDglayHC+lX8R7aVglEntdOqJn3Ylrj6s3l19oXkxLVtTCzykpt4K+WFQ7aQo/RwuMy2XZovLCTxfBCIYyebL2G2qjDJ/XL9YJRsFxV/oGj4wMRperL1KEzwR68I9FQ9gQP9aJpTBA== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=arm.com; dmarc=pass action=none header.from=arm.com; dkim=pass header.d=arm.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector2-armh-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=ROIhS7MZ/13Rp1vndjq3sK9s/aOjfxtgIUi6aHRaIpI=; b=8vQUDRnapia7xqkLm/ZHoVtqXxIZpQhWj1i0+2/utlJXyXr9Evr1ZI/rDaSvnAJ6ChM3sdPUYRrXZFBWPSuAwBl4rjGzwwnQ/6k1Urx1DhXGjdwD7Y74MRjAjx8QAkpVDslnt71iAr55iBRJQAAEwjWJAA3HJnqufTi0MznYlQk= Authentication-Results-Original: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com; Received: from PAWPR08MB8958.eurprd08.prod.outlook.com (2603:10a6:102:33e::15) by GV1PR08MB7378.eurprd08.prod.outlook.com (2603:10a6:150:22::14) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7002.21; Thu, 16 Nov 2023 18:07:28 +0000 Received: from PAWPR08MB8958.eurprd08.prod.outlook.com ([fe80::8512:cc10:24d4:1919]) by PAWPR08MB8958.eurprd08.prod.outlook.com ([fe80::8512:cc10:24d4:1919%5]) with mapi id 15.20.6977.029; Thu, 16 Nov 2023 18:07:28 +0000 Date: Thu, 16 Nov 2023 18:07:24 +0000 From: Alex Coplan To: gcc-patches@gcc.gnu.org Cc: Richard Sandiford , Kyrylo Tkachov Subject: [PATCH 03/11] aarch64, testsuite: Fix up auto-init-padding tests Message-ID: Content-Disposition: inline X-ClientProxiedBy: LO2P123CA0084.GBRP123.PROD.OUTLOOK.COM (2603:10a6:600:138::17) To PAWPR08MB8958.eurprd08.prod.outlook.com (2603:10a6:102:33e::15) MIME-Version: 1.0 X-MS-TrafficTypeDiagnostic: PAWPR08MB8958:EE_|GV1PR08MB7378:EE_|AMS1EPF00000044:EE_|PAVPR08MB9747:EE_ X-MS-Office365-Filtering-Correlation-Id: 623f6309-2669-4304-b6f9-08dbe6ceea80 x-checkrecipientrouted: true NoDisclaimer: true X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam-Untrusted: BCL:0; X-Microsoft-Antispam-Message-Info-Original: lIeH24LlzRTKkWlwYabWwXW70jtPiuNxE3mVKxf9hwZnB2ga30bo7ZQW4TGJvB8/ZaE+BXWNHV8BKRBqQ0+dz9EmcbEiYzM/Sac3p5Ob3ow8gyRL8L9UD5aI8RYxHB0CfxZxG700Fihbp4af/ZcdIm8ICXNDoH7NhwaWKMEy1ecpndADtWd7yPmPLuV3aELR2pLt96qWR60y4tXcKZWkXEjsDMPagpMGFo07Zou4kQsHRuPOM+3hu+IR62jC34mGerkwzM2D6GheDMmrpsPliRHRQarmLAyPHk7r0U7HqZ/rEyj6UzuRsNEjlzSutOldPyUTWHNmT2OP+SubvRf/o5sTA9H9z0kEMTfRXPt5/bFoq/Qfw/j0wX3Sl9r/xKwW8M5HBJ7lc44pNSt9lyQd8pNtOolCivpvi7uBOrOpC870s0ktC32lxUHlEEXYOifM4uRYEbR9sYggovMB/i0Wfc276uPrUhYi+GMT9ueo7CMTWO8DFXVdq1QCHeFJd0cTS+n60Oat0E74mpoE2p3EBNcrlHoOh97kcC55SOAa54De+vXZx208yBchEgJdMTlUEJFduqOgBbw7Eu0ZhjdiNDv9nqxCTiaZNMZ2/RPbh/dH9b8L3bsrOWuEqOaW4AQjwxZ6UfEILz2wBzvNgcmNgg== X-Forefront-Antispam-Report-Untrusted: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:PAWPR08MB8958.eurprd08.prod.outlook.com; PTR:; CAT:NONE; SFS:(13230031)(366004)(136003)(39860400002)(376002)(346002)(396003)(230922051799003)(1800799009)(64100799003)(186009)(451199024)(2906002)(84970400001)(41300700001)(316002)(66476007)(66556008)(54906003)(6916009)(36756003)(8936002)(8676002)(66946007)(4326008)(38100700002)(478600001)(235185007)(5660300002)(44832011)(6486002)(86362001)(83380400001)(26005)(6512007)(2616005)(6666004)(6506007)(44144004)(33964004)(2700100001); DIR:OUT; SFP:1101; X-MS-Exchange-Transport-CrossTenantHeadersStamped: GV1PR08MB7378 Original-Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com; X-EOPAttributedMessage: 0 X-MS-Exchange-Transport-CrossTenantHeadersStripped: AMS1EPF00000044.eurprd04.prod.outlook.com X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id-Prvs: 00fede20-7579-4805-c93f-08dbe6cee52e X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: yCd1l4MQdxC6PN7EvAoN4ywmmPxpf+yo+45TPLPvWnGEpLYDy9y3icVhSen9CXK4vwRn+MNJSGe5re4VtfwH/CJT+ySux3Scpr9wBOq34G/fV0sFpqfVo8Vn8D/0Ox/SZ0qH+LAnogjtLSU8A8ucjaudwzG/SV79BNf/WxuzzrIVx7gEjFFrlg0CRfmB6XLhDLBYtbp2LTZNnLkC7jRjDsEoDsAm8J4TU4ZpMOs6z33ZwE2McLJJrrwyhwLMTzKgvw8F3bcZfZFwyOJ21Broukl3p2J2mqP4KQp7uJJctvigXTNh7UxiZ9Iwi6FxVvocWPdlg1MvE1FUKDWvbDgPEbDnXkXq9D00gFEtWWMb2Q3AXllnrN0g+3FswKZv7p863XPk509ZBQX8maU4e3y8WpEQp4oRHzYYWWtKvdsZWPRGYGHhkT+zI9MqVjQVMC+QAOWQzbCAVnOiCfixcIg1BiIblMpL7ZOzO001iYZm7GcbdayT9Jb8mRDMxTvFfQKtHXv8X4jQQCBtH+bee15Ew8LyJzbOOzJ+35blRpjWgMCXNgZIg0BXa4NZAFBO5RqfdL9wnvqKPVy9mdon36yCmUSM/JaybtLCMkfvUdGzGE80yofQA0CaQOhS2i2fxfY/OTmY56dzPBYw1V/Wa2sgHzhVpmYVFLRMhLQuoxaSdM2ibslOgpIubcdRxmvzYV84D6X/MZsEcI3nhGRevvajr9yf2uYjmsQDscrAG6/w+1MTvN1zM0ujmQLnoV4da61W+1zd+pMLBYH8aeCI9h1R2IHFzGcPCP2hnFvJqgZXfw4= X-Forefront-Antispam-Report: CIP:63.35.35.123; CTRY:IE; LANG:en; SCL:1; SRV:; IPV:CAL; SFV:NSPM; H:64aa7808-outbound-1.mta.getcheckrecipient.com; PTR:ec2-63-35-35-123.eu-west-1.compute.amazonaws.com; CAT:NONE; SFS:(13230031)(4636009)(346002)(376002)(39860400002)(136003)(396003)(230922051799003)(64100799003)(1800799009)(451199024)(186009)(82310400011)(36840700001)(46966006)(40470700004)(82740400003)(40480700001)(44144004)(6512007)(6506007)(6666004)(33964004)(36860700001)(336012)(2616005)(84970400001)(26005)(478600001)(6486002)(86362001)(41300700001)(6916009)(70586007)(70206006)(54906003)(83380400001)(81166007)(356005)(316002)(47076005)(235185007)(44832011)(4326008)(2906002)(8676002)(5660300002)(36756003)(8936002)(40460700003)(2700100001); DIR:OUT; SFP:1101; X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 16 Nov 2023 18:07:37.4543 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 623f6309-2669-4304-b6f9-08dbe6ceea80 X-MS-Exchange-CrossTenant-Id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=f34e5979-57d9-4aaa-ad4d-b122a662184d; Ip=[63.35.35.123]; Helo=[64aa7808-outbound-1.mta.getcheckrecipient.com] X-MS-Exchange-CrossTenant-AuthSource: AMS1EPF00000044.eurprd04.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: PAVPR08MB9747 X-Spam-Status: No, score=-12.5 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, GIT_PATCH_0, KAM_DMARC_NONE, KAM_SHORT, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2, SPF_HELO_NONE, SPF_NONE, TXREP, T_SCC_BODY_TEXT_LINE, UNPARSEABLE_RELAY autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org The tests currently depending on memcpy lowering forming stps at -O0, but we no longer want to form stps during memcpy lowering, but instead in the load/store pair fusion pass. This patch therefore tweaks affected tests to enable optimizations (-O1), and adjusts the tests to avoid parts of the structures being optimized away where necessary. OK for trunk? gcc/testsuite/ChangeLog: * gcc.target/aarch64/auto-init-padding-1.c: Add -O to options, adjust test to work with optimizations enabled. * gcc.target/aarch64/auto-init-padding-2.c: Add -O to options. * gcc.target/aarch64/auto-init-padding-3.c: Add -O to options, adjust test to work with optimizations enabled. * gcc.target/aarch64/auto-init-padding-4.c: Likewise. * gcc.target/aarch64/auto-init-padding-9.c: Likewise. --- gcc/testsuite/gcc.target/aarch64/auto-init-padding-1.c | 8 +++++--- gcc/testsuite/gcc.target/aarch64/auto-init-padding-2.c | 2 +- gcc/testsuite/gcc.target/aarch64/auto-init-padding-3.c | 7 ++++--- gcc/testsuite/gcc.target/aarch64/auto-init-padding-4.c | 4 ++-- gcc/testsuite/gcc.target/aarch64/auto-init-padding-9.c | 7 ++++--- 5 files changed, 16 insertions(+), 12 deletions(-) diff --git a/gcc/testsuite/gcc.target/aarch64/auto-init-padding-1.c b/gcc/testsuite/gcc.target/aarch64/auto-init-padding-1.c index c747ebdcdf7..7027454dc74 100644 --- a/gcc/testsuite/gcc.target/aarch64/auto-init-padding-1.c +++ b/gcc/testsuite/gcc.target/aarch64/auto-init-padding-1.c @@ -1,17 +1,19 @@ /* Verify zero initialization for structure type automatic variables with padding. */ /* { dg-do compile } */ -/* { dg-options "-ftrivial-auto-var-init=zero" } */ +/* { dg-options "-O -ftrivial-auto-var-init=zero" } */ struct test_aligned { int internal1; long long internal2; } __attribute__ ((aligned(64))); -int foo () +void bar (struct test_aligned *); + +void foo () { struct test_aligned var; - return var.internal1; + bar(&var); } /* { dg-final { scan-assembler-times {stp\tq[0-9]+, q[0-9]+,} 2 } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/auto-init-padding-2.c b/gcc/testsuite/gcc.target/aarch64/auto-init-padding-2.c index 6e280904da1..d3b6591c9b0 100644 --- a/gcc/testsuite/gcc.target/aarch64/auto-init-padding-2.c +++ b/gcc/testsuite/gcc.target/aarch64/auto-init-padding-2.c @@ -1,7 +1,7 @@ /* Verify pattern initialization for structure type automatic variables with padding. */ /* { dg-do compile } */ -/* { dg-options "-ftrivial-auto-var-init=pattern" } */ +/* { dg-options "-O -ftrivial-auto-var-init=pattern" } */ struct test_aligned { int internal1; diff --git a/gcc/testsuite/gcc.target/aarch64/auto-init-padding-3.c b/gcc/testsuite/gcc.target/aarch64/auto-init-padding-3.c index 9ddea58b468..aad4bb8944f 100644 --- a/gcc/testsuite/gcc.target/aarch64/auto-init-padding-3.c +++ b/gcc/testsuite/gcc.target/aarch64/auto-init-padding-3.c @@ -1,7 +1,7 @@ /* Verify zero initialization for nested structure type automatic variables with padding. */ /* { dg-do compile } */ -/* { dg-options "-ftrivial-auto-var-init=zero" } */ +/* { dg-options "-O -ftrivial-auto-var-init=zero" } */ struct test_aligned { unsigned internal1; @@ -16,11 +16,12 @@ struct test_big_hole { struct test_aligned four; } __attribute__ ((aligned(64))); +void bar (struct test_big_hole *); -int foo () +void foo () { struct test_big_hole var; - return var.four.internal1; + bar (&var); } /* { dg-final { scan-assembler-times {stp\tq[0-9]+, q[0-9]+,} 4 } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/auto-init-padding-4.c b/gcc/testsuite/gcc.target/aarch64/auto-init-padding-4.c index 75bba82ed34..efd310f054d 100644 --- a/gcc/testsuite/gcc.target/aarch64/auto-init-padding-4.c +++ b/gcc/testsuite/gcc.target/aarch64/auto-init-padding-4.c @@ -1,7 +1,7 @@ /* Verify pattern initialization for nested structure type automatic variables with padding. */ /* { dg-do compile } */ -/* { dg-options "-ftrivial-auto-var-init=pattern" } */ +/* { dg-options "-O -ftrivial-auto-var-init=pattern" } */ struct test_aligned { unsigned internal1; @@ -23,4 +23,4 @@ int foo () return var.four.internal1; } -/* { dg-final { scan-assembler-times {stp\tq[0-9]+, q[0-9]+,} 5 } } */ +/* { dg-final { scan-assembler-times {stp\tq[0-9]+, q[0-9]+,} 4 } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/auto-init-padding-9.c b/gcc/testsuite/gcc.target/aarch64/auto-init-padding-9.c index 0f1930f813e..64ed8f11fe6 100644 --- a/gcc/testsuite/gcc.target/aarch64/auto-init-padding-9.c +++ b/gcc/testsuite/gcc.target/aarch64/auto-init-padding-9.c @@ -1,7 +1,7 @@ /* Verify zero initialization for array type with structure element with padding. */ /* { dg-do compile } */ -/* { dg-options "-ftrivial-auto-var-init=zero" } */ +/* { dg-options "-O -ftrivial-auto-var-init=zero" } */ struct test_trailing_hole { int one; @@ -11,11 +11,12 @@ struct test_trailing_hole { /* "sizeof(unsigned long) - 1" byte padding hole here. */ }; +void bar (void *); -int foo () +void foo () { struct test_trailing_hole var[10]; - return var[2].four; + bar (var); } /* { dg-final { scan-assembler-times {stp\tq[0-9]+, q[0-9]+,} 5 } } */ From patchwork Thu Nov 16 18:07:47 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alex Coplan X-Patchwork-Id: 1864869 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=armh.onmicrosoft.com header.i=@armh.onmicrosoft.com header.a=rsa-sha256 header.s=selector2-armh-onmicrosoft-com header.b=G92Phthq; dkim=pass (1024-bit key) header.d=armh.onmicrosoft.com header.i=@armh.onmicrosoft.com header.a=rsa-sha256 header.s=selector2-armh-onmicrosoft-com header.b=G92Phthq; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4SWSh25fdzz1yRV for ; Fri, 17 Nov 2023 05:08:18 +1100 (AEDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 01C6A383BCEE for ; Thu, 16 Nov 2023 18:08:16 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from EUR05-AM6-obe.outbound.protection.outlook.com (mail-am6eur05on2053.outbound.protection.outlook.com [40.107.22.53]) by sourceware.org (Postfix) with ESMTPS id 2BA7E3839DF9 for ; Thu, 16 Nov 2023 18:08:03 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 2BA7E3839DF9 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=arm.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 2BA7E3839DF9 Authentication-Results: server2.sourceware.org; arc=pass smtp.remote-ip=40.107.22.53 ARC-Seal: i=3; a=rsa-sha256; d=sourceware.org; s=key; t=1700158084; cv=pass; b=eQ4GTXsPZDaqNrykacj/M0StawJROUSyImxXcDjgzglWUqkXTfNxlkXaejRD77eWRe7oLfefRooNbjSPsPUFe+q/JazaQXquB0d1avQx8jVpvqTjjTvT6OEH9XDeYLXEgby1IoebPckdZ/mC+FLSY2VfSdgutTDw5gBFZDgQfp8= ARC-Message-Signature: i=3; a=rsa-sha256; d=sourceware.org; s=key; t=1700158084; c=relaxed/simple; bh=B510Fe89RYGvb7/dbUHq3JcT4cKKpnlXfW1qgQHO678=; h=DKIM-Signature:DKIM-Signature:Date:From:To:Subject:Message-ID: MIME-Version; b=bp7r3pBFFHK9eOqXtzyRpT/LjrEDuqBesZWaJYo/QSdbQjLo5M6IIXNulmrIFe6tLJQ7NocqZ5sXUEkGU1IPiCi5cIYqLWM3o8TI67a4oYb6eHk23EE+7QfdkR7FP00z0UCvOh3IOqsnIIMD6KLoIpOlIyqwPmkpnaNEG/J8xyc= ARC-Authentication-Results: i=3; server2.sourceware.org ARC-Seal: i=2; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=pass; b=E2fHJZDlOsTcoXcmcynXtiIh/OYNoFFUe0maWRj3eiClcCta2tTcvzhm86vTTiKvkslwKoDtG3nVtApmWrn7xIbTj6feE2uQtef01V1qi4DnLp8cO5qfymOAOHGWOjoa6kPxo3q5RbREkgGIi7FpQ/Z9WAlWFLJna4nEoL/qcYkIeL4/Vt/5OZzP2sIAF8IfHwhlj7TBkOVIBHg8aFaL/eE27YAcsfjQFEKOuyOD7TqNFQcmm0FpMXZMkXHWVnuLDby7wCTQ4fSs9FP7FBxl0yGcPOf4y6dYwLe3e4mVCvS2Rg0mkjrPEc8ZW0B7YSOtCbpOpaz8DJU5F9Je97XhCQ== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=9c6Ixd+IrQ8ENOLiNwOLPzKkYLv45CAL438oJOL6MgA=; b=YDOtP4YGVYYVK7VrgBtzaEpQXSQIrD30u2ZA8tVlMhQSc4c2hyF6/mzX84DRkWKW7LVKIqL4ii7d4u/QHh4LXINQZTLfDcpO8JcYtKlon6IPHsJUySL4PYNiTbEwtTgtrZgouy8AN5sM/aj4BYgKE0vkm7RLfCGIMxoizLhimrncomrAh2wIxXIO3HW4ULH0rYM9hTh4Pu0TlzXwVZY7k/cQRCfSGA7YrQpFVq5G1jNJVAgOMzwAaGbXYOBpp0a/vC4IUC+m9+CEhehupatKVRtrg7CErRWumfz/d7bQu63GuTkiB4s4lpzbgHuNDDXr2x2sIbV9uDetRn6/IhTZRw== ARC-Authentication-Results: i=2; mx.microsoft.com 1; spf=pass (sender ip is 63.35.35.123) smtp.rcpttodomain=gcc.gnu.org smtp.mailfrom=arm.com; dmarc=pass (p=none sp=none pct=100) action=none header.from=arm.com; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com; arc=pass (0 oda=1 ltdi=1 spf=[1,1,smtp.mailfrom=arm.com] dkim=[1,1,header.d=arm.com] dmarc=[1,1,header.from=arm.com]) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector2-armh-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=9c6Ixd+IrQ8ENOLiNwOLPzKkYLv45CAL438oJOL6MgA=; b=G92PhthqIMux3JBGnuHqAVN9qRZKEzz8mPl7kS3LyGwamm2Ir+tIq62uG9IOHJYx3PPyIOJbF/NXBFZaX7X8G+E+zud68D5dz17M1JoqPHLa1inmV2ywvVwn6UPYil6Ge39oFKwfSfVVt4OXyN5SjCF+QEPfN9ZflzA417OVqTk= Received: from DU2P251CA0025.EURP251.PROD.OUTLOOK.COM (2603:10a6:10:230::23) by DU0PR08MB9631.eurprd08.prod.outlook.com (2603:10a6:10:448::20) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7002.20; Thu, 16 Nov 2023 18:08:00 +0000 Received: from DB5PEPF00014B97.eurprd02.prod.outlook.com (2603:10a6:10:230:cafe::9e) by DU2P251CA0025.outlook.office365.com (2603:10a6:10:230::23) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7002.23 via Frontend Transport; Thu, 16 Nov 2023 18:08:00 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 63.35.35.123) smtp.mailfrom=arm.com; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com;dmarc=pass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 63.35.35.123 as permitted sender) receiver=protection.outlook.com; client-ip=63.35.35.123; helo=64aa7808-outbound-1.mta.getcheckrecipient.com; pr=C Received: from 64aa7808-outbound-1.mta.getcheckrecipient.com (63.35.35.123) by DB5PEPF00014B97.mail.protection.outlook.com (10.167.8.235) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7002.19 via Frontend Transport; Thu, 16 Nov 2023 18:08:00 +0000 Received: ("Tessian outbound 20615a7e7970:v228"); Thu, 16 Nov 2023 18:08:00 +0000 X-CheckRecipientChecked: true X-CR-MTA-CID: 11b2240a0a000f66 X-CR-MTA-TID: 64aa7808 Received: from 4e162be43685.2 by 64aa7808-outbound-1.mta.getcheckrecipient.com id 77F0BC98-1BFB-4780-839A-FBBA7F7AF304.1; Thu, 16 Nov 2023 18:07:53 +0000 Received: from EUR05-AM6-obe.outbound.protection.outlook.com by 64aa7808-outbound-1.mta.getcheckrecipient.com with ESMTPS id 4e162be43685.2 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384); Thu, 16 Nov 2023 18:07:53 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=FNOzllTvIndHlyY9ksdB7IEOj/+6uNMw023QqeKZ1zqK1W3oNFDrx8sS/X7IZ+KsA8GfOEF56Bbhzx+sHhOMKlwH1KAo1ww1WxRHXJPa2QvBlCRRTFpikaHf10CoMRaaCemgarRNzQXl95pxk98Ao45ILK2R59aZyUHWc1yBR/QtwqEA+XwvTtc++6n4BCR4fyH4hxqptHpgx/vnRYCmZlpxOWGYuk6Hs5s67kkNcSotDaL+Xe20Ue0sSe8WsYozbZC7bz422KIeASM53PkiIc0FyhpumjqomZqPMjyU3VPKIYzG0Q7nL8VlCXQTbFik6BsWuT+ZM+Te0NlAmzOckA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=9c6Ixd+IrQ8ENOLiNwOLPzKkYLv45CAL438oJOL6MgA=; b=EIsZOb2Hs3qoZiFD6q6eLslsBxAwmn/iN9Y93SNdpe8mqnZQ7wiiWGfn6uHbqBvpdycFB0jtrb9LUBFf+BhbyTSowOXNxB4fr8fgjuZRK36qXakXv/hucS3qggig+1y28QOLa58vkBeuUSoPi7qtL5wlpsmZDMfMBtscg+/CPLVjB/9qKrDXs5x40L46IAnyl8fgoaNIVCww42DnERtiXBjkDEttGRn2y7wM+5exXYA3Vrb+P0by7TmwuGvGSkL0JczTCoysBvlFH0QI3tpQnG26qE/CWCTwd00BF3hStM4tfuVxv0lmY+YJqSsMv4eQC5PMOaydSyAWav/w/GmlJg== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=arm.com; dmarc=pass action=none header.from=arm.com; dkim=pass header.d=arm.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector2-armh-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=9c6Ixd+IrQ8ENOLiNwOLPzKkYLv45CAL438oJOL6MgA=; b=G92PhthqIMux3JBGnuHqAVN9qRZKEzz8mPl7kS3LyGwamm2Ir+tIq62uG9IOHJYx3PPyIOJbF/NXBFZaX7X8G+E+zud68D5dz17M1JoqPHLa1inmV2ywvVwn6UPYil6Ge39oFKwfSfVVt4OXyN5SjCF+QEPfN9ZflzA417OVqTk= Authentication-Results-Original: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com; Received: from PAWPR08MB8958.eurprd08.prod.outlook.com (2603:10a6:102:33e::15) by GV1PR08MB7378.eurprd08.prod.outlook.com (2603:10a6:150:22::14) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7002.21; Thu, 16 Nov 2023 18:07:51 +0000 Received: from PAWPR08MB8958.eurprd08.prod.outlook.com ([fe80::8512:cc10:24d4:1919]) by PAWPR08MB8958.eurprd08.prod.outlook.com ([fe80::8512:cc10:24d4:1919%5]) with mapi id 15.20.6977.029; Thu, 16 Nov 2023 18:07:51 +0000 Date: Thu, 16 Nov 2023 18:07:47 +0000 From: Alex Coplan To: gcc-patches@gcc.gnu.org Cc: Richard Sandiford , Kyrylo Tkachov Subject: [PATCH 04/11] aarch64, testsuite: Allow ldp/stp on SVE regs with -msve-vector-bits=128 Message-ID: Content-Disposition: inline X-ClientProxiedBy: LO6P265CA0011.GBRP265.PROD.OUTLOOK.COM (2603:10a6:600:339::17) To PAWPR08MB8958.eurprd08.prod.outlook.com (2603:10a6:102:33e::15) MIME-Version: 1.0 X-MS-TrafficTypeDiagnostic: PAWPR08MB8958:EE_|GV1PR08MB7378:EE_|DB5PEPF00014B97:EE_|DU0PR08MB9631:EE_ X-MS-Office365-Filtering-Correlation-Id: b7130b10-de7b-4d20-dfce-08dbe6cef833 x-checkrecipientrouted: true NoDisclaimer: true X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam-Untrusted: BCL:0; X-Microsoft-Antispam-Message-Info-Original: hkb8BVzAAcMs0aoUyOsBFBV/Sj7H7XTa3LAfkX6w4YQ0mgqAqSvHCGYr5bjMPSOUp7ktYwSXRw/2QypxPmT60ypxE+86mze0OympX9jfos/X0xpOG4ib0lq7DAV15cxHllEIm6+UnlM/V0AFrdVMVWxN8OlsPmYJZuAc0UBn+7So8N/ki2VgOljvgauncxkYnJr/KXaN0b/+3371kbvLn1LJMJoG0tvUlHpLnxZr19Xc1TCrT3F/N3XUlTuZdKmvMTsixDVQNDWbFu2MGhRtTWatFuY2m6y2vpGTtsKQtaKdRY7K+6lSIFwu1sj7bH4HzZTQvpkN72RjvUY+Y6qpWn4vtipb5TWgxuKVC73xYhV2ryE7PFjJmhdWOUchttaUebP542LgDOQxdhnq7g5WAbHkNhZFGBez+UqpkCZRSz/HmABLq5qdJDFRE9T0T3aW+yXc6hyPRJFfSQyHt7prswhRjceVRqWgpOi1yJk/awdxEAoiwctUsGBqgRNzieKBY/Mnrl/SMk+6kHnUJHiV4AM/AhvAXEZP3+mrF9yDZCby7pn26Tukg+248nQMz3Rk+XhiJBSTQBYgMC/Hwc2XMGi1uifEst1X1U7xecZk2dNL2kzrh5jACOQv4jDpN69CIdjdgn9KJDrkoCFGuZqGuQ== X-Forefront-Antispam-Report-Untrusted: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:PAWPR08MB8958.eurprd08.prod.outlook.com; PTR:; CAT:NONE; SFS:(13230031)(366004)(136003)(39860400002)(376002)(346002)(396003)(230922051799003)(1800799009)(64100799003)(186009)(451199024)(2906002)(84970400001)(41300700001)(316002)(66476007)(66556008)(54906003)(6916009)(36756003)(8936002)(8676002)(66946007)(4326008)(38100700002)(478600001)(235185007)(5660300002)(44832011)(6486002)(86362001)(83380400001)(26005)(6512007)(2616005)(6666004)(6506007)(44144004)(33964004)(2700100001); DIR:OUT; SFP:1101; X-MS-Exchange-Transport-CrossTenantHeadersStamped: GV1PR08MB7378 Original-Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com; X-EOPAttributedMessage: 0 X-MS-Exchange-Transport-CrossTenantHeadersStripped: DB5PEPF00014B97.eurprd02.prod.outlook.com X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id-Prvs: 59f1e2b5-2195-4028-9459-08dbe6cef2a1 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: KEpBo4MBp1UoUJk2UJ2AbgrVo2DCekP8oQsVWGqkPGP+obcsYpPMWZVr6ZM9ISjJbDXRD4C/YV0fxxSwAPnPyCXDbuwXw9BYwzg1z+6fcdR9M0SL5ZDfVyIYX4GZb5+GzAfteBId0+92Ecx8WCcBUhqbyQHVxSQk918OysZOAyMdxRaXaDzndTexnfE5uMhCp3bggSBvnqQh9dLv5usnPDtEcGUpdzFttRVzR2aCI9COIOwNTpxZnIlbgSb4mIix13h6Pwk6jNaya3WOWpQp4BrKkN3JOK4igcNz690P15F2135P331ZUmkxIuZNNI7PCBK0imMz7hkN3lAbxDWfttEaBGFMI5+RDCc1oHqwMZBpxl9GE3cTEjrSi44wjbG3UV+/v9rzYlfxD6zS7h7L3sUqXW15a1hxHYQ7/n+kZsYzIb8y+cmTsfbX4LnxsTvjkwYHLAab/JW/TX82b0kSGcVASwM7r2jjHgqm12mT0RoK0fyJeoUPJLn1+4GQAmttku6dGeDCtlgbbHzF8lUHLl16EaSZiuX9QibErKxW30XmzDbvqmjN+425UoXHa2F1sKNCHAbVuyu4xCkJLH3mhWsaXwsy/I1awaHRon4NOQ3HaW8W6mgqbO29pKp8P9btrz1fmToC+ZMIpBJoWUZRbuRIJnWQ7pPajfRXiCw0rbXfmf3sad8gDNeVuomygn9AMjgKHSFEZHv7FF+zJUhfuVsHHOxeV3vbjUCEXHZFtWu5S1sMV+xL6hz6qQTekVZFyH4mZnM7OkENoEmBt1NHyRc7gOQ1XWh9Lb6S7T3nF+w= X-Forefront-Antispam-Report: CIP:63.35.35.123; CTRY:IE; LANG:en; SCL:1; SRV:; IPV:CAL; SFV:NSPM; H:64aa7808-outbound-1.mta.getcheckrecipient.com; PTR:ec2-63-35-35-123.eu-west-1.compute.amazonaws.com; CAT:NONE; SFS:(13230031)(4636009)(396003)(376002)(136003)(346002)(39860400002)(230922051799003)(186009)(64100799003)(1800799009)(82310400011)(451199024)(46966006)(40470700004)(36840700001)(84970400001)(5660300002)(47076005)(86362001)(235185007)(44832011)(6486002)(6512007)(36860700001)(316002)(6916009)(356005)(36756003)(8936002)(4326008)(8676002)(70206006)(70586007)(54906003)(81166007)(40480700001)(41300700001)(2906002)(40460700003)(336012)(83380400001)(26005)(2616005)(82740400003)(6666004)(6506007)(33964004)(44144004)(478600001)(2700100001); DIR:OUT; SFP:1101; X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 16 Nov 2023 18:08:00.4857 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: b7130b10-de7b-4d20-dfce-08dbe6cef833 X-MS-Exchange-CrossTenant-Id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=f34e5979-57d9-4aaa-ad4d-b122a662184d; Ip=[63.35.35.123]; Helo=[64aa7808-outbound-1.mta.getcheckrecipient.com] X-MS-Exchange-CrossTenant-AuthSource: DB5PEPF00014B97.eurprd02.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: DU0PR08MB9631 X-Spam-Status: No, score=-11.8 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, FORGED_SPF_HELO, GIT_PATCH_0, KAM_DMARC_NONE, KAM_NUMSUBJECT, KAM_SHORT, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2, SPF_HELO_PASS, SPF_NONE, TXREP, T_SCC_BODY_TEXT_LINE, UNPARSEABLE_RELAY autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org Later patches in the series allow ldp and stp to use SVE modes if -msve-vector-bits=128 is provided. This patch therefore adjusts tests that pass -msve-vector-bits=128 to allow ldp/stp to save/restore SVE registers. OK for trunk? gcc/testsuite/ChangeLog: * gcc.target/aarch64/sve/pcs/stack_clash_1_128.c: Allow ldp/stp saves of SVE registers. * gcc.target/aarch64/sve/pcs/struct_3_128.c: Likewise. --- .../aarch64/sve/pcs/stack_clash_1_128.c | 32 +++++++++++++++++++ .../gcc.target/aarch64/sve/pcs/struct_3_128.c | 29 +++++++++++++++++ 2 files changed, 61 insertions(+) diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pcs/stack_clash_1_128.c b/gcc/testsuite/gcc.target/aarch64/sve/pcs/stack_clash_1_128.c index 404301dc0c1..795429b01cb 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/pcs/stack_clash_1_128.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/pcs/stack_clash_1_128.c @@ -19,6 +19,7 @@ ** str p13, \[sp, #9, mul vl\] ** str p14, \[sp, #10, mul vl\] ** str p15, \[sp, #11, mul vl\] +** ( ** str z8, \[sp, #2, mul vl\] ** str z9, \[sp, #3, mul vl\] ** str z10, \[sp, #4, mul vl\] @@ -35,7 +36,18 @@ ** str z21, \[sp, #15, mul vl\] ** str z22, \[sp, #16, mul vl\] ** str z23, \[sp, #17, mul vl\] +** | +** stp q8, q9, \[sp, 32\] +** stp q10, q11, \[sp, 64\] +** stp q12, q13, \[sp, 96\] +** stp q14, q15, \[sp, 128\] +** stp q16, q17, \[sp, 160\] +** stp q18, q19, \[sp, 192\] +** stp q20, q21, \[sp, 224\] +** stp q22, q23, \[sp, 256\] +** ) ** ptrue p0\.b, vl16 +** ( ** ldr z8, \[sp, #2, mul vl\] ** ldr z9, \[sp, #3, mul vl\] ** ldr z10, \[sp, #4, mul vl\] @@ -52,6 +64,16 @@ ** ldr z21, \[sp, #15, mul vl\] ** ldr z22, \[sp, #16, mul vl\] ** ldr z23, \[sp, #17, mul vl\] +** | +** ldp q8, q9, \[sp, 32\] +** ldp q10, q11, \[sp, 64\] +** ldp q12, q13, \[sp, 96\] +** ldp q14, q15, \[sp, 128\] +** ldp q16, q17, \[sp, 160\] +** ldp q18, q19, \[sp, 192\] +** ldp q20, q21, \[sp, 224\] +** ldp q22, q23, \[sp, 256\] +** ) ** ldr p4, \[sp\] ** ldr p5, \[sp, #1, mul vl\] ** ldr p6, \[sp, #2, mul vl\] @@ -101,16 +123,26 @@ test_2 (void) ** str p5, \[sp\] ** str p6, \[sp, #1, mul vl\] ** str p11, \[sp, #2, mul vl\] +** ( ** str z8, \[sp, #1, mul vl\] ** str z13, \[sp, #2, mul vl\] ** str z19, \[sp, #3, mul vl\] ** str z20, \[sp, #4, mul vl\] +** | +** stp q8, q13, \[sp, 16\] +** stp q19, q20, \[sp, 48\] +** ) ** str z22, \[sp, #5, mul vl\] ** ptrue p0\.b, vl16 +** ( ** ldr z8, \[sp, #1, mul vl\] ** ldr z13, \[sp, #2, mul vl\] ** ldr z19, \[sp, #3, mul vl\] ** ldr z20, \[sp, #4, mul vl\] +** | +** ldp q8, q13, \[sp, 16\] +** ldp q19, q20, \[sp, 48\] +** ) ** ldr z22, \[sp, #5, mul vl\] ** ldr p5, \[sp\] ** ldr p6, \[sp, #1, mul vl\] diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pcs/struct_3_128.c b/gcc/testsuite/gcc.target/aarch64/sve/pcs/struct_3_128.c index f6d78469aa5..0d330c015b9 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/pcs/struct_3_128.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/pcs/struct_3_128.c @@ -220,6 +220,7 @@ SEL2 (struct, pst_arr5) /* ** test_pst_arr5: ** sub sp, sp, #128 +** ( ** str z0, \[sp\] ** str z1, \[sp, #1, mul vl\] ** str z2, \[sp, #2, mul vl\] @@ -228,6 +229,12 @@ SEL2 (struct, pst_arr5) ** str z5, \[sp, #5, mul vl\] ** str z6, \[sp, #6, mul vl\] ** str z7, \[sp, #7, mul vl\] +** | +** stp q0, q1, \[sp\] +** stp q2, q3, \[sp, 32\] +** stp q4, q5, \[sp, 64\] +** stp q6, q7, \[sp, 96\] +** ) ** mov (x7, sp|w7, wsp) ** add sp, sp, #?128 ** ret @@ -374,8 +381,12 @@ SEL2 (struct, pst_uniform1) /* ** test_pst_uniform1: ** sub sp, sp, #32 +** ( ** str z0, \[sp\] ** str z1, \[sp, #1, mul vl\] +** | +** stp q0, q1, \[sp\] +** ) ** mov (x7, sp|w7, wsp) ** add sp, sp, #?32 ** ret @@ -398,8 +409,12 @@ SEL2 (struct, pst_uniform2) /* ** test_pst_uniform2: ** sub sp, sp, #48 +** ( ** str z0, \[sp\] ** str z1, \[sp, #1, mul vl\] +** | +** stp q0, q1, \[sp\] +** ) ** str z2, \[sp, #2, mul vl\] ** mov (x7, sp|w7, wsp) ** add sp, sp, #?48 @@ -424,10 +439,15 @@ SEL2 (struct, pst_uniform3) /* ** test_pst_uniform3: ** sub sp, sp, #64 +** ( ** str z0, \[sp\] ** str z1, \[sp, #1, mul vl\] ** str z2, \[sp, #2, mul vl\] ** str z3, \[sp, #3, mul vl\] +** | +** stp q0, q1, \[sp\] +** stp q2, q3, \[sp, 32\] +** ) ** mov (x7, sp|w7, wsp) ** add sp, sp, #?64 ** ret @@ -456,8 +476,12 @@ SEL2 (struct, pst_uniform4) ** ptrue (p[0-7])\.b, vl16 ** st1w z0\.s, \2, \[x7\] ** add (x[0-9]+), x7, #?32 +** ( ** str z1, \[\3\] ** str z2, \[\3, #1, mul vl\] +** | +** stp q1, q2, \[\3\] +** ) ** str z3, \[\3, #2, mul vl\] ** st1w z4\.s, \2, \[x7, #6, mul vl\] ** add sp, sp, #?144 @@ -542,10 +566,15 @@ SEL2 (struct, pst_mixed2) ** str p2, \[sp, #18, mul vl\] ** add (x[0-9]+), sp, #?38 ** st1b z2\.b, \1, \[\4\] +** ( ** str z3, \[sp, #4, mul vl\] ** str z4, \[sp, #5, mul vl\] ** str z5, \[sp, #6, mul vl\] ** str z6, \[sp, #7, mul vl\] +** | +** stp q3, q4, \[sp, 64\] +** stp q5, q6, \[sp, 96\] +** ) ** mov (x7, sp|w7, wsp) ** add sp, sp, #?128 ** ret From patchwork Thu Nov 16 18:08:14 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alex Coplan X-Patchwork-Id: 1864870 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=armh.onmicrosoft.com header.i=@armh.onmicrosoft.com header.a=rsa-sha256 header.s=selector2-armh-onmicrosoft-com header.b=1EZNDBoo; dkim=pass (1024-bit key) header.d=armh.onmicrosoft.com header.i=@armh.onmicrosoft.com header.a=rsa-sha256 header.s=selector2-armh-onmicrosoft-com header.b=1EZNDBoo; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=8.43.85.97; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4SWShg03KDz1yRV for ; Fri, 17 Nov 2023 05:08:50 +1100 (AEDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 6DEEC3875467 for ; Thu, 16 Nov 2023 18:08:48 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from EUR05-AM6-obe.outbound.protection.outlook.com (mail-am6eur05on2044.outbound.protection.outlook.com [40.107.22.44]) by sourceware.org (Postfix) with ESMTPS id 6FB9B387547D for ; Thu, 16 Nov 2023 18:08:35 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 6FB9B387547D Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=arm.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 6FB9B387547D Authentication-Results: server2.sourceware.org; arc=pass smtp.remote-ip=40.107.22.44 ARC-Seal: i=3; a=rsa-sha256; d=sourceware.org; s=key; t=1700158116; cv=pass; b=i3q6GhgOFAwhk7RTTAzh2xlZM5lFTnUHHGViHX4Av+Llmy31t/1kem5wLf4bbrP1jzIdyPJiFpHNMi5XtnNUWsDnKEhCbxPjZTIk+QXaPSxYjLRb0y3Bld0BTqu0WWn1nY7ayudrV4l9IjTAwMzeJ9JlrMLrT0YUaayNGFxyn5s= ARC-Message-Signature: i=3; a=rsa-sha256; d=sourceware.org; s=key; t=1700158116; c=relaxed/simple; bh=tYjkHF7Xv604Bi7WoeBcwUbK+7MVcVhKudzX/3gEZyM=; h=DKIM-Signature:DKIM-Signature:Date:From:To:Subject:Message-ID: MIME-Version; b=V5YO/pfaWtSykOfQmBezgy9DPdz5tyGANA/AuEbmRLv+YyRAkPnBRWDHsMbbZ6dYQWZO0NeWrZY3mXFvBPwigapP+tQyMEPiYkYNR4Gg2WaMY05fRJzpRXT4kzPRJjASlJIEcgI6JIF879BnJfwdLajZWH++p39/kdLzFBfqVC0= ARC-Authentication-Results: i=3; server2.sourceware.org ARC-Seal: i=2; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=pass; b=Clqv+PsA3YBp5eskHq42G6NOtHD+Ljb37RdreuDbKY7QnFBggMjKu/D9fLDErJvmuvXZgdep/7Q8zxbG3BBkNEmSZ54rFx/GeDbeUiJHA3Uneo0Z5w8Hq6bth84eBbDOePO8w8jbJmN1hCgqKCuizkiVXOBAdDR7LDrxgSXcCjyfNnIAgKLwWy18SO0gE7XtADMI6btNDbkrIS7iV4Zl10wK8MfLNqJUgY7FF5lmO4501r4WtiN1Bdsn6foa1FZMruH1+LuJNMVcp0fTHse6K2AoHRdMFNQBSN8oxfztvKrUPF7+QmWcMbag1FzL+vbAmyqlIngeC6vF6o9hUH4vkQ== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=wV1C3h97c1o6Moh/OQb3OkqzJ/KiPUYjKhEbeKzdoFY=; b=jT3q++zvEKcrHvIlBoQuXmCOkk7Dv0DKJz8Fm7S6rmqfRH5PlXiAD1l4Zkz54iAcFMyW88UIcCdWPlpGvKX3sL9Fi9FJZVpy8I2tdMEN5nvHwIrbUG/oHUOwId16fbr79hZuJPM79PPlJfm1UBzQeFq6YKqU2w1fVockNGdILLAaZsAoF7z9gNa/1XJkPvdrBQ1HDiEBS2eLVd9qwfeTkdLz/ZlEmVLt0CgwIpL/2yJvYJNIMpieWfMs0gq8R3QdOnrUDfs0uJ7GmSZkDR/0FWjHlZF2AombHAP2dmocPBCFmLJoKL2bxPVE0Qa9Uglvdjn6MnWTDVE+QIl2lUCX/A== ARC-Authentication-Results: i=2; mx.microsoft.com 1; spf=pass (sender ip is 63.35.35.123) smtp.rcpttodomain=gcc.gnu.org smtp.mailfrom=arm.com; dmarc=pass (p=none sp=none pct=100) action=none header.from=arm.com; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com; arc=pass (0 oda=1 ltdi=1 spf=[1,1,smtp.mailfrom=arm.com] dkim=[1,1,header.d=arm.com] dmarc=[1,1,header.from=arm.com]) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector2-armh-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=wV1C3h97c1o6Moh/OQb3OkqzJ/KiPUYjKhEbeKzdoFY=; b=1EZNDBooaNw2EspbyIU7L5h+HImNr2OERDbxheqZq7U3kB+DC5nwQbtp0MnEHKRVN9I264H55Zw//3jtBSuUJm3tLo8VZUUFRHBfybhjtZSqrEyLF1TdJDYqTfZP6im7XPkTv0QLhvqQoGNnU4HABiQtR3eTADjK9x+haTd+XM4= Received: from AS9PR04CA0122.eurprd04.prod.outlook.com (2603:10a6:20b:531::24) by GVXPR08MB7752.eurprd08.prod.outlook.com (2603:10a6:150:6::7) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7002.19; Thu, 16 Nov 2023 18:08:32 +0000 Received: from AMS0EPF000001A0.eurprd05.prod.outlook.com (2603:10a6:20b:531:cafe::aa) by AS9PR04CA0122.outlook.office365.com (2603:10a6:20b:531::24) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7002.21 via Frontend Transport; Thu, 16 Nov 2023 18:08:32 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 63.35.35.123) smtp.mailfrom=arm.com; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com;dmarc=pass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 63.35.35.123 as permitted sender) receiver=protection.outlook.com; client-ip=63.35.35.123; helo=64aa7808-outbound-1.mta.getcheckrecipient.com; pr=C Received: from 64aa7808-outbound-1.mta.getcheckrecipient.com (63.35.35.123) by AMS0EPF000001A0.mail.protection.outlook.com (10.167.16.230) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7002.19 via Frontend Transport; Thu, 16 Nov 2023 18:08:32 +0000 Received: ("Tessian outbound 5d213238733f:v228"); Thu, 16 Nov 2023 18:08:32 +0000 X-CheckRecipientChecked: true X-CR-MTA-CID: 26634c20bd99243f X-CR-MTA-TID: 64aa7808 Received: from 1873713bcbf5.2 by 64aa7808-outbound-1.mta.getcheckrecipient.com id 2D3CC229-11F0-4593-B3BD-78210392DB57.1; Thu, 16 Nov 2023 18:08:20 +0000 Received: from EUR05-AM6-obe.outbound.protection.outlook.com by 64aa7808-outbound-1.mta.getcheckrecipient.com with ESMTPS id 1873713bcbf5.2 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384); Thu, 16 Nov 2023 18:08:20 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=LGoLP3I4Rgjx7QwK9c3IM+uHFoykADOYX2zZD6UkMjmfRxaRzr9RIkl1+3xUJ+/u8iD31U563QP38PWZRu8cUm/UBcnpjoao+7ipvXGW4dB2FSfJDphUjCt7V3fYcVFNvDH9kHkDxNGSK+9o8DOGWlPAhy+8NvLT4N/9wmzOomBwoZYh3wpJjFPhhh0Lmx93SINtRoljoBr+wBp+S2IhVFWtZ7WQmMdQ7VPbVNC+aa+cs6Oxasd5HI1LbFcw1moOa0pgw3xeWxWDIJrWtXLkC03jJAqBVEpH6yVP77CJmcaAQn9jkPS9CLjzlc+RTGzEO/l+qJa4rzgg19d5iqvSyA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=wV1C3h97c1o6Moh/OQb3OkqzJ/KiPUYjKhEbeKzdoFY=; b=RIO88s6dZEqpRzWzmmvP6JXBYhCGVUPtREFwrPRREBipv4dR4jpL/bzSj2ug7ArSwAJmP0ZQ46cxhOihC205DKLkqMTD1H8+zVsGY3txVHZRy2AYvrmxL/ej05gKlDWs55iO4RQCnGtOaXAa2+9M5I2PgGG16ob4B72dv5lfoFZ9GNuKFFfaKpg4LATlQ3TcmiD4aDMvHYb7KTgbp/hkpqUhJex078Ct4ZSNsB7l0cSYUG6pqzwAnMeO4oa7xUhXSuF/dQ4RR0jtMc26ghtn8hmoT0nBhOIbHL77PJApMsdTvsdPKPV1y3/4fFbPp3td4QFuR2qUTGQ/lL08LqzuGw== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=arm.com; dmarc=pass action=none header.from=arm.com; dkim=pass header.d=arm.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector2-armh-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=wV1C3h97c1o6Moh/OQb3OkqzJ/KiPUYjKhEbeKzdoFY=; b=1EZNDBooaNw2EspbyIU7L5h+HImNr2OERDbxheqZq7U3kB+DC5nwQbtp0MnEHKRVN9I264H55Zw//3jtBSuUJm3tLo8VZUUFRHBfybhjtZSqrEyLF1TdJDYqTfZP6im7XPkTv0QLhvqQoGNnU4HABiQtR3eTADjK9x+haTd+XM4= Authentication-Results-Original: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com; Received: from PAWPR08MB8958.eurprd08.prod.outlook.com (2603:10a6:102:33e::15) by GV1PR08MB7378.eurprd08.prod.outlook.com (2603:10a6:150:22::14) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7002.21; Thu, 16 Nov 2023 18:08:18 +0000 Received: from PAWPR08MB8958.eurprd08.prod.outlook.com ([fe80::8512:cc10:24d4:1919]) by PAWPR08MB8958.eurprd08.prod.outlook.com ([fe80::8512:cc10:24d4:1919%5]) with mapi id 15.20.6977.029; Thu, 16 Nov 2023 18:08:18 +0000 Date: Thu, 16 Nov 2023 18:08:14 +0000 From: Alex Coplan To: gcc-patches@gcc.gnu.org Cc: Richard Sandiford , Kyrylo Tkachov Subject: [PATCH 05/11] aarch64, testsuite: Fix up pr103147-10 tests Message-ID: Content-Disposition: inline X-ClientProxiedBy: LO2P265CA0439.GBRP265.PROD.OUTLOOK.COM (2603:10a6:600:e::19) To PAWPR08MB8958.eurprd08.prod.outlook.com (2603:10a6:102:33e::15) MIME-Version: 1.0 X-MS-TrafficTypeDiagnostic: PAWPR08MB8958:EE_|GV1PR08MB7378:EE_|AMS0EPF000001A0:EE_|GVXPR08MB7752:EE_ X-MS-Office365-Filtering-Correlation-Id: 90ceabbe-a0be-40aa-b8ad-08dbe6cf0b27 x-checkrecipientrouted: true NoDisclaimer: true X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam-Untrusted: BCL:0; X-Microsoft-Antispam-Message-Info-Original: bMP/GXu4Rr7kAdDVTKtlO/xwq5Y53wO36RThAuhIWXs+YVX4+w4JdkoMZCJLY4hiYoaj0j0EFoqjmmS9L4zPR/7RnQe+gvtWYXzs697T3Y+nd5CJG/lzoxaCkLIcVx+ycTp4qqDFXusqia9m7cIR+EM2oxhOQwUuVGdx+bEXQRzvyPVO41B+T1G86Sa6FvWIfFur4RPMfIfhxM2ddSgexGYu7RGqnAI1cM2VHxWlytMBDvlrrJQ1GWG1qz1P6qiLnBfA37O8ewTmM4vXbPSTen/Y5aARs5NcZ0Q8FAcjS4t02zaaiV2lR+FGS78z2UqEYAzVmoqKQcoxMvEgSM3Bc6hmBFLX1EvqyDh6Wziw+pnzpGs02/eGDVQXzkTzNrdFeA9NI9HefVTflD7cKhMbdW5piH3+xaT9PE8OiIvZhy07+n1/uBJPkOqOcMpLllAk+J29iSr8C7kR3fmYSIQIV8pDM6YQxyI+VKK3X5f5IihvITpgkAMwq9lFL5Vx87srTBxIRbduDsVIHyoMejFGYZ5XC83cpimg2K9fvkTzGglqcljFAet/BzJ9huLJvUX0Yd3E3Dg8ffZiOq4iqVlFYug6kZtmAwFYPIj1vy08v1o= X-Forefront-Antispam-Report-Untrusted: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:PAWPR08MB8958.eurprd08.prod.outlook.com; PTR:; CAT:NONE; SFS:(13230031)(366004)(136003)(39860400002)(376002)(346002)(396003)(230922051799003)(1800799009)(64100799003)(186009)(451199024)(2906002)(84970400001)(41300700001)(316002)(66476007)(66556008)(54906003)(6916009)(36756003)(8936002)(8676002)(66946007)(4326008)(38100700002)(478600001)(235185007)(5660300002)(44832011)(6486002)(86362001)(83380400001)(26005)(6512007)(2616005)(6666004)(6506007)(44144004)(33964004)(2700100001); DIR:OUT; SFP:1101; X-MS-Exchange-Transport-CrossTenantHeadersStamped: GV1PR08MB7378 Original-Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com; X-EOPAttributedMessage: 0 X-MS-Exchange-Transport-CrossTenantHeadersStripped: AMS0EPF000001A0.eurprd05.prod.outlook.com X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id-Prvs: 5f0d0f48-78f4-4629-56d0-08dbe6cf02d1 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: 5I2xBf2S0NHFm22jMUuV6+oEPBtW+rpiz1xZXa6tlwEqYqbdXx4FJhPs5kncAGm1QhnJK7CpWJOypprqd2qLcDnBbvpN7nmooEsCIKPmRk6BfsPOdzjfCXU3T2FnW8ypRIOAPmKCRwWgXJ0SswQnLM1vlhcqbt4QZzys3mIyXe93m6jUMSNWBHmRvfqWtsqBE4hekKLw3FDZvIaB5c3zpIlc5jIx+FldYED0lVKnWQRd2LDgKxUZ/dz7SzhNbJq5URmqG3NnqI5kx3ZhFIdrcKD573YRDHJFKiFbtmAYMn4tEv9Y1mh/HBkcpnhi2PmPKWw7JDrwB0dVsS0B2ccAe+alEst/O+s4odiyNtA/4h8rHk8rgC+68onaLIqf9olBoRWrHqOL+rthbcX9NP7J92irO/wq4u0c1W9KpA8Axa+pDXA6jW1O/TrtnyAi+lII4xGTO+RJaVgQCge4hVfSv3KnTiOvOhad+JHFnHcUhvCm51CGg7BC1eG2YQauXHLBsHtaIRlVh4cD5ONr5D2szV0C9uWd1fKmTa9r10TpfnEiRxW0aUvBJtbS7MFJtC8wDTok4IZG3O3CjxdU3qj9TvXikoDUH9DiRZWY3xyjW3fwtEixQJcsZeVBlberHvxvZSSWjl42vHR89dDfzB2TqDSAnxP6Dzp7g0xTgLxqrITNRB40OIqTQm50sjTOIi/FR1g7ahgyaigT+xv3SsAoQQ+x9Rn4MgoHzww3negPpQ75I4t2zt5hNeT7vTswyVPo X-Forefront-Antispam-Report: CIP:63.35.35.123; CTRY:IE; LANG:en; SCL:1; SRV:; IPV:CAL; SFV:NSPM; H:64aa7808-outbound-1.mta.getcheckrecipient.com; PTR:ec2-63-35-35-123.eu-west-1.compute.amazonaws.com; CAT:NONE; SFS:(13230031)(4636009)(376002)(346002)(39860400002)(136003)(396003)(230922051799003)(451199024)(1800799009)(82310400011)(64100799003)(186009)(40470700004)(36840700001)(46966006)(40460700003)(316002)(70586007)(6916009)(54906003)(8676002)(70206006)(8936002)(4326008)(6666004)(6486002)(478600001)(36756003)(41300700001)(86362001)(235185007)(5660300002)(44832011)(2906002)(36860700001)(47076005)(81166007)(356005)(44144004)(2616005)(26005)(33964004)(6512007)(83380400001)(336012)(6506007)(82740400003)(40480700001)(84970400001)(2700100001); DIR:OUT; SFP:1101; X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 16 Nov 2023 18:08:32.2359 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 90ceabbe-a0be-40aa-b8ad-08dbe6cf0b27 X-MS-Exchange-CrossTenant-Id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=f34e5979-57d9-4aaa-ad4d-b122a662184d; Ip=[63.35.35.123]; Helo=[64aa7808-outbound-1.mta.getcheckrecipient.com] X-MS-Exchange-CrossTenant-AuthSource: AMS0EPF000001A0.eurprd05.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: GVXPR08MB7752 X-Spam-Status: No, score=-12.0 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, FORGED_SPF_HELO, GIT_PATCH_0, KAM_DMARC_NONE, KAM_SHORT, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2, SPF_HELO_PASS, SPF_NONE, TXREP, T_SCC_BODY_TEXT_LINE, UNPARSEABLE_RELAY autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org For the ret function, allow the loads to be emitted in either order in the codegen. The order gets inverted with the new load/store pair pass. OK for trunk? gcc/testsuite/ChangeLog: * g++.target/aarch64/pr103147-10.C (ret): Allow loads in either order. * gcc.target/aarch64/pr103147-10.c (ret): Likewise. --- gcc/testsuite/g++.target/aarch64/pr103147-10.C | 5 +++++ gcc/testsuite/gcc.target/aarch64/pr103147-10.c | 5 +++++ 2 files changed, 10 insertions(+) diff --git a/gcc/testsuite/g++.target/aarch64/pr103147-10.C b/gcc/testsuite/g++.target/aarch64/pr103147-10.C index e12771533f7..5a98c30ed3f 100644 --- a/gcc/testsuite/g++.target/aarch64/pr103147-10.C +++ b/gcc/testsuite/g++.target/aarch64/pr103147-10.C @@ -62,8 +62,13 @@ ld4 (int32x4x4_t *a, int32_t *b) /* ** ret: ** ... +** ( ** ldp q0, q1, \[x0\] ** ldr q2, \[x0, #?32\] +** | +** ldr q2, \[x0, #?32\] +** ldp q0, q1, \[x0\] +** ) ** ... */ int32x4x3_t diff --git a/gcc/testsuite/gcc.target/aarch64/pr103147-10.c b/gcc/testsuite/gcc.target/aarch64/pr103147-10.c index 57942bfd10a..2609266bc46 100644 --- a/gcc/testsuite/gcc.target/aarch64/pr103147-10.c +++ b/gcc/testsuite/gcc.target/aarch64/pr103147-10.c @@ -60,8 +60,13 @@ ld4 (int32x4x4_t *a, int32_t *b) /* ** ret: ** ... +** ( ** ldp q0, q1, \[x0\] ** ldr q2, \[x0, #?32\] +** | +** ldr q2, \[x0, #?32\] +** ldp q0, q1, \[x0\] +** ) ** ... */ int32x4x3_t From patchwork Thu Nov 16 18:08:47 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alex Coplan X-Patchwork-Id: 1864871 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=armh.onmicrosoft.com header.i=@armh.onmicrosoft.com header.a=rsa-sha256 header.s=selector2-armh-onmicrosoft-com header.b=nsYg49Lc; dkim=pass (1024-bit key) header.d=armh.onmicrosoft.com header.i=@armh.onmicrosoft.com header.a=rsa-sha256 header.s=selector2-armh-onmicrosoft-com header.b=nsYg49Lc; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4SWSjB0NxTz1yRV for ; Fri, 17 Nov 2023 05:09:18 +1100 (AEDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 4DDBF3875453 for ; Thu, 16 Nov 2023 18:09:15 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from EUR04-DB3-obe.outbound.protection.outlook.com (mail-db3eur04on2065.outbound.protection.outlook.com [40.107.6.65]) by sourceware.org (Postfix) with ESMTPS id 23FF03875DC9 for ; Thu, 16 Nov 2023 18:09:03 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 23FF03875DC9 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=arm.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 23FF03875DC9 Authentication-Results: server2.sourceware.org; arc=pass smtp.remote-ip=40.107.6.65 ARC-Seal: i=3; a=rsa-sha256; d=sourceware.org; s=key; t=1700158144; cv=pass; b=hgMcNz9mMn8yQqlWEUBzxJ9p7l/ImD8cnYpqcLCi/vnO7MG3Qe6qFCrYB5IIayDoYTkDBarakqQN547DLCj6aSIUzCsjapTg1XiFQqXGkVucNwMHfPyHa4a9V4T7zWJwFFJQhRZu6R53UqYYnLf2hy8k9t+nazew5N5ZVJnoFcM= ARC-Message-Signature: i=3; a=rsa-sha256; d=sourceware.org; s=key; t=1700158144; c=relaxed/simple; bh=agCSQaCHHrj3fHGgYFQSms/u65mR0uc+Hrd6+pPmIzg=; h=DKIM-Signature:DKIM-Signature:Date:From:To:Subject:Message-ID: MIME-Version; b=wrdtnRJ2wbpijUYLHuL2PHEufmC1dPeI2zUV4NLnwfM9nn5zWqEZYmPiX8kMqzkM9l7Hro4fvTEYjxHzL96u1eo+UjMtjDiUPlSIDVoXUZcAaB7ZpXKoE37moiP6zWch4JLFVjwX3wYSpyMMwWASD5KslFBKbAV+28qpgpirWas= ARC-Authentication-Results: i=3; server2.sourceware.org ARC-Seal: i=2; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=pass; b=E9inIPm/sHQ3Mhgk6h6tYdVQJlaKWEJGAa2OjhX8yOtr6EV7DA3kgQVN/Vggw9+UdT8zXISRv0JENYDzG+VUXM8bvAz6yZHh9/2jFxStyxe1XXG1gG3SmJWNSaj75x4rvRamcMdJeV9Dv9BhGOyKcrA9bT72xsR2tfYJG76iESbdj3tgJQW6RhLLW/lQ1iP+IsaW4oBJY0R1xN5SH/d1Kxm6rjfeJutcCjLf9jpNP8q7+WXO4Cx0Xu7LaDK3vyDqb/cDBbXdU00TIuwvkn67q9Q79ZuudrvJwofMKZy8dB0HO6aEec4IgwDzW9JvFb4JxS7gX9onVcQzhVZBO3xtAg== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=aIQazdE2M+KuQToUIyfebnwMFmIWF5NFDj02nc/A7n8=; b=cq7+VIZOdD99VIIe9yFCezTFGi5tGtDWb4BxPAGP4P7w662XuJOw/K4fmCUkELKL/yzUQTF1YclCV4sXvrjYRBzL49fLBW77oQFyIn5NBn2zjlcRtVzoojMSXrHhUnRvGn/4al4GtvT6C2JMzPPjBkMg0ffSrON5Xh8LqAT53JzkUvTajfq3IJN5bMuM7AnaQkCsGfzUgRoJ6n3JMY99ixqzrMtzVgN5o+GuNRTYWr1mjJhY5wN5vs6I+NhATiGjaTAGCjUXb/dhUr4ov7FdeFqAk9bXVE+QpR0ZOlEo6RyjtrcHNwCQyxJVl4zZqvMtak/juZXCNT/7PEGF9bVjuw== ARC-Authentication-Results: i=2; mx.microsoft.com 1; spf=pass (sender ip is 63.35.35.123) smtp.rcpttodomain=gcc.gnu.org smtp.mailfrom=arm.com; dmarc=pass (p=none sp=none pct=100) action=none header.from=arm.com; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com; arc=pass (0 oda=1 ltdi=1 spf=[1,1,smtp.mailfrom=arm.com] dkim=[1,1,header.d=arm.com] dmarc=[1,1,header.from=arm.com]) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector2-armh-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=aIQazdE2M+KuQToUIyfebnwMFmIWF5NFDj02nc/A7n8=; b=nsYg49LckVaNxJEIBdYG+NFD5ffHsfnqRDo8uXM5b20XaPvB1S1BGXd9Wa5wQP+TLpbID6XcbjLyWSz4kj8juBHDNRx5QG1rXyXVoS+WZUH8kvnZ8loHAhWiqTO8LGe+K5HIawa3biG+htchdxjl7bHFwrxtgafBFWOvgPPE7iE= Received: from DUZP191CA0006.EURP191.PROD.OUTLOOK.COM (2603:10a6:10:4f9::16) by AS2PR08MB10351.eurprd08.prod.outlook.com (2603:10a6:20b:579::13) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6977.33; Thu, 16 Nov 2023 18:08:58 +0000 Received: from DU2PEPF0001E9C0.eurprd03.prod.outlook.com (2603:10a6:10:4f9:cafe::49) by DUZP191CA0006.outlook.office365.com (2603:10a6:10:4f9::16) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7002.23 via Frontend Transport; Thu, 16 Nov 2023 18:08:58 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 63.35.35.123) smtp.mailfrom=arm.com; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com;dmarc=pass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 63.35.35.123 as permitted sender) receiver=protection.outlook.com; client-ip=63.35.35.123; helo=64aa7808-outbound-1.mta.getcheckrecipient.com; pr=C Received: from 64aa7808-outbound-1.mta.getcheckrecipient.com (63.35.35.123) by DU2PEPF0001E9C0.mail.protection.outlook.com (10.167.8.69) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7002.20 via Frontend Transport; Thu, 16 Nov 2023 18:08:58 +0000 Received: ("Tessian outbound 20615a7e7970:v228"); Thu, 16 Nov 2023 18:08:58 +0000 X-CheckRecipientChecked: true X-CR-MTA-CID: ae905c331e3b647d X-CR-MTA-TID: 64aa7808 Received: from e6e597fa1571.2 by 64aa7808-outbound-1.mta.getcheckrecipient.com id B65B0054-7381-4CC5-AF6F-DD76BCE748CC.1; Thu, 16 Nov 2023 18:08:51 +0000 Received: from EUR05-AM6-obe.outbound.protection.outlook.com by 64aa7808-outbound-1.mta.getcheckrecipient.com with ESMTPS id e6e597fa1571.2 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384); Thu, 16 Nov 2023 18:08:51 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=aXpYKwW2Fj1FkbeuCp7jFwx1r4plBT5EiJpvDalggBsYaDvGFqP1yAAvxjCTiEj9jVOMFZSjhb3fBnkk+JZb7Hcq/gruJBXhOlzUxizx59HEr95Fh/TUJxuDcLcQYGS+FqwynzuL5G4DQ/iJmv91Q1J1bzcsFc3dRup/jP8ae2/hedx1zTh2/xsDblZa9Vm2hFlMtqMCeOnBJRWekbjzPQtoyFbufydsnwHoFcNBBQxCo8hT7VNOjEiMf3fW9391IRRt5e6H/wkSRJ/NGH8xBbkXRj/J3QkPp2w4GhrL9BRGP4GTArwRuzujYRCrHl3c1djQdGdTCdTX1TWOiiHRIA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=aIQazdE2M+KuQToUIyfebnwMFmIWF5NFDj02nc/A7n8=; b=RIMfo9sd/rMkz2147u96YqdHKvSaMPJ0he72t8sDuVF8Tj0mwZCJ5PbV5qjrGxQ4yOU2/QRmNtIEYDLuBKConMFz4F3TPeRYSrkvR8KGqbporArrKGf3/6nWh2wFr3j50mso6qaQr5Ss+k9SJfjoqNS7BhQHYTNmjYQWdWFRyF2rhJMrPmIV5ig+plTBp6U8b0NDzdEJ55fwFkKhM9Oli3ufdEduc+vdhAqEeOQvA6HB5y776W8e8i8xUB/U4xFNNP8ip+lYdiIhuSD6binuHGu8aOWwKfunUMwmgP0wAewGhWghSmWsam89ddM6z4n2vJPDwHEHfAtqlMpNCg3Nwg== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=arm.com; dmarc=pass action=none header.from=arm.com; dkim=pass header.d=arm.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector2-armh-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=aIQazdE2M+KuQToUIyfebnwMFmIWF5NFDj02nc/A7n8=; b=nsYg49LckVaNxJEIBdYG+NFD5ffHsfnqRDo8uXM5b20XaPvB1S1BGXd9Wa5wQP+TLpbID6XcbjLyWSz4kj8juBHDNRx5QG1rXyXVoS+WZUH8kvnZ8loHAhWiqTO8LGe+K5HIawa3biG+htchdxjl7bHFwrxtgafBFWOvgPPE7iE= Authentication-Results-Original: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com; Received: from PAWPR08MB8958.eurprd08.prod.outlook.com (2603:10a6:102:33e::15) by GV1PR08MB7378.eurprd08.prod.outlook.com (2603:10a6:150:22::14) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7002.21; Thu, 16 Nov 2023 18:08:49 +0000 Received: from PAWPR08MB8958.eurprd08.prod.outlook.com ([fe80::8512:cc10:24d4:1919]) by PAWPR08MB8958.eurprd08.prod.outlook.com ([fe80::8512:cc10:24d4:1919%5]) with mapi id 15.20.6977.029; Thu, 16 Nov 2023 18:08:49 +0000 Date: Thu, 16 Nov 2023 18:08:47 +0000 From: Alex Coplan To: gcc-patches@gcc.gnu.org Cc: Richard Sandiford , Kyrylo Tkachov Subject: [PATCH 06/11] aarch64: Fix up aarch64_print_operand xzr/wzr case Message-ID: Content-Disposition: inline X-ClientProxiedBy: LO2P265CA0095.GBRP265.PROD.OUTLOOK.COM (2603:10a6:600:8::35) To PAWPR08MB8958.eurprd08.prod.outlook.com (2603:10a6:102:33e::15) MIME-Version: 1.0 X-MS-TrafficTypeDiagnostic: PAWPR08MB8958:EE_|GV1PR08MB7378:EE_|DU2PEPF0001E9C0:EE_|AS2PR08MB10351:EE_ X-MS-Office365-Filtering-Correlation-Id: f5c5c9d3-57da-41b4-1869-08dbe6cf1a9e x-checkrecipientrouted: true NoDisclaimer: true X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam-Untrusted: BCL:0; X-Microsoft-Antispam-Message-Info-Original: ZA6HnaxIVVXJBteeC9RcmgoLK4CGicROTbji0VUFybFAXvXhHcJ3NPtajViwehdILuRzJ3uc1P5hArQJRiN4aSirzyuv9UDWhjncQ80rmzV1OkkplZFn7WwZbrIIWgXx90pnQXtwtMumY4gNfNVQyVovk3tdLVD7DRXZbeSX9gpk/zA669P9QLgakjMtpnI00m4tXQ8gUlO6EoQa7T8l0TtU23hIOI1W8eC6DBGK2LfQL8z2aXMRTK0iaEKR5IKmsBwAC4FNrc4jUvtH/h4EpbAIvnaYTtPssp6+lr0nvDrIpXwaFq2q08jf14bMbWqBfBpFiQuJiHxL5bVGYjn3wFwZ2x9DKasyt2j69TQkq1e04fZjKVDQncQXXWr0aeeuwovK+mZ7eZshlqEyuALoOBgAgyjn6oUSU5i4irRXFHzLYEtwvKarkPf3O6BDygMF397HhtNxJyZg+iQF4FFCZQpmcr16i/SX1v+3p365C1k1QeGyU//LmLrl+odPJ1V0YQUC70TU67Peab67sHr/6snndX0BiDtqILea2Dz+0X//afIw0A6xo7OuOxPJ+9D0Mh5Mj9y863UBPt/osoFwTk27atOhfW59TY3M7Ung+z+6lIMjpsdJjJG+Qnfqfrfk X-Forefront-Antispam-Report-Untrusted: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:PAWPR08MB8958.eurprd08.prod.outlook.com; PTR:; CAT:NONE; SFS:(13230031)(366004)(136003)(39860400002)(376002)(346002)(396003)(230922051799003)(1800799009)(64100799003)(186009)(451199024)(2906002)(41300700001)(316002)(66476007)(66556008)(54906003)(6916009)(36756003)(8936002)(8676002)(66946007)(4326008)(38100700002)(478600001)(235185007)(5660300002)(44832011)(6486002)(86362001)(83380400001)(26005)(6512007)(2616005)(6506007)(44144004)(33964004)(2700100001); DIR:OUT; SFP:1101; X-MS-Exchange-Transport-CrossTenantHeadersStamped: GV1PR08MB7378 Original-Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com; X-EOPAttributedMessage: 0 X-MS-Exchange-Transport-CrossTenantHeadersStripped: DU2PEPF0001E9C0.eurprd03.prod.outlook.com X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id-Prvs: 6dbe0512-97e4-4080-1452-08dbe6cf1542 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: P5zAkptYu1r2Vo05LD/CRovObk8T86amA8f75gGNEKUe41wNaywdcgdbutJdAg1FBggmfYr4Z3yXAqUWAKbWY8cgbQMgObjloMwkosCt8z2NfZXnlE02JpReJLoSxP8TIOfxJfkErHGy/x/YuyZADrnZgeW8L56DqStLJNT/x1cg+2e89EootxEyZsYicQ5MzT/uyWylM1nUHb1CvUwY85MVS3PFmELJniu5r94QDXbzsn/3NLUnKHGn3+2lSr646OZfz5icatLNFI+DLMjFTtRW8JNwikz5c26WyN1Fx+Pd5iVBd2lrUcQJQOMnszbBS+fvdGDCH05FG90i5sB0NkAAkYdD4HmMHwxHlQ1LB+cOk3dI0kTiqCIGb4RW19PLRbl83SMvDAAmCv3lDVvJYYeGDslGdRxn5zxaZkZrjuXsGHmGefRvX/qKI9MvRmibVZhPnPSGM+mvMGRkq6T81KwMDAbdSE5NeiHQdDfOm+Cw7nNpkH71JpQ5hEcb2JomdY5xCiqmSemiGlG54ENMkQ9guE31UVTw6DrqUPKStqVgS0kiTg2Gb84bHCquwXpFQKhejf2Jg+d8EBXT7VvbH5YBVHl9rkJE/DOxVsE5f4t4gPCDyWFgqd+ozK355bB5SqKWPXB1JTYHR8U3CqKgbTdTKaE1NdNFHDvOvQzOYRsoPl7GWvXTqtcvlj+Qm2Ar/FScxna0sVzh1RVn+DoA0ZdoJTNSortq1rIfGfwMU9pLXWx1hYTJOcXMZ7kp/lo5it1UZE+FUse8sOIorKS73A== X-Forefront-Antispam-Report: CIP:63.35.35.123; CTRY:IE; LANG:en; SCL:1; SRV:; IPV:CAL; SFV:NSPM; H:64aa7808-outbound-1.mta.getcheckrecipient.com; PTR:ec2-63-35-35-123.eu-west-1.compute.amazonaws.com; CAT:NONE; SFS:(13230031)(4636009)(396003)(136003)(346002)(376002)(39860400002)(230922051799003)(451199024)(1800799009)(82310400011)(186009)(64100799003)(40470700004)(36840700001)(46966006)(36756003)(81166007)(356005)(2906002)(36860700001)(40460700003)(47076005)(235185007)(41300700001)(5660300002)(54906003)(478600001)(86362001)(8936002)(40480700001)(6506007)(6486002)(44144004)(33964004)(4326008)(8676002)(83380400001)(316002)(70586007)(70206006)(6916009)(336012)(2616005)(44832011)(26005)(6512007)(82740400003)(2700100001); DIR:OUT; SFP:1101; X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 16 Nov 2023 18:08:58.2088 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: f5c5c9d3-57da-41b4-1869-08dbe6cf1a9e X-MS-Exchange-CrossTenant-Id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=f34e5979-57d9-4aaa-ad4d-b122a662184d; Ip=[63.35.35.123]; Helo=[64aa7808-outbound-1.mta.getcheckrecipient.com] X-MS-Exchange-CrossTenant-AuthSource: DU2PEPF0001E9C0.eurprd03.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: AS2PR08MB10351 X-Spam-Status: No, score=-12.0 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, FORGED_SPF_HELO, GIT_PATCH_0, KAM_DMARC_NONE, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2, SPF_HELO_PASS, SPF_NONE, TXREP, T_SCC_BODY_TEXT_LINE, UNPARSEABLE_RELAY autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org This adjusts aarch64_print_operand to recognize zero rtxes in modes other than VOIDmode. This allows us to use xzr/wzr for zero vectors, for example. Bootstrapped/regtested as a series on aarch64-linux-gnu, OK for trunk? gcc/ChangeLog: * config/aarch64/aarch64.cc (aarch64_print_operand): Handle non-VOIDmode CONST0_RTXes in {x,w}zr cases. --- gcc/config/aarch64/aarch64.cc | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc index 800a8b0e110..abd029887e5 100644 --- a/gcc/config/aarch64/aarch64.cc +++ b/gcc/config/aarch64/aarch64.cc @@ -12387,7 +12387,7 @@ aarch64_print_operand (FILE *f, rtx x, int code) case 'w': case 'x': - if (x == const0_rtx + if (x == CONST0_RTX (GET_MODE (x)) || (CONST_DOUBLE_P (x) && aarch64_float_const_zero_rtx_p (x))) { asm_fprintf (f, "%czr", code); From patchwork Thu Nov 16 18:09:16 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alex Coplan X-Patchwork-Id: 1864872 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=armh.onmicrosoft.com header.i=@armh.onmicrosoft.com header.a=rsa-sha256 header.s=selector2-armh-onmicrosoft-com header.b=UcFcg3gS; dkim=pass (1024-bit key) header.d=armh.onmicrosoft.com header.i=@armh.onmicrosoft.com header.a=rsa-sha256 header.s=selector2-armh-onmicrosoft-com header.b=UcFcg3gS; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4SWSjl46lMz1yRV for ; Fri, 17 Nov 2023 05:09:47 +1100 (AEDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 0E3883876059 for ; Thu, 16 Nov 2023 18:09:45 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from EUR05-VI1-obe.outbound.protection.outlook.com (mail-vi1eur05on2076.outbound.protection.outlook.com [40.107.21.76]) by sourceware.org (Postfix) with ESMTPS id D53DB3881D31 for ; Thu, 16 Nov 2023 18:09:32 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org D53DB3881D31 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=arm.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org D53DB3881D31 Authentication-Results: server2.sourceware.org; arc=pass smtp.remote-ip=40.107.21.76 ARC-Seal: i=3; a=rsa-sha256; d=sourceware.org; s=key; t=1700158174; cv=pass; b=kw8jM3D5EEGKkbCu0Cy/gFUUMa7UJIsOMQ8+qkAGzFC03asbcCMzCXvzUDS9Njxsu0kPm4Vqqxu05fhTRONHOJ4FNFE8RKZ8ID5Z2J8rNYIpZK35TQycDksSc2oI7sCEZ8yCVbm3iYE9alS14kHX44khJ6DcKBSXEZWi8UofC48= ARC-Message-Signature: i=3; a=rsa-sha256; d=sourceware.org; s=key; t=1700158174; c=relaxed/simple; bh=CGHr1achu6GSmhGlGnjafL8gF6YlOpA8oMs1FdtsGrY=; h=DKIM-Signature:DKIM-Signature:Date:From:To:Subject:Message-ID: MIME-Version; b=BImoK7OHZ+MwS0iUjEqIolVJb3rx1oUTE9apUmPbTXQFdMPi49l0x+N3Wyh0nGi6UMnUmIYqwDFHVbRTEonGC5jNiyNkgP23p99QAA7C1jPMSJq+YwUvehWnoaQLzOVx/h9O3HNJQFK++Jb28Hm09wxliI2yKLzJCix6nNCcYBI= ARC-Authentication-Results: i=3; server2.sourceware.org ARC-Seal: i=2; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=pass; b=VBlISuHqmj76jkldpOG3qWGtypiCAS2qOM4UptnElmBJVQsy88CFe+Hxs1Okf6MtDi8dkd7W1mh7sZIz/YLPBb/nB1BUOHBLq2/WTvDzPywasLRPMSGq8dy/hawgKjyXq+kv6P9avKFI793FS98v0aNeTh992tDTbSVZCv3Vb+z4yDhdlc/eWnBvZ4iwB0jGzDhEFjb/giQM95sd3TFafU+Z3nClQ9Oii3ATLq+0az0Qsr7yC5H/jgwVge906Fpm21RxtSuxrs6tdCM1SDD+qNxEE3BHQk93DPsVMbrDWBv1dPFOrm3JM0pCsysDJw3p1kkMXEosvdEB9RT395Jgdw== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=EpyeFw0QEe6I26K1r0Z2bTBhoC4pyDvlEhWu0jF7zDE=; b=J6LHAI5+g4wr/r9Fl/AeTyry9f+eLhaAppyxYlBGAxLUPDqw2Cu4wq0TuCj1yP/qFQedGUv2Qm65fd2NYRTjct1CdaooO7vvGVkHPu/W6bMgSOYzBXX9l/8C397TiVTvJe/7z58Mjl5qPpk+TdtCxaILn5t9hm0b+YxGXC/UH5bprEYO45HWN4Xm/u9KJAPPJNFdX3QbhQDtK4j1FUEdEO5nSty8ivF6EFdT7ivW31CXrDq5T3PDfSTz1pUFRk7QxDw//DuBMX4IgGgHT242pbqkI/LrHdpZK1yZESzFqq2FtXR8itotrxo0qVo3TjaEuUpscbLm4QBbSh+V5RSpKg== ARC-Authentication-Results: i=2; mx.microsoft.com 1; spf=pass (sender ip is 63.35.35.123) smtp.rcpttodomain=gcc.gnu.org smtp.mailfrom=arm.com; dmarc=pass (p=none sp=none pct=100) action=none header.from=arm.com; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com; arc=pass (0 oda=1 ltdi=1 spf=[1,1,smtp.mailfrom=arm.com] dkim=[1,1,header.d=arm.com] dmarc=[1,1,header.from=arm.com]) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector2-armh-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=EpyeFw0QEe6I26K1r0Z2bTBhoC4pyDvlEhWu0jF7zDE=; b=UcFcg3gSWrudoQQKz1+nYxbg/gHszid/iWQ+/en9gmxFVo7hdV8+39qLzjhEA7cCz8KgBZFLaT/rVeRl4sK1WC82UPh6AS06zil4gjHX6RYuSPZEhpctpLxvLLxpVWSa5BR7wft2ufCFsGfjRIUxCuLfKfjuro24xNBxaKYi/oA= Received: from DUZPR01CA0084.eurprd01.prod.exchangelabs.com (2603:10a6:10:46a::11) by AS8PR08MB8825.eurprd08.prod.outlook.com (2603:10a6:20b:5bc::14) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7002.23; Thu, 16 Nov 2023 18:09:27 +0000 Received: from DB5PEPF00014B92.eurprd02.prod.outlook.com (2603:10a6:10:46a:cafe::7f) by DUZPR01CA0084.outlook.office365.com (2603:10a6:10:46a::11) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7002.20 via Frontend Transport; Thu, 16 Nov 2023 18:09:27 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 63.35.35.123) smtp.mailfrom=arm.com; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com;dmarc=pass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 63.35.35.123 as permitted sender) receiver=protection.outlook.com; client-ip=63.35.35.123; helo=64aa7808-outbound-1.mta.getcheckrecipient.com; pr=C Received: from 64aa7808-outbound-1.mta.getcheckrecipient.com (63.35.35.123) by DB5PEPF00014B92.mail.protection.outlook.com (10.167.8.230) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7002.20 via Frontend Transport; Thu, 16 Nov 2023 18:09:27 +0000 Received: ("Tessian outbound 7671e7ddc218:v228"); Thu, 16 Nov 2023 18:09:27 +0000 X-CheckRecipientChecked: true X-CR-MTA-CID: 5bc1b3daad6d193e X-CR-MTA-TID: 64aa7808 Received: from aed225d12b8b.2 by 64aa7808-outbound-1.mta.getcheckrecipient.com id 6F1894EA-9131-422D-8D4F-69D3E57A707C.1; Thu, 16 Nov 2023 18:09:20 +0000 Received: from EUR04-VI1-obe.outbound.protection.outlook.com by 64aa7808-outbound-1.mta.getcheckrecipient.com with ESMTPS id aed225d12b8b.2 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384); Thu, 16 Nov 2023 18:09:20 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=O2GNNPI6TNiPcsTPLhPzaBcOslzY6WTxtB2Wq0PiwdGdslvMu45uNpLyNJjFwYJXU6mD1za5jUF0CjV1K5kIVMmRhXaQ/yleCcDzFmCd7qFb5VOaEGxmAiY3kJtH4vXirFty2hkGAL+QXR3Pe9AXH8lX5vY1x0Sa6+Vu7296JUzx/SSmG1yZBK6CFdmAgqPV3+f5jiVuns4Jn1IsXXozYYmsJjiCqkJMSHvp94oBHKEkf54Qqegx2lBxmQLUNOY0CEF6iNLqpw1vaSvD28v1VN+daX5yn45qdrwStdLiWXH0YJrW3UiBTKmMZR7FNYfsORWIHnrdD8yj/+eqrwewww== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=EpyeFw0QEe6I26K1r0Z2bTBhoC4pyDvlEhWu0jF7zDE=; b=J7gWj+IYBy4V6UO98YeLHDGAk+58zYfk8rexyXCZm1s7Qc3l1nSiapaLHSIpHTJAvCZpblfklN5ixvaqjNnrTX7axPzh74tjbHxWaW5XsL2mQzb2lqcbghmQvthBZy0TUjKNgsTItv++/7DfnaN4gCBb4Yd3KYIrCCD7mro43MFALDXClTmGukMtzDcWJhhOKypX7MobIxStbAV+CjfTFjthHB8R1VCEVQkl+o4FS5E6faF3q4PAqgHLT1D+s2LKMgTXq3F/MkzWcGew85w44CnMjdYJXLj93ctxP3EMISFoZMALq0uuGlsB19lLKDJGCV5fEPMHMtiMRkXKZfkH2g== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=arm.com; dmarc=pass action=none header.from=arm.com; dkim=pass header.d=arm.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector2-armh-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=EpyeFw0QEe6I26K1r0Z2bTBhoC4pyDvlEhWu0jF7zDE=; b=UcFcg3gSWrudoQQKz1+nYxbg/gHszid/iWQ+/en9gmxFVo7hdV8+39qLzjhEA7cCz8KgBZFLaT/rVeRl4sK1WC82UPh6AS06zil4gjHX6RYuSPZEhpctpLxvLLxpVWSa5BR7wft2ufCFsGfjRIUxCuLfKfjuro24xNBxaKYi/oA= Authentication-Results-Original: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com; Received: from PAWPR08MB8958.eurprd08.prod.outlook.com (2603:10a6:102:33e::15) by GV1PR08MB10424.eurprd08.prod.outlook.com (2603:10a6:150:15e::21) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6977.31; Thu, 16 Nov 2023 18:09:18 +0000 Received: from PAWPR08MB8958.eurprd08.prod.outlook.com ([fe80::8512:cc10:24d4:1919]) by PAWPR08MB8958.eurprd08.prod.outlook.com ([fe80::8512:cc10:24d4:1919%5]) with mapi id 15.20.6977.029; Thu, 16 Nov 2023 18:09:18 +0000 Date: Thu, 16 Nov 2023 18:09:16 +0000 From: Alex Coplan To: gcc-patches@gcc.gnu.org Cc: Richard Sandiford , Kyrylo Tkachov Subject: [PATCH 07/11] aarch64: Fix up printing of ldp/stp with -msve-vector-bits=128 Message-ID: Content-Disposition: inline X-ClientProxiedBy: LO4P123CA0391.GBRP123.PROD.OUTLOOK.COM (2603:10a6:600:18f::18) To PAWPR08MB8958.eurprd08.prod.outlook.com (2603:10a6:102:33e::15) MIME-Version: 1.0 X-MS-TrafficTypeDiagnostic: PAWPR08MB8958:EE_|GV1PR08MB10424:EE_|DB5PEPF00014B92:EE_|AS8PR08MB8825:EE_ X-MS-Office365-Filtering-Correlation-Id: 492b6f96-d902-4ad9-2453-08dbe6cf2c49 x-checkrecipientrouted: true NoDisclaimer: true X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam-Untrusted: BCL:0; X-Microsoft-Antispam-Message-Info-Original: pA58wE0xGwEOo31i1U4mTeZrWRXZEvg9WbWd/5dWv2Q0LKGbge03Fc6Sq+NjsywaQ9dtOi77wsC0HIW1M/ByuTZQ50gEvNo6IZEO/LyiCT0kzKgDfrU5bHRTC0p3TswcBmAQnKyVMADYT7nwJsEx9XCXdLk5fwon4gztOE4HG+jgEnvWYoVUzUjXlBhR5YqxUXztTNHazMMnPUGnRlJ6UaPSVGvf2pwcZel4DebHZqOPAILaDGJ4J0ynkOicEStPGYu9+NQ7kf0wqb1LJqgTG18LlEbYU+k1ihcIQyFhuxgrGIOPH7sS/kXKwoElf8E3vumTH6EHwuA1Nfj7U/DcP2Ed/jewcA3UnxRJoBemLDA35rmlMpf9I76F9s7PNYjnKoRQAF1JbxYsuUdRWuN9Kuk7eVqfXHMlDpYhOSHCB/eBr2bmf56/2bnFuqcENti3+g6V08OeB9Y7bn/bUtiwNpISrYU+yTKuJnFbudWm7rbTzi/kubXzVyHHE00dNX3muZBjBfxl+T9Nn8sHBoaVKQClObQBfAtncUuxJZYJHhRXmcdJJ89erihSqFneqX+5PpZ65FuJvBZU/uZkO7Sm2ppJec676bQI8m20U1K0SA9azEThsFUdSRJvTGXpBpcb X-Forefront-Antispam-Report-Untrusted: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:PAWPR08MB8958.eurprd08.prod.outlook.com; PTR:; CAT:NONE; SFS:(13230031)(376002)(366004)(396003)(39860400002)(346002)(136003)(230922051799003)(1800799009)(64100799003)(186009)(451199024)(2906002)(66556008)(478600001)(38100700002)(41300700001)(66476007)(66946007)(36756003)(86362001)(54906003)(6916009)(6486002)(316002)(2616005)(235185007)(5660300002)(6512007)(44144004)(26005)(6506007)(33964004)(44832011)(4326008)(8936002)(83380400001)(8676002)(2700100001); DIR:OUT; SFP:1101; X-MS-Exchange-Transport-CrossTenantHeadersStamped: GV1PR08MB10424 Original-Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com; X-EOPAttributedMessage: 0 X-MS-Exchange-Transport-CrossTenantHeadersStripped: DB5PEPF00014B92.eurprd02.prod.outlook.com X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id-Prvs: 6dd79f2f-d932-4a5a-6939-08dbe6cf2686 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: 4XfD4QsCg0A/Jz2yK0mNkOn44NbT1cPB7klxuBZiuqj6QZIgrXtMmmqw6s8egG4Pii9cinE61ADSJLtQAY/zB8rfnZT9KS/LdwXt+kwzBKg3p9+dsTe6VMMCmZTiQdYnzYNV0EPigrmEJaSQGFRhkKecg04Y6iZLv5nC7BPKydRsBN2SqjvqEk4ImXLHFGcJUmZuq+u34I2MjmmYXYsh16mmouupDxvZQdIXCoYF5dJ/Y/aBCi2Z1sZOdw8Ad4lwTLyWigGvdSF8WrDmnGuuUQA/383cOZt5JRtOv8vDPOum+TvJXnuNUhi9QBXojSw+UW8h4hgn2YX9bXntPtR7H4SVjzFcYzkFAwHXV4YYzaezpF9X4pTGrE3SCrHA850JDpKSJxe3ez3P/pTY0ulxproB0v7kP0BCMblzr/NadpNlBWDwepfcGsXw+09GcDlXwmQKHj7hfAexvdR4qur7TwA6tG4ICjhaEEq0Lcg5EzdNQE0/TGk0r6nTLj23FpfBXMJ9+z9xk6un5npkeJemwD5N9vuVC9p+AhFtrKiDYpyq3v8FKDMswdxmaYQnIPcWdonmqHQHWUdrmDtc7QRPny/TKsraDd3o8Y1sOh2Bzpse8UkGYegId+qn1BH79A1HvcOWB+tdsAxQvbsJYhW8ZgKDbTCpg6BidrC9K8Qb3PG7mpPybMIKtATrQHVsvxoul3HTdVUyS02x5LAb/CCqUmG1CTNBQnD+fvhOwzM6SMXV7lLOMZn+tnapfTjBdIdQt/09B4Hwg7Lof2LgivPCog== X-Forefront-Antispam-Report: CIP:63.35.35.123; CTRY:IE; LANG:en; SCL:1; SRV:; IPV:CAL; SFV:NSPM; H:64aa7808-outbound-1.mta.getcheckrecipient.com; PTR:ec2-63-35-35-123.eu-west-1.compute.amazonaws.com; CAT:NONE; SFS:(13230031)(4636009)(39860400002)(346002)(376002)(136003)(396003)(230922051799003)(64100799003)(1800799009)(82310400011)(186009)(451199024)(40470700004)(36840700001)(46966006)(5660300002)(235185007)(2906002)(8676002)(8936002)(4326008)(316002)(6916009)(54906003)(70206006)(70586007)(41300700001)(478600001)(26005)(6486002)(6512007)(6506007)(44144004)(33964004)(44832011)(40480700001)(2616005)(83380400001)(336012)(40460700003)(36860700001)(47076005)(356005)(36756003)(82740400003)(86362001)(81166007)(2700100001); DIR:OUT; SFP:1101; X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 16 Nov 2023 18:09:27.8666 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 492b6f96-d902-4ad9-2453-08dbe6cf2c49 X-MS-Exchange-CrossTenant-Id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=f34e5979-57d9-4aaa-ad4d-b122a662184d; Ip=[63.35.35.123]; Helo=[64aa7808-outbound-1.mta.getcheckrecipient.com] X-MS-Exchange-CrossTenant-AuthSource: DB5PEPF00014B92.eurprd02.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: AS8PR08MB8825 X-Spam-Status: No, score=-11.8 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, FORGED_SPF_HELO, GIT_PATCH_0, KAM_DMARC_NONE, KAM_NUMSUBJECT, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2, SPF_HELO_PASS, SPF_NONE, TXREP, T_SCC_BODY_TEXT_LINE, UNPARSEABLE_RELAY autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org Later patches allow using SVE modes in ldp/stp with -msve-vector-bits=128, so we need to make sure that we don't use SVE addressing modes when printing the address for the ldp/stp. This patch does that. Bootstrapped/regtested as a series on aarch64-linux-gnu, OK for trunk? gcc/ChangeLog: * config/aarch64/aarch64.cc (aarch64_print_address_internal): Handle SVE modes when printing ldp/stp addresses. --- gcc/config/aarch64/aarch64.cc | 13 ++++++++++++- 1 file changed, 12 insertions(+), 1 deletion(-) diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc index abd029887e5..4820fac67a1 100644 --- a/gcc/config/aarch64/aarch64.cc +++ b/gcc/config/aarch64/aarch64.cc @@ -12661,6 +12661,9 @@ aarch64_print_address_internal (FILE *f, machine_mode mode, rtx x, return false; } + const bool load_store_pair_p = (type == ADDR_QUERY_LDP_STP + || type == ADDR_QUERY_LDP_STP_N); + if (aarch64_classify_address (&addr, x, mode, true, type)) switch (addr.type) { @@ -12672,7 +12675,15 @@ aarch64_print_address_internal (FILE *f, machine_mode mode, rtx x, } vec_flags = aarch64_classify_vector_mode (mode); - if (vec_flags & VEC_ANY_SVE) + if ((vec_flags & VEC_ANY_SVE) + && load_store_pair_p + && !addr.const_offset.is_constant ()) + { + output_operand_lossage ("poly offset in ldp/stp address"); + return false; + } + + if ((vec_flags & VEC_ANY_SVE) && !load_store_pair_p) { HOST_WIDE_INT vnum = exact_div (addr.const_offset, From patchwork Thu Nov 16 18:09:52 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alex Coplan X-Patchwork-Id: 1864873 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=armh.onmicrosoft.com header.i=@armh.onmicrosoft.com header.a=rsa-sha256 header.s=selector2-armh-onmicrosoft-com header.b=t2QCtB9M; dkim=pass (1024-bit key) header.d=armh.onmicrosoft.com header.i=@armh.onmicrosoft.com header.a=rsa-sha256 header.s=selector2-armh-onmicrosoft-com header.b=t2QCtB9M; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4SWSkV1vvgz1yRR for ; Fri, 17 Nov 2023 05:10:26 +1100 (AEDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 9BC4B3857C70 for ; Thu, 16 Nov 2023 18:10:23 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from EUR05-VI1-obe.outbound.protection.outlook.com (mail-vi1eur05on2087.outbound.protection.outlook.com [40.107.21.87]) by sourceware.org (Postfix) with ESMTPS id 6182A384F008 for ; Thu, 16 Nov 2023 18:10:09 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 6182A384F008 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=arm.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 6182A384F008 Authentication-Results: server2.sourceware.org; arc=pass smtp.remote-ip=40.107.21.87 ARC-Seal: i=3; a=rsa-sha256; d=sourceware.org; s=key; t=1700158212; cv=pass; b=JU5FpUvWWAwlX4zZiwZBuGHoy7VzkF0OcDmALplHzhAo6uzn9EYBLwE0eHKURTkiTP/Ff1zwzD6Xkn9N0QWPaAai9E2B1PzezkdVk6Lcn4f7c/8br0d49taE5hCnGQEtyhiJlKI/mpFoPWADI7bmh2p8Fjzu+JryuYh1S0rgry4= ARC-Message-Signature: i=3; a=rsa-sha256; d=sourceware.org; s=key; t=1700158212; c=relaxed/simple; bh=5QV2MTOROv3D9zioHU6oHVXTjnBpfFCifcdFQsI9NdU=; h=DKIM-Signature:DKIM-Signature:Date:From:To:Subject:Message-ID: MIME-Version; b=HTlOvDJsNlgTRzUzoNsAWpgLKw19o7UbfLLFNva4WlDap0cNDthJup4+8rD/hcLejmr+d/Taxt99H5NkqslTHje+vkB8C9JMu6mkpCglvOWgk9KnnlHhsaWSAVZ/SPh6qVbUTvaZQf3IF/JIPx0TfYjBJXmPAq8hDBzdIl8r0Jw= ARC-Authentication-Results: i=3; server2.sourceware.org ARC-Seal: i=2; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=pass; b=WcN9KjGpFYzWbWzzqk0OytOzjFr8mBlkw41BZCBQ15T6Lr89qI2bo3Xy4a7dwi0AahHPKdMA1qlqgOxuAO8f2dP1klw43vbzDuPSpU+b8/TYnKmk6CuZ2tLIm+4VX4/YD0rQKZNM9iWCiGwaARzv645oQIyTUmxIhzV5XguJXmyRwLT5x3rJThsnLBN6CtoY5zn/pdXjx0qqxcv02kp02CLbIE0d+Xvp6veSAY5r5OntlIDKiL0KYAfOLOsm61rcBUux/VVvtyWU8j6pXdSpjtYdkDAfuA+lzN5gqX1i19onJ0R7ydYbbmcQrni8rLg2ykPWoMUlKcjwK96+xRPZ3Q== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=o4GhPx3LFjIDIbDy438hHloMGS551D9r/dD3+lwdYAM=; b=LcYwShdjLWDjBuOgffRShs1ltNorWjEcJbpIHOr7R/r1PmtVXMova+FvWXK2klWz6H3OB7rmPHxfXC8cOrJyxs1pZoitGfzOyM0mRFm1Qslf7InWP8qW949HCvDutrEz0eO3ffd1kU5V3cRPsE/AKwECQMviDyD3irTRTUosGB0EPgK3zWQIydq3VnsNYwb1PA7cszzO7S/BI3FaU0Wb/J2hFHLjNxTLcVAO+It0YwHC6Av4p+/PFPfvEN1RkM89wp+1hOt9F8W/Hw5kqptFIPBNLKOcKxg8hQQAhEUIHMUj6XNha/r1Os9PBQTax0Z4mgZeOWgj+WR/pBczicS2uA== ARC-Authentication-Results: i=2; mx.microsoft.com 1; spf=pass (sender ip is 63.35.35.123) smtp.rcpttodomain=gcc.gnu.org smtp.mailfrom=arm.com; dmarc=pass (p=none sp=none pct=100) action=none header.from=arm.com; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com; arc=pass (0 oda=1 ltdi=1 spf=[1,1,smtp.mailfrom=arm.com] dkim=[1,1,header.d=arm.com] dmarc=[1,1,header.from=arm.com]) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector2-armh-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=o4GhPx3LFjIDIbDy438hHloMGS551D9r/dD3+lwdYAM=; b=t2QCtB9McKvxndC3rimuvY6wNJg7U0I0m8I0gOxw6t3LfeGZX+loBNoN+ieUuZDj7ahr+xJb/dTvbPs/vpFTjv4LqwwGcd7xh6tn1R/lu5bpHkJuQBNBIJKE2eBXZMFuSpiG4yn+mfNw9HLmRpFJ/iDfRUUsUuOBwLfj9rKWx8M= Received: from AM4PR05CA0024.eurprd05.prod.outlook.com (2603:10a6:205::37) by PA4PR08MB5935.eurprd08.prod.outlook.com (2603:10a6:102:e4::20) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7002.21; Thu, 16 Nov 2023 18:10:06 +0000 Received: from AMS1EPF00000041.eurprd04.prod.outlook.com (2603:10a6:205:0:cafe::3) by AM4PR05CA0024.outlook.office365.com (2603:10a6:205::37) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7002.18 via Frontend Transport; Thu, 16 Nov 2023 18:10:06 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 63.35.35.123) smtp.mailfrom=arm.com; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com;dmarc=pass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 63.35.35.123 as permitted sender) receiver=protection.outlook.com; client-ip=63.35.35.123; helo=64aa7808-outbound-1.mta.getcheckrecipient.com; pr=C Received: from 64aa7808-outbound-1.mta.getcheckrecipient.com (63.35.35.123) by AMS1EPF00000041.mail.protection.outlook.com (10.167.16.38) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7002.20 via Frontend Transport; Thu, 16 Nov 2023 18:10:06 +0000 Received: ("Tessian outbound 20615a7e7970:v228"); Thu, 16 Nov 2023 18:10:05 +0000 X-CheckRecipientChecked: true X-CR-MTA-CID: bcd3266df9ecde91 X-CR-MTA-TID: 64aa7808 Received: from 4211b2b139af.2 by 64aa7808-outbound-1.mta.getcheckrecipient.com id D4747342-5466-4D2F-B5B6-1AF4B10DB5B1.1; Thu, 16 Nov 2023 18:09:58 +0000 Received: from EUR04-VI1-obe.outbound.protection.outlook.com by 64aa7808-outbound-1.mta.getcheckrecipient.com with ESMTPS id 4211b2b139af.2 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384); Thu, 16 Nov 2023 18:09:58 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=VJbcfbjuP2UDTKATgnm+0DBI/6vDlOpnx2Y041ZeIJ+VxOveeTP6Em+u+Tm1IYw6LVXrWSSI9PYnzzJbm4+bRoULAX/OKW2XCxJ9dMWaDGZ+Yd2pWgiEoV0N3TJQ1mp26cQLyxbZWz8LgSKHsE8C7JN3GypmSX62/JMVkIYghQhnnqQnFWaXVKFbhAXtPZb7Czb5HkaAM0X6t5EeKaWYzv1MidINakvrasJETq2bxIm1N0CZDydTz1hDVZNvkwhEo98hmw+37ztJcjt3plUB7PhcphLyvHeqqd53Mi5fQAepw62XkCHau6pesceEPv12f1lbhWTudLRfWAx/CBC/dw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=o4GhPx3LFjIDIbDy438hHloMGS551D9r/dD3+lwdYAM=; b=OUPy49db84ZfmGnJjHJrSPXQjXqpr/qXi+qdscxyzO35Wk0mf6KhGVbos4yOIH99oUebHWO9MHaRv3y+budhQEWVN8jZBrp6iEL74hLIarBRGiNDXA1FRutnF0KwXf9e+M/yZM08CS28auvPFa7Lznt3/QlEygeJS/qChxpZu4mu7Ui1LDwYmc1s3dO20VQsGiptJgYWzPpf2+Aa3J7W6bKYrPqsK8S77bhwAErhZPxhBqg5wWmvJ5m3qvjjcsx/xK3GgN6uCplnYL4Bej2t3kXh6Ozuomw/pFluRqEyQ3Pf0j/U0UR+bqVFY9/CDYu5q4ZrfxBj9k+RMc231jwjPw== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=arm.com; dmarc=pass action=none header.from=arm.com; dkim=pass header.d=arm.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector2-armh-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=o4GhPx3LFjIDIbDy438hHloMGS551D9r/dD3+lwdYAM=; b=t2QCtB9McKvxndC3rimuvY6wNJg7U0I0m8I0gOxw6t3LfeGZX+loBNoN+ieUuZDj7ahr+xJb/dTvbPs/vpFTjv4LqwwGcd7xh6tn1R/lu5bpHkJuQBNBIJKE2eBXZMFuSpiG4yn+mfNw9HLmRpFJ/iDfRUUsUuOBwLfj9rKWx8M= Authentication-Results-Original: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com; Received: from PAWPR08MB8958.eurprd08.prod.outlook.com (2603:10a6:102:33e::15) by GV1PR08MB10424.eurprd08.prod.outlook.com (2603:10a6:150:15e::21) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6977.31; Thu, 16 Nov 2023 18:09:56 +0000 Received: from PAWPR08MB8958.eurprd08.prod.outlook.com ([fe80::8512:cc10:24d4:1919]) by PAWPR08MB8958.eurprd08.prod.outlook.com ([fe80::8512:cc10:24d4:1919%5]) with mapi id 15.20.6977.029; Thu, 16 Nov 2023 18:09:56 +0000 Date: Thu, 16 Nov 2023 18:09:52 +0000 From: Alex Coplan To: gcc-patches@gcc.gnu.org Cc: Richard Sandiford , Kyrylo Tkachov Subject: [PATCH 08/11] aarch64: Generalize writeback ldp/stp patterns Message-ID: Content-Disposition: inline X-ClientProxiedBy: LO2P123CA0072.GBRP123.PROD.OUTLOOK.COM (2603:10a6:600:1::36) To PAWPR08MB8958.eurprd08.prod.outlook.com (2603:10a6:102:33e::15) MIME-Version: 1.0 X-MS-TrafficTypeDiagnostic: PAWPR08MB8958:EE_|GV1PR08MB10424:EE_|AMS1EPF00000041:EE_|PA4PR08MB5935:EE_ X-MS-Office365-Filtering-Correlation-Id: 348520c9-e56e-405b-5212-08dbe6cf431f x-checkrecipientrouted: true NoDisclaimer: true X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam-Untrusted: BCL:0; X-Microsoft-Antispam-Message-Info-Original: Ng2RRJZcCSvDPBCpckK0NXptItYB7D7DmoomyT8P6am/Jan/4A2FLTfiYWQCjPkduBKaY2Ql93OCDOboZ0rt5OblT8Ib8tr8Bu/mpA5TUe9v7cNspE5nR6atyPTMqOtxKPa521BKesBYfxnpp+pRbdW4Vn0XlCdSPsDAl/gDBjyp6ul7lKrL0ZgFLjg41BBLRJFVJZ4K0vCXZYyNJ0aGnZExAfh00v0/eRfU+I2/CcoxXX5LXOTn9/8X2yDMXx1NtLzEqnLwpxUnPy5JoYnJ4TaMcfYcqgLcow1ZQ0e8uRjd0ByGhlbEGKPusSH9ZjSACtNssvYd+nFZCFX4eOe82UyyjEMdsO6Phr2N2IKs+WBIWD9GDArocPY4sdwvFuEgsY1Qq07gv7/XPPI6/Tyr6uiuAa9++I8gqYbApIdcuNArcDrU00CsblD69fiZfU5KWaroQVn2hmAIUaYM1SKzzdsIUpfoVdCDltBBGOZ1N/8V19zSks/xni/b0Pz2uvDk0qsK/gCVHNq1rslAS470/+IIiWb3MCF0+l0T0e+n/sAkGA2wBGAfLEpedllNE5aTzZLECcE2SkXtPx/dVsiu3H5ZSLUfM5o5qEMLCrh+07SdHAagm5CZvPOu5c1TnuyF X-Forefront-Antispam-Report-Untrusted: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:PAWPR08MB8958.eurprd08.prod.outlook.com; PTR:; CAT:NONE; SFS:(13230031)(376002)(366004)(396003)(39860400002)(346002)(136003)(230922051799003)(1800799009)(64100799003)(186009)(451199024)(2906002)(66574015)(66556008)(478600001)(38100700002)(41300700001)(66476007)(66946007)(36756003)(86362001)(54906003)(6916009)(6486002)(316002)(2616005)(235185007)(5660300002)(6512007)(44144004)(26005)(6506007)(33964004)(6666004)(44832011)(4326008)(8936002)(83380400001)(8676002)(2700100001); DIR:OUT; SFP:1101; X-MS-Exchange-Transport-CrossTenantHeadersStamped: GV1PR08MB10424 Original-Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com; X-EOPAttributedMessage: 0 X-MS-Exchange-Transport-CrossTenantHeadersStripped: AMS1EPF00000041.eurprd04.prod.outlook.com X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id-Prvs: f99908b6-8e1d-4c7b-ce82-08dbe6cf3d6a X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: krbQdSwh0nI1M0nBFs9jNqo4NfYHRrB5rje/eafWKDp2vo8aadgP9KWHThdD1I1uhNisNjQpx/EI60Ecvm8rg+e0eY3WN8Ix2HmvUwHsG/iD6EEb7Ri9tzrNDkfUnuiDAA/x9o12RroeBKFr6MXNgEXhePh94MetC+jrMqKtJVDREZEOW1K8Xjm5gol2TO5AwBB/CDtvOmrGPSylCrM86NI/1UJN5CMcFYINjFvO+i9IbgfYbF8xY4W58cdjipTNKwvLCxh1UJFcj+u0avJaIKaPM7Mlqka/EFNlnVvTjjw63JZt3BHHJY7DKkG6I7Nj31LWfnmLzbggvKikScAYljZSxm1zkRbY9WFagiOw2dUpz99+dQpYEmaxJPQSMDSZtaeT3fd97+KU1tKd+smq1T7gmkx6oAqwcvZe+mi1pypSlXESIi43Xt7U+5s51h5BQGe0uy3obuhNhXyi25FZ319/BUakmy6JVjL4sdYxR1iUvfsZvEuDI7VSE3Ii/ZZK/O2qw+09ds/c2xwpY/JbotybfcM3/I9VR2yFqJ/qNQUrgahxJTsdAxPRN/zRJ+s55zL87dInPYdRhSDo6oa8j5f6CId1EdqQ0OeFdRAXWVqxdtVFGafu6cUMGwxAFFae9hndlmT2TZWFeTvPkZ8sihijf67eSzb+2oDgcV7hmJOOfl/2Nk6Bz+kRuuV6TPs7NRwb2XXb6s2uOeYcpT05jIppJMMFCDOFkyFRUpNJagAZmNGYETaekam5BEc4z5p89rZ382er3/sPT11lnAqXKg== X-Forefront-Antispam-Report: CIP:63.35.35.123; CTRY:IE; LANG:en; SCL:1; SRV:; IPV:CAL; SFV:NSPM; H:64aa7808-outbound-1.mta.getcheckrecipient.com; PTR:ec2-63-35-35-123.eu-west-1.compute.amazonaws.com; CAT:NONE; SFS:(13230031)(4636009)(376002)(396003)(346002)(39860400002)(136003)(230922051799003)(451199024)(1800799009)(82310400011)(64100799003)(186009)(36840700001)(46966006)(40470700004)(40460700003)(82740400003)(66574015)(336012)(83380400001)(44144004)(33964004)(6512007)(26005)(6506007)(2616005)(6666004)(316002)(6916009)(54906003)(8676002)(4326008)(70586007)(70206006)(36756003)(8936002)(36860700001)(2906002)(40480700001)(41300700001)(44832011)(5660300002)(235185007)(81166007)(6486002)(356005)(86362001)(47076005)(478600001)(2700100001); DIR:OUT; SFP:1101; X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 16 Nov 2023 18:10:06.1176 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 348520c9-e56e-405b-5212-08dbe6cf431f X-MS-Exchange-CrossTenant-Id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=f34e5979-57d9-4aaa-ad4d-b122a662184d; Ip=[63.35.35.123]; Helo=[64aa7808-outbound-1.mta.getcheckrecipient.com] X-MS-Exchange-CrossTenant-AuthSource: AMS1EPF00000041.eurprd04.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: PA4PR08MB5935 X-Spam-Status: No, score=-12.0 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, FORGED_SPF_HELO, GIT_PATCH_0, KAM_DMARC_NONE, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2, SPF_HELO_PASS, SPF_NONE, TXREP, T_SCC_BODY_TEXT_LINE, UNPARSEABLE_RELAY autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org Thus far the writeback forms of ldp/stp have been exclusively used in prologue and epilogue code for saving/restoring of registers to/from the stack. As such, forms of ldp/stp that weren't needed for prologue/epilogue code weren't supported by the aarch64 backend. This patch generalizes the load/store pair writeback patterns to allow: - Base registers other than the stack pointer. - Modes that weren't previously supported. - Combinations of distinct modes provided they have the same size. - Pre/post variants that weren't previously needed in prologue/epilogue code. We make quite some effort to avoid a combinatorial explosion in the number of patterns generated (and those in the source) by making extensive use of special predicates. An updated version of the upcoming ldp/stp pass can generate the writeback forms, so this patch is motivated by that. This patch doesn't add zero-extending or sign-extending forms of the writeback patterns; that is left for future work. Bootstrapped/regtested as a series on aarch64-linux-gnu, OK for trunk? gcc/ChangeLog: * config/aarch64/aarch64-protos.h (aarch64_ldpstp_operand_mode_p): Declare. * config/aarch64/aarch64.cc (aarch64_gen_storewb_pair): Build RTL directly instead of invoking named pattern. (aarch64_gen_loadwb_pair): Likewise. (aarch64_ldpstp_operand_mode_p): New. * config/aarch64/aarch64.md (loadwb_pair_): Replace with ... (*loadwb_post_pair_): ... this. Generalize as described in cover letter. (loadwb_pair_): Delete (superseded by the above). (*loadwb_post_pair_16): New. (*loadwb_pre_pair_): New. (loadwb_pair_): Delete. (*loadwb_pre_pair_16): New. (storewb_pair_): Replace with ... (*storewb_pre_pair_): ... this. Generalize as described in cover letter. (*storewb_pre_pair_16): New. (storewb_pair_): Delete. (*storewb_post_pair_): New. (storewb_pair_): Delete. (*storewb_post_pair_16): New. * config/aarch64/predicates.md (aarch64_mem_pair_operator): New. (pmode_plus_operator): New. (aarch64_ldp_reg_operand): New. (aarch64_stp_reg_operand): New. --- gcc/config/aarch64/aarch64-protos.h | 1 + gcc/config/aarch64/aarch64.cc | 60 +++--- gcc/config/aarch64/aarch64.md | 284 ++++++++++++++++++++-------- gcc/config/aarch64/predicates.md | 38 ++++ 4 files changed, 271 insertions(+), 112 deletions(-) diff --git a/gcc/config/aarch64/aarch64-protos.h b/gcc/config/aarch64/aarch64-protos.h index 36d6c688bc8..e463fd5c817 100644 --- a/gcc/config/aarch64/aarch64-protos.h +++ b/gcc/config/aarch64/aarch64-protos.h @@ -1023,6 +1023,7 @@ bool aarch64_operands_ok_for_ldpstp (rtx *, bool, machine_mode); bool aarch64_operands_adjust_ok_for_ldpstp (rtx *, bool, machine_mode); bool aarch64_mem_ok_with_ldpstp_policy_model (rtx, bool, machine_mode); void aarch64_swap_ldrstr_operands (rtx *, bool); +bool aarch64_ldpstp_operand_mode_p (machine_mode); extern void aarch64_asm_output_pool_epilogue (FILE *, const char *, tree, HOST_WIDE_INT); diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc index 4820fac67a1..ccf081d2a16 100644 --- a/gcc/config/aarch64/aarch64.cc +++ b/gcc/config/aarch64/aarch64.cc @@ -8977,23 +8977,15 @@ static rtx aarch64_gen_storewb_pair (machine_mode mode, rtx base, rtx reg, rtx reg2, HOST_WIDE_INT adjustment) { - switch (mode) - { - case E_DImode: - return gen_storewb_pairdi_di (base, base, reg, reg2, - GEN_INT (-adjustment), - GEN_INT (UNITS_PER_WORD - adjustment)); - case E_DFmode: - return gen_storewb_pairdf_di (base, base, reg, reg2, - GEN_INT (-adjustment), - GEN_INT (UNITS_PER_WORD - adjustment)); - case E_TFmode: - return gen_storewb_pairtf_di (base, base, reg, reg2, - GEN_INT (-adjustment), - GEN_INT (UNITS_PER_VREG - adjustment)); - default: - gcc_unreachable (); - } + rtx new_base = plus_constant (Pmode, base, -adjustment); + rtx mem = gen_frame_mem (mode, new_base); + rtx mem2 = adjust_address_nv (mem, mode, GET_MODE_SIZE (mode)); + + return gen_rtx_PARALLEL (VOIDmode, + gen_rtvec (3, + gen_rtx_SET (base, new_base), + gen_rtx_SET (mem, reg), + gen_rtx_SET (mem2, reg2))); } /* Push registers numbered REGNO1 and REGNO2 to the stack, adjusting the @@ -9025,20 +9017,15 @@ static rtx aarch64_gen_loadwb_pair (machine_mode mode, rtx base, rtx reg, rtx reg2, HOST_WIDE_INT adjustment) { - switch (mode) - { - case E_DImode: - return gen_loadwb_pairdi_di (base, base, reg, reg2, GEN_INT (adjustment), - GEN_INT (UNITS_PER_WORD)); - case E_DFmode: - return gen_loadwb_pairdf_di (base, base, reg, reg2, GEN_INT (adjustment), - GEN_INT (UNITS_PER_WORD)); - case E_TFmode: - return gen_loadwb_pairtf_di (base, base, reg, reg2, GEN_INT (adjustment), - GEN_INT (UNITS_PER_VREG)); - default: - gcc_unreachable (); - } + rtx mem = gen_frame_mem (mode, base); + rtx mem2 = adjust_address_nv (mem, mode, GET_MODE_SIZE (mode)); + rtx new_base = plus_constant (Pmode, base, adjustment); + + return gen_rtx_PARALLEL (VOIDmode, + gen_rtvec (3, + gen_rtx_SET (base, new_base), + gen_rtx_SET (reg, mem), + gen_rtx_SET (reg2, mem2))); } /* Pop the two registers numbered REGNO1, REGNO2 from the stack, adjusting it @@ -26688,6 +26675,17 @@ aarch64_check_consecutive_mems (rtx *mem1, rtx *mem2, bool *reversed) return false; } +bool +aarch64_ldpstp_operand_mode_p (machine_mode mode) +{ + if (!targetm.hard_regno_mode_ok (V0_REGNUM, mode) + || hard_regno_nregs (V0_REGNUM, mode) > 1) + return false; + + const auto size = GET_MODE_SIZE (mode); + return known_eq (size, 4) || known_eq (size, 8) || known_eq (size, 16); +} + /* Return true if MEM1 and MEM2 can be combined into a single access of mode MODE, with the combined access having the same address as MEM1. */ diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md index 7be1de38b1c..c92a51690c5 100644 --- a/gcc/config/aarch64/aarch64.md +++ b/gcc/config/aarch64/aarch64.md @@ -1831,102 +1831,224 @@ (define_insn "store_pair_dw_" (set_attr "fp" "yes")] ) +;; Writeback load/store pair patterns. +;; +;; Note that modes in the patterns [SI DI TI] are used only as a proxy for their +;; size; aarch64_ldp_reg_operand and aarch64_mem_pair_operator are special +;; predicates which accept a wide range of operand modes, with the requirement +;; that the contextual (pattern) mode is of the same size as the operand mode. + ;; Load pair with post-index writeback. This is primarily used in function ;; epilogues. -(define_insn "loadwb_pair_" +(define_insn "*loadwb_post_pair_" [(parallel - [(set (match_operand:P 0 "register_operand" "=k") - (plus:P (match_operand:P 1 "register_operand" "0") - (match_operand:P 4 "aarch64_mem_pair_offset" "n"))) - (set (match_operand:GPI 2 "register_operand" "=r") - (mem:GPI (match_dup 1))) - (set (match_operand:GPI 3 "register_operand" "=r") - (mem:GPI (plus:P (match_dup 1) - (match_operand:P 5 "const_int_operand" "n"))))])] - "INTVAL (operands[5]) == GET_MODE_SIZE (mode)" - "ldp\\t%2, %3, [%1], %4" - [(set_attr "type" "load_")] -) - -(define_insn "loadwb_pair_" + [(set (match_operand 0 "pmode_register_operand") + (match_operator 7 "pmode_plus_operator" [ + (match_operand 1 "pmode_register_operand") + (match_operand 4 "const_int_operand")])) + (set (match_operand:GPI 2 "aarch64_ldp_reg_operand") + (match_operator 5 "memory_operand" [(match_dup 1)])) + (set (match_operand:GPI 3 "aarch64_ldp_reg_operand") + (match_operator 6 "memory_operand" [ + (match_operator 8 "pmode_plus_operator" [ + (match_dup 1) + (const_int )])]))])] + "aarch64_mem_pair_offset (operands[4], mode) + && !reg_overlap_mentioned_p (operands[0], operands[2]) + && !reg_overlap_mentioned_p (operands[0], operands[3])" + {@ [cons: =0, 1, =2, =3; attrs: type] + [ rk, 0, r, r; load_] ldp\t%2, %3, [%1], %4 + [ rk, 0, w, w; neon_load1_2reg ] ldp\t%2, %3, [%1], %4 + } +) + +;; q-register variant of the above +(define_insn "*loadwb_post_pair_16" + [(parallel + [(set (match_operand 0 "pmode_register_operand" "=rk") + (match_operator 7 "pmode_plus_operator" [ + (match_operand 1 "pmode_register_operand" "0") + (match_operand 4 "const_int_operand")])) + (set (match_operand:TI 2 "aarch64_ldp_reg_operand" "=w") + (match_operator 5 "memory_operand" [(match_dup 1)])) + (set (match_operand:TI 3 "aarch64_ldp_reg_operand" "=w") + (match_operator 6 "memory_operand" + [(match_operator 8 "pmode_plus_operator" [ + (match_dup 1) + (const_int 16)])]))])] + "TARGET_FLOAT + && aarch64_mem_pair_offset (operands[4], TImode) + && !reg_overlap_mentioned_p (operands[0], operands[2]) + && !reg_overlap_mentioned_p (operands[0], operands[3])" + "ldp\t%q2, %q3, [%1], %4" + [(set_attr "type" "neon_ldp_q")] +) + +;; Load pair with pre-index writeback. +(define_insn "*loadwb_pre_pair_" [(parallel - [(set (match_operand:P 0 "register_operand" "=k") - (plus:P (match_operand:P 1 "register_operand" "0") - (match_operand:P 4 "aarch64_mem_pair_offset" "n"))) - (set (match_operand:GPF 2 "register_operand" "=w") - (mem:GPF (match_dup 1))) - (set (match_operand:GPF 3 "register_operand" "=w") - (mem:GPF (plus:P (match_dup 1) - (match_operand:P 5 "const_int_operand" "n"))))])] - "INTVAL (operands[5]) == GET_MODE_SIZE (mode)" - "ldp\\t%2, %3, [%1], %4" - [(set_attr "type" "neon_load1_2reg")] -) - -(define_insn "loadwb_pair_" + [(set (match_operand 0 "pmode_register_operand") + (match_operator 8 "pmode_plus_operator" [ + (match_operand 1 "pmode_register_operand") + (match_operand 4 "const_int_operand")])) + (set (match_operand:GPI 2 "aarch64_ldp_reg_operand") + (match_operator 6 "memory_operand" [ + (match_operator 10 "pmode_plus_operator" [ + (match_dup 1) + (match_dup 4) + ])])) + (set (match_operand:GPI 3 "aarch64_ldp_reg_operand") + (match_operator 7 "memory_operand" [ + (match_operator 9 "pmode_plus_operator" [ + (match_dup 1) + (match_operand 5 "const_int_operand") + ])]))])] + "aarch64_mem_pair_offset (operands[4], mode) + && known_eq (INTVAL (operands[5]), + INTVAL (operands[4]) + GET_MODE_SIZE (mode)) + && !reg_overlap_mentioned_p (operands[0], operands[2]) + && !reg_overlap_mentioned_p (operands[0], operands[3])" + {@ [cons: =&0, 1, =2, =3; attrs: type ] + [ rk, 0, r, r; load_] ldp\t%2, %3, [%0, %4]! + [ rk, 0, w, w; neon_load1_2reg ] ldp\t%2, %3, [%0, %4]! + } +) + +;; q-register variant of the above +(define_insn "*loadwb_pre_pair_16" [(parallel - [(set (match_operand:P 0 "register_operand" "=k") - (plus:P (match_operand:P 1 "register_operand" "0") - (match_operand:P 4 "aarch64_mem_pair_offset" "n"))) - (set (match_operand:TX 2 "register_operand" "=w") - (mem:TX (match_dup 1))) - (set (match_operand:TX 3 "register_operand" "=w") - (mem:TX (plus:P (match_dup 1) - (match_operand:P 5 "const_int_operand" "n"))))])] - "TARGET_SIMD && INTVAL (operands[5]) == GET_MODE_SIZE (mode)" - "ldp\\t%q2, %q3, [%1], %4" + [(set (match_operand 0 "pmode_register_operand" "=&rk") + (match_operator 8 "pmode_plus_operator" [ + (match_operand 1 "pmode_register_operand" "0") + (match_operand 4 "const_int_operand")])) + (set (match_operand:TI 2 "aarch64_ldp_reg_operand" "=w") + (match_operator 6 "memory_operand" [ + (match_operator 10 "pmode_plus_operator" [ + (match_dup 1) + (match_dup 4) + ])])) + (set (match_operand:TI 3 "aarch64_ldp_reg_operand" "=w") + (match_operator 7 "memory_operand" [ + (match_operator 9 "pmode_plus_operator" [ + (match_dup 1) + (match_operand 5 "const_int_operand") + ])]))])] + "TARGET_FLOAT + && aarch64_mem_pair_offset (operands[4], TImode) + && known_eq (INTVAL (operands[5]), INTVAL (operands[4]) + 16) + && !reg_overlap_mentioned_p (operands[0], operands[2]) + && !reg_overlap_mentioned_p (operands[0], operands[3])" + "ldp\t%q2, %q3, [%0, %4]!" [(set_attr "type" "neon_ldp_q")] ) ;; Store pair with pre-index writeback. This is primarily used in function ;; prologues. -(define_insn "storewb_pair_" +(define_insn "*storewb_pre_pair_" [(parallel - [(set (match_operand:P 0 "register_operand" "=&k") - (plus:P (match_operand:P 1 "register_operand" "0") - (match_operand:P 4 "aarch64_mem_pair_offset" "n"))) - (set (mem:GPI (plus:P (match_dup 0) - (match_dup 4))) - (match_operand:GPI 2 "register_operand" "r")) - (set (mem:GPI (plus:P (match_dup 0) - (match_operand:P 5 "const_int_operand" "n"))) - (match_operand:GPI 3 "register_operand" "r"))])] - "INTVAL (operands[5]) == INTVAL (operands[4]) + GET_MODE_SIZE (mode)" - "stp\\t%2, %3, [%0, %4]!" - [(set_attr "type" "store_")] + [(set (match_operand 0 "pmode_register_operand") + (match_operator 6 "pmode_plus_operator" [ + (match_operand 1 "pmode_register_operand") + (match_operand 4 "const_int_operand") + ])) + (set (match_operator:GPI 7 "aarch64_mem_pair_operator" [ + (match_operator 8 "pmode_plus_operator" [ + (match_dup 0) + (match_dup 4) + ])]) + (match_operand:GPI 2 "aarch64_stp_reg_operand")) + (set (match_operator:GPI 9 "aarch64_mem_pair_operator" [ + (match_operator 10 "pmode_plus_operator" [ + (match_dup 0) + (match_operand 5 "const_int_operand") + ])]) + (match_operand:GPI 3 "aarch64_stp_reg_operand"))])] + "aarch64_mem_pair_offset (operands[4], mode) + && known_eq (INTVAL (operands[5]), + INTVAL (operands[4]) + GET_MODE_SIZE (mode)) + && !reg_overlap_mentioned_p (operands[0], operands[2]) + && !reg_overlap_mentioned_p (operands[0], operands[3])" + {@ [cons: =&0, 1, 2, 3; attrs: type ] + [ rk, 0, rYZ, rYZ; store_] stp\t%2, %3, [%0, %4]! + [ rk, 0, w, w; neon_store1_2reg ] stp\t%2, %3, [%0, %4]! + } +) + +;; q-register variant of the above. +(define_insn "*storewb_pre_pair_16" + [(parallel + [(set (match_operand 0 "pmode_register_operand" "=&rk") + (match_operator 6 "pmode_plus_operator" [ + (match_operand 1 "pmode_register_operand" "0") + (match_operand 4 "const_int_operand") + ])) + (set (match_operator:TI 7 "aarch64_mem_pair_operator" [ + (match_operator 8 "pmode_plus_operator" [ + (match_dup 0) + (match_dup 4) + ])]) + (match_operand:TI 2 "aarch64_ldp_reg_operand" "w")) + (set (match_operator:TI 9 "aarch64_mem_pair_operator" [ + (match_operator 10 "pmode_plus_operator" [ + (match_dup 0) + (match_operand 5 "const_int_operand") + ])]) + (match_operand:TI 3 "aarch64_ldp_reg_operand" "w"))])] + "TARGET_FLOAT + && aarch64_mem_pair_offset (operands[4], TImode) + && known_eq (INTVAL (operands[5]), INTVAL (operands[4]) + 16) + && !reg_overlap_mentioned_p (operands[0], operands[2]) + && !reg_overlap_mentioned_p (operands[0], operands[3])" + "stp\\t%q2, %q3, [%0, %4]!" + [(set_attr "type" "neon_stp_q")] ) -(define_insn "storewb_pair_" +;; Store pair with post-index writeback. +(define_insn "*storewb_post_pair_" [(parallel - [(set (match_operand:P 0 "register_operand" "=&k") - (plus:P (match_operand:P 1 "register_operand" "0") - (match_operand:P 4 "aarch64_mem_pair_offset" "n"))) - (set (mem:GPF (plus:P (match_dup 0) - (match_dup 4))) - (match_operand:GPF 2 "register_operand" "w")) - (set (mem:GPF (plus:P (match_dup 0) - (match_operand:P 5 "const_int_operand" "n"))) - (match_operand:GPF 3 "register_operand" "w"))])] - "INTVAL (operands[5]) == INTVAL (operands[4]) + GET_MODE_SIZE (mode)" - "stp\\t%2, %3, [%0, %4]!" - [(set_attr "type" "neon_store1_2reg")] -) - -(define_insn "storewb_pair_" + [(set (match_operand 0 "pmode_register_operand") + (match_operator 5 "pmode_plus_operator" [ + (match_operand 1 "pmode_register_operand") + (match_operand 4 "const_int_operand") + ])) + (set (match_operator:GPI 6 "aarch64_mem_pair_operator" [(match_dup 1)]) + (match_operand 2 "aarch64_stp_reg_operand")) + (set (match_operator:GPI 7 "aarch64_mem_pair_operator" [ + (match_operator 8 "pmode_plus_operator" [ + (match_dup 0) + (const_int ) + ])]) + (match_operand 3 "aarch64_stp_reg_operand"))])] + "aarch64_mem_pair_offset (operands[4], mode) + && !reg_overlap_mentioned_p (operands[0], operands[2]) + && !reg_overlap_mentioned_p (operands[0], operands[3])" + {@ [cons: =0, 1, 2, 3; attrs: type ] + [ rk, 0, rYZ, rYZ; store_] stp\t%2, %3, [%0], %4 + [ rk, 0, w, w; neon_store1_2reg ] stp\t%2, %3, [%0], %4 + } +) + +;; Store pair with post-index writeback. +(define_insn "*storewb_post_pair_16" [(parallel - [(set (match_operand:P 0 "register_operand" "=&k") - (plus:P (match_operand:P 1 "register_operand" "0") - (match_operand:P 4 "aarch64_mem_pair_offset" "n"))) - (set (mem:TX (plus:P (match_dup 0) - (match_dup 4))) - (match_operand:TX 2 "register_operand" "w")) - (set (mem:TX (plus:P (match_dup 0) - (match_operand:P 5 "const_int_operand" "n"))) - (match_operand:TX 3 "register_operand" "w"))])] - "TARGET_SIMD - && INTVAL (operands[5]) - == INTVAL (operands[4]) + GET_MODE_SIZE (mode)" - "stp\\t%q2, %q3, [%0, %4]!" + [(set (match_operand 0 "pmode_register_operand" "=rk") + (match_operator 5 "pmode_plus_operator" [ + (match_operand 1 "pmode_register_operand" "0") + (match_operand 4 "const_int_operand") + ])) + (set (match_operator:TI 6 "aarch64_mem_pair_operator" [(match_dup 1)]) + (match_operand:TI 2 "aarch64_ldp_reg_operand" "w")) + (set (match_operator:TI 7 "aarch64_mem_pair_operator" [ + (match_operator 8 "pmode_plus_operator" [ + (match_dup 0) + (const_int 16) + ])]) + (match_operand:TI 3 "aarch64_ldp_reg_operand" "w"))])] + "TARGET_FLOAT + && aarch64_mem_pair_offset (operands[4], TImode) + && !reg_overlap_mentioned_p (operands[0], operands[2]) + && !reg_overlap_mentioned_p (operands[0], operands[3])" + "stp\t%q2, %q3, [%0], %4" [(set_attr "type" "neon_stp_q")] ) diff --git a/gcc/config/aarch64/predicates.md b/gcc/config/aarch64/predicates.md index a73724a7fc0..b647e5af7c6 100644 --- a/gcc/config/aarch64/predicates.md +++ b/gcc/config/aarch64/predicates.md @@ -257,11 +257,49 @@ (define_predicate "aarch64_mem_pair_offset" (and (match_code "const_int") (match_test "aarch64_offset_7bit_signed_scaled_p (mode, INTVAL (op))"))) +(define_special_predicate "aarch64_mem_pair_operator" + (and + (match_code "mem") + (match_test "aarch64_ldpstp_operand_mode_p (GET_MODE (op))") + (ior + (match_test "mode == VOIDmode") + (match_test "known_eq (GET_MODE_SIZE (mode), + GET_MODE_SIZE (GET_MODE (op)))")))) + (define_predicate "aarch64_mem_pair_operand" (and (match_code "mem") (match_test "aarch64_legitimate_address_p (mode, XEXP (op, 0), false, ADDR_QUERY_LDP_STP)"))) +(define_predicate "pmode_plus_operator" + (and (match_code "plus") + (match_test "GET_MODE (op) == Pmode"))) + +(define_special_predicate "aarch64_ldp_reg_operand" + (and + (match_code "reg,subreg") + (match_test "aarch64_ldpstp_operand_mode_p (GET_MODE (op))") + (ior + (match_test "mode == VOIDmode") + (match_test "known_eq (GET_MODE_SIZE (mode), + GET_MODE_SIZE (GET_MODE (op)))")))) + +(define_special_predicate "aarch64_stp_reg_operand" + (ior (match_operand 0 "aarch64_ldp_reg_operand") + (and (ior + (and (match_code "const_int,const,const_vector") + (match_test "op == CONST0_RTX (GET_MODE (op))")) + (and (match_code "const_double") + (match_test "aarch64_float_const_zero_rtx_p (op)"))) + (ior + (match_test "GET_MODE (op) == VOIDmode") + (and + (match_test "aarch64_ldpstp_operand_mode_p (GET_MODE (op))") + (ior + (match_test "mode == VOIDmode") + (match_test "known_eq (GET_MODE_SIZE (mode), + GET_MODE_SIZE (GET_MODE (op)))"))))))) + ;; Used for storing two 64-bit values in an AdvSIMD register using an STP ;; as a 128-bit vec_concat. (define_predicate "aarch64_mem_pair_lanes_operand" From patchwork Thu Nov 16 18:10:15 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alex Coplan X-Patchwork-Id: 1864874 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=armh.onmicrosoft.com header.i=@armh.onmicrosoft.com header.a=rsa-sha256 header.s=selector2-armh-onmicrosoft-com header.b=HR0DoWWR; dkim=pass (1024-bit key) header.d=armh.onmicrosoft.com header.i=@armh.onmicrosoft.com header.a=rsa-sha256 header.s=selector2-armh-onmicrosoft-com header.b=HR0DoWWR; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4SWSl32Z37z1yRR for ; Fri, 17 Nov 2023 05:10:55 +1100 (AEDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 84A34387688D for ; Thu, 16 Nov 2023 18:10:52 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from EUR05-AM6-obe.outbound.protection.outlook.com (mail-am6eur05on2057.outbound.protection.outlook.com [40.107.22.57]) by sourceware.org (Postfix) with ESMTPS id 0524E3856975 for ; Thu, 16 Nov 2023 18:10:32 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 0524E3856975 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=arm.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 0524E3856975 Authentication-Results: server2.sourceware.org; arc=pass smtp.remote-ip=40.107.22.57 ARC-Seal: i=3; a=rsa-sha256; d=sourceware.org; s=key; t=1700158237; cv=pass; b=qnJKYe6wpSZUuTx3pvgr2OgHb3wFFvJN0vm0ya8q0ZlrmLWJeN5AmN/FccVsh02oam0525HMKqgGrpHs46jEqNk9YjwdCIkeudZfpUGVzaH1lcxQ2B7XJgINNNu8s/rOJGiGyx0n0c03/RqovcL/gmZ/4/u471Z5H4WBUJ5bEOQ= ARC-Message-Signature: i=3; a=rsa-sha256; d=sourceware.org; s=key; t=1700158237; c=relaxed/simple; bh=sP37SVB7drhla4t81KM89lSNZjFh7fu1MWW7s3Hx2cw=; h=DKIM-Signature:DKIM-Signature:Date:From:To:Subject:Message-ID: MIME-Version; b=C4/BMc0Rta4JdEzA8ugcgC2QyOb4SyjNSywfQVgt0sGk36wYLnUO+a1hj7fzlMru+bpu32gsZ73UKe4ijvfaJgcOKuGyHSlbm34CZiqa4YrCVxD2skSMsdovW9z2CpZ5Lg00L9AcZRHCPRqlofldVZ/4BLk8ZtKm/uKQVHcUuC0= ARC-Authentication-Results: i=3; server2.sourceware.org ARC-Seal: i=2; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=pass; b=fysQNRFOvqVwjYFDzqxbglT3qjOWsFpzExchTzqGcCZv2FKgTafomd1ZglLYSnPy2EUuVF4Em6xr/BN/pEZIoIkU11FXWLm4RlF7IiAH5naULEaObqxnKnaJ6gheraZ0E3CbdEFWwmUzf0UdZBdXpAULW5WdMxmibZuvEfB/e0OFFuPTQW5fUtj/9ngS5q5gRCjzzWy9OY7TK4iJDkYT/VpeZQEwu0OaVcGfZyngblMiDjTC5+Vx7tVqhiBQurjwaoUwOPbjADweYwjc4TwWggaM+2uqbYG27aHMpierfJA144Rf3OjXznj8eBwbTpQn3z+zGDudQkULE9ZOnpU6+Q== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=ChBANwPMtGKRsPcrYcKmd2uZa/Q/i4YNqAbIgIgt4H8=; b=lGlHWWxmM8fpDbj29GqQQKTVoeW1aspXmWEzr2p7fAJmYxt/izIViOwLSvcepby6rrcXkY0kbQErOTTqWONrSz0uxhKKJGsOgLkS8f6ucno1aveTN6wqsPukzmMfHqkHFWgypOjNsPXo1XYy2pvcM3mgP7akhgDoEZ1VKHunIwVqivlXdeogtKg/WYTvWHQ+nyElHp92P4V+l8N2ssLpTLYoGeq8/0Q4O4FubYQ/afuZjhCzTLCirWtxXyi+AbowxojS1UNm5TJeyzjg1Fp41oCRSFxEvNhWhUX8MeAPaxAhzlw3m72meIdziuKDOZ735RjTsEbEupDUEYyGA/mH3A== ARC-Authentication-Results: i=2; mx.microsoft.com 1; spf=pass (sender ip is 63.35.35.123) smtp.rcpttodomain=gcc.gnu.org smtp.mailfrom=arm.com; dmarc=pass (p=none sp=none pct=100) action=none header.from=arm.com; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com; arc=pass (0 oda=1 ltdi=1 spf=[1,1,smtp.mailfrom=arm.com] dkim=[1,1,header.d=arm.com] dmarc=[1,1,header.from=arm.com]) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector2-armh-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=ChBANwPMtGKRsPcrYcKmd2uZa/Q/i4YNqAbIgIgt4H8=; b=HR0DoWWRpg1E+xNlRZDKEUe8MYIMdl+e2NTIJ32bCZHt+ZOIr/479h3ijMo5NAt8I8mjVc59LCX8gZa95NCwy9yf0pfPF7P4GBdotVXxbSpzoR4ltuUPyJcp+g0+NhMIQdtlHw0MOQMHQ740s851kqNDeCy5hPry9b4GmSmfHUA= Received: from AS4PR09CA0025.eurprd09.prod.outlook.com (2603:10a6:20b:5d4::15) by PAVPR08MB9282.eurprd08.prod.outlook.com (2603:10a6:102:305::9) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6977.31; Thu, 16 Nov 2023 18:10:29 +0000 Received: from AM1PEPF000252DF.eurprd07.prod.outlook.com (2603:10a6:20b:5d4:cafe::6d) by AS4PR09CA0025.outlook.office365.com (2603:10a6:20b:5d4::15) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7002.18 via Frontend Transport; Thu, 16 Nov 2023 18:10:29 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 63.35.35.123) smtp.mailfrom=arm.com; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com;dmarc=pass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 63.35.35.123 as permitted sender) receiver=protection.outlook.com; client-ip=63.35.35.123; helo=64aa7808-outbound-1.mta.getcheckrecipient.com; pr=C Received: from 64aa7808-outbound-1.mta.getcheckrecipient.com (63.35.35.123) by AM1PEPF000252DF.mail.protection.outlook.com (10.167.16.57) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7002.19 via Frontend Transport; Thu, 16 Nov 2023 18:10:29 +0000 Received: ("Tessian outbound 20615a7e7970:v228"); Thu, 16 Nov 2023 18:10:29 +0000 X-CheckRecipientChecked: true X-CR-MTA-CID: 3446c021e80b5c90 X-CR-MTA-TID: 64aa7808 Received: from a804d6167438.2 by 64aa7808-outbound-1.mta.getcheckrecipient.com id 58BC2F1B-0CEE-4838-BD6C-022ABC17D4CC.1; Thu, 16 Nov 2023 18:10:22 +0000 Received: from EUR04-VI1-obe.outbound.protection.outlook.com by 64aa7808-outbound-1.mta.getcheckrecipient.com with ESMTPS id a804d6167438.2 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384); Thu, 16 Nov 2023 18:10:22 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=R05Uvb9Esqb86lvs8ybLxNwLW3IdeiGE4jjGS0/pPSsJSYaXa0ZOMwI2Ewv7jhlKC+IxyybN3xEljFvknUWryI8XCzkz9y5mBpWeUgvXNZN7a8QeZC/0/i9EUlkuhSZgVjm8tiXw+lYDADqbWLqgKfHmPHlRYmlZ08qWwpQKChbwchCgUwTp1C7641V0EDKuGD6/zDiZQTVD2vikpXJ/kjXx8GPfTEBDuyP50d7wQI4xoYG3ySPkvkR3YAxyby5c01CrDzjBFgoAAw6Ofb7k7IO2kot4UwPP9vz66YC8rpM0nFv81SvDRIwVLc5ZZQJ/hIjlMrpidRQszN4OahpKOQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=ChBANwPMtGKRsPcrYcKmd2uZa/Q/i4YNqAbIgIgt4H8=; b=jjKyfouJWlNCePZiq48mmViGANWnmmMxHTxLwbd5Amq+PfOLnBxofCvXbtLa0F0bB/vIWdUDYlHeJcMyqPl3HU0Nsuqv3qX15R2Xw9fjxqdEu1yFlUDi/PVmmZItQVdcTHSkj1NqG67ous9ef7VTDR+QRq4ObS03B46ZOA3ikADLrTb3cMNpuSQo0SxJ6STq4NG7NDVJzOwbyTMt0WQsd0pOaIJEW8xZtLIiL6ywjk7C0HyvdTj8ngGfQ2PjWhd8l/MjoqDHSu8O1M17IUvXrZ7pqCP/+dK50aq6hkZhg1Tj0lFPj19zqOLzyil3EX5MglRbs/qAu0ioNc/QjU64vQ== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=arm.com; dmarc=pass action=none header.from=arm.com; dkim=pass header.d=arm.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector2-armh-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=ChBANwPMtGKRsPcrYcKmd2uZa/Q/i4YNqAbIgIgt4H8=; b=HR0DoWWRpg1E+xNlRZDKEUe8MYIMdl+e2NTIJ32bCZHt+ZOIr/479h3ijMo5NAt8I8mjVc59LCX8gZa95NCwy9yf0pfPF7P4GBdotVXxbSpzoR4ltuUPyJcp+g0+NhMIQdtlHw0MOQMHQ740s851kqNDeCy5hPry9b4GmSmfHUA= Authentication-Results-Original: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com; Received: from PAWPR08MB8958.eurprd08.prod.outlook.com (2603:10a6:102:33e::15) by GV1PR08MB10424.eurprd08.prod.outlook.com (2603:10a6:150:15e::21) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6977.31; Thu, 16 Nov 2023 18:10:19 +0000 Received: from PAWPR08MB8958.eurprd08.prod.outlook.com ([fe80::8512:cc10:24d4:1919]) by PAWPR08MB8958.eurprd08.prod.outlook.com ([fe80::8512:cc10:24d4:1919%5]) with mapi id 15.20.6977.029; Thu, 16 Nov 2023 18:10:19 +0000 Date: Thu, 16 Nov 2023 18:10:15 +0000 From: Alex Coplan To: gcc-patches@gcc.gnu.org Cc: Richard Sandiford , Kyrylo Tkachov Subject: [PATCH 09/11] aarch64: Rewrite non-writeback ldp/stp patterns Message-ID: Content-Disposition: inline X-ClientProxiedBy: LO2P265CA0387.GBRP265.PROD.OUTLOOK.COM (2603:10a6:600:f::15) To PAWPR08MB8958.eurprd08.prod.outlook.com (2603:10a6:102:33e::15) MIME-Version: 1.0 X-MS-TrafficTypeDiagnostic: PAWPR08MB8958:EE_|GV1PR08MB10424:EE_|AM1PEPF000252DF:EE_|PAVPR08MB9282:EE_ X-MS-Office365-Filtering-Correlation-Id: a7cc6e8c-a3a6-4d74-f6d1-08dbe6cf50eb x-checkrecipientrouted: true NoDisclaimer: true X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam-Untrusted: BCL:0; X-Microsoft-Antispam-Message-Info-Original: GjBNS7rm/Wn/tTAyW3DgR4MyyU5xSIZ8neWlRq1fqAvyCYJMKkR/YkNwIjisvazuQ3T0ex4v3um6IhsBnk3m0Zu2ESL4wUFgeINswvR7en6GwZM3NxMe/uIjv+cykFFsD6bNKbPToletJ4H5Yu2/RMfdUbimXSoTx3BWNRS66bGuYcIpLC0RwZiL4pL2TkBWNdGCz5imIOLZQDDAWBpJrvjZIV5ThAfv4Nix5DX1Ynj8Zl2LbPt1E17rsvUGo7g2XcA0pFOFRJeNxMzT6IWvNwrUodBimTnCr1LTVSCm2iel5uB31v2hlBpY0SZIBVs6crtiAQsMmJ+Q9FbRTUOQFkfyxgPbgcG/t+XVFEzdxu+DsfD0gvuUngjiXO78h6E4jhoxxKHsrMVmoGklP+tZBddu/YlpyGPqrdbfqXZ/qxjdlsuebSBmip3F8eaW+pK/Y+dgx1s+ABuebFfj06t4mIGCTfR7hNEkQWMOxuqdB+q+3f8jiWLd0K42cX56r0nOfYQ0eKVM7BdSxHL1JQpk444HhxPUgjFttpztoQuyxnhKSLC+t98BhxcKON3PKWdKYoG4G9LSpLAS76qZwS53RTri+jOs8GkNzUwBuwv0WebaP78lFa80ZuhMJCjxTo81 X-Forefront-Antispam-Report-Untrusted: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:PAWPR08MB8958.eurprd08.prod.outlook.com; PTR:; CAT:NONE; SFS:(13230031)(376002)(366004)(396003)(39860400002)(346002)(136003)(230922051799003)(1800799009)(64100799003)(186009)(451199024)(2906002)(66556008)(478600001)(38100700002)(41300700001)(66476007)(66946007)(36756003)(86362001)(54906003)(6916009)(6486002)(316002)(2616005)(235185007)(5660300002)(6512007)(44144004)(26005)(6506007)(33964004)(6666004)(44832011)(4326008)(8936002)(83380400001)(8676002)(2700100001); DIR:OUT; SFP:1101; X-MS-Exchange-Transport-CrossTenantHeadersStamped: GV1PR08MB10424 Original-Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com; X-EOPAttributedMessage: 0 X-MS-Exchange-Transport-CrossTenantHeadersStripped: AM1PEPF000252DF.eurprd07.prod.outlook.com X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id-Prvs: 6290a6a8-a843-4ba5-150a-08dbe6cf4af3 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: 8pDl2QxrbBMhWrUblsTiYAsp5rwQfydWMBDPkzfZSiWtUmw9XkRs1cZMmP6rxMGUrEGSLsc9TdxwXv+S8Xvuq3VaQzKxAp96BBfZq/QPda6hPesKjdXXv2eacdI0+qQ4VizYMq/fGH9QD/90DL17Q7RX0xIWqTdMG1ACxUpKLu4a1IosMkYNk8czKBmGgt8pDKVhi8KU9GLhuL2rsU5enqSL43QLMSv7+eTQeL48EzZw2DRhNWeFnJQ6Ygvl6mQaB0szNm6P7U7NOwO8YsfXTH5XO6ia04GHcV3aSMr9UduNbBV+5YaDl6NxW9s0w7deOVOcPqZUmPHNuV5HsoMx3rqS0bQNcpHmi9mqsmAtulzbIUZyaZWDrdVTVX1iChJtC3Q9FTSOIz7xt+DOrDn4YS9XkjDaS2b5A2gnL7xcgwsg6O1BKYwxhTQ+SfdhLkzJKvS+Y+h2tAmFbDqRXeotmwEaOPR74T/wMbKCQvs+kEhK3yoCXIuuyih8tVCi0VmVa5zQDmVnJ7QAvaY+FIjz4OFSd/GpjvyvSZ376TenU0+PkHt7oxT0uJk0OTkZIuhT85ALfBXjiGXLKAwkAyHsskfUyySPvDC53KBzhRvBbw63mNiElOqNRnXSsHZofz3m7mElwWi1VjrrLiOxCBtNvP76lWb2bONwbpAdtdlfWXS7fR0SvUttYHj6MP6dEaByCxwOu0RcTgYqAAXk0hdG4TAcArNVZ4Kr6rgE22DQ9uv1ELbzlIw1toi5+/bl0zDGkHYazZiGgdiocihoJ3P5nQ== X-Forefront-Antispam-Report: CIP:63.35.35.123; CTRY:IE; LANG:en; SCL:1; SRV:; IPV:CAL; SFV:NSPM; H:64aa7808-outbound-1.mta.getcheckrecipient.com; PTR:ec2-63-35-35-123.eu-west-1.compute.amazonaws.com; CAT:NONE; SFS:(13230031)(4636009)(346002)(39860400002)(396003)(136003)(376002)(230922051799003)(451199024)(64100799003)(1800799009)(186009)(82310400011)(40470700004)(46966006)(36840700001)(235185007)(2906002)(40460700003)(4326008)(54906003)(8936002)(70206006)(6916009)(70586007)(86362001)(5660300002)(316002)(44832011)(8676002)(36756003)(81166007)(83380400001)(478600001)(40480700001)(6486002)(2616005)(6666004)(36860700001)(356005)(6512007)(33964004)(44144004)(41300700001)(47076005)(26005)(6506007)(336012)(82740400003)(2700100001); DIR:OUT; SFP:1101; X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 16 Nov 2023 18:10:29.2692 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: a7cc6e8c-a3a6-4d74-f6d1-08dbe6cf50eb X-MS-Exchange-CrossTenant-Id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=f34e5979-57d9-4aaa-ad4d-b122a662184d; Ip=[63.35.35.123]; Helo=[64aa7808-outbound-1.mta.getcheckrecipient.com] X-MS-Exchange-CrossTenant-AuthSource: AM1PEPF000252DF.eurprd07.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: PAVPR08MB9282 X-Spam-Status: No, score=-11.6 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, FORGED_SPF_HELO, GIT_PATCH_0, KAM_ASCII_DIVIDERS, KAM_DMARC_NONE, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2, SPF_HELO_PASS, SPF_NONE, TXREP, T_SCC_BODY_TEXT_LINE, UNPARSEABLE_RELAY autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org This patch overhauls the load/store pair patterns with two main goals: 1. Fixing a correctness issue (the current patterns are not RA-friendly). 2. Allowing more flexibility in which operand modes are supported, and which combinations of modes are allowed in the two arms of the load/store pair, while reducing the number of patterns required both in the source and in the generated code. The correctness issue (1) is due to the fact that the current patterns have two independent memory operands tied together only by a predicate on the insns. Since LRA only looks at the constraints, one of the memory operands can get reloaded without the other one being changed, leading to the insn becoming unrecognizable after reload. We fix this issue by changing the patterns such that they only ever have one memory operand representing the entire pair. For the store case, we use an unspec to logically concatenate the register operands before storing them. For the load case, we use unspecs to extract the "lanes" from the pair mem, with the second occurrence of the mem matched using a match_dup (such that there is still really only one memory operand as far as the RA is concerned). In terms of the modes used for the pair memory operands, we canonicalize these to V2x4QImode, V2x8QImode, and V2x16QImode. These modes have not only the correct size but also correct alignment requirement for a memory operand representing an entire load/store pair. Unlike the other two, V2x4QImode didn't previously exist, so had to be added with the patch. As with the previous patch generalizing the writeback patterns, this patch aims to be flexible in the combinations of modes supported by the patterns without requiring a large number of generated patterns by using distinct mode iterators. The new scheme means we only need a single (generated) pattern for each load/store operation of a given operand size. For the 4-byte and 8-byte operand cases, we use the GPI iterator to synthesize the two patterns. The 16-byte case is implemented as a separate pattern in the source (due to only having a single possible alternative). Since the UNSPEC patterns can't be interpreted by the dwarf2cfi code, we add REG_CFA_OFFSET notes to the store pair insns emitted by aarch64_save_callee_saves, so that correct CFI information can still be generated. Furthermore, we now unconditionally generate these CFA notes on frame-related insns emitted by aarch64_save_callee_saves. This is done in case that the load/store pair pass forms these into pairs, in which case the CFA notes would be needed. We also adjust the ldp/stp peepholes to generate the new form. This is done by switching the generation to use the aarch64_gen_{load,store}_pair interface, making it easier to change the form in the future if needed. (Likewise, the upcoming aarch64 load/store pair pass also makes use of this interface). This patch also adds an "ldpstp" attribute to the non-writeback load/store pair patterns, which is used by the post-RA load/store pair pass to identify existing patterns and see if they can be promoted to writeback variants. One potential concern with using unspecs for the patterns is that it can block optimization by the generic RTL passes. This patch series tries to mitigate this in two ways: 1. The pre-RA load/store pair pass runs very late in the pre-RA pipeline. 2. A later patch in the series adjusts the aarch64 mem{cpy,set} expansion to emit individual loads/stores instead of ldp/stp. These should then be formed back into load/store pairs much later in the RTL pipeline by the new load/store pair pass. Bootstrapped/regtested on aarch64-linux-gnu, OK for trunk? Thanks, Alex gcc/ChangeLog: * config/aarch64/aarch64-ldpstp.md: Abstract ldp/stp representation from peepholes, allowing use of new form. * config/aarch64/aarch64-modes.def (V2x4QImode): Define. * config/aarch64/aarch64-protos.h (aarch64_finish_ldpstp_peephole): Declare. (aarch64_swap_ldrstr_operands): Delete declaration. (aarch64_gen_load_pair): Declare. (aarch64_gen_store_pair): Declare. * config/aarch64/aarch64-simd.md (load_pair): Delete. (vec_store_pair): Delete. (load_pair): Delete. (vec_store_pair): Delete. * config/aarch64/aarch64.cc (aarch64_pair_mode_for_mode): New. (aarch64_gen_store_pair): Adjust to use new unspec form of stp. Drop second mem from parameters. (aarch64_gen_load_pair): Likewise. (aarch64_pair_mem_from_base): New. (aarch64_save_callee_saves): Emit REG_CFA_OFFSET notes for frame-related saves. Adjust call to aarch64_gen_store_pair (aarch64_restore_callee_saves): Adjust calls to aarch64_gen_load_pair to account for change in interface. (aarch64_process_components): Likewise. (aarch64_classify_address): Handle 32-byte pair mems in LDP_STP_N case. (aarch64_print_operand): Likewise. (aarch64_copy_one_block_and_progress_pointers): Adjust calls to account for change in aarch64_gen_{load,store}_pair interface. (aarch64_set_one_block_and_progress_pointer): Likewise. (aarch64_finish_ldpstp_peephole): New. (aarch64_gen_adjusted_ldpstp): Adjust to use generation helper. * config/aarch64/aarch64.md (ldpstp): New attribute. (load_pair_sw_): Delete. (load_pair_dw_): Delete. (load_pair_dw_): Delete. (*load_pair_): New. (*load_pair_16): New. (store_pair_sw_): Delete. (store_pair_dw_): Delete. (store_pair_dw_): Delete. (*store_pair_): New. (*store_pair_16): New. (*load_pair_extendsidi2_aarch64): Adjust to use new form. (*zero_extendsidi2_aarch64): Likewise. * config/aarch64/iterators.md (VPAIR): New. * config/aarch64/predicates.md (aarch64_mem_pair_operand): Change to a special predicate derived from aarch64_mem_pair_operator. --- gcc/config/aarch64/aarch64-ldpstp.md | 66 +++---- gcc/config/aarch64/aarch64-modes.def | 6 +- gcc/config/aarch64/aarch64-protos.h | 5 +- gcc/config/aarch64/aarch64-simd.md | 60 ------- gcc/config/aarch64/aarch64.cc | 257 +++++++++++++++------------ gcc/config/aarch64/aarch64.md | 188 +++++++++----------- gcc/config/aarch64/iterators.md | 3 + gcc/config/aarch64/predicates.md | 10 +- 8 files changed, 270 insertions(+), 325 deletions(-) diff --git a/gcc/config/aarch64/aarch64-ldpstp.md b/gcc/config/aarch64/aarch64-ldpstp.md index 1ee7c73ff0c..dc39af85254 100644 --- a/gcc/config/aarch64/aarch64-ldpstp.md +++ b/gcc/config/aarch64/aarch64-ldpstp.md @@ -24,10 +24,10 @@ (define_peephole2 (set (match_operand:GPI 2 "register_operand" "") (match_operand:GPI 3 "memory_operand" ""))] "aarch64_operands_ok_for_ldpstp (operands, true, mode)" - [(parallel [(set (match_dup 0) (match_dup 1)) - (set (match_dup 2) (match_dup 3))])] + [(const_int 0)] { - aarch64_swap_ldrstr_operands (operands, true); + aarch64_finish_ldpstp_peephole (operands, true); + DONE; }) (define_peephole2 @@ -36,10 +36,10 @@ (define_peephole2 (set (match_operand:GPI 2 "memory_operand" "") (match_operand:GPI 3 "aarch64_reg_or_zero" ""))] "aarch64_operands_ok_for_ldpstp (operands, false, mode)" - [(parallel [(set (match_dup 0) (match_dup 1)) - (set (match_dup 2) (match_dup 3))])] + [(const_int 0)] { - aarch64_swap_ldrstr_operands (operands, false); + aarch64_finish_ldpstp_peephole (operands, false); + DONE; }) (define_peephole2 @@ -48,10 +48,10 @@ (define_peephole2 (set (match_operand:GPF 2 "register_operand" "") (match_operand:GPF 3 "memory_operand" ""))] "aarch64_operands_ok_for_ldpstp (operands, true, mode)" - [(parallel [(set (match_dup 0) (match_dup 1)) - (set (match_dup 2) (match_dup 3))])] + [(const_int 0)] { - aarch64_swap_ldrstr_operands (operands, true); + aarch64_finish_ldpstp_peephole (operands, true); + DONE; }) (define_peephole2 @@ -60,10 +60,10 @@ (define_peephole2 (set (match_operand:GPF 2 "memory_operand" "") (match_operand:GPF 3 "aarch64_reg_or_fp_zero" ""))] "aarch64_operands_ok_for_ldpstp (operands, false, mode)" - [(parallel [(set (match_dup 0) (match_dup 1)) - (set (match_dup 2) (match_dup 3))])] + [(const_int 0)] { - aarch64_swap_ldrstr_operands (operands, false); + aarch64_finish_ldpstp_peephole (operands, false); + DONE; }) (define_peephole2 @@ -72,10 +72,10 @@ (define_peephole2 (set (match_operand:DREG2 2 "register_operand" "") (match_operand:DREG2 3 "memory_operand" ""))] "aarch64_operands_ok_for_ldpstp (operands, true, mode)" - [(parallel [(set (match_dup 0) (match_dup 1)) - (set (match_dup 2) (match_dup 3))])] + [(const_int 0)] { - aarch64_swap_ldrstr_operands (operands, true); + aarch64_finish_ldpstp_peephole (operands, true); + DONE; }) (define_peephole2 @@ -84,10 +84,10 @@ (define_peephole2 (set (match_operand:DREG2 2 "memory_operand" "") (match_operand:DREG2 3 "register_operand" ""))] "aarch64_operands_ok_for_ldpstp (operands, false, mode)" - [(parallel [(set (match_dup 0) (match_dup 1)) - (set (match_dup 2) (match_dup 3))])] + [(const_int 0)] { - aarch64_swap_ldrstr_operands (operands, false); + aarch64_finish_ldpstp_peephole (operands, false); + DONE; }) (define_peephole2 @@ -99,10 +99,10 @@ (define_peephole2 && aarch64_operands_ok_for_ldpstp (operands, true, mode) && (aarch64_tune_params.extra_tuning_flags & AARCH64_EXTRA_TUNE_NO_LDP_STP_QREGS) == 0" - [(parallel [(set (match_dup 0) (match_dup 1)) - (set (match_dup 2) (match_dup 3))])] + [(const_int 0)] { - aarch64_swap_ldrstr_operands (operands, true); + aarch64_finish_ldpstp_peephole (operands, true); + DONE; }) (define_peephole2 @@ -114,10 +114,10 @@ (define_peephole2 && aarch64_operands_ok_for_ldpstp (operands, false, mode) && (aarch64_tune_params.extra_tuning_flags & AARCH64_EXTRA_TUNE_NO_LDP_STP_QREGS) == 0" - [(parallel [(set (match_dup 0) (match_dup 1)) - (set (match_dup 2) (match_dup 3))])] + [(const_int 0)] { - aarch64_swap_ldrstr_operands (operands, false); + aarch64_finish_ldpstp_peephole (operands, false); + DONE; }) @@ -129,10 +129,10 @@ (define_peephole2 (set (match_operand:DI 2 "register_operand" "") (sign_extend:DI (match_operand:SI 3 "memory_operand" "")))] "aarch64_operands_ok_for_ldpstp (operands, true, SImode)" - [(parallel [(set (match_dup 0) (sign_extend:DI (match_dup 1))) - (set (match_dup 2) (sign_extend:DI (match_dup 3)))])] + [(const_int 0)] { - aarch64_swap_ldrstr_operands (operands, true); + aarch64_finish_ldpstp_peephole (operands, true, SIGN_EXTEND); + DONE; }) (define_peephole2 @@ -141,10 +141,10 @@ (define_peephole2 (set (match_operand:DI 2 "register_operand" "") (zero_extend:DI (match_operand:SI 3 "memory_operand" "")))] "aarch64_operands_ok_for_ldpstp (operands, true, SImode)" - [(parallel [(set (match_dup 0) (zero_extend:DI (match_dup 1))) - (set (match_dup 2) (zero_extend:DI (match_dup 3)))])] + [(const_int 0)] { - aarch64_swap_ldrstr_operands (operands, true); + aarch64_finish_ldpstp_peephole (operands, true, ZERO_EXTEND); + DONE; }) ;; Handle storing of a floating point zero with integer data. @@ -163,10 +163,10 @@ (define_peephole2 (set (match_operand: 2 "memory_operand" "") (match_operand: 3 "aarch64_reg_zero_or_fp_zero" ""))] "aarch64_operands_ok_for_ldpstp (operands, false, mode)" - [(parallel [(set (match_dup 0) (match_dup 1)) - (set (match_dup 2) (match_dup 3))])] + [(const_int 0)] { - aarch64_swap_ldrstr_operands (operands, false); + aarch64_finish_ldpstp_peephole (operands, false); + DONE; }) ;; Handle consecutive load/store whose offset is out of the range diff --git a/gcc/config/aarch64/aarch64-modes.def b/gcc/config/aarch64/aarch64-modes.def index 6b4f4e17dd5..1e0d770f72f 100644 --- a/gcc/config/aarch64/aarch64-modes.def +++ b/gcc/config/aarch64/aarch64-modes.def @@ -93,9 +93,13 @@ INT_MODE (XI, 64); /* V8DI mode. */ VECTOR_MODE_WITH_PREFIX (V, INT, DI, 8, 5); - ADJUST_ALIGNMENT (V8DI, 8); +/* V2x4QImode. Used in load/store pair patterns. */ +VECTOR_MODE_WITH_PREFIX (V2x, INT, QI, 4, 5); +ADJUST_NUNITS (V2x4QI, 8); +ADJUST_ALIGNMENT (V2x4QI, 4); + /* Define Advanced SIMD modes for structures of 2, 3 and 4 d-registers. */ #define ADV_SIMD_D_REG_STRUCT_MODES(NVECS, VB, VH, VS, VD) \ VECTOR_MODES_WITH_PREFIX (V##NVECS##x, INT, 8, 3); \ diff --git a/gcc/config/aarch64/aarch64-protos.h b/gcc/config/aarch64/aarch64-protos.h index e463fd5c817..2ab54f244a7 100644 --- a/gcc/config/aarch64/aarch64-protos.h +++ b/gcc/config/aarch64/aarch64-protos.h @@ -967,6 +967,8 @@ void aarch64_split_compare_and_swap (rtx op[]); void aarch64_split_atomic_op (enum rtx_code, rtx, rtx, rtx, rtx, rtx, rtx); bool aarch64_gen_adjusted_ldpstp (rtx *, bool, machine_mode, RTX_CODE); +void aarch64_finish_ldpstp_peephole (rtx *, bool, + enum rtx_code = (enum rtx_code)0); void aarch64_expand_sve_vec_cmp_int (rtx, rtx_code, rtx, rtx); bool aarch64_expand_sve_vec_cmp_float (rtx, rtx_code, rtx, rtx, bool); @@ -1022,8 +1024,9 @@ bool aarch64_mergeable_load_pair_p (machine_mode, rtx, rtx); bool aarch64_operands_ok_for_ldpstp (rtx *, bool, machine_mode); bool aarch64_operands_adjust_ok_for_ldpstp (rtx *, bool, machine_mode); bool aarch64_mem_ok_with_ldpstp_policy_model (rtx, bool, machine_mode); -void aarch64_swap_ldrstr_operands (rtx *, bool); bool aarch64_ldpstp_operand_mode_p (machine_mode); +rtx aarch64_gen_load_pair (rtx, rtx, rtx, enum rtx_code = (enum rtx_code)0); +rtx aarch64_gen_store_pair (rtx, rtx, rtx); extern void aarch64_asm_output_pool_epilogue (FILE *, const char *, tree, HOST_WIDE_INT); diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md index c6f2d582837..6f5080ab030 100644 --- a/gcc/config/aarch64/aarch64-simd.md +++ b/gcc/config/aarch64/aarch64-simd.md @@ -231,38 +231,6 @@ (define_insn "aarch64_store_lane0" [(set_attr "type" "neon_store1_1reg")] ) -(define_insn "load_pair" - [(set (match_operand:DREG 0 "register_operand") - (match_operand:DREG 1 "aarch64_mem_pair_operand")) - (set (match_operand:DREG2 2 "register_operand") - (match_operand:DREG2 3 "memory_operand"))] - "TARGET_FLOAT - && rtx_equal_p (XEXP (operands[3], 0), - plus_constant (Pmode, - XEXP (operands[1], 0), - GET_MODE_SIZE (mode)))" - {@ [ cons: =0 , 1 , =2 , 3 ; attrs: type ] - [ w , Ump , w , m ; neon_ldp ] ldp\t%d0, %d2, %z1 - [ r , Ump , r , m ; load_16 ] ldp\t%x0, %x2, %z1 - } -) - -(define_insn "vec_store_pair" - [(set (match_operand:DREG 0 "aarch64_mem_pair_operand") - (match_operand:DREG 1 "register_operand")) - (set (match_operand:DREG2 2 "memory_operand") - (match_operand:DREG2 3 "register_operand"))] - "TARGET_FLOAT - && rtx_equal_p (XEXP (operands[2], 0), - plus_constant (Pmode, - XEXP (operands[0], 0), - GET_MODE_SIZE (mode)))" - {@ [ cons: =0 , 1 , =2 , 3 ; attrs: type ] - [ Ump , w , m , w ; neon_stp ] stp\t%d1, %d3, %z0 - [ Ump , r , m , r ; store_16 ] stp\t%x1, %x3, %z0 - } -) - (define_insn "aarch64_simd_stp" [(set (match_operand:VP_2E 0 "aarch64_mem_pair_lanes_operand") (vec_duplicate:VP_2E (match_operand: 1 "register_operand")))] @@ -273,34 +241,6 @@ (define_insn "aarch64_simd_stp" } ) -(define_insn "load_pair" - [(set (match_operand:VQ 0 "register_operand" "=w") - (match_operand:VQ 1 "aarch64_mem_pair_operand" "Ump")) - (set (match_operand:VQ2 2 "register_operand" "=w") - (match_operand:VQ2 3 "memory_operand" "m"))] - "TARGET_FLOAT - && rtx_equal_p (XEXP (operands[3], 0), - plus_constant (Pmode, - XEXP (operands[1], 0), - GET_MODE_SIZE (mode)))" - "ldp\\t%q0, %q2, %z1" - [(set_attr "type" "neon_ldp_q")] -) - -(define_insn "vec_store_pair" - [(set (match_operand:VQ 0 "aarch64_mem_pair_operand" "=Ump") - (match_operand:VQ 1 "register_operand" "w")) - (set (match_operand:VQ2 2 "memory_operand" "=m") - (match_operand:VQ2 3 "register_operand" "w"))] - "TARGET_FLOAT - && rtx_equal_p (XEXP (operands[2], 0), - plus_constant (Pmode, - XEXP (operands[0], 0), - GET_MODE_SIZE (mode)))" - "stp\\t%q1, %q3, %z0" - [(set_attr "type" "neon_stp_q")] -) - (define_expand "@aarch64_split_simd_mov" [(set (match_operand:VQMOV 0) (match_operand:VQMOV 1))] diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc index ccf081d2a16..1f6094bf1bc 100644 --- a/gcc/config/aarch64/aarch64.cc +++ b/gcc/config/aarch64/aarch64.cc @@ -9056,59 +9056,81 @@ aarch64_pop_regs (unsigned regno1, unsigned regno2, HOST_WIDE_INT adjustment, } } -/* Generate and return a store pair instruction of mode MODE to store - register REG1 to MEM1 and register REG2 to MEM2. */ +static machine_mode +aarch64_pair_mode_for_mode (machine_mode mode) +{ + if (known_eq (GET_MODE_SIZE (mode), 4)) + return E_V2x4QImode; + else if (known_eq (GET_MODE_SIZE (mode), 8)) + return E_V2x8QImode; + else if (known_eq (GET_MODE_SIZE (mode), 16)) + return E_V2x16QImode; + else + gcc_unreachable (); +} static rtx -aarch64_gen_store_pair (machine_mode mode, rtx mem1, rtx reg1, rtx mem2, - rtx reg2) +aarch64_pair_mem_from_base (rtx mem) { - switch (mode) - { - case E_DImode: - return gen_store_pair_dw_didi (mem1, reg1, mem2, reg2); - - case E_DFmode: - return gen_store_pair_dw_dfdf (mem1, reg1, mem2, reg2); - - case E_TFmode: - return gen_store_pair_dw_tftf (mem1, reg1, mem2, reg2); + auto pair_mode = aarch64_pair_mode_for_mode (GET_MODE (mem)); + mem = adjust_bitfield_address_nv (mem, pair_mode, 0); + gcc_assert (aarch64_mem_pair_lanes_operand (mem, pair_mode)); + return mem; +} - case E_V4SImode: - return gen_vec_store_pairv4siv4si (mem1, reg1, mem2, reg2); +/* Generate and return a store pair instruction to store REG1 and REG2 + into memory starting at BASE_MEM. All three rtxes should have modes of the + same size. */ - case E_V16QImode: - return gen_vec_store_pairv16qiv16qi (mem1, reg1, mem2, reg2); +rtx +aarch64_gen_store_pair (rtx base_mem, rtx reg1, rtx reg2) +{ + rtx pair_mem = aarch64_pair_mem_from_base (base_mem); - default: - gcc_unreachable (); - } + return gen_rtx_SET (pair_mem, + gen_rtx_UNSPEC (GET_MODE (pair_mem), + gen_rtvec (2, reg1, reg2), + UNSPEC_STP)); } -/* Generate and regurn a load pair isntruction of mode MODE to load register - REG1 from MEM1 and register REG2 from MEM2. */ +/* Generate and return a load pair instruction to load a pair of + registers starting at BASE_MEM into REG1 and REG2. If CODE is + UNKNOWN, all three rtxes should have modes of the same size. + Otherwise, CODE is {SIGN,ZERO}_EXTEND, base_mem should be in SImode, + and REG{1,2} should be in DImode. */ -static rtx -aarch64_gen_load_pair (machine_mode mode, rtx reg1, rtx mem1, rtx reg2, - rtx mem2) +rtx +aarch64_gen_load_pair (rtx reg1, rtx reg2, rtx base_mem, enum rtx_code code) { - switch (mode) - { - case E_DImode: - return gen_load_pair_dw_didi (reg1, mem1, reg2, mem2); + rtx pair_mem = aarch64_pair_mem_from_base (base_mem); - case E_DFmode: - return gen_load_pair_dw_dfdf (reg1, mem1, reg2, mem2); - - case E_TFmode: - return gen_load_pair_dw_tftf (reg1, mem1, reg2, mem2); + const bool any_extend_p = (code == ZERO_EXTEND || code == SIGN_EXTEND); + if (any_extend_p) + { + gcc_checking_assert (GET_MODE (base_mem) == SImode); + gcc_checking_assert (GET_MODE (reg1) == DImode); + gcc_checking_assert (GET_MODE (reg2) == DImode); + } + else + gcc_assert (code == UNKNOWN); + + rtx unspecs[2] = { + gen_rtx_UNSPEC (any_extend_p ? SImode : GET_MODE (reg1), + gen_rtvec (1, pair_mem), + UNSPEC_LDP_FST), + gen_rtx_UNSPEC (any_extend_p ? SImode : GET_MODE (reg2), + gen_rtvec (1, copy_rtx (pair_mem)), + UNSPEC_LDP_SND) + }; - case E_V4SImode: - return gen_load_pairv4siv4si (reg1, mem1, reg2, mem2); + if (any_extend_p) + for (int i = 0; i < 2; i++) + unspecs[i] = gen_rtx_fmt_e (code, DImode, unspecs[i]); - default: - gcc_unreachable (); - } + return gen_rtx_PARALLEL (VOIDmode, + gen_rtvec (2, + gen_rtx_SET (reg1, unspecs[0]), + gen_rtx_SET (reg2, unspecs[1]))); } /* Return TRUE if return address signing should be enabled for the current @@ -9321,8 +9343,19 @@ aarch64_save_callee_saves (poly_int64 bytes_below_sp, offset -= fp_offset; } rtx mem = gen_frame_mem (mode, plus_constant (Pmode, base_rtx, offset)); - bool need_cfa_note_p = (base_rtx != stack_pointer_rtx); + rtx cfa_base = stack_pointer_rtx; + poly_int64 cfa_offset = sp_offset; + + if (hard_fp_valid_p && frame_pointer_needed) + { + cfa_base = hard_frame_pointer_rtx; + cfa_offset += (bytes_below_sp - frame.bytes_below_hard_fp); + } + + rtx cfa_mem = gen_frame_mem (mode, + plus_constant (Pmode, + cfa_base, cfa_offset)); unsigned int regno2; if (!aarch64_sve_mode_p (mode) && i + 1 < regs.size () @@ -9331,45 +9364,37 @@ aarch64_save_callee_saves (poly_int64 bytes_below_sp, frame.reg_offset[regno2] - frame.reg_offset[regno])) { rtx reg2 = gen_rtx_REG (mode, regno2); - rtx mem2; offset += GET_MODE_SIZE (mode); - mem2 = gen_frame_mem (mode, plus_constant (Pmode, base_rtx, offset)); - insn = emit_insn (aarch64_gen_store_pair (mode, mem, reg, mem2, - reg2)); - - /* The first part of a frame-related parallel insn is - always assumed to be relevant to the frame - calculations; subsequent parts, are only - frame-related if explicitly marked. */ + insn = emit_insn (aarch64_gen_store_pair (mem, reg, reg2)); + if (aarch64_emit_cfi_for_reg_p (regno2)) { - if (need_cfa_note_p) - aarch64_add_cfa_expression (insn, reg2, stack_pointer_rtx, - sp_offset + GET_MODE_SIZE (mode)); - else - RTX_FRAME_RELATED_P (XVECEXP (PATTERN (insn), 0, 1)) = 1; + rtx cfa_mem2 = adjust_address_nv (cfa_mem, + Pmode, + GET_MODE_SIZE (mode)); + add_reg_note (insn, REG_CFA_OFFSET, + gen_rtx_SET (cfa_mem2, reg2)); } regno = regno2; ++i; } else if (mode == VNx2DImode && BYTES_BIG_ENDIAN) - { - insn = emit_insn (gen_aarch64_pred_mov (mode, mem, ptrue, reg)); - need_cfa_note_p = true; - } + insn = emit_insn (gen_aarch64_pred_mov (mode, mem, ptrue, reg)); else if (aarch64_sve_mode_p (mode)) insn = emit_insn (gen_rtx_SET (mem, reg)); else insn = emit_move_insn (mem, reg); RTX_FRAME_RELATED_P (insn) = frame_related_p; - if (frame_related_p && need_cfa_note_p) - aarch64_add_cfa_expression (insn, reg, stack_pointer_rtx, sp_offset); + + if (frame_related_p) + add_reg_note (insn, REG_CFA_OFFSET, gen_rtx_SET (cfa_mem, reg)); } } + /* Emit code to restore the callee registers in REGS, ignoring pop candidates and any other registers that are handled separately. Write the appropriate REG_CFA_RESTORE notes into CFI_OPS. @@ -9425,12 +9450,7 @@ aarch64_restore_callee_saves (poly_int64 bytes_below_sp, frame.reg_offset[regno2] - frame.reg_offset[regno])) { rtx reg2 = gen_rtx_REG (mode, regno2); - rtx mem2; - - offset += GET_MODE_SIZE (mode); - mem2 = gen_frame_mem (mode, plus_constant (Pmode, base_rtx, offset)); - emit_insn (aarch64_gen_load_pair (mode, reg, mem, reg2, mem2)); - + emit_insn (aarch64_gen_load_pair (reg, reg2, mem)); *cfi_ops = alloc_reg_note (REG_CFA_RESTORE, reg2, *cfi_ops); regno = regno2; ++i; @@ -9762,9 +9782,9 @@ aarch64_process_components (sbitmap components, bool prologue_p) : gen_rtx_SET (reg2, mem2); if (prologue_p) - insn = emit_insn (aarch64_gen_store_pair (mode, mem, reg, mem2, reg2)); + insn = emit_insn (aarch64_gen_store_pair (mem, reg, reg2)); else - insn = emit_insn (aarch64_gen_load_pair (mode, reg, mem, reg2, mem2)); + insn = emit_insn (aarch64_gen_load_pair (reg, reg2, mem)); if (frame_related_p || frame_related2_p) { @@ -10983,12 +11003,18 @@ aarch64_classify_address (struct aarch64_address_info *info, mode of the corresponding addressing mode is half of that. */ if (type == ADDR_QUERY_LDP_STP_N) { - if (known_eq (GET_MODE_SIZE (mode), 16)) + if (known_eq (GET_MODE_SIZE (mode), 32)) + mode = V16QImode; + else if (known_eq (GET_MODE_SIZE (mode), 16)) mode = DFmode; else if (known_eq (GET_MODE_SIZE (mode), 8)) mode = SFmode; else return false; + + /* This isn't really an Advanced SIMD struct mode, but a mode + used to represent the complete mem in a load/store pair. */ + advsimd_struct_p = false; } bool allow_reg_index_p = (!load_store_pair_p @@ -12609,7 +12635,8 @@ aarch64_print_operand (FILE *f, rtx x, int code) if (!MEM_P (x) || (code == 'y' && maybe_ne (GET_MODE_SIZE (mode), 8) - && maybe_ne (GET_MODE_SIZE (mode), 16))) + && maybe_ne (GET_MODE_SIZE (mode), 16) + && maybe_ne (GET_MODE_SIZE (mode), 32))) { output_operand_lossage ("invalid operand for '%%%c'", code); return; @@ -25431,10 +25458,8 @@ aarch64_copy_one_block_and_progress_pointers (rtx *src, rtx *dst, *src = adjust_address (*src, mode, 0); *dst = adjust_address (*dst, mode, 0); /* Emit the memcpy. */ - emit_insn (aarch64_gen_load_pair (mode, reg1, *src, reg2, - aarch64_progress_pointer (*src))); - emit_insn (aarch64_gen_store_pair (mode, *dst, reg1, - aarch64_progress_pointer (*dst), reg2)); + emit_insn (aarch64_gen_load_pair (reg1, reg2, *src)); + emit_insn (aarch64_gen_store_pair (*dst, reg1, reg2)); /* Move the pointers forward. */ *src = aarch64_move_pointer (*src, 32); *dst = aarch64_move_pointer (*dst, 32); @@ -25613,8 +25638,7 @@ aarch64_set_one_block_and_progress_pointer (rtx src, rtx *dst, /* "Cast" the *dst to the correct mode. */ *dst = adjust_address (*dst, mode, 0); /* Emit the memset. */ - emit_insn (aarch64_gen_store_pair (mode, *dst, src, - aarch64_progress_pointer (*dst), src)); + emit_insn (aarch64_gen_store_pair (*dst, src, src)); /* Move the pointers forward. */ *dst = aarch64_move_pointer (*dst, 32); @@ -26812,6 +26836,22 @@ aarch64_swap_ldrstr_operands (rtx* operands, bool load) } } +void +aarch64_finish_ldpstp_peephole (rtx *operands, bool load_p, enum rtx_code code) +{ + aarch64_swap_ldrstr_operands (operands, load_p); + + if (load_p) + emit_insn (aarch64_gen_load_pair (operands[0], operands[2], + operands[1], code)); + else + { + gcc_assert (code == UNKNOWN); + emit_insn (aarch64_gen_store_pair (operands[0], operands[1], + operands[3])); + } +} + /* Taking X and Y to be HOST_WIDE_INT pointers, return the result of a comparison between the two. */ int @@ -26993,8 +27033,8 @@ bool aarch64_gen_adjusted_ldpstp (rtx *operands, bool load, machine_mode mode, RTX_CODE code) { - rtx base, offset_1, offset_3, t1, t2; - rtx mem_1, mem_2, mem_3, mem_4; + rtx base, offset_1, offset_3; + rtx mem_1, mem_2; rtx temp_operands[8]; HOST_WIDE_INT off_val_1, off_val_3, base_off, new_off_1, new_off_3, stp_off_upper_limit, stp_off_lower_limit, msize; @@ -27019,21 +27059,17 @@ aarch64_gen_adjusted_ldpstp (rtx *operands, bool load, if (load) { mem_1 = copy_rtx (temp_operands[1]); - mem_2 = copy_rtx (temp_operands[3]); - mem_3 = copy_rtx (temp_operands[5]); - mem_4 = copy_rtx (temp_operands[7]); + mem_2 = copy_rtx (temp_operands[5]); } else { mem_1 = copy_rtx (temp_operands[0]); - mem_2 = copy_rtx (temp_operands[2]); - mem_3 = copy_rtx (temp_operands[4]); - mem_4 = copy_rtx (temp_operands[6]); + mem_2 = copy_rtx (temp_operands[4]); gcc_assert (code == UNKNOWN); } extract_base_offset_in_addr (mem_1, &base, &offset_1); - extract_base_offset_in_addr (mem_3, &base, &offset_3); + extract_base_offset_in_addr (mem_2, &base, &offset_3); gcc_assert (base != NULL_RTX && offset_1 != NULL_RTX && offset_3 != NULL_RTX); @@ -27097,63 +27133,48 @@ aarch64_gen_adjusted_ldpstp (rtx *operands, bool load, replace_equiv_address_nv (mem_1, plus_constant (Pmode, operands[8], new_off_1), true); replace_equiv_address_nv (mem_2, plus_constant (Pmode, operands[8], - new_off_1 + msize), true); - replace_equiv_address_nv (mem_3, plus_constant (Pmode, operands[8], new_off_3), true); - replace_equiv_address_nv (mem_4, plus_constant (Pmode, operands[8], - new_off_3 + msize), true); if (!aarch64_mem_pair_operand (mem_1, mode) - || !aarch64_mem_pair_operand (mem_3, mode)) + || !aarch64_mem_pair_operand (mem_2, mode)) return false; - if (code == ZERO_EXTEND) - { - mem_1 = gen_rtx_ZERO_EXTEND (DImode, mem_1); - mem_2 = gen_rtx_ZERO_EXTEND (DImode, mem_2); - mem_3 = gen_rtx_ZERO_EXTEND (DImode, mem_3); - mem_4 = gen_rtx_ZERO_EXTEND (DImode, mem_4); - } - else if (code == SIGN_EXTEND) - { - mem_1 = gen_rtx_SIGN_EXTEND (DImode, mem_1); - mem_2 = gen_rtx_SIGN_EXTEND (DImode, mem_2); - mem_3 = gen_rtx_SIGN_EXTEND (DImode, mem_3); - mem_4 = gen_rtx_SIGN_EXTEND (DImode, mem_4); - } - if (load) { operands[0] = temp_operands[0]; operands[1] = mem_1; operands[2] = temp_operands[2]; - operands[3] = mem_2; operands[4] = temp_operands[4]; - operands[5] = mem_3; + operands[5] = mem_2; operands[6] = temp_operands[6]; - operands[7] = mem_4; } else { operands[0] = mem_1; operands[1] = temp_operands[1]; - operands[2] = mem_2; operands[3] = temp_operands[3]; - operands[4] = mem_3; + operands[4] = mem_2; operands[5] = temp_operands[5]; - operands[6] = mem_4; operands[7] = temp_operands[7]; } /* Emit adjusting instruction. */ emit_insn (gen_rtx_SET (operands[8], plus_constant (DImode, base, base_off))); /* Emit ldp/stp instructions. */ - t1 = gen_rtx_SET (operands[0], operands[1]); - t2 = gen_rtx_SET (operands[2], operands[3]); - emit_insn (gen_rtx_PARALLEL (VOIDmode, gen_rtvec (2, t1, t2))); - t1 = gen_rtx_SET (operands[4], operands[5]); - t2 = gen_rtx_SET (operands[6], operands[7]); - emit_insn (gen_rtx_PARALLEL (VOIDmode, gen_rtvec (2, t1, t2))); + if (load) + { + emit_insn (aarch64_gen_load_pair (operands[0], operands[2], + operands[1], code)); + emit_insn (aarch64_gen_load_pair (operands[4], operands[6], + operands[5], code)); + } + else + { + emit_insn (aarch64_gen_store_pair (operands[0], operands[1], + operands[3])); + emit_insn (aarch64_gen_store_pair (operands[4], operands[5], + operands[7])); + } return true; } diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md index c92a51690c5..ffb6b0ba749 100644 --- a/gcc/config/aarch64/aarch64.md +++ b/gcc/config/aarch64/aarch64.md @@ -175,6 +175,9 @@ (define_c_enum "unspec" [ UNSPEC_GOTSMALLTLS UNSPEC_GOTTINYPIC UNSPEC_GOTTINYTLS + UNSPEC_STP + UNSPEC_LDP_FST + UNSPEC_LDP_SND UNSPEC_LD1 UNSPEC_LD2 UNSPEC_LD2_DREG @@ -453,6 +456,11 @@ (define_attr "predicated" "yes,no" (const_string "no")) ;; may chose to hold the tracking state encoded in SP. (define_attr "speculation_barrier" "true,false" (const_string "false")) +;; Attribute use to identify load pair and store pair instructions. +;; Currently the attribute is only applied to the non-writeback ldp/stp +;; patterns. +(define_attr "ldpstp" "ldp,stp,none" (const_string "none")) + ;; ------------------------------------------------------------------- ;; Pipeline descriptions and scheduling ;; ------------------------------------------------------------------- @@ -1735,100 +1743,62 @@ (define_expand "setmemdi" FAIL; }) -;; Operands 1 and 3 are tied together by the final condition; so we allow -;; fairly lax checking on the second memory operation. -(define_insn "load_pair_sw_" - [(set (match_operand:SX 0 "register_operand") - (match_operand:SX 1 "aarch64_mem_pair_operand")) - (set (match_operand:SX2 2 "register_operand") - (match_operand:SX2 3 "memory_operand"))] - "rtx_equal_p (XEXP (operands[3], 0), - plus_constant (Pmode, - XEXP (operands[1], 0), - GET_MODE_SIZE (mode)))" - {@ [ cons: =0 , 1 , =2 , 3 ; attrs: type , arch ] - [ r , Ump , r , m ; load_8 , * ] ldp\t%w0, %w2, %z1 - [ w , Ump , w , m ; neon_load1_2reg , fp ] ldp\t%s0, %s2, %z1 - } -) - -;; Storing different modes that can still be merged -(define_insn "load_pair_dw_" - [(set (match_operand:DX 0 "register_operand") - (match_operand:DX 1 "aarch64_mem_pair_operand")) - (set (match_operand:DX2 2 "register_operand") - (match_operand:DX2 3 "memory_operand"))] - "rtx_equal_p (XEXP (operands[3], 0), - plus_constant (Pmode, - XEXP (operands[1], 0), - GET_MODE_SIZE (mode)))" - {@ [ cons: =0 , 1 , =2 , 3 ; attrs: type , arch ] - [ r , Ump , r , m ; load_16 , * ] ldp\t%x0, %x2, %z1 - [ w , Ump , w , m ; neon_load1_2reg , fp ] ldp\t%d0, %d2, %z1 - } -) - -(define_insn "load_pair_dw_" - [(set (match_operand:TX 0 "register_operand" "=w") - (match_operand:TX 1 "aarch64_mem_pair_operand" "Ump")) - (set (match_operand:TX2 2 "register_operand" "=w") - (match_operand:TX2 3 "memory_operand" "m"))] - "TARGET_SIMD - && rtx_equal_p (XEXP (operands[3], 0), - plus_constant (Pmode, - XEXP (operands[1], 0), - GET_MODE_SIZE (mode)))" - "ldp\\t%q0, %q2, %z1" +(define_insn "*load_pair_" + [(set (match_operand:GPI 0 "aarch64_ldp_reg_operand") + (unspec [ + (match_operand: 1 "aarch64_mem_pair_lanes_operand") + ] UNSPEC_LDP_FST)) + (set (match_operand:GPI 2 "aarch64_ldp_reg_operand") + (unspec [ + (match_dup 1) + ] UNSPEC_LDP_SND))] + "" + {@ [cons: =0, 1, =2; attrs: type, arch] + [ r, Umn, r; load_, * ] ldp\t%0, %2, %y1 + [ w, Umn, w; neon_load1_2reg, fp ] ldp\t%0, %2, %y1 + } + [(set_attr "ldpstp" "ldp")] +) + +(define_insn "*load_pair_16" + [(set (match_operand:TI 0 "aarch64_ldp_reg_operand" "=w") + (unspec [ + (match_operand:V2x16QI 1 "aarch64_mem_pair_lanes_operand" "Umn") + ] UNSPEC_LDP_FST)) + (set (match_operand:TI 2 "aarch64_ldp_reg_operand" "=w") + (unspec [ + (match_dup 1) + ] UNSPEC_LDP_SND))] + "TARGET_FLOAT" + "ldp\\t%q0, %q2, %y1" [(set_attr "type" "neon_ldp_q") - (set_attr "fp" "yes")] -) - -;; Operands 0 and 2 are tied together by the final condition; so we allow -;; fairly lax checking on the second memory operation. -(define_insn "store_pair_sw_" - [(set (match_operand:SX 0 "aarch64_mem_pair_operand") - (match_operand:SX 1 "aarch64_reg_zero_or_fp_zero")) - (set (match_operand:SX2 2 "memory_operand") - (match_operand:SX2 3 "aarch64_reg_zero_or_fp_zero"))] - "rtx_equal_p (XEXP (operands[2], 0), - plus_constant (Pmode, - XEXP (operands[0], 0), - GET_MODE_SIZE (mode)))" - {@ [ cons: =0 , 1 , =2 , 3 ; attrs: type , arch ] - [ Ump , rYZ , m , rYZ ; store_8 , * ] stp\t%w1, %w3, %z0 - [ Ump , w , m , w ; neon_store1_2reg , fp ] stp\t%s1, %s3, %z0 - } -) - -;; Storing different modes that can still be merged -(define_insn "store_pair_dw_" - [(set (match_operand:DX 0 "aarch64_mem_pair_operand") - (match_operand:DX 1 "aarch64_reg_zero_or_fp_zero")) - (set (match_operand:DX2 2 "memory_operand") - (match_operand:DX2 3 "aarch64_reg_zero_or_fp_zero"))] - "rtx_equal_p (XEXP (operands[2], 0), - plus_constant (Pmode, - XEXP (operands[0], 0), - GET_MODE_SIZE (mode)))" - {@ [ cons: =0 , 1 , =2 , 3 ; attrs: type , arch ] - [ Ump , rYZ , m , rYZ ; store_16 , * ] stp\t%x1, %x3, %z0 - [ Ump , w , m , w ; neon_store1_2reg , fp ] stp\t%d1, %d3, %z0 - } -) - -(define_insn "store_pair_dw_" - [(set (match_operand:TX 0 "aarch64_mem_pair_operand" "=Ump") - (match_operand:TX 1 "register_operand" "w")) - (set (match_operand:TX2 2 "memory_operand" "=m") - (match_operand:TX2 3 "register_operand" "w"))] - "TARGET_SIMD && - rtx_equal_p (XEXP (operands[2], 0), - plus_constant (Pmode, - XEXP (operands[0], 0), - GET_MODE_SIZE (TFmode)))" - "stp\\t%q1, %q3, %z0" + (set_attr "fp" "yes") + (set_attr "ldpstp" "ldp")] +) + +(define_insn "*store_pair_" + [(set (match_operand: 0 "aarch64_mem_pair_lanes_operand") + (unspec: + [(match_operand:GPI 1 "aarch64_stp_reg_operand") + (match_operand:GPI 2 "aarch64_stp_reg_operand")] UNSPEC_STP))] + "" + {@ [cons: =0, 1, 2; attrs: type , arch] + [ Umn, rYZ, rYZ; store_, * ] stp\t%1, %2, %y0 + [ Umn, w, w; neon_store1_2reg , fp ] stp\t%1, %2, %y0 + } + [(set_attr "ldpstp" "stp")] +) + +(define_insn "*store_pair_16" + [(set (match_operand:V2x16QI 0 "aarch64_mem_pair_lanes_operand" "=Umn") + (unspec:V2x16QI + [(match_operand:TI 1 "aarch64_ldp_reg_operand" "w") + (match_operand:TI 2 "aarch64_ldp_reg_operand" "w")] UNSPEC_STP))] + "TARGET_FLOAT" + "stp\t%q1, %q2, %y0" [(set_attr "type" "neon_stp_q") - (set_attr "fp" "yes")] + (set_attr "fp" "yes") + (set_attr "ldpstp" "stp")] ) ;; Writeback load/store pair patterns. @@ -2074,14 +2044,15 @@ (define_insn "*extendsidi2_aarch64" (define_insn "*load_pair_extendsidi2_aarch64" [(set (match_operand:DI 0 "register_operand" "=r") - (sign_extend:DI (match_operand:SI 1 "aarch64_mem_pair_operand" "Ump"))) + (sign_extend:DI (unspec:SI [ + (match_operand:V2x4QI 1 "aarch64_mem_pair_lanes_operand" "Umn") + ] UNSPEC_LDP_FST))) (set (match_operand:DI 2 "register_operand" "=r") - (sign_extend:DI (match_operand:SI 3 "memory_operand" "m")))] - "rtx_equal_p (XEXP (operands[3], 0), - plus_constant (Pmode, - XEXP (operands[1], 0), - GET_MODE_SIZE (SImode)))" - "ldpsw\\t%0, %2, %z1" + (sign_extend:DI (unspec:SI [ + (match_dup 1) + ] UNSPEC_LDP_SND)))] + "" + "ldpsw\\t%0, %2, %y1" [(set_attr "type" "load_8")] ) @@ -2101,16 +2072,17 @@ (define_insn "*zero_extendsidi2_aarch64" (define_insn "*load_pair_zero_extendsidi2_aarch64" [(set (match_operand:DI 0 "register_operand") - (zero_extend:DI (match_operand:SI 1 "aarch64_mem_pair_operand"))) + (zero_extend:DI (unspec:SI [ + (match_operand:V2x4QI 1 "aarch64_mem_pair_lanes_operand") + ] UNSPEC_LDP_FST))) (set (match_operand:DI 2 "register_operand") - (zero_extend:DI (match_operand:SI 3 "memory_operand")))] - "rtx_equal_p (XEXP (operands[3], 0), - plus_constant (Pmode, - XEXP (operands[1], 0), - GET_MODE_SIZE (SImode)))" - {@ [ cons: =0 , 1 , =2 , 3 ; attrs: type , arch ] - [ r , Ump , r , m ; load_8 , * ] ldp\t%w0, %w2, %z1 - [ w , Ump , w , m ; neon_load1_2reg , fp ] ldp\t%s0, %s2, %z1 + (zero_extend:DI (unspec:SI [ + (match_dup 1) + ] UNSPEC_LDP_SND)))] + "" + {@ [ cons: =0 , 1 , =2; attrs: type , arch] + [ r , Umn , r ; load_8 , * ] ldp\t%w0, %w2, %y1 + [ w , Umn , w ; neon_load1_2reg, fp ] ldp\t%s0, %s2, %y1 } ) diff --git a/gcc/config/aarch64/iterators.md b/gcc/config/aarch64/iterators.md index a920de99ffc..fd8dd6db349 100644 --- a/gcc/config/aarch64/iterators.md +++ b/gcc/config/aarch64/iterators.md @@ -1435,6 +1435,9 @@ (define_mode_attr VDBL [(V8QI "V16QI") (V4HI "V8HI") (SI "V2SI") (SF "V2SF") (DI "V2DI") (DF "V2DF")]) +;; Load/store pair mode. +(define_mode_attr VPAIR [(SI "V2x4QI") (DI "V2x8QI")]) + ;; Register suffix for double-length mode. (define_mode_attr Vdtype [(V4HF "8h") (V2SF "4s")]) diff --git a/gcc/config/aarch64/predicates.md b/gcc/config/aarch64/predicates.md index b647e5af7c6..80f2e03d8de 100644 --- a/gcc/config/aarch64/predicates.md +++ b/gcc/config/aarch64/predicates.md @@ -266,10 +266,12 @@ (define_special_predicate "aarch64_mem_pair_operator" (match_test "known_eq (GET_MODE_SIZE (mode), GET_MODE_SIZE (GET_MODE (op)))")))) -(define_predicate "aarch64_mem_pair_operand" - (and (match_code "mem") - (match_test "aarch64_legitimate_address_p (mode, XEXP (op, 0), false, - ADDR_QUERY_LDP_STP)"))) +;; Like aarch64_mem_pair_operator, but additionally check the +;; address is suitable. +(define_special_predicate "aarch64_mem_pair_operand" + (and (match_operand 0 "aarch64_mem_pair_operator") + (match_test "aarch64_legitimate_address_p (GET_MODE (op), XEXP (op, 0), + false, ADDR_QUERY_LDP_STP)"))) (define_predicate "pmode_plus_operator" (and (match_code "plus") From patchwork Thu Nov 16 18:11:03 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alex Coplan X-Patchwork-Id: 1864875 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=armh.onmicrosoft.com header.i=@armh.onmicrosoft.com header.a=rsa-sha256 header.s=selector2-armh-onmicrosoft-com header.b=YeW/btTr; dkim=pass (1024-bit key) header.d=armh.onmicrosoft.com header.i=@armh.onmicrosoft.com header.a=rsa-sha256 header.s=selector2-armh-onmicrosoft-com header.b=YeW/btTr; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=8.43.85.97; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4SWSmF5dG6z1yRR for ; Fri, 17 Nov 2023 05:11:57 +1100 (AEDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id F337D3875DCE for ; Thu, 16 Nov 2023 18:11:54 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from EUR04-DB3-obe.outbound.protection.outlook.com (mail-db3eur04on2070.outbound.protection.outlook.com [40.107.6.70]) by sourceware.org (Postfix) with ESMTPS id 34A7938582A2 for ; Thu, 16 Nov 2023 18:11:25 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 34A7938582A2 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=arm.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 34A7938582A2 Authentication-Results: server2.sourceware.org; arc=pass smtp.remote-ip=40.107.6.70 ARC-Seal: i=3; a=rsa-sha256; d=sourceware.org; s=key; t=1700158292; cv=pass; b=QHSITk3x2Nn2HJs+HxbFcG5n3XOGqxNiTHurofOEw6A2PAgmSULPkI9bD+swxqo+aOLL+ljYVuzHlcGjnT/HFUD6WyHnWpEwnqB7Ec0Di2qWLjg5sJ+EBrM7kSNQ26ZH3J4BHsXUko7q89Lxfp6CLFAbLzcMYZkfbj0emG9jP8U= ARC-Message-Signature: i=3; a=rsa-sha256; d=sourceware.org; s=key; t=1700158292; c=relaxed/simple; bh=kHAVyp5NC0aZPuBy+t7/WYArppP5ftBIxBXZ1qsHP+w=; h=DKIM-Signature:DKIM-Signature:Date:From:To:Subject:Message-ID: MIME-Version; b=nJezrVrh35tc9OvI43CcF7feT6FRju0zIw+yptRCCCD5Dg9kYyey++r3KC8TI7QBT+ebAh93jTRi4A1Ot6JaYKl4DvzeFmuUySm3wUASIk4bWuPcetrZSNExpGy/Os1Q1iJNYvSFGPYWxnjOwO9q90GXk+X3iKnhRgkiGRK6/6k= ARC-Authentication-Results: i=3; server2.sourceware.org ARC-Seal: i=2; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=pass; b=ZZATjxDRCGt/nmftdXYXWIo4bFgmj31cUZMYsvLMxlolqjO349ZSVbJZ41ybNqo6pZ3pxh1wKcJpv2BAQjUAHKbBQKauv6NdwKrfLbS0d55aZJhrzTmtw1CGOsiqWtCzGaluydRSAz+2Ns2RG8d5ui7jurIcqztLhR251T37BAv59fDabCGGaskzKP1gRRftzsdFM8x9fCbhxA8DBrBV4KQmDJzoRv8l92XnFDOpun10WzEnxFrGz5uuwQaUgfatCduEgi6ey2Ammf81uOKrzaZnHVXwprp5/V5WtEjY+abQFfpBEGMqkopEZf5nedtWylEZ3TMLt3ygnQyFHqRjYQ== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=/QZ+OSCgFKOv6Nu7ZSOoOaeSyB428X9Gj/9KReY1KNw=; b=WCD8eEqFvcwQZdyORwVx+UH2UPRS+NGViM2skQU6p2s3ySKhZWPjfyPrzHTk9pzYbRce7xn3GGLeCedv9KfY1AaDy2Wi6Q45lr5e79UMLROMn1PgsUWhKdsdRjuIOSBUtZC0de3jjz5nNmVH8IzhDG4pyolgPChGywPPcu4ZTgT/hiqloZDU99g1dDKvtAHHApWtEsJG6jbZyC6zuUrG5oryCecl3fvQYI9gU8A1Veg2LFW4higS7sYgTG8y5LxeAWYpIc5SClKw3gM04V6WWRcg5lQjN9g4nqBbbn0H9r0K0THFY1APvxke0sjm+K1ruGpQYMgDbAhr0qbYNz2qyQ== ARC-Authentication-Results: i=2; mx.microsoft.com 1; spf=pass (sender ip is 63.35.35.123) smtp.rcpttodomain=gcc.gnu.org smtp.mailfrom=arm.com; dmarc=pass (p=none sp=none pct=100) action=none header.from=arm.com; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com; arc=pass (0 oda=1 ltdi=1 spf=[1,1,smtp.mailfrom=arm.com] dkim=[1,1,header.d=arm.com] dmarc=[1,1,header.from=arm.com]) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector2-armh-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=/QZ+OSCgFKOv6Nu7ZSOoOaeSyB428X9Gj/9KReY1KNw=; b=YeW/btTrrgF0hgzs4ekwgk0Md98XFMV1s3pprUnr1SDSnQ1A36VWuLB+UbNJ8pYgVeMrvdKSku82qhhou6TFG2zxbKCiGv4B5pAvoyAfJkn6AvOPqfJHiiZZt4dAqtsmOj6Qi32syJFAgEaBZr0Kq5hqkfxed2G2AQPk9vePInE= Received: from DU2PR04CA0244.eurprd04.prod.outlook.com (2603:10a6:10:28e::9) by AS2PR08MB10012.eurprd08.prod.outlook.com (2603:10a6:20b:64f::8) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6977.31; Thu, 16 Nov 2023 18:11:22 +0000 Received: from DU6PEPF0000B61C.eurprd02.prod.outlook.com (2603:10a6:10:28e:cafe::53) by DU2PR04CA0244.outlook.office365.com (2603:10a6:10:28e::9) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7002.21 via Frontend Transport; Thu, 16 Nov 2023 18:11:22 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 63.35.35.123) smtp.mailfrom=arm.com; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com;dmarc=pass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 63.35.35.123 as permitted sender) receiver=protection.outlook.com; client-ip=63.35.35.123; helo=64aa7808-outbound-1.mta.getcheckrecipient.com; pr=C Received: from 64aa7808-outbound-1.mta.getcheckrecipient.com (63.35.35.123) by DU6PEPF0000B61C.mail.protection.outlook.com (10.167.8.135) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7002.20 via Frontend Transport; Thu, 16 Nov 2023 18:11:22 +0000 Received: ("Tessian outbound e243565b0037:v228"); Thu, 16 Nov 2023 18:11:22 +0000 X-CheckRecipientChecked: true X-CR-MTA-CID: 46b3826b7fe1ae31 X-CR-MTA-TID: 64aa7808 Received: from ed31f9cf78c8.2 by 64aa7808-outbound-1.mta.getcheckrecipient.com id 201C0076-CF46-4FD2-B1D1-C4F75043EDB7.1; Thu, 16 Nov 2023 18:11:15 +0000 Received: from EUR04-VI1-obe.outbound.protection.outlook.com by 64aa7808-outbound-1.mta.getcheckrecipient.com with ESMTPS id ed31f9cf78c8.2 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384); Thu, 16 Nov 2023 18:11:15 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=O6CcNhMfSjQHNC+dkp80GdxVJbqDaq/kvfQVyR1EwwTnAb7020iYfA08wcVYKz9WZbWReOMcetI0h1LC/NornpG0GPNaOQbBMKO/oXW3HzVMmUAfYDQnMWdLIaHh2CS33izUFiPyRK27zefBb0XsShB4pvfk3sJfmt5bX1XmxaLCqR9BOKoyt4E8okMzbOdqe3F7sfxNhAMxung4czlZjbwB4Mokf2j8FOkpWMfJIdjH86XonQ+kUnuOroasO8fHisRGIuHRPCqYOC0HEAHnsNhlE6kFrg7BWBksJ1kbp+c/bn0odFXwH+fMQt5OqKX93w8rGSxk8M9D95ruhboslg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=/QZ+OSCgFKOv6Nu7ZSOoOaeSyB428X9Gj/9KReY1KNw=; b=nQ/qM/hPg3SJK+swbXYZDS2PfT1zvLtG8kXIAGJE3ItZILpB1kpzCLeHghAWRBRgfniPZhGIIremojDq90B5kRAUclL1b0IFNERpszFKaJUp2NLc7+rXOioPSbM39bJQDmDB9rW83l6jXkOBJrMJOu2T7oKuN/baielUclr5agHaQHNIN9qMRIY3Q3jd4GcaZksO5hUh9+MoNmC/g38kiMvOcNKFUfOJsD87QeDmUo35SNAc2e/Te3ztKiGmNslW7t64gD98vxrTgf7kryWQm6gHxaZhrYN8GUEAay28UWiYe5kDq3NFqDgQUbeGGsg5G/K8Elgr3xzIDgMDM6XZ/A== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=arm.com; dmarc=pass action=none header.from=arm.com; dkim=pass header.d=arm.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector2-armh-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=/QZ+OSCgFKOv6Nu7ZSOoOaeSyB428X9Gj/9KReY1KNw=; b=YeW/btTrrgF0hgzs4ekwgk0Md98XFMV1s3pprUnr1SDSnQ1A36VWuLB+UbNJ8pYgVeMrvdKSku82qhhou6TFG2zxbKCiGv4B5pAvoyAfJkn6AvOPqfJHiiZZt4dAqtsmOj6Qi32syJFAgEaBZr0Kq5hqkfxed2G2AQPk9vePInE= Authentication-Results-Original: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com; Received: from PAWPR08MB8958.eurprd08.prod.outlook.com (2603:10a6:102:33e::15) by GV1PR08MB10424.eurprd08.prod.outlook.com (2603:10a6:150:15e::21) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6977.31; Thu, 16 Nov 2023 18:11:13 +0000 Received: from PAWPR08MB8958.eurprd08.prod.outlook.com ([fe80::8512:cc10:24d4:1919]) by PAWPR08MB8958.eurprd08.prod.outlook.com ([fe80::8512:cc10:24d4:1919%5]) with mapi id 15.20.6977.029; Thu, 16 Nov 2023 18:11:06 +0000 Date: Thu, 16 Nov 2023 18:11:03 +0000 From: Alex Coplan To: gcc-patches@gcc.gnu.org Cc: Richard Sandiford , Kyrylo Tkachov Subject: [PATCH 10/11] aarch64: Add new load/store pair fusion pass. Message-ID: Content-Disposition: inline X-ClientProxiedBy: LO2P265CA0367.GBRP265.PROD.OUTLOOK.COM (2603:10a6:600:a3::19) To PAWPR08MB8958.eurprd08.prod.outlook.com (2603:10a6:102:33e::15) MIME-Version: 1.0 X-MS-TrafficTypeDiagnostic: PAWPR08MB8958:EE_|GV1PR08MB10424:EE_|DU6PEPF0000B61C:EE_|AS2PR08MB10012:EE_ X-MS-Office365-Filtering-Correlation-Id: 56170157-bfb2-4786-5f47-08dbe6cf70a3 x-checkrecipientrouted: true NoDisclaimer: true X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam-Untrusted: BCL:0; X-Microsoft-Antispam-Message-Info-Original: 4AhDz+n1R0B8yCQ1HXSDjeOXeaAr5UM6P2BYYJ2+WWCrOVS8pGKWh3jZYslBd4jHaMXZZ0QXqWPwpEykStIZ8D3nQF9jCGMdVJAZruENfN/pKcA5GqveaZ6/KNRw0xcxSFjT/vh2nYXGfNRYghRvkedz8niA45CogZwl+CuFQe/8XbkwJJEHNIb1Xr8e5ouRMHZ6ldA7hRsmNKaO8nWLGwg1LfJtxqkf0GVD4hKxj08uJe09AeTXuDx+ksE0iPyxSn6250+YXAVr67b5Q+8f2mh475wQkqL4k3fSgUv8m2oA6R7OjYV+Q2owgpkLiRlFA0/rxhNBsjwfBxchvF+9A7poEmEPEkV78PXRntiD3s6f4311dM2tFHl2FNp6lANp8EyDDrxgMaStLdW3A4Up/dpOE5gUDx1bbMNb+utFbhTmuQXNxPmgJgNYY/5xemrBiBJ652EMpV4njnEbzCeAPM6NLjVXJ22xkbE7iHkGfmwyr7Fk4tLVViS6xDo164/zDzqibgX1x1gb6wqU6EYh+70xHnthv8bflGq2tO0lxdJrAZdk6xmJt5fgbucc3a7wUKFZYAd0LyqS23NQYfRXolh7C3uJLc+fVW1ngPNgrNc= X-Forefront-Antispam-Report-Untrusted: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:PAWPR08MB8958.eurprd08.prod.outlook.com; PTR:; CAT:NONE; SFS:(13230031)(376002)(366004)(396003)(39860400002)(346002)(136003)(230922051799003)(1800799009)(64100799003)(186009)(451199024)(2906002)(66556008)(478600001)(38100700002)(41300700001)(966005)(66476007)(66946007)(36756003)(86362001)(54906003)(6916009)(6486002)(316002)(2616005)(235185007)(5660300002)(6512007)(44144004)(26005)(6506007)(33964004)(6666004)(44832011)(4326008)(8936002)(83380400001)(8676002)(2700100001); DIR:OUT; SFP:1101; X-MS-Exchange-Transport-CrossTenantHeadersStamped: GV1PR08MB10424 Original-Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com; X-EOPAttributedMessage: 0 X-MS-Exchange-Transport-CrossTenantHeadersStripped: DU6PEPF0000B61C.eurprd02.prod.outlook.com X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id-Prvs: 3569b266-cb58-4d48-2bc8-08dbe6cf6718 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: U/YJUN0HbEIIVP0TZ5UbKFl2x7INBoNfECFIloWbsB/ytkxjwbKHO9larlcAI7veRe6L2Rlc8bRIC6fcRMY+V0tHGASwCDbkf3rINLA71W9cXWcQFOEOgz0kaBVYkfr932WUdI7Y8a8kq5kVfN6F/e7veI5TP0OyFOWIVUrR6lwSqcU+LfQbWcYL/QC5pvWA49bl8Gotw1cvf4xs0wf5+iifrJ1o3Qe1GSSXBOzu5JZs6wRwhUgo7PKsQZl2AGF0cuAiIVzDxhK2SMI8bV7M5Dg8YHhX4vqJHKpw9SKkF/ZBW5cApfUh3Cb/pxcv69NM0sp7mRBSu558NbqqdR+xBpyKRAvxFCmgMcHg3UWVLpleXKQFxo1ihdA6pqrZ2lM5RRm6nayDCJKlBYxevEq4JGQLl6YoD/GNFxnqh3wituWpFhwMI88JGhwnMfs6qaop5IoRyTTMAC0VxpJ8/+rsctmKvUnU5nVdst4wDW8JPtSxcUxm9lFC6zu4I/4Vf2mJv7G5QvMdJgfWlEWB1VlDbGdcxHuI8Ux0bwmpk5ovb9Lzq6lc/YDM3Fn3s6vEKLFsHx+HOnoUJRtqHRgqQcuXHuyC0xpDu9JVCcHmSuMM+q9nQC5CuR2dH2IA0TP9yGV031wz1ChyG4PDncMk0NMem6sZbBqNx8i3dVv2hFGJlUFRXWhwSwSFhixbGvi0uAQkAUj0SVBjBINN1uBmMNTCWmENeb5++HrovfwTIBlU7+T8v5zKoGyKu4jf3knIq9pM X-Forefront-Antispam-Report: CIP:63.35.35.123; CTRY:IE; LANG:en; SCL:1; SRV:; IPV:CAL; SFV:NSPM; H:64aa7808-outbound-1.mta.getcheckrecipient.com; PTR:ec2-63-35-35-123.eu-west-1.compute.amazonaws.com; CAT:NONE; SFS:(13230031)(4636009)(396003)(346002)(136003)(39860400002)(376002)(230922051799003)(1800799009)(451199024)(186009)(64100799003)(82310400011)(36840700001)(46966006)(40470700004)(8936002)(40460700003)(2616005)(235185007)(41300700001)(36860700001)(70586007)(6486002)(316002)(86362001)(54906003)(70206006)(82740400003)(36756003)(966005)(6916009)(356005)(81166007)(478600001)(5660300002)(2906002)(26005)(6512007)(44144004)(44832011)(33964004)(6506007)(6666004)(8676002)(47076005)(83380400001)(336012)(4326008)(40480700001)(2700100001); DIR:OUT; SFP:1101; X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 16 Nov 2023 18:11:22.5274 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 56170157-bfb2-4786-5f47-08dbe6cf70a3 X-MS-Exchange-CrossTenant-Id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=f34e5979-57d9-4aaa-ad4d-b122a662184d; Ip=[63.35.35.123]; Helo=[64aa7808-outbound-1.mta.getcheckrecipient.com] X-MS-Exchange-CrossTenant-AuthSource: DU6PEPF0000B61C.eurprd02.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: AS2PR08MB10012 X-Spam-Status: No, score=-11.5 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, FORGED_SPF_HELO, GIT_PATCH_0, KAM_DMARC_NONE, KAM_SHORT, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2, SCC_5_SHORT_WORD_LINES, SPF_HELO_PASS, SPF_NONE, TXREP, T_SCC_BODY_TEXT_LINE, UNPARSEABLE_RELAY autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org This is a v3 of the aarch64 load/store pair fusion pass. v2 was posted here: - https://gcc.gnu.org/pipermail/gcc-patches/2023-October/633601.html The main changes since v2 are as follows: We now handle writeback opportunities as well. E.g. for this testcase: void foo (long *p, long *q, long x, long y) { do { *(p++) = x; *(p++) = y; } while (p < q); } wtih the patch, we generate: foo: .LFB0: .align 3 .L2: stp x2, x3, [x0], 16 cmp x0, x1 bcc .L2 ret instead of: foo: .LFB0: .align 3 .L2: str x2, [x0], 16 str x3, [x0, -8] cmp x0, x1 bcc .L2 ret i.e. the pass is now capable of finding load/store pair opportunities even in the case that one or more of the initial candidate accesses uses writeback addressing. We do this by adding a notion of canonicalizing RTL bases. When we see a writeback access, we record that the new base def is equivalent to the original def plus some offset. When tracking accesses, we then canonicalize to track each access relative to the earliest equivalent base in the basic block. This allows us to spot that accesses are adjacent even though they don't share the same RTL-SSA base def. Furthermore, we also add some extra logic to opportunistically fold in trailing destructive updates of the base register used for a load/store pair. E.g. for void post_add (long *p, long *q, long x, long y) { do { p[0] = x; p[1] = y; p += 2; } while (p < q); } the auto-inc-dec pass doesn't currently form any writeback accesses, and we generate: post_add: .LFB0: .align 3 .L2: add x0, x0, 16 stp x2, x3, [x0, -16] cmp x0, x1 bcc .L2 ret but with the updated pass, we now get: post_add: .LFB0: .align 3 .L2: stp x2, x3, [x0], 16 cmp x0, x1 bcc .L2 ret Other notable changes to the pass since the last version include: - We switch to using the aarch64_gen_{load,store}_pair interface for forming the (non-writeback) pairs, allowing use of the new load/store pair representation added by the earlier patch. - The various updates to the load/store pair patterns mean that we no longer need to do mode canonicalization / mode unification in the pass, as the patterns allow arbitrary combinations of suitable modes of the same size. So we remove the logic to do this (including the param to control the strategy). - Fix up classification of zero operands to make sure that these are always treated as GPR operands for pair discovery purposes. This avoids us pairing zero operands with FPRs in the pre-RA pass, which used to lead to undesirable codegen involving cross-file moves. - We also remove the try_adjust_address logic from the previous iteration of the pass. Since we validate all ldp/stp offsets in the pass, this only meant that we lost opportunities in the case that a given mem fails to adjust in its original mode. Bootstrapped/regtested as a series on aarch64-linux-gnu, OK for trunk? Thanks, Alex gcc/ChangeLog: * config.gcc: Add aarch64-ldp-fusion.o to extra_objs for aarch64; add aarch64-ldp-fusion.cc to target_gtfiles. * config/aarch64/aarch64-passes.def: Add copies of pass_ldp_fusion before and after RA. * config/aarch64/aarch64-protos.h (make_pass_ldp_fusion): Declare. * config/aarch64/aarch64.opt (-mearly-ldp-fusion): New. (-mlate-ldp-fusion): New. (--param=aarch64-ldp-alias-check-limit): New. (--param=aarch64-ldp-writeback): New. * config/aarch64/t-aarch64: Add rule for aarch64-ldp-fusion.o. * config/aarch64/aarch64-ldp-fusion.cc: New file. --- gcc/config.gcc | 4 +- gcc/config/aarch64/aarch64-ldp-fusion.cc | 2727 ++++++++++++++++++++++ gcc/config/aarch64/aarch64-passes.def | 2 + gcc/config/aarch64/aarch64-protos.h | 1 + gcc/config/aarch64/aarch64.opt | 23 + gcc/config/aarch64/t-aarch64 | 7 + 6 files changed, 2762 insertions(+), 2 deletions(-) create mode 100644 gcc/config/aarch64/aarch64-ldp-fusion.cc diff --git a/gcc/config.gcc b/gcc/config.gcc index c1460ca354e..8b7f6b20309 100644 --- a/gcc/config.gcc +++ b/gcc/config.gcc @@ -349,8 +349,8 @@ aarch64*-*-*) c_target_objs="aarch64-c.o" cxx_target_objs="aarch64-c.o" d_target_objs="aarch64-d.o" - extra_objs="aarch64-builtins.o aarch-common.o aarch64-sve-builtins.o aarch64-sve-builtins-shapes.o aarch64-sve-builtins-base.o aarch64-sve-builtins-sve2.o cortex-a57-fma-steering.o aarch64-speculation.o falkor-tag-collision-avoidance.o aarch-bti-insert.o aarch64-cc-fusion.o" - target_gtfiles="\$(srcdir)/config/aarch64/aarch64-builtins.cc \$(srcdir)/config/aarch64/aarch64-sve-builtins.h \$(srcdir)/config/aarch64/aarch64-sve-builtins.cc" + extra_objs="aarch64-builtins.o aarch-common.o aarch64-sve-builtins.o aarch64-sve-builtins-shapes.o aarch64-sve-builtins-base.o aarch64-sve-builtins-sve2.o cortex-a57-fma-steering.o aarch64-speculation.o falkor-tag-collision-avoidance.o aarch-bti-insert.o aarch64-cc-fusion.o aarch64-ldp-fusion.o" + target_gtfiles="\$(srcdir)/config/aarch64/aarch64-builtins.cc \$(srcdir)/config/aarch64/aarch64-sve-builtins.h \$(srcdir)/config/aarch64/aarch64-sve-builtins.cc \$(srcdir)/config/aarch64/aarch64-ldp-fusion.cc" target_has_targetm_common=yes ;; alpha*-*-*) diff --git a/gcc/config/aarch64/aarch64-ldp-fusion.cc b/gcc/config/aarch64/aarch64-ldp-fusion.cc new file mode 100644 index 00000000000..6ab18b9216e --- /dev/null +++ b/gcc/config/aarch64/aarch64-ldp-fusion.cc @@ -0,0 +1,2727 @@ +// LoadPair fusion optimization pass for AArch64. +// Copyright (C) 2023 Free Software Foundation, Inc. +// +// This file is part of GCC. +// +// GCC is free software; you can redistribute it and/or modify it +// under the terms of the GNU General Public License as published by +// the Free Software Foundation; either version 3, or (at your option) +// any later version. +// +// GCC is distributed in the hope that it will be useful, but +// WITHOUT ANY WARRANTY; without even the implied warranty of +// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU +// General Public License for more details. +// +// You should have received a copy of the GNU General Public License +// along with GCC; see the file COPYING3. If not see +// . + +#define INCLUDE_ALGORITHM +#define INCLUDE_FUNCTIONAL +#define INCLUDE_LIST +#define INCLUDE_TYPE_TRAITS +#include "config.h" +#include "system.h" +#include "coretypes.h" +#include "backend.h" +#include "rtl.h" +#include "df.h" +#include "rtl-ssa.h" +#include "cfgcleanup.h" +#include "tree-pass.h" +#include "ordered-hash-map.h" +#include "tree-dfa.h" +#include "fold-const.h" +#include "tree-hash-traits.h" +#include "print-tree.h" +#include "insn-attr.h" + +using namespace rtl_ssa; + +enum +{ + LDP_IMM_BITS = 7, + LDP_IMM_MASK = (1 << LDP_IMM_BITS) - 1, + LDP_IMM_SIGN_BIT = (1 << (LDP_IMM_BITS - 1)), + LDP_MAX_IMM = LDP_IMM_SIGN_BIT - 1, + LDP_MIN_IMM = -LDP_MAX_IMM - 1, +}; + +// We pack these fields (load_p, fpsimd_p, and size) into an integer +// (LFS) which we use as part of the key into the main hash tables. +// +// The idea is that we group candidates together only if they agree on +// the fields below. Candidates that disagree on any of these +// properties shouldn't be merged together. +struct lfs_fields +{ + bool load_p; + bool fpsimd_p; + unsigned size; +}; + +using insn_list_t = std::list ; +using insn_iter_t = insn_list_t::iterator; + +// Information about the accesses at a given offset from a particular +// base. Stored in an access_group, see below. +struct access_record +{ + poly_int64 offset; + std::list cand_insns; + std::list::iterator place; + + access_record (poly_int64 off) : offset (off) {} +}; + +// A group of accesses where adjacent accesses could be ldp/stp +// candidates. The splay tree supports efficient insertion, +// while the list supports efficient iteration. +struct access_group +{ + splay_tree tree; + std::list list; + + template + inline void track (Alloc node_alloc, poly_int64 offset, insn_info *insn); +}; + +// Information about a potential base candidate, used in try_fuse_pair. +// There may be zero, one, or two viable RTL bases for a given pair. +struct base_cand +{ + def_info *m_def; + + // FROM_INSN is -1 if the base candidate is already shared by both + // candidate insns. Otherwise it holds the index of the insn from + // which the base originated. + int from_insn; + + // Initially: dataflow hazards that arise if we choose this base as + // the common base register for the pair. + // + // Later these get narrowed, taking alias hazards into account. + insn_info *hazards[2]; + + base_cand (def_info *def, int insn) + : m_def (def), from_insn (insn), hazards {nullptr, nullptr} {} + + base_cand (def_info *def) : base_cand (def, -1) {} + + bool viable () const + { + return !hazards[0] || !hazards[1] || (*hazards[0] > *hazards[1]); + } +}; + +// Information about an alternate base. For a def_info D, it may +// instead be expressed as D = BASE + OFFSET. +struct alt_base +{ + def_info *base; + poly_int64 offset; +}; + +// State used by the pass for a given basic block. +struct ldp_bb_info +{ + using def_hash = nofree_ptr_hash ; + using expr_key_t = pair_hash >; + using def_key_t = pair_hash >; + + // Map of -> access_group. + ordered_hash_map expr_map; + + // Map of -> access_group. + ordered_hash_map def_map; + + // Given the def_info for an RTL base register, express it as an offset from + // some canonical base instead. + // + // Canonicalizing bases in this way allows us to identify adjacent accesses + // even if they see different base register defs. + hash_map canon_base_map; + + static const size_t obstack_alignment = sizeof (void *); + bb_info *m_bb; + + ldp_bb_info (bb_info *bb) : m_bb (bb), m_emitted_tombstone (false) + { + obstack_specify_allocation (&m_obstack, OBSTACK_CHUNK_SIZE, + obstack_alignment, obstack_chunk_alloc, + obstack_chunk_free); + } + ~ldp_bb_info () + { + obstack_free (&m_obstack, nullptr); + } + + inline void track_access (insn_info *, bool load, rtx mem); + inline void transform (); + inline void cleanup_tombstones (); + +private: + // Did we emit a tombstone insn for this bb? + bool m_emitted_tombstone; + obstack m_obstack; + + inline splay_tree_node *node_alloc (access_record *); + + template + inline void traverse_base_map (Map &map); + inline void transform_for_base (int load_size, access_group &group); + + inline bool try_form_pairs (insn_list_t *, insn_list_t *, + bool load_p, unsigned access_size); + + inline bool track_via_mem_expr (insn_info *, rtx mem, lfs_fields lfs); +}; + +splay_tree_node * +ldp_bb_info::node_alloc (access_record *access) +{ + using T = splay_tree_node; + void *addr = obstack_alloc (&m_obstack, sizeof (T)); + return new (addr) T (access); +} + +// Given a mem MEM, if the address has side effects, return a MEM that accesses +// the same address but without the side effects. Otherwise, return +// MEM unchanged. +static rtx +drop_writeback (rtx mem) +{ + rtx addr = XEXP (mem, 0); + + if (!side_effects_p (addr)) + return mem; + + switch (GET_CODE (addr)) + { + case PRE_MODIFY: + addr = XEXP (addr, 1); + break; + case POST_MODIFY: + case POST_INC: + case POST_DEC: + addr = XEXP (addr, 0); + break; + case PRE_INC: + case PRE_DEC: + { + poly_int64 adjustment = GET_MODE_SIZE (GET_MODE (mem)); + if (GET_CODE (addr) == PRE_DEC) + adjustment *= -1; + addr = plus_constant (GET_MODE (addr), XEXP (addr, 0), adjustment); + break; + } + default: + gcc_unreachable (); + } + + return change_address (mem, GET_MODE (mem), addr); +} + +// Convenience wrapper around strip_offset that can also look +// through {PRE,POST}_MODIFY. +static rtx ldp_strip_offset (rtx mem, rtx *modify, poly_int64 *offset) +{ + gcc_checking_assert (MEM_P (mem)); + + rtx base = strip_offset (XEXP (mem, 0), offset); + + if (side_effects_p (base)) + *modify = base; + + switch (GET_CODE (base)) + { + case PRE_MODIFY: + case POST_MODIFY: + base = strip_offset (XEXP (base, 1), offset); + gcc_checking_assert (REG_P (base)); + gcc_checking_assert (rtx_equal_p (XEXP (*modify, 0), base)); + break; + case PRE_INC: + case POST_INC: + base = XEXP (base, 0); + *offset = GET_MODE_SIZE (GET_MODE (mem)); + gcc_checking_assert (REG_P (base)); + break; + case PRE_DEC: + case POST_DEC: + base = XEXP (base, 0); + *offset = -GET_MODE_SIZE (GET_MODE (mem)); + gcc_checking_assert (REG_P (base)); + break; + + default: + gcc_checking_assert (!side_effects_p (base)); + } + + return base; +} + +static bool +any_pre_modify_p (rtx x) +{ + const auto code = GET_CODE (x); + return code == PRE_INC || code == PRE_DEC || code == PRE_MODIFY; +} + +static bool +any_post_modify_p (rtx x) +{ + const auto code = GET_CODE (x); + return code == POST_INC || code == POST_DEC || code == POST_MODIFY; +} + +static bool +ldp_operand_mode_ok_p (machine_mode mode) +{ + const bool allow_qregs + = !(aarch64_tune_params.extra_tuning_flags + & AARCH64_EXTRA_TUNE_NO_LDP_STP_QREGS); + + if (!aarch64_ldpstp_operand_mode_p (mode)) + return false; + + const auto size = GET_MODE_SIZE (mode).to_constant (); + if (size == 16 && !allow_qregs) + return false; + + return reload_completed || mode != E_TImode; +} + +static int +encode_lfs (lfs_fields fields) +{ + int size_log2 = exact_log2 (fields.size); + gcc_checking_assert (size_log2 >= 2 && size_log2 <= 4); + return ((int)fields.load_p << 3) + | ((int)fields.fpsimd_p << 2) + | (size_log2 - 2); +} + +static lfs_fields +decode_lfs (int lfs) +{ + bool load_p = (lfs & (1 << 3)); + bool fpsimd_p = (lfs & (1 << 2)); + unsigned size = 1U << ((lfs & 3) + 2); + return { load_p, fpsimd_p, size }; +} + +template +void +access_group::track (Alloc alloc_node, poly_int64 offset, insn_info *insn) +{ + auto insert_before = [&](std::list::iterator after) + { + auto it = list.emplace (after, offset); + it->cand_insns.push_back (insn); + it->place = it; + return &*it; + }; + + if (!list.size ()) + { + auto access = insert_before (list.end ()); + tree.insert_max_node (alloc_node (access)); + return; + } + + auto compare = [&](splay_tree_node *node) + { + return compare_sizes_for_sort (offset, node->value ()->offset); + }; + auto result = tree.lookup (compare); + splay_tree_node *node = tree.root (); + if (result == 0) + node->value ()->cand_insns.push_back (insn); + else + { + auto it = node->value ()->place; + auto after = (result > 0) ? std::next (it) : it; + auto access = insert_before (after); + tree.insert_child (node, result > 0, alloc_node (access)); + } +} + +bool +ldp_bb_info::track_via_mem_expr (insn_info *insn, rtx mem, lfs_fields lfs) +{ + if (!MEM_EXPR (mem) || !MEM_OFFSET_KNOWN_P (mem)) + return false; + + poly_int64 offset; + tree base_expr = get_addr_base_and_unit_offset (MEM_EXPR (mem), + &offset); + if (!base_expr || !DECL_P (base_expr)) + return false; + + offset += MEM_OFFSET (mem); + + const machine_mode mem_mode = GET_MODE (mem); + const HOST_WIDE_INT mem_size = GET_MODE_SIZE (mem_mode).to_constant (); + + // Punt on misaligned offsets. + if (offset.coeffs[0] & (mem_size - 1)) + return false; + + const auto key = std::make_pair (base_expr, encode_lfs (lfs)); + access_group &group = expr_map.get_or_insert (key, NULL); + auto alloc = [&](access_record *access) { return node_alloc (access); }; + group.track (alloc, offset, insn); + + if (dump_file) + { + fprintf (dump_file, "[bb %u] tracking insn %d via ", + m_bb->index (), insn->uid ()); + print_node_brief (dump_file, "mem expr", base_expr, 0); + fprintf (dump_file, " [L=%d FP=%d, %smode, off=", + lfs.load_p, lfs.fpsimd_p, mode_name[mem_mode]); + print_dec (offset, dump_file); + fprintf (dump_file, "]\n"); + } + + return true; +} + +// Return true if X is a constant zero operand. N.B. this matches the +// {w,x}zr check in aarch64_print_operand, the logic in the predicate +// aarch64_stp_reg_operand, and the constraints on the pair patterns. +static bool const_zero_op_p (rtx x) +{ + return x == CONST0_RTX (GET_MODE (x)) + || (CONST_DOUBLE_P (x) && aarch64_float_const_zero_rtx_p (x)); +} + +void +ldp_bb_info::track_access (insn_info *insn, bool load_p, rtx mem) +{ + // We can't combine volatile MEMs, so punt on these. + if (MEM_VOLATILE_P (mem)) + return; + + // Ignore writeback accesses if the param says to do so. + if (!aarch64_ldp_writeback && side_effects_p (XEXP (mem, 0))) + return; + + const machine_mode mem_mode = GET_MODE (mem); + if (!ldp_operand_mode_ok_p (mem_mode)) + return; + + // Note ldp_operand_mode_ok_p already rejected VL modes. + const HOST_WIDE_INT mem_size = GET_MODE_SIZE (mem_mode).to_constant (); + + rtx reg_op = XEXP (PATTERN (insn->rtl ()), !load_p); + + // Is this an FP/SIMD access? Note that constant zero operands + // use an integer zero register ({w,x}zr). + const bool fpsimd_op_p + = GET_MODE_CLASS (mem_mode) != MODE_INT + && (load_p || !const_zero_op_p (reg_op)); + + // N.B. we only want to segregate FP/SIMD accesses from integer accesses + // before RA. + const bool fpsimd_bit_p = !reload_completed && fpsimd_op_p; + const lfs_fields lfs = { load_p, fpsimd_bit_p, mem_size }; + + if (track_via_mem_expr (insn, mem, lfs)) + return; + + poly_int64 mem_off; + rtx modify = NULL_RTX; + rtx base = ldp_strip_offset (mem, &modify, &mem_off); + if (!REG_P (base)) + return; + + // Need to calculate two (possibly different) offsets: + // - Offset at which the access occurs. + // - Offset of the new base def. + poly_int64 access_off; + if (modify && any_post_modify_p (modify)) + access_off = 0; + else + access_off = mem_off; + + poly_int64 new_def_off = mem_off; + + // Punt on accesses relative to the eliminable regs: since we don't + // know the elimination offset pre-RA, we should postpone forming + // pairs on such accesses until after RA. + if (!reload_completed + && (REGNO (base) == FRAME_POINTER_REGNUM + || REGNO (base) == ARG_POINTER_REGNUM)) + return; + + // Now need to find def of base register. + def_info *base_def; + use_info *base_use = find_access (insn->uses (), REGNO (base)); + gcc_assert (base_use); + base_def = base_use->def (); + if (!base_def) + { + if (dump_file) + fprintf (dump_file, + "base register (regno %d) of insn %d is undefined", + REGNO (base), insn->uid ()); + return; + } + + alt_base *canon_base = canon_base_map.get (base_def); + if (canon_base) + { + // Express this as the combined offset from the canonical base. + base_def = canon_base->base; + new_def_off += canon_base->offset; + access_off += canon_base->offset; + } + + if (modify) + { + auto def = find_access (insn->defs (), REGNO (base)); + gcc_assert (def); + + // Record that DEF = BASE_DEF + MEM_OFF. + if (dump_file) + { + pretty_printer pp; + pp_access (&pp, def, 0); + pp_string (&pp, " = "); + pp_access (&pp, base_def, 0); + fprintf (dump_file, "[bb %u] recording %s + ", + m_bb->index (), pp_formatted_text (&pp)); + print_dec (new_def_off, dump_file); + fprintf (dump_file, "\n"); + } + + alt_base base_rec { base_def, new_def_off }; + if (canon_base_map.put (def, base_rec)) + gcc_unreachable (); // Base defs should be unique. + } + + // Punt on misaligned offsets. + if (mem_off.coeffs[0] & (mem_size - 1)) + return; + + const auto key = std::make_pair (base_def, encode_lfs (lfs)); + access_group &group = def_map.get_or_insert (key, NULL); + auto alloc = [&](access_record *access) { return node_alloc (access); }; + group.track (alloc, access_off, insn); + + if (dump_file) + { + pretty_printer pp; + pp_access (&pp, base_def, 0); + + fprintf (dump_file, "[bb %u] tracking insn %d via %s", + m_bb->index (), insn->uid (), pp_formatted_text (&pp)); + fprintf (dump_file, + " [L=%d, WB=%d, FP=%d, %smode, off=", + lfs.load_p, !!modify, lfs.fpsimd_p, mode_name[mem_mode]); + print_dec (access_off, dump_file); + fprintf (dump_file, "]\n"); + } +} + +// Dummy predicate that never ignores any insns. +static bool no_ignore (insn_info *) { return false; } + +// Return the latest dataflow hazard before INSN. +// +// If IGNORE is non-NULL, this points to a sub-rtx which we should +// ignore for dataflow purposes. This is needed when considering +// changing the RTL base of an access discovered through a MEM_EXPR +// base. +// +// N.B. we ignore any defs/uses of memory here as we deal with that +// separately, making use of alias disambiguation. +static insn_info * +latest_hazard_before (insn_info *insn, rtx *ignore, + insn_info *ignore_insn = nullptr) +{ + insn_info *result = nullptr; + + // Return true if we registered the hazard. + auto hazard = [&](insn_info *h) -> bool + { + gcc_checking_assert (*h < *insn); + if (h == ignore_insn) + return false; + + if (!result || *h > *result) + result = h; + + return true; + }; + + rtx pat = PATTERN (insn->rtl ()); + auto ignore_use = [&](use_info *u) + { + if (u->is_mem ()) + return true; + + return !refers_to_regno_p (u->regno (), u->regno () + 1, pat, ignore); + }; + + // Find defs of uses in INSN (RaW). + for (auto use : insn->uses ()) + if (!ignore_use (use) && use->def ()) + hazard (use->def ()->insn ()); + + // Find previous defs (WaW) or previous uses (WaR) of defs in INSN. + for (auto def : insn->defs ()) + { + if (def->is_mem ()) + continue; + + if (def->prev_def ()) + { + hazard (def->prev_def ()->insn ()); // WaW + + auto set = dyn_cast (def->prev_def ()); + if (set && set->has_nondebug_insn_uses ()) + for (auto use : set->reverse_nondebug_insn_uses ()) + if (use->insn () != insn && hazard (use->insn ())) // WaR + break; + } + + if (!HARD_REGISTER_NUM_P (def->regno ())) + continue; + + // Also need to check backwards for call clobbers (WaW). + for (auto call_group : def->ebb ()->call_clobbers ()) + { + if (!call_group->clobbers (def->resource ())) + continue; + + auto clobber_insn = prev_call_clobbers_ignoring (*call_group, + def->insn (), + no_ignore); + if (clobber_insn) + hazard (clobber_insn); + } + + } + + return result; +} + +static insn_info * +first_hazard_after (insn_info *insn, rtx *ignore) +{ + insn_info *result = nullptr; + auto hazard = [insn, &result](insn_info *h) + { + gcc_checking_assert (*h > *insn); + if (!result || *h < *result) + result = h; + }; + + rtx pat = PATTERN (insn->rtl ()); + auto ignore_use = [&](use_info *u) + { + if (u->is_mem ()) + return true; + + return !refers_to_regno_p (u->regno (), u->regno () + 1, pat, ignore); + }; + + for (auto def : insn->defs ()) + { + if (def->is_mem ()) + continue; + + if (def->next_def ()) + hazard (def->next_def ()->insn ()); // WaW + + auto set = dyn_cast (def); + if (set && set->has_nondebug_insn_uses ()) + hazard (set->first_nondebug_insn_use ()->insn ()); // RaW + + if (!HARD_REGISTER_NUM_P (def->regno ())) + continue; + + // Also check for call clobbers of this def (WaW). + for (auto call_group : def->ebb ()->call_clobbers ()) + { + if (!call_group->clobbers (def->resource ())) + continue; + + auto clobber_insn = next_call_clobbers_ignoring (*call_group, + def->insn (), + no_ignore); + if (clobber_insn) + hazard (clobber_insn); + } + } + + // Find any subsequent defs of uses in INSN (WaR). + for (auto use : insn->uses ()) + { + if (ignore_use (use)) + continue; + + if (use->def ()) + { + auto def = use->def ()->next_def (); + if (def && def->insn () == insn) + def = def->next_def (); + + if (def) + hazard (def->insn ()); + } + + if (!HARD_REGISTER_NUM_P (use->regno ())) + continue; + + // Also need to handle call clobbers of our uses (again WaR). + // + // See restrict_movement_for_uses_ignoring for why we don't + // need to check backwards for call clobbers. + for (auto call_group : use->ebb ()->call_clobbers ()) + { + if (!call_group->clobbers (use->resource ())) + continue; + + auto clobber_insn = next_call_clobbers_ignoring (*call_group, + use->insn (), + no_ignore); + if (clobber_insn) + hazard (clobber_insn); + } + } + + return result; +} + + +enum change_strategy { + CHANGE, + DELETE, + TOMBSTONE, +}; + +// Given a change_strategy S, convert it to a string (for output in the +// dump file). +static const char *cs_to_string (change_strategy s) +{ +#define C(x) case x: return #x + switch (s) + { + C (CHANGE); + C (DELETE); + C (TOMBSTONE); + } +#undef C + gcc_unreachable (); +} + +// TODO: should this live in RTL-SSA? +static bool +ranges_overlap_p (const insn_range_info &r1, const insn_range_info &r2) +{ + // If either range is empty, then their intersection is empty. + if (!r1 || !r2) + return false; + + // When do they not overlap? When one range finishes before the other + // starts, i.e. (*r1.last < *r2.first || *r2.last < *r1.first). + // Inverting this, we get the below. + return *r1.last >= *r2.first && *r2.last >= *r1.first; +} + +// Get the range of insns that def feeds. +static insn_range_info get_def_range (def_info *def) +{ + insn_info *last = def->next_def ()->insn ()->prev_nondebug_insn (); + return { def->insn (), last }; +} + +// Given a def (of memory), return the downwards range within which we +// can safely move this def. +static insn_range_info +def_downwards_move_range (def_info *def) +{ + auto range = get_def_range (def); + + auto set = dyn_cast (def); + if (!set || !set->has_any_uses ()) + return range; + + auto use = set->first_nondebug_insn_use (); + if (use) + range = move_earlier_than (range, use->insn ()); + + return range; +} + +// Given a def (of memory), return the upwards range within which we can +// safely move this def. +static insn_range_info +def_upwards_move_range (def_info *def) +{ + def_info *prev = def->prev_def (); + insn_range_info range { prev->insn (), def->insn () }; + + auto set = dyn_cast (prev); + if (!set || !set->has_any_uses ()) + return range; + + auto use = set->last_nondebug_insn_use (); + if (use) + range = move_later_than (range, use->insn ()); + + return range; +} + +static def_info * +decide_stp_strategy (change_strategy strategy[2], + insn_info *first, + insn_info *second, + const insn_range_info &move_range) +{ + strategy[0] = CHANGE; + strategy[1] = DELETE; + + unsigned viable = 0; + viable |= move_range.includes (first); + viable |= ((unsigned) move_range.includes (second)) << 1; + + def_info * const defs[2] = { + memory_access (first->defs ()), + memory_access (second->defs ()) + }; + if (defs[0] == defs[1]) + viable = 3; // No intervening store, either is viable. + + if (!(viable & 1) + && ranges_overlap_p (move_range, def_downwards_move_range (defs[0]))) + viable |= 1; + if (!(viable & 2) + && ranges_overlap_p (move_range, def_upwards_move_range (defs[1]))) + viable |= 2; + + if (viable == 2) + std::swap (strategy[0], strategy[1]); + else if (!viable) + // Tricky case: need to delete both accesses. + strategy[0] = DELETE; + + for (int i = 0; i < 2; i++) + { + if (strategy[i] != DELETE) + continue; + + // See if we can get away without a tombstone. + auto set = dyn_cast (defs[i]); + if (!set || !set->has_any_uses ()) + continue; // We can indeed. + + // If both sides are viable for re-purposing, and the other store's + // def doesn't have any uses, then we can delete the other store + // and re-purpose this store instead. + if (viable == 3) + { + gcc_assert (strategy[!i] == CHANGE); + auto other_set = dyn_cast (defs[!i]); + if (!other_set || !other_set->has_any_uses ()) + { + strategy[i] = CHANGE; + strategy[!i] = DELETE; + break; + } + } + + // Alas, we need a tombstone after all. + strategy[i] = TOMBSTONE; + } + + for (int i = 0; i < 2; i++) + if (strategy[i] == CHANGE) + return defs[i]; + + return nullptr; +} + +static GTY(()) rtx tombstone = NULL_RTX; + +// Generate the RTL pattern for a "tombstone"; used temporarily +// during this pass to replace stores that are marked for deletion +// where we can't immediately delete the store (e.g. if there are uses +// hanging off its def of memory). +// +// These are deleted at the end of the pass and uses re-parented +// appropriately at this point. +static rtx +gen_tombstone (void) +{ + if (!tombstone) + { + tombstone = gen_rtx_CLOBBER (VOIDmode, + gen_rtx_MEM (BLKmode, + gen_rtx_SCRATCH (Pmode))); + return tombstone; + } + + return copy_rtx (tombstone); +} + +static bool +tombstone_insn_p (insn_info *insn) +{ + rtx x = tombstone ? tombstone : gen_tombstone (); + return rtx_equal_p (PATTERN (insn->rtl ()), x); +} + +static machine_mode +aarch64_operand_mode_for_pair_mode (machine_mode mode) +{ + switch (mode) + { + case E_V2x4QImode: + return E_SImode; + case E_V2x8QImode: + return E_DImode; + case E_V2x16QImode: + return E_V16QImode; + default: + gcc_unreachable (); + } +} + +static rtx +filter_notes (rtx note, rtx result, bool *eh_region) +{ + for (; note; note = XEXP (note, 1)) + { + switch (REG_NOTE_KIND (note)) + { + case REG_EQUAL: + case REG_EQUIV: + case REG_DEAD: + case REG_UNUSED: + case REG_NOALIAS: + // These can all be dropped. For REG_EQU{AL,IV} they + // cannot apply to non-single_set insns, and + // REG_{DEAD,UNUSED} are re-computed by RTl-SSA, see + // rtl-ssa/changes.cc:update_notes. + // + // Similarly, REG_NOALIAS cannot apply to a parallel. + case REG_INC: + // When we form the pair insn, the reg update is implemented + // as just another SET in the parallel, so isn't really an + // auto-increment in the RTL sense, hence we drop the note. + break; + case REG_EH_REGION: + gcc_assert (!*eh_region); + *eh_region = true; + result = alloc_reg_note (REG_EH_REGION, XEXP (note, 0), result); + break; + case REG_CFA_DEF_CFA: + case REG_CFA_OFFSET: + case REG_CFA_RESTORE: + result = alloc_reg_note (REG_NOTE_KIND (note), + copy_rtx (XEXP (note, 0)), + result); + break; + default: + // Unexpected REG_NOTE kind. + gcc_unreachable (); + } + } + + return result; +} + +// Ensure we have a sensible scheme for combining REG_NOTEs +// given two candidate insns I1 and I2. +static rtx +combine_reg_notes (insn_info *i1, insn_info *i2, rtx writeback, bool &ok) +{ + if ((writeback && find_reg_note (i1->rtl (), REG_CFA_DEF_CFA, NULL_RTX)) + || find_reg_note (i2->rtl (), REG_CFA_DEF_CFA, NULL_RTX)) + { + // CFA_DEF_CFA notes apply to the first set of the PARALLEL, + // so we can only preserve them in the non-writeback case, in + // the case that the note is attached to the lower access. + if (dump_file) + fprintf (dump_file, + "(%d,%d,WB=%d): can't preserve CFA_DEF_CFA note, punting\n", + i1->uid (), i2->uid (), !!writeback); + ok = false; + return NULL_RTX; + } + + bool found_eh_region = false; + rtx result = NULL_RTX; + result = filter_notes (REG_NOTES (i1->rtl ()), result, &found_eh_region); + return filter_notes (REG_NOTES (i2->rtl ()), result, &found_eh_region); +} + +// Given two memory accesses, at least one of which is of a writeback form, +// extract two non-writeback memory accesses addressed relative to the initial +// value of the base register, and output these in PATS. Return an rtx that +// represents the overall change to the base register. +static rtx +extract_writebacks (bool load_p, rtx pats[2], int changed) +{ + rtx base_reg = NULL_RTX; + poly_int64 current_offset = 0; + + poly_int64 offsets[2]; + + for (int i = 0; i < 2; i++) + { + rtx mem = XEXP (pats[i], load_p); + rtx reg = XEXP (pats[i], !load_p); + + rtx modify = NULL_RTX; + poly_int64 offset; + rtx this_base = ldp_strip_offset (mem, &modify, &offset); + gcc_assert (REG_P (this_base)); + if (base_reg) + gcc_assert (rtx_equal_p (base_reg, this_base)); + else + base_reg = this_base; + + // If we changed base for the current insn, then we already + // derived the correct mem for this insn from the effective + // address of the other access. + if (i == changed) + { + gcc_checking_assert (!modify); + offsets[i] = offset; + continue; + } + + if (modify && any_pre_modify_p (modify)) + current_offset += offset; + + poly_int64 this_off = current_offset; + if (!modify) + this_off += offset; + + offsets[i] = this_off; + rtx new_mem = change_address (mem, GET_MODE (mem), + plus_constant (GET_MODE (base_reg), + base_reg, this_off)); + pats[i] = load_p + ? gen_rtx_SET (reg, new_mem) + : gen_rtx_SET (new_mem, reg); + + if (modify && any_post_modify_p (modify)) + current_offset += offset; + } + + if (known_eq (current_offset, 0)) + return NULL_RTX; + + return gen_rtx_SET (base_reg, plus_constant (GET_MODE (base_reg), + base_reg, current_offset)); +} + +static insn_info * +find_trailing_add (insn_info *insns[2], + const insn_range_info &pair_range, + rtx *writeback_effect, + def_info **add_def, + def_info *base_def, + poly_int64 initial_offset, + unsigned access_size) +{ + insn_info *pair_insn = insns[1]; + + def_info *def = base_def->next_def (); + + while (def + && def->bb () == pair_insn->bb () + && *(def->insn ()) <= *pair_insn) + def = def->next_def (); + + if (!def || def->bb () != pair_insn->bb ()) + return nullptr; + + insn_info *cand = def->insn (); + const auto base_regno = base_def->regno (); + + // If CAND doesn't also use our base register, + // it can't destructively update it. + if (!find_access (cand->uses (), base_regno)) + return nullptr; + + auto rti = cand->rtl (); + + if (!INSN_P (rti)) + return nullptr; + + auto pat = PATTERN (rti); + if (GET_CODE (pat) != SET) + return nullptr; + + auto dest = XEXP (pat, 0); + if (!REG_P (dest) || REGNO (dest) != base_regno) + return nullptr; + + poly_int64 offset; + rtx rhs_base = strip_offset (XEXP (pat, 1), &offset); + if (!REG_P (rhs_base) + || REGNO (rhs_base) != base_regno + || !offset.is_constant ()) + return nullptr; + + // If the initial base offset is zero, we can handle any add offset + // (post-inc). Otherwise, we require the offsets to match (pre-inc). + if (!known_eq (initial_offset, 0) && !known_eq (offset, initial_offset)) + return nullptr; + + auto off_hwi = offset.to_constant (); + + if (off_hwi % access_size != 0) + return nullptr; + + off_hwi /= access_size; + + if (off_hwi < LDP_MIN_IMM || off_hwi > LDP_MAX_IMM) + return nullptr; + + insn_info *pair_dst = pair_range.singleton (); + gcc_assert (pair_dst); + + auto dump_prefix = [&]() + { + if (!insns[0]) + fprintf (dump_file, "existing pair i%d: ", insns[1]->uid ()); + else + fprintf (dump_file, " (%d,%d)", + insns[0]->uid (), insns[1]->uid ()); + }; + + insn_info *hazard = latest_hazard_before (cand, nullptr, pair_insn); + if (!hazard || *hazard <= *pair_dst) + { + if (dump_file) + { + dump_prefix (); + fprintf (dump_file, + "folding in trailing add (%d) to use writeback form\n", + cand->uid ()); + } + + *add_def = def; + *writeback_effect = copy_rtx (pat); + return cand; + } + + if (dump_file) + { + dump_prefix (); + fprintf (dump_file, + "can't fold in trailing add (%d), hazard = %d\n", + cand->uid (), hazard->uid ()); + } + + return nullptr; +} + +// Try and actually fuse the pair given by insns I1 and I2. +static bool +fuse_pair (bool load_p, + unsigned access_size, + int writeback, + insn_info *i1, + insn_info *i2, + base_cand &base, + const insn_range_info &move_range, + bool &emitted_tombstone_p) +{ + auto attempt = crtl->ssa->new_change_attempt (); + + auto make_change = [&attempt](insn_info *insn) + { + return crtl->ssa->change_alloc (attempt, insn); + }; + auto make_delete = [&attempt](insn_info *insn) + { + return crtl->ssa->change_alloc (attempt, + insn, + insn_change::DELETE); + }; + + // Are we using a tombstone insn for this pair? + bool have_tombstone_p = false; + + insn_info *first = (*i1 < *i2) ? i1 : i2; + insn_info *second = (first == i1) ? i2 : i1; + + insn_info *insns[2] = { first, second }; + + auto_vec changes; + changes.reserve (4); + + rtx pats[2] = { + PATTERN (first->rtl ()), + PATTERN (second->rtl ()) + }; + + use_array input_uses[2] = { first->uses (), second->uses () }; + def_array input_defs[2] = { first->defs (), second->defs () }; + + int changed_insn = -1; + if (base.from_insn != -1) + { + // If we're not already using a shared base, we need + // to re-write one of the accesses to use the base from + // the other insn. + gcc_checking_assert (base.from_insn == 0 || base.from_insn == 1); + changed_insn = !base.from_insn; + + rtx base_pat = pats[base.from_insn]; + rtx change_pat = pats[changed_insn]; + rtx base_mem = XEXP (base_pat, load_p); + rtx change_mem = XEXP (change_pat, load_p); + + const bool lower_base_p = (insns[base.from_insn] == i1); + HOST_WIDE_INT adjust_amt = access_size; + if (!lower_base_p) + adjust_amt *= -1; + + rtx change_reg = XEXP (change_pat, !load_p); + machine_mode mode_for_mem = GET_MODE (change_mem); + rtx effective_base = drop_writeback (base_mem); + rtx new_mem = adjust_address_nv (effective_base, + mode_for_mem, + adjust_amt); + rtx new_set = load_p + ? gen_rtx_SET (change_reg, new_mem) + : gen_rtx_SET (new_mem, change_reg); + + pats[changed_insn] = new_set; + + auto keep_use = [&](use_info *u) + { + return refers_to_regno_p (u->regno (), u->regno () + 1, + change_pat, &XEXP (change_pat, load_p)); + }; + + // Drop any uses that only occur in the old address. + input_uses[changed_insn] = filter_accesses (attempt, + input_uses[changed_insn], + keep_use); + } + + rtx writeback_effect = NULL_RTX; + if (writeback) + writeback_effect = extract_writebacks (load_p, pats, changed_insn); + + const auto base_regno = base.m_def->regno (); + + if (base.from_insn == -1 && (writeback & 1)) + { + // If the first of the candidate insns had a writeback form, we'll need to + // drop the use of the updated base register from the second insn's uses. + // + // N.B. we needn't worry about the base register occurring as a store + // operand, as we checked that there was no non-address true dependence + // between the insns in try_fuse_pair. + gcc_checking_assert (find_access (input_uses[1], base_regno)); + input_uses[1] = check_remove_regno_access (attempt, + input_uses[1], + base_regno); + } + + // Go through and drop uses that only occur in register notes, + // as we won't be preserving those. + for (int i = 0; i < 2; i++) + { + auto rti = insns[i]->rtl (); + if (!REG_NOTES (rti)) + continue; + + input_uses[i] = remove_note_accesses (attempt, input_uses[i]); + } + + // Edge case: if the first insn is a writeback load and the + // second insn is a non-writeback load which transfers into the base + // register, then we should drop the writeback altogether as the + // update of the base register from the second load should prevail. + // + // For example: + // ldr x2, [x1], #8 + // ldr x1, [x1] + // --> + // ldp x2, x1, [x1] + if (writeback == 1 + && load_p + && find_access (input_defs[1], base_regno)) + { + if (dump_file) + fprintf (dump_file, + " ldp: i%d has wb but subsequent i%d has non-wb " + "update of base (r%d), dropping wb\n", + insns[0]->uid (), insns[1]->uid (), base_regno); + gcc_assert (writeback_effect); + writeback_effect = NULL_RTX; + } + + // If both of the original insns had a writeback form, then we should drop the + // first def. The second def could well have uses, but the first def should + // only be used by the second insn (and we dropped that use above). + if (writeback == 3) + input_defs[0] = check_remove_regno_access (attempt, + input_defs[0], + base_regno); + + // So far the patterns have been in instruction order, + // now we want them in offset order. + if (i1 != first) + std::swap (pats[0], pats[1]); + + poly_int64 offsets[2]; + for (int i = 0; i < 2; i++) + { + rtx mem = XEXP (pats[i], load_p); + gcc_checking_assert (MEM_P (mem)); + rtx base = strip_offset (XEXP (mem, 0), offsets + i); + gcc_checking_assert (REG_P (base)); + gcc_checking_assert (base_regno == REGNO (base)); + } + + insn_info *trailing_add = nullptr; + if (aarch64_ldp_writeback > 1 && !writeback_effect) + { + def_info *add_def; + trailing_add = find_trailing_add (insns, move_range, &writeback_effect, + &add_def, base.m_def, offsets[0], + access_size); + if (trailing_add && !writeback) + { + // If there was no writeback to start with, we need to preserve the + // def of the base register from the add insn. + input_defs[0] = insert_access (attempt, add_def, input_defs[0]); + gcc_assert (input_defs[0].is_valid ()); + } + } + + // If either of the original insns had writeback, but the resulting + // pair insn does not (can happen e.g. in the ldp edge case above, or + // if the writeback effects cancel out), then drop the def(s) of the + // base register as appropriate. + if (!writeback_effect) + for (int i = 0; i < 2; i++) + if (writeback & (1 << i)) + input_defs[i] = check_remove_regno_access (attempt, + input_defs[i], + base_regno); + + // Now that we know what base mem we're going to use, check if it's OK + // with the ldp/stp policy. + rtx first_mem = XEXP (pats[0], load_p); + if (!aarch64_mem_ok_with_ldpstp_policy_model (first_mem, + load_p, + GET_MODE (first_mem))) + { + if (dump_file) + fprintf (dump_file, "punting on pair (%d,%d), ldp/stp policy says no\n", + i1->uid (), i2->uid ()); + return false; + } + + bool reg_notes_ok = true; + rtx reg_notes = combine_reg_notes (i1, i2, writeback_effect, reg_notes_ok); + if (!reg_notes_ok) + return false; + + rtx pair_pat; + if (writeback_effect) + { + auto patvec = gen_rtvec (3, writeback_effect, pats[0], pats[1]); + pair_pat = gen_rtx_PARALLEL (VOIDmode, patvec); + } + else if (load_p) + pair_pat = aarch64_gen_load_pair (XEXP (pats[0], 0), + XEXP (pats[1], 0), + XEXP (pats[0], 1)); + else + pair_pat = aarch64_gen_store_pair (XEXP (pats[0], 0), + XEXP (pats[0], 1), + XEXP (pats[1], 1)); + + insn_change *pair_change = nullptr; + auto set_pair_pat = [pair_pat,reg_notes](insn_change *change) { + rtx_insn *rti = change->insn ()->rtl (); + gcc_assert (validate_unshare_change (rti, &PATTERN (rti), pair_pat, + true)); + gcc_assert (validate_change (rti, ®_NOTES (rti), + reg_notes, true)); + }; + + if (load_p) + { + changes.quick_push (make_delete (first)); + pair_change = make_change (second); + changes.quick_push (pair_change); + + pair_change->move_range = move_range; + pair_change->new_defs = merge_access_arrays (attempt, + input_defs[0], + input_defs[1]); + gcc_assert (pair_change->new_defs.is_valid ()); + + pair_change->new_uses + = merge_access_arrays (attempt, + drop_memory_access (input_uses[0]), + drop_memory_access (input_uses[1])); + gcc_assert (pair_change->new_uses.is_valid ()); + set_pair_pat (pair_change); + } + else + { + change_strategy strategy[2]; + def_info *stp_def = decide_stp_strategy (strategy, first, second, + move_range); + if (dump_file) + { + auto cs1 = cs_to_string (strategy[0]); + auto cs2 = cs_to_string (strategy[1]); + fprintf (dump_file, + " stp strategy for candidate insns (%d,%d): (%s,%s)\n", + insns[0]->uid (), insns[1]->uid (), cs1, cs2); + if (stp_def) + fprintf (dump_file, + " re-using mem def from insn %d\n", + stp_def->insn ()->uid ()); + } + + insn_change *change; + for (int i = 0; i < 2; i++) + { + switch (strategy[i]) + { + case DELETE: + changes.quick_push (make_delete (insns[i])); + break; + case TOMBSTONE: + case CHANGE: + change = make_change (insns[i]); + if (strategy[i] == CHANGE) + { + set_pair_pat (change); + change->new_uses = merge_access_arrays (attempt, + input_uses[0], + input_uses[1]); + auto d1 = drop_memory_access (input_defs[0]); + auto d2 = drop_memory_access (input_defs[1]); + change->new_defs = merge_access_arrays (attempt, d1, d2); + gcc_assert (change->new_defs.is_valid ()); + gcc_assert (stp_def); + change->new_defs = insert_access (attempt, + stp_def, + change->new_defs); + gcc_assert (change->new_defs.is_valid ()); + change->move_range = move_range; + pair_change = change; + } + else + { + rtx_insn *rti = insns[i]->rtl (); + gcc_assert (validate_change (rti, &PATTERN (rti), + gen_tombstone (), true)); + gcc_assert (validate_change (rti, ®_NOTES (rti), + NULL_RTX, true)); + change->new_uses = use_array (nullptr, 0); + have_tombstone_p = true; + } + gcc_assert (change->new_uses.is_valid ()); + changes.quick_push (change); + break; + } + } + + if (!stp_def) + { + // Tricky case. Cannot re-purpose existing insns for stp. + // Need to insert new insn. + if (dump_file) + fprintf (dump_file, + " stp fusion: cannot re-purpose candidate stores\n"); + + auto new_insn = crtl->ssa->create_insn (attempt, INSN, pair_pat); + change = make_change (new_insn); + change->move_range = move_range; + change->new_uses = merge_access_arrays (attempt, + input_uses[0], + input_uses[1]); + gcc_assert (change->new_uses.is_valid ()); + + auto d1 = drop_memory_access (input_defs[0]); + auto d2 = drop_memory_access (input_defs[1]); + change->new_defs = merge_access_arrays (attempt, d1, d2); + gcc_assert (change->new_defs.is_valid ()); + + auto new_set = crtl->ssa->create_set (attempt, new_insn, memory); + change->new_defs = insert_access (attempt, new_set, + change->new_defs); + gcc_assert (change->new_defs.is_valid ()); + changes.safe_insert (1, change); + pair_change = change; + } + } + + if (trailing_add) + changes.quick_push (make_delete (trailing_add)); + + auto n_changes = changes.length (); + gcc_checking_assert (n_changes >= 2 && n_changes <= 4); + + + auto is_changing = insn_is_changing (changes); + for (unsigned i = 0; i < n_changes; i++) + gcc_assert (rtl_ssa::restrict_movement_ignoring (*changes[i], is_changing)); + + // Check the pair pattern is recog'd. + if (!rtl_ssa::recog_ignoring (attempt, *pair_change, is_changing)) + { + if (dump_file) + fprintf (dump_file, " failed to form pair, recog failed\n"); + + // Free any reg notes we allocated. + while (reg_notes) + { + rtx next = XEXP (reg_notes, 1); + free_EXPR_LIST_node (reg_notes); + reg_notes = next; + } + cancel_changes (0); + return false; + } + + gcc_assert (crtl->ssa->verify_insn_changes (changes)); + + confirm_change_group (); + crtl->ssa->change_insns (changes); + emitted_tombstone_p |= have_tombstone_p; + return true; +} + +// Return true if STORE_INSN may modify mem rtx MEM. Make sure we keep +// within our BUDGET for alias analysis. +static bool +store_modifies_mem_p (rtx mem, insn_info *store_insn, int &budget) +{ + if (tombstone_insn_p (store_insn)) + return false; + + if (!budget) + { + if (dump_file) + { + fprintf (dump_file, + "exceeded budget, assuming store %d aliases with mem ", + store_insn->uid ()); + print_simple_rtl (dump_file, mem); + fprintf (dump_file, "\n"); + } + + return true; + } + + budget--; + return memory_modified_in_insn_p (mem, store_insn->rtl ()); +} + +// Return true if LOAD may be modified by STORE. Make sure we keep +// within our BUDGET for alias analysis. +static bool +load_modified_by_store_p (insn_info *load, + insn_info *store, + int &budget) +{ + gcc_checking_assert (budget >= 0); + + if (!budget) + { + if (dump_file) + { + fprintf (dump_file, + "exceeded budget, assuming load %d aliases with store %d\n", + load->uid (), store->uid ()); + } + return true; + } + + // It isn't safe to re-order stores over calls. + if (CALL_P (load->rtl ())) + return true; + + budget--; + return modified_in_p (PATTERN (load->rtl ()), store->rtl ()); +} + +struct alias_walker +{ + virtual insn_info *insn () const = 0; + virtual bool valid () const = 0; + virtual bool conflict_p (int &budget) const = 0; + virtual void advance () = 0; +}; + +template +class store_walker : public alias_walker +{ + using def_iter_t = typename std::conditional ::type; + + def_iter_t def_iter; + rtx cand_mem; + insn_info *limit; + +public: + store_walker (def_info *mem_def, rtx mem, insn_info *limit_insn) : + def_iter (mem_def), cand_mem (mem), limit (limit_insn) {} + + bool valid () const override + { + if (!*def_iter) + return false; + + if (reverse) + return *((*def_iter)->insn ()) > *limit; + else + return *((*def_iter)->insn ()) < *limit; + } + insn_info *insn () const override { return (*def_iter)->insn (); } + void advance () override { def_iter++; } + bool conflict_p (int &budget) const override + { + return store_modifies_mem_p (cand_mem, insn (), budget); + } +}; + +template +class load_walker : public alias_walker +{ + using def_iter_t = typename std::conditional ::type; + using use_iter_t = typename std::conditional ::type; + + def_iter_t def_iter; + use_iter_t use_iter; + insn_info *cand_store; + insn_info *limit; + + static use_info *start_use_chain (def_iter_t &def_iter) + { + set_info *set = nullptr; + for (; *def_iter; def_iter++) + { + set = dyn_cast (*def_iter); + if (!set) + continue; + + use_info *use = reverse + ? set->last_nondebug_insn_use () + : set->first_nondebug_insn_use (); + + if (use) + return use; + } + + return nullptr; + } + +public: + void advance () override + { + use_iter++; + if (*use_iter) + return; + def_iter++; + use_iter = start_use_chain (def_iter); + } + + insn_info *insn () const override + { + gcc_checking_assert (*use_iter); + return (*use_iter)->insn (); + } + + bool valid () const override + { + if (!*use_iter) + return false; + + if (reverse) + return *((*use_iter)->insn ()) > *limit; + else + return *((*use_iter)->insn ()) < *limit; + } + + bool conflict_p (int &budget) const override + { + return load_modified_by_store_p (insn (), cand_store, budget); + } + + load_walker (def_info *def, insn_info *store, insn_info *limit_insn) + : def_iter (def), use_iter (start_use_chain (def_iter)), + cand_store (store), limit (limit_insn) {} +}; + +// Process our alias_walkers in a round-robin fashion, proceeding until +// nothing more can be learned from alias analysis. +// +// We try to maintain the invariant that if a walker becomes invalid, we +// set its pointer to null. +static void +do_alias_analysis (insn_info *alias_hazards[4], + alias_walker *walkers[4], + bool load_p) +{ + const int n_walkers = 2 + (2 * !load_p); + int budget = aarch64_ldp_alias_check_limit; + + auto next_walker = [walkers,n_walkers](int current) -> int { + for (int j = 1; j <= n_walkers; j++) + { + int idx = (current + j) % n_walkers; + if (walkers[idx]) + return idx; + } + return -1; + }; + + int i = -1; + for (int j = 0; j < n_walkers; j++) + { + alias_hazards[j] = nullptr; + if (!walkers[j]) + continue; + + if (!walkers[j]->valid ()) + walkers[j] = nullptr; + else if (i == -1) + i = j; + } + + while (i >= 0) + { + int insn_i = i % 2; + int paired_i = (i & 2) + !insn_i; + int pair_fst = (i & 2); + int pair_snd = (i & 2) + 1; + + if (walkers[i]->conflict_p (budget)) + { + alias_hazards[i] = walkers[i]->insn (); + + // We got an aliasing conflict for this {load,store} walker, + // so we don't need to walk any further. + walkers[i] = nullptr; + + // If we have a pair of alias conflicts that prevent + // forming the pair, stop. There's no need to do further + // analysis. + if (alias_hazards[paired_i] + && (*alias_hazards[pair_fst] <= *alias_hazards[pair_snd])) + return; + + if (!load_p) + { + int other_pair_fst = (pair_fst ? 0 : 2); + int other_paired_i = other_pair_fst + !insn_i; + + int x_pair_fst = (i == pair_fst) ? i : other_paired_i; + int x_pair_snd = (i == pair_fst) ? other_paired_i : i; + + // Similarly, handle the case where we have a {load,store} + // or {store,load} alias hazard pair that prevents forming + // the pair. + if (alias_hazards[other_paired_i] + && *alias_hazards[x_pair_fst] <= *alias_hazards[x_pair_snd]) + return; + } + } + + if (walkers[i]) + { + walkers[i]->advance (); + + if (!walkers[i]->valid ()) + walkers[i] = nullptr; + } + + i = next_walker (i); + } +} + +// Return an integer where bit (1 << i) is set if INSNS[i] uses writeback +// addressing. +static int +get_viable_bases (insn_info *insns[2], + vec &base_cands, + rtx cand_mems[2], + unsigned access_size, + bool reversed) +{ + // We discovered this pair through a common base. Need to ensure that + // we have a common base register that is live at both locations. + def_info *base_defs[2] = {}; + int writeback = 0; + for (int i = 0; i < 2; i++) + { + const bool is_lower = (i == reversed); + poly_int64 poly_off; + rtx modify = NULL_RTX; + rtx base = ldp_strip_offset (cand_mems[i], &modify, &poly_off); + if (modify) + writeback |= (1 << i); + + if (!REG_P (base) || !poly_off.is_constant ()) + continue; + + // Punt on accesses relative to eliminable regs. Since we don't know the + // elimination offset pre-RA, we should postpone forming pairs on such + // accesses until after RA. + if (!reload_completed + && (REGNO (base) == FRAME_POINTER_REGNUM + || REGNO (base) == ARG_POINTER_REGNUM)) + continue; + + HOST_WIDE_INT base_off = poly_off.to_constant (); + + // It should be unlikely that we ever punt here, since MEM_EXPR offset + // alignment should be a good proxy for register offset alignment. + if (base_off % access_size != 0) + { + if (dump_file) + fprintf (dump_file, + "base not viable, offset misaligned (insn %d)\n", + insns[i]->uid ()); + continue; + } + + base_off /= access_size; + + if (!is_lower) + base_off--; + + if (base_off < LDP_MIN_IMM || base_off > LDP_MAX_IMM) + continue; + + for (auto use : insns[i]->uses ()) + if (use->is_reg () && use->regno () == REGNO (base)) + { + base_defs[i] = use->def (); + break; + } + } + + if (!base_defs[0] && !base_defs[1]) + { + if (dump_file) + fprintf (dump_file, "no viable base register for pair (%d,%d)\n", + insns[0]->uid (), insns[1]->uid ()); + return writeback; + } + + for (int i = 0; i < 2; i++) + if ((writeback & (1 << i)) && !base_defs[i]) + { + if (dump_file) + fprintf (dump_file, "insn %d has writeback but base isn't viable\n", + insns[i]->uid ()); + return writeback; + } + + if (writeback == 3 + && base_defs[0]->regno () != base_defs[1]->regno ()) + { + if (dump_file) + fprintf (dump_file, + "pair (%d,%d): double writeback with distinct regs (%d,%d): " + "punting\n", + insns[0]->uid (), insns[1]->uid (), + base_defs[0]->regno (), base_defs[1]->regno ()); + return writeback; + } + + if (base_defs[0] && base_defs[1] + && base_defs[0]->regno () == base_defs[1]->regno ()) + { + // Easy case: insns already share the same base reg. + base_cands.quick_push (base_defs[0]); + return writeback; + } + + // Otherwise, we know that one of the bases must change. + // + // Note that if there is writeback we must use the writeback base + // (we know now there is exactly one). + for (int i = 0; i < 2; i++) + if (base_defs[i] && (!writeback || (writeback & (1 << i)))) + base_cands.quick_push (base_cand { base_defs[i], i }); + + return writeback; +} + +// Given two adjacent memory accesses of the same size, I1 and I2, try +// and see if we can merge them into a ldp or stp. +static bool +try_fuse_pair (bool load_p, + unsigned access_size, + insn_info *i1, + insn_info *i2, + bool &emitted_tombstone_p) +{ + if (dump_file) + fprintf (dump_file, "analyzing pair (load=%d): (%d,%d)\n", + load_p, i1->uid (), i2->uid ()); + + insn_info *insns[2]; + bool reversed = false; + if (*i1 < *i2) + { + insns[0] = i1; + insns[1] = i2; + } + else + { + insns[0] = i2; + insns[1] = i1; + reversed = true; + } + + rtx cand_mems[2]; + rtx reg_ops[2]; + rtx pats[2]; + for (int i = 0; i < 2; i++) + { + pats[i] = PATTERN (insns[i]->rtl ()); + cand_mems[i] = XEXP (pats[i], load_p); + reg_ops[i] = XEXP (pats[i], !load_p); + } + + if (load_p && reg_overlap_mentioned_p (reg_ops[0], reg_ops[1])) + { + if (dump_file) + fprintf (dump_file, + "punting on ldp due to reg conflcits (%d,%d)\n", + insns[0]->uid (), insns[1]->uid ()); + return false; + } + + if (cfun->can_throw_non_call_exceptions + && (find_reg_note (insns[0]->rtl (), REG_EH_REGION, NULL_RTX) + || find_reg_note (insns[1]->rtl (), REG_EH_REGION, NULL_RTX)) + && insn_could_throw_p (insns[0]->rtl ()) + && insn_could_throw_p (insns[1]->rtl ())) + { + if (dump_file) + fprintf (dump_file, + "can't combine insns with EH side effects (%d,%d)\n", + insns[0]->uid (), insns[1]->uid ()); + return false; + } + + auto_vec base_cands; + base_cands.reserve (2); + + int writeback = get_viable_bases (insns, base_cands, cand_mems, + access_size, reversed); + if (base_cands.is_empty ()) + { + if (dump_file) + fprintf (dump_file, "no viable base for pair (%d,%d)\n", + insns[0]->uid (), insns[1]->uid ()); + return false; + } + + rtx *ignore = &XEXP (pats[1], load_p); + for (auto use : insns[1]->uses ()) + if (!use->is_mem () + && refers_to_regno_p (use->regno (), use->regno () + 1, pats[1], ignore) + && use->def () && use->def ()->insn () == insns[0]) + { + // N.B. we allow a true dependence on the base address, as this + // happens in the case of auto-inc accesses. Consider a post-increment + // load followed by a regular indexed load, for example. + if (dump_file) + fprintf (dump_file, + "%d has non-address true dependence on %d, rejecting pair\n", + insns[1]->uid (), insns[0]->uid ()); + return false; + } + + unsigned i = 0; + while (i < base_cands.length ()) + { + base_cand &cand = base_cands[i]; + + rtx *ignore[2] = {}; + for (int j = 0; j < 2; j++) + if (cand.from_insn == !j) + ignore[j] = &XEXP (cand_mems[j], 0); + + insn_info *h = first_hazard_after (insns[0], ignore[0]); + if (h && *h <= *insns[1]) + cand.hazards[0] = h; + + h = latest_hazard_before (insns[1], ignore[1]); + if (h && *h >= *insns[0]) + cand.hazards[1] = h; + + if (!cand.viable ()) + { + if (dump_file) + fprintf (dump_file, + "pair (%d,%d): rejecting base %d due to dataflow " + "hazards (%d,%d)\n", + insns[0]->uid (), + insns[1]->uid (), + cand.m_def->regno (), + cand.hazards[0]->uid (), + cand.hazards[1]->uid ()); + + base_cands.ordered_remove (i); + } + else + i++; + } + + if (base_cands.is_empty ()) + { + if (dump_file) + fprintf (dump_file, + "can't form pair (%d,%d) due to dataflow hazards\n", + insns[0]->uid (), insns[1]->uid ()); + return false; + } + + insn_info *alias_hazards[4] = {}; + + // First def of memory after the first insn, and last def of memory + // before the second insn, respectively. + def_info *mem_defs[2] = {}; + if (load_p) + { + if (!MEM_READONLY_P (cand_mems[0])) + { + mem_defs[0] = memory_access (insns[0]->uses ())->def (); + gcc_checking_assert (mem_defs[0]); + mem_defs[0] = mem_defs[0]->next_def (); + } + if (!MEM_READONLY_P (cand_mems[1])) + { + mem_defs[1] = memory_access (insns[1]->uses ())->def (); + gcc_checking_assert (mem_defs[1]); + } + } + else + { + mem_defs[0] = memory_access (insns[0]->defs ())->next_def (); + mem_defs[1] = memory_access (insns[1]->defs ())->prev_def (); + gcc_checking_assert (mem_defs[0]); + gcc_checking_assert (mem_defs[1]); + } + + store_walker forward_store_walker (mem_defs[0], + cand_mems[0], + insns[1]); + store_walker backward_store_walker (mem_defs[1], + cand_mems[1], + insns[0]); + alias_walker *walkers[4] = {}; + if (mem_defs[0]) + walkers[0] = &forward_store_walker; + if (mem_defs[1]) + walkers[1] = &backward_store_walker; + + if (load_p && (mem_defs[0] || mem_defs[1])) + do_alias_analysis (alias_hazards, walkers, load_p); + else + { + // We want to find any loads hanging off the first store. + mem_defs[0] = memory_access (insns[0]->defs ()); + load_walker forward_load_walker (mem_defs[0], insns[0], insns[1]); + load_walker backward_load_walker (mem_defs[1], insns[1], insns[0]); + walkers[2] = &forward_load_walker; + walkers[3] = &backward_load_walker; + do_alias_analysis (alias_hazards, walkers, load_p); + // Now consolidate hazards back down. + if (alias_hazards[2] + && (!alias_hazards[0] || (*alias_hazards[2] < *alias_hazards[0]))) + alias_hazards[0] = alias_hazards[2]; + + if (alias_hazards[3] + && (!alias_hazards[1] || (*alias_hazards[3] > *alias_hazards[1]))) + alias_hazards[1] = alias_hazards[3]; + } + + if (alias_hazards[0] && alias_hazards[1] + && *alias_hazards[0] <= *alias_hazards[1]) + { + if (dump_file) + fprintf (dump_file, + "cannot form pair (%d,%d) due to alias conflicts (%d,%d)\n", + i1->uid (), i2->uid (), + alias_hazards[0]->uid (), alias_hazards[1]->uid ()); + return false; + } + + // Now narrow the hazards on each base candidate using + // the alias hazards. + i = 0; + while (i < base_cands.length ()) + { + base_cand &cand = base_cands[i]; + if (alias_hazards[0] && (!cand.hazards[0] + || *alias_hazards[0] < *cand.hazards[0])) + cand.hazards[0] = alias_hazards[0]; + if (alias_hazards[1] && (!cand.hazards[1] + || *alias_hazards[1] > *cand.hazards[1])) + cand.hazards[1] = alias_hazards[1]; + + if (cand.viable ()) + i++; + else + { + if (dump_file) + fprintf (dump_file, "pair (%d,%d): rejecting base %d due to " + "alias/dataflow hazards (%d,%d)", + insns[0]->uid (), insns[1]->uid (), + cand.m_def->regno (), + cand.hazards[0]->uid (), + cand.hazards[1]->uid ()); + + base_cands.ordered_remove (i); + } + } + + if (base_cands.is_empty ()) + { + if (dump_file) + fprintf (dump_file, + "cannot form pair (%d,%d) due to alias/dataflow hazards", + insns[0]->uid (), insns[1]->uid ()); + + return false; + } + + base_cand *base = &base_cands[0]; + if (base_cands.length () > 1) + { + // If there are still multiple viable bases, it makes sense + // to choose one that allows us to reduce register pressure, + // for loads this means moving further down, for stores this + // means moving further up. + gcc_checking_assert (base_cands.length () == 2); + const int hazard_i = !load_p; + if (base->hazards[hazard_i]) + { + if (!base_cands[1].hazards[hazard_i]) + base = &base_cands[1]; + else if (load_p + && *base_cands[1].hazards[hazard_i] + > *(base->hazards[hazard_i])) + base = &base_cands[1]; + else if (!load_p + && *base_cands[1].hazards[hazard_i] + < *(base->hazards[hazard_i])) + base = &base_cands[1]; + } + } + + // Otherwise, hazards[0] > hazards[1]. + // Pair can be formed anywhere in (hazards[1], hazards[0]). + insn_range_info range (insns[0], insns[1]); + if (base->hazards[1]) + range.first = base->hazards[1]; + if (base->hazards[0]) + range.last = base->hazards[0]->prev_nondebug_insn (); + + // Placement strategy: push loads down and pull stores up, this should + // help register pressure by reducing live ranges. + if (load_p) + range.first = range.last; + else + range.last = range.first; + + if (dump_file) + { + auto print_hazard = [](insn_info *i) + { + if (i) + fprintf (dump_file, "%d", i->uid ()); + else + fprintf (dump_file, "-"); + }; + auto print_pair = [print_hazard](insn_info **i) + { + print_hazard (i[0]); + fprintf (dump_file, ","); + print_hazard (i[1]); + }; + + fprintf (dump_file, "fusing pair [L=%d] (%d,%d), base=%d, hazards: (", + load_p, insns[0]->uid (), insns[1]->uid (), + base->m_def->regno ()); + print_pair (base->hazards); + fprintf (dump_file, "), move_range: (%d,%d)\n", + range.first->uid (), range.last->uid ()); + } + + return fuse_pair (load_p, access_size, writeback, + i1, i2, *base, range, emitted_tombstone_p); +} + +// Erase [l.begin (), i] inclusive, respecting iterator order. +static insn_iter_t +erase_prefix (insn_list_t &l, insn_iter_t i) +{ + l.erase (l.begin (), std::next (i)); + return l.begin (); +} + +static insn_iter_t +erase_one (insn_list_t &l, insn_iter_t i, insn_iter_t begin) +{ + auto prev_or_next = (i == begin) ? std::next (i) : std::prev (i); + l.erase (i); + return prev_or_next; +} + +static void +dump_insn_list (FILE *f, const insn_list_t &l) +{ + fprintf (f, "("); + + auto i = l.begin (); + auto end = l.end (); + + if (i != end) + fprintf (f, "%d", (*i)->uid ()); + i++; + + for (; i != end; i++) + { + fprintf (f, ", %d", (*i)->uid ()); + } + + fprintf (f, ")"); +} + +DEBUG_FUNCTION void +debug (const insn_list_t &l) +{ + dump_insn_list (stderr, l); + fprintf (stderr, "\n"); +} + +void +merge_pairs (insn_iter_t l_begin, + insn_iter_t l_end, + insn_iter_t r_begin, + insn_iter_t r_end, + insn_list_t &left_list, + insn_list_t &right_list, + hash_set &to_delete, + bool load_p, + unsigned access_size, + bool &emitted_tombstone_p) +{ + auto iter_l = l_begin; + auto iter_r = r_begin; + + bool result; + while (l_begin != l_end && r_begin != r_end) + { + auto next_l = std::next (iter_l); + auto next_r = std::next (iter_r); + if (**iter_l < **iter_r + && next_l != l_end + && **next_l < **iter_r) + { + iter_l = next_l; + continue; + } + else if (**iter_r < **iter_l + && next_r != r_end + && **next_r < **iter_l) + { + iter_r = next_r; + continue; + } + + bool update_l = false; + bool update_r = false; + + result = try_fuse_pair (load_p, access_size, + *iter_l, *iter_r, + emitted_tombstone_p); + if (result) + { + update_l = update_r = true; + if (to_delete.add (*iter_r)) + gcc_unreachable (); // Shouldn't get added twice. + + iter_l = erase_one (left_list, iter_l, l_begin); + iter_r = erase_one (right_list, iter_r, r_begin); + } + else + { + // Here we know that the entire prefix we skipped + // over cannot merge with anything further on + // in iteration order (there are aliasing hazards + // on both sides), so delete the entire prefix. + if (**iter_l < **iter_r) + { + // Delete everything from l_begin to iter_l, inclusive. + update_l = true; + iter_l = erase_prefix (left_list, iter_l); + } + else + { + // Delete everything from r_begin to iter_r, inclusive. + update_r = true; + iter_r = erase_prefix (right_list, iter_r); + } + } + + if (update_l) + { + l_begin = left_list.begin (); + l_end = left_list.end (); + } + if (update_r) + { + r_begin = right_list.begin (); + r_end = right_list.end (); + } + } +} + +// Given a list of insns LEFT_ORIG with all accesses adjacent to +// those in RIGHT_ORIG, try and form them into pairs. +// +// Return true iff we formed all the RIGHT_ORIG candidates into +// pairs. +bool +ldp_bb_info::try_form_pairs (insn_list_t *left_orig, + insn_list_t *right_orig, + bool load_p, unsigned access_size) +{ + // Make a copy of the right list which we can modify to + // exclude candidates locally for this invocation. + insn_list_t right_copy (*right_orig); + + if (dump_file) + { + fprintf (dump_file, "try_form_pairs [L=%d], cand vecs ", load_p); + dump_insn_list (dump_file, *left_orig); + fprintf (dump_file, " x "); + dump_insn_list (dump_file, right_copy); + fprintf (dump_file, "\n"); + } + + // List of candidate insns to delete from the original right_list + // (because they were formed into a pair). + hash_set to_delete; + + // Now we have a 2D matrix of candidates, traverse it to try and + // find a pair of insns that are already adjacent (within the + // merged list of accesses). + merge_pairs (left_orig->begin (), left_orig->end (), + right_copy.begin (), right_copy.end (), + *left_orig, right_copy, + to_delete, load_p, access_size, + m_emitted_tombstone); + + // If we formed all right candidates into pairs, + // then we can skip the next iteration. + if (to_delete.elements () == right_orig->size ()) + return true; + + // Delete items from to_delete. + auto right_iter = right_orig->begin (); + auto right_end = right_orig->end (); + while (right_iter != right_end) + { + auto right_next = std::next (right_iter); + + if (to_delete.contains (*right_iter)) + { + right_orig->erase (right_iter); + right_end = right_orig->end (); + } + + right_iter = right_next; + } + + return false; +} + +void +ldp_bb_info::transform_for_base (int encoded_lfs, + access_group &group) +{ + const auto lfs = decode_lfs (encoded_lfs); + const unsigned access_size = lfs.size; + + bool skip_next = true; + access_record *prev_access = nullptr; + + for (auto &access : group.list) + { + if (skip_next) + skip_next = false; + else if (known_eq (access.offset, prev_access->offset + access_size)) + skip_next = try_form_pairs (&prev_access->cand_insns, + &access.cand_insns, + lfs.load_p, access_size); + + prev_access = &access; + } +} + +void +ldp_bb_info::cleanup_tombstones () +{ + // No need to do anything if we didn't emit a tombstone insn for this bb. + if (!m_emitted_tombstone) + return; + + insn_info *insn = m_bb->head_insn (); + while (insn) + { + insn_info *next = insn->next_nondebug_insn (); + if (!insn->is_real () || !tombstone_insn_p (insn)) + { + insn = next; + continue; + } + + auto def = memory_access (insn->defs ()); + auto set = dyn_cast (def); + if (set && set->has_any_uses ()) + { + def_info *prev_def = def->prev_def (); + auto prev_set = dyn_cast (prev_def); + if (!prev_set) + gcc_unreachable (); // TODO: handle this if needed. + + while (set->first_use ()) + crtl->ssa->reparent_use (set->first_use (), prev_set); + } + + // Now set has no uses, we can delete it. + insn_change change (insn, insn_change::DELETE); + crtl->ssa->change_insn (change); + insn = next; + } +} + +template +void +ldp_bb_info::traverse_base_map (Map &map) +{ + for (auto kv : map) + { + const auto &key = kv.first; + auto &value = kv.second; + transform_for_base (key.second, value); + } +} + +void +ldp_bb_info::transform () +{ + traverse_base_map (expr_map); + traverse_base_map (def_map); +} + +static void +ldp_fusion_init () +{ + calculate_dominance_info (CDI_DOMINATORS); + df_analyze (); + crtl->ssa = new rtl_ssa::function_info (cfun); +} + +static void +ldp_fusion_destroy () +{ + if (crtl->ssa->perform_pending_updates ()) + cleanup_cfg (0); + + free_dominance_info (CDI_DOMINATORS); + + delete crtl->ssa; + crtl->ssa = nullptr; +} + +static rtx +aarch64_destructure_load_pair (rtx regs[2], rtx pattern) +{ + rtx mem = NULL_RTX; + + for (int i = 0; i < 2; i++) + { + rtx pat = XVECEXP (pattern, 0, i); + regs[i] = XEXP (pat, 0); + rtx unspec = XEXP (pat, 1); + gcc_checking_assert (GET_CODE (unspec) == UNSPEC); + rtx this_mem = XVECEXP (unspec, 0, 0); + if (mem) + gcc_checking_assert (rtx_equal_p (mem, this_mem)); + else + { + gcc_checking_assert (MEM_P (this_mem)); + mem = this_mem; + } + } + + return mem; +} + +static rtx +aarch64_destructure_store_pair (rtx regs[2], rtx pattern) +{ + rtx mem = XEXP (pattern, 0); + rtx unspec = XEXP (pattern, 1); + gcc_checking_assert (GET_CODE (unspec) == UNSPEC); + for (int i = 0; i < 2; i++) + regs[i] = XVECEXP (unspec, 0, i); + return mem; +} + +static rtx +aarch64_gen_writeback_pair (rtx wb_effect, rtx pair_mem, rtx regs[2], + bool load_p) +{ + auto op_mode = aarch64_operand_mode_for_pair_mode (GET_MODE (pair_mem)); + + machine_mode modes[2]; + for (int i = 0; i < 2; i++) + { + machine_mode mode = GET_MODE (regs[i]); + if (load_p) + gcc_checking_assert (mode != VOIDmode); + else if (mode == VOIDmode) + mode = op_mode; + + modes[i] = mode; + } + + const auto op_size = GET_MODE_SIZE (modes[0]); + gcc_checking_assert (known_eq (op_size, GET_MODE_SIZE (modes[1]))); + + rtx pats[2]; + for (int i = 0; i < 2; i++) + { + rtx mem = adjust_address_nv (pair_mem, modes[i], op_size * i); + pats[i] = load_p + ? gen_rtx_SET (regs[i], mem) + : gen_rtx_SET (mem, regs[i]); + } + + return gen_rtx_PARALLEL (VOIDmode, + gen_rtvec (3, wb_effect, pats[0], pats[1])); +} + +// Given an existing pair insn INSN, look for a trailing update of +// the base register which we can fold in to make this pair use +// a writeback addressing mode. +static void +try_promote_writeback (insn_info *insn) +{ + auto rti = insn->rtl (); + const auto attr = get_attr_ldpstp (rti); + if (attr == LDPSTP_NONE) + return; + + bool load_p = (attr == LDPSTP_LDP); + gcc_checking_assert (load_p || attr == LDPSTP_STP); + + rtx regs[2]; + rtx mem = NULL_RTX; + if (load_p) + mem = aarch64_destructure_load_pair (regs, PATTERN (rti)); + else + mem = aarch64_destructure_store_pair (regs, PATTERN (rti)); + gcc_checking_assert (MEM_P (mem)); + + poly_int64 offset; + rtx base = strip_offset (XEXP (mem, 0), &offset); + gcc_assert (REG_P (base)); + + const auto access_size = GET_MODE_SIZE (GET_MODE (mem)).to_constant () / 2; + + if (find_access (insn->defs (), REGNO (base))) + { + gcc_assert (load_p); + if (dump_file) + fprintf (dump_file, + "ldp %d clobbers base r%d, can't promote to writeback\n", + insn->uid (), REGNO (base)); + return; + } + + auto base_use = find_access (insn->uses (), REGNO (base)); + gcc_assert (base_use); + + if (!base_use->def ()) + { + if (dump_file) + fprintf (dump_file, + "found pair (i%d, L=%d): but base r%d is upwards exposed\n", + insn->uid (), load_p, REGNO (base)); + return; + } + + auto base_def = base_use->def (); + + rtx wb_effect = NULL_RTX; + def_info *add_def; + const insn_range_info pair_range (insn->prev_nondebug_insn ()); + insn_info *insns[2] = { nullptr, insn }; + insn_info *trailing_add = find_trailing_add (insns, pair_range, &wb_effect, + &add_def, base_def, offset, + access_size); + if (!trailing_add) + return; + + auto attempt = crtl->ssa->new_change_attempt (); + + insn_change pair_change (insn); + insn_change del_change (trailing_add, insn_change::DELETE); + insn_change *changes[] = { &pair_change, &del_change }; + + rtx pair_pat = aarch64_gen_writeback_pair (wb_effect, mem, regs, load_p); + gcc_assert (validate_unshare_change (rti, &PATTERN (rti), pair_pat, true)); + + // The pair must gain the def of the base register from the add. + pair_change.new_defs = insert_access (attempt, + add_def, + pair_change.new_defs); + gcc_assert (pair_change.new_defs.is_valid ()); + + pair_change.move_range = insn_range_info (insn->prev_nondebug_insn ()); + + auto is_changing = insn_is_changing (changes); + for (unsigned i = 0; i < ARRAY_SIZE (changes); i++) + gcc_assert (rtl_ssa::restrict_movement_ignoring (*changes[i], is_changing)); + + gcc_assert (rtl_ssa::recog_ignoring (attempt, pair_change, is_changing)); + gcc_assert (crtl->ssa->verify_insn_changes (changes)); + confirm_change_group (); + crtl->ssa->change_insns (changes); +} + +void ldp_fusion_bb (bb_info *bb) +{ + const bool track_loads + = aarch64_tune_params.ldp_policy_model != AARCH64_LDP_STP_POLICY_NEVER; + const bool track_stores + = aarch64_tune_params.stp_policy_model != AARCH64_LDP_STP_POLICY_NEVER; + + ldp_bb_info bb_state (bb); + + for (auto insn : bb->nondebug_insns ()) + { + rtx_insn *rti = insn->rtl (); + + if (!rti || !INSN_P (rti)) + continue; + + rtx pat = PATTERN (rti); + if (reload_completed + && aarch64_ldp_writeback > 1 + && GET_CODE (pat) == PARALLEL + && XVECLEN (pat, 0) == 2) + try_promote_writeback (insn); + + if (GET_CODE (pat) != SET) + continue; + + if (track_stores && MEM_P (XEXP (pat, 0))) + bb_state.track_access (insn, false, XEXP (pat, 0)); + else if (track_loads && MEM_P (XEXP (pat, 1))) + bb_state.track_access (insn, true, XEXP (pat, 1)); + } + + bb_state.transform (); + bb_state.cleanup_tombstones (); +} + +void ldp_fusion () +{ + ldp_fusion_init (); + + for (auto bb : crtl->ssa->bbs ()) + ldp_fusion_bb (bb); + + ldp_fusion_destroy (); +} + +namespace { + +const pass_data pass_data_ldp_fusion = +{ + RTL_PASS, /* type */ + "ldp_fusion", /* name */ + OPTGROUP_NONE, /* optinfo_flags */ + TV_NONE, /* tv_id */ + 0, /* properties_required */ + 0, /* properties_provided */ + 0, /* properties_destroyed */ + 0, /* todo_flags_start */ + TODO_df_finish, /* todo_flags_finish */ +}; + +class pass_ldp_fusion : public rtl_opt_pass +{ +public: + pass_ldp_fusion (gcc::context *ctx) + : rtl_opt_pass (pass_data_ldp_fusion, ctx) + {} + + opt_pass *clone () override { return new pass_ldp_fusion (m_ctxt); } + + bool gate (function *) final override + { + if (!optimize || optimize_debug) + return false; + + // If the tuning policy says never to form ldps or stps, don't run + // the pass. + if ((aarch64_tune_params.ldp_policy_model + == AARCH64_LDP_STP_POLICY_NEVER) + && (aarch64_tune_params.stp_policy_model + == AARCH64_LDP_STP_POLICY_NEVER)) + return false; + + if (reload_completed) + return flag_aarch64_late_ldp_fusion; + else + return flag_aarch64_early_ldp_fusion; + } + + unsigned execute (function *) final override + { + ldp_fusion (); + return 0; + } +}; + +} // anon namespace + +rtl_opt_pass * +make_pass_ldp_fusion (gcc::context *ctx) +{ + return new pass_ldp_fusion (ctx); +} + +#include "gt-aarch64-ldp-fusion.h" diff --git a/gcc/config/aarch64/aarch64-passes.def b/gcc/config/aarch64/aarch64-passes.def index 6ace797b738..f38c642414e 100644 --- a/gcc/config/aarch64/aarch64-passes.def +++ b/gcc/config/aarch64/aarch64-passes.def @@ -23,3 +23,5 @@ INSERT_PASS_BEFORE (pass_reorder_blocks, 1, pass_track_speculation); INSERT_PASS_AFTER (pass_machine_reorg, 1, pass_tag_collision_avoidance); INSERT_PASS_BEFORE (pass_shorten_branches, 1, pass_insert_bti); INSERT_PASS_AFTER (pass_if_after_combine, 1, pass_cc_fusion); +INSERT_PASS_BEFORE (pass_early_remat, 1, pass_ldp_fusion); +INSERT_PASS_BEFORE (pass_peephole2, 1, pass_ldp_fusion); diff --git a/gcc/config/aarch64/aarch64-protos.h b/gcc/config/aarch64/aarch64-protos.h index 2ab54f244a7..fd75aa115d1 100644 --- a/gcc/config/aarch64/aarch64-protos.h +++ b/gcc/config/aarch64/aarch64-protos.h @@ -1055,6 +1055,7 @@ rtl_opt_pass *make_pass_track_speculation (gcc::context *); rtl_opt_pass *make_pass_tag_collision_avoidance (gcc::context *); rtl_opt_pass *make_pass_insert_bti (gcc::context *ctxt); rtl_opt_pass *make_pass_cc_fusion (gcc::context *ctxt); +rtl_opt_pass *make_pass_ldp_fusion (gcc::context *); poly_uint64 aarch64_regmode_natural_size (machine_mode); diff --git a/gcc/config/aarch64/aarch64.opt b/gcc/config/aarch64/aarch64.opt index f5a518202a1..a69c37ce33b 100644 --- a/gcc/config/aarch64/aarch64.opt +++ b/gcc/config/aarch64/aarch64.opt @@ -271,6 +271,16 @@ mtrack-speculation Target Var(aarch64_track_speculation) Generate code to track when the CPU might be speculating incorrectly. +mearly-ldp-fusion +Target Var(flag_aarch64_early_ldp_fusion) Optimization Init(1) +Enable the pre-RA AArch64-specific pass to fuse loads and stores into +ldp and stp instructions. + +mlate-ldp-fusion +Target Var(flag_aarch64_late_ldp_fusion) Optimization Init(1) +Enable the post-RA AArch64-specific pass to fuse loads and stores into +ldp and stp instructions. + mstack-protector-guard= Target RejectNegative Joined Enum(stack_protector_guard) Var(aarch64_stack_protector_guard) Init(SSP_GLOBAL) Use given stack-protector guard. @@ -360,3 +370,16 @@ Enum(aarch64_ldp_stp_policy) String(never) Value(AARCH64_LDP_STP_POLICY_NEVER) EnumValue Enum(aarch64_ldp_stp_policy) String(aligned) Value(AARCH64_LDP_STP_POLICY_ALIGNED) + +-param=aarch64-ldp-alias-check-limit= +Target Joined UInteger Var(aarch64_ldp_alias_check_limit) Init(8) IntegerRange(0, 65536) Param +Limit on number of alias checks performed when attempting to form an ldp/stp. + +-param=aarch64-ldp-writeback= +Target Joined UInteger Var(aarch64_ldp_writeback) Init(2) IntegerRange(0,2) Param +Param to control which wirteback opportunities we try to handle in the +load/store pair fusion pass. A value of zero disables writeback +handling. One means we try to form pairs involving one or more existing +individual writeback accesses where possible. A value of two means we +also try to opportunistically form writeback opportunities by folding in +trailing destructive updates of the base register used by a pair. diff --git a/gcc/config/aarch64/t-aarch64 b/gcc/config/aarch64/t-aarch64 index a9a244ab6d6..37917344a54 100644 --- a/gcc/config/aarch64/t-aarch64 +++ b/gcc/config/aarch64/t-aarch64 @@ -176,6 +176,13 @@ aarch64-cc-fusion.o: $(srcdir)/config/aarch64/aarch64-cc-fusion.cc \ $(COMPILER) -c $(ALL_COMPILERFLAGS) $(ALL_CPPFLAGS) $(INCLUDES) \ $(srcdir)/config/aarch64/aarch64-cc-fusion.cc +aarch64-ldp-fusion.o: $(srcdir)/config/aarch64/aarch64-ldp-fusion.cc \ + $(CONFIG_H) $(SYSTEM_H) $(CORETYPES_H) $(BACKEND_H) $(RTL_H) $(DF_H) \ + $(RTL_SSA_H) cfgcleanup.h tree-pass.h ordered-hash-map.h tree-dfa.h \ + fold-const.h tree-hash-traits.h print-tree.h + $(COMPILER) -c $(ALL_COMPILERFLAGS) $(ALL_CPPFLAGS) $(INCLUDES) \ + $(srcdir)/config/aarch64/aarch64-ldp-fusion.cc + comma=, MULTILIB_OPTIONS = $(subst $(comma),/, $(patsubst %, mabi=%, $(subst $(comma),$(comma)mabi=,$(TM_MULTILIB_CONFIG)))) MULTILIB_DIRNAMES = $(subst $(comma), ,$(TM_MULTILIB_CONFIG)) From patchwork Thu Nov 16 18:11:46 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alex Coplan X-Patchwork-Id: 1864876 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=armh.onmicrosoft.com header.i=@armh.onmicrosoft.com header.a=rsa-sha256 header.s=selector2-armh-onmicrosoft-com header.b=uzigqxuY; dkim=pass (1024-bit key) header.d=armh.onmicrosoft.com header.i=@armh.onmicrosoft.com header.a=rsa-sha256 header.s=selector2-armh-onmicrosoft-com header.b=uzigqxuY; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=8.43.85.97; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4SWSmk727Tz1yRR for ; Fri, 17 Nov 2023 05:12:22 +1100 (AEDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 2C1B8387544A for ; Thu, 16 Nov 2023 18:12:20 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from EUR05-VI1-obe.outbound.protection.outlook.com (mail-vi1eur05on2042.outbound.protection.outlook.com [40.107.21.42]) by sourceware.org (Postfix) with ESMTPS id 9F890387544A for ; Thu, 16 Nov 2023 18:12:07 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 9F890387544A Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=arm.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 9F890387544A Authentication-Results: server2.sourceware.org; arc=pass smtp.remote-ip=40.107.21.42 ARC-Seal: i=3; a=rsa-sha256; d=sourceware.org; s=key; t=1700158329; cv=pass; b=wMzEOu9uo9xgeXOvZVvb4ScMLdiuSk7pa0QtRFKOv3jy11bPL1A4mYVp4iDcpWfiZwNqNbupH+2G99JJJeFGoAIWST2zN3JXE9MUdrpocOGCs3N9xGlz+DkxBwUyBNSkhT2ad46rLbtYavvF4MTwyrGD4VNgakevBT0y5QtsJMM= ARC-Message-Signature: i=3; a=rsa-sha256; d=sourceware.org; s=key; t=1700158329; c=relaxed/simple; bh=HLABBGti7nTiqjeeVGM4qBzjAcPZiej64JYrD+gAO/A=; h=DKIM-Signature:DKIM-Signature:Date:From:To:Subject:Message-ID: MIME-Version; b=AzlLB90hZcTzklxvlyAuFy7o7hG5BQiXrGrEpMHqCYa9O8I/lpl1pfqq2bdMK4X4pvkNapXUMc1D5e2eOZSlcMcwkIm4ESXgXNhyy0xNFNXKdRiA9phW386bF691F1gLvD+jHiL42/+V3G6sAGNqzKhAH/k5Nb46ySYKSySV7QM= ARC-Authentication-Results: i=3; server2.sourceware.org ARC-Seal: i=2; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=pass; b=gMYkjWRS6WpJ9dP+P0Xbm18G3pCqpzJSzPxfZjMYQoHNfpcbJ9UvuLWCeVeJZ4Tm/TwC7zXWiyMGx1/LhmDoAFYEpk/xicS02wHo/61+QYx5gExxXUiJbCElZPBuHKvC7XxJj5t5537WWzWZTZ0V40iUmgB4EtONxg2pocfKwlEwMee/q0L3WfFQSrj4b9ssNRbE61as2YkIubsMehaUhEX7yf4c/DyZQvf/oik7ulFXJuEPZgc9+JkRH+jO4t4q5wfKAeCm/Htgbv7FBOaglnN3+uuwOMxS74FPL/9EpWiMaIfya61CZ2nLINY9xItzv6k9/bU/drzpyk4FVCwTew== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=+7mceBoXarMyScg/yo+X3cj9n/dy8O2Mwso4vCwnpLY=; b=W4sBAaJ/P0Ki2MK6IP/Vq5rg8Ru8SZ1VchU6d7Mfx+ZnT2jC7i8reJrtLDlLxN+FQmJb6A7n8kwGRP5xHUCBagsP2ydSjoxxgE8t8WbNHLNmbnuTzSkrd/qzbuvEk0eaufn9BdtpZd470QljMROpvk1hQcoLicRG0PyUs2koyUpskMyyMdUw56uVkEPHqhRM6mz3fAvGOuOkiApstTLUHG1j+UP+OdHwIOTDNXWREiR7GMvq3dhVpJURF3RGSTgDP354Ku8PoGgDMafIobfBELlAOihkoPNcpTnl/CcwnD6wlNVsDA0r5wafRa8GQW2Ea/sotcrk/Bcqtyl0T1pWqg== ARC-Authentication-Results: i=2; mx.microsoft.com 1; spf=pass (sender ip is 63.35.35.123) smtp.rcpttodomain=gcc.gnu.org smtp.mailfrom=arm.com; dmarc=pass (p=none sp=none pct=100) action=none header.from=arm.com; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com; arc=pass (0 oda=1 ltdi=1 spf=[1,1,smtp.mailfrom=arm.com] dkim=[1,1,header.d=arm.com] dmarc=[1,1,header.from=arm.com]) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector2-armh-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=+7mceBoXarMyScg/yo+X3cj9n/dy8O2Mwso4vCwnpLY=; b=uzigqxuYKguq8XIsSi4SkRL1aZASXTHZFN+Z00sWDdImx70RqdfLIBtIZfvP1RefGP5m0w9SzbSCN2bFZW1EZQtj/NmwZTbJkmm4kqrTzwjko1BC6blLP7hLXlmFl1hC65QStTN/13g4uDuy5nCEYQajrKmvyR4b1BP3FKtNhBA= Received: from DU6P191CA0024.EURP191.PROD.OUTLOOK.COM (2603:10a6:10:540::14) by DB9PR08MB9948.eurprd08.prod.outlook.com (2603:10a6:10:3d0::22) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7002.21; Thu, 16 Nov 2023 18:12:03 +0000 Received: from DB5PEPF00014B9D.eurprd02.prod.outlook.com (2603:10a6:10:540:cafe::c8) by DU6P191CA0024.outlook.office365.com (2603:10a6:10:540::14) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7002.23 via Frontend Transport; Thu, 16 Nov 2023 18:12:03 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 63.35.35.123) smtp.mailfrom=arm.com; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com;dmarc=pass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 63.35.35.123 as permitted sender) receiver=protection.outlook.com; client-ip=63.35.35.123; helo=64aa7808-outbound-1.mta.getcheckrecipient.com; pr=C Received: from 64aa7808-outbound-1.mta.getcheckrecipient.com (63.35.35.123) by DB5PEPF00014B9D.mail.protection.outlook.com (10.167.8.164) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7002.20 via Frontend Transport; Thu, 16 Nov 2023 18:12:03 +0000 Received: ("Tessian outbound 20615a7e7970:v228"); Thu, 16 Nov 2023 18:12:03 +0000 X-CheckRecipientChecked: true X-CR-MTA-CID: 78f2a8bd7b9ed5c8 X-CR-MTA-TID: 64aa7808 Received: from fec4e5b43316.2 by 64aa7808-outbound-1.mta.getcheckrecipient.com id 429A6E3D-4E34-48CF-8DA2-E0E9FA38047C.1; Thu, 16 Nov 2023 18:11:52 +0000 Received: from EUR04-VI1-obe.outbound.protection.outlook.com by 64aa7808-outbound-1.mta.getcheckrecipient.com with ESMTPS id fec4e5b43316.2 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384); Thu, 16 Nov 2023 18:11:52 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=Gn2j45s0cOkVEgF9oB9VEZoftmwSmBIiPtdoRcpFU4kKguWevJ54GR0yzu9rhqptXP8kXVo1VROj5AYwqVcNpYKhRsEeKGOii2mQXLz4q+n0Us6P0D7deIbmBzTWsBj/FFLRzOK1/hUQvwae6bJPa/zClxMBaxTmTbvtwY3wCuNN1uBW96gv4xrS1RrX+OvGumCT8WtEeBlF374LRC7DpqVvTl3udwhWPxwM/xI5hrZxGZb3QAh3dZR+gg01Q6gJ1rn/N3ktIuuBOLv52oOc83n0tBIpQLAVvIPew+cxJOFfmbea1YxoNE53bd7DhnUvOv17A3oEf7QbT6FjBM2N4Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=+7mceBoXarMyScg/yo+X3cj9n/dy8O2Mwso4vCwnpLY=; b=EUBVbC+ETXcwmJKhiIrJ4z7wf76mNf1LylBgjVmmUCDGUPLMjyYVbddF4RoDEIhkDPZhFYLTk+Mt87GUx6F3h3jSDsSyBWhNX4/74UHb2nOCPDqYZnVhtWzQGnAT8ER3ktDLUKhSCX6lCDCGyDulgALi3gAlnPbT0IhivgXPcio8/Sz2p3PL6CF3JVhpfnHs9XFIUf7EfkWxNq4kKGWLWH/VB5KRusnaFaf+wZXTrZJmisyk9satIPLmQ2jgFQ+abzFr5A3DXxnajvuonLxqMeo8EQXxGhqNh2W2u2nq7+dG2h2b/hm7dBlc5cD9R2WXV49swBwyhRwnQP7m2Hot8g== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=arm.com; dmarc=pass action=none header.from=arm.com; dkim=pass header.d=arm.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector2-armh-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=+7mceBoXarMyScg/yo+X3cj9n/dy8O2Mwso4vCwnpLY=; b=uzigqxuYKguq8XIsSi4SkRL1aZASXTHZFN+Z00sWDdImx70RqdfLIBtIZfvP1RefGP5m0w9SzbSCN2bFZW1EZQtj/NmwZTbJkmm4kqrTzwjko1BC6blLP7hLXlmFl1hC65QStTN/13g4uDuy5nCEYQajrKmvyR4b1BP3FKtNhBA= Authentication-Results-Original: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com; Received: from PAWPR08MB8958.eurprd08.prod.outlook.com (2603:10a6:102:33e::15) by GV1PR08MB10424.eurprd08.prod.outlook.com (2603:10a6:150:15e::21) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6977.31; Thu, 16 Nov 2023 18:11:49 +0000 Received: from PAWPR08MB8958.eurprd08.prod.outlook.com ([fe80::8512:cc10:24d4:1919]) by PAWPR08MB8958.eurprd08.prod.outlook.com ([fe80::8512:cc10:24d4:1919%5]) with mapi id 15.20.6977.029; Thu, 16 Nov 2023 18:11:49 +0000 Date: Thu, 16 Nov 2023 18:11:46 +0000 From: Alex Coplan To: gcc-patches@gcc.gnu.org Cc: Richard Sandiford , Kyrylo Tkachov Subject: [PATCH 11/11] aarch64: Use individual loads/stores for mem{cpy,set} expansion Message-ID: Content-Disposition: inline X-ClientProxiedBy: LO4P123CA0602.GBRP123.PROD.OUTLOOK.COM (2603:10a6:600:295::16) To PAWPR08MB8958.eurprd08.prod.outlook.com (2603:10a6:102:33e::15) MIME-Version: 1.0 X-MS-TrafficTypeDiagnostic: PAWPR08MB8958:EE_|GV1PR08MB10424:EE_|DB5PEPF00014B9D:EE_|DB9PR08MB9948:EE_ X-MS-Office365-Filtering-Correlation-Id: ba73c5e0-4375-4db3-1b89-08dbe6cf88fd x-checkrecipientrouted: true NoDisclaimer: true X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam-Untrusted: BCL:0; X-Microsoft-Antispam-Message-Info-Original: H1R/vuaCgKdRuipxIqFsPb4r56tblKDgIKfi9a11PCHA71ZTcyvcbsGmjB7xSJceSoOSbgIQfujVnGNjIGAmO7zlKq14XPht18euxWMahKr/Z7tVc3uuc1UV5REYpPCcfGsIYqxXHQFmlBQ4RXhA3OzBTK0OYDeio9+am8X1jYqkfH8Ha6Hdgulz7BIOROfjwxMMjCh4YwTwbVxBLLXLhd/hVsD4lSfiYMzAO6wuMjCroDKGI0lonYVAfo7YBRI+3hy/K3IP79jWFFCeapbozyq+5ubUURewuO6jm2Nqnt/sUnQr5dZ54u6+LB1qw4Gt7Xy+nORlO3+hMMyw0PEPKxw6UYH3ouKG6CQrTtOfOZw2S35+auUlPKFWQ2/jVZtl3eV/Qo9QUJdWRpujdNounUEC6amDEki2SMh8SJ2I85eRdsMxmiEo8q+H4BWs9qlehJfeNEEq91y7vNxtgUz29wO6CtCG5TwBvLySaz88rvOXZfAg0KPecpu4ThyC5x+8ilWxzehNTGgMfAkZeo784v0ETC6Ly4lmDxty+mpnWvG07shjLiL8v8i+cRY6aSi6vSX02JuL2Qg3KMaNBOkI3ry6YSg+s0nbri6phzr5ozrP64Wc3SX1I8aRv69sanch X-Forefront-Antispam-Report-Untrusted: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:PAWPR08MB8958.eurprd08.prod.outlook.com; PTR:; CAT:NONE; SFS:(13230031)(376002)(366004)(396003)(39860400002)(346002)(136003)(230922051799003)(1800799009)(64100799003)(186009)(451199024)(2906002)(66556008)(478600001)(38100700002)(41300700001)(66476007)(66946007)(36756003)(86362001)(54906003)(6916009)(6486002)(316002)(2616005)(235185007)(5660300002)(6512007)(44144004)(26005)(6506007)(33964004)(6666004)(44832011)(4326008)(8936002)(83380400001)(8676002)(2700100001); DIR:OUT; SFP:1101; X-MS-Exchange-Transport-CrossTenantHeadersStamped: GV1PR08MB10424 Original-Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com; X-EOPAttributedMessage: 0 X-MS-Exchange-Transport-CrossTenantHeadersStripped: DB5PEPF00014B9D.eurprd02.prod.outlook.com X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id-Prvs: 629403fc-0ae7-4b08-13aa-08dbe6cf8066 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: b7eSbxf8yyKHhJPHWT1hUCTKz70YBq1jm7QClrrTNFMlGtg4PJRu/rLrPLLV/zFXEjHN+wTBFhC0GIuC7znTotHFAqu76eanIn9rxTVnl3rwj6eDgu7Kjrzp43YKZrZnrNu4nxvfDuAb1OmSaJgZurgtubDSvzSPEysEfeNoEEkyxJWirpObSCSpEbhGLsEeqXDAlAbtxkvt7b160Q3LUjV4BLyhru0/XzXMG1h2R0mjppspSNvQpxVfoOhptSHhmoBjer79KdA3UDn4B+e0J2nMljtYqE5q9DhwZKY9lOFwFq185E/qAAshqdEX69jcoimB9vGc2IPWsTWKWR7eNMa8qzsDQ1p8fLk0eST3/GmUi4oOh35sMmmYZZShWTCT+Nqvmo0Mc53f5BT8cgqV5ooi50qLR1foT8sl+gB8lmcnioV4bh2FjHvXGfeRyvqEJ+ZZ3ac+i4sqDig/yq2Iyd2IHaqqI2/7IwbS3I/26sB4rvKPFL8bDmhOmFOYHFpgurTE7urJnhnQSIPYRyywbvH0qHtBpN3lTRLfHD3qZNttxEpnMLRcQdmPaDZaGVYy6CdS+hCEYcJbHkBfL1aoQkqLfupYaefozclLsSQY9l6xxrpudTwnQlUYlNMGbTXvg/pHxIo+ck2SC/2dO4HVqhSt9zqdnXaQ+6vEm/a1/eWReIdtGFsEQUgDv/5LsLzTv5P1E5g2v6Ss71kO7k+5RBmjrglNWB9QbvEa6ggkLLtybJn1GZDgRECURgOvzF+cvKGi5XULszR1/PKbaFYIdA== X-Forefront-Antispam-Report: CIP:63.35.35.123; CTRY:IE; LANG:en; SCL:1; SRV:; IPV:CAL; SFV:NSPM; H:64aa7808-outbound-1.mta.getcheckrecipient.com; PTR:ec2-63-35-35-123.eu-west-1.compute.amazonaws.com; CAT:NONE; SFS:(13230031)(4636009)(136003)(396003)(346002)(376002)(39860400002)(230922051799003)(451199024)(82310400011)(186009)(64100799003)(1800799009)(46966006)(40470700004)(36840700001)(40460700003)(26005)(336012)(33964004)(6506007)(44144004)(6666004)(2616005)(6512007)(36860700001)(83380400001)(44832011)(235185007)(47076005)(8676002)(5660300002)(8936002)(41300700001)(4326008)(2906002)(478600001)(6486002)(316002)(6916009)(54906003)(70206006)(70586007)(36756003)(86362001)(82740400003)(81166007)(356005)(40480700001)(2700100001); DIR:OUT; SFP:1101; X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 16 Nov 2023 18:12:03.3964 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: ba73c5e0-4375-4db3-1b89-08dbe6cf88fd X-MS-Exchange-CrossTenant-Id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=f34e5979-57d9-4aaa-ad4d-b122a662184d; Ip=[63.35.35.123]; Helo=[64aa7808-outbound-1.mta.getcheckrecipient.com] X-MS-Exchange-CrossTenant-AuthSource: DB5PEPF00014B9D.eurprd02.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: DB9PR08MB9948 X-Spam-Status: No, score=-12.0 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, FORGED_SPF_HELO, GIT_PATCH_0, KAM_DMARC_NONE, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2, SPF_HELO_PASS, SPF_NONE, TXREP, T_SCC_BODY_TEXT_LINE, UNPARSEABLE_RELAY autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org This patch adjusts the mem{cpy,set} expansion in the aarch64 backend to use individual loads/stores instead of ldp/stp at expand time. The idea is to rely on the ldp fusion pass to fuse the accesses together later in the RTL pipeline. The earlier parts of the RTL pipeline should be able to do a better job with the individual (non-paired) accesses, especially given that an earlier patch in this series moves the pair representation to use unspecs. Bootstrapped/regtested as a series on aarch64-linux-gnu, OK for trunk? Thanks, Alex gcc/ChangeLog: * config/aarch64/aarch64.cc (aarch64_copy_one_block_and_progress_pointers): Emit individual accesses instead of load/store pairs. (aarch64_set_one_block_and_progress_pointer): Likewise. --- gcc/config/aarch64/aarch64.cc | 12 ++++++++---- 1 file changed, 8 insertions(+), 4 deletions(-) diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc index 1f6094bf1bc..315ba7119c0 100644 --- a/gcc/config/aarch64/aarch64.cc +++ b/gcc/config/aarch64/aarch64.cc @@ -25457,9 +25457,12 @@ aarch64_copy_one_block_and_progress_pointers (rtx *src, rtx *dst, /* "Cast" the pointers to the correct mode. */ *src = adjust_address (*src, mode, 0); *dst = adjust_address (*dst, mode, 0); - /* Emit the memcpy. */ - emit_insn (aarch64_gen_load_pair (reg1, reg2, *src)); - emit_insn (aarch64_gen_store_pair (*dst, reg1, reg2)); + /* Emit the memcpy. The load/store pair pass should form + a load/store pair from these moves. */ + emit_move_insn (reg1, *src); + emit_move_insn (reg2, aarch64_progress_pointer (*src)); + emit_move_insn (*dst, reg1); + emit_move_insn (aarch64_progress_pointer (*dst), reg2); /* Move the pointers forward. */ *src = aarch64_move_pointer (*src, 32); *dst = aarch64_move_pointer (*dst, 32); @@ -25638,7 +25641,8 @@ aarch64_set_one_block_and_progress_pointer (rtx src, rtx *dst, /* "Cast" the *dst to the correct mode. */ *dst = adjust_address (*dst, mode, 0); /* Emit the memset. */ - emit_insn (aarch64_gen_store_pair (*dst, src, src)); + emit_move_insn (*dst, src); + emit_move_insn (aarch64_progress_pointer (*dst), src); /* Move the pointers forward. */ *dst = aarch64_move_pointer (*dst, 32);