From patchwork Thu Dec 7 14:43:28 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alex Coplan X-Patchwork-Id: 1873235 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=armh.onmicrosoft.com header.i=@armh.onmicrosoft.com header.a=rsa-sha256 header.s=selector2-armh-onmicrosoft-com header.b=6U8KiF54; dkim=pass (1024-bit key) header.d=armh.onmicrosoft.com header.i=@armh.onmicrosoft.com header.a=rsa-sha256 header.s=selector2-armh-onmicrosoft-com header.b=6U8KiF54; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4SmH8k6sxNz23nW for ; Fri, 8 Dec 2023 01:44:04 +1100 (AEDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id ACC693861805 for ; Thu, 7 Dec 2023 14:44:02 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from EUR04-DB3-obe.outbound.protection.outlook.com (mail-db3eur04on2061.outbound.protection.outlook.com [40.107.6.61]) by sourceware.org (Postfix) with ESMTPS id C4F303858282 for ; Thu, 7 Dec 2023 14:43:44 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org C4F303858282 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=arm.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org C4F303858282 Authentication-Results: server2.sourceware.org; arc=pass smtp.remote-ip=40.107.6.61 ARC-Seal: i=3; a=rsa-sha256; d=sourceware.org; s=key; t=1701960228; cv=pass; b=vRcyibrfAGuLYI1qtT03Ye+Zq7TvVlmw+1SDe0egMQPxGzkUfehoBj/5nOSaBpDGlNN28p/X9dUB7Oi9muwcS3f3q1JFaJorzvyd0ZoCndWv6wQyVaxlWC95BtF5nqHfPG+rFOIrW2b62/7eeKtseWbnemvZyjLedYuMQgTuzu4= ARC-Message-Signature: i=3; a=rsa-sha256; d=sourceware.org; s=key; t=1701960228; c=relaxed/simple; bh=pMoMoas81SdLgHnVbvaI3aCQIQiGo+8Zc6Uln8HHTcs=; h=DKIM-Signature:DKIM-Signature:Date:From:To:Subject:Message-ID: MIME-Version; b=HGKAlzS5+dvoAYkIiasLNaof8QzZo2Fr/fDtvlSUW7nOP6vuFvYuOxDlyReeTlOMDGmi87JC7ldXNAGsmZBewmKaWz56Gj0YDQ7DTr85BL/JiTBpQWBuFIxwND5VhbQGnX8AFHziQw03S5T2+54Q6X3dH/GayIrefzcBgavOY/o= ARC-Authentication-Results: i=3; server2.sourceware.org ARC-Seal: i=2; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=pass; b=lO8g8lJP9q8CZHnnUmi+U5wAj4MnGAvwTMgbbg+F9KPjQ1aLdNEs6yOE+CXB9a1Gn3DRoD2Tcy9pxZ6fzwDIO+KHdbp7DQTU4ZhNQNapdu89NWZ5NwDhE+yIDbZxtXsp6lIuBIihr+NNWv0qOKltuDi+j5cX+mZ+5RVDCTwANVggA5n0GnLZagkQ/+wxnpAFfwYleMMBtGo4oIp12LT/kvGNTgwxo/PcMwxw3/QRy8D2Q3p0h+zzv+T7Q3E0W1I9I0V1U2yppa5BTXLsliOoVss6R9uiG1rLA84lIVdm9msryzKoAX2SbEMmrwF6wfBV+CnCHAY3WqNjJy5ksAGfhA== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=QQO60iIqOMFc0RpDXUefIpc/9cuzmP/yY2IibprfeS4=; b=GDO7SM3Q2NUjGH5WNkDKw7N4TUxeDddReaT/nx2gZ0+4zqgc25opFyOsRBdOdyxPDh8uTnKkHigp1VTQjBNstlAeuwsC/m7ZnNGbiQoFcK7hHkzT/LEqz+itCQx/TslpkixySYbXfUbvrU+b9NGqWaZoVuWciCDQLiDk5DXc0Z9xJ2c5/QMVokAko3zhumajCfRykcw4k20k/Zt8TL4OtAgAIrNjy8zkwisWDp4Hu2foBFJG14ZGQq33WehcvhV7QCKofaJbdq9j3hG0HCnUGjTmeg/SIJsTC6EPrft6VMXDDUOzIjgELukXy4cltXsFN69JBexH5lnDVhfQyszJYA== ARC-Authentication-Results: i=2; mx.microsoft.com 1; spf=pass (sender ip is 63.35.35.123) smtp.rcpttodomain=gcc.gnu.org smtp.mailfrom=arm.com; dmarc=pass (p=none sp=none pct=100) action=none header.from=arm.com; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com; arc=pass (0 oda=1 ltdi=1 spf=[1,1,smtp.mailfrom=arm.com] dkim=[1,1,header.d=arm.com] dmarc=[1,1,header.from=arm.com]) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector2-armh-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=QQO60iIqOMFc0RpDXUefIpc/9cuzmP/yY2IibprfeS4=; b=6U8KiF54YuvfsbDWmHnijZcDos9HQ0/mdP+faaFP2gun6E6WQJ4s1xeqdKmXesbABSS7symlIt3rrTr9pTrp4vTeSpTL0dze+VC6Lw8MFbR39+bQRPm9dxnqJIw8LX5QlsOuxkE9c62sT6Zu/Z10y/kM862k3Y0zgEUzLYCoVHU= Received: from AM0PR07CA0022.eurprd07.prod.outlook.com (2603:10a6:208:ac::35) by AM7PR08MB5368.eurprd08.prod.outlook.com (2603:10a6:20b:103::17) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7068.27; Thu, 7 Dec 2023 14:43:41 +0000 Received: from AM3PEPF0000A791.eurprd04.prod.outlook.com (2603:10a6:208:ac:cafe::61) by AM0PR07CA0022.outlook.office365.com (2603:10a6:208:ac::35) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7068.24 via Frontend Transport; Thu, 7 Dec 2023 14:43:41 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 63.35.35.123) smtp.mailfrom=arm.com; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com;dmarc=pass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 63.35.35.123 as permitted sender) receiver=protection.outlook.com; client-ip=63.35.35.123; helo=64aa7808-outbound-1.mta.getcheckrecipient.com; pr=C Received: from 64aa7808-outbound-1.mta.getcheckrecipient.com (63.35.35.123) by AM3PEPF0000A791.mail.protection.outlook.com (10.167.16.120) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7068.20 via Frontend Transport; Thu, 7 Dec 2023 14:43:40 +0000 Received: ("Tessian outbound 20615a7e7970:v228"); Thu, 07 Dec 2023 14:43:40 +0000 X-CheckRecipientChecked: true X-CR-MTA-CID: a5d0c51b041885c1 X-CR-MTA-TID: 64aa7808 Received: from f151f6dbead1.1 by 64aa7808-outbound-1.mta.getcheckrecipient.com id EBB10913-5F0F-45F4-83B6-8E9E3BF6BF0D.1; Thu, 07 Dec 2023 14:43:33 +0000 Received: from EUR05-DB8-obe.outbound.protection.outlook.com by 64aa7808-outbound-1.mta.getcheckrecipient.com with ESMTPS id f151f6dbead1.1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384); Thu, 07 Dec 2023 14:43:33 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=hjbs/qjJ394W/dbGDC6Bhk8BXKOiCliMIfwoAH2cRwwBgIabK27YGu/npwM5Pb6a6Y9oW0kPXwldvx7f8p2MLvbf299kgTHSwZUe2S0jXRXVaHbAFtM+0pVXRpnNc/M0IHUnxI9wKpIKTkuxN9f70jdmaz1FaMekwC+SsYuAj3EOv6QZc7Q0VeSs3obHylxvaWViBrdMh2tFB/AQK+Df70UUaxfq2pvkfbNzoZNbvqKFWRSInrQmm+0VGlCBpNlcvTogWdWGjZ/8v43Hw05StzUPaH9Xnn93zbOYGBaWdIAtoZ7D0dMm3hJnqPhVoLUaOeLodC1N3iYDNhZalMVvjA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=QQO60iIqOMFc0RpDXUefIpc/9cuzmP/yY2IibprfeS4=; b=oNsXYMMqKUTLROBeUi/QuPsi7LvxSnjApHJlZXL3oX6hU/JbGWlFcFlWDFs8fYT87RmehovfDu7e4/j07jfrikJUbbPHeS8jsI1lQ7/LBcskbdRPDGietVlOIqnzpb4DoSx7UtwBI8vlOhdLWRqFbxSfqsJy5gnD4JpUPVtJyfR9TaU8mQNfkcfM3NAS8J2s2R5lOdS3Lz/qnEdi/R1lJjRnQtxFiu8EZasdV5sAO0Ya9TdkcGhy6oxCxixNuDw93fDoM7oObueSsRDOZvj5FuMCAAFgYU6YRW9cmbVAMNrMyJiUx4ciXeS1zPeEqnVCqkagHyFmFM+kO9e2qoAsxw== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=arm.com; dmarc=pass action=none header.from=arm.com; dkim=pass header.d=arm.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector2-armh-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=QQO60iIqOMFc0RpDXUefIpc/9cuzmP/yY2IibprfeS4=; b=6U8KiF54YuvfsbDWmHnijZcDos9HQ0/mdP+faaFP2gun6E6WQJ4s1xeqdKmXesbABSS7symlIt3rrTr9pTrp4vTeSpTL0dze+VC6Lw8MFbR39+bQRPm9dxnqJIw8LX5QlsOuxkE9c62sT6Zu/Z10y/kM862k3Y0zgEUzLYCoVHU= Authentication-Results-Original: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com; Received: from PAWPR08MB8958.eurprd08.prod.outlook.com (2603:10a6:102:33e::15) by GV2PR08MB8728.eurprd08.prod.outlook.com (2603:10a6:150:b3::12) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7068.27; Thu, 7 Dec 2023 14:43:31 +0000 Received: from PAWPR08MB8958.eurprd08.prod.outlook.com ([fe80::8512:cc10:24d4:1919]) by PAWPR08MB8958.eurprd08.prod.outlook.com ([fe80::8512:cc10:24d4:1919%5]) with mapi id 15.20.7068.027; Thu, 7 Dec 2023 14:43:31 +0000 Date: Thu, 7 Dec 2023 14:43:28 +0000 From: Alex Coplan To: gcc-patches@gcc.gnu.org Cc: Richard Sandiford , Kyrylo Tkachov Subject: [PATCH v3 08/11] aarch64: Generalize writeback ldp/stp patterns Message-ID: Content-Disposition: inline X-ClientProxiedBy: LO4P123CA0685.GBRP123.PROD.OUTLOOK.COM (2603:10a6:600:37b::8) To PAWPR08MB8958.eurprd08.prod.outlook.com (2603:10a6:102:33e::15) MIME-Version: 1.0 X-MS-TrafficTypeDiagnostic: PAWPR08MB8958:EE_|GV2PR08MB8728:EE_|AM3PEPF0000A791:EE_|AM7PR08MB5368:EE_ X-MS-Office365-Filtering-Correlation-Id: 456ae869-6db6-4986-8468-08dbf732e7b6 x-checkrecipientrouted: true NoDisclaimer: true X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam-Untrusted: BCL:0; X-Microsoft-Antispam-Message-Info-Original: Gy6Fa9R7gOlWp4Ze0FPyiWw0fFgF9XEXsiH5N+7PVC6lmq9AUHEANpMH9OvA/RoV8zWIAnIaLvSLt96uqGq/Xh52d9MiDCnlOoTPL8RKtQvLG9pbFnf1/szkxnuasinyNXx+txyMf6/uHdXKp8kO2WJekWnbmIVgdZjU8PhS2fboTqf4PDziDHlPkkvsWg+dLlIyOLIq4o7Z3HTzyZLvBbD6zblHhsMPd2xuFZSNUF1EpdmBCqDLnmu0DJHI/5n43//1K8OYCWHTt/UNVWDVgVjEDkE+1xYXoiqVI9WcZDWpuKpcNSNf3LIrMxX7H4KtVdo8JnBcSoky/E/P+m/MrjqVoh5JMIs3sdmTBU1TyCiohmVKfU35Mbmjl1MJssMLv9bO9gOAzqUhaBpHDA53riYKCAWFzwOZ+Gb64YQDStTn+BhXxdCvOBRdHqNf3DlEQaXOvNYvNvCEntz0bVSp8Wmt+c02CIn6bHyXsGfotuxyDyMULpxzUOBfPzbzmWwb/QALdgmfa7v7IjEr7YYh5VFeWT76mimBEl8WxA/cO65L4lW5xvlQJlCoeWsX42rH9Sa73kOhP6TFXM+SOU6oMBlwzqt5xeiDcANyRoJGtyQ= X-Forefront-Antispam-Report-Untrusted: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:PAWPR08MB8958.eurprd08.prod.outlook.com; PTR:; CAT:NONE; SFS:(13230031)(396003)(346002)(366004)(136003)(376002)(39860400002)(230922051799003)(64100799003)(451199024)(186009)(1800799012)(83380400001)(21480400003)(478600001)(66574015)(2616005)(6512007)(2906002)(26005)(6666004)(966005)(6486002)(44144004)(33964004)(6506007)(4326008)(8676002)(8936002)(86362001)(66556008)(235185007)(5660300002)(44832011)(41300700001)(36756003)(54906003)(6916009)(316002)(38100700002)(66476007)(66946007)(2700100001); DIR:OUT; SFP:1101; X-MS-Exchange-Transport-CrossTenantHeadersStamped: GV2PR08MB8728 Original-Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com; X-EOPAttributedMessage: 0 X-MS-Exchange-Transport-CrossTenantHeadersStripped: AM3PEPF0000A791.eurprd04.prod.outlook.com X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id-Prvs: dc1d4b7e-3311-4cdd-1971-08dbf732e1b8 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: ZvK+7Dltt/kscuHYPJUFlVJl4rhgC1IlC5pheTmhnHynE79bSpbqSgy7fHMkaDec7QwpGki/IuQQT79dI8uRUjHZZngNZSmngL7e6AuFRqMStz2Ycpir9GdYKaA9Vwterw20di3nb5CjVLBE6MNjJiHHcSryQPvkKCkK8E441YPY96JzALYP8eZf1z2RhECpwclIgFVNRogZl7n/mNPqXIggWqLk8Iwfn+LvFW3I1SUkaDD91Q3rcLq8tJRtEYxP4uSwGJZkCuZDuiVQ68qzjTJFPBAe5+LODa8KSVHFEnWjC2yNlWtDfE1AvSSUYiQZNw3MX7ttgZgifJ8o7ok0is4qDBrDY6gep+TtD7dsbQNzHcivC6IPzVQcEgkxHmPRo4gfCKmEItYGlt2X5KtI3Z8hfbwek2PvDjKONxcYSiU/LCewnCUxWH0k/0nIhozWuCXaHWjD2dj4I3e2E7vKm/AgsV6HwMNKN+wE1K1lfvPE+rOH/5ANDYXbbpnCkAyPlvVIfd/0I/33T4+oXCYpba6QSAsMCbLhsBb0Zc26oBMy3JJmZAXoWVBHc4Zw49JeWfpjAOb5F0LSfI3kkGlFyeKrN4nmTDmjvvpIkP7tjoOwdLdOPGarpVZTcFKmDeEoj9OFD05mAqLx4Hu6AannGymx2sE8ZNiTGfpNGMFV/bwzXaXbO0EQshD65Sxs83OzAyaG8CKRpLFGFpg/IR8CodFmkCKOuGu56CkB1IUaTvXCuLkW3dKlnoEt6N/ZzGGdSnN4XuiXNi68/JQankvFxA== X-Forefront-Antispam-Report: CIP:63.35.35.123; CTRY:IE; LANG:en; SCL:1; SRV:; IPV:CAL; SFV:NSPM; H:64aa7808-outbound-1.mta.getcheckrecipient.com; PTR:ec2-63-35-35-123.eu-west-1.compute.amazonaws.com; CAT:NONE; SFS:(13230031)(4636009)(376002)(346002)(396003)(39850400004)(136003)(230922051799003)(82310400011)(186009)(1800799012)(64100799003)(451199024)(46966006)(40470700004)(36840700001)(2906002)(40480700001)(40460700003)(235185007)(5660300002)(44832011)(36860700001)(6506007)(6666004)(6512007)(4326008)(478600001)(8676002)(36756003)(81166007)(356005)(6486002)(966005)(8936002)(82740400003)(6916009)(83380400001)(336012)(26005)(2616005)(66574015)(44144004)(33964004)(41300700001)(21480400003)(54906003)(70586007)(86362001)(70206006)(316002)(47076005)(2700100001); DIR:OUT; SFP:1101; X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 07 Dec 2023 14:43:40.9887 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 456ae869-6db6-4986-8468-08dbf732e7b6 X-MS-Exchange-CrossTenant-Id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=f34e5979-57d9-4aaa-ad4d-b122a662184d; Ip=[63.35.35.123]; Helo=[64aa7808-outbound-1.mta.getcheckrecipient.com] X-MS-Exchange-CrossTenant-AuthSource: AM3PEPF0000A791.eurprd04.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: AM7PR08MB5368 X-Spam-Status: No, score=-12.0 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, FORGED_SPF_HELO, GIT_PATCH_0, KAM_DMARC_NONE, KAM_SHORT, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2, SPF_HELO_PASS, SPF_NONE, TXREP, T_SCC_BODY_TEXT_LINE, UNPARSEABLE_RELAY autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org Hi, This is a v3 patch which is rebased on top of the SME changes. Otherwise it is the same as v2, posted here: https://gcc.gnu.org/pipermail/gcc-patches/2023-December/639367.html Bootstrapped/regtested as a series on aarch64-linux-gnu, OK for trunk? Thanks, Alex -- >8 -- Thus far the writeback forms of ldp/stp have been exclusively used in prologue and epilogue code for saving/restoring of registers to/from the stack. As such, forms of ldp/stp that weren't needed for prologue/epilogue code weren't supported by the aarch64 backend. This patch generalizes the load/store pair writeback patterns to allow: - Base registers other than the stack pointer. - Modes that weren't previously supported. - Combinations of distinct modes provided they have the same size. - Pre/post variants that weren't previously needed in prologue/epilogue code. We make quite some effort to avoid a combinatorial explosion in the number of patterns generated (and those in the source) by making extensive use of special predicates. An updated version of the upcoming ldp/stp pass can generate the writeback forms, so this patch is motivated by that. This patch doesn't add zero-extending or sign-extending forms of the writeback patterns; that is left for future work. gcc/ChangeLog: * config/aarch64/aarch64-protos.h (aarch64_ldpstp_operand_mode_p): Declare. * config/aarch64/aarch64.cc (aarch64_gen_storewb_pair): Build RTL directly instead of invoking named pattern. (aarch64_gen_loadwb_pair): Likewise. (aarch64_ldpstp_operand_mode_p): New. * config/aarch64/aarch64.md (loadwb_pair_): Replace with ... (*loadwb_post_pair_): ... this. Generalize as described in cover letter. (loadwb_pair_): Delete (superseded by the above). (*loadwb_post_pair_16): New. (*loadwb_pre_pair_): New. (loadwb_pair_): Delete. (*loadwb_pre_pair_16): New. (storewb_pair_): Replace with ... (*storewb_pre_pair_): ... this. Generalize as described in cover letter. (*storewb_pre_pair_16): New. (storewb_pair_): Delete. (*storewb_post_pair_): New. (storewb_pair_): Delete. (*storewb_post_pair_16): New. * config/aarch64/predicates.md (aarch64_mem_pair_operator): New. (pmode_plus_operator): New. (aarch64_ldp_reg_operand): New. (aarch64_stp_reg_operand): New. diff --git a/gcc/config/aarch64/aarch64-protos.h b/gcc/config/aarch64/aarch64-protos.h index 42f7bfad5cb..ee0f0a18541 100644 --- a/gcc/config/aarch64/aarch64-protos.h +++ b/gcc/config/aarch64/aarch64-protos.h @@ -1041,6 +1041,7 @@ bool aarch64_operands_ok_for_ldpstp (rtx *, bool, machine_mode); bool aarch64_operands_adjust_ok_for_ldpstp (rtx *, bool, machine_mode); bool aarch64_mem_ok_with_ldpstp_policy_model (rtx, bool, machine_mode); void aarch64_swap_ldrstr_operands (rtx *, bool); +bool aarch64_ldpstp_operand_mode_p (machine_mode); extern void aarch64_asm_output_pool_epilogue (FILE *, const char *, tree, HOST_WIDE_INT); diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc index d870973dcd6..baa2b6ca3f7 100644 --- a/gcc/config/aarch64/aarch64.cc +++ b/gcc/config/aarch64/aarch64.cc @@ -8097,23 +8097,15 @@ static rtx aarch64_gen_storewb_pair (machine_mode mode, rtx base, rtx reg, rtx reg2, HOST_WIDE_INT adjustment) { - switch (mode) - { - case E_DImode: - return gen_storewb_pairdi_di (base, base, reg, reg2, - GEN_INT (-adjustment), - GEN_INT (UNITS_PER_WORD - adjustment)); - case E_DFmode: - return gen_storewb_pairdf_di (base, base, reg, reg2, - GEN_INT (-adjustment), - GEN_INT (UNITS_PER_WORD - adjustment)); - case E_TFmode: - return gen_storewb_pairtf_di (base, base, reg, reg2, - GEN_INT (-adjustment), - GEN_INT (UNITS_PER_VREG - adjustment)); - default: - gcc_unreachable (); - } + rtx new_base = plus_constant (Pmode, base, -adjustment); + rtx mem = gen_frame_mem (mode, new_base); + rtx mem2 = adjust_address_nv (mem, mode, GET_MODE_SIZE (mode)); + + return gen_rtx_PARALLEL (VOIDmode, + gen_rtvec (3, + gen_rtx_SET (base, new_base), + gen_rtx_SET (mem, reg), + gen_rtx_SET (mem2, reg2))); } /* Push registers numbered REGNO1 and REGNO2 to the stack, adjusting the @@ -8145,20 +8137,15 @@ static rtx aarch64_gen_loadwb_pair (machine_mode mode, rtx base, rtx reg, rtx reg2, HOST_WIDE_INT adjustment) { - switch (mode) - { - case E_DImode: - return gen_loadwb_pairdi_di (base, base, reg, reg2, GEN_INT (adjustment), - GEN_INT (UNITS_PER_WORD)); - case E_DFmode: - return gen_loadwb_pairdf_di (base, base, reg, reg2, GEN_INT (adjustment), - GEN_INT (UNITS_PER_WORD)); - case E_TFmode: - return gen_loadwb_pairtf_di (base, base, reg, reg2, GEN_INT (adjustment), - GEN_INT (UNITS_PER_VREG)); - default: - gcc_unreachable (); - } + rtx mem = gen_frame_mem (mode, base); + rtx mem2 = adjust_address_nv (mem, mode, GET_MODE_SIZE (mode)); + rtx new_base = plus_constant (Pmode, base, adjustment); + + return gen_rtx_PARALLEL (VOIDmode, + gen_rtvec (3, + gen_rtx_SET (base, new_base), + gen_rtx_SET (reg, mem), + gen_rtx_SET (reg2, mem2))); } /* Pop the two registers numbered REGNO1, REGNO2 from the stack, adjusting it @@ -26685,6 +26672,20 @@ aarch64_check_consecutive_mems (rtx *mem1, rtx *mem2, bool *reversed) return false; } +/* Test if MODE is suitable for a single transfer register in an ldp or stp + instruction. */ + +bool +aarch64_ldpstp_operand_mode_p (machine_mode mode) +{ + if (!targetm.hard_regno_mode_ok (V0_REGNUM, mode) + || hard_regno_nregs (V0_REGNUM, mode) > 1) + return false; + + const auto size = GET_MODE_SIZE (mode); + return known_eq (size, 4) || known_eq (size, 8) || known_eq (size, 16); +} + /* Return true if MEM1 and MEM2 can be combined into a single access of mode MODE, with the combined access having the same address as MEM1. */ diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md index a6d5e8c2a1a..f87cddf8f4b 100644 --- a/gcc/config/aarch64/aarch64.md +++ b/gcc/config/aarch64/aarch64.md @@ -1919,102 +1919,208 @@ (define_insn "store_pair_dw_" (set_attr "fp" "yes")] ) +;; Writeback load/store pair patterns. +;; +;; Note that modes in the patterns [SI DI TI] are used only as a proxy for their +;; size; aarch64_ldp_reg_operand and aarch64_mem_pair_operator are special +;; predicates which accept a wide range of operand modes, with the requirement +;; that the contextual (pattern) mode is of the same size as the operand mode. + ;; Load pair with post-index writeback. This is primarily used in function ;; epilogues. -(define_insn "loadwb_pair_" - [(parallel - [(set (match_operand:P 0 "register_operand" "=k") - (plus:P (match_operand:P 1 "register_operand" "0") - (match_operand:P 4 "aarch64_mem_pair_offset" "n"))) - (set (match_operand:GPI 2 "register_operand" "=r") - (mem:GPI (match_dup 1))) - (set (match_operand:GPI 3 "register_operand" "=r") - (mem:GPI (plus:P (match_dup 1) - (match_operand:P 5 "const_int_operand" "n"))))])] - "INTVAL (operands[5]) == GET_MODE_SIZE (mode)" - "ldp\\t%2, %3, [%1], %4" - [(set_attr "type" "load_")] -) - -(define_insn "loadwb_pair_" - [(parallel - [(set (match_operand:P 0 "register_operand" "=k") - (plus:P (match_operand:P 1 "register_operand" "0") - (match_operand:P 4 "aarch64_mem_pair_offset" "n"))) - (set (match_operand:GPF 2 "register_operand" "=w") - (mem:GPF (match_dup 1))) - (set (match_operand:GPF 3 "register_operand" "=w") - (mem:GPF (plus:P (match_dup 1) - (match_operand:P 5 "const_int_operand" "n"))))])] - "INTVAL (operands[5]) == GET_MODE_SIZE (mode)" - "ldp\\t%2, %3, [%1], %4" - [(set_attr "type" "neon_load1_2reg")] -) - -(define_insn "loadwb_pair_" - [(parallel - [(set (match_operand:P 0 "register_operand" "=k") - (plus:P (match_operand:P 1 "register_operand" "0") - (match_operand:P 4 "aarch64_mem_pair_offset" "n"))) - (set (match_operand:TX 2 "register_operand" "=w") - (mem:TX (match_dup 1))) - (set (match_operand:TX 3 "register_operand" "=w") - (mem:TX (plus:P (match_dup 1) - (match_operand:P 5 "const_int_operand" "n"))))])] - "TARGET_BASE_SIMD && INTVAL (operands[5]) == GET_MODE_SIZE (mode)" - "ldp\\t%q2, %q3, [%1], %4" +(define_insn "*loadwb_post_pair_" + [(set (match_operand 0 "pmode_register_operand") + (match_operator 7 "pmode_plus_operator" [ + (match_operand 1 "pmode_register_operand") + (match_operand 4 "const_int_operand")])) + (set (match_operand:GPI 2 "aarch64_ldp_reg_operand") + (match_operator 5 "memory_operand" [(match_dup 1)])) + (set (match_operand:GPI 3 "aarch64_ldp_reg_operand") + (match_operator 6 "memory_operand" [ + (match_operator 8 "pmode_plus_operator" [ + (match_dup 1) + (const_int )])]))] + "aarch64_mem_pair_offset (operands[4], mode)" + {@ [cons: =0, 1, =2, =3; attrs: type] + [ rk, 0, r, r; load_] ldp\t%2, %3, [%1], %4 + [ rk, 0, w, w; neon_load1_2reg ] ldp\t%2, %3, [%1], %4 + } +) + +;; q-register variant of the above +(define_insn "*loadwb_post_pair_16" + [(set (match_operand 0 "pmode_register_operand" "=rk") + (match_operator 7 "pmode_plus_operator" [ + (match_operand 1 "pmode_register_operand" "0") + (match_operand 4 "const_int_operand")])) + (set (match_operand:TI 2 "aarch64_ldp_reg_operand" "=w") + (match_operator 5 "memory_operand" [(match_dup 1)])) + (set (match_operand:TI 3 "aarch64_ldp_reg_operand" "=w") + (match_operator 6 "memory_operand" + [(match_operator 8 "pmode_plus_operator" [ + (match_dup 1) + (const_int 16)])]))] + "TARGET_FLOAT + && aarch64_mem_pair_offset (operands[4], TImode)" + "ldp\t%q2, %q3, [%1], %4" + [(set_attr "type" "neon_ldp_q")] +) + +;; Load pair with pre-index writeback. +(define_insn "*loadwb_pre_pair_" + [(set (match_operand 0 "pmode_register_operand") + (match_operator 8 "pmode_plus_operator" [ + (match_operand 1 "pmode_register_operand") + (match_operand 4 "const_int_operand")])) + (set (match_operand:GPI 2 "aarch64_ldp_reg_operand") + (match_operator 6 "memory_operand" [ + (match_operator 9 "pmode_plus_operator" [ + (match_dup 1) + (match_dup 4) + ])])) + (set (match_operand:GPI 3 "aarch64_ldp_reg_operand") + (match_operator 7 "memory_operand" [ + (match_operator 10 "pmode_plus_operator" [ + (match_dup 1) + (match_operand 5 "const_int_operand") + ])]))] + "aarch64_mem_pair_offset (operands[4], mode) + && known_eq (INTVAL (operands[5]), + INTVAL (operands[4]) + GET_MODE_SIZE (mode))" + {@ [cons: =&0, 1, =2, =3; attrs: type ] + [ rk, 0, r, r; load_] ldp\t%2, %3, [%0, %4]! + [ rk, 0, w, w; neon_load1_2reg ] ldp\t%2, %3, [%0, %4]! + } +) + +;; q-register variant of the above +(define_insn "*loadwb_pre_pair_16" + [(set (match_operand 0 "pmode_register_operand" "=&rk") + (match_operator 8 "pmode_plus_operator" [ + (match_operand 1 "pmode_register_operand" "0") + (match_operand 4 "const_int_operand")])) + (set (match_operand:TI 2 "aarch64_ldp_reg_operand" "=w") + (match_operator 6 "memory_operand" [ + (match_operator 10 "pmode_plus_operator" [ + (match_dup 1) + (match_dup 4) + ])])) + (set (match_operand:TI 3 "aarch64_ldp_reg_operand" "=w") + (match_operator 7 "memory_operand" [ + (match_operator 9 "pmode_plus_operator" [ + (match_dup 1) + (match_operand 5 "const_int_operand") + ])]))] + "TARGET_FLOAT + && aarch64_mem_pair_offset (operands[4], TImode) + && known_eq (INTVAL (operands[5]), INTVAL (operands[4]) + 16)" + "ldp\t%q2, %q3, [%0, %4]!" [(set_attr "type" "neon_ldp_q")] ) ;; Store pair with pre-index writeback. This is primarily used in function ;; prologues. -(define_insn "storewb_pair_" - [(parallel - [(set (match_operand:P 0 "register_operand" "=&k") - (plus:P (match_operand:P 1 "register_operand" "0") - (match_operand:P 4 "aarch64_mem_pair_offset" "n"))) - (set (mem:GPI (plus:P (match_dup 0) - (match_dup 4))) - (match_operand:GPI 2 "register_operand" "r")) - (set (mem:GPI (plus:P (match_dup 0) - (match_operand:P 5 "const_int_operand" "n"))) - (match_operand:GPI 3 "register_operand" "r"))])] - "INTVAL (operands[5]) == INTVAL (operands[4]) + GET_MODE_SIZE (mode)" - "stp\\t%2, %3, [%0, %4]!" - [(set_attr "type" "store_")] +(define_insn "*storewb_pre_pair_" + [(set (match_operand 0 "pmode_register_operand") + (match_operator 6 "pmode_plus_operator" [ + (match_operand 1 "pmode_register_operand") + (match_operand 4 "const_int_operand") + ])) + (set (match_operator:GPI 7 "aarch64_mem_pair_operator" [ + (match_operator 8 "pmode_plus_operator" [ + (match_dup 0) + (match_dup 4) + ])]) + (match_operand:GPI 2 "aarch64_stp_reg_operand")) + (set (match_operator:GPI 9 "aarch64_mem_pair_operator" [ + (match_operator 10 "pmode_plus_operator" [ + (match_dup 0) + (match_operand 5 "const_int_operand") + ])]) + (match_operand:GPI 3 "aarch64_stp_reg_operand"))] + "aarch64_mem_pair_offset (operands[4], mode) + && known_eq (INTVAL (operands[5]), + INTVAL (operands[4]) + GET_MODE_SIZE (mode)) + && !reg_overlap_mentioned_p (operands[0], operands[2]) + && !reg_overlap_mentioned_p (operands[0], operands[3])" + {@ [cons: =&0, 1, 2, 3; attrs: type ] + [ rk, 0, rYZ, rYZ; store_] stp\t%2, %3, [%0, %4]! + [ rk, 0, w, w; neon_store1_2reg ] stp\t%2, %3, [%0, %4]! + } +) + +;; q-register variant of the above. +(define_insn "*storewb_pre_pair_16" + [(set (match_operand 0 "pmode_register_operand" "=&rk") + (match_operator 6 "pmode_plus_operator" [ + (match_operand 1 "pmode_register_operand" "0") + (match_operand 4 "const_int_operand") + ])) + (set (match_operator:TI 7 "aarch64_mem_pair_operator" [ + (match_operator 8 "pmode_plus_operator" [ + (match_dup 0) + (match_dup 4) + ])]) + (match_operand:TI 2 "aarch64_ldp_reg_operand" "w")) + (set (match_operator:TI 9 "aarch64_mem_pair_operator" [ + (match_operator 10 "pmode_plus_operator" [ + (match_dup 0) + (match_operand 5 "const_int_operand") + ])]) + (match_operand:TI 3 "aarch64_ldp_reg_operand" "w"))] + "TARGET_FLOAT + && aarch64_mem_pair_offset (operands[4], TImode) + && known_eq (INTVAL (operands[5]), INTVAL (operands[4]) + 16) + && !reg_overlap_mentioned_p (operands[0], operands[2]) + && !reg_overlap_mentioned_p (operands[0], operands[3])" + "stp\\t%q2, %q3, [%0, %4]!" + [(set_attr "type" "neon_stp_q")] ) -(define_insn "storewb_pair_" - [(parallel - [(set (match_operand:P 0 "register_operand" "=&k") - (plus:P (match_operand:P 1 "register_operand" "0") - (match_operand:P 4 "aarch64_mem_pair_offset" "n"))) - (set (mem:GPF (plus:P (match_dup 0) - (match_dup 4))) - (match_operand:GPF 2 "register_operand" "w")) - (set (mem:GPF (plus:P (match_dup 0) - (match_operand:P 5 "const_int_operand" "n"))) - (match_operand:GPF 3 "register_operand" "w"))])] - "INTVAL (operands[5]) == INTVAL (operands[4]) + GET_MODE_SIZE (mode)" - "stp\\t%2, %3, [%0, %4]!" - [(set_attr "type" "neon_store1_2reg")] -) - -(define_insn "storewb_pair_" - [(parallel - [(set (match_operand:P 0 "register_operand" "=&k") - (plus:P (match_operand:P 1 "register_operand" "0") - (match_operand:P 4 "aarch64_mem_pair_offset" "n"))) - (set (mem:TX (plus:P (match_dup 0) - (match_dup 4))) - (match_operand:TX 2 "register_operand" "w")) - (set (mem:TX (plus:P (match_dup 0) - (match_operand:P 5 "const_int_operand" "n"))) - (match_operand:TX 3 "register_operand" "w"))])] - "TARGET_BASE_SIMD - && INTVAL (operands[5]) - == INTVAL (operands[4]) + GET_MODE_SIZE (mode)" - "stp\\t%q2, %q3, [%0, %4]!" +;; Store pair with post-index writeback. +(define_insn "*storewb_post_pair_" + [(set (match_operand 0 "pmode_register_operand") + (match_operator 5 "pmode_plus_operator" [ + (match_operand 1 "pmode_register_operand") + (match_operand 4 "const_int_operand") + ])) + (set (match_operator:GPI 6 "aarch64_mem_pair_operator" [(match_dup 1)]) + (match_operand 2 "aarch64_stp_reg_operand")) + (set (match_operator:GPI 7 "aarch64_mem_pair_operator" [ + (match_operator 8 "pmode_plus_operator" [ + (match_dup 0) + (const_int ) + ])]) + (match_operand 3 "aarch64_stp_reg_operand"))] + "aarch64_mem_pair_offset (operands[4], mode) + && !reg_overlap_mentioned_p (operands[0], operands[2]) + && !reg_overlap_mentioned_p (operands[0], operands[3])" + {@ [cons: =0, 1, 2, 3; attrs: type ] + [ rk, 0, rYZ, rYZ; store_] stp\t%2, %3, [%0], %4 + [ rk, 0, w, w; neon_store1_2reg ] stp\t%2, %3, [%0], %4 + } +) + +;; Store pair with post-index writeback. +(define_insn "*storewb_post_pair_16" + [(set (match_operand 0 "pmode_register_operand" "=rk") + (match_operator 5 "pmode_plus_operator" [ + (match_operand 1 "pmode_register_operand" "0") + (match_operand 4 "const_int_operand") + ])) + (set (match_operator:TI 6 "aarch64_mem_pair_operator" [(match_dup 1)]) + (match_operand:TI 2 "aarch64_ldp_reg_operand" "w")) + (set (match_operator:TI 7 "aarch64_mem_pair_operator" [ + (match_operator 8 "pmode_plus_operator" [ + (match_dup 0) + (const_int 16) + ])]) + (match_operand:TI 3 "aarch64_ldp_reg_operand" "w"))] + "TARGET_FLOAT + && aarch64_mem_pair_offset (operands[4], TImode) + && !reg_overlap_mentioned_p (operands[0], operands[2]) + && !reg_overlap_mentioned_p (operands[0], operands[3])" + "stp\t%q2, %q3, [%0], %4" [(set_attr "type" "neon_stp_q")] ) diff --git a/gcc/config/aarch64/predicates.md b/gcc/config/aarch64/predicates.md index 9af28103a74..698a68a6311 100644 --- a/gcc/config/aarch64/predicates.md +++ b/gcc/config/aarch64/predicates.md @@ -291,11 +291,46 @@ (define_predicate "aarch64_mem_pair_offset" (and (match_code "const_int") (match_test "aarch64_offset_7bit_signed_scaled_p (mode, INTVAL (op))"))) +(define_special_predicate "aarch64_mem_pair_operator" + (and + (match_code "mem") + (match_test "aarch64_ldpstp_operand_mode_p (GET_MODE (op))") + (ior + (match_test "mode == VOIDmode") + (match_test "known_eq (GET_MODE_SIZE (mode), + GET_MODE_SIZE (GET_MODE (op)))")))) + (define_predicate "aarch64_mem_pair_operand" (and (match_code "mem") (match_test "aarch64_legitimate_address_p (mode, XEXP (op, 0), false, ADDR_QUERY_LDP_STP)"))) +(define_predicate "pmode_plus_operator" + (and (match_code "plus") + (match_test "GET_MODE (op) == Pmode"))) + +(define_special_predicate "aarch64_ldp_reg_operand" + (and + (match_code "reg,subreg") + (match_test "aarch64_ldpstp_operand_mode_p (GET_MODE (op))") + (ior + (match_test "mode == VOIDmode") + (match_test "known_eq (GET_MODE_SIZE (mode), + GET_MODE_SIZE (GET_MODE (op)))")))) + +(define_special_predicate "aarch64_stp_reg_operand" + (ior (match_operand 0 "aarch64_ldp_reg_operand") + (and (match_code "const_int,const,const_vector,const_double") + (match_test "aarch64_const_zero_rtx_p (op)")) + (ior + (match_test "GET_MODE (op) == VOIDmode") + (and + (match_test "aarch64_ldpstp_operand_mode_p (GET_MODE (op))") + (ior + (match_test "mode == VOIDmode") + (match_test "known_eq (GET_MODE_SIZE (mode), + GET_MODE_SIZE (GET_MODE (op)))")))))) + ;; Used for storing two 64-bit values in an AdvSIMD register using an STP ;; as a 128-bit vec_concat. (define_predicate "aarch64_mem_pair_lanes_operand"