From patchwork Tue Feb 11 10:14:53 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrea Corallo X-Patchwork-Id: 1236231 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=209.132.180.131; helo=sourceware.org; envelope-from=gcc-patches-return-519327-incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=arm.com Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.a=rsa-sha1 header.s=default header.b=b8BH3nTr; dkim=fail reason="signature verification failed" (1024-bit key; unprotected) header.d=armh.onmicrosoft.com header.i=@armh.onmicrosoft.com header.a=rsa-sha256 header.s=selector2-armh-onmicrosoft-com header.b=JNq4dy3j; dkim=fail reason="signature verification failed" (1024-bit key) header.d=armh.onmicrosoft.com header.i=@armh.onmicrosoft.com header.a=rsa-sha256 header.s=selector2-armh-onmicrosoft-com header.b=JNq4dy3j; dkim-atps=neutral Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 48GzDx1WHdz9s3x for ; Tue, 11 Feb 2020 21:15:42 +1100 (AEDT) DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :to:cc:subject:date:message-id:content-type:mime-version; q=dns; s=default; b=FUmZYWwOAvOLYNjeqiGVmb7kAVF9rNgqyZ+3OYVHpzwF5c/cfE ij0qjkDAGKUkhbOLCEGVqXt/NbJKGRPBVKP8rfSbj6qfDV4LpaYdrbJVDdcQwaYE vBD40gAt+0lqfJ8VXUA8qw2J+rp9zuZdfwDr/5Bu7wwD0eW/+7iptSwZY= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :to:cc:subject:date:message-id:content-type:mime-version; s= default; bh=T7XY7jygzCbi6KTTn36ELH9Ca+g=; b=b8BH3nTrA7uxMXO/6H/w wzRxe2JalgyXDEoJoUgvLulRYbeOXAIE60sD0NuBQOajZumUQuQjHSr/YrZi6fEj K9SKSDwc81QuBfmz6khhLdyWpwXsRSzky9nbEYW0fkRcBbXUA+ob033wYtxpXS+L L1rEUtqj3xDmca45Q1sRaHE= Received: (qmail 46534 invoked by alias); 11 Feb 2020 10:15:33 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 46468 invoked by uid 89); 11 Feb 2020 10:15:30 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-20.8 required=5.0 tests=AWL, BAYES_00, GIT_PATCH_0, GIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3, MSGID_FROM_MTA_HEADER, RCVD_IN_DNSWL_NONE, SPF_HELO_PASS, SPF_PASS, UNPARSEABLE_RELAY autolearn=ham version=3.3.1 spammy=Andrea, 8339, emulation, Modulo X-HELO: EUR05-VI1-obe.outbound.protection.outlook.com Received: from mail-vi1eur05on2041.outbound.protection.outlook.com (HELO EUR05-VI1-obe.outbound.protection.outlook.com) (40.107.21.41) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Tue, 11 Feb 2020 10:15:26 +0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector2-armh-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=BnbApGf2rOx0IlpBvRSMUdawBBNWAmlJDVdQ/Jnu1RU=; b=JNq4dy3jUe+813OIrCQC5XE14KqrDM90yTPr9PBqAasjpt6FI+j2Mu3bqMSrmOxGgCk9T9a+LsxyV5lczJ7AF1LMZO/uS8mck3WjTwpAilRMJyZqI0YrkaLYVJkwZ/Uac1pWSAKv1KjpKfMaVvh+fEWHcYYvpnfPjm7Yb2ElG0g= Received: from VI1PR08CA0187.eurprd08.prod.outlook.com (2603:10a6:800:d2::17) by AM0PR08MB3204.eurprd08.prod.outlook.com (2603:10a6:208:5d::33) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.2707.21; Tue, 11 Feb 2020 10:15:03 +0000 Received: from DB5EUR03FT019.eop-EUR03.prod.protection.outlook.com (2a01:111:f400:7e0a::206) by VI1PR08CA0187.outlook.office365.com (2603:10a6:800:d2::17) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.2729.22 via Frontend Transport; Tue, 11 Feb 2020 10:15:03 +0000 Authentication-Results: spf=pass (sender IP is 63.35.35.123) smtp.mailfrom=arm.com; gcc.gnu.org; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com; gcc.gnu.org; dmarc=bestguesspass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 63.35.35.123 as permitted sender) receiver=protection.outlook.com; client-ip=63.35.35.123; helo=64aa7808-outbound-1.mta.getcheckrecipient.com; Received: from 64aa7808-outbound-1.mta.getcheckrecipient.com (63.35.35.123) by DB5EUR03FT019.mail.protection.outlook.com (10.152.20.163) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.2665.18 via Frontend Transport; Tue, 11 Feb 2020 10:15:03 +0000 Received: ("Tessian outbound da94dc68d1bb:v42"); Tue, 11 Feb 2020 10:15:02 +0000 X-CheckRecipientChecked: true X-CR-MTA-CID: 6fa23ecbb54bdc29 X-CR-MTA-TID: 64aa7808 Received: from 5eef7427890f.1 by 64aa7808-outbound-1.mta.getcheckrecipient.com id 759E8FC0-8D8B-4F9B-8D30-87986F18C46D.1; Tue, 11 Feb 2020 10:14:56 +0000 Received: from EUR04-DB3-obe.outbound.protection.outlook.com by 64aa7808-outbound-1.mta.getcheckrecipient.com with ESMTPS id 5eef7427890f.1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384); Tue, 11 Feb 2020 10:14:56 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=d7CxPBYeRf1YwDTKml95+1lyBNUBxs0fiyO+43Gw+awsQeLPyNJIwZtGnoBCkDU5m1qpASSn4+pduFEBkchG+uhddaLUwQIHF9I2LnRj4+TZhZ6oZq5yqiFr2I1w97WAZSkI9jeR8ZF3/6Jov16QT5IDpZ007nBFLWXtOxkZxceWJRl7auho2PKBQkWeNaJDkBMuJkN4VXZzva9TXVyNhQZzjwo8fFPdwlrCh89bADseATtxDy6gUlGjsDauXWLxTKc7vkTtTWtertIqfoKVjcpke6nG/sCtokCS1cBpCoCgoRnqkxOmvGdx2DGq2kmC1jxbOPkST9m3Rj7+csc2cg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=BnbApGf2rOx0IlpBvRSMUdawBBNWAmlJDVdQ/Jnu1RU=; b=dzZAsX+QzSVqNAbpOn/MI3QzT1I1WRMnb1vu9d1oT1O910tAfK4ZBOBx5nca33WyWKAXe4zgsXrRp4q056hUbkE/XKhuh16T+JVcZCOXectacQx4WBdKsJ+5aycz6bNFUzIedoW7BiR4CBRUVPZ7HOpH5x7S9OnDxxwTl1nvJqDS0npJzKoKcQ2WidLxnCE8eXCnFdxX3hkz9IcK0taKD3cADtDDhADBrvTw/sAOfrFCXcC4Tupn+G9dnt1pmc9ojXsSYMtBIOt5cYMkvu/lg+E2cZJsgAlumtdpeh2jYf4VCsMVdCXT7qFZPD9QOsd7s60SjYfMzocMoMdLO+h/Xw== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=arm.com; dmarc=pass action=none header.from=arm.com; dkim=pass header.d=arm.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector2-armh-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=BnbApGf2rOx0IlpBvRSMUdawBBNWAmlJDVdQ/Jnu1RU=; b=JNq4dy3jUe+813OIrCQC5XE14KqrDM90yTPr9PBqAasjpt6FI+j2Mu3bqMSrmOxGgCk9T9a+LsxyV5lczJ7AF1LMZO/uS8mck3WjTwpAilRMJyZqI0YrkaLYVJkwZ/Uac1pWSAKv1KjpKfMaVvh+fEWHcYYvpnfPjm7Yb2ElG0g= Authentication-Results-Original: spf=none (sender IP is ) smtp.mailfrom=Andrea.Corallo@arm.com; Received: from VI1PR08MB2765.eurprd08.prod.outlook.com (10.170.236.32) by VI1PR08MB4413.eurprd08.prod.outlook.com (20.179.24.85) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.2707.26; Tue, 11 Feb 2020 10:14:55 +0000 Received: from VI1PR08MB2765.eurprd08.prod.outlook.com ([fe80::eca9:5f98:627d:571b]) by VI1PR08MB2765.eurprd08.prod.outlook.com ([fe80::eca9:5f98:627d:571b%6]) with mapi id 15.20.2707.030; Tue, 11 Feb 2020 10:14:55 +0000 From: Andrea Corallo To: gcc-patches@gcc.gnu.org Cc: nd@arm.com Subject: [PATCH] [arm] Implement Armv8.1-M low overhead loops Date: Tue, 11 Feb 2020 11:14:53 +0100 Message-ID: MIME-Version: 1.0 Received: from e112547 (217.140.106.37) by LO2P265CA0158.GBRP265.PROD.OUTLOOK.COM (2603:10a6:600:9::26) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.2707.23 via Frontend Transport; Tue, 11 Feb 2020 10:14:54 +0000 x-checkrecipientrouted: true X-MS-Oob-TLC-OOBClassifiers: OLM:374;OLM:374; X-Forefront-Antispam-Report-Untrusted: SFV:NSPM; SFS:(10009020)(4636009)(39860400002)(396003)(376002)(366004)(136003)(346002)(189003)(199004)(2616005)(956004)(44832011)(8676002)(235185007)(81166006)(81156014)(478600001)(26005)(5660300002)(186003)(16526019)(52116002)(6496006)(8936002)(36756003)(4326008)(86362001)(66616009)(66476007)(66556008)(66946007)(6916009)(316002)(2906002)(6486002); DIR:OUT; SFP:1101; SCL:1; SRVR:VI1PR08MB4413; H:VI1PR08MB2765.eurprd08.prod.outlook.com; FPR:; SPF:None; LANG:en; PTR:InfoNoRecords; MX:1; A:1; Received-SPF: None (protection.outlook.com: arm.com does not designate permitted sender hosts) X-MS-Exchange-SenderADCheck: 1 X-Microsoft-Antispam-Untrusted: BCL:0; X-Microsoft-Antispam-Message-Info-Original: N7yBCU9YCDkPBPZ0HlaL7v1DX3kH9QS2KI0mv3O2G9CbpS1yMGY/X9I4/vcM5EC2HFWvFZBkFxL0jvZ9xXeOdRvzjNlXABq30YSZYmsLGllwPqdz0TUxj/yfeH6/bgScemSOg2AiZewhpALt4vPaKsbWPGhDQ9syADiKSuOWcEOwhDvudaZfcCWCGJeoCbvMyqqpQm09ilujkn96dOh8ge5X7qEgrbV/0OQblOPD0FqaV5Qb7ArJ8eXOWbPTCjEep6tj/Jl8yQ9XUSMHC7Refp+VwZECvZhY9by2fCZdI44QHoRn/pe3Hx/Y//msO6rXStILwRiITwblttCU71wA1XTRi5nUZIm4Gt6Z6S2t31Kp76NGjrrrFdg3q8YNAJufnWYVVJmxSHREyMomiM9q9rnjPWyNLUXkxNsEr8toFgxVsndEECqmQKH8SVAzThT2 X-MS-Exchange-AntiSpam-MessageData: uakjdcIlAjJzBUCygaz0QGYTg4K7Mq/QscvU9L3lCY6ke6DIXSKbg6Z4KDGNLAgDJMAdW93tw1o5a3OmKtPJYM9bIxnIhbELeaHukil4Bl8vbOwS/hs8+iRK/fsGPyxg8Vq5p7/Hx9zmOv72b5kXKQ== X-MS-Exchange-Transport-Forked: True Original-Authentication-Results: spf=none (sender IP is ) smtp.mailfrom=Andrea.Corallo@arm.com; X-MS-Exchange-Transport-CrossTenantHeadersStripped: DB5EUR03FT019.eop-EUR03.prod.protection.outlook.com X-MS-Office365-Filtering-Correlation-Id-Prvs: 7a3bcf1b-55ee-4e06-53fa-08d7aedb3d90 Hi all, This patch enables the Armv8.1-M Mainline LOB (low overhead branch) extension low overhead loops (LOL) feature by using the loop-doloop pass that it shares with the Swing Modulo Scheduler. bootstrapped arm-none-linux-gnueabihf, does not introduce testsuite regressions. Andrea gcc/ChangeLog: 2020-??-?? Andrea Corallo 2020-??-?? Mihail-Calin Ionescu 2020-??-?? Iain Apreotesei * config/arm/arm.c (TARGET_INVALID_WITHIN_DOLOOP): (arm_invalid_within_doloop): Implement invalid_within_doloop hook. * config/arm/arm.h (TARGET_HAVE_LOB): Add new macro. * config/arm/thumb2.md (*doloop_end, doloop_begin, dls_insn): Add new patterns. * config/arm/unspecs.md: Add new unspec. gcc/testsuite/ChangeLog: 2020-??-?? Andrea Corallo 2020-??-?? Mihail-Calin Ionescu 2020-??-?? Iain Apreotesei * gcc.target/arm/lob.h: New header. * gcc.target/arm/lob1.c: New testcase. * gcc.target/arm/lob2.c: Likewise. * gcc.target/arm/lob3.c: Likewise. * gcc.target/arm/lob4.c: Likewise. * gcc.target/arm/lob5.c: Likewise. * gcc.target/arm/lob6.c: Likewise. diff --git a/gcc/config/arm/arm.h b/gcc/config/arm/arm.h index e07cf03538c..1269f40bd77 100644 --- a/gcc/config/arm/arm.h +++ b/gcc/config/arm/arm.h @@ -586,6 +586,9 @@ extern int arm_arch_bf16; /* Target machine storage Layout. */ +/* Nonzero if this chip provides Armv8.1-M Mainline + LOB (low overhead branch features) extension instructions. */ +#define TARGET_HAVE_LOB (arm_arch8_1m_main) /* Define this macro if it is advisable to hold scalars in registers in a wider mode than that declared by the program. In such cases, diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c index 9cc7bc0e562..d0b50d544e3 100644 --- a/gcc/config/arm/arm.c +++ b/gcc/config/arm/arm.c @@ -833,6 +833,9 @@ static const struct attribute_spec arm_attribute_table[] = #undef TARGET_CONSTANT_ALIGNMENT #define TARGET_CONSTANT_ALIGNMENT arm_constant_alignment +#undef TARGET_INVALID_WITHIN_DOLOOP +#define TARGET_INVALID_WITHIN_DOLOOP arm_invalid_within_doloop + #undef TARGET_MD_ASM_ADJUST #define TARGET_MD_ASM_ADJUST arm_md_asm_adjust @@ -32937,6 +32940,39 @@ arm_ge_bits_access (void) return true; } +/* NULL if INSN insn is valid within a low-overhead loop. + Otherwise return why doloop cannot be applied. */ + +static const char * +arm_invalid_within_doloop (const rtx_insn *insn) +{ + if (!TARGET_HAVE_LOB) + return default_invalid_within_doloop (insn); + + if (CALL_P (insn)) + return "Function call in the loop."; + + if (tablejump_p (insn, NULL, NULL) || computed_jump_p (insn)) + return "Computed branch in the loop."; + + if (INSN_P (insn) + && GET_CODE (PATTERN (insn)) == PARALLEL) + { + rtx parallel = PATTERN (insn); + rtx clobber; + int j; + for (j = XVECLEN (parallel, 0) - 1; j >= 0; j--) + { + clobber = XVECEXP (parallel, 0, j); + if (GET_CODE (clobber) == CLOBBER + && GET_CODE (XEXP (clobber, 0)) == REG + && REGNO (XEXP (clobber, 0)) == LR_REGNUM) + return "LR is used inside loop."; + } + } + return NULL; +} + #if CHECKING_P namespace selftest { diff --git a/gcc/config/arm/thumb2.md b/gcc/config/arm/thumb2.md index b0d3bd1cf1c..44b1a264dba 100644 --- a/gcc/config/arm/thumb2.md +++ b/gcc/config/arm/thumb2.md @@ -1555,8 +1555,11 @@ using a certain 'count' register and (2) the loop count can be adjusted by modifying this register prior to the loop. ??? The possible introduction of a new block to initialize the - new IV can potentially affect branch optimizations. */ - if (optimize > 0 && flag_modulo_sched) + new IV can potentially affect branch optimizations. + + Also used to implement the low over head loops feature, which is part of + the Armv8.1-M Mainline Low Overhead Branch (LOB) extension. */ + if (optimize > 0 && (flag_modulo_sched || TARGET_HAVE_LOB)) { rtx s0; rtx bcomp; @@ -1569,6 +1572,11 @@ FAIL; s0 = operands [0]; + + /* Low over head loop instructions require the first operand to be LR. */ + if (TARGET_HAVE_LOB) + s0 = gen_rtx_REG (SImode, LR_REGNUM); + if (TARGET_THUMB2) insn = emit_insn (gen_thumb2_addsi3_compare0 (s0, s0, GEN_INT (-1))); else @@ -1650,3 +1658,29 @@ "TARGET_HAVE_MVE" "lsrl%?\\t%Q0, %R0, %1" [(set_attr "predicable" "yes")]) + +(define_insn "*doloop_end" + [(parallel [(set (pc) + (if_then_else + (ne (reg:SI LR_REGNUM) (const_int 1)) + (label_ref (match_operand 0 "" "")) + (pc))) + (set (reg:SI LR_REGNUM) + (plus:SI (reg:SI LR_REGNUM) (const_int -1)))])] + "TARGET_32BIT && TARGET_HAVE_LOB && !flag_modulo_sched" + "le\tlr, %l0") + +(define_expand "doloop_begin" + [(match_operand 0 "" "") + (match_operand 1 "" "")] + "TARGET_32BIT && TARGET_HAVE_LOB && !flag_modulo_sched" + { + emit_insn (gen_dls_insn (operands[0], operands[0])); + DONE; + }) + +(define_insn "dls_insn" + [(set (match_operand:SI 0 "" "") + (unspec:SI [(match_operand:SI 1 "s_register_operand" "r")] UNSPEC_DLS))] + "TARGET_32BIT && TARGET_HAVE_LOB && !flag_modulo_sched" + "dls\tlr, %1") diff --git a/gcc/config/arm/unspecs.md b/gcc/config/arm/unspecs.md index 8f4a705f43e..df5ecb73192 100644 --- a/gcc/config/arm/unspecs.md +++ b/gcc/config/arm/unspecs.md @@ -154,6 +154,7 @@ UNSPEC_SMUADX ; Represent the SMUADX operation. UNSPEC_SSAT16 ; Represent the SSAT16 operation. UNSPEC_USAT16 ; Represent the USAT16 operation. + UNSPEC_DLS ; Used for DLS (Do Loop Start), Armv8.1-M Mainline instruction ]) diff --git a/gcc/testsuite/gcc.target/arm/lob.h b/gcc/testsuite/gcc.target/arm/lob.h new file mode 100644 index 00000000000..feaae7cc899 --- /dev/null +++ b/gcc/testsuite/gcc.target/arm/lob.h @@ -0,0 +1,15 @@ +#include + +/* Common code for lob tests. */ + +#define NO_LOB asm volatile ("@ clobber lr" : : : "lr" ) + +#define N 10000 + +static void +reset_data (int *a, int *b, int *c) +{ + memset (a, -1, N * sizeof (*a)); + memset (b, -1, N * sizeof (*b)); + memset (c, -1, N * sizeof (*c)); +} diff --git a/gcc/testsuite/gcc.target/arm/lob1.c b/gcc/testsuite/gcc.target/arm/lob1.c new file mode 100644 index 00000000000..8ffaaa29878 --- /dev/null +++ b/gcc/testsuite/gcc.target/arm/lob1.c @@ -0,0 +1,82 @@ +/* Check that GCC generates Armv8.1-M low over head loop instructions + for some simple loops. */ +/* { dg-do run } */ +/* { dg-options "-march=armv8.1-m.main -O3 --save-temps" } */ +#include +#include "lob.h" + +int a[N]; +int b[N]; +int c[N]; + +int +foo (int a, int b) +{ + return a + b; +} + +void __attribute__((noinline)) +loop1 (int *a, int *b, int *c) +{ + for (int i = 0; i < N; i++) + { + a[i] = i; + b[i] = i * 2; + c[i] = a[i] + b[i]; + } +} + +void __attribute__((noinline)) +loop2 (int *a, int *b, int *c) +{ + int i = 0; + while (i < N) + { + a[i] = i - 2; + b[i] = i * 5; + c[i] = a[i] + b[i]; + i++; + } +} + +void __attribute__((noinline)) +loop3 (int *a, int *b, int *c) +{ + int i = 0; + do + { + a[i] = i - 4; + b[i] = i * 3; + c[i] = a[i] + b[i]; + i++; + } while (i < N); +} + +void +check (int *a, int *b, int *c) +{ + for (int i = 0; i < N; i++) + { + NO_LOB; + if (c[i] != a[i] + b[i]) + abort (); + } +} + +int main (void) +{ + reset_data (a, b, c); + loop1 (a, b ,c); + check (a, b ,c); + reset_data (a, b, c); + loop2 (a, b ,c); + check (a, b ,c); + reset_data (a, b, c); + loop3 (a, b ,c); + check (a, b ,c); + + return 0; +} + +/* { dg-final { scan-assembler-times {dls\s\S*,\s\S*} 3 } } */ +/* { dg-final { scan-assembler-times {le\slr,\s\S*} 3 } } */ diff --git a/gcc/testsuite/gcc.target/arm/lob2.c b/gcc/testsuite/gcc.target/arm/lob2.c new file mode 100644 index 00000000000..046d92fcad1 --- /dev/null +++ b/gcc/testsuite/gcc.target/arm/lob2.c @@ -0,0 +1,30 @@ +/* Check that GCC does not generate Armv8.1-M low over head loop instructions + if a non-inlineable function call takes place inside the loop. */ +/* { dg-do compile } */ +/* { dg-options "-march=armv8.1-m.main -O3 --save-temps" } */ +#include +#include "lob.h" + +int a[N]; +int b[N]; +int c[N]; + +int __attribute__ ((noinline)) +foo (int a, int b) +{ + return a + b; +} + +int main (void) +{ + for (int i = 0; i < N; i++) + { + a[i] = i; + b[i] = i * 2; + c[i] = foo (a[i], b[i]); + } + + return 0; +} +/* { dg-final { scan-assembler-not {dls\s\S*,\s\S*} } } */ +/* { dg-final { scan-assembler-not {le\slr,\s\S*} } } */ diff --git a/gcc/testsuite/gcc.target/arm/lob3.c b/gcc/testsuite/gcc.target/arm/lob3.c new file mode 100644 index 00000000000..77f89ad9c70 --- /dev/null +++ b/gcc/testsuite/gcc.target/arm/lob3.c @@ -0,0 +1,26 @@ +/* Check that GCC does not generate Armv8.1-M low over head loop instructions + if causes VFP emulation library calls to happen inside the loop. */ +/* { dg-do compile } */ +/* { dg-options "-march=armv8.1-m.main -O3 --save-temps -mfloat-abi=soft" } */ +/* { dg-require-effective-target arm_softfloat } */ +#include +#include "lob.h" + +double a[N]; +double b[N]; +double c[N]; + +int +main (void) +{ + for (int i = 0; i < N; i++) + { + a[i] = i; + b[i] = i * 2; + c[i] = a[i] + b[i]; + } + + return 0; +} +/* { dg-final { scan-assembler-not {dls\s\S*,\s\S*} } } */ +/* { dg-final { scan-assembler-not {le\slr,\s\S*} } } */ diff --git a/gcc/testsuite/gcc.target/arm/lob4.c b/gcc/testsuite/gcc.target/arm/lob4.c new file mode 100644 index 00000000000..88be61f3c76 --- /dev/null +++ b/gcc/testsuite/gcc.target/arm/lob4.c @@ -0,0 +1,32 @@ +/* Check that GCC does not generate Armv8.1-M low over head loop instructions + if LR is modified within the loop. */ +/* { dg-do compile } */ +/* { dg-options "-march=armv8.1-m.main -O3 --save-temps -mfloat-abi=soft" } */ +/* { dg-require-effective-target arm_softfloat } */ +#include +#include "lob.h" + +int a[N]; +int b[N]; +int c[N]; + +static __attribute__ ((always_inline)) inline int +foo (int a, int b) +{ + NO_LOB; + return a + b; +} + +int main (void) +{ + for (int i = 0; i < N; i++) + { + a[i] = i; + b[i] = i * 2; + c[i] = foo(a[i], b[i]); + } + + return 0; +} +/* { dg-final { scan-assembler-not {dls\s\S*,\s\S*} } } */ +/* { dg-final { scan-assembler-not {le\slr,\s\S*} } } */ diff --git a/gcc/testsuite/gcc.target/arm/lob5.c b/gcc/testsuite/gcc.target/arm/lob5.c new file mode 100644 index 00000000000..cd91c3252d3 --- /dev/null +++ b/gcc/testsuite/gcc.target/arm/lob5.c @@ -0,0 +1,33 @@ +/* Check that GCC does not generates Armv8.1-M low over head loop + instructions. Innermost loop has no fixed number of iterations + therefore is not optimizable. Outer loops are not optimized. */ +/* { dg-do compile } */ +/* { dg-options "-march=armv8.1-m.main -O3 --save-temps" } */ +#include +#include "lob.h" + +int a[N]; +int b[N]; +int c[N]; + +int main (void) +{ + for (int i = 0; i < N; i++) + { + a[i] = i; + b[i] = i * 2; + + int k = b[i]; + while (k != 0) + { + if (k % 2 == 0) + c[i - 1] = k % 2; + k /= 2; + } + c[i] = a[i] - b[i]; + } + + return 0; +} +/* { dg-final { scan-assembler-not {dls\s\S*,\s\S*} } } */ +/* { dg-final { scan-assembler-not {le\slr,\s\S*} } } */ diff --git a/gcc/testsuite/gcc.target/arm/lob6.c b/gcc/testsuite/gcc.target/arm/lob6.c new file mode 100644 index 00000000000..4bcedc8bd60 --- /dev/null +++ b/gcc/testsuite/gcc.target/arm/lob6.c @@ -0,0 +1,94 @@ +/* Check that GCC generates Armv8.1-M low over head loop instructions + with some less trivial loops and the result is correct. */ +/* { dg-do run } */ +/* { dg-options "-march=armv8.1-m.main -O3 --save-temps" } */ +#include +#include "lob.h" + +#define TEST_CODE1 \ + { \ + for (int i = 0; i < N; i++) \ + { \ + a[i] = i; \ + b[i] = i * 2; \ + \ + for (int k = 0; k < N; k++) \ + { \ + MAYBE_LOB; \ + c[k] = k / 2; \ + } \ + c[i] = a[i] - b[i]; \ + } \ + } + +#define TEST_CODE2 \ + { \ + for (int i = 0; i < N / 2; i++) \ + { \ + MAYBE_LOB; \ + if (c[i] % 2 == 0) \ + break; \ + a[i]++; \ + b[i]++; \ + } \ + } + +int a1[N]; +int b1[N]; +int c1[N]; + +int a2[N]; +int b2[N]; +int c2[N]; + +#define MAYBE_LOB +void __attribute__((noinline)) +loop1 (int *a, int *b, int *c) + TEST_CODE1; + +void __attribute__((noinline)) +loop2 (int *a, int *b, int *c) + TEST_CODE2; + +#undef MAYBE_LOB +#define MAYBE_LOB NO_LOB + +void +ref1 (int *a, int *b, int *c) + TEST_CODE1; + +void +ref2 (int *a, int *b, int *c) + TEST_CODE2; + +void +check (void) +{ + for (int i = 0; i < N; i++) + { + NO_LOB; + if (a1[i] != a2[i] + && b1[i] != b2[i] + && c1[i] != c2[i]) + abort (); + } +} + +int main (void) +{ + reset_data (a1, b1, c1); + reset_data (a2, b2, c2); + loop1 (a1, b1, c1); + ref1 (a2, b2, c2); + check (); + + reset_data (a1, b1, c1); + reset_data (a2, b2, c2); + loop2 (a1, b1, c1); + ref2 (a2, b2, c2); + check (); + + return 0; +} +/* { dg-final { scan-assembler-times {dls\s\S*,\s\S*} 2 } } */ +/* { dg-final { scan-assembler-times {le\slr,\s\S*} 2 } } */