From patchwork Tue Dec 5 10:12:59 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Sandiford X-Patchwork-Id: 1872023 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=8.43.85.97; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4SkxFn4fKRz23mf for ; Tue, 5 Dec 2023 21:13:49 +1100 (AEDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 0B5DC3861867 for ; Tue, 5 Dec 2023 10:13:47 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by sourceware.org (Postfix) with ESMTP id B1C5F385734D for ; Tue, 5 Dec 2023 10:13:34 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org B1C5F385734D Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=arm.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org B1C5F385734D Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=217.140.110.172 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1701771216; cv=none; b=x5KNrNGF5G+qGGY0Fiwyzs1YJuh+GLEX187dMYtSglCQzdYkcg1/3hDcSuFo0WqbfOKnB6HizWqZ77me36Q7d37mo1LHisaOVYBHsi37kylHq4MyS0gCGEhFfeHQyLWX0lyDgzXOKhyR+VTkDyxvJeNFBPDPXGmNvmgFczbGQ00= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1701771216; c=relaxed/simple; bh=GWAR+88DcpuYwZiS0XqbszFo5aoCa+jGtUxRWvjFxj8=; h=From:To:Subject:Date:Message-Id:MIME-Version; b=Ilhjny48zBzt41qAlGNDmiF+sbzStGdtqLhuW/qra5YzcEj5v33d99MxzQfMheXrBMQBklx3hUSZvH7VcJDFen3Kn5ekFZD1yqH09Hc4hs15fus9Nh+i9mxHSeepWvLCat389jCzsWnfszEuuXjbMI1mU5TpeOwvZNHoNj/a+m0= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id D5322139F; Tue, 5 Dec 2023 02:14:20 -0800 (PST) Received: from e121540-lin.manchester.arm.com (e121540-lin.manchester.arm.com [10.32.110.72]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id C6AAA3F5A1; Tue, 5 Dec 2023 02:13:33 -0800 (PST) From: Richard Sandiford To: gcc-patches@gcc.gnu.org Cc: Richard Sandiford Subject: [pushed v2 01/25] aarch64: Generalise require_immediate_lane_index Date: Tue, 5 Dec 2023 10:12:59 +0000 Message-Id: <20231205101323.1914247-2-richard.sandiford@arm.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20231205101323.1914247-1-richard.sandiford@arm.com> References: <20231205101323.1914247-1-richard.sandiford@arm.com> MIME-Version: 1.0 X-Spam-Status: No, score=-22.4 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_DMARC_NONE, KAM_DMARC_STATUS, KAM_LAZY_DOMAIN_SECURITY, SPF_HELO_NONE, SPF_NONE, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org require_immediate_lane_index previously hard-coded the assumption that the group size is determined by the argument immediately before the index. However, for SME, there are cases where it should be determined by an earlier argument instead. gcc/ * config/aarch64/aarch64-sve-builtins.h: (function_checker::require_immediate_lane_index): Add an argument for the index of the indexed vector argument. * config/aarch64/aarch64-sve-builtins.cc (function_checker::require_immediate_lane_index): Likewise. * config/aarch64/aarch64-sve-builtins-shapes.cc (ternary_bfloat_lane_base::check): Update accordingly. (ternary_qq_lane_base::check): Likewise. (binary_lane_def::check): Likewise. (binary_long_lane_def::check): Likewise. (ternary_lane_def::check): Likewise. (ternary_lane_rotate_def::check): Likewise. (ternary_long_lane_def::check): Likewise. (ternary_qq_lane_rotate_def::check): Likewise. --- .../aarch64/aarch64-sve-builtins-shapes.cc | 16 ++++++++-------- gcc/config/aarch64/aarch64-sve-builtins.cc | 18 ++++++++++++------ gcc/config/aarch64/aarch64-sve-builtins.h | 3 ++- 3 files changed, 22 insertions(+), 15 deletions(-) diff --git a/gcc/config/aarch64/aarch64-sve-builtins-shapes.cc b/gcc/config/aarch64/aarch64-sve-builtins-shapes.cc index af816c4c9e7..1646afc7a0d 100644 --- a/gcc/config/aarch64/aarch64-sve-builtins-shapes.cc +++ b/gcc/config/aarch64/aarch64-sve-builtins-shapes.cc @@ -941,7 +941,7 @@ struct ternary_bfloat_lane_base bool check (function_checker &c) const override { - return c.require_immediate_lane_index (3, N); + return c.require_immediate_lane_index (3, 2, N); } }; @@ -956,7 +956,7 @@ struct ternary_qq_lane_base bool check (function_checker &c) const override { - return c.require_immediate_lane_index (3, 4); + return c.require_immediate_lane_index (3, 0); } }; @@ -1123,7 +1123,7 @@ struct binary_lane_def : public overloaded_base<0> bool check (function_checker &c) const override { - return c.require_immediate_lane_index (2); + return c.require_immediate_lane_index (2, 1); } }; SHAPE (binary_lane) @@ -1162,7 +1162,7 @@ struct binary_long_lane_def : public overloaded_base<0> bool check (function_checker &c) const override { - return c.require_immediate_lane_index (2); + return c.require_immediate_lane_index (2, 1); } }; SHAPE (binary_long_lane) @@ -2817,7 +2817,7 @@ struct ternary_lane_def : public overloaded_base<0> bool check (function_checker &c) const override { - return c.require_immediate_lane_index (3); + return c.require_immediate_lane_index (3, 2); } }; SHAPE (ternary_lane) @@ -2845,7 +2845,7 @@ struct ternary_lane_rotate_def : public overloaded_base<0> bool check (function_checker &c) const override { - return (c.require_immediate_lane_index (3, 2) + return (c.require_immediate_lane_index (3, 2, 2) && c.require_immediate_one_of (4, 0, 90, 180, 270)); } }; @@ -2868,7 +2868,7 @@ struct ternary_long_lane_def bool check (function_checker &c) const override { - return c.require_immediate_lane_index (3); + return c.require_immediate_lane_index (3, 2); } }; SHAPE (ternary_long_lane) @@ -2965,7 +2965,7 @@ struct ternary_qq_lane_rotate_def : public overloaded_base<0> bool check (function_checker &c) const override { - return (c.require_immediate_lane_index (3, 4) + return (c.require_immediate_lane_index (3, 0) && c.require_immediate_one_of (4, 0, 90, 180, 270)); } }; diff --git a/gcc/config/aarch64/aarch64-sve-builtins.cc b/gcc/config/aarch64/aarch64-sve-builtins.cc index c5aaf1bef17..e6ac81f6b52 100644 --- a/gcc/config/aarch64/aarch64-sve-builtins.cc +++ b/gcc/config/aarch64/aarch64-sve-builtins.cc @@ -2440,20 +2440,26 @@ function_checker::require_immediate_enum (unsigned int rel_argno, tree type) return false; } -/* Check that argument REL_ARGNO is suitable for indexing argument - REL_ARGNO - 1, in groups of GROUP_SIZE elements. REL_ARGNO counts - from the end of the predication arguments. */ +/* The intrinsic conceptually divides vector argument REL_VEC_ARGNO into + groups of GROUP_SIZE elements. Return true if argument REL_ARGNO is + a suitable constant index for selecting one of these groups. The + selection happens within a 128-bit quadword, rather than the whole vector. + + REL_ARGNO and REL_VEC_ARGNO count from the end of the predication + arguments. */ bool function_checker::require_immediate_lane_index (unsigned int rel_argno, + unsigned int rel_vec_argno, unsigned int group_size) { unsigned int argno = m_base_arg + rel_argno; if (!argument_exists_p (argno)) return true; - /* Get the type of the previous argument. tree_argument_type wants a - 1-based number, whereas ARGNO is 0-based. */ - machine_mode mode = TYPE_MODE (type_argument_type (m_fntype, argno)); + /* Get the type of the vector argument. tree_argument_type wants a + 1-based number, whereas VEC_ARGNO is 0-based. */ + unsigned int vec_argno = m_base_arg + rel_vec_argno; + machine_mode mode = TYPE_MODE (type_argument_type (m_fntype, vec_argno + 1)); gcc_assert (VECTOR_MODE_P (mode)); unsigned int nlanes = 128 / (group_size * GET_MODE_UNIT_BITSIZE (mode)); return require_immediate_range (rel_argno, 0, nlanes - 1); diff --git a/gcc/config/aarch64/aarch64-sve-builtins.h b/gcc/config/aarch64/aarch64-sve-builtins.h index 1ac561dab47..2ca5b208efa 100644 --- a/gcc/config/aarch64/aarch64-sve-builtins.h +++ b/gcc/config/aarch64/aarch64-sve-builtins.h @@ -463,7 +463,8 @@ public: bool require_immediate_either_or (unsigned int, HOST_WIDE_INT, HOST_WIDE_INT); bool require_immediate_enum (unsigned int, tree); - bool require_immediate_lane_index (unsigned int, unsigned int = 1); + bool require_immediate_lane_index (unsigned int, unsigned int, + unsigned int = 1); bool require_immediate_one_of (unsigned int, HOST_WIDE_INT, HOST_WIDE_INT, HOST_WIDE_INT, HOST_WIDE_INT); bool require_immediate_range (unsigned int, HOST_WIDE_INT, HOST_WIDE_INT); From patchwork Tue Dec 5 10:13:00 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Sandiford X-Patchwork-Id: 1872025 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4SkxFs32MQz23mf for ; Tue, 5 Dec 2023 21:13:53 +1100 (AEDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id CFF06386183E for ; Tue, 5 Dec 2023 10:13:50 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by sourceware.org (Postfix) with ESMTP id 6FE833857BA7 for ; Tue, 5 Dec 2023 10:13:35 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 6FE833857BA7 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=arm.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 6FE833857BA7 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=217.140.110.172 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1701771218; cv=none; b=llLA1nxg/5GZ1cm067HCzxb3POf4VHK6zfkOwD2O5y4mXm26GNW5re64l1Gg5DRTDqoSBvkiF2W1wcHdSTaAUUW0FEYjaRzF5Dfork0JIA1aM8ZBgtCZC/poX6D6Y9eIwt5sVXCawbf916GR5SXjzaE1FFs1LT+luDy9+UDUtac= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1701771218; c=relaxed/simple; bh=kNV7fYtZqXDYY2yjLjIvNP6N/hKRd9fLCaWfVSehkEQ=; h=From:To:Subject:Date:Message-Id:MIME-Version; b=b7fiIXyfbImGJJbSqWv8Kad3OPX6uzpyZOH9JBl/dpOz+AIURjgOrJEG+s1gKLDmq602MkyAVXEpRQZSYV8pJl4JDMznymiGzbQLYWefM1LPUnhA99H1pIaB+sLHOgBKP4D/+ndJ/lC35ctroCTkpQ8ql5x0TELCDJrK7+GiMZg= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id A17261476; Tue, 5 Dec 2023 02:14:21 -0800 (PST) Received: from e121540-lin.manchester.arm.com (e121540-lin.manchester.arm.com [10.32.110.72]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 77A073F5A1; Tue, 5 Dec 2023 02:13:34 -0800 (PST) From: Richard Sandiford To: gcc-patches@gcc.gnu.org Cc: Richard Sandiford Subject: [pushed v2 02/25] aarch64: Use SVE's RDVL instruction Date: Tue, 5 Dec 2023 10:13:00 +0000 Message-Id: <20231205101323.1914247-3-richard.sandiford@arm.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20231205101323.1914247-1-richard.sandiford@arm.com> References: <20231205101323.1914247-1-richard.sandiford@arm.com> MIME-Version: 1.0 X-Spam-Status: No, score=-21.6 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_DMARC_NONE, KAM_DMARC_STATUS, KAM_LAZY_DOMAIN_SECURITY, KAM_SHORT, KAM_STOCKGEN, SPF_HELO_NONE, SPF_NONE, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org We didn't previously use SVE's RDVL instruction, since the CNT* forms are preferred and provide most of the range. However, there are some cases that RDVL can handle and CNT* can't, and using RDVL-like instructions becomes important for SME. gcc/ * config/aarch64/aarch64-protos.h (aarch64_sve_rdvl_immediate_p) (aarch64_output_sve_rdvl): Declare. * config/aarch64/aarch64.cc (aarch64_sve_cnt_factor_p): New function, split out from... (aarch64_sve_cnt_immediate_p): ...here. (aarch64_sve_rdvl_factor_p): New function. (aarch64_sve_rdvl_immediate_p): Likewise. (aarch64_output_sve_rdvl): Likewise. (aarch64_offset_temporaries): Rewrite the SVE handling to use RDVL for some cases. (aarch64_expand_mov_immediate): Handle RDVL immediates. (aarch64_mov_operand_p): Likewise. * config/aarch64/constraints.md (Usr): New constraint. * config/aarch64/aarch64.md (*mov_aarch64): Add an RDVL alternative. (*movsi_aarch64, *movdi_aarch64): Likewise. gcc/testsuite/ * gcc.target/aarch64/sve/acle/asm/cntb.c: Tweak expected output. * gcc.target/aarch64/sve/acle/asm/cnth.c: Likewise. * gcc.target/aarch64/sve/acle/asm/cntw.c: Likewise. * gcc.target/aarch64/sve/acle/asm/cntd.c: Likewise. * gcc.target/aarch64/sve/acle/asm/prfb.c: Likewise. * gcc.target/aarch64/sve/acle/asm/prfh.c: Likewise. * gcc.target/aarch64/sve/acle/asm/prfw.c: Likewise. * gcc.target/aarch64/sve/acle/asm/prfd.c: Likewise. * gcc.target/aarch64/sve/loop_add_4.c: Expect RDVL to be used to calculate the -17 and 17 factors. * gcc.target/aarch64/sve/pcs/stack_clash_1.c: Likewise the 18 factor. --- gcc/config/aarch64/aarch64-protos.h | 2 + gcc/config/aarch64/aarch64.cc | 191 ++++++++++++------ gcc/config/aarch64/aarch64.md | 3 + gcc/config/aarch64/constraints.md | 6 + .../gcc.target/aarch64/sve/acle/asm/cntb.c | 71 +++++-- .../gcc.target/aarch64/sve/acle/asm/cntd.c | 12 +- .../gcc.target/aarch64/sve/acle/asm/cnth.c | 20 +- .../gcc.target/aarch64/sve/acle/asm/cntw.c | 16 +- .../gcc.target/aarch64/sve/acle/asm/prfb.c | 6 +- .../gcc.target/aarch64/sve/acle/asm/prfd.c | 4 +- .../gcc.target/aarch64/sve/acle/asm/prfh.c | 4 +- .../gcc.target/aarch64/sve/acle/asm/prfw.c | 4 +- .../gcc.target/aarch64/sve/loop_add_4.c | 6 +- .../aarch64/sve/pcs/stack_clash_1.c | 3 +- 14 files changed, 225 insertions(+), 123 deletions(-) diff --git a/gcc/config/aarch64/aarch64-protos.h b/gcc/config/aarch64/aarch64-protos.h index b0b7d33714d..765c42916f6 100644 --- a/gcc/config/aarch64/aarch64-protos.h +++ b/gcc/config/aarch64/aarch64-protos.h @@ -798,6 +798,7 @@ bool aarch64_sve_mode_p (machine_mode); HOST_WIDE_INT aarch64_fold_sve_cnt_pat (aarch64_svpattern, unsigned int); bool aarch64_sve_cnt_immediate_p (rtx); bool aarch64_sve_scalar_inc_dec_immediate_p (rtx); +bool aarch64_sve_rdvl_immediate_p (rtx); bool aarch64_sve_addvl_addpl_immediate_p (rtx); bool aarch64_sve_vector_inc_dec_immediate_p (rtx); int aarch64_add_offset_temporaries (rtx); @@ -810,6 +811,7 @@ char *aarch64_output_sve_prefetch (const char *, rtx, const char *); char *aarch64_output_sve_cnt_immediate (const char *, const char *, rtx); char *aarch64_output_sve_cnt_pat_immediate (const char *, const char *, rtx *); char *aarch64_output_sve_scalar_inc_dec (rtx); +char *aarch64_output_sve_rdvl (rtx); char *aarch64_output_sve_addvl_addpl (rtx); char *aarch64_output_sve_vector_inc_dec (const char *, rtx); char *aarch64_output_scalar_simd_mov_immediate (rtx, scalar_int_mode); diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc index c864f4c0f6f..7a5d0d325e9 100644 --- a/gcc/config/aarch64/aarch64.cc +++ b/gcc/config/aarch64/aarch64.cc @@ -2933,6 +2933,18 @@ aarch64_fold_sve_cnt_pat (aarch64_svpattern pattern, unsigned int nelts_per_vq) return -1; } +/* Return true if a single CNT[BHWD] instruction can multiply FACTOR + by the number of 128-bit quadwords in an SVE vector. */ + +static bool +aarch64_sve_cnt_factor_p (HOST_WIDE_INT factor) +{ + /* The coefficient must be [1, 16] * {2, 4, 8, 16}. */ + return (IN_RANGE (factor, 2, 16 * 16) + && (factor & 1) == 0 + && factor <= 16 * (factor & -factor)); +} + /* Return true if we can move VALUE into a register using a single CNT[BHWD] instruction. */ @@ -2940,11 +2952,7 @@ static bool aarch64_sve_cnt_immediate_p (poly_int64 value) { HOST_WIDE_INT factor = value.coeffs[0]; - /* The coefficient must be [1, 16] * {2, 4, 8, 16}. */ - return (value.coeffs[1] == factor - && IN_RANGE (factor, 2, 16 * 16) - && (factor & 1) == 0 - && factor <= 16 * (factor & -factor)); + return value.coeffs[1] == factor && aarch64_sve_cnt_factor_p (factor); } /* Likewise for rtx X. */ @@ -3060,6 +3068,50 @@ aarch64_output_sve_scalar_inc_dec (rtx offset) -offset_value.coeffs[1], 0); } +/* Return true if a single RDVL instruction can multiply FACTOR by the + number of 128-bit quadwords in an SVE vector. */ + +static bool +aarch64_sve_rdvl_factor_p (HOST_WIDE_INT factor) +{ + return (multiple_p (factor, 16) + && IN_RANGE (factor, -32 * 16, 31 * 16)); +} + +/* Return true if we can move VALUE into a register using a single + RDVL instruction. */ + +static bool +aarch64_sve_rdvl_immediate_p (poly_int64 value) +{ + HOST_WIDE_INT factor = value.coeffs[0]; + return value.coeffs[1] == factor && aarch64_sve_rdvl_factor_p (factor); +} + +/* Likewise for rtx X. */ + +bool +aarch64_sve_rdvl_immediate_p (rtx x) +{ + poly_int64 value; + return poly_int_rtx_p (x, &value) && aarch64_sve_rdvl_immediate_p (value); +} + +/* Return the asm string for moving RDVL immediate OFFSET into register + operand 0. */ + +char * +aarch64_output_sve_rdvl (rtx offset) +{ + static char buffer[sizeof ("rdvl\t%x0, #-") + 3 * sizeof (int)]; + poly_int64 offset_value = rtx_to_poly_int64 (offset); + gcc_assert (aarch64_sve_rdvl_immediate_p (offset_value)); + + int factor = offset_value.coeffs[1]; + snprintf (buffer, sizeof (buffer), "rdvl\t%%x0, #%d", factor / 16); + return buffer; +} + /* Return true if we can add VALUE to a register using a single ADDVL or ADDPL instruction. */ @@ -3689,13 +3741,13 @@ aarch64_offset_temporaries (bool add_p, poly_int64 offset) count += 1; else if (factor != 0) { - factor = abs (factor); - if (factor > 16 * (factor & -factor)) - /* Need one register for the CNT result and one for the multiplication - factor. If necessary, the second temporary can be reused for the - constant part of the offset. */ + factor /= (HOST_WIDE_INT) least_bit_hwi (factor); + if (!IN_RANGE (factor, -32, 31)) + /* Need one register for the CNT or RDVL result and one for the + multiplication factor. If necessary, the second temporary + can be reused for the constant part of the offset. */ return 2; - /* Need one register for the CNT result (which might then + /* Need one register for the CNT or RDVL result (which might then be shifted). */ count += 1; } @@ -3784,85 +3836,100 @@ aarch64_add_offset (scalar_int_mode mode, rtx dest, rtx src, /* Otherwise use a CNT-based sequence. */ else if (factor != 0) { - /* Use a subtraction if we have a negative factor. */ - rtx_code code = PLUS; - if (factor < 0) - { - factor = -factor; - code = MINUS; - } + /* Calculate CNTB * FACTOR / 16 as CNTB * REL_FACTOR * 2**SHIFT, + with negative shifts indicating a shift right. */ + HOST_WIDE_INT low_bit = least_bit_hwi (factor); + HOST_WIDE_INT rel_factor = factor / low_bit; + int shift = exact_log2 (low_bit) - 4; + gcc_assert (shift >= -4 && (rel_factor & 1) != 0); + + /* Set CODE, VAL and SHIFT so that [+-] VAL * 2**SHIFT is + equal to CNTB * FACTOR / 16, with CODE being the [+-]. - /* Calculate CNTD * FACTOR / 2. First try to fold the division - into the multiplication. */ + We can avoid a multiplication if REL_FACTOR is in the range + of RDVL, although there are then various optimizations that + we can try on top. */ + rtx_code code = PLUS; rtx val; - int shift = 0; - if (factor & 1) - /* Use a right shift by 1. */ - shift = -1; - else - factor /= 2; - HOST_WIDE_INT low_bit = factor & -factor; - if (factor <= 16 * low_bit) + if (IN_RANGE (rel_factor, -32, 31)) { - if (factor > 16 * 8) + /* Try to use an unshifted CNT[BHWD] or RDVL. */ + if (aarch64_sve_cnt_factor_p (factor) + || aarch64_sve_rdvl_factor_p (factor)) + { + val = gen_int_mode (poly_int64 (factor, factor), mode); + shift = 0; + } + /* Try to subtract an unshifted CNT[BHWD]. */ + else if (aarch64_sve_cnt_factor_p (-factor)) { - /* "CNTB Xn, ALL, MUL #FACTOR" is out of range, so calculate - the value with the minimum multiplier and shift it into - position. */ - int extra_shift = exact_log2 (low_bit); - shift += extra_shift; - factor >>= extra_shift; + code = MINUS; + val = gen_int_mode (poly_int64 (-factor, -factor), mode); + shift = 0; } - val = gen_int_mode (poly_int64 (factor * 2, factor * 2), mode); + /* If subtraction is free, prefer to load a positive constant. + In the best case this will fit a shifted CNTB. */ + else if (src != const0_rtx && rel_factor < 0) + { + code = MINUS; + val = gen_int_mode (-rel_factor * BYTES_PER_SVE_VECTOR, mode); + } + /* Otherwise use a shifted RDVL or CNT[BHWD]. */ + else + val = gen_int_mode (rel_factor * BYTES_PER_SVE_VECTOR, mode); } else { - /* Base the factor on LOW_BIT if we can calculate LOW_BIT - directly, since that should increase the chances of being - able to use a shift and add sequence. If LOW_BIT itself - is out of range, just use CNTD. */ - if (low_bit <= 16 * 8) - factor /= low_bit; + /* If we can calculate CNTB << SHIFT directly, prefer to do that, + since it should increase the chances of being able to use + a shift and add sequence for the multiplication. + If CNTB << SHIFT is out of range, stick with the current + shift factor. */ + if (IN_RANGE (low_bit, 2, 16 * 16)) + { + val = gen_int_mode (poly_int64 (low_bit, low_bit), mode); + shift = 0; + } else - low_bit = 1; + val = gen_int_mode (BYTES_PER_SVE_VECTOR, mode); - val = gen_int_mode (poly_int64 (low_bit * 2, low_bit * 2), mode); val = aarch64_force_temporary (mode, temp1, val); + /* Prefer to multiply by a positive factor and subtract rather + than multiply by a negative factor and add, since positive + values are usually easier to move. */ + if (rel_factor < 0 && src != const0_rtx) + { + rel_factor = -rel_factor; + code = MINUS; + } + if (can_create_pseudo_p ()) { - rtx coeff1 = gen_int_mode (factor, mode); + rtx coeff1 = gen_int_mode (rel_factor, mode); val = expand_mult (mode, val, coeff1, NULL_RTX, true, true); } else { - /* Go back to using a negative multiplication factor if we have - no register from which to subtract. */ - if (code == MINUS && src == const0_rtx) - { - factor = -factor; - code = PLUS; - } - rtx coeff1 = gen_int_mode (factor, mode); + rtx coeff1 = gen_int_mode (rel_factor, mode); coeff1 = aarch64_force_temporary (mode, temp2, coeff1); val = gen_rtx_MULT (mode, val, coeff1); } } + /* Multiply by 2 ** SHIFT. */ if (shift > 0) { - /* Multiply by 1 << SHIFT. */ val = aarch64_force_temporary (mode, temp1, val); val = gen_rtx_ASHIFT (mode, val, GEN_INT (shift)); } - else if (shift == -1) + else if (shift < 0) { - /* Divide by 2. */ val = aarch64_force_temporary (mode, temp1, val); - val = gen_rtx_ASHIFTRT (mode, val, const1_rtx); + val = gen_rtx_ASHIFTRT (mode, val, GEN_INT (-shift)); } - /* Calculate SRC +/- CNTD * FACTOR / 2. */ + /* Add the result to SRC or subtract the result from SRC. */ if (src != const0_rtx) { val = aarch64_force_temporary (mode, temp1, val); @@ -4508,7 +4575,9 @@ aarch64_expand_mov_immediate (rtx dest, rtx imm) aarch64_report_sve_required (); return; } - if (base == const0_rtx && aarch64_sve_cnt_immediate_p (offset)) + if (base == const0_rtx + && (aarch64_sve_cnt_immediate_p (offset) + || aarch64_sve_rdvl_immediate_p (offset))) emit_insn (gen_rtx_SET (dest, imm)); else { @@ -19641,7 +19710,9 @@ aarch64_mov_operand_p (rtx x, machine_mode mode) if (SYMBOL_REF_P (x) && mode == DImode && CONSTANT_ADDRESS_P (x)) return true; - if (TARGET_SVE && aarch64_sve_cnt_immediate_p (x)) + if (TARGET_SVE + && (aarch64_sve_cnt_immediate_p (x) + || aarch64_sve_rdvl_immediate_p (x))) return true; return aarch64_classify_symbolic_expression (x) diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md index 202190c2cbf..d843f472dc2 100644 --- a/gcc/config/aarch64/aarch64.md +++ b/gcc/config/aarch64/aarch64.md @@ -1230,6 +1230,7 @@ (define_insn "*mov_aarch64" [w, D; neon_move , simd ] << aarch64_output_scalar_simd_mov_immediate (operands[1], mode); /* The "mov_imm" type for CNT is just a placeholder. */ [r, Usv ; mov_imm , sve ] << aarch64_output_sve_cnt_immediate ("cnt", "%x0", operands[1]); + [r, Usr ; mov_imm , sve ] << aarch64_output_sve_rdvl (operands[1]); [r, m ; load_4 , * ] ldr\t%w0, %1 [w, m ; load_4 , * ] ldr\t%0, %1 [m, r Z ; store_4 , * ] str\\t%w1, %0 @@ -1289,6 +1290,7 @@ (define_insn_and_split "*movsi_aarch64" [r , n ; mov_imm , * ,16] # /* The "mov_imm" type for CNT is just a placeholder. */ [r , Usv; mov_imm , sve , 4] << aarch64_output_sve_cnt_immediate ("cnt", "%x0", operands[1]); + [r , Usr; mov_imm , sve, 4] << aarch64_output_sve_rdvl (operands[1]); [r , m ; load_4 , * , 4] ldr\t%w0, %1 [w , m ; load_4 , fp , 4] ldr\t%s0, %1 [m , r Z; store_4 , * , 4] str\t%w1, %0 @@ -1324,6 +1326,7 @@ (define_insn_and_split "*movdi_aarch64" [r, n ; mov_imm , * ,16] # /* The "mov_imm" type for CNT is just a placeholder. */ [r, Usv; mov_imm , sve , 4] << aarch64_output_sve_cnt_immediate ("cnt", "%x0", operands[1]); + [r, Usr; mov_imm , sve, 4] << aarch64_output_sve_rdvl (operands[1]); [r, m ; load_8 , * , 4] ldr\t%x0, %1 [w, m ; load_8 , fp , 4] ldr\t%d0, %1 [m, r Z; store_8 , * , 4] str\t%x1, %0 diff --git a/gcc/config/aarch64/constraints.md b/gcc/config/aarch64/constraints.md index b3922bcb9a8..5c02d15c77a 100644 --- a/gcc/config/aarch64/constraints.md +++ b/gcc/config/aarch64/constraints.md @@ -219,6 +219,12 @@ (define_constraint "Ulc" (and (match_code "const_int") (match_test "aarch64_high_bits_all_ones_p (ival)"))) +(define_constraint "Usr" + "@internal + A constraint that matches a value produced by RDVL." + (and (match_code "const_poly_int") + (match_test "aarch64_sve_rdvl_immediate_p (op)"))) + (define_constraint "Usv" "@internal A constraint that matches a VG-based constant that can be loaded by diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/cntb.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/cntb.c index 8b8fe8e4f2b..a22d8a28d86 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/cntb.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/cntb.c @@ -51,19 +51,24 @@ PROTO (cntb_15, uint64_t, ()) { return svcntb () * 15; } */ PROTO (cntb_16, uint64_t, ()) { return svcntb () * 16; } -/* Other sequences would be OK. */ /* ** cntb_17: -** cntb x0, all, mul #16 -** incb x0 +** rdvl x0, #17 ** ret */ PROTO (cntb_17, uint64_t, ()) { return svcntb () * 17; } +/* +** cntb_31: +** rdvl x0, #31 +** ret +*/ +PROTO (cntb_31, uint64_t, ()) { return svcntb () * 31; } + /* ** cntb_32: -** cntd (x[0-9]+) -** lsl x0, \1, 8 +** cntb (x[0-9]+) +** lsl x0, \1, 5 ** ret */ PROTO (cntb_32, uint64_t, ()) { return svcntb () * 32; } @@ -80,16 +85,16 @@ PROTO (cntb_33, uint64_t, ()) { return svcntb () * 33; } /* ** cntb_64: -** cntd (x[0-9]+) -** lsl x0, \1, 9 +** cntb (x[0-9]+) +** lsl x0, \1, 6 ** ret */ PROTO (cntb_64, uint64_t, ()) { return svcntb () * 64; } /* ** cntb_128: -** cntd (x[0-9]+) -** lsl x0, \1, 10 +** cntb (x[0-9]+) +** lsl x0, \1, 7 ** ret */ PROTO (cntb_128, uint64_t, ()) { return svcntb () * 128; } @@ -106,46 +111,70 @@ PROTO (cntb_129, uint64_t, ()) { return svcntb () * 129; } /* ** cntb_m1: -** cntb (x[0-9]+) -** neg x0, \1 +** rdvl x0, #-1 ** ret */ PROTO (cntb_m1, uint64_t, ()) { return -svcntb (); } /* ** cntb_m13: -** cntb (x[0-9]+), all, mul #13 -** neg x0, \1 +** rdvl x0, #-13 ** ret */ PROTO (cntb_m13, uint64_t, ()) { return -svcntb () * 13; } /* ** cntb_m15: -** cntb (x[0-9]+), all, mul #15 -** neg x0, \1 +** rdvl x0, #-15 ** ret */ PROTO (cntb_m15, uint64_t, ()) { return -svcntb () * 15; } /* ** cntb_m16: -** cntb (x[0-9]+), all, mul #16 -** neg x0, \1 +** rdvl x0, #-16 ** ret */ PROTO (cntb_m16, uint64_t, ()) { return -svcntb () * 16; } -/* Other sequences would be OK. */ /* ** cntb_m17: -** cntb x0, all, mul #16 -** incb x0 -** neg x0, x0 +** rdvl x0, #-17 ** ret */ PROTO (cntb_m17, uint64_t, ()) { return -svcntb () * 17; } +/* +** cntb_m32: +** rdvl x0, #-32 +** ret +*/ +PROTO (cntb_m32, uint64_t, ()) { return -svcntb () * 32; } + +/* +** cntb_m33: +** rdvl x0, #-32 +** decb x0 +** ret +*/ +PROTO (cntb_m33, uint64_t, ()) { return -svcntb () * 33; } + +/* +** cntb_m34: +** rdvl (x[0-9]+), #-17 +** lsl x0, \1, #?1 +** ret +*/ +PROTO (cntb_m34, uint64_t, ()) { return -svcntb () * 34; } + +/* +** cntb_m64: +** rdvl (x[0-9]+), #-1 +** lsl x0, \1, #?6 +** ret +*/ +PROTO (cntb_m64, uint64_t, ()) { return -svcntb () * 64; } + /* ** incb_1: ** incb x0 diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/cntd.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/cntd.c index 0d0ed4849f1..090a643b418 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/cntd.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/cntd.c @@ -54,8 +54,8 @@ PROTO (cntd_16, uint64_t, ()) { return svcntd () * 16; } /* Other sequences would be OK. */ /* ** cntd_17: -** cntb x0, all, mul #2 -** incd x0 +** rdvl (x[0-9]+), #17 +** asr x0, \1, 3 ** ret */ PROTO (cntd_17, uint64_t, ()) { return svcntd () * 17; } @@ -107,8 +107,7 @@ PROTO (cntd_m15, uint64_t, ()) { return -svcntd () * 15; } /* ** cntd_m16: -** cntb (x[0-9]+), all, mul #2 -** neg x0, \1 +** rdvl x0, #-2 ** ret */ PROTO (cntd_m16, uint64_t, ()) { return -svcntd () * 16; } @@ -116,9 +115,8 @@ PROTO (cntd_m16, uint64_t, ()) { return -svcntd () * 16; } /* Other sequences would be OK. */ /* ** cntd_m17: -** cntb x0, all, mul #2 -** incd x0 -** neg x0, x0 +** rdvl (x[0-9]+), #-17 +** asr x0, \1, 3 ** ret */ PROTO (cntd_m17, uint64_t, ()) { return -svcntd () * 17; } diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/cnth.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/cnth.c index c29930f1591..1a4e7dc0e01 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/cnth.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/cnth.c @@ -54,8 +54,8 @@ PROTO (cnth_16, uint64_t, ()) { return svcnth () * 16; } /* Other sequences would be OK. */ /* ** cnth_17: -** cntb x0, all, mul #8 -** inch x0 +** rdvl (x[0-9]+), #17 +** asr x0, \1, 1 ** ret */ PROTO (cnth_17, uint64_t, ()) { return svcnth () * 17; } @@ -69,16 +69,16 @@ PROTO (cnth_32, uint64_t, ()) { return svcnth () * 32; } /* ** cnth_64: -** cntd (x[0-9]+) -** lsl x0, \1, 8 +** cntb (x[0-9]+) +** lsl x0, \1, 5 ** ret */ PROTO (cnth_64, uint64_t, ()) { return svcnth () * 64; } /* ** cnth_128: -** cntd (x[0-9]+) -** lsl x0, \1, 9 +** cntb (x[0-9]+) +** lsl x0, \1, 6 ** ret */ PROTO (cnth_128, uint64_t, ()) { return svcnth () * 128; } @@ -109,8 +109,7 @@ PROTO (cnth_m15, uint64_t, ()) { return -svcnth () * 15; } /* ** cnth_m16: -** cntb (x[0-9]+), all, mul #8 -** neg x0, \1 +** rdvl x0, #-8 ** ret */ PROTO (cnth_m16, uint64_t, ()) { return -svcnth () * 16; } @@ -118,9 +117,8 @@ PROTO (cnth_m16, uint64_t, ()) { return -svcnth () * 16; } /* Other sequences would be OK. */ /* ** cnth_m17: -** cntb x0, all, mul #8 -** inch x0 -** neg x0, x0 +** rdvl (x[0-9]+), #-17 +** asr x0, \1, 1 ** ret */ PROTO (cnth_m17, uint64_t, ()) { return -svcnth () * 17; } diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/cntw.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/cntw.c index e26cc67a467..9d169769094 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/cntw.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/cntw.c @@ -54,8 +54,8 @@ PROTO (cntw_16, uint64_t, ()) { return svcntw () * 16; } /* Other sequences would be OK. */ /* ** cntw_17: -** cntb x0, all, mul #4 -** incw x0 +** rdvl (x[0-9]+), #17 +** asr x0, \1, 2 ** ret */ PROTO (cntw_17, uint64_t, ()) { return svcntw () * 17; } @@ -76,8 +76,8 @@ PROTO (cntw_64, uint64_t, ()) { return svcntw () * 64; } /* ** cntw_128: -** cntd (x[0-9]+) -** lsl x0, \1, 8 +** cntb (x[0-9]+) +** lsl x0, \1, 5 ** ret */ PROTO (cntw_128, uint64_t, ()) { return svcntw () * 128; } @@ -108,8 +108,7 @@ PROTO (cntw_m15, uint64_t, ()) { return -svcntw () * 15; } /* ** cntw_m16: -** cntb (x[0-9]+), all, mul #4 -** neg x0, \1 +** rdvl (x[0-9]+), #-4 ** ret */ PROTO (cntw_m16, uint64_t, ()) { return -svcntw () * 16; } @@ -117,9 +116,8 @@ PROTO (cntw_m16, uint64_t, ()) { return -svcntw () * 16; } /* Other sequences would be OK. */ /* ** cntw_m17: -** cntb x0, all, mul #4 -** incw x0 -** neg x0, x0 +** rdvl (x[0-9]+), #-17 +** asr x0, \1, 2 ** ret */ PROTO (cntw_m17, uint64_t, ()) { return -svcntw () * 17; } diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/prfb.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/prfb.c index c90730a037c..94cd3a0662e 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/prfb.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/prfb.c @@ -218,8 +218,8 @@ TEST_PREFETCH (prfb_vnum_31, uint16_t, /* ** prfb_vnum_32: -** cntd (x[0-9]+) -** lsl (x[0-9]+), \1, #?8 +** cntb (x[0-9]+) +** lsl (x[0-9]+), \1, #?5 ** add (x[0-9]+), (\2, x0|x0, \2) ** prfb pldl1keep, p0, \[\3\] ** ret @@ -240,7 +240,7 @@ TEST_PREFETCH (prfb_vnum_m32, uint16_t, /* ** prfb_vnum_m33: ** ... -** prfb pldl1keep, p0, \[x[0-9]+\] +** prfb pldl1keep, p0, \[x[0-9]+(, x[0-9]+)?\] ** ret */ TEST_PREFETCH (prfb_vnum_m33, uint16_t, diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/prfd.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/prfd.c index 869ef3d3eeb..b7a116cf056 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/prfd.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/prfd.c @@ -218,8 +218,8 @@ TEST_PREFETCH (prfd_vnum_31, uint16_t, /* ** prfd_vnum_32: -** cntd (x[0-9]+) -** lsl (x[0-9]+), \1, #?8 +** cntb (x[0-9]+) +** lsl (x[0-9]+), \1, #?5 ** add (x[0-9]+), (\2, x0|x0, \2) ** prfd pldl1keep, p0, \[\3\] ** ret diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/prfh.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/prfh.c index 45a735eaea0..9d3df6bd3a8 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/prfh.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/prfh.c @@ -218,8 +218,8 @@ TEST_PREFETCH (prfh_vnum_31, uint16_t, /* ** prfh_vnum_32: -** cntd (x[0-9]+) -** lsl (x[0-9]+), \1, #?8 +** cntb (x[0-9]+) +** lsl (x[0-9]+), \1, #?5 ** add (x[0-9]+), (\2, x0|x0, \2) ** prfh pldl1keep, p0, \[\3\] ** ret diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/prfw.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/prfw.c index 444187f45d9..6962abab600 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/prfw.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/prfw.c @@ -218,8 +218,8 @@ TEST_PREFETCH (prfw_vnum_31, uint16_t, /* ** prfw_vnum_32: -** cntd (x[0-9]+) -** lsl (x[0-9]+), \1, #?8 +** cntb (x[0-9]+) +** lsl (x[0-9]+), \1, #?5 ** add (x[0-9]+), (\2, x0|x0, \2) ** prfw pldl1keep, p0, \[\3\] ** ret diff --git a/gcc/testsuite/gcc.target/aarch64/sve/loop_add_4.c b/gcc/testsuite/gcc.target/aarch64/sve/loop_add_4.c index 9ead9c21b35..7f02497e839 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/loop_add_4.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/loop_add_4.c @@ -68,8 +68,7 @@ TEST_ALL (LOOP) /* { dg-final { scan-assembler-times {\tindex\tz[0-9]+\.s, w[0-9]+, w[0-9]+\n} 3 } } */ /* { dg-final { scan-assembler-times {\tld1w\tz[0-9]+\.s, p[0-7]+/z, \[x[0-9]+, x[0-9]+, lsl 2\]} 8 } } */ /* { dg-final { scan-assembler-times {\tst1w\tz[0-9]+\.s, p[0-7]+, \[x[0-9]+, x[0-9]+, lsl 2\]} 8 } } */ -/* 2 for the calculations of -17 and 17. */ -/* { dg-final { scan-assembler-times {\tincw\tx[0-9]+\n} 10 } } */ +/* { dg-final { scan-assembler-times {\tincw\tx[0-9]+\n} 8 } } */ /* { dg-final { scan-assembler-times {\tdecw\tz[0-9]+\.s, all, mul #16\n} 1 } } */ /* { dg-final { scan-assembler-times {\tdecw\tz[0-9]+\.s, all, mul #15\n} 1 } } */ @@ -86,8 +85,7 @@ TEST_ALL (LOOP) /* { dg-final { scan-assembler-times {\tindex\tz[0-9]+\.d, x[0-9]+, x[0-9]+\n} 3 } } */ /* { dg-final { scan-assembler-times {\tld1d\tz[0-9]+\.d, p[0-7]+/z, \[x[0-9]+, x[0-9]+, lsl 3\]} 8 } } */ /* { dg-final { scan-assembler-times {\tst1d\tz[0-9]+\.d, p[0-7]+, \[x[0-9]+, x[0-9]+, lsl 3\]} 8 } } */ -/* 2 for the calculations of -17 and 17. */ -/* { dg-final { scan-assembler-times {\tincd\tx[0-9]+\n} 10 } } */ +/* { dg-final { scan-assembler-times {\tincd\tx[0-9]+\n} 8 } } */ /* { dg-final { scan-assembler-times {\tdecd\tz[0-9]+\.d, all, mul #16\n} 1 } } */ /* { dg-final { scan-assembler-times {\tdecd\tz[0-9]+\.d, all, mul #15\n} 1 } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pcs/stack_clash_1.c b/gcc/testsuite/gcc.target/aarch64/sve/pcs/stack_clash_1.c index 110947a6c4a..5de34fc6163 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/pcs/stack_clash_1.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/pcs/stack_clash_1.c @@ -6,8 +6,7 @@ /* ** test_1: -** cntd x12, all, mul #9 -** lsl x12, x12, #?4 +** rdvl x12, #18 ** mov x11, sp ** ... ** sub sp, sp, x12 From patchwork Tue Dec 5 10:13:01 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Sandiford X-Patchwork-Id: 1872024 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4SkxFq1sbJz23mf for ; Tue, 5 Dec 2023 21:13:51 +1100 (AEDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 5F8B43861847 for ; Tue, 5 Dec 2023 10:13:48 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by sourceware.org (Postfix) with ESMTP id 0F316385AC32 for ; Tue, 5 Dec 2023 10:13:36 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 0F316385AC32 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=arm.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 0F316385AC32 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=217.140.110.172 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1701771217; cv=none; b=fi2Q/LGutl3K7QTFyCBbEhAU6gu28y9GVvWWeJu+jXJmJ6nxLKz+t0UVzB0zt29zWBbGPa1zwQGhBGwX77YBig9FF3/a3V4Jfeeljawgz8RynlwUyczbWgA+wLan2AcrwdIvIFCqJ5H4/whNA5wISud3f34aEzioU7q1KyA8WN4= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1701771217; c=relaxed/simple; bh=Rhyc9uAwDljzQZbp/N12dceN4PfuQeE+2FxSFF3VGI4=; h=From:To:Subject:Date:Message-Id:MIME-Version; b=feczb6SziCAPoZNaz1T9au2lRTjeM7ez7QfYL075YmQXrxKieVohoYAvukGmawtBxGhWdQdmCEqLdfgXWkPVjSWG753J6sxTUCx8jOnPQGXTCK7Ni7pjFDOzVSlbDgKtDEhY/gD7d0T9NKCwQRf3m3NSVjc1tfwdqFMn4wI9TQE= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 5147B1477; Tue, 5 Dec 2023 02:14:22 -0800 (PST) Received: from e121540-lin.manchester.arm.com (e121540-lin.manchester.arm.com [10.32.110.72]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 42C4D3F5A1; Tue, 5 Dec 2023 02:13:35 -0800 (PST) From: Richard Sandiford To: gcc-patches@gcc.gnu.org Cc: Richard Sandiford Subject: [pushed v2 03/25] aarch64: Make AARCH64_FL_SVE requirements explicit Date: Tue, 5 Dec 2023 10:13:01 +0000 Message-Id: <20231205101323.1914247-4-richard.sandiford@arm.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20231205101323.1914247-1-richard.sandiford@arm.com> References: <20231205101323.1914247-1-richard.sandiford@arm.com> MIME-Version: 1.0 X-Spam-Status: No, score=-22.4 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_DMARC_NONE, KAM_DMARC_STATUS, KAM_LAZY_DOMAIN_SECURITY, KAM_SHORT, SPF_HELO_NONE, SPF_NONE, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org So far, all intrinsics covered by the aarch64-sve-builtins* framework have (naturally enough) required at least SVE. However, arm_sme.h defines a couple of intrinsics that can be called by any code. It's therefore necessary to make the implicit SVE requirement explicit. gcc/ * config/aarch64/aarch64-sve-builtins.cc (function_groups): Remove implied requirement on SVE. * config/aarch64/aarch64-sve-builtins-base.def: Explicitly require SVE. * config/aarch64/aarch64-sve-builtins-sve2.def: Likewise. --- .../aarch64/aarch64-sve-builtins-base.def | 10 +++++----- .../aarch64/aarch64-sve-builtins-sve2.def | 18 +++++++++++++----- gcc/config/aarch64/aarch64-sve-builtins.cc | 2 +- 3 files changed, 19 insertions(+), 11 deletions(-) diff --git a/gcc/config/aarch64/aarch64-sve-builtins-base.def b/gcc/config/aarch64/aarch64-sve-builtins-base.def index 95ae1d71629..0484863d3f7 100644 --- a/gcc/config/aarch64/aarch64-sve-builtins-base.def +++ b/gcc/config/aarch64/aarch64-sve-builtins-base.def @@ -17,7 +17,7 @@ along with GCC; see the file COPYING3. If not see . */ -#define REQUIRED_EXTENSIONS 0 +#define REQUIRED_EXTENSIONS AARCH64_FL_SVE DEF_SVE_FUNCTION (svabd, binary_opt_n, all_arith, mxz) DEF_SVE_FUNCTION (svabs, unary, all_float_and_signed, mxz) DEF_SVE_FUNCTION (svacge, compare_opt_n, all_float, implicit) @@ -318,7 +318,7 @@ DEF_SVE_FUNCTION (svzip2, binary, all_data, none) DEF_SVE_FUNCTION (svzip2, binary_pred, all_pred, none) #undef REQUIRED_EXTENSIONS -#define REQUIRED_EXTENSIONS AARCH64_FL_BF16 +#define REQUIRED_EXTENSIONS AARCH64_FL_SVE | AARCH64_FL_BF16 DEF_SVE_FUNCTION (svbfdot, ternary_bfloat_opt_n, s_float, none) DEF_SVE_FUNCTION (svbfdot_lane, ternary_bfloat_lanex2, s_float, none) DEF_SVE_FUNCTION (svbfmlalb, ternary_bfloat_opt_n, s_float, none) @@ -330,7 +330,7 @@ DEF_SVE_FUNCTION (svcvt, unary_convert, cvt_bfloat, mxz) DEF_SVE_FUNCTION (svcvtnt, unary_convert_narrowt, cvt_bfloat, mx) #undef REQUIRED_EXTENSIONS -#define REQUIRED_EXTENSIONS AARCH64_FL_I8MM +#define REQUIRED_EXTENSIONS AARCH64_FL_SVE | AARCH64_FL_I8MM DEF_SVE_FUNCTION (svmmla, mmla, s_integer, none) DEF_SVE_FUNCTION (svusmmla, ternary_uintq_intq, s_signed, none) DEF_SVE_FUNCTION (svsudot, ternary_intq_uintq_opt_n, s_signed, none) @@ -339,11 +339,11 @@ DEF_SVE_FUNCTION (svusdot, ternary_uintq_intq_opt_n, s_signed, none) DEF_SVE_FUNCTION (svusdot_lane, ternary_uintq_intq_lane, s_signed, none) #undef REQUIRED_EXTENSIONS -#define REQUIRED_EXTENSIONS AARCH64_FL_F32MM +#define REQUIRED_EXTENSIONS AARCH64_FL_SVE | AARCH64_FL_F32MM DEF_SVE_FUNCTION (svmmla, mmla, s_float, none) #undef REQUIRED_EXTENSIONS -#define REQUIRED_EXTENSIONS AARCH64_FL_F64MM +#define REQUIRED_EXTENSIONS AARCH64_FL_SVE | AARCH64_FL_F64MM DEF_SVE_FUNCTION (svld1ro, load_replicate, all_data, implicit) DEF_SVE_FUNCTION (svmmla, mmla, d_float, none) DEF_SVE_FUNCTION (svtrn1q, binary, all_data, none) diff --git a/gcc/config/aarch64/aarch64-sve-builtins-sve2.def b/gcc/config/aarch64/aarch64-sve-builtins-sve2.def index dd6d1357d51..565393f3081 100644 --- a/gcc/config/aarch64/aarch64-sve-builtins-sve2.def +++ b/gcc/config/aarch64/aarch64-sve-builtins-sve2.def @@ -17,7 +17,7 @@ along with GCC; see the file COPYING3. If not see . */ -#define REQUIRED_EXTENSIONS AARCH64_FL_SVE2 +#define REQUIRED_EXTENSIONS AARCH64_FL_SVE | AARCH64_FL_SVE2 DEF_SVE_FUNCTION (svaba, ternary_opt_n, all_integer, none) DEF_SVE_FUNCTION (svabalb, ternary_long_opt_n, hsd_integer, none) DEF_SVE_FUNCTION (svabalt, ternary_long_opt_n, hsd_integer, none) @@ -189,7 +189,9 @@ DEF_SVE_FUNCTION (svwhilewr, compare_ptr, all_data, none) DEF_SVE_FUNCTION (svxar, ternary_shift_right_imm, all_integer, none) #undef REQUIRED_EXTENSIONS -#define REQUIRED_EXTENSIONS (AARCH64_FL_SVE2 | AARCH64_FL_SVE2_AES) +#define REQUIRED_EXTENSIONS (AARCH64_FL_SVE \ + | AARCH64_FL_SVE2 \ + | AARCH64_FL_SVE2_AES) DEF_SVE_FUNCTION (svaesd, binary, b_unsigned, none) DEF_SVE_FUNCTION (svaese, binary, b_unsigned, none) DEF_SVE_FUNCTION (svaesmc, unary, b_unsigned, none) @@ -198,17 +200,23 @@ DEF_SVE_FUNCTION (svpmullb_pair, binary_opt_n, d_unsigned, none) DEF_SVE_FUNCTION (svpmullt_pair, binary_opt_n, d_unsigned, none) #undef REQUIRED_EXTENSIONS -#define REQUIRED_EXTENSIONS (AARCH64_FL_SVE2 | AARCH64_FL_SVE2_BITPERM) +#define REQUIRED_EXTENSIONS (AARCH64_FL_SVE \ + | AARCH64_FL_SVE2 \ + | AARCH64_FL_SVE2_BITPERM) DEF_SVE_FUNCTION (svbdep, binary_opt_n, all_unsigned, none) DEF_SVE_FUNCTION (svbext, binary_opt_n, all_unsigned, none) DEF_SVE_FUNCTION (svbgrp, binary_opt_n, all_unsigned, none) #undef REQUIRED_EXTENSIONS -#define REQUIRED_EXTENSIONS (AARCH64_FL_SVE2 | AARCH64_FL_SVE2_SHA3) +#define REQUIRED_EXTENSIONS (AARCH64_FL_SVE \ + | AARCH64_FL_SVE2 \ + | AARCH64_FL_SVE2_SHA3) DEF_SVE_FUNCTION (svrax1, binary, d_integer, none) #undef REQUIRED_EXTENSIONS -#define REQUIRED_EXTENSIONS (AARCH64_FL_SVE2 | AARCH64_FL_SVE2_SM4) +#define REQUIRED_EXTENSIONS (AARCH64_FL_SVE \ + | AARCH64_FL_SVE2 \ + | AARCH64_FL_SVE2_SM4) DEF_SVE_FUNCTION (svsm4e, binary, s_unsigned, none) DEF_SVE_FUNCTION (svsm4ekey, binary, s_unsigned, none) #undef REQUIRED_EXTENSIONS diff --git a/gcc/config/aarch64/aarch64-sve-builtins.cc b/gcc/config/aarch64/aarch64-sve-builtins.cc index e6ac81f6b52..1bf88fa2ee1 100644 --- a/gcc/config/aarch64/aarch64-sve-builtins.cc +++ b/gcc/config/aarch64/aarch64-sve-builtins.cc @@ -525,7 +525,7 @@ static const predication_index preds_z[] = { PRED_z, NUM_PREDS }; static CONSTEXPR const function_group_info function_groups[] = { #define DEF_SVE_FUNCTION(NAME, SHAPE, TYPES, PREDS) \ { #NAME, &functions::NAME, &shapes::SHAPE, types_##TYPES, preds_##PREDS, \ - REQUIRED_EXTENSIONS | AARCH64_FL_SVE }, + REQUIRED_EXTENSIONS }, #include "aarch64-sve-builtins.def" }; From patchwork Tue Dec 5 10:13:02 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Sandiford X-Patchwork-Id: 1872026 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=8.43.85.97; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4SkxGK13KRz23mj for ; Tue, 5 Dec 2023 21:14:17 +1100 (AEDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 11FB1384F4A0 for ; Tue, 5 Dec 2023 10:14:08 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by sourceware.org (Postfix) with ESMTP id A9FF23857038 for ; Tue, 5 Dec 2023 10:13:36 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org A9FF23857038 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=arm.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org A9FF23857038 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=217.140.110.172 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1701771219; cv=none; b=OfvHrSzqi4JDJRJ9hGLgRE4HK1/YtB0NMEu8xwhW39JdwhGXIG0aRDCwhnTDGNKdY+9JerMBC9dy5Ewie5LlmLF+bmVj86KUC01eBhTZAFwPRx4OzdjcFcyJyoGSet137Hb7d7NkmR7UbXaurIgzjD5w8vPWM149cE8YlTeKdC8= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1701771219; c=relaxed/simple; bh=TrFmpGTT7cWfnhQNcif8Q2J5zw+qU+B/xX8WVgBorbI=; h=From:To:Subject:Date:Message-Id:MIME-Version; b=N+KJVkAtf33VZFNBBefm11ZECJf0L/aGuX1PE4V5jx9v5MfOHD6glb1bulT/g4zJBvoNStaLqA6mzu6UirLtcJcHD3OASMRnZ+Xzb6VPWwWDLfmiy1h22sYMs8HVnA95KDUju4bdF4Q2maPz4LwsiGeTso2cLTuvWCSNoi3hrQE= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 01837139F; Tue, 5 Dec 2023 02:14:23 -0800 (PST) Received: from e121540-lin.manchester.arm.com (e121540-lin.manchester.arm.com [10.32.110.72]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id E77393F5A1; Tue, 5 Dec 2023 02:13:35 -0800 (PST) From: Richard Sandiford To: gcc-patches@gcc.gnu.org Cc: Richard Sandiford Subject: [pushed v2 04/25] aarch64: Add group suffixes to SVE intrinsics Date: Tue, 5 Dec 2023 10:13:02 +0000 Message-Id: <20231205101323.1914247-5-richard.sandiford@arm.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20231205101323.1914247-1-richard.sandiford@arm.com> References: <20231205101323.1914247-1-richard.sandiford@arm.com> MIME-Version: 1.0 X-Spam-Status: No, score=-22.4 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_DMARC_NONE, KAM_DMARC_STATUS, KAM_LAZY_DOMAIN_SECURITY, SPF_HELO_NONE, SPF_NONE, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org The SME2 ACLE adds a new "group" suffix component to the naming convention for SVE intrinsics. This is also used in the new tuple forms of the svreinterpret intrinsics. This patch adds support for group suffixes and defines the x2, x3 and x4 suffixes that are needed for the svreinterprets. gcc/ * config/aarch64/aarch64-sve-builtins-shapes.cc (build_one): Take a group suffix index parameter. (build_32_64, build_all): Update accordingly. Iterate over all group suffixes. * config/aarch64/aarch64-sve-builtins-sve2.cc (svqrshl_impl::fold) (svqshl_impl::fold, svrshl_impl::fold): Update function_instance constructors. * config/aarch64/aarch64-sve-builtins.cc (group_suffixes): New array. (groups_none): New constant. (function_groups): Initialize the groups field. (function_instance::hash): Hash the group index. (function_builder::get_name): Add the group suffix. (function_builder::add_overloaded_functions): Iterate over all group suffixes. (function_resolver::lookup_form): Take a group suffix parameter. (function_resolver::resolve_to): Likewise. * config/aarch64/aarch64-sve-builtins.def (DEF_SVE_GROUP_SUFFIX): New macro. (x2, x3, x4): New group suffixes. * config/aarch64/aarch64-sve-builtins.h (group_suffix_index): New enum. (group_suffix_info): New structure. (function_group_info::groups): New member variable. (function_instance::group_suffix_id): Likewise. (group_suffixes): New array. (function_instance::operator==): Compare the group suffixes. (function_instance::group_suffix): New function. --- .../aarch64/aarch64-sve-builtins-shapes.cc | 53 ++++++------ .../aarch64/aarch64-sve-builtins-sve2.cc | 10 +-- gcc/config/aarch64/aarch64-sve-builtins.cc | 84 +++++++++++++------ gcc/config/aarch64/aarch64-sve-builtins.def | 9 ++ gcc/config/aarch64/aarch64-sve-builtins.h | 81 ++++++++++++++---- 5 files changed, 165 insertions(+), 72 deletions(-) diff --git a/gcc/config/aarch64/aarch64-sve-builtins-shapes.cc b/gcc/config/aarch64/aarch64-sve-builtins-shapes.cc index 1646afc7a0d..dc255fc59f2 100644 --- a/gcc/config/aarch64/aarch64-sve-builtins-shapes.cc +++ b/gcc/config/aarch64/aarch64-sve-builtins-shapes.cc @@ -275,18 +275,20 @@ parse_signature (const function_instance &instance, const char *format, } /* Add one function instance for GROUP, using mode suffix MODE_SUFFIX_ID, - the type suffixes at index TI and the predication suffix at index PI. - The other arguments are as for build_all. */ + the type suffixes at index TI, the group suffixes at index GI, and the + predication suffix at index PI. The other arguments are as for + build_all. */ static void build_one (function_builder &b, const char *signature, const function_group_info &group, mode_suffix_index mode_suffix_id, - unsigned int ti, unsigned int pi, bool force_direct_overloads) + unsigned int ti, unsigned int gi, unsigned int pi, + bool force_direct_overloads) { /* Byte forms of svdupq take 16 arguments. */ auto_vec argument_types; function_instance instance (group.base_name, *group.base, *group.shape, mode_suffix_id, group.types[ti], - group.preds[pi]); + group.groups[gi], group.preds[pi]); tree return_type = parse_signature (instance, signature, argument_types); apply_predication (instance, return_type, argument_types); b.add_unique_function (instance, return_type, argument_types, @@ -312,24 +314,26 @@ build_32_64 (function_builder &b, const char *signature, mode_suffix_index mode64, bool force_direct_overloads = false) { for (unsigned int pi = 0; group.preds[pi] != NUM_PREDS; ++pi) - if (group.types[0][0] == NUM_TYPE_SUFFIXES) - { - gcc_assert (mode32 != MODE_none && mode64 != MODE_none); - build_one (b, signature, group, mode32, 0, pi, - force_direct_overloads); - build_one (b, signature, group, mode64, 0, pi, - force_direct_overloads); - } - else - for (unsigned int ti = 0; group.types[ti][0] != NUM_TYPE_SUFFIXES; ++ti) + for (unsigned int gi = 0; group.groups[gi] != NUM_GROUP_SUFFIXES; ++gi) + if (group.types[0][0] == NUM_TYPE_SUFFIXES) { - unsigned int bits = type_suffixes[group.types[ti][0]].element_bits; - gcc_assert (bits == 32 || bits == 64); - mode_suffix_index mode = bits == 32 ? mode32 : mode64; - if (mode != MODE_none) - build_one (b, signature, group, mode, ti, pi, - force_direct_overloads); + gcc_assert (mode32 != MODE_none && mode64 != MODE_none); + build_one (b, signature, group, mode32, 0, gi, pi, + force_direct_overloads); + build_one (b, signature, group, mode64, 0, gi, pi, + force_direct_overloads); } + else + for (unsigned int ti = 0; group.types[ti][0] != NUM_TYPE_SUFFIXES; + ++ti) + { + unsigned int bits = type_suffixes[group.types[ti][0]].element_bits; + gcc_assert (bits == 32 || bits == 64); + mode_suffix_index mode = bits == 32 ? mode32 : mode64; + if (mode != MODE_none) + build_one (b, signature, group, mode, ti, gi, pi, + force_direct_overloads); + } } /* For every type and predicate combination in GROUP, add one function @@ -423,10 +427,11 @@ build_all (function_builder &b, const char *signature, bool force_direct_overloads = false) { for (unsigned int pi = 0; group.preds[pi] != NUM_PREDS; ++pi) - for (unsigned int ti = 0; - ti == 0 || group.types[ti][0] != NUM_TYPE_SUFFIXES; ++ti) - build_one (b, signature, group, mode_suffix_id, ti, pi, - force_direct_overloads); + for (unsigned int gi = 0; group.groups[gi] != NUM_GROUP_SUFFIXES; ++gi) + for (unsigned int ti = 0; + ti == 0 || group.types[ti][0] != NUM_TYPE_SUFFIXES; ++ti) + build_one (b, signature, group, mode_suffix_id, ti, gi, pi, + force_direct_overloads); } /* TYPE is the largest type suffix associated with the arguments of R, diff --git a/gcc/config/aarch64/aarch64-sve-builtins-sve2.cc b/gcc/config/aarch64/aarch64-sve-builtins-sve2.cc index 9e989fca2ab..73f9e5a899c 100644 --- a/gcc/config/aarch64/aarch64-sve-builtins-sve2.cc +++ b/gcc/config/aarch64/aarch64-sve-builtins-sve2.cc @@ -247,7 +247,7 @@ public: that we can use for sensible shift amounts. */ function_instance instance ("svqshl", functions::svqshl, shapes::binary_int_opt_n, MODE_n, - f.type_suffix_ids, f.pred); + f.type_suffix_ids, GROUP_none, f.pred); return f.redirect_call (instance); } else @@ -256,7 +256,7 @@ public: that we can use for sensible shift amounts. */ function_instance instance ("svrshl", functions::svrshl, shapes::binary_int_opt_n, MODE_n, - f.type_suffix_ids, f.pred); + f.type_suffix_ids, GROUP_none, f.pred); return f.redirect_call (instance); } } @@ -285,7 +285,7 @@ public: -wi::to_wide (amount)); function_instance instance ("svasr", functions::svasr, shapes::binary_uint_opt_n, MODE_n, - f.type_suffix_ids, f.pred); + f.type_suffix_ids, GROUP_none, f.pred); if (f.type_suffix (0).unsigned_p) { instance.base_name = "svlsr"; @@ -317,7 +317,7 @@ public: that we can use for sensible shift amounts. */ function_instance instance ("svlsl", functions::svlsl, shapes::binary_uint_opt_n, MODE_n, - f.type_suffix_ids, f.pred); + f.type_suffix_ids, GROUP_none, f.pred); gcall *call = as_a (f.redirect_call (instance)); gimple_call_set_arg (call, 2, amount); return call; @@ -330,7 +330,7 @@ public: -wi::to_wide (amount)); function_instance instance ("svrshr", functions::svrshr, shapes::shift_right_imm, MODE_n, - f.type_suffix_ids, f.pred); + f.type_suffix_ids, GROUP_none, f.pred); gcall *call = as_a (f.redirect_call (instance)); gimple_call_set_arg (call, 2, amount); return call; diff --git a/gcc/config/aarch64/aarch64-sve-builtins.cc b/gcc/config/aarch64/aarch64-sve-builtins.cc index 1bf88fa2ee1..55938413ef0 100644 --- a/gcc/config/aarch64/aarch64-sve-builtins.cc +++ b/gcc/config/aarch64/aarch64-sve-builtins.cc @@ -144,6 +144,13 @@ CONSTEXPR const type_suffix_info type_suffixes[NUM_TYPE_SUFFIXES + 1] = { 0, VOIDmode } }; +CONSTEXPR const group_suffix_info group_suffixes[] = { +#define DEF_SVE_GROUP_SUFFIX(NAME, VG, VECTORS_PER_TUPLE) \ + { "_" #NAME, VG, VECTORS_PER_TUPLE }, +#include "aarch64-sve-builtins.def" + { "", 0, 1 } +}; + /* Define a TYPES_ macro for each combination of type suffixes that an ACLE function can have, where is the name used in DEF_SVE_FUNCTION entries. @@ -483,6 +490,10 @@ DEF_SVE_TYPES_ARRAY (inc_dec_n); DEF_SVE_TYPES_ARRAY (reinterpret); DEF_SVE_TYPES_ARRAY (while); +static const group_suffix_index groups_none[] = { + GROUP_none, NUM_GROUP_SUFFIXES +}; + /* Used by functions that have no governing predicate. */ static const predication_index preds_none[] = { PRED_none, NUM_PREDS }; @@ -524,8 +535,8 @@ static const predication_index preds_z[] = { PRED_z, NUM_PREDS }; /* A list of all SVE ACLE functions. */ static CONSTEXPR const function_group_info function_groups[] = { #define DEF_SVE_FUNCTION(NAME, SHAPE, TYPES, PREDS) \ - { #NAME, &functions::NAME, &shapes::SHAPE, types_##TYPES, preds_##PREDS, \ - REQUIRED_EXTENSIONS }, + { #NAME, &functions::NAME, &shapes::SHAPE, types_##TYPES, groups_none, \ + preds_##PREDS, REQUIRED_EXTENSIONS }, #include "aarch64-sve-builtins.def" }; @@ -788,6 +799,7 @@ function_instance::hash () const h.add_int (mode_suffix_id); h.add_int (type_suffix_ids[0]); h.add_int (type_suffix_ids[1]); + h.add_int (group_suffix_id); h.add_int (pred); return h.end (); } @@ -957,6 +969,8 @@ function_builder::get_name (const function_instance &instance, for (unsigned int i = 0; i < 2; ++i) if (!overloaded_p || instance.shape->explicit_type_suffix_p (i)) append_name (instance.type_suffix (i).string); + if (!overloaded_p || instance.shape->explicit_group_suffix_p ()) + append_name (instance.group_suffix ().string); append_name (pred_suffixes[instance.pred]); return finish_name (); } @@ -1113,19 +1127,26 @@ void function_builder::add_overloaded_functions (const function_group_info &group, mode_suffix_index mode) { - unsigned int explicit_type0 = (*group.shape)->explicit_type_suffix_p (0); - unsigned int explicit_type1 = (*group.shape)->explicit_type_suffix_p (1); - for (unsigned int pi = 0; group.preds[pi] != NUM_PREDS; ++pi) + bool explicit_type0 = (*group.shape)->explicit_type_suffix_p (0); + bool explicit_type1 = (*group.shape)->explicit_type_suffix_p (1); + bool explicit_group = (*group.shape)->explicit_group_suffix_p (); + auto add_function = [&](const type_suffix_pair &types, + group_suffix_index group_suffix_id, + unsigned int pi) + { + function_instance instance (group.base_name, *group.base, + *group.shape, mode, types, + group_suffix_id, group.preds[pi]); + add_overloaded_function (instance, group.required_extensions); + }; + + auto add_group_suffix = [&](group_suffix_index group_suffix_id, + unsigned int pi) { if (!explicit_type0 && !explicit_type1) - { - /* Deal with the common case in which there is one overloaded - function for all type combinations. */ - function_instance instance (group.base_name, *group.base, - *group.shape, mode, types_none[0], - group.preds[pi]); - add_overloaded_function (instance, group.required_extensions); - } + /* Deal with the common case in which there is one overloaded + function for all type combinations. */ + add_function (types_none[0], group_suffix_id, pi); else for (unsigned int ti = 0; group.types[ti][0] != NUM_TYPE_SUFFIXES; ++ti) @@ -1136,12 +1157,16 @@ function_builder::add_overloaded_functions (const function_group_info &group, explicit_type0 ? group.types[ti][0] : NUM_TYPE_SUFFIXES, explicit_type1 ? group.types[ti][1] : NUM_TYPE_SUFFIXES }; - function_instance instance (group.base_name, *group.base, - *group.shape, mode, types, - group.preds[pi]); - add_overloaded_function (instance, group.required_extensions); + add_function (types, group_suffix_id, pi); } - } + }; + + for (unsigned int pi = 0; group.preds[pi] != NUM_PREDS; ++pi) + if (explicit_group) + for (unsigned int gi = 0; group.groups[gi] != NUM_GROUP_SUFFIXES; ++gi) + add_group_suffix (group.groups[gi], pi); + else + add_group_suffix (GROUP_none, pi); } /* Register all the functions in GROUP. */ @@ -1213,29 +1238,34 @@ function_resolver::report_no_such_form (type_suffix_index type) } /* Silently check whether there is an instance of the function with the - mode suffix given by MODE and the type suffixes given by TYPE0 and TYPE1. - Return its function decl if so, otherwise return null. */ + mode suffix given by MODE, the type suffixes given by TYPE0 and TYPE1, + and the group suffix given by GROUP. Return its function decl if so, + otherwise return null. */ tree function_resolver::lookup_form (mode_suffix_index mode, type_suffix_index type0, - type_suffix_index type1) + type_suffix_index type1, + group_suffix_index group) { type_suffix_pair types = { type0, type1 }; - function_instance instance (base_name, base, shape, mode, types, pred); + function_instance instance (base_name, base, shape, mode, types, + group, pred); registered_function *rfn = function_table->find_with_hash (instance, instance.hash ()); return rfn ? rfn->decl : NULL_TREE; } -/* Resolve the function to one with the mode suffix given by MODE and the - type suffixes given by TYPE0 and TYPE1. Return its function decl on - success, otherwise report an error and return error_mark_node. */ +/* Resolve the function to one with the mode suffix given by MODE, the + type suffixes given by TYPE0 and TYPE1, and group suffix given by + GROUP. Return its function decl on success, otherwise report an + error and return error_mark_node. */ tree function_resolver::resolve_to (mode_suffix_index mode, type_suffix_index type0, - type_suffix_index type1) + type_suffix_index type1, + group_suffix_index group) { - tree res = lookup_form (mode, type0, type1); + tree res = lookup_form (mode, type0, type1, group); if (!res) { if (type1 == NUM_TYPE_SUFFIXES) diff --git a/gcc/config/aarch64/aarch64-sve-builtins.def b/gcc/config/aarch64/aarch64-sve-builtins.def index 534f6e69d72..5fbd486d74e 100644 --- a/gcc/config/aarch64/aarch64-sve-builtins.def +++ b/gcc/config/aarch64/aarch64-sve-builtins.def @@ -29,6 +29,10 @@ #define DEF_SVE_TYPE_SUFFIX(A, B, C, D, E) #endif +#ifndef DEF_SVE_GROUP_SUFFIX +#define DEF_SVE_GROUP_SUFFIX(A, B, C) +#endif + #ifndef DEF_SVE_FUNCTION #define DEF_SVE_FUNCTION(A, B, C, D) #endif @@ -95,10 +99,15 @@ DEF_SVE_TYPE_SUFFIX (u16, svuint16_t, unsigned, 16, VNx8HImode) DEF_SVE_TYPE_SUFFIX (u32, svuint32_t, unsigned, 32, VNx4SImode) DEF_SVE_TYPE_SUFFIX (u64, svuint64_t, unsigned, 64, VNx2DImode) +DEF_SVE_GROUP_SUFFIX (x2, 0, 2) +DEF_SVE_GROUP_SUFFIX (x3, 0, 3) +DEF_SVE_GROUP_SUFFIX (x4, 0, 4) + #include "aarch64-sve-builtins-base.def" #include "aarch64-sve-builtins-sve2.def" #undef DEF_SVE_FUNCTION +#undef DEF_SVE_GROUP_SUFFIX #undef DEF_SVE_TYPE_SUFFIX #undef DEF_SVE_TYPE #undef DEF_SVE_MODE diff --git a/gcc/config/aarch64/aarch64-sve-builtins.h b/gcc/config/aarch64/aarch64-sve-builtins.h index 2ca5b208efa..dde35e15259 100644 --- a/gcc/config/aarch64/aarch64-sve-builtins.h +++ b/gcc/config/aarch64/aarch64-sve-builtins.h @@ -180,6 +180,17 @@ enum type_suffix_index NUM_TYPE_SUFFIXES }; +/* Enumerates the possible group suffixes. Each suffix combines two + optional pieces of information: the vector group size in a ZA index, + and the number of vectors in the largest tuple argument. */ +enum group_suffix_index +{ +#define DEF_SVE_GROUP_SUFFIX(NAME, VG, VECTORS_PER_TUPLE) GROUP_##NAME, +#include "aarch64-sve-builtins.def" + GROUP_none, + NUM_GROUP_SUFFIXES +}; + /* Combines two type suffixes. */ typedef enum type_suffix_index type_suffix_pair[2]; @@ -237,6 +248,21 @@ struct type_suffix_info machine_mode vector_mode : 16; }; +/* Static information about a group suffix. */ +struct group_suffix_info +{ + /* The suffix string itself. */ + const char *string; + + /* If the suffix describes a vector group in a ZA index, this is the + size of that group, otherwise it is zero. */ + unsigned int vg; + + /* The number of vectors in the largest (or only) tuple argument, + or 1 if the suffix does not convey this information. */ + unsigned int vectors_per_tuple; +}; + /* Static information about a set of functions. */ struct function_group_info { @@ -251,14 +277,16 @@ struct function_group_info shapes. */ const function_shape *const *shape; - /* A list of the available type suffixes, and of the available predication - types. The function supports every combination of the two. + /* A list of the available type suffixes, group suffixes, and predication + types. The function supports every combination of the three. + + The list of type suffixes is terminated by two NUM_TYPE_SUFFIXES. + It is lexicographically ordered based on the index value. - The list of type suffixes is terminated by two NUM_TYPE_SUFFIXES - while the list of predication types is terminated by NUM_PREDS. - The list of type suffixes is lexicographically ordered based - on the index value. */ + The list of group suffixes is terminated by NUM_GROUP_SUFFIXES + and the list of predication types is terminated by NUM_PREDS. */ const type_suffix_pair *types; + const group_suffix_index *groups; const predication_index *preds; /* The architecture extensions that the functions require, as a set of @@ -273,7 +301,8 @@ class GTY((user)) function_instance public: function_instance (const char *, const function_base *, const function_shape *, mode_suffix_index, - const type_suffix_pair &, predication_index); + const type_suffix_pair &, group_suffix_index, + predication_index); bool operator== (const function_instance &) const; bool operator!= (const function_instance &) const; @@ -294,6 +323,8 @@ public: units_index displacement_units () const; const type_suffix_info &type_suffix (unsigned int) const; + const group_suffix_info &group_suffix () const; + tree scalar_type (unsigned int) const; tree vector_type (unsigned int) const; tree tuple_type (unsigned int) const; @@ -301,14 +332,14 @@ public: machine_mode vector_mode (unsigned int) const; machine_mode gp_mode (unsigned int) const; - /* The properties of the function. (The explicit "enum"s are required - for gengtype.) */ + /* The properties of the function. */ const char *base_name; const function_base *base; const function_shape *shape; - enum mode_suffix_index mode_suffix_id; + mode_suffix_index mode_suffix_id; type_suffix_pair type_suffix_ids; - enum predication_index pred; + group_suffix_index group_suffix_id; + predication_index pred; }; class registered_function; @@ -390,10 +421,12 @@ public: tree report_no_such_form (type_suffix_index); tree lookup_form (mode_suffix_index, type_suffix_index = NUM_TYPE_SUFFIXES, - type_suffix_index = NUM_TYPE_SUFFIXES); + type_suffix_index = NUM_TYPE_SUFFIXES, + group_suffix_index = GROUP_none); tree resolve_to (mode_suffix_index, type_suffix_index = NUM_TYPE_SUFFIXES, - type_suffix_index = NUM_TYPE_SUFFIXES); + type_suffix_index = NUM_TYPE_SUFFIXES, + group_suffix_index = GROUP_none); type_suffix_index infer_integer_scalar_type (unsigned int); type_suffix_index infer_pointer_type (unsigned int, bool = false); @@ -643,6 +676,11 @@ class function_shape public: virtual bool explicit_type_suffix_p (unsigned int) const = 0; + /* True if the group suffix is present in overloaded names. + This isn't meaningful for pre-SME intrinsics, and true is + more common than false, so provide a default definition. */ + virtual bool explicit_group_suffix_p () const { return true; } + /* Define all functions associated with the given group. */ virtual void build (function_builder &, const function_group_info &) const = 0; @@ -671,6 +709,7 @@ private: extern const type_suffix_info type_suffixes[NUM_TYPE_SUFFIXES + 1]; extern const mode_suffix_info mode_suffixes[MODE_none + 1]; +extern const group_suffix_info group_suffixes[NUM_GROUP_SUFFIXES]; extern tree scalar_types[NUM_VECTOR_TYPES]; extern tree acle_vector_types[MAX_TUPLE_SIZE][NUM_VECTOR_TYPES + 1]; @@ -733,9 +772,11 @@ function_instance (const char *base_name_in, const function_shape *shape_in, mode_suffix_index mode_suffix_id_in, const type_suffix_pair &type_suffix_ids_in, + group_suffix_index group_suffix_id_in, predication_index pred_in) : base_name (base_name_in), base (base_in), shape (shape_in), - mode_suffix_id (mode_suffix_id_in), pred (pred_in) + mode_suffix_id (mode_suffix_id_in), group_suffix_id (group_suffix_id_in), + pred (pred_in) { memcpy (type_suffix_ids, type_suffix_ids_in, sizeof (type_suffix_ids)); } @@ -746,9 +787,10 @@ function_instance::operator== (const function_instance &other) const return (base == other.base && shape == other.shape && mode_suffix_id == other.mode_suffix_id - && pred == other.pred && type_suffix_ids[0] == other.type_suffix_ids[0] - && type_suffix_ids[1] == other.type_suffix_ids[1]); + && type_suffix_ids[1] == other.type_suffix_ids[1] + && group_suffix_id == other.group_suffix_id + && pred == other.pred); } inline bool @@ -820,6 +862,13 @@ function_instance::type_suffix (unsigned int i) const return type_suffixes[type_suffix_ids[i]]; } +/* Return information about the function's group suffix. */ +inline const group_suffix_info & +function_instance::group_suffix () const +{ + return group_suffixes[group_suffix_id]; +} + /* Return the scalar type associated with type suffix I. */ inline tree function_instance::scalar_type (unsigned int i) const From patchwork Tue Dec 5 10:13:03 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Sandiford X-Patchwork-Id: 1872029 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4SkxGw6Mgjz23mf for ; Tue, 5 Dec 2023 21:14:48 +1100 (AEDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id A309B384CB9C for ; Tue, 5 Dec 2023 10:14:38 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by sourceware.org (Postfix) with ESMTP id 475BD385734D for ; Tue, 5 Dec 2023 10:13:37 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 475BD385734D Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=arm.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 475BD385734D Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=217.140.110.172 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1701771219; cv=none; b=bn60vhX77DckXpdLWqQ85f2qFU+MigBrX1rFy+sKzEZLYxO8O46hkMwNKJWSQYMDW15fcJEMI48khpXN1hqxd/NgwlRYV3TOdcmS6AD92f0540Jvk3YV1fBZO6SlvQjI+q24yPhSJcSn64d1Vb0xpgwiEKOlg4RU+ogWnjHGIvM= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1701771219; c=relaxed/simple; bh=yRk1wOpvFKgnS6QLIEMGZzYTsrEkDg3MKcWNrxXDyoo=; h=From:To:Subject:Date:Message-Id:MIME-Version; b=YysMNIdjBsInz9IQD/NhMhZ24GB1lHdZM49Dw15X4gCxo5qfaDH0+1/GmEbpzst/YawiZ7TqDSzZTAizu4VzPazipdJPid6nu1zEVDJRfnkRyl36rOspYy8j7eIARCCGQEKocffiKygLY45z+qAT9WzG7PMYI2EBRssDQ7XkCxw= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id A5BA91570; Tue, 5 Dec 2023 02:14:23 -0800 (PST) Received: from e121540-lin.manchester.arm.com (e121540-lin.manchester.arm.com [10.32.110.72]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 973313F5A1; Tue, 5 Dec 2023 02:13:36 -0800 (PST) From: Richard Sandiford To: gcc-patches@gcc.gnu.org Cc: Richard Sandiford Subject: [pushed v2 05/25] aarch64: Add sve_type to SVE builtins code Date: Tue, 5 Dec 2023 10:13:03 +0000 Message-Id: <20231205101323.1914247-6-richard.sandiford@arm.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20231205101323.1914247-1-richard.sandiford@arm.com> References: <20231205101323.1914247-1-richard.sandiford@arm.com> MIME-Version: 1.0 X-Spam-Status: No, score=-22.4 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_DMARC_NONE, KAM_DMARC_STATUS, KAM_LAZY_DOMAIN_SECURITY, SPF_HELO_NONE, SPF_NONE, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org Until now, the SVE ACLE code had mostly been able to represent individual SVE arguments with just an element type suffix (s32, u32, etc.). However, the SME2 ACLE provides many overloaded intrinsics that operate on tuples rather than single vectors. This patch therefore adds a new type (sve_type) that combines an element type suffix with a vector count. This is enough to uniquely represent all SVE ACLE types. gcc/ * config/aarch64/aarch64-sve-builtins.h (sve_type): New struct. (sve_type::operator==): New function. (function_resolver::get_vector_type): Delete. (function_resolver::report_no_such_form): Take an sve_type rather than a type_suffix_index. * config/aarch64/aarch64-sve-builtins.cc (get_vector_type): New function. (function_resolver::get_vector_type): Delete. (function_resolver::report_no_such_form): Take an sve_type rather than a type_suffix_index. (find_sve_type): New function, split out from... (function_resolver::infer_vector_or_tuple_type): ...here. --- gcc/config/aarch64/aarch64-sve-builtins.cc | 93 ++++++++++++---------- gcc/config/aarch64/aarch64-sve-builtins.h | 37 ++++++++- 2 files changed, 88 insertions(+), 42 deletions(-) diff --git a/gcc/config/aarch64/aarch64-sve-builtins.cc b/gcc/config/aarch64/aarch64-sve-builtins.cc index 55938413ef0..058b1defa9e 100644 --- a/gcc/config/aarch64/aarch64-sve-builtins.cc +++ b/gcc/config/aarch64/aarch64-sve-builtins.cc @@ -659,6 +659,14 @@ find_type_suffix_for_scalar_type (const_tree type) return NUM_TYPE_SUFFIXES; } +/* Return the vector type associated with TYPE. */ +static tree +get_vector_type (sve_type type) +{ + auto vector_type = type_suffixes[type.type].vector_type; + return acle_vector_types[type.num_vectors - 1][vector_type]; +} + /* Report an error against LOCATION that the user has tried to use function FNDECL when extension EXTENSION is disabled. */ static void @@ -1190,13 +1198,6 @@ function_resolver::function_resolver (location_t location, { } -/* Return the vector type associated with type suffix TYPE. */ -tree -function_resolver::get_vector_type (type_suffix_index type) -{ - return acle_vector_types[0][type_suffixes[type].vector_type]; -} - /* Return the name associated with TYPE. Using the name should be more user-friendly than the underlying canonical type, since it makes the signedness and bitwidth explicit. */ @@ -1227,10 +1228,10 @@ function_resolver::scalar_argument_p (unsigned int i) || SCALAR_FLOAT_TYPE_P (type)); } -/* Report that the function has no form that takes type suffix TYPE. +/* Report that the function has no form that takes type TYPE. Return error_mark_node. */ tree -function_resolver::report_no_such_form (type_suffix_index type) +function_resolver::report_no_such_form (sve_type type) { error_at (location, "%qE has no form that takes %qT arguments", fndecl, get_vector_type (type)); @@ -1352,6 +1353,25 @@ function_resolver::infer_pointer_type (unsigned int argno, return type; } +/* If TYPE is an SVE predicate or vector type, or a tuple of such a type, + return the associated sve_type, otherwise return an invalid sve_type. */ +static sve_type +find_sve_type (const_tree type) +{ + /* A linear search should be OK here, since the code isn't hot and + the number of types is only small. */ + for (unsigned int size_i = 0; size_i < MAX_TUPLE_SIZE; ++size_i) + for (unsigned int suffix_i = 0; suffix_i < NUM_TYPE_SUFFIXES; ++suffix_i) + { + vector_type_index type_i = type_suffixes[suffix_i].vector_type; + tree this_type = acle_vector_types[size_i][type_i]; + if (this_type && matches_type_p (this_type, type)) + return { type_suffix_index (suffix_i), size_i + 1 }; + } + + return {}; +} + /* Require argument ARGNO to be a single vector or a tuple of NUM_VECTORS vectors; NUM_VECTORS is 1 for the former. Return the associated type suffix on success, using TYPE_SUFFIX_b for predicates. Report an error @@ -1364,37 +1384,30 @@ function_resolver::infer_vector_or_tuple_type (unsigned int argno, if (actual == error_mark_node) return NUM_TYPE_SUFFIXES; - /* A linear search should be OK here, since the code isn't hot and - the number of types is only small. */ - for (unsigned int size_i = 0; size_i < MAX_TUPLE_SIZE; ++size_i) - for (unsigned int suffix_i = 0; suffix_i < NUM_TYPE_SUFFIXES; ++suffix_i) - { - vector_type_index type_i = type_suffixes[suffix_i].vector_type; - tree type = acle_vector_types[size_i][type_i]; - if (type && matches_type_p (type, actual)) - { - if (size_i + 1 == num_vectors) - return type_suffix_index (suffix_i); - - if (num_vectors == 1) - error_at (location, "passing %qT to argument %d of %qE, which" - " expects a single SVE vector rather than a tuple", - actual, argno + 1, fndecl); - else if (size_i == 0 && type_i != VECTOR_TYPE_svbool_t) - /* num_vectors is always != 1, so the singular isn't needed. */ - error_n (location, num_vectors, "%qT%d%qE%d", - "passing single vector %qT to argument %d" - " of %qE, which expects a tuple of %d vectors", - actual, argno + 1, fndecl, num_vectors); - else - /* num_vectors is always != 1, so the singular isn't needed. */ - error_n (location, num_vectors, "%qT%d%qE%d", - "passing %qT to argument %d of %qE, which" - " expects a tuple of %d vectors", actual, argno + 1, - fndecl, num_vectors); - return NUM_TYPE_SUFFIXES; - } - } + if (auto sve_type = find_sve_type (actual)) + { + if (sve_type.num_vectors == num_vectors) + return sve_type.type; + + if (num_vectors == 1) + error_at (location, "passing %qT to argument %d of %qE, which" + " expects a single SVE vector rather than a tuple", + actual, argno + 1, fndecl); + else if (sve_type.num_vectors == 1 + && sve_type.type != TYPE_SUFFIX_b) + /* num_vectors is always != 1, so the singular isn't needed. */ + error_n (location, num_vectors, "%qT%d%qE%d", + "passing single vector %qT to argument %d" + " of %qE, which expects a tuple of %d vectors", + actual, argno + 1, fndecl, num_vectors); + else + /* num_vectors is always != 1, so the singular isn't needed. */ + error_n (location, num_vectors, "%qT%d%qE%d", + "passing %qT to argument %d of %qE, which" + " expects a tuple of %d vectors", actual, argno + 1, + fndecl, num_vectors); + return NUM_TYPE_SUFFIXES; + } if (num_vectors == 1) error_at (location, "passing %qT to argument %d of %qE, which" diff --git a/gcc/config/aarch64/aarch64-sve-builtins.h b/gcc/config/aarch64/aarch64-sve-builtins.h index dde35e15259..0dbd06791b8 100644 --- a/gcc/config/aarch64/aarch64-sve-builtins.h +++ b/gcc/config/aarch64/aarch64-sve-builtins.h @@ -263,6 +263,40 @@ struct group_suffix_info unsigned int vectors_per_tuple; }; +/* Represents an SVE vector, predicate, tuple of vectors, or tuple of + predicates. There is also a representation of "no type"/"invalid type". */ +struct sve_type +{ + sve_type () = default; + sve_type (type_suffix_index type) : type (type), num_vectors (1) {} + sve_type (type_suffix_index type, unsigned int num_vectors) + : type (type), num_vectors (num_vectors) {} + + /* Return true if the type is valid. */ + explicit operator bool () const { return type != NUM_TYPE_SUFFIXES; } + + bool operator== (const sve_type &) const; + bool operator!= (const sve_type &x) const { return !operator== (x); } + + /* This is one of: + + - TYPE_SUFFIX_b for svbool_t-based types + - TYPE_SUFFIX_c for svcount_t-based types + - the type suffix of a data element for SVE data vectors and tuples + - NUM_TYPE_SUFFIXES for invalid types. */ + type_suffix_index type = NUM_TYPE_SUFFIXES; + + /* If the type is a tuple, this is the number of vectors in the tuple, + otherwise it is 1. */ + unsigned int num_vectors = 1; +}; + +inline bool +sve_type::operator== (const sve_type &other) const +{ + return type == other.type && num_vectors == other.num_vectors; +} + /* Static information about a set of functions. */ struct function_group_info { @@ -413,12 +447,11 @@ public: function_resolver (location_t, const function_instance &, tree, vec &); - tree get_vector_type (type_suffix_index); const char *get_scalar_type_name (type_suffix_index); tree get_argument_type (unsigned int); bool scalar_argument_p (unsigned int); - tree report_no_such_form (type_suffix_index); + tree report_no_such_form (sve_type); tree lookup_form (mode_suffix_index, type_suffix_index = NUM_TYPE_SUFFIXES, type_suffix_index = NUM_TYPE_SUFFIXES, From patchwork Tue Dec 5 10:13:04 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Sandiford X-Patchwork-Id: 1872034 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4SkxJD4s5kz1ySd for ; Tue, 5 Dec 2023 21:15:56 +1100 (AEDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id A3DBA383F5FB for ; Tue, 5 Dec 2023 10:15:33 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by sourceware.org (Postfix) with ESMTP id 1A06038582B1 for ; Tue, 5 Dec 2023 10:13:38 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 1A06038582B1 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=arm.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 1A06038582B1 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=217.140.110.172 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1701771224; cv=none; b=YIFdBQzJI6waOuuFao2mF0QbA7JTvHt/Obk5Ud8B48lPM2Pbo9itIuM3HcxWF/EhPaJyHTEN56T01Ms4Mn/dWr35VM8gE0OKpIAqiYRcuOYadoc9J5GT8KsqbyEbfsk82ClAvYWHBa5G8Fz21iSXBbBJ+Vmk+moE3dtQOuyipXo= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1701771224; c=relaxed/simple; bh=swLs+dVIKCObumNXRzN4AiKxNC6syT2atg5FDwWJOQ0=; h=From:To:Subject:Date:Message-Id:MIME-Version; b=tchFH2NezVciYRy9oQlRZaHTQ+5SQuR41gZY8MIouKx4S1+MQtzYYvpALDzt0KuiIjuc1Ak9aPHtjMx2qZMbGDZPb33Euh9VRrs2oLZ4fqcGK25h6q/yRGCqXL5CnYc/tQGXiY1yVbzOUA1CdkzDmNi4dKruixM4lVO3iL+50CI= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 5EFCA1480; Tue, 5 Dec 2023 02:14:24 -0800 (PST) Received: from e121540-lin.manchester.arm.com (e121540-lin.manchester.arm.com [10.32.110.72]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 4BC463F5A1; Tue, 5 Dec 2023 02:13:37 -0800 (PST) From: Richard Sandiford To: gcc-patches@gcc.gnu.org Cc: Richard Sandiford Subject: [pushed v2 06/25] aarch64: Generalise some SVE ACLE error messages Date: Tue, 5 Dec 2023 10:13:04 +0000 Message-Id: <20231205101323.1914247-7-richard.sandiford@arm.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20231205101323.1914247-1-richard.sandiford@arm.com> References: <20231205101323.1914247-1-richard.sandiford@arm.com> MIME-Version: 1.0 X-Spam-Status: No, score=-22.4 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_DMARC_NONE, KAM_DMARC_STATUS, KAM_LAZY_DOMAIN_SECURITY, KAM_SHORT, SPF_HELO_NONE, SPF_NONE, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org The current SVE ACLE function-resolution diagnostics assume that a function has a fixed choice between vectors or tuples of vectors. If an argument was not an SVE type at all, the error message said the function "expects an SVE vector type" or "expects an SVE tuple type". This patch generalises the error to cope with cases where an argument can be either a vector or a tuple. It also splits out the diagnostics for mismatched tuple sizes, so that they can be reused by later patches. gcc/ * config/aarch64/aarch64-sve-builtins.h (function_resolver::infer_sve_type): New member function. (function_resolver::report_incorrect_num_vectors): Likewise. * config/aarch64/aarch64-sve-builtins.cc (function_resolver::infer_sve_type): New function,. (function_resolver::report_incorrect_num_vectors): New function, split out from... (function_resolver::infer_vector_or_tuple_type): ...here. Use infer_sve_type. gcc/testsuite/ * gcc.target/aarch64/sve/acle/general-c/*: Update expected error messages. --- gcc/config/aarch64/aarch64-sve-builtins.cc | 87 ++++++++++++------- gcc/config/aarch64/aarch64-sve-builtins.h | 3 + .../aarch64/sve/acle/general-c/adr_index_1.c | 6 +- .../aarch64/sve/acle/general-c/adr_offset_1.c | 6 +- .../aarch64/sve/acle/general-c/binary_1.c | 2 +- .../sve/acle/general-c/binary_int_opt_n.c | 2 +- .../sve/acle/general-c/binary_lane_1.c | 4 +- .../sve/acle/general-c/binary_long_lane_1.c | 4 +- .../sve/acle/general-c/binary_long_opt_n_1.c | 2 +- .../aarch64/sve/acle/general-c/binary_n_1.c | 2 +- .../acle/general-c/binary_narrowb_opt_n_1.c | 2 +- .../acle/general-c/binary_narrowt_opt_n_1.c | 4 +- .../sve/acle/general-c/binary_opt_n_2.c | 2 +- .../sve/acle/general-c/binary_opt_n_3.c | 2 +- .../sve/acle/general-c/binary_rotate_1.c | 4 +- .../sve/acle/general-c/binary_to_uint_1.c | 4 +- .../sve/acle/general-c/binary_uint64_n_1.c | 2 +- .../acle/general-c/binary_uint64_opt_n_2.c | 2 +- .../sve/acle/general-c/binary_uint_1.c | 2 +- .../sve/acle/general-c/binary_uint_n_1.c | 2 +- .../sve/acle/general-c/binary_uint_opt_n_1.c | 2 +- .../sve/acle/general-c/binary_wide_1.c | 8 +- .../sve/acle/general-c/binary_wide_opt_n_1.c | 4 +- .../aarch64/sve/acle/general-c/clast_1.c | 4 +- .../aarch64/sve/acle/general-c/compare_1.c | 4 +- .../sve/acle/general-c/compare_opt_n_1.c | 2 +- .../sve/acle/general-c/compare_wide_opt_n_1.c | 2 +- .../sve/acle/general-c/count_vector_1.c | 2 +- .../aarch64/sve/acle/general-c/create_1.c | 4 +- .../aarch64/sve/acle/general-c/create_3.c | 4 +- .../aarch64/sve/acle/general-c/create_5.c | 4 +- .../aarch64/sve/acle/general-c/fold_left_1.c | 4 +- .../sve/acle/general-c/inc_dec_pred_1.c | 2 +- .../aarch64/sve/acle/general-c/mmla_1.c | 10 +-- .../acle/general-c/prefetch_gather_offset_2.c | 2 +- .../aarch64/sve/acle/general-c/reduction_1.c | 2 +- .../sve/acle/general-c/reduction_wide_1.c | 2 +- .../general-c/shift_right_imm_narrowb_1.c | 2 +- .../shift_right_imm_narrowb_to_uint_1.c | 2 +- .../general-c/shift_right_imm_narrowt_1.c | 4 +- .../shift_right_imm_narrowt_to_uint_1.c | 4 +- .../aarch64/sve/acle/general-c/store_1.c | 2 +- .../aarch64/sve/acle/general-c/store_2.c | 2 +- .../acle/general-c/store_scatter_offset_1.c | 4 +- .../sve/acle/general-c/ternary_bfloat16_1.c | 2 +- .../acle/general-c/ternary_bfloat16_lane_1.c | 2 +- .../general-c/ternary_bfloat16_lanex2_1.c | 2 +- .../acle/general-c/ternary_bfloat16_opt_n_1.c | 2 +- .../general-c/ternary_intq_uintq_lane_1.c | 6 +- .../general-c/ternary_intq_uintq_opt_n_1.c | 4 +- .../sve/acle/general-c/ternary_lane_1.c | 6 +- .../acle/general-c/ternary_lane_rotate_1.c | 6 +- .../sve/acle/general-c/ternary_long_lane_1.c | 6 +- .../sve/acle/general-c/ternary_long_opt_n_1.c | 4 +- .../sve/acle/general-c/ternary_opt_n_1.c | 4 +- .../sve/acle/general-c/ternary_qq_lane_1.c | 6 +- .../acle/general-c/ternary_qq_lane_rotate_1.c | 6 +- .../sve/acle/general-c/ternary_qq_opt_n_2.c | 4 +- .../sve/acle/general-c/ternary_qq_rotate_1.c | 6 +- .../sve/acle/general-c/ternary_rotate_1.c | 6 +- .../general-c/ternary_shift_right_imm_1.c | 4 +- .../sve/acle/general-c/ternary_uint_1.c | 6 +- .../sve/acle/general-c/ternary_uintq_intq_1.c | 6 +- .../general-c/ternary_uintq_intq_lane_1.c | 6 +- .../general-c/ternary_uintq_intq_opt_n_1.c | 4 +- .../aarch64/sve/acle/general-c/tmad_1.c | 4 +- .../aarch64/sve/acle/general-c/unary_1.c | 2 +- .../aarch64/sve/acle/general-c/unary_2.c | 2 +- .../sve/acle/general-c/unary_convert_1.c | 2 +- .../sve/acle/general-c/unary_convert_2.c | 2 +- .../acle/general-c/unary_convert_narrowt_1.c | 2 +- .../sve/acle/general-c/unary_narrowb_1.c | 2 +- .../acle/general-c/unary_narrowb_to_uint_1.c | 2 +- .../sve/acle/general-c/unary_narrowt_1.c | 4 +- .../acle/general-c/unary_narrowt_to_uint_1.c | 4 +- .../sve/acle/general-c/unary_to_int_1.c | 2 +- .../sve/acle/general-c/unary_to_uint_1.c | 2 +- .../sve/acle/general-c/unary_to_uint_2.c | 2 +- .../sve/acle/general-c/unary_to_uint_3.c | 2 +- .../aarch64/sve/acle/general-c/unary_uint_1.c | 2 +- .../sve/acle/general-c/unary_widen_1.c | 4 +- 81 files changed, 195 insertions(+), 169 deletions(-) diff --git a/gcc/config/aarch64/aarch64-sve-builtins.cc b/gcc/config/aarch64/aarch64-sve-builtins.cc index 058b1defa9e..1ecd8fd5db9 100644 --- a/gcc/config/aarch64/aarch64-sve-builtins.cc +++ b/gcc/config/aarch64/aarch64-sve-builtins.cc @@ -1228,6 +1228,32 @@ function_resolver::scalar_argument_p (unsigned int i) || SCALAR_FLOAT_TYPE_P (type)); } +/* Report that argument ARGNO was expected to have NUM_VECTORS vectors. + TYPE is the type that ARGNO actually has. */ +void +function_resolver::report_incorrect_num_vectors (unsigned int argno, + sve_type type, + unsigned int num_vectors) +{ + if (num_vectors == 1) + error_at (location, "passing %qT to argument %d of %qE, which" + " expects a single SVE vector rather than a tuple", + get_vector_type (type), argno + 1, fndecl); + else if (type.num_vectors == 1 + && type.type != TYPE_SUFFIX_b) + /* num_vectors is always != 1, so the singular isn't needed. */ + error_n (location, num_vectors, "%qT%d%qE%d", + "passing single vector %qT to argument %d" + " of %qE, which expects a tuple of %d vectors", + get_vector_type (type), argno + 1, fndecl, num_vectors); + else + /* num_vectors is always != 1, so the singular isn't needed. */ + error_n (location, num_vectors, "%qT%d%qE%d", + "passing %qT to argument %d of %qE, which" + " expects a tuple of %d vectors", get_vector_type (type), + argno + 1, fndecl, num_vectors); +} + /* Report that the function has no form that takes type TYPE. Return error_mark_node. */ tree @@ -1372,6 +1398,30 @@ find_sve_type (const_tree type) return {}; } +/* Require argument ARGNO to be an SVE type (i.e. something that can be + represented by sve_type). Return the (valid) type if it is, otherwise + report an error and return an invalid type. */ +sve_type +function_resolver::infer_sve_type (unsigned int argno) +{ + tree actual = get_argument_type (argno); + if (actual == error_mark_node) + return {}; + + if (sve_type type = find_sve_type (actual)) + return type; + + if (scalar_argument_p (argno)) + error_at (location, "passing %qT to argument %d of %qE, which" + " expects an SVE type rather than a scalar type", + actual, argno + 1, fndecl); + else + error_at (location, "passing %qT to argument %d of %qE, which" + " expects an SVE type", + actual, argno + 1, fndecl); + return {}; +} + /* Require argument ARGNO to be a single vector or a tuple of NUM_VECTORS vectors; NUM_VECTORS is 1 for the former. Return the associated type suffix on success, using TYPE_SUFFIX_b for predicates. Report an error @@ -1380,41 +1430,14 @@ type_suffix_index function_resolver::infer_vector_or_tuple_type (unsigned int argno, unsigned int num_vectors) { - tree actual = get_argument_type (argno); - if (actual == error_mark_node) + auto type = infer_sve_type (argno); + if (!type) return NUM_TYPE_SUFFIXES; - if (auto sve_type = find_sve_type (actual)) - { - if (sve_type.num_vectors == num_vectors) - return sve_type.type; - - if (num_vectors == 1) - error_at (location, "passing %qT to argument %d of %qE, which" - " expects a single SVE vector rather than a tuple", - actual, argno + 1, fndecl); - else if (sve_type.num_vectors == 1 - && sve_type.type != TYPE_SUFFIX_b) - /* num_vectors is always != 1, so the singular isn't needed. */ - error_n (location, num_vectors, "%qT%d%qE%d", - "passing single vector %qT to argument %d" - " of %qE, which expects a tuple of %d vectors", - actual, argno + 1, fndecl, num_vectors); - else - /* num_vectors is always != 1, so the singular isn't needed. */ - error_n (location, num_vectors, "%qT%d%qE%d", - "passing %qT to argument %d of %qE, which" - " expects a tuple of %d vectors", actual, argno + 1, - fndecl, num_vectors); - return NUM_TYPE_SUFFIXES; - } + if (type.num_vectors == num_vectors) + return type.type; - if (num_vectors == 1) - error_at (location, "passing %qT to argument %d of %qE, which" - " expects an SVE vector type", actual, argno + 1, fndecl); - else - error_at (location, "passing %qT to argument %d of %qE, which" - " expects an SVE tuple type", actual, argno + 1, fndecl); + report_incorrect_num_vectors (argno, type, num_vectors); return NUM_TYPE_SUFFIXES; } diff --git a/gcc/config/aarch64/aarch64-sve-builtins.h b/gcc/config/aarch64/aarch64-sve-builtins.h index 0dbd06791b8..bba3a87f7bc 100644 --- a/gcc/config/aarch64/aarch64-sve-builtins.h +++ b/gcc/config/aarch64/aarch64-sve-builtins.h @@ -451,6 +451,8 @@ public: tree get_argument_type (unsigned int); bool scalar_argument_p (unsigned int); + void report_incorrect_num_vectors (unsigned int, sve_type, unsigned int); + tree report_no_such_form (sve_type); tree lookup_form (mode_suffix_index, type_suffix_index = NUM_TYPE_SUFFIXES, @@ -463,6 +465,7 @@ public: type_suffix_index infer_integer_scalar_type (unsigned int); type_suffix_index infer_pointer_type (unsigned int, bool = false); + sve_type infer_sve_type (unsigned int); type_suffix_index infer_vector_or_tuple_type (unsigned int, unsigned int); type_suffix_index infer_vector_type (unsigned int); type_suffix_index infer_integer_vector_type (unsigned int); diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/adr_index_1.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/adr_index_1.c index 714265ed1f1..a17e99f5d6e 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/adr_index_1.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/adr_index_1.c @@ -10,14 +10,14 @@ f1 (svbool_t pg, uint32_t *u32_ptr, svuint8_t u8, svuint16_t u16, { svadrh_index (u32); /* { dg-error {too few arguments to function 'svadrh_index'} } */ svadrh_index (u32, u32, u32); /* { dg-error {too many arguments to function 'svadrh_index'} } */ - svadrh_index (u32_ptr, s32); /* { dg-error {passing '[^']*\*'[^\n]* to argument 1 of 'svadrh_index', which expects an SVE vector type} } */ - svadrh_index (0, s32); /* { dg-error {passing 'int' to argument 1 of 'svadrh_index', which expects an SVE vector type} } */ + svadrh_index (u32_ptr, s32); /* { dg-error {passing '[^']*\*'[^\n]* to argument 1 of 'svadrh_index', which expects an SVE type} } */ + svadrh_index (0, s32); /* { dg-error {passing 'int' to argument 1 of 'svadrh_index', which expects an SVE type rather than a scalar} } */ svadrh_index (u16, u16); /* { dg-error {passing 'svuint16_t' to argument 1 of 'svadrh_index', which expects 'svuint32_t' or 'svuint64_t'} } */ svadrh_index (s32, s32); /* { dg-error {passing 'svint32_t' to argument 1 of 'svadrh_index', which expects 'svuint32_t' or 'svuint64_t'} } */ svadrh_index (f32, s32); /* { dg-error {passing 'svfloat32_t' to argument 1 of 'svadrh_index', which expects 'svuint32_t' or 'svuint64_t'} } */ svadrh_index (pg, s32); /* { dg-error {passing 'svbool_t' to argument 1 of 'svadrh_index', which expects 'svuint32_t' or 'svuint64_t'} } */ - svadrh_index (u32, 0); /* { dg-error {passing 'int' to argument 2 of 'svadrh_index', which expects an SVE vector type} } */ + svadrh_index (u32, 0); /* { dg-error {passing 'int' to argument 2 of 'svadrh_index', which expects an SVE type rather than a scalar} } */ svadrh_index (u32, u8); /* { dg-error {passing 'svuint8_t' to argument 2 of 'svadrh_index', which expects a vector of 32-bit or 64-bit integers} } */ svadrh_index (u32, u16); /* { dg-error {passing 'svuint16_t' to argument 2 of 'svadrh_index', which expects a vector of 32-bit or 64-bit integers} } */ svadrh_index (u32, pg); /* { dg-error {passing 'svbool_t' to argument 2 of 'svadrh_index', which expects a vector of integers} } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/adr_offset_1.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/adr_offset_1.c index 528d7ac51ef..627ae8ac5ae 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/adr_offset_1.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/adr_offset_1.c @@ -10,14 +10,14 @@ f1 (svbool_t pg, uint32_t *u32_ptr, svuint8_t u8, svuint16_t u16, { svadrb_offset (u32); /* { dg-error {too few arguments to function 'svadrb_offset'} } */ svadrb_offset (u32, u32, u32); /* { dg-error {too many arguments to function 'svadrb_offset'} } */ - svadrb_offset (u32_ptr, s32); /* { dg-error {passing '[^']*\*'[^\n]* to argument 1 of 'svadrb_offset', which expects an SVE vector type} } */ - svadrb_offset (0, s32); /* { dg-error {passing 'int' to argument 1 of 'svadrb_offset', which expects an SVE vector type} } */ + svadrb_offset (u32_ptr, s32); /* { dg-error {passing '[^']*\*'[^\n]* to argument 1 of 'svadrb_offset', which expects an SVE type} } */ + svadrb_offset (0, s32); /* { dg-error {passing 'int' to argument 1 of 'svadrb_offset', which expects an SVE type rather than a scalar} } */ svadrb_offset (u16, u16); /* { dg-error {passing 'svuint16_t' to argument 1 of 'svadrb_offset', which expects 'svuint32_t' or 'svuint64_t'} } */ svadrb_offset (s32, s32); /* { dg-error {passing 'svint32_t' to argument 1 of 'svadrb_offset', which expects 'svuint32_t' or 'svuint64_t'} } */ svadrb_offset (f32, s32); /* { dg-error {passing 'svfloat32_t' to argument 1 of 'svadrb_offset', which expects 'svuint32_t' or 'svuint64_t'} } */ svadrb_offset (pg, s32); /* { dg-error {passing 'svbool_t' to argument 1 of 'svadrb_offset', which expects 'svuint32_t' or 'svuint64_t'} } */ - svadrb_offset (u32, 0); /* { dg-error {passing 'int' to argument 2 of 'svadrb_offset', which expects an SVE vector type} } */ + svadrb_offset (u32, 0); /* { dg-error {passing 'int' to argument 2 of 'svadrb_offset', which expects an SVE type rather than a scalar} } */ svadrb_offset (u32, u8); /* { dg-error {passing 'svuint8_t' to argument 2 of 'svadrb_offset', which expects a vector of 32-bit or 64-bit integers} } */ svadrb_offset (u32, u16); /* { dg-error {passing 'svuint16_t' to argument 2 of 'svadrb_offset', which expects a vector of 32-bit or 64-bit integers} } */ svadrb_offset (u32, pg); /* { dg-error {passing 'svbool_t' to argument 2 of 'svadrb_offset', which expects a vector of integers} } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/binary_1.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/binary_1.c index 8ce89fa1053..4343146de05 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/binary_1.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/binary_1.c @@ -10,5 +10,5 @@ f1 (svbool_t pg, svuint8_t u8, svint16_t s16) svzip1 (pg, u8); /* { dg-error {passing 'svuint8_t' to argument 2 of 'svzip1', but previous arguments had type 'svbool_t'} } */ svzip1 (u8, pg); /* { dg-error {passing 'svbool_t' to argument 2 of 'svzip1', but previous arguments had type 'svuint8_t'} } */ svzip1 (u8, s16); /* { dg-error {passing 'svint16_t' to argument 2 of 'svzip1', but previous arguments had type 'svuint8_t'} } */ - svzip1 (u8, 0); /* { dg-error {passing 'int' to argument 2 of 'svzip1', which expects an SVE vector type} } */ + svzip1 (u8, 0); /* { dg-error {passing 'int' to argument 2 of 'svzip1', which expects an SVE type rather than a scalar} } */ } diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/binary_int_opt_n.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/binary_int_opt_n.c index 965e9a13cce..9902379f649 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/binary_int_opt_n.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/binary_int_opt_n.c @@ -11,7 +11,7 @@ f1 (svbool_t pg, svfloat16_t f16, svint16_t s16, svuint16_t u16, svscale_x (s32, f16, s32); /* { dg-error {passing 'svint32_t' to argument 1 of 'svscale_x', which expects 'svbool_t'} } */ svscale_x (1, f16, s32); /* { dg-error {passing 'int' to argument 1 of 'svscale_x', which expects 'svbool_t'} } */ svscale_x (pg, pg, s16); /* { dg-error {'svscale_x' has no form that takes 'svbool_t' arguments} } */ - svscale_x (pg, 1, s16); /* { dg-error {passing 'int' to argument 2 of 'svscale_x', which expects an SVE vector type} } */ + svscale_x (pg, 1, s16); /* { dg-error {passing 'int' to argument 2 of 'svscale_x', which expects an SVE type rather than a scalar} } */ svscale_x (pg, f16, s16); svscale_x (pg, f16, u16); /* { dg-error {passing 'svuint16_t' to argument 3 of 'svscale_x', which expects a vector of signed integers} } */ svscale_x (pg, f16, f16); /* { dg-error {passing 'svfloat16_t' to argument 3 of 'svscale_x', which expects a vector of signed integers} } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/binary_lane_1.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/binary_lane_1.c index 3913ff63d4f..10b6b7e81e7 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/binary_lane_1.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/binary_lane_1.c @@ -10,8 +10,8 @@ f1 (svbool_t pg, svfloat16_t f16, svfloat32_t f32, svfloat64_t f64, svmul_lane (f32, f32, 0, 0); /* { dg-error {too many arguments to function 'svmul_lane'} } */ svmul_lane (pg, pg, 0); /* { dg-error {'svmul_lane' has no form that takes 'svbool_t' arguments} } */ svmul_lane (s32, s32, 0); /* { dg-error {ACLE function 'svmul_lane_s32' requires ISA extension 'sve2'} "" { xfail aarch64_sve2 } } */ - svmul_lane (1, f32, 0); /* { dg-error {passing 'int' to argument 1 of 'svmul_lane', which expects an SVE vector type} } */ - svmul_lane (f32, 1, 0); /* { dg-error {passing 'int' to argument 2 of 'svmul_lane', which expects an SVE vector type} } */ + svmul_lane (1, f32, 0); /* { dg-error {passing 'int' to argument 1 of 'svmul_lane', which expects an SVE type rather than a scalar} } */ + svmul_lane (f32, 1, 0); /* { dg-error {passing 'int' to argument 2 of 'svmul_lane', which expects an SVE type rather than a scalar} } */ svmul_lane (f32, f64, 0); /* { dg-error {passing 'svfloat64_t' to argument 2 of 'svmul_lane', but previous arguments had type 'svfloat32_t'} } */ svmul_lane (f32, f32, s32); /* { dg-error {argument 3 of 'svmul_lane' must be an integer constant expression} } */ svmul_lane (f32, f32, i); /* { dg-error {argument 3 of 'svmul_lane' must be an integer constant expression} } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/binary_long_lane_1.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/binary_long_lane_1.c index bfe78088b07..805863f76bc 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/binary_long_lane_1.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/binary_long_lane_1.c @@ -19,8 +19,8 @@ f1 (svbool_t pg, svint8_t s8, svuint8_t u8, svint16_t s16, svuint16_t u16, svmullb_lane (f16, f16, 0); /* { dg-error {'svmullb_lane' has no form that takes 'svfloat16_t' arguments} } */ svmullb_lane (f32, f32, 0); /* { dg-error {'svmullb_lane' has no form that takes 'svfloat32_t' arguments} } */ svmullb_lane (f64, f64, 0); /* { dg-error {'svmullb_lane' has no form that takes 'svfloat64_t' arguments} } */ - svmullb_lane (1, u32, 0); /* { dg-error {passing 'int' to argument 1 of 'svmullb_lane', which expects an SVE vector type} } */ - svmullb_lane (u32, 1, 0); /* { dg-error {passing 'int' to argument 2 of 'svmullb_lane', which expects an SVE vector type} } */ + svmullb_lane (1, u32, 0); /* { dg-error {passing 'int' to argument 1 of 'svmullb_lane', which expects an SVE type rather than a scalar} } */ + svmullb_lane (u32, 1, 0); /* { dg-error {passing 'int' to argument 2 of 'svmullb_lane', which expects an SVE type rather than a scalar} } */ svmullb_lane (u32, s32, 0); /* { dg-error {passing 'svint32_t' to argument 2 of 'svmullb_lane', but previous arguments had type 'svuint32_t'} } */ svmullb_lane (u32, u32, s32); /* { dg-error {argument 3 of 'svmullb_lane' must be an integer constant expression} } */ svmullb_lane (u32, u32, i); /* { dg-error {argument 3 of 'svmullb_lane' must be an integer constant expression} } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/binary_long_opt_n_1.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/binary_long_opt_n_1.c index 27893c6fbe3..ee704eeaefb 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/binary_long_opt_n_1.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/binary_long_opt_n_1.c @@ -23,7 +23,7 @@ f1 (svbool_t pg, svint8_t s8, svuint8_t u8, svaddlb (u64, u64); /* { dg-error {'svaddlb' has no form that takes 'svuint64_t' arguments} } */ svaddlb (s64, s64); /* { dg-error {'svaddlb' has no form that takes 'svint64_t' arguments} } */ svaddlb (f16, f16); /* { dg-error {'svaddlb' has no form that takes 'svfloat16_t' arguments} } */ - svaddlb (1, u8); /* { dg-error {passing 'int' to argument 1 of 'svaddlb', which expects an SVE vector type} } */ + svaddlb (1, u8); /* { dg-error {passing 'int' to argument 1 of 'svaddlb', which expects an SVE type rather than a scalar} } */ svaddlb (u8, s8); /* { dg-error {passing 'svint8_t' to argument 2 of 'svaddlb', but previous arguments had type 'svuint8_t'} } */ svaddlb (u8, s16); /* { dg-error {passing 'svint16_t' to argument 2 of 'svaddlb', but previous arguments had type 'svuint8_t'} } */ svaddlb (u8, u16); /* { dg-error {passing 'svuint16_t' to argument 2 of 'svaddlb', but previous arguments had type 'svuint8_t'} } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/binary_n_1.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/binary_n_1.c index 0c69e66a15a..ff4f0ff756f 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/binary_n_1.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/binary_n_1.c @@ -7,7 +7,7 @@ f1 (svbool_t pg, svuint8_t u8, svfloat16_t f16, int i, float f) { svinsr (u8); /* { dg-error {too few arguments to function 'svinsr'} } */ svinsr (u8, 0, 0); /* { dg-error {too many arguments to function 'svinsr'} } */ - svinsr (0, 0); /* { dg-error {passing 'int' to argument 1 of 'svinsr', which expects an SVE vector type} } */ + svinsr (0, 0); /* { dg-error {passing 'int' to argument 1 of 'svinsr', which expects an SVE type rather than a scalar} } */ svinsr (u8, 0); svinsr (u8, -1); svinsr (u8, i); diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/binary_narrowb_opt_n_1.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/binary_narrowb_opt_n_1.c index 920cbd1b0c3..8ca549ba93f 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/binary_narrowb_opt_n_1.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/binary_narrowb_opt_n_1.c @@ -23,7 +23,7 @@ f1 (svbool_t pg, svint8_t s8, svuint8_t u8, svaddhnb (u64, u64); svaddhnb (s64, s64); svaddhnb (f32, f32); /* { dg-error {'svaddhnb' has no form that takes 'svfloat32_t' arguments} } */ - svaddhnb (1, u16); /* { dg-error {passing 'int' to argument 1 of 'svaddhnb', which expects an SVE vector type} } */ + svaddhnb (1, u16); /* { dg-error {passing 'int' to argument 1 of 'svaddhnb', which expects an SVE type rather than a scalar} } */ svaddhnb (u16, s8); /* { dg-error {passing 'svint8_t' to argument 2 of 'svaddhnb', but previous arguments had type 'svuint16_t'} } */ svaddhnb (u16, s16); /* { dg-error {passing 'svint16_t' to argument 2 of 'svaddhnb', but previous arguments had type 'svuint16_t'} } */ svaddhnb (u16, u32); /* { dg-error {passing 'svuint32_t' to argument 2 of 'svaddhnb', but previous arguments had type 'svuint16_t'} } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/binary_narrowt_opt_n_1.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/binary_narrowt_opt_n_1.c index eb70d058ec7..2b537965bc6 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/binary_narrowt_opt_n_1.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/binary_narrowt_opt_n_1.c @@ -26,8 +26,8 @@ f1 (svbool_t pg, svint8_t s8, svuint8_t u8, svaddhnt (u32, u64, u64); svaddhnt (s32, s64, s64); svaddhnt (f16, f32, f32); /* { dg-error {'svaddhnt' has no form that takes 'svfloat32_t' arguments} } */ - svaddhnt (1, u16, u16); /* { dg-error {passing 'int' to argument 1 of 'svaddhnt', which expects an SVE vector type} } */ - svaddhnt (u8, 1, u16); /* { dg-error {passing 'int' to argument 2 of 'svaddhnt', which expects an SVE vector type} } */ + svaddhnt (1, u16, u16); /* { dg-error {passing 'int' to argument 1 of 'svaddhnt', which expects an SVE type rather than a scalar} } */ + svaddhnt (u8, 1, u16); /* { dg-error {passing 'int' to argument 2 of 'svaddhnt', which expects an SVE type rather than a scalar} } */ svaddhnt (u8, u16, s8); /* { dg-error {passing 'svint8_t' to argument 3 of 'svaddhnt', but previous arguments had type 'svuint16_t'} } */ svaddhnt (u8, u16, s16); /* { dg-error {passing 'svint16_t' to argument 3 of 'svaddhnt', but previous arguments had type 'svuint16_t'} } */ svaddhnt (u8, u16, u32); /* { dg-error {passing 'svuint32_t' to argument 3 of 'svaddhnt', but previous arguments had type 'svuint16_t'} } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/binary_opt_n_2.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/binary_opt_n_2.c index 9fa83ca99c2..a151f90d170 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/binary_opt_n_2.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/binary_opt_n_2.c @@ -10,7 +10,7 @@ f1 (svbool_t pg, svint8_t s8, svuint8_t u8, svadd_x (pg, u8, u8, u8); /* { dg-error {too many arguments to function 'svadd_x'} } */ svadd_x (u8, u8, u8); /* { dg-error {passing 'svuint8_t' to argument 1 of 'svadd_x', which expects 'svbool_t'} } */ svadd_x (pg, pg, pg); /* { dg-error {'svadd_x' has no form that takes 'svbool_t' arguments} } */ - svadd_x (pg, 1, u8); /* { dg-error {passing 'int' to argument 2 of 'svadd_x', which expects an SVE vector type} } */ + svadd_x (pg, 1, u8); /* { dg-error {passing 'int' to argument 2 of 'svadd_x', which expects an SVE type rather than a scalar} } */ svadd_x (pg, u8, s8); /* { dg-error {passing 'svint8_t' to argument 3 of 'svadd_x', but previous arguments had type 'svuint8_t'} } */ svadd_x (pg, u8, u8); svadd_x (pg, u8, s16); /* { dg-error {passing 'svint16_t' to argument 3 of 'svadd_x', but previous arguments had type 'svuint8_t'} } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/binary_opt_n_3.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/binary_opt_n_3.c index 4d0b253e352..70ec9c58518 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/binary_opt_n_3.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/binary_opt_n_3.c @@ -10,7 +10,7 @@ f1 (svbool_t pg, svint8_t s8, svuint8_t u8, svand_z (pg, u8, u8, u8); /* { dg-error {too many arguments to function 'svand_z'} } */ svand_z (u8, u8, u8); /* { dg-error {passing 'svuint8_t' to argument 1 of 'svand_z', which expects 'svbool_t'} } */ svand_z (pg, pg, pg); - svand_z (pg, 1, u8); /* { dg-error {passing 'int' to argument 2 of 'svand_z', which expects an SVE vector type} } */ + svand_z (pg, 1, u8); /* { dg-error {passing 'int' to argument 2 of 'svand_z', which expects an SVE type rather than a scalar} } */ svand_z (pg, u8, s8); /* { dg-error {passing 'svint8_t' to argument 3 of 'svand_z', but previous arguments had type 'svuint8_t'} } */ svand_z (pg, u8, u8); svand_z (pg, u8, s16); /* { dg-error {passing 'svint16_t' to argument 3 of 'svand_z', but previous arguments had type 'svuint8_t'} } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/binary_rotate_1.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/binary_rotate_1.c index 8ffe91bce0d..7669e4a0261 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/binary_rotate_1.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/binary_rotate_1.c @@ -10,8 +10,8 @@ f1 (svbool_t pg, svfloat32_t f32, svfloat64_t f64, svint32_t s32, int i) svcadd_x (f32, f32, f32, 90); /* { dg-error {passing 'svfloat32_t' to argument 1 of 'svcadd_x', which expects 'svbool_t'} } */ svcadd_x (pg, pg, pg, 90); /* { dg-error {'svcadd_x' has no form that takes 'svbool_t' arguments} } */ svcadd_x (pg, s32, s32, 90); /* { dg-error {'svcadd_x' has no form that takes 'svint32_t' arguments} } */ - svcadd_x (pg, 1, f32, 90); /* { dg-error {passing 'int' to argument 2 of 'svcadd_x', which expects an SVE vector type} } */ - svcadd_x (pg, f32, 1, 90); /* { dg-error {passing 'int' to argument 3 of 'svcadd_x', which expects an SVE vector type} } */ + svcadd_x (pg, 1, f32, 90); /* { dg-error {passing 'int' to argument 2 of 'svcadd_x', which expects an SVE type rather than a scalar} } */ + svcadd_x (pg, f32, 1, 90); /* { dg-error {passing 'int' to argument 3 of 'svcadd_x', which expects an SVE type rather than a scalar} } */ svcadd_x (pg, f32, f64, 90); /* { dg-error {passing 'svfloat64_t' to argument 3 of 'svcadd_x', but previous arguments had type 'svfloat32_t'} } */ svcadd_x (pg, f32, f32, s32); /* { dg-error {argument 4 of 'svcadd_x' must be an integer constant expression} } */ svcadd_x (pg, f32, f32, i); /* { dg-error {argument 4 of 'svcadd_x' must be an integer constant expression} } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/binary_to_uint_1.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/binary_to_uint_1.c index 213defc6606..154662487e3 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/binary_to_uint_1.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/binary_to_uint_1.c @@ -11,9 +11,9 @@ f1 (svbool_t pg, svint32_t s32, svuint32_t u32) svhistcnt_z (pg, s32, s32, 0); /* { dg-error {too many arguments to function 'svhistcnt_z'} } */ svhistcnt_z (0, s32, s32); /* { dg-error {passing 'int' to argument 1 of 'svhistcnt_z', which expects 'svbool_t'} } */ svhistcnt_z (s32, s32, s32); /* { dg-error {passing 'svint32_t' to argument 1 of 'svhistcnt_z', which expects 'svbool_t'} } */ - svhistcnt_z (pg, 0, s32); /* { dg-error {passing 'int' to argument 2 of 'svhistcnt_z', which expects an SVE vector type} } */ + svhistcnt_z (pg, 0, s32); /* { dg-error {passing 'int' to argument 2 of 'svhistcnt_z', which expects an SVE type rather than a scalar} } */ svhistcnt_z (pg, pg, s32); /* { dg-error {passing 'svint32_t' to argument 3 of 'svhistcnt_z', but previous arguments had type 'svbool_t'} } */ svhistcnt_z (pg, s32, u32); /* { dg-error {passing 'svuint32_t' to argument 3 of 'svhistcnt_z', but previous arguments had type 'svint32_t'} } */ - svhistcnt_z (pg, s32, 0); /* { dg-error {passing 'int' to argument 3 of 'svhistcnt_z', which expects an SVE vector type} } */ + svhistcnt_z (pg, s32, 0); /* { dg-error {passing 'int' to argument 3 of 'svhistcnt_z', which expects an SVE type rather than a scalar} } */ svhistcnt_z (pg, pg, pg); /* { dg-error {'svhistcnt_z' has no form that takes 'svbool_t' arguments} } */ } diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/binary_uint64_n_1.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/binary_uint64_n_1.c index c8ca5f7464b..207552a3ba1 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/binary_uint64_n_1.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/binary_uint64_n_1.c @@ -7,7 +7,7 @@ f1 (svbool_t pg, svuint8_t u8, int i, float f) { svdupq_lane (u8); /* { dg-error {too few arguments to function 'svdupq_lane'} } */ svdupq_lane (u8, 0, 0); /* { dg-error {too many arguments to function 'svdupq_lane'} } */ - svdupq_lane (0, 0); /* { dg-error {passing 'int' to argument 1 of 'svdupq_lane', which expects an SVE vector type} } */ + svdupq_lane (0, 0); /* { dg-error {passing 'int' to argument 1 of 'svdupq_lane', which expects an SVE type rather than a scalar} } */ svdupq_lane (u8, 0); svdupq_lane (u8, -1); svdupq_lane (u8, i); diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/binary_uint64_opt_n_2.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/binary_uint64_opt_n_2.c index be217394f9f..c661a66f33e 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/binary_uint64_opt_n_2.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/binary_uint64_opt_n_2.c @@ -8,7 +8,7 @@ f1 (svbool_t pg, svuint8_t u8, svuint64_t u64) svlsl_wide_x (pg, u8); /* { dg-error {too few arguments to function 'svlsl_wide_x'} } */ svlsl_wide_x (pg, u8, u8, u8); /* { dg-error {too many arguments to function 'svlsl_wide_x'} } */ svlsl_wide_x (u8, u8, u64); /* { dg-error {passing 'svuint8_t' to argument 1 of 'svlsl_wide_x', which expects 'svbool_t'} } */ - svlsl_wide_x (pg, 1, u64); /* { dg-error {passing 'int' to argument 2 of 'svlsl_wide_x', which expects an SVE vector type} } */ + svlsl_wide_x (pg, 1, u64); /* { dg-error {passing 'int' to argument 2 of 'svlsl_wide_x', which expects an SVE type rather than a scalar} } */ svlsl_wide_x (pg, u8, u8); /* { dg-error {passing 'svuint8_t' to argument 3 of 'svlsl_wide_x', which expects 'svuint64_t'} } */ svlsl_wide_x (pg, u64, u64); /* { dg-error {'svlsl_wide_x' has no form that takes 'svuint64_t' arguments} } */ } diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/binary_uint_1.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/binary_uint_1.c index 8f86c50b681..8493d5d68e3 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/binary_uint_1.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/binary_uint_1.c @@ -11,7 +11,7 @@ f1 (svbool_t pg, svuint8_t u8, svint8_t s8, svuint16_t u16, svint16_t s16, svtbl (pg, pg); /* { dg-error {passing 'svbool_t' to argument 2 of 'svtbl', which expects a vector of unsigned integers} } */ svtbl (pg, u8); /* { dg-error {'svtbl' has no form that takes 'svbool_t' arguments} } */ - svtbl (u8, 0); /* { dg-error {passing 'int' to argument 2 of 'svtbl', which expects an SVE vector type} } */ + svtbl (u8, 0); /* { dg-error {passing 'int' to argument 2 of 'svtbl', which expects an SVE type rather than a scalar} } */ svtbl (u8, u8); svtbl (u8, s8); /* { dg-error {passing 'svint8_t' to argument 2 of 'svtbl', which expects a vector of unsigned integers} } */ svtbl (u8, u16); /* { dg-error {arguments 1 and 2 of 'svtbl' must have the same element size, but the values passed here have type 'svuint8_t' and 'svuint16_t' respectively} } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/binary_uint_n_1.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/binary_uint_n_1.c index 36a902e69c5..d74cb46f74f 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/binary_uint_n_1.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/binary_uint_n_1.c @@ -7,7 +7,7 @@ f1 (svbool_t pg, svuint8_t u8, int i, float f) { svdup_lane (u8); /* { dg-error {too few arguments to function 'svdup_lane'} } */ svdup_lane (u8, 0, 0); /* { dg-error {too many arguments to function 'svdup_lane'} } */ - svdup_lane (0, 0); /* { dg-error {passing 'int' to argument 1 of 'svdup_lane', which expects an SVE vector type} } */ + svdup_lane (0, 0); /* { dg-error {passing 'int' to argument 1 of 'svdup_lane', which expects an SVE type rather than a scalar} } */ svdup_lane (u8, 0); svdup_lane (u8, -1); svdup_lane (u8, i); diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/binary_uint_opt_n_1.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/binary_uint_opt_n_1.c index b162ab4050e..f44d7a9fa07 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/binary_uint_opt_n_1.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/binary_uint_opt_n_1.c @@ -11,7 +11,7 @@ f1 (svbool_t pg, svfloat16_t f16, svint16_t s16, svuint16_t u16, svlsl_x (s32, s32, u32); /* { dg-error {passing 'svint32_t' to argument 1 of 'svlsl_x', which expects 'svbool_t'} } */ svlsl_x (1, s32, u32); /* { dg-error {passing 'int' to argument 1 of 'svlsl_x', which expects 'svbool_t'} } */ svlsl_x (pg, pg, u16); /* { dg-error {'svlsl_x' has no form that takes 'svbool_t' arguments} } */ - svlsl_x (pg, 1, s16); /* { dg-error {passing 'int' to argument 2 of 'svlsl_x', which expects an SVE vector type} } */ + svlsl_x (pg, 1, s16); /* { dg-error {passing 'int' to argument 2 of 'svlsl_x', which expects an SVE type rather than a scalar} } */ svlsl_x (pg, s16, s16); /* { dg-error {passing 'svint16_t' to argument 3 of 'svlsl_x', which expects a vector of unsigned integers} } */ svlsl_x (pg, s16, u16); svlsl_x (pg, s16, f16); /* { dg-error {passing 'svfloat16_t' to argument 3 of 'svlsl_x', which expects a vector of unsigned integers} } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/binary_wide_1.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/binary_wide_1.c index f58ab75d792..ba38361ab0e 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/binary_wide_1.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/binary_wide_1.c @@ -30,8 +30,8 @@ f1 (svbool_t pg, svint8_t s8, svuint8_t u8, svadalp_m (pg, s16, s8); svadalp_m (pg, f32, f16); /* { dg-error {'svadalp_m' has no form that takes 'svfloat32_t' arguments} } */ svadalp_m (pg, f16, f32); /* { dg-error {'svadalp_m' has no form that takes 'svfloat16_t' arguments} } */ - svadalp_m (pg, 0, u32); /* { dg-error {passing 'int' to argument 2 of 'svadalp_m', which expects an SVE vector type} } */ - svadalp_m (pg, 0, u64); /* { dg-error {passing 'int' to argument 2 of 'svadalp_m', which expects an SVE vector type} } */ - svadalp_m (pg, u8, 0); /* { dg-error {passing 'int' to argument 3 of 'svadalp_m', which expects an SVE vector type} } */ - svadalp_m (pg, u16, 0); /* { dg-error {passing 'int' to argument 3 of 'svadalp_m', which expects an SVE vector type} } */ + svadalp_m (pg, 0, u32); /* { dg-error {passing 'int' to argument 2 of 'svadalp_m', which expects an SVE type rather than a scalar} } */ + svadalp_m (pg, 0, u64); /* { dg-error {passing 'int' to argument 2 of 'svadalp_m', which expects an SVE type rather than a scalar} } */ + svadalp_m (pg, u8, 0); /* { dg-error {passing 'int' to argument 3 of 'svadalp_m', which expects an SVE type rather than a scalar} } */ + svadalp_m (pg, u16, 0); /* { dg-error {passing 'int' to argument 3 of 'svadalp_m', which expects an SVE type rather than a scalar} } */ } diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/binary_wide_opt_n_1.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/binary_wide_opt_n_1.c index 5a58211a09a..fd27d855916 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/binary_wide_opt_n_1.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/binary_wide_opt_n_1.c @@ -27,8 +27,8 @@ f1 (svbool_t pg, svint8_t s8, svuint8_t u8, svaddwb (s16, s8); svaddwb (f32, f16); /* { dg-error {'svaddwb' has no form that takes 'svfloat32_t' arguments} } */ svaddwb (f16, f32); /* { dg-error {'svaddwb' has no form that takes 'svfloat16_t' arguments} } */ - svaddwb (0, u32); /* { dg-error {passing 'int' to argument 1 of 'svaddwb', which expects an SVE vector type} } */ - svaddwb (0, u64); /* { dg-error {passing 'int' to argument 1 of 'svaddwb', which expects an SVE vector type} } */ + svaddwb (0, u32); /* { dg-error {passing 'int' to argument 1 of 'svaddwb', which expects an SVE type rather than a scalar} } */ + svaddwb (0, u64); /* { dg-error {passing 'int' to argument 1 of 'svaddwb', which expects an SVE type rather than a scalar} } */ svaddwb (u8, 0); /* { dg-error {'svaddwb' has no form that takes 'svuint8_t' arguments} } */ svaddwb (u16, 0); svaddwb (u32, 0); diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/clast_1.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/clast_1.c index cb9ac946c0c..ba1b2520f7a 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/clast_1.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/clast_1.c @@ -6,10 +6,10 @@ test (svbool_t pg, svint32_t s32, svint64_t s64, int i) svclasta (pg, 1); /* { dg-error {too few arguments to function 'svclasta'} } */ svclasta (pg, 1, s32, 1); /* { dg-error {too many arguments to function 'svclasta'} } */ svclasta (1, 1, s32); /* { dg-error {passing 'int' to argument 1 of 'svclasta', which expects 'svbool_t'} } */ - svclasta (pg, 1, 1); /* { dg-error {passing 'int' to argument 3 of 'svclasta', which expects an SVE vector type} } */ + svclasta (pg, 1, 1); /* { dg-error {passing 'int' to argument 3 of 'svclasta', which expects an SVE type rather than a scalar} } */ svclasta (pg, 1, pg); /* { dg-error {'svclasta' has no form that takes 'svbool_t' arguments} } */ svclasta (pg, i, s32); - svclasta (pg, s32, 1); /* { dg-error {passing 'int' to argument 3 of 'svclasta', which expects an SVE vector type} } */ + svclasta (pg, s32, 1); /* { dg-error {passing 'int' to argument 3 of 'svclasta', which expects an SVE type rather than a scalar} } */ svclasta (pg, s32, s64); /* { dg-error {passing 'svint64_t' to argument 3 of 'svclasta', but previous arguments had type 'svint32_t'} } */ svclasta (pg, pg, pg); /* { dg-error {'svclasta' has no form that takes 'svbool_t' arguments} } */ } diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/compare_1.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/compare_1.c index 12511a85beb..5474124cc46 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/compare_1.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/compare_1.c @@ -12,14 +12,14 @@ f1 (svbool_t pg, svint8_t s8, svuint8_t u8, svmatch (pg, u8, u8, u8); /* { dg-error {too many arguments to function 'svmatch'} } */ svmatch (u8, u8, u8); /* { dg-error {passing 'svuint8_t' to argument 1 of 'svmatch', which expects 'svbool_t'} } */ svmatch (pg, pg, pg); /* { dg-error {'svmatch' has no form that takes 'svbool_t' arguments} } */ - svmatch (pg, 1, u8); /* { dg-error {passing 'int' to argument 2 of 'svmatch', which expects an SVE vector type} } */ + svmatch (pg, 1, u8); /* { dg-error {passing 'int' to argument 2 of 'svmatch', which expects an SVE type rather than a scalar} } */ svmatch (pg, u8, s8); /* { dg-error {passing 'svint8_t' to argument 3 of 'svmatch', but previous arguments had type 'svuint8_t'} } */ svmatch (pg, u8, u8); svmatch (pg, u8, s16); /* { dg-error {passing 'svint16_t' to argument 3 of 'svmatch', but previous arguments had type 'svuint8_t'} } */ svmatch (pg, u8, u16); /* { dg-error {passing 'svuint16_t' to argument 3 of 'svmatch', but previous arguments had type 'svuint8_t'} } */ svmatch (pg, u8, f16); /* { dg-error {passing 'svfloat16_t' to argument 3 of 'svmatch', but previous arguments had type 'svuint8_t'} } */ svmatch (pg, u8, pg); /* { dg-error {passing 'svbool_t' to argument 3 of 'svmatch', but previous arguments had type 'svuint8_t'} } */ - svmatch (pg, u8, 0); /* { dg-error {passing 'int' to argument 3 of 'svmatch', which expects an SVE vector type} } */ + svmatch (pg, u8, 0); /* { dg-error {passing 'int' to argument 3 of 'svmatch', which expects an SVE type rather than a scalar} } */ svmatch (pg, f16, s16); /* { dg-error {passing 'svint16_t' to argument 3 of 'svmatch', but previous arguments had type 'svfloat16_t'} } */ svmatch (pg, f16, u16); /* { dg-error {passing 'svuint16_t' to argument 3 of 'svmatch', but previous arguments had type 'svfloat16_t'} } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/compare_opt_n_1.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/compare_opt_n_1.c index 71c8e86d5da..6faa73972f5 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/compare_opt_n_1.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/compare_opt_n_1.c @@ -10,7 +10,7 @@ f1 (svbool_t pg, svint8_t s8, svuint8_t u8, svcmpeq (pg, u8, u8, u8); /* { dg-error {too many arguments to function 'svcmpeq'} } */ svcmpeq (u8, u8, u8); /* { dg-error {passing 'svuint8_t' to argument 1 of 'svcmpeq', which expects 'svbool_t'} } */ svcmpeq (pg, pg, pg); /* { dg-error {'svcmpeq' has no form that takes 'svbool_t' arguments} } */ - svcmpeq (pg, 1, u8); /* { dg-error {passing 'int' to argument 2 of 'svcmpeq', which expects an SVE vector type} } */ + svcmpeq (pg, 1, u8); /* { dg-error {passing 'int' to argument 2 of 'svcmpeq', which expects an SVE type rather than a scalar} } */ svcmpeq (pg, u8, s8); /* { dg-error {passing 'svint8_t' to argument 3 of 'svcmpeq', but previous arguments had type 'svuint8_t'} } */ svcmpeq (pg, u8, u8); svcmpeq (pg, u8, s16); /* { dg-error {passing 'svint16_t' to argument 3 of 'svcmpeq', but previous arguments had type 'svuint8_t'} } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/compare_wide_opt_n_1.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/compare_wide_opt_n_1.c index fc5e4566361..655f03360d3 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/compare_wide_opt_n_1.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/compare_wide_opt_n_1.c @@ -9,7 +9,7 @@ f1 (svbool_t pg, svuint8_t u8, svint8_t s8, svint64_t s64, svuint64_t u64, svcmpeq_wide (pg, s8); /* { dg-error {too few arguments to function 'svcmpeq_wide'} } */ svcmpeq_wide (pg, s8, s64, s8); /* { dg-error {too many arguments to function 'svcmpeq_wide'} } */ svcmpeq_wide (s8, s8, s64); /* { dg-error {passing 'svint8_t' to argument 1 of 'svcmpeq_wide', which expects 'svbool_t'} } */ - svcmpeq_wide (pg, 0, s64); /* { dg-error {passing 'int' to argument 2 of 'svcmpeq_wide', which expects an SVE vector type} } */ + svcmpeq_wide (pg, 0, s64); /* { dg-error {passing 'int' to argument 2 of 'svcmpeq_wide', which expects an SVE type rather than a scalar} } */ svcmpeq_wide (pg, s8, 0); svcmpeq_wide (pg, s8, x); svcmpeq_wide (pg, s8, s8); /* { dg-error {passing 'svint8_t' to argument 3 of 'svcmpeq_wide', which expects a vector of 64-bit elements} } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/count_vector_1.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/count_vector_1.c index daf9e0d5beb..b57d9de1d29 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/count_vector_1.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/count_vector_1.c @@ -7,7 +7,7 @@ f1 (svbool_t pg, svuint32_t u32, svuint32x2_t u32x2) { svlen (); /* { dg-error {too few arguments to function 'svlen'} } */ svlen (u32, u32); /* { dg-error {too many arguments to function 'svlen'} } */ - svlen (0); /* { dg-error {passing 'int' to argument 1 of 'svlen', which expects an SVE vector type} } */ + svlen (0); /* { dg-error {passing 'int' to argument 1 of 'svlen', which expects an SVE type rather than a scalar} } */ svlen (pg); /* { dg-error {'svlen' has no form that takes 'svbool_t' arguments} } */ svlen (u32x2); /* { dg-error {passing 'svuint32x2_t' to argument 1 of 'svlen', which expects a single SVE vector rather than a tuple} } */ } diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/create_1.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/create_1.c index 31321a04649..83e4a5600cb 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/create_1.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/create_1.c @@ -12,8 +12,8 @@ f1 (svuint8x2_t *ptr, svbool_t pg, svuint8_t u8, svfloat64_t f64, *ptr = svcreate2 (u8x2, u8x2); /* { dg-error {passing 'svuint8x2_t' to argument 1 of 'svcreate2', which expects a single SVE vector rather than a tuple} } */ *ptr = svcreate2 (u8, f64); /* { dg-error {passing 'svfloat64_t' to argument 2 of 'svcreate2', but previous arguments had type 'svuint8_t'} } */ *ptr = svcreate2 (u8, pg); /* { dg-error {passing 'svbool_t' to argument 2 of 'svcreate2', but previous arguments had type 'svuint8_t'} } */ - *ptr = svcreate2 (u8, x); /* { dg-error {passing 'int' to argument 2 of 'svcreate2', which expects an SVE vector type} } */ - *ptr = svcreate2 (x, u8); /* { dg-error {passing 'int' to argument 1 of 'svcreate2', which expects an SVE vector type} } */ + *ptr = svcreate2 (u8, x); /* { dg-error {passing 'int' to argument 2 of 'svcreate2', which expects an SVE type rather than a scalar} } */ + *ptr = svcreate2 (x, u8); /* { dg-error {passing 'int' to argument 1 of 'svcreate2', which expects an SVE type rather than a scalar} } */ *ptr = svcreate2 (pg, u8); /* { dg-error {passing 'svuint8_t' to argument 2 of 'svcreate2', but previous arguments had type 'svbool_t'} } */ *ptr = svcreate2 (pg, pg); /* { dg-error {'svcreate2' has no form that takes 'svbool_t' arguments} } */ *ptr = svcreate2 (u8, u8); diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/create_3.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/create_3.c index a88e56b318d..e3302f7e7db 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/create_3.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/create_3.c @@ -13,8 +13,8 @@ f1 (svfloat16x3_t *ptr, svbool_t pg, svfloat16_t f16, svfloat64_t f64, *ptr = svcreate3 (f16x3, f16x3, f16x3); /* { dg-error {passing 'svfloat16x3_t' to argument 1 of 'svcreate3', which expects a single SVE vector rather than a tuple} } */ *ptr = svcreate3 (f16, f16, f64); /* { dg-error {passing 'svfloat64_t' to argument 3 of 'svcreate3', but previous arguments had type 'svfloat16_t'} } */ *ptr = svcreate3 (f16, pg, f16); /* { dg-error {passing 'svbool_t' to argument 2 of 'svcreate3', but previous arguments had type 'svfloat16_t'} } */ - *ptr = svcreate3 (f16, x, f16); /* { dg-error {passing 'int' to argument 2 of 'svcreate3', which expects an SVE vector type} } */ - *ptr = svcreate3 (x, f16, f16); /* { dg-error {passing 'int' to argument 1 of 'svcreate3', which expects an SVE vector type} } */ + *ptr = svcreate3 (f16, x, f16); /* { dg-error {passing 'int' to argument 2 of 'svcreate3', which expects an SVE type rather than a scalar} } */ + *ptr = svcreate3 (x, f16, f16); /* { dg-error {passing 'int' to argument 1 of 'svcreate3', which expects an SVE type rather than a scalar} } */ *ptr = svcreate3 (pg, f16, f16); /* { dg-error {passing 'svfloat16_t' to argument 2 of 'svcreate3', but previous arguments had type 'svbool_t'} } */ *ptr = svcreate3 (pg, pg, pg); /* { dg-error {'svcreate3' has no form that takes 'svbool_t' arguments} } */ *ptr = svcreate3 (f16, f16, f16); diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/create_5.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/create_5.c index fed12450627..c850c94f0d2 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/create_5.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/create_5.c @@ -14,8 +14,8 @@ f1 (svint32x4_t *ptr, svbool_t pg, svint32_t s32, svfloat64_t f64, *ptr = svcreate4 (s32x4, s32x4, s32x4, s32x4); /* { dg-error {passing 'svint32x4_t' to argument 1 of 'svcreate4', which expects a single SVE vector rather than a tuple} } */ *ptr = svcreate4 (s32, s32, s32, f64); /* { dg-error {passing 'svfloat64_t' to argument 4 of 'svcreate4', but previous arguments had type 'svint32_t'} } */ *ptr = svcreate4 (s32, s32, pg, s32); /* { dg-error {passing 'svbool_t' to argument 3 of 'svcreate4', but previous arguments had type 'svint32_t'} } */ - *ptr = svcreate4 (s32, x, s32, s32); /* { dg-error {passing 'int' to argument 2 of 'svcreate4', which expects an SVE vector type} } */ - *ptr = svcreate4 (x, s32, s32, s32); /* { dg-error {passing 'int' to argument 1 of 'svcreate4', which expects an SVE vector type} } */ + *ptr = svcreate4 (s32, x, s32, s32); /* { dg-error {passing 'int' to argument 2 of 'svcreate4', which expects an SVE type rather than a scalar} } */ + *ptr = svcreate4 (x, s32, s32, s32); /* { dg-error {passing 'int' to argument 1 of 'svcreate4', which expects an SVE type rather than a scalar} } */ *ptr = svcreate4 (pg, s32, s32, s32); /* { dg-error {passing 'svint32_t' to argument 2 of 'svcreate4', but previous arguments had type 'svbool_t'} } */ *ptr = svcreate4 (pg, pg, pg, pg); /* { dg-error {'svcreate4' has no form that takes 'svbool_t' arguments} } */ *ptr = svcreate4 (s32, s32, s32, s32); diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/fold_left_1.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/fold_left_1.c index 1d292786df9..181d1b01b1b 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/fold_left_1.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/fold_left_1.c @@ -15,7 +15,7 @@ f1 (svbool_t pg, int i, float f, double d, void *ptr, svfloat32_t f32, svadda (pg, ptr, f32); /* { dg-error {incompatible type for argument 2 of 'svadda_f32'} } */ svadda (pg, pg, f32); /* { dg-error {passing 'svbool_t' to argument 2 of 'svadda', which expects a scalar element} } */ svadda (pg, f32, f32); /* { dg-error {passing 'svfloat32_t' to argument 2 of 'svadda', which expects a scalar element} } */ - svadda (pg, f, f); /* { dg-error {passing 'float' to argument 3 of 'svadda', which expects an SVE vector type} } */ + svadda (pg, f, f); /* { dg-error {passing 'float' to argument 3 of 'svadda', which expects an SVE type rather than a scalar} } */ svadda (pg, i, i32); /* { dg-error {'svadda' has no form that takes 'svint32_t' arguments} } */ - svadda (pg, i, i); /* { dg-error {passing 'int' to argument 3 of 'svadda', which expects an SVE vector type} } */ + svadda (pg, i, i); /* { dg-error {passing 'int' to argument 3 of 'svadda', which expects an SVE type rather than a scalar} } */ } diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/inc_dec_pred_1.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/inc_dec_pred_1.c index a61afcd2db6..4de082d014c 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/inc_dec_pred_1.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/inc_dec_pred_1.c @@ -7,7 +7,7 @@ test (svbool_t pg, svint8_t s8, svuint8_t u8, { svqincp (s32); /* { dg-error {too few arguments to function 'svqincp'} } */ svqincp (s32, pg, pg); /* { dg-error {too many arguments to function 'svqincp'} } */ - svqincp (i, pg); /* { dg-error {passing 'int' to argument 1 of 'svqincp', which expects an SVE vector type} } */ + svqincp (i, pg); /* { dg-error {passing 'int' to argument 1 of 'svqincp', which expects an SVE type rather than a scalar} } */ svqincp (pg, pg); /* { dg-error {'svqincp' has no form that takes 'svbool_t' arguments} } */ svqincp (s8, pg); /* { dg-error {'svqincp' has no form that takes 'svint8_t' arguments} } */ svqincp (u8, pg); /* { dg-error {'svqincp' has no form that takes 'svuint8_t' arguments} } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/mmla_1.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/mmla_1.c index 5b0b00e96b5..7fc7bb67b75 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/mmla_1.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/mmla_1.c @@ -23,22 +23,22 @@ f2 (svbool_t pg, svint8_t s8, svuint8_t u8, svuint32_t u32, svint32_t s32, { svmmla (s32, s8); /* { dg-error {too few arguments to function 'svmmla'} } */ svmmla (s32, s8, s8, s8); /* { dg-error {too many arguments to function 'svmmla'} } */ - svmmla (0, s8, s8); /* { dg-error {passing 'int' to argument 1 of 'svmmla', which expects an SVE vector type} } */ + svmmla (0, s8, s8); /* { dg-error {passing 'int' to argument 1 of 'svmmla', which expects an SVE type rather than a scalar} } */ svmmla (pg, s8, s8); /* { dg-error {'svmmla' has no form that takes 'svbool_t' arguments} } */ svmmla (u8, s8, s8); /* { dg-error {'svmmla' has no form that takes 'svuint8_t' arguments} } */ - svmmla (s32, 0, s8); /* { dg-error {passing 'int' to argument 2 of 'svmmla', which expects an SVE vector type} } */ + svmmla (s32, 0, s8); /* { dg-error {passing 'int' to argument 2 of 'svmmla', which expects an SVE type rather than a scalar} } */ svmmla (s32, u8, s8); /* { dg-error {arguments 1 and 2 of 'svmmla' must have the same signedness, but the values passed here have type 'svint32_t' and 'svuint8_t' respectively} } */ svmmla (s32, s8, u8); /* { dg-error {arguments 1 and 3 of 'svmmla' must have the same signedness, but the values passed here have type 'svint32_t' and 'svuint8_t' respectively} } */ - svmmla (s32, s8, 0); /* { dg-error {passing 'int' to argument 3 of 'svmmla', which expects an SVE vector type} } */ + svmmla (s32, s8, 0); /* { dg-error {passing 'int' to argument 3 of 'svmmla', which expects an SVE type rather than a scalar} } */ svmmla (s32, s8, s8); svmmla (s32, s32, s32); /* { dg-error {passing 'svint32_t' instead of the expected 'svint8_t' to argument 2 of 'svmmla', after passing 'svint32_t' to argument 1} } */ svmmla (s32, u32, u32); /* { dg-error {passing 'svuint32_t' instead of the expected 'svint8_t' to argument 2 of 'svmmla', after passing 'svint32_t' to argument 1} } */ - svmmla (u32, 0, u8); /* { dg-error {passing 'int' to argument 2 of 'svmmla', which expects an SVE vector type} } */ + svmmla (u32, 0, u8); /* { dg-error {passing 'int' to argument 2 of 'svmmla', which expects an SVE type rather than a scalar} } */ svmmla (u32, s8, u8); /* { dg-error {arguments 1 and 2 of 'svmmla' must have the same signedness, but the values passed here have type 'svuint32_t' and 'svint8_t' respectively} } */ svmmla (u32, u8, s8); /* { dg-error {arguments 1 and 3 of 'svmmla' must have the same signedness, but the values passed here have type 'svuint32_t' and 'svint8_t' respectively} } */ - svmmla (u32, u8, 0); /* { dg-error {passing 'int' to argument 3 of 'svmmla', which expects an SVE vector type} } */ + svmmla (u32, u8, 0); /* { dg-error {passing 'int' to argument 3 of 'svmmla', which expects an SVE type rather than a scalar} } */ svmmla (u32, u8, u8); svmmla (u32, s32, s32); /* { dg-error {passing 'svint32_t' instead of the expected 'svuint8_t' to argument 2 of 'svmmla', after passing 'svuint32_t' to argument 1} } */ svmmla (u32, u32, u32); /* { dg-error {passing 'svuint32_t' instead of the expected 'svuint8_t' to argument 2 of 'svmmla', after passing 'svuint32_t' to argument 1} } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/prefetch_gather_offset_2.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/prefetch_gather_offset_2.c index b74721fadce..88e0c35e7af 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/prefetch_gather_offset_2.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/prefetch_gather_offset_2.c @@ -12,7 +12,7 @@ f1 (svbool_t pg, svint8_t s8, svuint8_t u8, svprfb_gather (pg, u32); /* { dg-error {too few arguments to function 'svprfb_gather'} } */ svprfb_gather (pg, u32, SV_PLDL1KEEP, 0); /* { dg-error {too many arguments to function 'svprfb_gather'} } */ svprfb_gather (0, u32, SV_PLDL1KEEP); /* { dg-error {passing 'int' to argument 1 of 'svprfb_gather', which expects 'svbool_t'} } */ - svprfb_gather (pg, 0, SV_PLDL1KEEP); /* { dg-error {passing 'int' to argument 2 of 'svprfb_gather', which expects an SVE vector type} } */ + svprfb_gather (pg, 0, SV_PLDL1KEEP); /* { dg-error {passing 'int' to argument 2 of 'svprfb_gather', which expects an SVE type rather than a scalar} } */ svprfb_gather (pg, s8, SV_PLDL1KEEP); /* { dg-error {passing 'svint8_t' to argument 2 of 'svprfb_gather', which expects 'svuint32_t' or 'svuint64_t'} } */ svprfb_gather (pg, u8, SV_PLDL1KEEP); /* { dg-error {passing 'svuint8_t' to argument 2 of 'svprfb_gather', which expects 'svuint32_t' or 'svuint64_t'} } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/reduction_1.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/reduction_1.c index ab0ef304a31..025795e3da6 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/reduction_1.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/reduction_1.c @@ -10,7 +10,7 @@ f1 (svbool_t pg, svint32_t s32, svuint32_t u32, svfloat32_t f32, svorv (pg, u32, u32); /* { dg-error {too many arguments to function 'svorv'} } */ svorv (0, u32); /* { dg-error {passing 'int' to argument 1 of 'svorv', which expects 'svbool_t'} } */ svorv (u32, u32); /* { dg-error {passing 'svuint32_t' to argument 1 of 'svorv', which expects 'svbool_t'} } */ - svorv (pg, 0); /* { dg-error {passing 'int' to argument 2 of 'svorv', which expects an SVE vector type} } */ + svorv (pg, 0); /* { dg-error {passing 'int' to argument 2 of 'svorv', which expects an SVE type rather than a scalar} } */ svorv (pg, pg); /* { dg-error {'svorv' has no form that takes 'svbool_t' arguments} } */ svorv (pg, s32); svorv (pg, u32); diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/reduction_wide_1.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/reduction_wide_1.c index f99a2887bf6..68bacd0a3df 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/reduction_wide_1.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/reduction_wide_1.c @@ -10,7 +10,7 @@ f1 (svbool_t pg, svint32_t s32, svuint32_t u32, svfloat32_t f32, svaddv (pg, u32, u32); /* { dg-error {too many arguments to function 'svaddv'} } */ svaddv (0, u32); /* { dg-error {passing 'int' to argument 1 of 'svaddv', which expects 'svbool_t'} } */ svaddv (u32, u32); /* { dg-error {passing 'svuint32_t' to argument 1 of 'svaddv', which expects 'svbool_t'} } */ - svaddv (pg, 0); /* { dg-error {passing 'int' to argument 2 of 'svaddv', which expects an SVE vector type} } */ + svaddv (pg, 0); /* { dg-error {passing 'int' to argument 2 of 'svaddv', which expects an SVE type rather than a scalar} } */ svaddv (pg, pg); /* { dg-error {'svaddv' has no form that takes 'svbool_t' arguments} } */ svaddv (pg, s32); svaddv (pg, u32); diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/shift_right_imm_narrowb_1.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/shift_right_imm_narrowb_1.c index 6536679d5be..c5942c70110 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/shift_right_imm_narrowb_1.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/shift_right_imm_narrowb_1.c @@ -66,5 +66,5 @@ f1 (svbool_t pg, svint8_t s8, svuint8_t u8, svshrnb (f32, 1); /* { dg-error {'svshrnb' has no form that takes 'svfloat32_t' arguments} } */ - svshrnb (1, 1); /* { dg-error {passing 'int' to argument 1 of 'svshrnb', which expects an SVE vector type} } */ + svshrnb (1, 1); /* { dg-error {passing 'int' to argument 1 of 'svshrnb', which expects an SVE type rather than a scalar} } */ } diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/shift_right_imm_narrowb_to_uint_1.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/shift_right_imm_narrowb_to_uint_1.c index 51f9388bfb3..3ecd20a2279 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/shift_right_imm_narrowb_to_uint_1.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/shift_right_imm_narrowb_to_uint_1.c @@ -54,5 +54,5 @@ f1 (svbool_t pg, svint8_t s8, svuint8_t u8, svqshrunb (f32, 1); /* { dg-error {'svqshrunb' has no form that takes 'svfloat32_t' arguments} } */ - svqshrunb (1, 1); /* { dg-error {passing 'int' to argument 1 of 'svqshrunb', which expects an SVE vector type} } */ + svqshrunb (1, 1); /* { dg-error {passing 'int' to argument 1 of 'svqshrunb', which expects an SVE type rather than a scalar} } */ } diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/shift_right_imm_narrowt_1.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/shift_right_imm_narrowt_1.c index 6c31cf8ec31..e9d1d13371e 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/shift_right_imm_narrowt_1.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/shift_right_imm_narrowt_1.c @@ -76,6 +76,6 @@ f1 (svbool_t pg, svint8_t s8, svuint8_t u8, svshrnt (f32, f32, 1); /* { dg-error {'svshrnt' has no form that takes 'svfloat32_t' arguments} } */ - svshrnt (1, s32, 1); /* { dg-error {passing 'int' to argument 1 of 'svshrnt', which expects an SVE vector type} } */ - svshrnt (s32, 1, 1); /* { dg-error {passing 'int' to argument 2 of 'svshrnt', which expects an SVE vector type} } */ + svshrnt (1, s32, 1); /* { dg-error {passing 'int' to argument 1 of 'svshrnt', which expects an SVE type rather than a scalar} } */ + svshrnt (s32, 1, 1); /* { dg-error {passing 'int' to argument 2 of 'svshrnt', which expects an SVE type rather than a scalar} } */ } diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/shift_right_imm_narrowt_to_uint_1.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/shift_right_imm_narrowt_to_uint_1.c index 2e35ad304bf..7414956099e 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/shift_right_imm_narrowt_to_uint_1.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/shift_right_imm_narrowt_to_uint_1.c @@ -59,6 +59,6 @@ f1 (svbool_t pg, svint8_t s8, svuint8_t u8, svqshrunt (u16, f32, 1); /* { dg-error {'svqshrunt' has no form that takes 'svfloat32_t' arguments} } */ - svqshrunt (1, u32, 1); /* { dg-error {passing 'int' to argument 1 of 'svqshrunt', which expects an SVE vector type} } */ - svqshrunt (u32, 1, 1); /* { dg-error {passing 'int' to argument 2 of 'svqshrunt', which expects an SVE vector type} } */ + svqshrunt (1, u32, 1); /* { dg-error {passing 'int' to argument 1 of 'svqshrunt', which expects an SVE type rather than a scalar} } */ + svqshrunt (u32, 1, 1); /* { dg-error {passing 'int' to argument 2 of 'svqshrunt', which expects an SVE type rather than a scalar} } */ } diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/store_1.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/store_1.c index 3669b3088a7..6011ab05414 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/store_1.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/store_1.c @@ -13,7 +13,7 @@ f1 (svbool_t pg, signed char *s8_ptr, void *void_ptr, struct s *s_ptr, svst1 (pg, s8_ptr); /* { dg-error {too few arguments to function 'svst1'} } */ svst1 (pg, s8_ptr, s8, 0); /* { dg-error {too many arguments to function 'svst1'} } */ svst1 (0, s8_ptr, s8); /* { dg-error {passing 'int' to argument 1 of 'svst1', which expects 'svbool_t'} } */ - svst1 (pg, void_ptr, 0); /* { dg-error {passing 'int' to argument 3 of 'svst1', which expects an SVE vector type} } */ + svst1 (pg, void_ptr, 0); /* { dg-error {passing 'int' to argument 3 of 'svst1', which expects an SVE type rather than a scalar} } */ svst1 (pg, void_ptr, pg); /* { dg-error {'svst1' has no form that takes 'svbool_t' arguments} } */ svst1 (pg, 0, s8); svst1 (pg, (int32_t *) 0, s8); /* { dg-error "passing argument 2 of 'svst1_s8' from incompatible pointer type" } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/store_2.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/store_2.c index 30a0a4c8586..552540bf7ff 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/store_2.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/store_2.c @@ -15,7 +15,7 @@ f1 (svbool_t pg, signed char *s8_ptr, void *void_ptr, struct s *s_ptr, svst1_vnum (pg, s8_ptr, pg, s8); /* { dg-error {passing 'svbool_t' to argument 3 of 'svst1_vnum', which expects 'int64_t'} } */ svst1_vnum (pg, s8_ptr, s8, s8); /* { dg-error {passing 'svint8_t' to argument 3 of 'svst1_vnum', which expects 'int64_t'} } */ svst1_vnum (pg, s8_ptr, void_ptr, s8); /* { dg-error "passing argument 3 of 'svst1_vnum_s8' makes integer from pointer without a cast" } */ - svst1_vnum (pg, void_ptr, 0, 0); /* { dg-error {passing 'int' to argument 4 of 'svst1_vnum', which expects an SVE vector type} } */ + svst1_vnum (pg, void_ptr, 0, 0); /* { dg-error {passing 'int' to argument 4 of 'svst1_vnum', which expects an SVE type rather than a scalar} } */ svst1_vnum (pg, void_ptr, 0, pg); /* { dg-error {'svst1_vnum' has no form that takes 'svbool_t' arguments} } */ svst1_vnum (pg, 0, 0, s8); svst1_vnum (pg, (int32_t *) 0, 0, s8); /* { dg-error "passing argument 2 of 'svst1_vnum_s8' from incompatible pointer type" } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/store_scatter_offset_1.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/store_scatter_offset_1.c index 10abf758c2f..3b3b5622267 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/store_scatter_offset_1.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/store_scatter_offset_1.c @@ -13,8 +13,8 @@ f1 (svbool_t pg, svint8_t s8, svuint8_t u8, svint16_t s16, svuint16_t u16, svst1_scatter (pg, u32); /* { dg-error {too few arguments to function 'svst1_scatter'} } */ svst1_scatter (pg, u32, u32, 0); /* { dg-error {too many arguments to function 'svst1_scatter'} } */ svst1_scatter (0, u32, u32); /* { dg-error {passing 'int' to argument 1 of 'svst1_scatter', which expects 'svbool_t'} } */ - svst1_scatter (pg, 0, u32); /* { dg-error {passing 'int' to argument 2 of 'svst1_scatter', which expects an SVE vector type} } */ - svst1_scatter (pg, u32, 0); /* { dg-error {passing 'int' to argument 3 of 'svst1_scatter', which expects an SVE vector type} } */ + svst1_scatter (pg, 0, u32); /* { dg-error {passing 'int' to argument 2 of 'svst1_scatter', which expects an SVE type rather than a scalar} } */ + svst1_scatter (pg, u32, 0); /* { dg-error {passing 'int' to argument 3 of 'svst1_scatter', which expects an SVE type rather than a scalar} } */ svst1_scatter (pg, u32, pg); /* { dg-error {passing 'svbool_t' to argument 3 of 'svst1_scatter', which expects a vector of 32-bit or 64-bit elements} } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/ternary_bfloat16_1.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/ternary_bfloat16_1.c index a9233324c56..9a554f54f3d 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/ternary_bfloat16_1.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/ternary_bfloat16_1.c @@ -10,7 +10,7 @@ f1 (svbool_t pg, svuint8_t u8, svuint16_t u16, svint32_t s32, { svbfmmla (f32, bf16); /* { dg-error {too few arguments to function 'svbfmmla'} } */ svbfmmla (f32, bf16, bf16, 0); /* { dg-error {too many arguments to function 'svbfmmla'} } */ - svbfmmla (0, bf16, bf16); /* { dg-error {passing 'int' to argument 1 of 'svbfmmla', which expects an SVE vector type} } */ + svbfmmla (0, bf16, bf16); /* { dg-error {passing 'int' to argument 1 of 'svbfmmla', which expects an SVE type rather than a scalar} } */ svbfmmla (pg, bf16, bf16); /* { dg-error {'svbfmmla' has no form that takes 'svbool_t' arguments} } */ svbfmmla (u8, bf16, bf16); /* { dg-error {'svbfmmla' has no form that takes 'svuint8_t' arguments} } */ svbfmmla (u16, bf16, bf16); /* { dg-error {'svbfmmla' has no form that takes 'svuint16_t' arguments} } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/ternary_bfloat16_lane_1.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/ternary_bfloat16_lane_1.c index 23f027f2d70..87e74fbcfb4 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/ternary_bfloat16_lane_1.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/ternary_bfloat16_lane_1.c @@ -10,7 +10,7 @@ f1 (svbool_t pg, svuint8_t u8, svuint16_t u16, svint32_t s32, { svbfmlalb_lane (f32, bf16, bf16); /* { dg-error {too few arguments to function 'svbfmlalb_lane'} } */ svbfmlalb_lane (f32, bf16, bf16, 0, 0); /* { dg-error {too many arguments to function 'svbfmlalb_lane'} } */ - svbfmlalb_lane (0, bf16, bf16, 0); /* { dg-error {passing 'int' to argument 1 of 'svbfmlalb_lane', which expects an SVE vector type} } */ + svbfmlalb_lane (0, bf16, bf16, 0); /* { dg-error {passing 'int' to argument 1 of 'svbfmlalb_lane', which expects an SVE type rather than a scalar} } */ svbfmlalb_lane (pg, bf16, bf16, 0); /* { dg-error {'svbfmlalb_lane' has no form that takes 'svbool_t' arguments} } */ svbfmlalb_lane (u8, bf16, bf16, 0); /* { dg-error {'svbfmlalb_lane' has no form that takes 'svuint8_t' arguments} } */ svbfmlalb_lane (u16, bf16, bf16, 0); /* { dg-error {'svbfmlalb_lane' has no form that takes 'svuint16_t' arguments} } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/ternary_bfloat16_lanex2_1.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/ternary_bfloat16_lanex2_1.c index 4755ca79ac2..ca1852644dd 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/ternary_bfloat16_lanex2_1.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/ternary_bfloat16_lanex2_1.c @@ -10,7 +10,7 @@ f1 (svbool_t pg, svuint8_t u8, svuint16_t u16, svint32_t s32, { svbfdot_lane (f32, bf16, bf16); /* { dg-error {too few arguments to function 'svbfdot_lane'} } */ svbfdot_lane (f32, bf16, bf16, 0, 0); /* { dg-error {too many arguments to function 'svbfdot_lane'} } */ - svbfdot_lane (0, bf16, bf16, 0); /* { dg-error {passing 'int' to argument 1 of 'svbfdot_lane', which expects an SVE vector type} } */ + svbfdot_lane (0, bf16, bf16, 0); /* { dg-error {passing 'int' to argument 1 of 'svbfdot_lane', which expects an SVE type rather than a scalar} } */ svbfdot_lane (pg, bf16, bf16, 0); /* { dg-error {'svbfdot_lane' has no form that takes 'svbool_t' arguments} } */ svbfdot_lane (u8, bf16, bf16, 0); /* { dg-error {'svbfdot_lane' has no form that takes 'svuint8_t' arguments} } */ svbfdot_lane (u16, bf16, bf16, 0); /* { dg-error {'svbfdot_lane' has no form that takes 'svuint16_t' arguments} } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/ternary_bfloat16_opt_n_1.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/ternary_bfloat16_opt_n_1.c index cb0605b9a0f..d831fc927c4 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/ternary_bfloat16_opt_n_1.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/ternary_bfloat16_opt_n_1.c @@ -10,7 +10,7 @@ f1 (svbool_t pg, svuint8_t u8, svuint16_t u16, svint32_t s32, { svbfdot (f32, bf16); /* { dg-error {too few arguments to function 'svbfdot'} } */ svbfdot (f32, bf16, bf16, 0); /* { dg-error {too many arguments to function 'svbfdot'} } */ - svbfdot (0, bf16, bf16); /* { dg-error {passing 'int' to argument 1 of 'svbfdot', which expects an SVE vector type} } */ + svbfdot (0, bf16, bf16); /* { dg-error {passing 'int' to argument 1 of 'svbfdot', which expects an SVE type rather than a scalar} } */ svbfdot (pg, bf16, bf16); /* { dg-error {'svbfdot' has no form that takes 'svbool_t' arguments} } */ svbfdot (u8, bf16, bf16); /* { dg-error {'svbfdot' has no form that takes 'svuint8_t' arguments} } */ svbfdot (u16, bf16, bf16); /* { dg-error {'svbfdot' has no form that takes 'svuint16_t' arguments} } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/ternary_intq_uintq_lane_1.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/ternary_intq_uintq_lane_1.c index 600be05a88d..934b7bd6010 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/ternary_intq_uintq_lane_1.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/ternary_intq_uintq_lane_1.c @@ -10,14 +10,14 @@ f1 (svbool_t pg, svint8_t s8, svuint8_t u8, svint16_t s16, svuint16_t u16, { svsudot_lane (s32, s8, u8); /* { dg-error {too few arguments to function 'svsudot_lane'} } */ svsudot_lane (s32, s8, u8, 0, 0); /* { dg-error {too many arguments to function 'svsudot_lane'} } */ - svsudot_lane (0, s8, u8, 0); /* { dg-error {passing 'int' to argument 1 of 'svsudot_lane', which expects an SVE vector type} } */ + svsudot_lane (0, s8, u8, 0); /* { dg-error {passing 'int' to argument 1 of 'svsudot_lane', which expects an SVE type rather than a scalar} } */ svsudot_lane (pg, s8, u8, 0); /* { dg-error {'svsudot_lane' has no form that takes 'svbool_t' arguments} } */ svsudot_lane (u8, s8, u8, 0); /* { dg-error {'svsudot_lane' has no form that takes 'svuint8_t' arguments} } */ svsudot_lane (f32, s8, u8, 0); /* { dg-error {'svsudot_lane' has no form that takes 'svfloat32_t' arguments} } */ svsudot_lane (u32, s8, u8, 0); /* { dg-error {'svsudot_lane' has no form that takes 'svuint32_t' arguments} } */ svsudot_lane (s32, s8, u8, 0); - svsudot_lane (s32, 0, u8, 0); /* { dg-error {passing 'int' to argument 2 of 'svsudot_lane', which expects an SVE vector type} } */ - svsudot_lane (s32, s8, 0, 0); /* { dg-error {passing 'int' to argument 3 of 'svsudot_lane', which expects an SVE vector type} } */ + svsudot_lane (s32, 0, u8, 0); /* { dg-error {passing 'int' to argument 2 of 'svsudot_lane', which expects an SVE type rather than a scalar} } */ + svsudot_lane (s32, s8, 0, 0); /* { dg-error {passing 'int' to argument 3 of 'svsudot_lane', which expects an SVE type rather than a scalar} } */ svsudot_lane (s32, s8, u8, 0); svsudot_lane (s32, u8, u8, 0); /* { dg-error {passing 'svuint8_t' to argument 2 of 'svsudot_lane', which expects a vector of signed integers} } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/ternary_intq_uintq_opt_n_1.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/ternary_intq_uintq_opt_n_1.c index f95ac582ffe..c481996d3de 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/ternary_intq_uintq_opt_n_1.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/ternary_intq_uintq_opt_n_1.c @@ -23,12 +23,12 @@ f2 (svbool_t pg, svint8_t s8, svuint8_t u8, svuint32_t u32, { svsudot (s32, s8); /* { dg-error {too few arguments to function 'svsudot'} } */ svsudot (s32, s8, u8, u8); /* { dg-error {too many arguments to function 'svsudot'} } */ - svsudot (0, s8, u8); /* { dg-error {passing 'int' to argument 1 of 'svsudot', which expects an SVE vector type} } */ + svsudot (0, s8, u8); /* { dg-error {passing 'int' to argument 1 of 'svsudot', which expects an SVE type rather than a scalar} } */ svsudot (pg, s8, u8); /* { dg-error {'svsudot' has no form that takes 'svbool_t' arguments} } */ svsudot (u8, s8, u8); /* { dg-error {'svsudot' has no form that takes 'svuint8_t' arguments} } */ svsudot (f32, s8, u8); /* { dg-error {'svsudot' has no form that takes 'svfloat32_t' arguments} } */ svsudot (s32, s8, u8); - svsudot (s32, 0, u8); /* { dg-error {passing 'int' to argument 2 of 'svsudot', which expects an SVE vector type} } */ + svsudot (s32, 0, u8); /* { dg-error {passing 'int' to argument 2 of 'svsudot', which expects an SVE type rather than a scalar} } */ svsudot (s32, u8, u8); /* { dg-error {passing 'svuint8_t' to argument 2 of 'svsudot', which expects a vector of signed integers} } */ svsudot (s32, s8, s8); /* { dg-error {passing 'svint8_t' to argument 3 of 'svsudot', which expects a vector of unsigned integers} } */ svsudot (s32, s8, 0); diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/ternary_lane_1.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/ternary_lane_1.c index d59ffab40fb..520c11f792b 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/ternary_lane_1.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/ternary_lane_1.c @@ -10,9 +10,9 @@ f1 (svbool_t pg, svfloat16_t f16, svfloat32_t f32, svfloat64_t f64, svmla_lane (f32, f32, f32, 0, 0); /* { dg-error {too many arguments to function 'svmla_lane'} } */ svmla_lane (pg, pg, pg, 0); /* { dg-error {'svmla_lane' has no form that takes 'svbool_t' arguments} } */ svmla_lane (s32, s32, s32, 0); /* { dg-error {ACLE function 'svmla_lane_s32' requires ISA extension 'sve2'} "" { xfail aarch64_sve2 } } */ - svmla_lane (1, f32, f32, 0); /* { dg-error {passing 'int' to argument 1 of 'svmla_lane', which expects an SVE vector type} } */ - svmla_lane (f32, 1, f32, 0); /* { dg-error {passing 'int' to argument 2 of 'svmla_lane', which expects an SVE vector type} } */ - svmla_lane (f32, f32, 1, 0); /* { dg-error {passing 'int' to argument 3 of 'svmla_lane', which expects an SVE vector type} } */ + svmla_lane (1, f32, f32, 0); /* { dg-error {passing 'int' to argument 1 of 'svmla_lane', which expects an SVE type rather than a scalar} } */ + svmla_lane (f32, 1, f32, 0); /* { dg-error {passing 'int' to argument 2 of 'svmla_lane', which expects an SVE type rather than a scalar} } */ + svmla_lane (f32, f32, 1, 0); /* { dg-error {passing 'int' to argument 3 of 'svmla_lane', which expects an SVE type rather than a scalar} } */ svmla_lane (f32, f64, f32, 0); /* { dg-error {passing 'svfloat64_t' to argument 2 of 'svmla_lane', but previous arguments had type 'svfloat32_t'} } */ svmla_lane (f32, f32, f64, 0); /* { dg-error {passing 'svfloat64_t' to argument 3 of 'svmla_lane', but previous arguments had type 'svfloat32_t'} } */ svmla_lane (f32, f32, f32, s32); /* { dg-error {argument 4 of 'svmla_lane' must be an integer constant expression} } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/ternary_lane_rotate_1.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/ternary_lane_rotate_1.c index 68e51724c6a..3163d130c59 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/ternary_lane_rotate_1.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/ternary_lane_rotate_1.c @@ -11,9 +11,9 @@ f1 (svbool_t pg, svfloat16_t f16, svfloat32_t f32, svfloat64_t f64, svcmla_lane (pg, pg, pg, 0, 90); /* { dg-error {'svcmla_lane' has no form that takes 'svbool_t' arguments} } */ svcmla_lane (s32, s32, s32, 0, 90); /* { dg-error {ACLE function 'svcmla_lane_s32' requires ISA extension 'sve2'} "" { xfail aarch64_sve2 } } */ svcmla_lane (f64, f64, f64, 0, 90); /* { dg-error {'svcmla_lane' has no form that takes 'svfloat64_t' arguments} } */ - svcmla_lane (1, f32, f32, 0, 90); /* { dg-error {passing 'int' to argument 1 of 'svcmla_lane', which expects an SVE vector type} } */ - svcmla_lane (f32, 1, f32, 0, 90); /* { dg-error {passing 'int' to argument 2 of 'svcmla_lane', which expects an SVE vector type} } */ - svcmla_lane (f32, f32, 1, 0, 90); /* { dg-error {passing 'int' to argument 3 of 'svcmla_lane', which expects an SVE vector type} } */ + svcmla_lane (1, f32, f32, 0, 90); /* { dg-error {passing 'int' to argument 1 of 'svcmla_lane', which expects an SVE type rather than a scalar} } */ + svcmla_lane (f32, 1, f32, 0, 90); /* { dg-error {passing 'int' to argument 2 of 'svcmla_lane', which expects an SVE type rather than a scalar} } */ + svcmla_lane (f32, f32, 1, 0, 90); /* { dg-error {passing 'int' to argument 3 of 'svcmla_lane', which expects an SVE type rather than a scalar} } */ svcmla_lane (f32, f64, f32, 0, 90); /* { dg-error {passing 'svfloat64_t' to argument 2 of 'svcmla_lane', but previous arguments had type 'svfloat32_t'} } */ svcmla_lane (f32, f32, f64, 0, 90); /* { dg-error {passing 'svfloat64_t' to argument 3 of 'svcmla_lane', but previous arguments had type 'svfloat32_t'} } */ svcmla_lane (f32, f32, f32, s32, 0); /* { dg-error {argument 4 of 'svcmla_lane' must be an integer constant expression} } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/ternary_long_lane_1.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/ternary_long_lane_1.c index e20e1a12257..dd67b4e4e23 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/ternary_long_lane_1.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/ternary_long_lane_1.c @@ -11,16 +11,16 @@ f1 (svbool_t pg, svint8_t s8, svuint8_t u8, svint16_t s16, svuint16_t u16, { svmlalb_lane (u64, u32, u32); /* { dg-error {too few arguments to function 'svmlalb_lane'} } */ svmlalb_lane (u64, u32, u32, 0, 0); /* { dg-error {too many arguments to function 'svmlalb_lane'} } */ - svmlalb_lane (0, u16, u16, 0); /* { dg-error {passing 'int' to argument 1 of 'svmlalb_lane', which expects an SVE vector type} } */ + svmlalb_lane (0, u16, u16, 0); /* { dg-error {passing 'int' to argument 1 of 'svmlalb_lane', which expects an SVE type rather than a scalar} } */ svmlalb_lane (pg, u16, u16, 0); /* { dg-error {'svmlalb_lane' has no form that takes 'svbool_t' arguments} } */ svmlalb_lane (u8, u8, u8, 0); /* { dg-error {'svmlalb_lane' has no form that takes 'svuint8_t' arguments} } */ svmlalb_lane (u16, u8, u8, 0); /* { dg-error {'svmlalb_lane' has no form that takes 'svuint16_t' arguments} } */ svmlalb_lane (f16, u16, u16, 0); /* { dg-error {'svmlalb_lane' has no form that takes 'svfloat16_t' arguments} } */ svmlalb_lane (f32, f16, f16, 0); svmlalb_lane (u32, u16, u16, 0); - svmlalb_lane (u32, 0, u16, 0); /* { dg-error {passing 'int' to argument 2 of 'svmlalb_lane', which expects an SVE vector type} } */ + svmlalb_lane (u32, 0, u16, 0); /* { dg-error {passing 'int' to argument 2 of 'svmlalb_lane', which expects an SVE type rather than a scalar} } */ svmlalb_lane (u32, s16, u16, 0); /* { dg-error {arguments 1 and 2 of 'svmlalb_lane' must have the same signedness, but the values passed here have type 'svuint32_t' and 'svint16_t' respectively} } */ - svmlalb_lane (u32, u16, 0, 0); /* { dg-error {passing 'int' to argument 3 of 'svmlalb_lane', which expects an SVE vector type} } */ + svmlalb_lane (u32, u16, 0, 0); /* { dg-error {passing 'int' to argument 3 of 'svmlalb_lane', which expects an SVE type rather than a scalar} } */ svmlalb_lane (u32, u16, s16, 0); /* { dg-error {arguments 1 and 3 of 'svmlalb_lane' must have the same signedness, but the values passed here have type 'svuint32_t' and 'svint16_t' respectively} } */ svmlalb_lane (u32, u32, u32, 0); /* { dg-error {passing 'svuint32_t' instead of the expected 'svuint16_t' to argument 2 of 'svmlalb_lane', after passing 'svuint32_t' to argument 1} } */ svmlalb_lane (u32, u8, u16, 0); /* { dg-error {passing 'svuint8_t' instead of the expected 'svuint16_t' to argument 2 of 'svmlalb_lane', after passing 'svuint32_t' to argument 1} } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/ternary_long_opt_n_1.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/ternary_long_opt_n_1.c index c6718cf3715..157fd7cd503 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/ternary_long_opt_n_1.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/ternary_long_opt_n_1.c @@ -10,13 +10,13 @@ f1 (svbool_t pg, svint8_t s8, svuint8_t u8, svuint16_t u16, svuint32_t u32, { svabalb (u16, u8); /* { dg-error {too few arguments to function 'svabalb'} } */ svabalb (u16, u8, u8, u8); /* { dg-error {too many arguments to function 'svabalb'} } */ - svabalb (0, u8, u8); /* { dg-error {passing 'int' to argument 1 of 'svabalb', which expects an SVE vector type} } */ + svabalb (0, u8, u8); /* { dg-error {passing 'int' to argument 1 of 'svabalb', which expects an SVE type rather than a scalar} } */ svabalb (pg, u8, u8); /* { dg-error {'svabalb' has no form that takes 'svbool_t' arguments} } */ svabalb (u8, u8, u8); /* { dg-error {'svabalb' has no form that takes 'svuint8_t' arguments} } */ svabalb (f16, u8, u8); /* { dg-error {'svabalb' has no form that takes 'svfloat16_t' arguments} } */ svabalb (f32, f16, f16); /* { dg-error {'svabalb' has no form that takes 'svfloat32_t' arguments} } */ svabalb (u16, u8, u8); - svabalb (u16, 0, u8); /* { dg-error {passing 'int' to argument 2 of 'svabalb', which expects an SVE vector type} } */ + svabalb (u16, 0, u8); /* { dg-error {passing 'int' to argument 2 of 'svabalb', which expects an SVE type rather than a scalar} } */ svabalb (u16, s8, u8); /* { dg-error {arguments 1 and 2 of 'svabalb' must have the same signedness, but the values passed here have type 'svuint16_t' and 'svint8_t' respectively} } */ svabalb (u16, u8, 0); svabalb (u16, u8, s8); /* { dg-error {arguments 1 and 3 of 'svabalb' must have the same signedness, but the values passed here have type 'svuint16_t' and 'svint8_t' respectively} } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/ternary_opt_n_1.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/ternary_opt_n_1.c index c4a80e9daa2..ac789c2beca 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/ternary_opt_n_1.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/ternary_opt_n_1.c @@ -10,14 +10,14 @@ f1 (svbool_t pg, svint8_t s8, svuint8_t u8, svmla_x (pg, u8, u8, u8, u8); /* { dg-error {too many arguments to function 'svmla_x'} } */ svmla_x (u8, u8, u8, u8); /* { dg-error {passing 'svuint8_t' to argument 1 of 'svmla_x', which expects 'svbool_t'} } */ svmla_x (pg, pg, pg, pg); /* { dg-error {'svmla_x' has no form that takes 'svbool_t' arguments} } */ - svmla_x (pg, 1, u8, u8); /* { dg-error {passing 'int' to argument 2 of 'svmla_x', which expects an SVE vector type} } */ + svmla_x (pg, 1, u8, u8); /* { dg-error {passing 'int' to argument 2 of 'svmla_x', which expects an SVE type rather than a scalar} } */ svmla_x (pg, u8, s8, u8); /* { dg-error {passing 'svint8_t' to argument 3 of 'svmla_x', but previous arguments had type 'svuint8_t'} } */ svmla_x (pg, u8, u8, u8); svmla_x (pg, u8, s16, u8); /* { dg-error {passing 'svint16_t' to argument 3 of 'svmla_x', but previous arguments had type 'svuint8_t'} } */ svmla_x (pg, u8, u16, u8); /* { dg-error {passing 'svuint16_t' to argument 3 of 'svmla_x', but previous arguments had type 'svuint8_t'} } */ svmla_x (pg, u8, f16, u8); /* { dg-error {passing 'svfloat16_t' to argument 3 of 'svmla_x', but previous arguments had type 'svuint8_t'} } */ svmla_x (pg, u8, pg, u8); /* { dg-error {passing 'svbool_t' to argument 3 of 'svmla_x', but previous arguments had type 'svuint8_t'} } */ - svmla_x (pg, u8, 0, u8); /* { dg-error {passing 'int' to argument 3 of 'svmla_x', which expects an SVE vector type} } */ + svmla_x (pg, u8, 0, u8); /* { dg-error {passing 'int' to argument 3 of 'svmla_x', which expects an SVE type rather than a scalar} } */ svmla_x (pg, u8, u8, s8); /* { dg-error {passing 'svint8_t' to argument 4 of 'svmla_x', but previous arguments had type 'svuint8_t'} } */ svmla_x (pg, u8, u8, s16); /* { dg-error {passing 'svint16_t' to argument 4 of 'svmla_x', but previous arguments had type 'svuint8_t'} } */ svmla_x (pg, u8, u8, u16); /* { dg-error {passing 'svuint16_t' to argument 4 of 'svmla_x', but previous arguments had type 'svuint8_t'} } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/ternary_qq_lane_1.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/ternary_qq_lane_1.c index e81552b64a9..c69b2d57503 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/ternary_qq_lane_1.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/ternary_qq_lane_1.c @@ -9,13 +9,13 @@ f1 (svbool_t pg, svint8_t s8, svuint8_t u8, svint16_t s16, svuint16_t u16, { svdot_lane (u32, u8, u8); /* { dg-error {too few arguments to function 'svdot_lane'} } */ svdot_lane (u32, u8, u8, 0, 0); /* { dg-error {too many arguments to function 'svdot_lane'} } */ - svdot_lane (0, u8, u8, 0); /* { dg-error {passing 'int' to argument 1 of 'svdot_lane', which expects an SVE vector type} } */ + svdot_lane (0, u8, u8, 0); /* { dg-error {passing 'int' to argument 1 of 'svdot_lane', which expects an SVE type rather than a scalar} } */ svdot_lane (pg, u8, u8, 0); /* { dg-error {'svdot_lane' has no form that takes 'svbool_t' arguments} } */ svdot_lane (u8, u8, u8, 0); /* { dg-error {'svdot_lane' has no form that takes 'svuint8_t' arguments} } */ svdot_lane (f32, u8, u8, 0); /* { dg-error {'svdot_lane' has no form that takes 'svfloat32_t' arguments} } */ svdot_lane (u32, u8, u8, 0); - svdot_lane (u32, 0, u8, 0); /* { dg-error {passing 'int' to argument 2 of 'svdot_lane', which expects an SVE vector type} } */ - svdot_lane (u32, u8, 0, 0); /* { dg-error {passing 'int' to argument 3 of 'svdot_lane', which expects an SVE vector type} } */ + svdot_lane (u32, 0, u8, 0); /* { dg-error {passing 'int' to argument 2 of 'svdot_lane', which expects an SVE type rather than a scalar} } */ + svdot_lane (u32, u8, 0, 0); /* { dg-error {passing 'int' to argument 3 of 'svdot_lane', which expects an SVE type rather than a scalar} } */ svdot_lane (s32, s8, s8, 0); svdot_lane (s32, u8, s8, 0); /* { dg-error {arguments 1 and 2 of 'svdot_lane' must have the same signedness, but the values passed here have type 'svint32_t' and 'svuint8_t' respectively} } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/ternary_qq_lane_rotate_1.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/ternary_qq_lane_rotate_1.c index a748a8627c1..9e84e7a8961 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/ternary_qq_lane_rotate_1.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/ternary_qq_lane_rotate_1.c @@ -11,13 +11,13 @@ f1 (svbool_t pg, svint8_t s8, svuint8_t u8, svint16_t s16, svuint16_t u16, { svcdot_lane (u32, u8, u8, 0); /* { dg-error {too few arguments to function 'svcdot_lane'} } */ svcdot_lane (u32, u8, u8, 0, 0, 0); /* { dg-error {too many arguments to function 'svcdot_lane'} } */ - svcdot_lane (0, u8, u8, 0, 0); /* { dg-error {passing 'int' to argument 1 of 'svcdot_lane', which expects an SVE vector type} } */ + svcdot_lane (0, u8, u8, 0, 0); /* { dg-error {passing 'int' to argument 1 of 'svcdot_lane', which expects an SVE type rather than a scalar} } */ svcdot_lane (pg, u8, u8, 0, 0); /* { dg-error {'svcdot_lane' has no form that takes 'svbool_t' arguments} } */ svcdot_lane (s8, s8, s8, 0, 0); /* { dg-error {'svcdot_lane' has no form that takes 'svint8_t' arguments} } */ svcdot_lane (f32, s8, s8, 0, 0); /* { dg-error {'svcdot_lane' has no form that takes 'svfloat32_t' arguments} } */ svcdot_lane (s32, s8, s8, 0, 0); - svcdot_lane (s32, 0, s8, 0, 0); /* { dg-error {passing 'int' to argument 2 of 'svcdot_lane', which expects an SVE vector type} } */ - svcdot_lane (s32, s8, 0, 0, 0); /* { dg-error {passing 'int' to argument 3 of 'svcdot_lane', which expects an SVE vector type} } */ + svcdot_lane (s32, 0, s8, 0, 0); /* { dg-error {passing 'int' to argument 2 of 'svcdot_lane', which expects an SVE type rather than a scalar} } */ + svcdot_lane (s32, s8, 0, 0, 0); /* { dg-error {passing 'int' to argument 3 of 'svcdot_lane', which expects an SVE type rather than a scalar} } */ svcdot_lane (s32, s8, s8, 0, 0); svcdot_lane (s32, u8, s8, 0, 0); /* { dg-error {arguments 1 and 2 of 'svcdot_lane' must have the same signedness, but the values passed here have type 'svint32_t' and 'svuint8_t' respectively} } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/ternary_qq_opt_n_2.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/ternary_qq_opt_n_2.c index fee4096fe0e..85d4b2dd8d5 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/ternary_qq_opt_n_2.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/ternary_qq_opt_n_2.c @@ -8,12 +8,12 @@ f1 (svbool_t pg, svint8_t s8, svuint8_t u8, svuint32_t u32, { svdot (u32, u8); /* { dg-error {too few arguments to function 'svdot'} } */ svdot (u32, u8, u8, u8); /* { dg-error {too many arguments to function 'svdot'} } */ - svdot (0, u8, u8); /* { dg-error {passing 'int' to argument 1 of 'svdot', which expects an SVE vector type} } */ + svdot (0, u8, u8); /* { dg-error {passing 'int' to argument 1 of 'svdot', which expects an SVE type rather than a scalar} } */ svdot (pg, u8, u8); /* { dg-error {'svdot' has no form that takes 'svbool_t' arguments} } */ svdot (u8, u8, u8); /* { dg-error {'svdot' has no form that takes 'svuint8_t' arguments} } */ svdot (f32, u8, u8); /* { dg-error {'svdot' has no form that takes 'svfloat32_t' arguments} } */ svdot (u32, u8, u8); - svdot (u32, 0, u8); /* { dg-error {passing 'int' to argument 2 of 'svdot', which expects an SVE vector type} } */ + svdot (u32, 0, u8); /* { dg-error {passing 'int' to argument 2 of 'svdot', which expects an SVE type rather than a scalar} } */ svdot (u32, s8, u8); /* { dg-error {arguments 1 and 2 of 'svdot' must have the same signedness, but the values passed here have type 'svuint32_t' and 'svint8_t' respectively} } */ svdot (u32, u8, 0); svdot (u32, u8, s8); /* { dg-error {arguments 1 and 3 of 'svdot' must have the same signedness, but the values passed here have type 'svuint32_t' and 'svint8_t' respectively} } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/ternary_qq_rotate_1.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/ternary_qq_rotate_1.c index 65e749ba7ac..9dd7eaf3ce7 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/ternary_qq_rotate_1.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/ternary_qq_rotate_1.c @@ -11,13 +11,13 @@ f1 (svbool_t pg, svint8_t s8, svuint8_t u8, svint16_t s16, svuint16_t u16, { svcdot (u32, u8, u8); /* { dg-error {too few arguments to function 'svcdot'} } */ svcdot (u32, u8, u8, 0, 0); /* { dg-error {too many arguments to function 'svcdot'} } */ - svcdot (0, u8, u8, 0); /* { dg-error {passing 'int' to argument 1 of 'svcdot', which expects an SVE vector type} } */ + svcdot (0, u8, u8, 0); /* { dg-error {passing 'int' to argument 1 of 'svcdot', which expects an SVE type rather than a scalar} } */ svcdot (pg, u8, u8, 0); /* { dg-error {'svcdot' has no form that takes 'svbool_t' arguments} } */ svcdot (s8, s8, s8, 0); /* { dg-error {'svcdot' has no form that takes 'svint8_t' arguments} } */ svcdot (f32, s8, s8, 0); /* { dg-error {'svcdot' has no form that takes 'svfloat32_t' arguments} } */ svcdot (s32, s8, s8, 0); - svcdot (s32, 0, s8, 0); /* { dg-error {passing 'int' to argument 2 of 'svcdot', which expects an SVE vector type} } */ - svcdot (s32, s8, 0, 0); /* { dg-error {passing 'int' to argument 3 of 'svcdot', which expects an SVE vector type} } */ + svcdot (s32, 0, s8, 0); /* { dg-error {passing 'int' to argument 2 of 'svcdot', which expects an SVE type rather than a scalar} } */ + svcdot (s32, s8, 0, 0); /* { dg-error {passing 'int' to argument 3 of 'svcdot', which expects an SVE type rather than a scalar} } */ svcdot (s32, s8, s8, 0); svcdot (s32, u8, s8, 0); /* { dg-error {arguments 1 and 2 of 'svcdot' must have the same signedness, but the values passed here have type 'svint32_t' and 'svuint8_t' respectively} } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/ternary_rotate_1.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/ternary_rotate_1.c index f340e3d1e75..bb67402897d 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/ternary_rotate_1.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/ternary_rotate_1.c @@ -10,9 +10,9 @@ f1 (svbool_t pg, svfloat32_t f32, svfloat64_t f64, svint32_t s32, int i) svcmla_x (f32, f32, f32, f32, 90); /* { dg-error {passing 'svfloat32_t' to argument 1 of 'svcmla_x', which expects 'svbool_t'} } */ svcmla_x (pg, pg, pg, pg, 90); /* { dg-error {'svcmla_x' has no form that takes 'svbool_t' arguments} } */ svcmla_x (pg, s32, s32, s32, 90); /* { dg-error {'svcmla_x' has no form that takes 'svint32_t' arguments} } */ - svcmla_x (pg, 1, f32, f32, 90); /* { dg-error {passing 'int' to argument 2 of 'svcmla_x', which expects an SVE vector type} } */ - svcmla_x (pg, f32, 1, f32, 90); /* { dg-error {passing 'int' to argument 3 of 'svcmla_x', which expects an SVE vector type} } */ - svcmla_x (pg, f32, f32, 1, 90); /* { dg-error {passing 'int' to argument 4 of 'svcmla_x', which expects an SVE vector type} } */ + svcmla_x (pg, 1, f32, f32, 90); /* { dg-error {passing 'int' to argument 2 of 'svcmla_x', which expects an SVE type rather than a scalar} } */ + svcmla_x (pg, f32, 1, f32, 90); /* { dg-error {passing 'int' to argument 3 of 'svcmla_x', which expects an SVE type rather than a scalar} } */ + svcmla_x (pg, f32, f32, 1, 90); /* { dg-error {passing 'int' to argument 4 of 'svcmla_x', which expects an SVE type rather than a scalar} } */ svcmla_x (pg, f32, f64, f32, 90); /* { dg-error {passing 'svfloat64_t' to argument 3 of 'svcmla_x', but previous arguments had type 'svfloat32_t'} } */ svcmla_x (pg, f32, f32, f64, 90); /* { dg-error {passing 'svfloat64_t' to argument 4 of 'svcmla_x', but previous arguments had type 'svfloat32_t'} } */ svcmla_x (pg, f32, f32, f32, s32); /* { dg-error {argument 5 of 'svcmla_x' must be an integer constant expression} } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/ternary_shift_right_imm_1.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/ternary_shift_right_imm_1.c index 28111375f26..cfe601631ea 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/ternary_shift_right_imm_1.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/ternary_shift_right_imm_1.c @@ -12,10 +12,10 @@ f1 (svbool_t pg, svuint8_t u8, svint8_t s8, svint16_t s16, const int one = 1; pg = svsra (pg, pg, 1); /* { dg-error {'svsra' has no form that takes 'svbool_t' arguments} } */ pg = svsra (pg, s8, 1); /* { dg-error {passing 'svint8_t' to argument 2 of 'svsra', but previous arguments had type 'svbool_t'} } */ - s8 = svsra (1, s8, 1); /* { dg-error {passing 'int' to argument 1 of 'svsra', which expects an SVE vector type} } */ + s8 = svsra (1, s8, 1); /* { dg-error {passing 'int' to argument 1 of 'svsra', which expects an SVE type rather than a scalar} } */ s8 = svsra (s8, u8, 1); /* { dg-error {passing 'svuint8_t' to argument 2 of 'svsra', but previous arguments had type 'svint8_t'} } */ s8 = svsra (s8, pg, 1); /* { dg-error {passing 'svbool_t' to argument 2 of 'svsra', but previous arguments had type 'svint8_t'} } */ - s8 = svsra (s8, 1, 1); /* { dg-error {passing 'int' to argument 2 of 'svsra', which expects an SVE vector type} } */ + s8 = svsra (s8, 1, 1); /* { dg-error {passing 'int' to argument 2 of 'svsra', which expects an SVE type rather than a scalar} } */ s8 = svsra (s8, s8, x); /* { dg-error {argument 3 of 'svsra' must be an integer constant expression} } */ s8 = svsra (s8, s8, one); /* { dg-error {argument 3 of 'svsra' must be an integer constant expression} } */ s8 = svsra (s8, s8, 0.4); /* { dg-error {passing 0 to argument 3 of 'svsra', which expects a value in the range \[1, 8\]} } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/ternary_uint_1.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/ternary_uint_1.c index 711b6a133be..5fb49770173 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/ternary_uint_1.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/ternary_uint_1.c @@ -13,8 +13,8 @@ f1 (svbool_t pg, svuint8_t u8, svint8_t s8, svuint16_t u16, svint16_t s16, svtbx (pg, pg, pg); /* { dg-error {passing 'svbool_t' to argument 3 of 'svtbx', which expects a vector of unsigned integers} } */ svtbx (pg, pg, u8); /* { dg-error {'svtbx' has no form that takes 'svbool_t' arguments} } */ - svtbx (u8, 0, u8); /* { dg-error {passing 'int' to argument 2 of 'svtbx', which expects an SVE vector type} } */ - svtbx (u8, u8, 0); /* { dg-error {passing 'int' to argument 3 of 'svtbx', which expects an SVE vector type} } */ + svtbx (u8, 0, u8); /* { dg-error {passing 'int' to argument 2 of 'svtbx', which expects an SVE type rather than a scalar} } */ + svtbx (u8, u8, 0); /* { dg-error {passing 'int' to argument 3 of 'svtbx', which expects an SVE type rather than a scalar} } */ svtbx (u8, s8, u8); /* { dg-error {passing 'svint8_t' to argument 2 of 'svtbx', but previous arguments had type 'svuint8_t'} } */ svtbx (u8, u8, u8); svtbx (u8, u8, s8); /* { dg-error {passing 'svint8_t' to argument 3 of 'svtbx', which expects a vector of unsigned integers} } */ @@ -29,7 +29,7 @@ f1 (svbool_t pg, svuint8_t u8, svint8_t s8, svuint16_t u16, svint16_t s16, svtbx (s8, s8, s16); /* { dg-error {passing 'svint16_t' to argument 3 of 'svtbx', which expects a vector of unsigned integers} } */ svtbx (s8, s8, pg); /* { dg-error {passing 'svbool_t' to argument 3 of 'svtbx', which expects a vector of unsigned integers} } */ - svtbx (u16, 0, u16); /* { dg-error {passing 'int' to argument 2 of 'svtbx', which expects an SVE vector type} } */ + svtbx (u16, 0, u16); /* { dg-error {passing 'int' to argument 2 of 'svtbx', which expects an SVE type rather than a scalar} } */ svtbx (u16, u16, u8); /* { dg-error {arguments 1 and 3 of 'svtbx' must have the same element size, but the values passed here have type 'svuint16_t' and 'svuint8_t' respectively} } */ svtbx (u16, u16, s8); /* { dg-error {passing 'svint8_t' to argument 3 of 'svtbx', which expects a vector of unsigned integers} } */ svtbx (u16, u16, u16); diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/ternary_uintq_intq_1.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/ternary_uintq_intq_1.c index f52fb39bf4d..d1aad1de1ce 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/ternary_uintq_intq_1.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/ternary_uintq_intq_1.c @@ -23,15 +23,15 @@ f2 (svbool_t pg, svint8_t s8, svuint8_t u8, svuint32_t u32, { svusmmla (s32, u8); /* { dg-error {too few arguments to function 'svusmmla'} } */ svusmmla (s32, u8, s8, u8); /* { dg-error {too many arguments to function 'svusmmla'} } */ - svusmmla (0, u8, s8); /* { dg-error {passing 'int' to argument 1 of 'svusmmla', which expects an SVE vector type} } */ + svusmmla (0, u8, s8); /* { dg-error {passing 'int' to argument 1 of 'svusmmla', which expects an SVE type rather than a scalar} } */ svusmmla (pg, u8, s8); /* { dg-error {'svusmmla' has no form that takes 'svbool_t' arguments} } */ svusmmla (u8, u8, s8); /* { dg-error {'svusmmla' has no form that takes 'svuint8_t' arguments} } */ svusmmla (f32, u8, s8); /* { dg-error {'svusmmla' has no form that takes 'svfloat32_t' arguments} } */ svusmmla (s32, u8, s8); - svusmmla (s32, 0, s8); /* { dg-error {passing 'int' to argument 2 of 'svusmmla', which expects an SVE vector type} } */ + svusmmla (s32, 0, s8); /* { dg-error {passing 'int' to argument 2 of 'svusmmla', which expects an SVE type rather than a scalar} } */ svusmmla (s32, u8, u8); /* { dg-error {passing 'svuint8_t' to argument 3 of 'svusmmla', which expects a vector of signed integers} } */ svusmmla (s32, s8, s8); /* { dg-error {passing 'svint8_t' to argument 2 of 'svusmmla', which expects a vector of unsigned integers} } */ - svusmmla (s32, u8, 0); /* { dg-error {passing 'int' to argument 3 of 'svusmmla', which expects an SVE vector type} } */ + svusmmla (s32, u8, 0); /* { dg-error {passing 'int' to argument 3 of 'svusmmla', which expects an SVE type rather than a scalar} } */ svusmmla (s32, u8, s8); svusmmla (s32, u32, u32); /* { dg-error {passing 'svuint32_t' instead of the expected 'svuint8_t' to argument 2 of 'svusmmla', after passing 'svint32_t' to argument 1} } */ } diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/ternary_uintq_intq_lane_1.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/ternary_uintq_intq_lane_1.c index b40cfe9e8e0..0cc5c74979e 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/ternary_uintq_intq_lane_1.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/ternary_uintq_intq_lane_1.c @@ -10,14 +10,14 @@ f1 (svbool_t pg, svint8_t s8, svuint8_t u8, svint16_t s16, svuint16_t u16, { svusdot_lane (s32, u8, s8); /* { dg-error {too few arguments to function 'svusdot_lane'} } */ svusdot_lane (s32, u8, s8, 0, 0); /* { dg-error {too many arguments to function 'svusdot_lane'} } */ - svusdot_lane (0, u8, s8, 0); /* { dg-error {passing 'int' to argument 1 of 'svusdot_lane', which expects an SVE vector type} } */ + svusdot_lane (0, u8, s8, 0); /* { dg-error {passing 'int' to argument 1 of 'svusdot_lane', which expects an SVE type rather than a scalar} } */ svusdot_lane (pg, u8, s8, 0); /* { dg-error {'svusdot_lane' has no form that takes 'svbool_t' arguments} } */ svusdot_lane (u8, u8, s8, 0); /* { dg-error {'svusdot_lane' has no form that takes 'svuint8_t' arguments} } */ svusdot_lane (f32, u8, s8, 0); /* { dg-error {'svusdot_lane' has no form that takes 'svfloat32_t' arguments} } */ svusdot_lane (u32, u8, s8, 0); /* { dg-error {'svusdot_lane' has no form that takes 'svuint32_t' arguments} } */ svusdot_lane (s32, u8, s8, 0); - svusdot_lane (s32, 0, s8, 0); /* { dg-error {passing 'int' to argument 2 of 'svusdot_lane', which expects an SVE vector type} } */ - svusdot_lane (s32, u8, 0, 0); /* { dg-error {passing 'int' to argument 3 of 'svusdot_lane', which expects an SVE vector type} } */ + svusdot_lane (s32, 0, s8, 0); /* { dg-error {passing 'int' to argument 2 of 'svusdot_lane', which expects an SVE type rather than a scalar} } */ + svusdot_lane (s32, u8, 0, 0); /* { dg-error {passing 'int' to argument 3 of 'svusdot_lane', which expects an SVE type rather than a scalar} } */ svusdot_lane (s32, u8, s8, 0); svusdot_lane (s32, s8, s8, 0); /* { dg-error {passing 'svint8_t' to argument 2 of 'svusdot_lane', which expects a vector of unsigned integers} } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/ternary_uintq_intq_opt_n_1.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/ternary_uintq_intq_opt_n_1.c index 896b80390a2..f6585ae77c5 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/ternary_uintq_intq_opt_n_1.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/ternary_uintq_intq_opt_n_1.c @@ -23,12 +23,12 @@ f2 (svbool_t pg, svint8_t s8, svuint8_t u8, svuint32_t u32, { svusdot (s32, u8); /* { dg-error {too few arguments to function 'svusdot'} } */ svusdot (s32, u8, s8, u8); /* { dg-error {too many arguments to function 'svusdot'} } */ - svusdot (0, u8, s8); /* { dg-error {passing 'int' to argument 1 of 'svusdot', which expects an SVE vector type} } */ + svusdot (0, u8, s8); /* { dg-error {passing 'int' to argument 1 of 'svusdot', which expects an SVE type rather than a scalar} } */ svusdot (pg, u8, s8); /* { dg-error {'svusdot' has no form that takes 'svbool_t' arguments} } */ svusdot (u8, u8, s8); /* { dg-error {'svusdot' has no form that takes 'svuint8_t' arguments} } */ svusdot (f32, u8, s8); /* { dg-error {'svusdot' has no form that takes 'svfloat32_t' arguments} } */ svusdot (s32, u8, s8); - svusdot (s32, 0, s8); /* { dg-error {passing 'int' to argument 2 of 'svusdot', which expects an SVE vector type} } */ + svusdot (s32, 0, s8); /* { dg-error {passing 'int' to argument 2 of 'svusdot', which expects an SVE type rather than a scalar} } */ svusdot (s32, u8, u8); /* { dg-error {passing 'svuint8_t' to argument 3 of 'svusdot', which expects a vector of signed integers} } */ svusdot (s32, s8, s8); /* { dg-error {passing 'svint8_t' to argument 2 of 'svusdot', which expects a vector of unsigned integers} } */ svusdot (s32, u8, 0); diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/tmad_1.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/tmad_1.c index 8b98fc24d66..c2eda93e363 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/tmad_1.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/tmad_1.c @@ -9,8 +9,8 @@ f1 (svbool_t pg, svfloat32_t f32, svfloat64_t f64, svint32_t s32, int i) svtmad (f32, f32, 0, 0); /* { dg-error {too many arguments to function 'svtmad'} } */ svtmad (pg, pg, 0); /* { dg-error {'svtmad' has no form that takes 'svbool_t' arguments} } */ svtmad (s32, s32, 0); /* { dg-error {'svtmad' has no form that takes 'svint32_t' arguments} } */ - svtmad (1, f32, 0); /* { dg-error {passing 'int' to argument 1 of 'svtmad', which expects an SVE vector type} } */ - svtmad (f32, 1, 0); /* { dg-error {passing 'int' to argument 2 of 'svtmad', which expects an SVE vector type} } */ + svtmad (1, f32, 0); /* { dg-error {passing 'int' to argument 1 of 'svtmad', which expects an SVE type rather than a scalar} } */ + svtmad (f32, 1, 0); /* { dg-error {passing 'int' to argument 2 of 'svtmad', which expects an SVE type rather than a scalar} } */ svtmad (f32, f64, 0); /* { dg-error {passing 'svfloat64_t' to argument 2 of 'svtmad', but previous arguments had type 'svfloat32_t'} } */ svtmad (f32, f32, s32); /* { dg-error {argument 3 of 'svtmad' must be an integer constant expression} } */ svtmad (f32, f32, i); /* { dg-error {argument 3 of 'svtmad' must be an integer constant expression} } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/unary_1.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/unary_1.c index eef85a01d9e..8c865a0e67d 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/unary_1.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/unary_1.c @@ -7,7 +7,7 @@ f1 (svbool_t pg, svint32_t s32, svuint32_t u32, svfloat32_t f32) { svabs_m (s32, pg); /* { dg-error {too few arguments to function 'svabs_m'} } */ svabs_m (s32, pg, s32, s32); /* { dg-error {too many arguments to function 'svabs_m'} } */ - svabs_m (0, pg, s32); /* { dg-error {passing 'int' to argument 1 of 'svabs_m', which expects an SVE vector type} } */ + svabs_m (0, pg, s32); /* { dg-error {passing 'int' to argument 1 of 'svabs_m', which expects an SVE type rather than a scalar} } */ svabs_m (s32, s32, s32); /* { dg-error {passing 'svint32_t' to argument 2 of 'svabs_m', which expects 'svbool_t'} } */ svabs_m (s32, 0, s32); /* { dg-error {passing 'int' to argument 2 of 'svabs_m', which expects 'svbool_t'} } */ svabs_m (s32, pg, s32); diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/unary_2.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/unary_2.c index e94673a66f5..bf93e21a40a 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/unary_2.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/unary_2.c @@ -9,7 +9,7 @@ f1 (svbool_t pg, svint8_t s8, svuint8_t u8) svabs_x (pg, s8, s8); /* { dg-error {too many arguments to function 'svabs_x'} } */ svabs_x (s8, s8); /* { dg-error {passing 'svint8_t' to argument 1 of 'svabs_x', which expects 'svbool_t'} } */ svabs_x (pg, pg); /* { dg-error {'svabs_x' has no form that takes 'svbool_t' arguments} } */ - svabs_x (pg, 1); /* { dg-error {passing 'int' to argument 2 of 'svabs_x', which expects an SVE vector type} } */ + svabs_x (pg, 1); /* { dg-error {passing 'int' to argument 2 of 'svabs_x', which expects an SVE type rather than a scalar} } */ svabs_x (pg, s8); svabs_x (pg, u8); /* { dg-error {'svabs_x' has no form that takes 'svuint8_t' arguments} } */ } diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/unary_convert_1.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/unary_convert_1.c index caa4e623d3f..f59ad590ba4 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/unary_convert_1.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/unary_convert_1.c @@ -9,7 +9,7 @@ test (svbool_t pg, svint8_t s8, svuint8_t u8, svcvt_f64_x (pg); /* { dg-error {too few arguments to function 'svcvt_f64_x'} } */ svcvt_f64_x (pg, s32, 0); /* { dg-error {too many arguments to function 'svcvt_f64_x'} } */ svcvt_f64_x (s32, s32); /* { dg-error {passing 'svint32_t' to argument 1 of 'svcvt_f64_x', which expects 'svbool_t'} } */ - svcvt_f64_x (pg, 0); /* { dg-error {passing 'int' to argument 2 of 'svcvt_f64_x', which expects an SVE vector type} } */ + svcvt_f64_x (pg, 0); /* { dg-error {passing 'int' to argument 2 of 'svcvt_f64_x', which expects an SVE type rather than a scalar} } */ svcvt_f64_x (pg, s8); /* { dg-error {'svcvt_f64_x' has no form that takes 'svint8_t' arguments} } */ svcvt_f64_x (pg, s16); /* { dg-error {'svcvt_f64_x' has no form that takes 'svint16_t' arguments} } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/unary_convert_2.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/unary_convert_2.c index ddbd93b697c..2649fd69467 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/unary_convert_2.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/unary_convert_2.c @@ -12,7 +12,7 @@ test (svbool_t pg, svint8_t s8, svuint8_t u8, svcvt_f64_m (0, pg, s32); /* { dg-error {passing 'int' to argument 1 of 'svcvt_f64_m', which expects 'svfloat64_t'} } */ svcvt_f64_m (pg, pg, s32); /* { dg-error {passing 'svbool_t' to argument 1 of 'svcvt_f64_m', which expects 'svfloat64_t'} } */ svcvt_f64_m (f64, s32, s32); /* { dg-error {passing 'svint32_t' to argument 2 of 'svcvt_f64_m', which expects 'svbool_t'} } */ - svcvt_f64_m (f64, pg, 0); /* { dg-error {passing 'int' to argument 3 of 'svcvt_f64_m', which expects an SVE vector type} } */ + svcvt_f64_m (f64, pg, 0); /* { dg-error {passing 'int' to argument 3 of 'svcvt_f64_m', which expects an SVE type rather than a scalar} } */ svcvt_f64_m (f64, pg, s8); /* { dg-error {'svcvt_f64_m' has no form that takes 'svint8_t' arguments} } */ svcvt_f64_m (f64, pg, s16); /* { dg-error {'svcvt_f64_m' has no form that takes 'svint16_t' arguments} } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/unary_convert_narrowt_1.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/unary_convert_narrowt_1.c index 92c07b8c139..a5d56dec08b 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/unary_convert_narrowt_1.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/unary_convert_narrowt_1.c @@ -14,7 +14,7 @@ test (svbool_t pg, svint8_t s8, svuint8_t u8, svcvtnt_f32_m (0, pg, f64); /* { dg-error {passing 'int' to argument 1 of 'svcvtnt_f32_m', which expects 'svfloat32_t'} } */ svcvtnt_f32_m (pg, pg, f64); /* { dg-error {passing 'svbool_t' to argument 1 of 'svcvtnt_f32_m', which expects 'svfloat32_t'} } */ svcvtnt_f32_m (f32, s32, f64); /* { dg-error {passing 'svint32_t' to argument 2 of 'svcvtnt_f32_m', which expects 'svbool_t'} } */ - svcvtnt_f32_m (f32, pg, 0); /* { dg-error {passing 'int' to argument 3 of 'svcvtnt_f32_m', which expects an SVE vector type} } */ + svcvtnt_f32_m (f32, pg, 0); /* { dg-error {passing 'int' to argument 3 of 'svcvtnt_f32_m', which expects an SVE type rather than a scalar} } */ svcvtnt_f32_m (f32, pg, s8); /* { dg-error {'svcvtnt_f32_m' has no form that takes 'svint8_t' arguments} } */ svcvtnt_f32_m (f32, pg, s16); /* { dg-error {'svcvtnt_f32_m' has no form that takes 'svint16_t' arguments} } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/unary_narrowb_1.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/unary_narrowb_1.c index c03d644ed4d..c2465e3e2d6 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/unary_narrowb_1.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/unary_narrowb_1.c @@ -23,5 +23,5 @@ f1 (svbool_t pg, svint8_t s8, svuint8_t u8, svqxtnb (u64); svqxtnb (s64); svqxtnb (f32); /* { dg-error {'svqxtnb' has no form that takes 'svfloat32_t' arguments} } */ - svqxtnb (1); /* { dg-error {passing 'int' to argument 1 of 'svqxtnb', which expects an SVE vector type} } */ + svqxtnb (1); /* { dg-error {passing 'int' to argument 1 of 'svqxtnb', which expects an SVE type rather than a scalar} } */ } diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/unary_narrowb_to_uint_1.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/unary_narrowb_to_uint_1.c index c3e21038071..60051f80c82 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/unary_narrowb_to_uint_1.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/unary_narrowb_to_uint_1.c @@ -23,5 +23,5 @@ f1 (svbool_t pg, svint8_t s8, svuint8_t u8, svqxtunb (u64); /* { dg-error {'svqxtunb' has no form that takes 'svuint64_t' arguments} } */ svqxtunb (s64); svqxtunb (f32); /* { dg-error {'svqxtunb' has no form that takes 'svfloat32_t' arguments} } */ - svqxtunb (1); /* { dg-error {passing 'int' to argument 1 of 'svqxtunb', which expects an SVE vector type} } */ + svqxtunb (1); /* { dg-error {passing 'int' to argument 1 of 'svqxtunb', which expects an SVE type rather than a scalar} } */ } diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/unary_narrowt_1.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/unary_narrowt_1.c index 4ed179cb3b4..a0612dcb7c2 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/unary_narrowt_1.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/unary_narrowt_1.c @@ -26,6 +26,6 @@ f1 (svbool_t pg, svint8_t s8, svuint8_t u8, svqxtnt (u32, u64); svqxtnt (s32, s64); svqxtnt (f16, f32); /* { dg-error {'svqxtnt' has no form that takes 'svfloat32_t' arguments} } */ - svqxtnt (1, u16); /* { dg-error {passing 'int' to argument 1 of 'svqxtnt', which expects an SVE vector type} } */ - svqxtnt (u8, 1); /* { dg-error {passing 'int' to argument 2 of 'svqxtnt', which expects an SVE vector type} } */ + svqxtnt (1, u16); /* { dg-error {passing 'int' to argument 1 of 'svqxtnt', which expects an SVE type rather than a scalar} } */ + svqxtnt (u8, 1); /* { dg-error {passing 'int' to argument 2 of 'svqxtnt', which expects an SVE type rather than a scalar} } */ } diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/unary_narrowt_to_uint_1.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/unary_narrowt_to_uint_1.c index acaa546eee9..8e5fa5b3dc5 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/unary_narrowt_to_uint_1.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/unary_narrowt_to_uint_1.c @@ -26,6 +26,6 @@ f1 (svbool_t pg, svint8_t s8, svuint8_t u8, svqxtunt (u32, u64); /* { dg-error {'svqxtunt' has no form that takes 'svuint64_t' arguments} } */ svqxtunt (u32, s64); svqxtunt (u16, f32); /* { dg-error {'svqxtunt' has no form that takes 'svfloat32_t' arguments} } */ - svqxtunt (1, u16); /* { dg-error {passing 'int' to argument 1 of 'svqxtunt', which expects an SVE vector type} } */ - svqxtunt (u8, 1); /* { dg-error {passing 'int' to argument 2 of 'svqxtunt', which expects an SVE vector type} } */ + svqxtunt (1, u16); /* { dg-error {passing 'int' to argument 1 of 'svqxtunt', which expects an SVE type rather than a scalar} } */ + svqxtunt (u8, 1); /* { dg-error {passing 'int' to argument 2 of 'svqxtunt', which expects an SVE type rather than a scalar} } */ } diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/unary_to_int_1.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/unary_to_int_1.c index 517d11ff0f8..e2e172d2db4 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/unary_to_int_1.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/unary_to_int_1.c @@ -10,7 +10,7 @@ f1 (svbool_t pg, svint32_t s32, svuint32_t u32, svfloat32_t f32, { svlogb_m (s32, pg); /* { dg-error {too few arguments to function 'svlogb_m'} } */ svlogb_m (s32, pg, f32, s32); /* { dg-error {too many arguments to function 'svlogb_m'} } */ - svlogb_m (0, pg, f32); /* { dg-error {passing 'int' to argument 1 of 'svlogb_m', which expects an SVE vector type} } */ + svlogb_m (0, pg, f32); /* { dg-error {passing 'int' to argument 1 of 'svlogb_m', which expects an SVE type rather than a scalar} } */ svlogb_m (s32, u32, f32); /* { dg-error {passing 'svuint32_t' to argument 2 of 'svlogb_m', which expects 'svbool_t'} } */ svlogb_m (s32, 0, f32); /* { dg-error {passing 'int' to argument 2 of 'svlogb_m', which expects 'svbool_t'} } */ svlogb_m (s32, pg, s32); /* { dg-error {'svlogb_m' has no form that takes 'svint32_t' arguments} } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/unary_to_uint_1.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/unary_to_uint_1.c index 888b52513ef..b3cf0b9f5b4 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/unary_to_uint_1.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/unary_to_uint_1.c @@ -8,7 +8,7 @@ f1 (svbool_t pg, svint32_t s32, svuint32_t u32, svfloat32_t f32, { svclz_m (u32, pg); /* { dg-error {too few arguments to function 'svclz_m'} } */ svclz_m (u32, pg, s32, s32); /* { dg-error {too many arguments to function 'svclz_m'} } */ - svclz_m (0, pg, f32); /* { dg-error {passing 'int' to argument 1 of 'svclz_m', which expects an SVE vector type} } */ + svclz_m (0, pg, f32); /* { dg-error {passing 'int' to argument 1 of 'svclz_m', which expects an SVE type rather than a scalar} } */ svclz_m (u32, u32, f32); /* { dg-error {passing 'svuint32_t' to argument 2 of 'svclz_m', which expects 'svbool_t'} } */ svclz_m (u32, 0, f32); /* { dg-error {passing 'int' to argument 2 of 'svclz_m', which expects 'svbool_t'} } */ svclz_m (u32, pg, s32); diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/unary_to_uint_2.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/unary_to_uint_2.c index 233e847e903..da02d12fba1 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/unary_to_uint_2.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/unary_to_uint_2.c @@ -9,7 +9,7 @@ f1 (svbool_t pg, svint32_t s32, svuint32_t u32, svfloat32_t f32, { svclz_m (u32, pg); /* { dg-error {too few arguments to function 'svclz_m'} } */ svclz_m (u32, pg, s32, s32); /* { dg-error {too many arguments to function 'svclz_m'} } */ - svclz_m (0, pg, f32); /* { dg-error {passing 'int' to argument 1 of 'svclz_m', which expects an SVE vector type} } */ + svclz_m (0, pg, f32); /* { dg-error {passing 'int' to argument 1 of 'svclz_m', which expects an SVE type rather than a scalar} } */ svclz_m (u32, u32, f32); /* { dg-error {passing 'svuint32_t' to argument 2 of 'svclz_m', which expects 'svbool_t'} } */ svclz_m (u32, 0, f32); /* { dg-error {passing 'int' to argument 2 of 'svclz_m', which expects 'svbool_t'} } */ svclz_m (u32, pg, s32); diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/unary_to_uint_3.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/unary_to_uint_3.c index da57b07ea84..858a2a5e03e 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/unary_to_uint_3.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/unary_to_uint_3.c @@ -9,6 +9,6 @@ f1 (svbool_t pg, svuint8_t u8) svcnt_x (pg, u8, u8); /* { dg-error {too many arguments to function 'svcnt_x'} } */ svcnt_x (u8, u8); /* { dg-error {passing 'svuint8_t' to argument 1 of 'svcnt_x', which expects 'svbool_t'} } */ svcnt_x (pg, pg); /* { dg-error {'svcnt_x' has no form that takes 'svbool_t' arguments} } */ - svcnt_x (pg, 1); /* { dg-error {passing 'int' to argument 2 of 'svcnt_x', which expects an SVE vector type} } */ + svcnt_x (pg, 1); /* { dg-error {passing 'int' to argument 2 of 'svcnt_x', which expects an SVE type rather than a scalar} } */ svcnt_x (pg, u8); } diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/unary_uint_1.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/unary_uint_1.c index 9c8acdf2d11..e3275a8ced2 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/unary_uint_1.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/unary_uint_1.c @@ -8,7 +8,7 @@ f1 (svbool_t pg, svint8_t s8, svuint8_t u8, { svexpa (); /* { dg-error {too few arguments to function 'svexpa'} } */ svexpa (u16, u16); /* { dg-error {too many arguments to function 'svexpa'} } */ - svexpa (1); /* { dg-error {passing 'int' to argument 1 of 'svexpa', which expects an SVE vector type} } */ + svexpa (1); /* { dg-error {passing 'int' to argument 1 of 'svexpa', which expects an SVE type rather than a scalar} } */ svexpa (pg); /* { dg-error {passing 'svbool_t' to argument 1 of 'svexpa', which expects a vector of unsigned integers} } */ svexpa (s8); /* { dg-error {passing 'svint8_t' to argument 1 of 'svexpa', which expects a vector of unsigned integers} } */ svexpa (s16); /* { dg-error {passing 'svint16_t' to argument 1 of 'svexpa', which expects a vector of unsigned integers} } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/unary_widen_1.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/unary_widen_1.c index 95a97a72efd..a194bd6ab4b 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/unary_widen_1.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/unary_widen_1.c @@ -8,8 +8,8 @@ test (svbool_t pg, svint8_t s8, svuint8_t u8, { svunpklo (); /* { dg-error {too few arguments to function 'svunpklo'} } */ svunpklo (pg, s8); /* { dg-error {too many arguments to function 'svunpklo'} } */ - svunpklo (i); /* { dg-error {passing 'int' to argument 1 of 'svunpklo', which expects an SVE vector type} } */ - svunpklo (f); /* { dg-error {passing 'float' to argument 1 of 'svunpklo', which expects an SVE vector type} } */ + svunpklo (i); /* { dg-error {passing 'int' to argument 1 of 'svunpklo', which expects an SVE type rather than a scalar} } */ + svunpklo (f); /* { dg-error {passing 'float' to argument 1 of 'svunpklo', which expects an SVE type rather than a scalar} } */ svunpklo (pg); svunpklo (s8); svunpklo (s16); From patchwork Tue Dec 5 10:13:05 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Sandiford X-Patchwork-Id: 1872032 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4SkxHt2jL2z1ySd for ; Tue, 5 Dec 2023 21:15:38 +1100 (AEDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 4F646385E451 for ; Tue, 5 Dec 2023 10:15:22 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by sourceware.org (Postfix) with ESMTP id E040B385843A for ; Tue, 5 Dec 2023 10:13:38 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org E040B385843A Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=arm.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org E040B385843A Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=217.140.110.172 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1701771224; cv=none; b=g4yrcuyVp4KE2QLlpTemwmaFGWdTeHGqHY04cKmGGkgYXJ68gxR+6lYCWLuzt+8SMggFn9QRKprnIVPdPYC8bcf3y4R044CMXZgo8jNuBLWZHGr1MXjUaWCcxvsXxNueFwARnH3x0IOnSR+GnONNhDSfFnohC4aeykrOL1GxzmI= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1701771224; c=relaxed/simple; bh=d5TxKS8vttgiBj1T7kcnEW29XeEp8NzsQ25MWbDS9z8=; h=From:To:Subject:Date:Message-Id:MIME-Version; b=XEriCVJyZ13vTzxHQwGhB/Q7+ZfvXenVgkvjAAbaN+92y2G57Yu8cKoLfwTEbGCYGZ6LQtlDyJDf1gojunVrEeCU80ex8fWp8p5cxqqSVafvZ2AWodOtmnAekQ1PJsuoDFow9pxx3OP4utv/Ni00Oz0rz9mDI5alF3YyBx6REiE= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 116A01477; Tue, 5 Dec 2023 02:14:25 -0800 (PST) Received: from e121540-lin.manchester.arm.com (e121540-lin.manchester.arm.com [10.32.110.72]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 02B5D3F5A1; Tue, 5 Dec 2023 02:13:37 -0800 (PST) From: Richard Sandiford To: gcc-patches@gcc.gnu.org Cc: Richard Sandiford Subject: [pushed v2 07/25] aarch64: Replace vague "previous arguments" message Date: Tue, 5 Dec 2023 10:13:05 +0000 Message-Id: <20231205101323.1914247-8-richard.sandiford@arm.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20231205101323.1914247-1-richard.sandiford@arm.com> References: <20231205101323.1914247-1-richard.sandiford@arm.com> MIME-Version: 1.0 X-Spam-Status: No, score=-22.4 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_DMARC_NONE, KAM_DMARC_STATUS, KAM_LAZY_DOMAIN_SECURITY, KAM_SHORT, SPF_HELO_NONE, SPF_NONE, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org If an SVE ACLE intrinsic requires two arguments to have the same type, the C resolver would report mismatches as "argument N has type T2, but previous arguments had type T1". This patch makes the message say which argument had type T1. This is needed to give decent error messages for some SME cases. gcc/ * config/aarch64/aarch64-sve-builtins.h (function_resolver::require_matching_vector_type): Add a parameter that specifies the number of the earlier argument that is being matched against. * config/aarch64/aarch64-sve-builtins.cc (function_resolver::require_matching_vector_type): Likewise. (require_derived_vector_type): Update calls accordingly. (function_resolver::resolve_unary): Likewise. (function_resolver::resolve_uniform): Likewise. (function_resolver::resolve_uniform_opt_n): Likewise. * config/aarch64/aarch64-sve-builtins-shapes.cc (binary_long_lane_def::resolve): Likewise. (clast_def::resolve, ternary_uint_def::resolve): Likewise. gcc/testsuite/ * gcc.target/aarch64/sve/acle/general-c/*: Replace "but previous arguments had" with "but argument N had". --- .../aarch64/aarch64-sve-builtins-shapes.cc | 6 ++-- gcc/config/aarch64/aarch64-sve-builtins.cc | 17 +++++------ gcc/config/aarch64/aarch64-sve-builtins.h | 3 +- .../aarch64/sve/acle/general-c/binary_1.c | 6 ++-- .../sve/acle/general-c/binary_lane_1.c | 2 +- .../sve/acle/general-c/binary_long_lane_1.c | 2 +- .../sve/acle/general-c/binary_long_opt_n_1.c | 8 +++--- .../acle/general-c/binary_narrowb_opt_n_1.c | 8 +++--- .../acle/general-c/binary_narrowt_opt_n_1.c | 8 +++--- .../sve/acle/general-c/binary_opt_n_2.c | 14 +++++----- .../sve/acle/general-c/binary_opt_n_3.c | 16 +++++------ .../sve/acle/general-c/binary_rotate_1.c | 2 +- .../sve/acle/general-c/binary_to_uint_1.c | 4 +-- .../aarch64/sve/acle/general-c/clast_1.c | 2 +- .../aarch64/sve/acle/general-c/compare_1.c | 14 +++++----- .../sve/acle/general-c/compare_opt_n_1.c | 14 +++++----- .../aarch64/sve/acle/general-c/create_1.c | 6 ++-- .../aarch64/sve/acle/general-c/create_3.c | 6 ++-- .../aarch64/sve/acle/general-c/create_5.c | 6 ++-- .../aarch64/sve/acle/general-c/mmla_1.c | 14 +++++----- .../sve/acle/general-c/ternary_lane_1.c | 4 +-- .../acle/general-c/ternary_lane_rotate_1.c | 4 +-- .../sve/acle/general-c/ternary_opt_n_1.c | 28 +++++++++---------- .../sve/acle/general-c/ternary_rotate_1.c | 4 +-- .../general-c/ternary_shift_right_imm_1.c | 6 ++-- .../sve/acle/general-c/ternary_uint_1.c | 6 ++-- .../aarch64/sve/acle/general-c/tmad_1.c | 2 +- .../aarch64/sve/acle/general-c/unary_1.c | 8 +++--- .../aarch64/sve/acle/general-c/undeclared_2.c | 2 +- 29 files changed, 112 insertions(+), 110 deletions(-) diff --git a/gcc/config/aarch64/aarch64-sve-builtins-shapes.cc b/gcc/config/aarch64/aarch64-sve-builtins-shapes.cc index dc255fc59f2..7ab94a9cb31 100644 --- a/gcc/config/aarch64/aarch64-sve-builtins-shapes.cc +++ b/gcc/config/aarch64/aarch64-sve-builtins-shapes.cc @@ -1153,7 +1153,7 @@ struct binary_long_lane_def : public overloaded_base<0> type_suffix_index type, result_type; if (!r.check_gp_argument (3, i, nargs) || (type = r.infer_vector_type (i)) == NUM_TYPE_SUFFIXES - || !r.require_matching_vector_type (i + 1, type) + || !r.require_matching_vector_type (i + 1, i, type) || !r.require_integer_immediate (i + 2) || (result_type = long_type_suffix (r, type)) == NUM_TYPE_SUFFIXES) return error_mark_node; @@ -1608,7 +1608,7 @@ struct clast_def : public overloaded_base<0> { type_suffix_index type; if ((type = r.infer_vector_type (i)) == NUM_TYPE_SUFFIXES - || !r.require_matching_vector_type (i + 1, type)) + || !r.require_matching_vector_type (i + 1, i, type)) return error_mark_node; return r.resolve_to (MODE_none, type); } @@ -3108,7 +3108,7 @@ struct ternary_uint_def : public overloaded_base<0> type_suffix_index type; if (!r.check_gp_argument (3, i, nargs) || (type = r.infer_vector_type (i)) == NUM_TYPE_SUFFIXES - || !r.require_matching_vector_type (i + 1, type) + || !r.require_matching_vector_type (i + 1, i, type) || !r.require_derived_vector_type (i + 2, i, type, TYPE_unsigned)) return error_mark_node; diff --git a/gcc/config/aarch64/aarch64-sve-builtins.cc b/gcc/config/aarch64/aarch64-sve-builtins.cc index 1ecd8fd5db9..4203ff4fc41 100644 --- a/gcc/config/aarch64/aarch64-sve-builtins.cc +++ b/gcc/config/aarch64/aarch64-sve-builtins.cc @@ -1561,11 +1561,12 @@ function_resolver::require_vector_type (unsigned int argno, return true; } -/* Like require_vector_type, but TYPE is inferred from previous arguments +/* Like require_vector_type, but TYPE is inferred from argument FIRST_ARGNO rather than being a fixed part of the function signature. This changes the nature of the error messages. */ bool function_resolver::require_matching_vector_type (unsigned int argno, + unsigned int first_argno, type_suffix_index type) { type_suffix_index new_type = infer_vector_type (argno); @@ -1575,9 +1576,9 @@ function_resolver::require_matching_vector_type (unsigned int argno, if (type != new_type) { error_at (location, "passing %qT to argument %d of %qE, but" - " previous arguments had type %qT", + " argument %d had type %qT", get_vector_type (new_type), argno + 1, fndecl, - get_vector_type (type)); + first_argno + 1, get_vector_type (type)); return false; } return true; @@ -1626,7 +1627,7 @@ require_derived_vector_type (unsigned int argno, { /* There's no need to resolve this case out of order. */ gcc_assert (argno > first_argno); - return require_matching_vector_type (argno, first_type); + return require_matching_vector_type (argno, first_argno, first_type); } /* Use FIRST_TYPE to get the expected type class and element size. */ @@ -2314,7 +2315,7 @@ function_resolver::resolve_unary (type_class_index merge_tclass, so we can use normal left-to-right resolution. */ if ((type = infer_vector_type (0)) == NUM_TYPE_SUFFIXES || !require_vector_type (1, VECTOR_TYPE_svbool_t) - || !require_matching_vector_type (2, type)) + || !require_matching_vector_type (2, 0, type)) return error_mark_node; } else @@ -2359,9 +2360,9 @@ function_resolver::resolve_uniform (unsigned int nops, unsigned int nimm) || (type = infer_vector_type (i)) == NUM_TYPE_SUFFIXES) return error_mark_node; - i += 1; + unsigned int first_arg = i++; for (; i < nargs - nimm; ++i) - if (!require_matching_vector_type (i, type)) + if (!require_matching_vector_type (i, first_arg, type)) return error_mark_node; for (; i < nargs; ++i) @@ -2390,7 +2391,7 @@ function_resolver::resolve_uniform_opt_n (unsigned int nops) unsigned int first_arg = i++; for (; i < nargs - 1; ++i) - if (!require_matching_vector_type (i, type)) + if (!require_matching_vector_type (i, first_arg, type)) return error_mark_node; return finish_opt_n_resolution (i, first_arg, type); diff --git a/gcc/config/aarch64/aarch64-sve-builtins.h b/gcc/config/aarch64/aarch64-sve-builtins.h index bba3a87f7bc..f959b6f6ab3 100644 --- a/gcc/config/aarch64/aarch64-sve-builtins.h +++ b/gcc/config/aarch64/aarch64-sve-builtins.h @@ -476,7 +476,8 @@ public: bool require_vector_or_scalar_type (unsigned int); bool require_vector_type (unsigned int, vector_type_index); - bool require_matching_vector_type (unsigned int, type_suffix_index); + bool require_matching_vector_type (unsigned int, unsigned int, + type_suffix_index); bool require_derived_vector_type (unsigned int, unsigned int, type_suffix_index, type_class_index = SAME_TYPE_CLASS, diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/binary_1.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/binary_1.c index 4343146de05..2e919d287ad 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/binary_1.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/binary_1.c @@ -7,8 +7,8 @@ f1 (svbool_t pg, svuint8_t u8, svint16_t s16) { svzip1 (pg); /* { dg-error {too few arguments to function 'svzip1'} } */ svzip1 (pg, u8, u8); /* { dg-error {too many arguments to function 'svzip1'} } */ - svzip1 (pg, u8); /* { dg-error {passing 'svuint8_t' to argument 2 of 'svzip1', but previous arguments had type 'svbool_t'} } */ - svzip1 (u8, pg); /* { dg-error {passing 'svbool_t' to argument 2 of 'svzip1', but previous arguments had type 'svuint8_t'} } */ - svzip1 (u8, s16); /* { dg-error {passing 'svint16_t' to argument 2 of 'svzip1', but previous arguments had type 'svuint8_t'} } */ + svzip1 (pg, u8); /* { dg-error {passing 'svuint8_t' to argument 2 of 'svzip1', but argument 1 had type 'svbool_t'} } */ + svzip1 (u8, pg); /* { dg-error {passing 'svbool_t' to argument 2 of 'svzip1', but argument 1 had type 'svuint8_t'} } */ + svzip1 (u8, s16); /* { dg-error {passing 'svint16_t' to argument 2 of 'svzip1', but argument 1 had type 'svuint8_t'} } */ svzip1 (u8, 0); /* { dg-error {passing 'int' to argument 2 of 'svzip1', which expects an SVE type rather than a scalar} } */ } diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/binary_lane_1.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/binary_lane_1.c index 10b6b7e81e7..81533b25daf 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/binary_lane_1.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/binary_lane_1.c @@ -12,7 +12,7 @@ f1 (svbool_t pg, svfloat16_t f16, svfloat32_t f32, svfloat64_t f64, svmul_lane (s32, s32, 0); /* { dg-error {ACLE function 'svmul_lane_s32' requires ISA extension 'sve2'} "" { xfail aarch64_sve2 } } */ svmul_lane (1, f32, 0); /* { dg-error {passing 'int' to argument 1 of 'svmul_lane', which expects an SVE type rather than a scalar} } */ svmul_lane (f32, 1, 0); /* { dg-error {passing 'int' to argument 2 of 'svmul_lane', which expects an SVE type rather than a scalar} } */ - svmul_lane (f32, f64, 0); /* { dg-error {passing 'svfloat64_t' to argument 2 of 'svmul_lane', but previous arguments had type 'svfloat32_t'} } */ + svmul_lane (f32, f64, 0); /* { dg-error {passing 'svfloat64_t' to argument 2 of 'svmul_lane', but argument 1 had type 'svfloat32_t'} } */ svmul_lane (f32, f32, s32); /* { dg-error {argument 3 of 'svmul_lane' must be an integer constant expression} } */ svmul_lane (f32, f32, i); /* { dg-error {argument 3 of 'svmul_lane' must be an integer constant expression} } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/binary_long_lane_1.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/binary_long_lane_1.c index 805863f76bc..25b620877de 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/binary_long_lane_1.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/binary_long_lane_1.c @@ -21,7 +21,7 @@ f1 (svbool_t pg, svint8_t s8, svuint8_t u8, svint16_t s16, svuint16_t u16, svmullb_lane (f64, f64, 0); /* { dg-error {'svmullb_lane' has no form that takes 'svfloat64_t' arguments} } */ svmullb_lane (1, u32, 0); /* { dg-error {passing 'int' to argument 1 of 'svmullb_lane', which expects an SVE type rather than a scalar} } */ svmullb_lane (u32, 1, 0); /* { dg-error {passing 'int' to argument 2 of 'svmullb_lane', which expects an SVE type rather than a scalar} } */ - svmullb_lane (u32, s32, 0); /* { dg-error {passing 'svint32_t' to argument 2 of 'svmullb_lane', but previous arguments had type 'svuint32_t'} } */ + svmullb_lane (u32, s32, 0); /* { dg-error {passing 'svint32_t' to argument 2 of 'svmullb_lane', but argument 1 had type 'svuint32_t'} } */ svmullb_lane (u32, u32, s32); /* { dg-error {argument 3 of 'svmullb_lane' must be an integer constant expression} } */ svmullb_lane (u32, u32, i); /* { dg-error {argument 3 of 'svmullb_lane' must be an integer constant expression} } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/binary_long_opt_n_1.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/binary_long_opt_n_1.c index ee704eeaefb..1f513dde933 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/binary_long_opt_n_1.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/binary_long_opt_n_1.c @@ -24,10 +24,10 @@ f1 (svbool_t pg, svint8_t s8, svuint8_t u8, svaddlb (s64, s64); /* { dg-error {'svaddlb' has no form that takes 'svint64_t' arguments} } */ svaddlb (f16, f16); /* { dg-error {'svaddlb' has no form that takes 'svfloat16_t' arguments} } */ svaddlb (1, u8); /* { dg-error {passing 'int' to argument 1 of 'svaddlb', which expects an SVE type rather than a scalar} } */ - svaddlb (u8, s8); /* { dg-error {passing 'svint8_t' to argument 2 of 'svaddlb', but previous arguments had type 'svuint8_t'} } */ - svaddlb (u8, s16); /* { dg-error {passing 'svint16_t' to argument 2 of 'svaddlb', but previous arguments had type 'svuint8_t'} } */ - svaddlb (u8, u16); /* { dg-error {passing 'svuint16_t' to argument 2 of 'svaddlb', but previous arguments had type 'svuint8_t'} } */ - svaddlb (u16, pg); /* { dg-error {passing 'svbool_t' to argument 2 of 'svaddlb', but previous arguments had type 'svuint16_t'} } */ + svaddlb (u8, s8); /* { dg-error {passing 'svint8_t' to argument 2 of 'svaddlb', but argument 1 had type 'svuint8_t'} } */ + svaddlb (u8, s16); /* { dg-error {passing 'svint16_t' to argument 2 of 'svaddlb', but argument 1 had type 'svuint8_t'} } */ + svaddlb (u8, u16); /* { dg-error {passing 'svuint16_t' to argument 2 of 'svaddlb', but argument 1 had type 'svuint8_t'} } */ + svaddlb (u16, pg); /* { dg-error {passing 'svbool_t' to argument 2 of 'svaddlb', but argument 1 had type 'svuint16_t'} } */ svaddlb (u8, 0); svaddlb (u16, 0); svaddlb (u32, 0); diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/binary_narrowb_opt_n_1.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/binary_narrowb_opt_n_1.c index 8ca549ba93f..4a29b5c4395 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/binary_narrowb_opt_n_1.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/binary_narrowb_opt_n_1.c @@ -24,10 +24,10 @@ f1 (svbool_t pg, svint8_t s8, svuint8_t u8, svaddhnb (s64, s64); svaddhnb (f32, f32); /* { dg-error {'svaddhnb' has no form that takes 'svfloat32_t' arguments} } */ svaddhnb (1, u16); /* { dg-error {passing 'int' to argument 1 of 'svaddhnb', which expects an SVE type rather than a scalar} } */ - svaddhnb (u16, s8); /* { dg-error {passing 'svint8_t' to argument 2 of 'svaddhnb', but previous arguments had type 'svuint16_t'} } */ - svaddhnb (u16, s16); /* { dg-error {passing 'svint16_t' to argument 2 of 'svaddhnb', but previous arguments had type 'svuint16_t'} } */ - svaddhnb (u16, u32); /* { dg-error {passing 'svuint32_t' to argument 2 of 'svaddhnb', but previous arguments had type 'svuint16_t'} } */ - svaddhnb (u16, pg); /* { dg-error {passing 'svbool_t' to argument 2 of 'svaddhnb', but previous arguments had type 'svuint16_t'} } */ + svaddhnb (u16, s8); /* { dg-error {passing 'svint8_t' to argument 2 of 'svaddhnb', but argument 1 had type 'svuint16_t'} } */ + svaddhnb (u16, s16); /* { dg-error {passing 'svint16_t' to argument 2 of 'svaddhnb', but argument 1 had type 'svuint16_t'} } */ + svaddhnb (u16, u32); /* { dg-error {passing 'svuint32_t' to argument 2 of 'svaddhnb', but argument 1 had type 'svuint16_t'} } */ + svaddhnb (u16, pg); /* { dg-error {passing 'svbool_t' to argument 2 of 'svaddhnb', but argument 1 had type 'svuint16_t'} } */ svaddhnb (u8, 0); /* { dg-error {'svaddhnb' has no form that takes 'svuint8_t' arguments} } */ svaddhnb (u16, 0); svaddhnb (u32, 0); diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/binary_narrowt_opt_n_1.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/binary_narrowt_opt_n_1.c index 2b537965bc6..4a442616eeb 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/binary_narrowt_opt_n_1.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/binary_narrowt_opt_n_1.c @@ -28,10 +28,10 @@ f1 (svbool_t pg, svint8_t s8, svuint8_t u8, svaddhnt (f16, f32, f32); /* { dg-error {'svaddhnt' has no form that takes 'svfloat32_t' arguments} } */ svaddhnt (1, u16, u16); /* { dg-error {passing 'int' to argument 1 of 'svaddhnt', which expects an SVE type rather than a scalar} } */ svaddhnt (u8, 1, u16); /* { dg-error {passing 'int' to argument 2 of 'svaddhnt', which expects an SVE type rather than a scalar} } */ - svaddhnt (u8, u16, s8); /* { dg-error {passing 'svint8_t' to argument 3 of 'svaddhnt', but previous arguments had type 'svuint16_t'} } */ - svaddhnt (u8, u16, s16); /* { dg-error {passing 'svint16_t' to argument 3 of 'svaddhnt', but previous arguments had type 'svuint16_t'} } */ - svaddhnt (u8, u16, u32); /* { dg-error {passing 'svuint32_t' to argument 3 of 'svaddhnt', but previous arguments had type 'svuint16_t'} } */ - svaddhnt (u8, u16, pg); /* { dg-error {passing 'svbool_t' to argument 3 of 'svaddhnt', but previous arguments had type 'svuint16_t'} } */ + svaddhnt (u8, u16, s8); /* { dg-error {passing 'svint8_t' to argument 3 of 'svaddhnt', but argument 2 had type 'svuint16_t'} } */ + svaddhnt (u8, u16, s16); /* { dg-error {passing 'svint16_t' to argument 3 of 'svaddhnt', but argument 2 had type 'svuint16_t'} } */ + svaddhnt (u8, u16, u32); /* { dg-error {passing 'svuint32_t' to argument 3 of 'svaddhnt', but argument 2 had type 'svuint16_t'} } */ + svaddhnt (u8, u16, pg); /* { dg-error {passing 'svbool_t' to argument 3 of 'svaddhnt', but argument 2 had type 'svuint16_t'} } */ svaddhnt (u8, u8, 0); /* { dg-error {'svaddhnt' has no form that takes 'svuint8_t' arguments} } */ svaddhnt (u16, u16, 0); /* { dg-error {passing 'svuint16_t' instead of the expected 'svuint8_t' to argument 1 of 'svaddhnt', after passing 'svuint16_t' to argument 2} } */ svaddhnt (s8, u16, 0); /* { dg-error {arguments 1 and 2 of 'svaddhnt' must have the same signedness, but the values passed here have type 'svint8_t' and 'svuint16_t' respectively} } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/binary_opt_n_2.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/binary_opt_n_2.c index a151f90d170..40447cf83f5 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/binary_opt_n_2.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/binary_opt_n_2.c @@ -11,16 +11,16 @@ f1 (svbool_t pg, svint8_t s8, svuint8_t u8, svadd_x (u8, u8, u8); /* { dg-error {passing 'svuint8_t' to argument 1 of 'svadd_x', which expects 'svbool_t'} } */ svadd_x (pg, pg, pg); /* { dg-error {'svadd_x' has no form that takes 'svbool_t' arguments} } */ svadd_x (pg, 1, u8); /* { dg-error {passing 'int' to argument 2 of 'svadd_x', which expects an SVE type rather than a scalar} } */ - svadd_x (pg, u8, s8); /* { dg-error {passing 'svint8_t' to argument 3 of 'svadd_x', but previous arguments had type 'svuint8_t'} } */ + svadd_x (pg, u8, s8); /* { dg-error {passing 'svint8_t' to argument 3 of 'svadd_x', but argument 2 had type 'svuint8_t'} } */ svadd_x (pg, u8, u8); - svadd_x (pg, u8, s16); /* { dg-error {passing 'svint16_t' to argument 3 of 'svadd_x', but previous arguments had type 'svuint8_t'} } */ - svadd_x (pg, u8, u16); /* { dg-error {passing 'svuint16_t' to argument 3 of 'svadd_x', but previous arguments had type 'svuint8_t'} } */ - svadd_x (pg, u8, f16); /* { dg-error {passing 'svfloat16_t' to argument 3 of 'svadd_x', but previous arguments had type 'svuint8_t'} } */ - svadd_x (pg, u8, pg); /* { dg-error {passing 'svbool_t' to argument 3 of 'svadd_x', but previous arguments had type 'svuint8_t'} } */ + svadd_x (pg, u8, s16); /* { dg-error {passing 'svint16_t' to argument 3 of 'svadd_x', but argument 2 had type 'svuint8_t'} } */ + svadd_x (pg, u8, u16); /* { dg-error {passing 'svuint16_t' to argument 3 of 'svadd_x', but argument 2 had type 'svuint8_t'} } */ + svadd_x (pg, u8, f16); /* { dg-error {passing 'svfloat16_t' to argument 3 of 'svadd_x', but argument 2 had type 'svuint8_t'} } */ + svadd_x (pg, u8, pg); /* { dg-error {passing 'svbool_t' to argument 3 of 'svadd_x', but argument 2 had type 'svuint8_t'} } */ svadd_x (pg, u8, 0); - svadd_x (pg, f16, s16); /* { dg-error {passing 'svint16_t' to argument 3 of 'svadd_x', but previous arguments had type 'svfloat16_t'} } */ - svadd_x (pg, f16, u16); /* { dg-error {passing 'svuint16_t' to argument 3 of 'svadd_x', but previous arguments had type 'svfloat16_t'} } */ + svadd_x (pg, f16, s16); /* { dg-error {passing 'svint16_t' to argument 3 of 'svadd_x', but argument 2 had type 'svfloat16_t'} } */ + svadd_x (pg, f16, u16); /* { dg-error {passing 'svuint16_t' to argument 3 of 'svadd_x', but argument 2 had type 'svfloat16_t'} } */ svadd_x (pg, f16, f16); svadd_x (pg, f16, 1); } diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/binary_opt_n_3.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/binary_opt_n_3.c index 70ec9c58518..94e20bc919f 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/binary_opt_n_3.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/binary_opt_n_3.c @@ -11,19 +11,19 @@ f1 (svbool_t pg, svint8_t s8, svuint8_t u8, svand_z (u8, u8, u8); /* { dg-error {passing 'svuint8_t' to argument 1 of 'svand_z', which expects 'svbool_t'} } */ svand_z (pg, pg, pg); svand_z (pg, 1, u8); /* { dg-error {passing 'int' to argument 2 of 'svand_z', which expects an SVE type rather than a scalar} } */ - svand_z (pg, u8, s8); /* { dg-error {passing 'svint8_t' to argument 3 of 'svand_z', but previous arguments had type 'svuint8_t'} } */ + svand_z (pg, u8, s8); /* { dg-error {passing 'svint8_t' to argument 3 of 'svand_z', but argument 2 had type 'svuint8_t'} } */ svand_z (pg, u8, u8); - svand_z (pg, u8, s16); /* { dg-error {passing 'svint16_t' to argument 3 of 'svand_z', but previous arguments had type 'svuint8_t'} } */ - svand_z (pg, u8, u16); /* { dg-error {passing 'svuint16_t' to argument 3 of 'svand_z', but previous arguments had type 'svuint8_t'} } */ - svand_z (pg, u8, f16); /* { dg-error {passing 'svfloat16_t' to argument 3 of 'svand_z', but previous arguments had type 'svuint8_t'} } */ - svand_z (pg, u8, pg); /* { dg-error {passing 'svbool_t' to argument 3 of 'svand_z', but previous arguments had type 'svuint8_t'} } */ + svand_z (pg, u8, s16); /* { dg-error {passing 'svint16_t' to argument 3 of 'svand_z', but argument 2 had type 'svuint8_t'} } */ + svand_z (pg, u8, u16); /* { dg-error {passing 'svuint16_t' to argument 3 of 'svand_z', but argument 2 had type 'svuint8_t'} } */ + svand_z (pg, u8, f16); /* { dg-error {passing 'svfloat16_t' to argument 3 of 'svand_z', but argument 2 had type 'svuint8_t'} } */ + svand_z (pg, u8, pg); /* { dg-error {passing 'svbool_t' to argument 3 of 'svand_z', but argument 2 had type 'svuint8_t'} } */ svand_z (pg, u8, 0); - svand_z (pg, pg, u8); /* { dg-error {passing 'svuint8_t' to argument 3 of 'svand_z', but previous arguments had type 'svbool_t'} } */ + svand_z (pg, pg, u8); /* { dg-error {passing 'svuint8_t' to argument 3 of 'svand_z', but argument 2 had type 'svbool_t'} } */ svand_z (pg, pg, 0); /* { dg-error {passing 'int' to argument 3 of 'svand_z', but its 'svbool_t' form does not accept scalars} } */ - svand_z (pg, f16, s16); /* { dg-error {passing 'svint16_t' to argument 3 of 'svand_z', but previous arguments had type 'svfloat16_t'} } */ - svand_z (pg, f16, u16); /* { dg-error {passing 'svuint16_t' to argument 3 of 'svand_z', but previous arguments had type 'svfloat16_t'} } */ + svand_z (pg, f16, s16); /* { dg-error {passing 'svint16_t' to argument 3 of 'svand_z', but argument 2 had type 'svfloat16_t'} } */ + svand_z (pg, f16, u16); /* { dg-error {passing 'svuint16_t' to argument 3 of 'svand_z', but argument 2 had type 'svfloat16_t'} } */ svand_z (pg, f16, f16); /* { dg-error {'svand_z' has no form that takes 'svfloat16_t' arguments} } */ svand_z (pg, f16, 1); /* { dg-error {'svand_z' has no form that takes 'svfloat16_t' arguments} } */ } diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/binary_rotate_1.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/binary_rotate_1.c index 7669e4a0261..8939ce25839 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/binary_rotate_1.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/binary_rotate_1.c @@ -12,7 +12,7 @@ f1 (svbool_t pg, svfloat32_t f32, svfloat64_t f64, svint32_t s32, int i) svcadd_x (pg, s32, s32, 90); /* { dg-error {'svcadd_x' has no form that takes 'svint32_t' arguments} } */ svcadd_x (pg, 1, f32, 90); /* { dg-error {passing 'int' to argument 2 of 'svcadd_x', which expects an SVE type rather than a scalar} } */ svcadd_x (pg, f32, 1, 90); /* { dg-error {passing 'int' to argument 3 of 'svcadd_x', which expects an SVE type rather than a scalar} } */ - svcadd_x (pg, f32, f64, 90); /* { dg-error {passing 'svfloat64_t' to argument 3 of 'svcadd_x', but previous arguments had type 'svfloat32_t'} } */ + svcadd_x (pg, f32, f64, 90); /* { dg-error {passing 'svfloat64_t' to argument 3 of 'svcadd_x', but argument 2 had type 'svfloat32_t'} } */ svcadd_x (pg, f32, f32, s32); /* { dg-error {argument 4 of 'svcadd_x' must be an integer constant expression} } */ svcadd_x (pg, f32, f32, i); /* { dg-error {argument 4 of 'svcadd_x' must be an integer constant expression} } */ svcadd_x (pg, f32, f32, -90); /* { dg-error {passing -90 to argument 4 of 'svcadd_x', which expects either 90 or 270} } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/binary_to_uint_1.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/binary_to_uint_1.c index 154662487e3..2c3fe5df178 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/binary_to_uint_1.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/binary_to_uint_1.c @@ -12,8 +12,8 @@ f1 (svbool_t pg, svint32_t s32, svuint32_t u32) svhistcnt_z (0, s32, s32); /* { dg-error {passing 'int' to argument 1 of 'svhistcnt_z', which expects 'svbool_t'} } */ svhistcnt_z (s32, s32, s32); /* { dg-error {passing 'svint32_t' to argument 1 of 'svhistcnt_z', which expects 'svbool_t'} } */ svhistcnt_z (pg, 0, s32); /* { dg-error {passing 'int' to argument 2 of 'svhistcnt_z', which expects an SVE type rather than a scalar} } */ - svhistcnt_z (pg, pg, s32); /* { dg-error {passing 'svint32_t' to argument 3 of 'svhistcnt_z', but previous arguments had type 'svbool_t'} } */ - svhistcnt_z (pg, s32, u32); /* { dg-error {passing 'svuint32_t' to argument 3 of 'svhistcnt_z', but previous arguments had type 'svint32_t'} } */ + svhistcnt_z (pg, pg, s32); /* { dg-error {passing 'svint32_t' to argument 3 of 'svhistcnt_z', but argument 2 had type 'svbool_t'} } */ + svhistcnt_z (pg, s32, u32); /* { dg-error {passing 'svuint32_t' to argument 3 of 'svhistcnt_z', but argument 2 had type 'svint32_t'} } */ svhistcnt_z (pg, s32, 0); /* { dg-error {passing 'int' to argument 3 of 'svhistcnt_z', which expects an SVE type rather than a scalar} } */ svhistcnt_z (pg, pg, pg); /* { dg-error {'svhistcnt_z' has no form that takes 'svbool_t' arguments} } */ } diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/clast_1.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/clast_1.c index ba1b2520f7a..47ce473287a 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/clast_1.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/clast_1.c @@ -10,6 +10,6 @@ test (svbool_t pg, svint32_t s32, svint64_t s64, int i) svclasta (pg, 1, pg); /* { dg-error {'svclasta' has no form that takes 'svbool_t' arguments} } */ svclasta (pg, i, s32); svclasta (pg, s32, 1); /* { dg-error {passing 'int' to argument 3 of 'svclasta', which expects an SVE type rather than a scalar} } */ - svclasta (pg, s32, s64); /* { dg-error {passing 'svint64_t' to argument 3 of 'svclasta', but previous arguments had type 'svint32_t'} } */ + svclasta (pg, s32, s64); /* { dg-error {passing 'svint64_t' to argument 3 of 'svclasta', but argument 2 had type 'svint32_t'} } */ svclasta (pg, pg, pg); /* { dg-error {'svclasta' has no form that takes 'svbool_t' arguments} } */ } diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/compare_1.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/compare_1.c index 5474124cc46..0dd0ad91048 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/compare_1.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/compare_1.c @@ -13,15 +13,15 @@ f1 (svbool_t pg, svint8_t s8, svuint8_t u8, svmatch (u8, u8, u8); /* { dg-error {passing 'svuint8_t' to argument 1 of 'svmatch', which expects 'svbool_t'} } */ svmatch (pg, pg, pg); /* { dg-error {'svmatch' has no form that takes 'svbool_t' arguments} } */ svmatch (pg, 1, u8); /* { dg-error {passing 'int' to argument 2 of 'svmatch', which expects an SVE type rather than a scalar} } */ - svmatch (pg, u8, s8); /* { dg-error {passing 'svint8_t' to argument 3 of 'svmatch', but previous arguments had type 'svuint8_t'} } */ + svmatch (pg, u8, s8); /* { dg-error {passing 'svint8_t' to argument 3 of 'svmatch', but argument 2 had type 'svuint8_t'} } */ svmatch (pg, u8, u8); - svmatch (pg, u8, s16); /* { dg-error {passing 'svint16_t' to argument 3 of 'svmatch', but previous arguments had type 'svuint8_t'} } */ - svmatch (pg, u8, u16); /* { dg-error {passing 'svuint16_t' to argument 3 of 'svmatch', but previous arguments had type 'svuint8_t'} } */ - svmatch (pg, u8, f16); /* { dg-error {passing 'svfloat16_t' to argument 3 of 'svmatch', but previous arguments had type 'svuint8_t'} } */ - svmatch (pg, u8, pg); /* { dg-error {passing 'svbool_t' to argument 3 of 'svmatch', but previous arguments had type 'svuint8_t'} } */ + svmatch (pg, u8, s16); /* { dg-error {passing 'svint16_t' to argument 3 of 'svmatch', but argument 2 had type 'svuint8_t'} } */ + svmatch (pg, u8, u16); /* { dg-error {passing 'svuint16_t' to argument 3 of 'svmatch', but argument 2 had type 'svuint8_t'} } */ + svmatch (pg, u8, f16); /* { dg-error {passing 'svfloat16_t' to argument 3 of 'svmatch', but argument 2 had type 'svuint8_t'} } */ + svmatch (pg, u8, pg); /* { dg-error {passing 'svbool_t' to argument 3 of 'svmatch', but argument 2 had type 'svuint8_t'} } */ svmatch (pg, u8, 0); /* { dg-error {passing 'int' to argument 3 of 'svmatch', which expects an SVE type rather than a scalar} } */ - svmatch (pg, f16, s16); /* { dg-error {passing 'svint16_t' to argument 3 of 'svmatch', but previous arguments had type 'svfloat16_t'} } */ - svmatch (pg, f16, u16); /* { dg-error {passing 'svuint16_t' to argument 3 of 'svmatch', but previous arguments had type 'svfloat16_t'} } */ + svmatch (pg, f16, s16); /* { dg-error {passing 'svint16_t' to argument 3 of 'svmatch', but argument 2 had type 'svfloat16_t'} } */ + svmatch (pg, f16, u16); /* { dg-error {passing 'svuint16_t' to argument 3 of 'svmatch', but argument 2 had type 'svfloat16_t'} } */ svmatch (pg, f16, f16); /* { dg-error {'svmatch' has no form that takes 'svfloat16_t' arguments} } */ } diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/compare_opt_n_1.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/compare_opt_n_1.c index 6faa73972f5..cfa50d38701 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/compare_opt_n_1.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/compare_opt_n_1.c @@ -11,16 +11,16 @@ f1 (svbool_t pg, svint8_t s8, svuint8_t u8, svcmpeq (u8, u8, u8); /* { dg-error {passing 'svuint8_t' to argument 1 of 'svcmpeq', which expects 'svbool_t'} } */ svcmpeq (pg, pg, pg); /* { dg-error {'svcmpeq' has no form that takes 'svbool_t' arguments} } */ svcmpeq (pg, 1, u8); /* { dg-error {passing 'int' to argument 2 of 'svcmpeq', which expects an SVE type rather than a scalar} } */ - svcmpeq (pg, u8, s8); /* { dg-error {passing 'svint8_t' to argument 3 of 'svcmpeq', but previous arguments had type 'svuint8_t'} } */ + svcmpeq (pg, u8, s8); /* { dg-error {passing 'svint8_t' to argument 3 of 'svcmpeq', but argument 2 had type 'svuint8_t'} } */ svcmpeq (pg, u8, u8); - svcmpeq (pg, u8, s16); /* { dg-error {passing 'svint16_t' to argument 3 of 'svcmpeq', but previous arguments had type 'svuint8_t'} } */ - svcmpeq (pg, u8, u16); /* { dg-error {passing 'svuint16_t' to argument 3 of 'svcmpeq', but previous arguments had type 'svuint8_t'} } */ - svcmpeq (pg, u8, f16); /* { dg-error {passing 'svfloat16_t' to argument 3 of 'svcmpeq', but previous arguments had type 'svuint8_t'} } */ - svcmpeq (pg, u8, pg); /* { dg-error {passing 'svbool_t' to argument 3 of 'svcmpeq', but previous arguments had type 'svuint8_t'} } */ + svcmpeq (pg, u8, s16); /* { dg-error {passing 'svint16_t' to argument 3 of 'svcmpeq', but argument 2 had type 'svuint8_t'} } */ + svcmpeq (pg, u8, u16); /* { dg-error {passing 'svuint16_t' to argument 3 of 'svcmpeq', but argument 2 had type 'svuint8_t'} } */ + svcmpeq (pg, u8, f16); /* { dg-error {passing 'svfloat16_t' to argument 3 of 'svcmpeq', but argument 2 had type 'svuint8_t'} } */ + svcmpeq (pg, u8, pg); /* { dg-error {passing 'svbool_t' to argument 3 of 'svcmpeq', but argument 2 had type 'svuint8_t'} } */ svcmpeq (pg, u8, 0); - svcmpeq (pg, f16, s16); /* { dg-error {passing 'svint16_t' to argument 3 of 'svcmpeq', but previous arguments had type 'svfloat16_t'} } */ - svcmpeq (pg, f16, u16); /* { dg-error {passing 'svuint16_t' to argument 3 of 'svcmpeq', but previous arguments had type 'svfloat16_t'} } */ + svcmpeq (pg, f16, s16); /* { dg-error {passing 'svint16_t' to argument 3 of 'svcmpeq', but argument 2 had type 'svfloat16_t'} } */ + svcmpeq (pg, f16, u16); /* { dg-error {passing 'svuint16_t' to argument 3 of 'svcmpeq', but argument 2 had type 'svfloat16_t'} } */ svcmpeq (pg, f16, f16); svcmpeq (pg, f16, 1); } diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/create_1.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/create_1.c index 83e4a5600cb..7a617aa1563 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/create_1.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/create_1.c @@ -10,11 +10,11 @@ f1 (svuint8x2_t *ptr, svbool_t pg, svuint8_t u8, svfloat64_t f64, *ptr = svcreate2 (u8); /* { dg-error {too few arguments to function 'svcreate2'} } */ *ptr = svcreate2 (u8, u8, u8); /* { dg-error {too many arguments to function 'svcreate2'} } */ *ptr = svcreate2 (u8x2, u8x2); /* { dg-error {passing 'svuint8x2_t' to argument 1 of 'svcreate2', which expects a single SVE vector rather than a tuple} } */ - *ptr = svcreate2 (u8, f64); /* { dg-error {passing 'svfloat64_t' to argument 2 of 'svcreate2', but previous arguments had type 'svuint8_t'} } */ - *ptr = svcreate2 (u8, pg); /* { dg-error {passing 'svbool_t' to argument 2 of 'svcreate2', but previous arguments had type 'svuint8_t'} } */ + *ptr = svcreate2 (u8, f64); /* { dg-error {passing 'svfloat64_t' to argument 2 of 'svcreate2', but argument 1 had type 'svuint8_t'} } */ + *ptr = svcreate2 (u8, pg); /* { dg-error {passing 'svbool_t' to argument 2 of 'svcreate2', but argument 1 had type 'svuint8_t'} } */ *ptr = svcreate2 (u8, x); /* { dg-error {passing 'int' to argument 2 of 'svcreate2', which expects an SVE type rather than a scalar} } */ *ptr = svcreate2 (x, u8); /* { dg-error {passing 'int' to argument 1 of 'svcreate2', which expects an SVE type rather than a scalar} } */ - *ptr = svcreate2 (pg, u8); /* { dg-error {passing 'svuint8_t' to argument 2 of 'svcreate2', but previous arguments had type 'svbool_t'} } */ + *ptr = svcreate2 (pg, u8); /* { dg-error {passing 'svuint8_t' to argument 2 of 'svcreate2', but argument 1 had type 'svbool_t'} } */ *ptr = svcreate2 (pg, pg); /* { dg-error {'svcreate2' has no form that takes 'svbool_t' arguments} } */ *ptr = svcreate2 (u8, u8); *ptr = svcreate2 (f64, f64); /* { dg-error {incompatible types when assigning to type 'svuint8x2_t' from type 'svfloat64x2_t'} } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/create_3.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/create_3.c index e3302f7e7db..40f3a1fedcb 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/create_3.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/create_3.c @@ -11,11 +11,11 @@ f1 (svfloat16x3_t *ptr, svbool_t pg, svfloat16_t f16, svfloat64_t f64, *ptr = svcreate3 (f16, f16); /* { dg-error {too few arguments to function 'svcreate3'} } */ *ptr = svcreate3 (f16, f16, f16, f16); /* { dg-error {too many arguments to function 'svcreate3'} } */ *ptr = svcreate3 (f16x3, f16x3, f16x3); /* { dg-error {passing 'svfloat16x3_t' to argument 1 of 'svcreate3', which expects a single SVE vector rather than a tuple} } */ - *ptr = svcreate3 (f16, f16, f64); /* { dg-error {passing 'svfloat64_t' to argument 3 of 'svcreate3', but previous arguments had type 'svfloat16_t'} } */ - *ptr = svcreate3 (f16, pg, f16); /* { dg-error {passing 'svbool_t' to argument 2 of 'svcreate3', but previous arguments had type 'svfloat16_t'} } */ + *ptr = svcreate3 (f16, f16, f64); /* { dg-error {passing 'svfloat64_t' to argument 3 of 'svcreate3', but argument 1 had type 'svfloat16_t'} } */ + *ptr = svcreate3 (f16, pg, f16); /* { dg-error {passing 'svbool_t' to argument 2 of 'svcreate3', but argument 1 had type 'svfloat16_t'} } */ *ptr = svcreate3 (f16, x, f16); /* { dg-error {passing 'int' to argument 2 of 'svcreate3', which expects an SVE type rather than a scalar} } */ *ptr = svcreate3 (x, f16, f16); /* { dg-error {passing 'int' to argument 1 of 'svcreate3', which expects an SVE type rather than a scalar} } */ - *ptr = svcreate3 (pg, f16, f16); /* { dg-error {passing 'svfloat16_t' to argument 2 of 'svcreate3', but previous arguments had type 'svbool_t'} } */ + *ptr = svcreate3 (pg, f16, f16); /* { dg-error {passing 'svfloat16_t' to argument 2 of 'svcreate3', but argument 1 had type 'svbool_t'} } */ *ptr = svcreate3 (pg, pg, pg); /* { dg-error {'svcreate3' has no form that takes 'svbool_t' arguments} } */ *ptr = svcreate3 (f16, f16, f16); *ptr = svcreate3 (f64, f64, f64); /* { dg-error {incompatible types when assigning to type 'svfloat16x3_t' from type 'svfloat64x3_t'} } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/create_5.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/create_5.c index c850c94f0d2..bf3dd5d7514 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/create_5.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/create_5.c @@ -12,11 +12,11 @@ f1 (svint32x4_t *ptr, svbool_t pg, svint32_t s32, svfloat64_t f64, *ptr = svcreate4 (s32, s32, s32); /* { dg-error {too few arguments to function 'svcreate4'} } */ *ptr = svcreate4 (s32, s32, s32, s32, s32); /* { dg-error {too many arguments to function 'svcreate4'} } */ *ptr = svcreate4 (s32x4, s32x4, s32x4, s32x4); /* { dg-error {passing 'svint32x4_t' to argument 1 of 'svcreate4', which expects a single SVE vector rather than a tuple} } */ - *ptr = svcreate4 (s32, s32, s32, f64); /* { dg-error {passing 'svfloat64_t' to argument 4 of 'svcreate4', but previous arguments had type 'svint32_t'} } */ - *ptr = svcreate4 (s32, s32, pg, s32); /* { dg-error {passing 'svbool_t' to argument 3 of 'svcreate4', but previous arguments had type 'svint32_t'} } */ + *ptr = svcreate4 (s32, s32, s32, f64); /* { dg-error {passing 'svfloat64_t' to argument 4 of 'svcreate4', but argument 1 had type 'svint32_t'} } */ + *ptr = svcreate4 (s32, s32, pg, s32); /* { dg-error {passing 'svbool_t' to argument 3 of 'svcreate4', but argument 1 had type 'svint32_t'} } */ *ptr = svcreate4 (s32, x, s32, s32); /* { dg-error {passing 'int' to argument 2 of 'svcreate4', which expects an SVE type rather than a scalar} } */ *ptr = svcreate4 (x, s32, s32, s32); /* { dg-error {passing 'int' to argument 1 of 'svcreate4', which expects an SVE type rather than a scalar} } */ - *ptr = svcreate4 (pg, s32, s32, s32); /* { dg-error {passing 'svint32_t' to argument 2 of 'svcreate4', but previous arguments had type 'svbool_t'} } */ + *ptr = svcreate4 (pg, s32, s32, s32); /* { dg-error {passing 'svint32_t' to argument 2 of 'svcreate4', but argument 1 had type 'svbool_t'} } */ *ptr = svcreate4 (pg, pg, pg, pg); /* { dg-error {'svcreate4' has no form that takes 'svbool_t' arguments} } */ *ptr = svcreate4 (s32, s32, s32, s32); *ptr = svcreate4 (f64, f64, f64, f64); /* { dg-error {incompatible types when assigning to type 'svint32x4_t' from type 'svfloat64x4_t'} } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/mmla_1.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/mmla_1.c index 7fc7bb67b75..ca2ab8a6f3f 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/mmla_1.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/mmla_1.c @@ -44,13 +44,13 @@ f2 (svbool_t pg, svint8_t s8, svuint8_t u8, svuint32_t u32, svint32_t s32, svmmla (u32, u32, u32); /* { dg-error {passing 'svuint32_t' instead of the expected 'svuint8_t' to argument 2 of 'svmmla', after passing 'svuint32_t' to argument 1} } */ svmmla (f16, s8, s8); /* { dg-error {'svmmla' has no form that takes 'svfloat16_t' arguments} } */ - svmmla (f32, s8, s8); /* { dg-error {passing 'svint8_t' to argument 2 of 'svmmla', but previous arguments had type 'svfloat32_t'} } */ - svmmla (f32, s32, s32); /* { dg-error {passing 'svint32_t' to argument 2 of 'svmmla', but previous arguments had type 'svfloat32_t'} } */ - svmmla (f32, f16, f16); /* { dg-error {passing 'svfloat16_t' to argument 2 of 'svmmla', but previous arguments had type 'svfloat32_t'} } */ - svmmla (f64, f16, f16); /* { dg-error {passing 'svfloat16_t' to argument 2 of 'svmmla', but previous arguments had type 'svfloat64_t'} } */ - svmmla (f32, f32, f16); /* { dg-error {passing 'svfloat16_t' to argument 3 of 'svmmla', but previous arguments had type 'svfloat32_t'} } */ - svmmla (f64, f32, f16); /* { dg-error {passing 'svfloat32_t' to argument 2 of 'svmmla', but previous arguments had type 'svfloat64_t'} } */ - svmmla (f64, f64, f16); /* { dg-error {passing 'svfloat16_t' to argument 3 of 'svmmla', but previous arguments had type 'svfloat64_t'} } */ + svmmla (f32, s8, s8); /* { dg-error {passing 'svint8_t' to argument 2 of 'svmmla', but argument 1 had type 'svfloat32_t'} } */ + svmmla (f32, s32, s32); /* { dg-error {passing 'svint32_t' to argument 2 of 'svmmla', but argument 1 had type 'svfloat32_t'} } */ + svmmla (f32, f16, f16); /* { dg-error {passing 'svfloat16_t' to argument 2 of 'svmmla', but argument 1 had type 'svfloat32_t'} } */ + svmmla (f64, f16, f16); /* { dg-error {passing 'svfloat16_t' to argument 2 of 'svmmla', but argument 1 had type 'svfloat64_t'} } */ + svmmla (f32, f32, f16); /* { dg-error {passing 'svfloat16_t' to argument 3 of 'svmmla', but argument 1 had type 'svfloat32_t'} } */ + svmmla (f64, f32, f16); /* { dg-error {passing 'svfloat32_t' to argument 2 of 'svmmla', but argument 1 had type 'svfloat64_t'} } */ + svmmla (f64, f64, f16); /* { dg-error {passing 'svfloat16_t' to argument 3 of 'svmmla', but argument 1 had type 'svfloat64_t'} } */ svmmla (f16, f16, f16); /* { dg-error {'svmmla' has no form that takes 'svfloat16_t' arguments} } */ svmmla (f32, f32, f32); diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/ternary_lane_1.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/ternary_lane_1.c index 520c11f792b..0a67f82bf3b 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/ternary_lane_1.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/ternary_lane_1.c @@ -13,8 +13,8 @@ f1 (svbool_t pg, svfloat16_t f16, svfloat32_t f32, svfloat64_t f64, svmla_lane (1, f32, f32, 0); /* { dg-error {passing 'int' to argument 1 of 'svmla_lane', which expects an SVE type rather than a scalar} } */ svmla_lane (f32, 1, f32, 0); /* { dg-error {passing 'int' to argument 2 of 'svmla_lane', which expects an SVE type rather than a scalar} } */ svmla_lane (f32, f32, 1, 0); /* { dg-error {passing 'int' to argument 3 of 'svmla_lane', which expects an SVE type rather than a scalar} } */ - svmla_lane (f32, f64, f32, 0); /* { dg-error {passing 'svfloat64_t' to argument 2 of 'svmla_lane', but previous arguments had type 'svfloat32_t'} } */ - svmla_lane (f32, f32, f64, 0); /* { dg-error {passing 'svfloat64_t' to argument 3 of 'svmla_lane', but previous arguments had type 'svfloat32_t'} } */ + svmla_lane (f32, f64, f32, 0); /* { dg-error {passing 'svfloat64_t' to argument 2 of 'svmla_lane', but argument 1 had type 'svfloat32_t'} } */ + svmla_lane (f32, f32, f64, 0); /* { dg-error {passing 'svfloat64_t' to argument 3 of 'svmla_lane', but argument 1 had type 'svfloat32_t'} } */ svmla_lane (f32, f32, f32, s32); /* { dg-error {argument 4 of 'svmla_lane' must be an integer constant expression} } */ svmla_lane (f32, f32, f32, i); /* { dg-error {argument 4 of 'svmla_lane' must be an integer constant expression} } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/ternary_lane_rotate_1.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/ternary_lane_rotate_1.c index 3163d130c59..60c9c466e22 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/ternary_lane_rotate_1.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/ternary_lane_rotate_1.c @@ -14,8 +14,8 @@ f1 (svbool_t pg, svfloat16_t f16, svfloat32_t f32, svfloat64_t f64, svcmla_lane (1, f32, f32, 0, 90); /* { dg-error {passing 'int' to argument 1 of 'svcmla_lane', which expects an SVE type rather than a scalar} } */ svcmla_lane (f32, 1, f32, 0, 90); /* { dg-error {passing 'int' to argument 2 of 'svcmla_lane', which expects an SVE type rather than a scalar} } */ svcmla_lane (f32, f32, 1, 0, 90); /* { dg-error {passing 'int' to argument 3 of 'svcmla_lane', which expects an SVE type rather than a scalar} } */ - svcmla_lane (f32, f64, f32, 0, 90); /* { dg-error {passing 'svfloat64_t' to argument 2 of 'svcmla_lane', but previous arguments had type 'svfloat32_t'} } */ - svcmla_lane (f32, f32, f64, 0, 90); /* { dg-error {passing 'svfloat64_t' to argument 3 of 'svcmla_lane', but previous arguments had type 'svfloat32_t'} } */ + svcmla_lane (f32, f64, f32, 0, 90); /* { dg-error {passing 'svfloat64_t' to argument 2 of 'svcmla_lane', but argument 1 had type 'svfloat32_t'} } */ + svcmla_lane (f32, f32, f64, 0, 90); /* { dg-error {passing 'svfloat64_t' to argument 3 of 'svcmla_lane', but argument 1 had type 'svfloat32_t'} } */ svcmla_lane (f32, f32, f32, s32, 0); /* { dg-error {argument 4 of 'svcmla_lane' must be an integer constant expression} } */ svcmla_lane (f32, f32, f32, i, 0); /* { dg-error {argument 4 of 'svcmla_lane' must be an integer constant expression} } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/ternary_opt_n_1.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/ternary_opt_n_1.c index ac789c2beca..6ca223475f1 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/ternary_opt_n_1.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/ternary_opt_n_1.c @@ -11,24 +11,24 @@ f1 (svbool_t pg, svint8_t s8, svuint8_t u8, svmla_x (u8, u8, u8, u8); /* { dg-error {passing 'svuint8_t' to argument 1 of 'svmla_x', which expects 'svbool_t'} } */ svmla_x (pg, pg, pg, pg); /* { dg-error {'svmla_x' has no form that takes 'svbool_t' arguments} } */ svmla_x (pg, 1, u8, u8); /* { dg-error {passing 'int' to argument 2 of 'svmla_x', which expects an SVE type rather than a scalar} } */ - svmla_x (pg, u8, s8, u8); /* { dg-error {passing 'svint8_t' to argument 3 of 'svmla_x', but previous arguments had type 'svuint8_t'} } */ + svmla_x (pg, u8, s8, u8); /* { dg-error {passing 'svint8_t' to argument 3 of 'svmla_x', but argument 2 had type 'svuint8_t'} } */ svmla_x (pg, u8, u8, u8); - svmla_x (pg, u8, s16, u8); /* { dg-error {passing 'svint16_t' to argument 3 of 'svmla_x', but previous arguments had type 'svuint8_t'} } */ - svmla_x (pg, u8, u16, u8); /* { dg-error {passing 'svuint16_t' to argument 3 of 'svmla_x', but previous arguments had type 'svuint8_t'} } */ - svmla_x (pg, u8, f16, u8); /* { dg-error {passing 'svfloat16_t' to argument 3 of 'svmla_x', but previous arguments had type 'svuint8_t'} } */ - svmla_x (pg, u8, pg, u8); /* { dg-error {passing 'svbool_t' to argument 3 of 'svmla_x', but previous arguments had type 'svuint8_t'} } */ + svmla_x (pg, u8, s16, u8); /* { dg-error {passing 'svint16_t' to argument 3 of 'svmla_x', but argument 2 had type 'svuint8_t'} } */ + svmla_x (pg, u8, u16, u8); /* { dg-error {passing 'svuint16_t' to argument 3 of 'svmla_x', but argument 2 had type 'svuint8_t'} } */ + svmla_x (pg, u8, f16, u8); /* { dg-error {passing 'svfloat16_t' to argument 3 of 'svmla_x', but argument 2 had type 'svuint8_t'} } */ + svmla_x (pg, u8, pg, u8); /* { dg-error {passing 'svbool_t' to argument 3 of 'svmla_x', but argument 2 had type 'svuint8_t'} } */ svmla_x (pg, u8, 0, u8); /* { dg-error {passing 'int' to argument 3 of 'svmla_x', which expects an SVE type rather than a scalar} } */ - svmla_x (pg, u8, u8, s8); /* { dg-error {passing 'svint8_t' to argument 4 of 'svmla_x', but previous arguments had type 'svuint8_t'} } */ - svmla_x (pg, u8, u8, s16); /* { dg-error {passing 'svint16_t' to argument 4 of 'svmla_x', but previous arguments had type 'svuint8_t'} } */ - svmla_x (pg, u8, u8, u16); /* { dg-error {passing 'svuint16_t' to argument 4 of 'svmla_x', but previous arguments had type 'svuint8_t'} } */ - svmla_x (pg, u8, u8, f16); /* { dg-error {passing 'svfloat16_t' to argument 4 of 'svmla_x', but previous arguments had type 'svuint8_t'} } */ - svmla_x (pg, u8, u8, pg); /* { dg-error {passing 'svbool_t' to argument 4 of 'svmla_x', but previous arguments had type 'svuint8_t'} } */ + svmla_x (pg, u8, u8, s8); /* { dg-error {passing 'svint8_t' to argument 4 of 'svmla_x', but argument 2 had type 'svuint8_t'} } */ + svmla_x (pg, u8, u8, s16); /* { dg-error {passing 'svint16_t' to argument 4 of 'svmla_x', but argument 2 had type 'svuint8_t'} } */ + svmla_x (pg, u8, u8, u16); /* { dg-error {passing 'svuint16_t' to argument 4 of 'svmla_x', but argument 2 had type 'svuint8_t'} } */ + svmla_x (pg, u8, u8, f16); /* { dg-error {passing 'svfloat16_t' to argument 4 of 'svmla_x', but argument 2 had type 'svuint8_t'} } */ + svmla_x (pg, u8, u8, pg); /* { dg-error {passing 'svbool_t' to argument 4 of 'svmla_x', but argument 2 had type 'svuint8_t'} } */ svmla_x (pg, u8, u8, 0); - svmla_x (pg, f16, s16, f16); /* { dg-error {passing 'svint16_t' to argument 3 of 'svmla_x', but previous arguments had type 'svfloat16_t'} } */ - svmla_x (pg, f16, u16, f16); /* { dg-error {passing 'svuint16_t' to argument 3 of 'svmla_x', but previous arguments had type 'svfloat16_t'} } */ - svmla_x (pg, f16, f16, s16); /* { dg-error {passing 'svint16_t' to argument 4 of 'svmla_x', but previous arguments had type 'svfloat16_t'} } */ - svmla_x (pg, f16, f16, u16); /* { dg-error {passing 'svuint16_t' to argument 4 of 'svmla_x', but previous arguments had type 'svfloat16_t'} } */ + svmla_x (pg, f16, s16, f16); /* { dg-error {passing 'svint16_t' to argument 3 of 'svmla_x', but argument 2 had type 'svfloat16_t'} } */ + svmla_x (pg, f16, u16, f16); /* { dg-error {passing 'svuint16_t' to argument 3 of 'svmla_x', but argument 2 had type 'svfloat16_t'} } */ + svmla_x (pg, f16, f16, s16); /* { dg-error {passing 'svint16_t' to argument 4 of 'svmla_x', but argument 2 had type 'svfloat16_t'} } */ + svmla_x (pg, f16, f16, u16); /* { dg-error {passing 'svuint16_t' to argument 4 of 'svmla_x', but argument 2 had type 'svfloat16_t'} } */ svmla_x (pg, f16, f16, f16); svmla_x (pg, f16, f16, 1); } diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/ternary_rotate_1.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/ternary_rotate_1.c index bb67402897d..68b2cfc1d72 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/ternary_rotate_1.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/ternary_rotate_1.c @@ -13,8 +13,8 @@ f1 (svbool_t pg, svfloat32_t f32, svfloat64_t f64, svint32_t s32, int i) svcmla_x (pg, 1, f32, f32, 90); /* { dg-error {passing 'int' to argument 2 of 'svcmla_x', which expects an SVE type rather than a scalar} } */ svcmla_x (pg, f32, 1, f32, 90); /* { dg-error {passing 'int' to argument 3 of 'svcmla_x', which expects an SVE type rather than a scalar} } */ svcmla_x (pg, f32, f32, 1, 90); /* { dg-error {passing 'int' to argument 4 of 'svcmla_x', which expects an SVE type rather than a scalar} } */ - svcmla_x (pg, f32, f64, f32, 90); /* { dg-error {passing 'svfloat64_t' to argument 3 of 'svcmla_x', but previous arguments had type 'svfloat32_t'} } */ - svcmla_x (pg, f32, f32, f64, 90); /* { dg-error {passing 'svfloat64_t' to argument 4 of 'svcmla_x', but previous arguments had type 'svfloat32_t'} } */ + svcmla_x (pg, f32, f64, f32, 90); /* { dg-error {passing 'svfloat64_t' to argument 3 of 'svcmla_x', but argument 2 had type 'svfloat32_t'} } */ + svcmla_x (pg, f32, f32, f64, 90); /* { dg-error {passing 'svfloat64_t' to argument 4 of 'svcmla_x', but argument 2 had type 'svfloat32_t'} } */ svcmla_x (pg, f32, f32, f32, s32); /* { dg-error {argument 5 of 'svcmla_x' must be an integer constant expression} } */ svcmla_x (pg, f32, f32, f32, i); /* { dg-error {argument 5 of 'svcmla_x' must be an integer constant expression} } */ svcmla_x (pg, f32, f32, f32, -90); /* { dg-error {passing -90 to argument 5 of 'svcmla_x', which expects 0, 90, 180 or 270} } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/ternary_shift_right_imm_1.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/ternary_shift_right_imm_1.c index cfe601631ea..134cf98fd4f 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/ternary_shift_right_imm_1.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/ternary_shift_right_imm_1.c @@ -11,10 +11,10 @@ f1 (svbool_t pg, svuint8_t u8, svint8_t s8, svint16_t s16, { const int one = 1; pg = svsra (pg, pg, 1); /* { dg-error {'svsra' has no form that takes 'svbool_t' arguments} } */ - pg = svsra (pg, s8, 1); /* { dg-error {passing 'svint8_t' to argument 2 of 'svsra', but previous arguments had type 'svbool_t'} } */ + pg = svsra (pg, s8, 1); /* { dg-error {passing 'svint8_t' to argument 2 of 'svsra', but argument 1 had type 'svbool_t'} } */ s8 = svsra (1, s8, 1); /* { dg-error {passing 'int' to argument 1 of 'svsra', which expects an SVE type rather than a scalar} } */ - s8 = svsra (s8, u8, 1); /* { dg-error {passing 'svuint8_t' to argument 2 of 'svsra', but previous arguments had type 'svint8_t'} } */ - s8 = svsra (s8, pg, 1); /* { dg-error {passing 'svbool_t' to argument 2 of 'svsra', but previous arguments had type 'svint8_t'} } */ + s8 = svsra (s8, u8, 1); /* { dg-error {passing 'svuint8_t' to argument 2 of 'svsra', but argument 1 had type 'svint8_t'} } */ + s8 = svsra (s8, pg, 1); /* { dg-error {passing 'svbool_t' to argument 2 of 'svsra', but argument 1 had type 'svint8_t'} } */ s8 = svsra (s8, 1, 1); /* { dg-error {passing 'int' to argument 2 of 'svsra', which expects an SVE type rather than a scalar} } */ s8 = svsra (s8, s8, x); /* { dg-error {argument 3 of 'svsra' must be an integer constant expression} } */ s8 = svsra (s8, s8, one); /* { dg-error {argument 3 of 'svsra' must be an integer constant expression} } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/ternary_uint_1.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/ternary_uint_1.c index 5fb49770173..a639562b170 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/ternary_uint_1.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/ternary_uint_1.c @@ -15,14 +15,14 @@ f1 (svbool_t pg, svuint8_t u8, svint8_t s8, svuint16_t u16, svint16_t s16, svtbx (u8, 0, u8); /* { dg-error {passing 'int' to argument 2 of 'svtbx', which expects an SVE type rather than a scalar} } */ svtbx (u8, u8, 0); /* { dg-error {passing 'int' to argument 3 of 'svtbx', which expects an SVE type rather than a scalar} } */ - svtbx (u8, s8, u8); /* { dg-error {passing 'svint8_t' to argument 2 of 'svtbx', but previous arguments had type 'svuint8_t'} } */ + svtbx (u8, s8, u8); /* { dg-error {passing 'svint8_t' to argument 2 of 'svtbx', but argument 1 had type 'svuint8_t'} } */ svtbx (u8, u8, u8); svtbx (u8, u8, s8); /* { dg-error {passing 'svint8_t' to argument 3 of 'svtbx', which expects a vector of unsigned integers} } */ svtbx (u8, u8, u16); /* { dg-error {arguments 1 and 3 of 'svtbx' must have the same element size, but the values passed here have type 'svuint8_t' and 'svuint16_t' respectively} } */ svtbx (u8, u8, s16); /* { dg-error {passing 'svint16_t' to argument 3 of 'svtbx', which expects a vector of unsigned integers} } */ svtbx (u8, u8, pg); /* { dg-error {passing 'svbool_t' to argument 3 of 'svtbx', which expects a vector of unsigned integers} } */ - svtbx (s8, u8, u8); /* { dg-error {passing 'svuint8_t' to argument 2 of 'svtbx', but previous arguments had type 'svint8_t'} } */ + svtbx (s8, u8, u8); /* { dg-error {passing 'svuint8_t' to argument 2 of 'svtbx', but argument 1 had type 'svint8_t'} } */ svtbx (s8, s8, u8); svtbx (s8, s8, s8); /* { dg-error {passing 'svint8_t' to argument 3 of 'svtbx', which expects a vector of unsigned integers} } */ svtbx (s8, s8, u16); /* { dg-error {arguments 1 and 3 of 'svtbx' must have the same element size, but the values passed here have type 'svint8_t' and 'svuint16_t' respectively} } */ @@ -36,7 +36,7 @@ f1 (svbool_t pg, svuint8_t u8, svint8_t s8, svuint16_t u16, svint16_t s16, svtbx (u16, u16, s16); /* { dg-error {passing 'svint16_t' to argument 3 of 'svtbx', which expects a vector of unsigned integers} } */ svtbx (u16, u16, f16); /* { dg-error {passing 'svfloat16_t' to argument 3 of 'svtbx', which expects a vector of unsigned integers} } */ - svtbx (s16, u16, u16); /* { dg-error {passing 'svuint16_t' to argument 2 of 'svtbx', but previous arguments had type 'svint16_t'} } */ + svtbx (s16, u16, u16); /* { dg-error {passing 'svuint16_t' to argument 2 of 'svtbx', but argument 1 had type 'svint16_t'} } */ svtbx (s16, s16, u8); /* { dg-error {arguments 1 and 3 of 'svtbx' must have the same element size, but the values passed here have type 'svint16_t' and 'svuint8_t' respectively} } */ svtbx (s16, s16, s8); /* { dg-error {passing 'svint8_t' to argument 3 of 'svtbx', which expects a vector of unsigned integers} } */ svtbx (s16, s16, u16); diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/tmad_1.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/tmad_1.c index c2eda93e363..992b50199da 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/tmad_1.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/tmad_1.c @@ -11,7 +11,7 @@ f1 (svbool_t pg, svfloat32_t f32, svfloat64_t f64, svint32_t s32, int i) svtmad (s32, s32, 0); /* { dg-error {'svtmad' has no form that takes 'svint32_t' arguments} } */ svtmad (1, f32, 0); /* { dg-error {passing 'int' to argument 1 of 'svtmad', which expects an SVE type rather than a scalar} } */ svtmad (f32, 1, 0); /* { dg-error {passing 'int' to argument 2 of 'svtmad', which expects an SVE type rather than a scalar} } */ - svtmad (f32, f64, 0); /* { dg-error {passing 'svfloat64_t' to argument 2 of 'svtmad', but previous arguments had type 'svfloat32_t'} } */ + svtmad (f32, f64, 0); /* { dg-error {passing 'svfloat64_t' to argument 2 of 'svtmad', but argument 1 had type 'svfloat32_t'} } */ svtmad (f32, f32, s32); /* { dg-error {argument 3 of 'svtmad' must be an integer constant expression} } */ svtmad (f32, f32, i); /* { dg-error {argument 3 of 'svtmad' must be an integer constant expression} } */ svtmad (f32, f32, -1); /* { dg-error {passing -1 to argument 3 of 'svtmad', which expects a value in the range \[0, 7\]} } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/unary_1.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/unary_1.c index 8c865a0e67d..9c9c383dd1e 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/unary_1.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/unary_1.c @@ -13,9 +13,9 @@ f1 (svbool_t pg, svint32_t s32, svuint32_t u32, svfloat32_t f32) svabs_m (s32, pg, s32); svabs_m (u32, pg, u32); /* { dg-error {'svabs_m' has no form that takes 'svuint32_t' arguments} } */ svabs_m (f32, pg, f32); - svabs_m (s32, pg, u32); /* { dg-error {passing 'svuint32_t' to argument 3 of 'svabs_m', but previous arguments had type 'svint32_t'} } */ - svabs_m (s32, pg, f32); /* { dg-error {passing 'svfloat32_t' to argument 3 of 'svabs_m', but previous arguments had type 'svint32_t'} } */ - svabs_m (s32, pg, pg); /* { dg-error {passing 'svbool_t' to argument 3 of 'svabs_m', but previous arguments had type 'svint32_t'} } */ - svabs_m (pg, pg, s32); /* { dg-error {passing 'svint32_t' to argument 3 of 'svabs_m', but previous arguments had type 'svbool_t'} } */ + svabs_m (s32, pg, u32); /* { dg-error {passing 'svuint32_t' to argument 3 of 'svabs_m', but argument 1 had type 'svint32_t'} } */ + svabs_m (s32, pg, f32); /* { dg-error {passing 'svfloat32_t' to argument 3 of 'svabs_m', but argument 1 had type 'svint32_t'} } */ + svabs_m (s32, pg, pg); /* { dg-error {passing 'svbool_t' to argument 3 of 'svabs_m', but argument 1 had type 'svint32_t'} } */ + svabs_m (pg, pg, s32); /* { dg-error {passing 'svint32_t' to argument 3 of 'svabs_m', but argument 1 had type 'svbool_t'} } */ svabs_m (pg, pg, pg); /* { dg-error {'svabs_m' has no form that takes 'svbool_t' arguments} } */ } diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/undeclared_2.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/undeclared_2.c index 7e869bda8a1..6ffd3d9e8ef 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/undeclared_2.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/undeclared_2.c @@ -9,7 +9,7 @@ f (svint8_t s8, svuint16_t u16, svfloat32_t f32, u16 = svneg_x (pg, u16); /* { dg-error {'svneg_x' has no form that takes 'svuint16_t' arguments} } */ f32 = svclz_x (pg, f32); /* { dg-error {'svclz_x' has no form that takes 'svfloat32_t' arguments} } */ s16x2 = svcreate2 (s8); /* { dg-error {too few arguments to function 'svcreate2'} } */ - u32x3 = svcreate3 (u16, u16, f32); /* { dg-error {passing 'svfloat32_t' to argument 3 of 'svcreate3', but previous arguments had type 'svuint16_t'} } */ + u32x3 = svcreate3 (u16, u16, f32); /* { dg-error {passing 'svfloat32_t' to argument 3 of 'svcreate3', but argument 1 had type 'svuint16_t'} } */ f64x4 = svcreate4 (f32, f32, f32, f32, f32); /* { dg-error {too many arguments to function 'svcreate4'} } */ pg = svadd_x (pg, pg, pg); /* { dg-error {'svadd_x' has no form that takes 'svbool_t' arguments} } */ } From patchwork Tue Dec 5 10:13:06 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Sandiford X-Patchwork-Id: 1872030 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4SkxH85Xcxz23mf for ; Tue, 5 Dec 2023 21:15:00 +1100 (AEDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 47A3C384CB9C for ; Tue, 5 Dec 2023 10:14:49 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by sourceware.org (Postfix) with ESMTP id 52B23385DC19 for ; Tue, 5 Dec 2023 10:13:39 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 52B23385DC19 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=arm.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 52B23385DC19 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=217.140.110.172 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1701771222; cv=none; b=aewXZYbb06tWcpJg2V0ap+kitCuAi8H+LyhHpJC9lVT0j/+5BFSfE9B7YHWLseguF4z8y4n49fY5C1bTMbSNw7aO1t6jwlD0TWMQJpXsbOiV03mTAyHwrfi1mKTB5QgpSI/WrOrn3uG/A1Fa/uBcuTtumQWwBDxJt2IElWGZDxE= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1701771222; c=relaxed/simple; bh=RAO+z2OFtNddeRCjCNG5irgwPfx4dWi7dR0VqN8C2Go=; h=From:To:Subject:Date:Message-Id:MIME-Version; b=ZPvYMm+InMuZXZfOGqXa/8I7ePKKKPSvJC+Z7XsW+idvQEt5kqwtbkpZKKghe/Zavuw5KTOtn7wQKNcU1mnZVm7PZzMeP37z4Ojk21xDHROIBmaHKACA9E4/BCYq9986bxCmnLv/oEzy5k+fSPcR3GrD4dU8H2K1r9hXWr7Ngqo= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id B5D3D1476; Tue, 5 Dec 2023 02:14:25 -0800 (PST) Received: from e121540-lin.manchester.arm.com (e121540-lin.manchester.arm.com [10.32.110.72]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id A75303F5A1; Tue, 5 Dec 2023 02:13:38 -0800 (PST) From: Richard Sandiford To: gcc-patches@gcc.gnu.org Cc: Richard Sandiford Subject: [pushed v2 08/25] aarch64: Make more use of sve_type in ACLE code Date: Tue, 5 Dec 2023 10:13:06 +0000 Message-Id: <20231205101323.1914247-9-richard.sandiford@arm.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20231205101323.1914247-1-richard.sandiford@arm.com> References: <20231205101323.1914247-1-richard.sandiford@arm.com> MIME-Version: 1.0 X-Spam-Status: No, score=-22.4 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_DMARC_NONE, KAM_DMARC_STATUS, KAM_LAZY_DOMAIN_SECURITY, SPF_HELO_NONE, SPF_NONE, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org This patch makes some functions operate on sve_type, rather than just on type suffixes. It also allows an overload to be resolved based on a mode and sve_type. In this case the sve_type is used to derive the group size as well as a type suffix. This is needed for the SME2 intrinsics and the new tuple forms of svreinterpret. No functional change intended on its own. gcc/ * config/aarch64/aarch64-sve-builtins.h (function_resolver::lookup_form): Add an overload that takes an sve_type rather than type and group suffixes. (function_resolver::resolve_to): Likewise. (function_resolver::infer_vector_or_tuple_type): Return an sve_type. (function_resolver::infer_tuple_type): Likewise. (function_resolver::require_matching_vector_type): Take an sve_type rather than a type_suffix_index. (function_resolver::require_derived_vector_type): Likewise. * config/aarch64/aarch64-sve-builtins.cc (num_vectors_to_group): New function. (function_resolver::lookup_form): Add an overload that takes an sve_type rather than type and group suffixes. (function_resolver::resolve_to): Likewise. (function_resolver::infer_vector_or_tuple_type): Return an sve_type. (function_resolver::infer_tuple_type): Likewise. (function_resolver::infer_vector_type): Update accordingly. (function_resolver::require_matching_vector_type): Take an sve_type rather than a type_suffix_index. (function_resolver::require_derived_vector_type): Likewise. * config/aarch64/aarch64-sve-builtins-shapes.cc (get_def::resolve) (set_def::resolve, store_def::resolve, tbl_tuple_def::resolve): Update calls accordingly. --- .../aarch64/aarch64-sve-builtins-shapes.cc | 16 +-- gcc/config/aarch64/aarch64-sve-builtins.cc | 111 +++++++++++++----- gcc/config/aarch64/aarch64-sve-builtins.h | 12 +- 3 files changed, 95 insertions(+), 44 deletions(-) diff --git a/gcc/config/aarch64/aarch64-sve-builtins-shapes.cc b/gcc/config/aarch64/aarch64-sve-builtins-shapes.cc index 7ab94a9cb31..86ec29a5caf 100644 --- a/gcc/config/aarch64/aarch64-sve-builtins-shapes.cc +++ b/gcc/config/aarch64/aarch64-sve-builtins-shapes.cc @@ -1904,9 +1904,9 @@ struct get_def : public overloaded_base<0> resolve (function_resolver &r) const override { unsigned int i, nargs; - type_suffix_index type; + sve_type type; if (!r.check_gp_argument (2, i, nargs) - || (type = r.infer_tuple_type (i)) == NUM_TYPE_SUFFIXES + || !(type = r.infer_tuple_type (i)) || !r.require_integer_immediate (i + 1)) return error_mark_node; @@ -2417,9 +2417,9 @@ struct set_def : public overloaded_base<0> resolve (function_resolver &r) const override { unsigned int i, nargs; - type_suffix_index type; + sve_type type; if (!r.check_gp_argument (3, i, nargs) - || (type = r.infer_tuple_type (i)) == NUM_TYPE_SUFFIXES + || !(type = r.infer_tuple_type (i)) || !r.require_integer_immediate (i + 1) || !r.require_derived_vector_type (i + 2, i, type)) return error_mark_node; @@ -2592,11 +2592,11 @@ struct store_def : public overloaded_base<0> gcc_assert (r.mode_suffix_id == MODE_none || vnum_p); unsigned int i, nargs; - type_suffix_index type; + sve_type type; if (!r.check_gp_argument (vnum_p ? 3 : 2, i, nargs) || !r.require_pointer_type (i) || (vnum_p && !r.require_scalar_type (i + 1, "int64_t")) - || ((type = r.infer_tuple_type (nargs - 1)) == NUM_TYPE_SUFFIXES)) + || !(type = r.infer_tuple_type (nargs - 1))) return error_mark_node; return r.resolve_to (r.mode_suffix_id, type); @@ -2713,9 +2713,9 @@ struct tbl_tuple_def : public overloaded_base<0> resolve (function_resolver &r) const override { unsigned int i, nargs; - type_suffix_index type; + sve_type type; if (!r.check_gp_argument (2, i, nargs) - || (type = r.infer_tuple_type (i)) == NUM_TYPE_SUFFIXES + || !(type = r.infer_tuple_type (i)) || !r.require_derived_vector_type (i + 1, i, type, TYPE_unsigned)) return error_mark_node; diff --git a/gcc/config/aarch64/aarch64-sve-builtins.cc b/gcc/config/aarch64/aarch64-sve-builtins.cc index 4203ff4fc41..cdae77272ab 100644 --- a/gcc/config/aarch64/aarch64-sve-builtins.cc +++ b/gcc/config/aarch64/aarch64-sve-builtins.cc @@ -659,6 +659,21 @@ find_type_suffix_for_scalar_type (const_tree type) return NUM_TYPE_SUFFIXES; } +/* Return the implicit group suffix for intrinsics that operate on NVECTORS + vectors. */ +static group_suffix_index +num_vectors_to_group (unsigned int nvectors) +{ + switch (nvectors) + { + case 1: return GROUP_none; + case 2: return GROUP_x2; + case 3: return GROUP_x3; + case 4: return GROUP_x4; + } + gcc_unreachable (); +} + /* Return the vector type associated with TYPE. */ static tree get_vector_type (sve_type type) @@ -1282,6 +1297,27 @@ function_resolver::lookup_form (mode_suffix_index mode, return rfn ? rfn->decl : NULL_TREE; } +/* Silently check whether there is an instance of the function that has the + mode suffix given by MODE and the type and group suffixes implied by TYPE. + If the overloaded function has an explicit first type suffix (like + conversions do), TYPE describes the implicit second type suffix. + Otherwise, TYPE describes the only type suffix. + + Return the decl of the function if it exists, otherwise return null. */ +tree +function_resolver::lookup_form (mode_suffix_index mode, sve_type type) +{ + type_suffix_index type0 = type_suffix_ids[0]; + type_suffix_index type1 = type_suffix_ids[1]; + (type0 == NUM_TYPE_SUFFIXES ? type0 : type1) = type.type; + + group_suffix_index group = group_suffix_id; + if (group == GROUP_none && type.num_vectors != vectors_per_tuple ()) + group = num_vectors_to_group (type.num_vectors); + + return lookup_form (mode, type0, type1, group); +} + /* Resolve the function to one with the mode suffix given by MODE, the type suffixes given by TYPE0 and TYPE1, and group suffix given by GROUP. Return its function decl on success, otherwise report an @@ -1305,6 +1341,19 @@ function_resolver::resolve_to (mode_suffix_index mode, return res; } +/* Resolve the function to one that has the suffixes associated with MODE + and TYPE; see lookup_form for how TYPE is interpreted. Return the + function decl on success, otherwise report an error and return + error_mark_node. */ +tree +function_resolver::resolve_to (mode_suffix_index mode, sve_type type) +{ + if (tree res = lookup_form (mode, type)) + return res; + + return report_no_such_form (type); +} + /* Require argument ARGNO to be a 32-bit or 64-bit scalar integer type. Return the associated type suffix on success, otherwise report an error and return NUM_TYPE_SUFFIXES. */ @@ -1424,21 +1473,20 @@ function_resolver::infer_sve_type (unsigned int argno) /* Require argument ARGNO to be a single vector or a tuple of NUM_VECTORS vectors; NUM_VECTORS is 1 for the former. Return the associated type - suffix on success, using TYPE_SUFFIX_b for predicates. Report an error - and return NUM_TYPE_SUFFIXES on failure. */ -type_suffix_index + on success. Report an error on failure. */ +sve_type function_resolver::infer_vector_or_tuple_type (unsigned int argno, unsigned int num_vectors) { auto type = infer_sve_type (argno); if (!type) - return NUM_TYPE_SUFFIXES; + return type; if (type.num_vectors == num_vectors) - return type.type; + return type; report_incorrect_num_vectors (argno, type, num_vectors); - return NUM_TYPE_SUFFIXES; + return {}; } /* Require argument ARGNO to have some form of vector type. Return the @@ -1447,7 +1495,9 @@ function_resolver::infer_vector_or_tuple_type (unsigned int argno, type_suffix_index function_resolver::infer_vector_type (unsigned int argno) { - return infer_vector_or_tuple_type (argno, 1); + if (auto type = infer_vector_or_tuple_type (argno, 1)) + return type.type; + return NUM_TYPE_SUFFIXES; } /* Like infer_vector_type, but also require the type to be integral. */ @@ -1512,10 +1562,9 @@ function_resolver::infer_sd_vector_type (unsigned int argno) /* If the function operates on tuples of vectors, require argument ARGNO to be a tuple with the appropriate number of vectors, otherwise require it to be - a single vector. Return the associated type suffix on success, using - TYPE_SUFFIX_b for predicates. Report an error and return NUM_TYPE_SUFFIXES + a single vector. Return the associated type on success. Report an error on failure. */ -type_suffix_index +sve_type function_resolver::infer_tuple_type (unsigned int argno) { return infer_vector_or_tuple_type (argno, vectors_per_tuple ()); @@ -1567,10 +1616,10 @@ function_resolver::require_vector_type (unsigned int argno, bool function_resolver::require_matching_vector_type (unsigned int argno, unsigned int first_argno, - type_suffix_index type) + sve_type type) { - type_suffix_index new_type = infer_vector_type (argno); - if (new_type == NUM_TYPE_SUFFIXES) + sve_type new_type = infer_sve_type (argno); + if (!new_type) return false; if (type != new_type) @@ -1613,15 +1662,13 @@ function_resolver::require_matching_vector_type (unsigned int argno, bool function_resolver:: require_derived_vector_type (unsigned int argno, unsigned int first_argno, - type_suffix_index first_type, + sve_type first_type, type_class_index expected_tclass, unsigned int expected_bits) { /* If the type needs to match FIRST_ARGNO exactly, use the preferred - error message for that case. The VECTOR_TYPE_P test excludes tuple - types, which we handle below instead. */ - bool both_vectors_p = VECTOR_TYPE_P (get_argument_type (first_argno)); - if (both_vectors_p + error message for that case. */ + if (first_type.num_vectors == 1 && expected_tclass == SAME_TYPE_CLASS && expected_bits == SAME_SIZE) { @@ -1631,17 +1678,18 @@ require_derived_vector_type (unsigned int argno, } /* Use FIRST_TYPE to get the expected type class and element size. */ + auto &first_type_suffix = type_suffixes[first_type.type]; type_class_index orig_expected_tclass = expected_tclass; if (expected_tclass == NUM_TYPE_CLASSES) - expected_tclass = type_suffixes[first_type].tclass; + expected_tclass = first_type_suffix.tclass; unsigned int orig_expected_bits = expected_bits; if (expected_bits == SAME_SIZE) - expected_bits = type_suffixes[first_type].element_bits; + expected_bits = first_type_suffix.element_bits; else if (expected_bits == HALF_SIZE) - expected_bits = type_suffixes[first_type].element_bits / 2; + expected_bits = first_type_suffix.element_bits / 2; else if (expected_bits == QUARTER_SIZE) - expected_bits = type_suffixes[first_type].element_bits / 4; + expected_bits = first_type_suffix.element_bits / 4; /* If the expected type doesn't depend on FIRST_TYPE at all, just check for the fixed choice of vector type. */ @@ -1655,13 +1703,14 @@ require_derived_vector_type (unsigned int argno, /* Require the argument to be some form of SVE vector type, without being specific about the type of vector we want. */ - type_suffix_index actual_type = infer_vector_type (argno); - if (actual_type == NUM_TYPE_SUFFIXES) + sve_type actual_type = infer_vector_type (argno); + if (!actual_type) return false; /* Exit now if we got the right type. */ - bool tclass_ok_p = (type_suffixes[actual_type].tclass == expected_tclass); - bool size_ok_p = (type_suffixes[actual_type].element_bits == expected_bits); + auto &actual_type_suffix = type_suffixes[actual_type.type]; + bool tclass_ok_p = (actual_type_suffix.tclass == expected_tclass); + bool size_ok_p = (actual_type_suffix.element_bits == expected_bits); if (tclass_ok_p && size_ok_p) return true; @@ -1701,7 +1750,9 @@ require_derived_vector_type (unsigned int argno, /* If the arguments have consistent type classes, but a link between the sizes has been broken, try to describe the error in those terms. */ - if (both_vectors_p && tclass_ok_p && orig_expected_bits == SAME_SIZE) + if (first_type.num_vectors == 1 + && tclass_ok_p + && orig_expected_bits == SAME_SIZE) { if (argno < first_argno) { @@ -1718,11 +1769,11 @@ require_derived_vector_type (unsigned int argno, /* Likewise in reverse: look for cases in which the sizes are consistent but a link between the type classes has been broken. */ - if (both_vectors_p + if (first_type.num_vectors == 1 && size_ok_p && orig_expected_tclass == SAME_TYPE_CLASS - && type_suffixes[first_type].integer_p - && type_suffixes[actual_type].integer_p) + && first_type_suffix.integer_p + && actual_type_suffix.integer_p) { if (argno < first_argno) { diff --git a/gcc/config/aarch64/aarch64-sve-builtins.h b/gcc/config/aarch64/aarch64-sve-builtins.h index f959b6f6ab3..0b40ad7b7cd 100644 --- a/gcc/config/aarch64/aarch64-sve-builtins.h +++ b/gcc/config/aarch64/aarch64-sve-builtins.h @@ -458,28 +458,28 @@ public: type_suffix_index = NUM_TYPE_SUFFIXES, type_suffix_index = NUM_TYPE_SUFFIXES, group_suffix_index = GROUP_none); + tree lookup_form (mode_suffix_index, sve_type); tree resolve_to (mode_suffix_index, type_suffix_index = NUM_TYPE_SUFFIXES, type_suffix_index = NUM_TYPE_SUFFIXES, group_suffix_index = GROUP_none); + tree resolve_to (mode_suffix_index, sve_type); type_suffix_index infer_integer_scalar_type (unsigned int); type_suffix_index infer_pointer_type (unsigned int, bool = false); sve_type infer_sve_type (unsigned int); - type_suffix_index infer_vector_or_tuple_type (unsigned int, unsigned int); + sve_type infer_vector_or_tuple_type (unsigned int, unsigned int); type_suffix_index infer_vector_type (unsigned int); type_suffix_index infer_integer_vector_type (unsigned int); type_suffix_index infer_unsigned_vector_type (unsigned int); type_suffix_index infer_sd_vector_type (unsigned int); - type_suffix_index infer_tuple_type (unsigned int); + sve_type infer_tuple_type (unsigned int); bool require_vector_or_scalar_type (unsigned int); bool require_vector_type (unsigned int, vector_type_index); - bool require_matching_vector_type (unsigned int, unsigned int, - type_suffix_index); - bool require_derived_vector_type (unsigned int, unsigned int, - type_suffix_index, + bool require_matching_vector_type (unsigned int, unsigned int, sve_type); + bool require_derived_vector_type (unsigned int, unsigned int, sve_type, type_class_index = SAME_TYPE_CLASS, unsigned int = SAME_SIZE); From patchwork Tue Dec 5 10:13:07 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Sandiford X-Patchwork-Id: 1872028 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=8.43.85.97; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4SkxGt4fkxz23mf for ; Tue, 5 Dec 2023 21:14:46 +1100 (AEDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 2CAB53845BF6 for ; Tue, 5 Dec 2023 10:14:32 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by sourceware.org (Postfix) with ESMTP id F12CE385E00F for ; Tue, 5 Dec 2023 10:13:39 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org F12CE385E00F Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=arm.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org F12CE385E00F Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=217.140.110.172 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1701771222; cv=none; b=FlFhFvZ3ye78UMSYAGmbsq2uv6Mpp2VjAxJ97vY8vCaQ5SJiih4AcBp6vGBsbn4o47o4jKHaahJ3jUe17P1Wb8cIECiS/0KTnJ6WDGJJKUZrrW48kHTJbmcghMqZj51+n4/90yma87SEdD7XH7A8gqqa7q7+0h5AXUgwBAfU9nw= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1701771222; c=relaxed/simple; bh=i/8CrIMnOcZyejfIjiLZ31yUehq78800JJRlP+Ys4To=; h=From:To:Subject:Date:Message-Id:MIME-Version; b=RYE9BmPdULvFmSb8LupMHNSCCrLTnTunQf766iTsVqSFF5lEhn8NJbXCPq0CJrX2IcuKas0tu65DsL4TLywYJb3leEUXw/r+5V2V/q9lFKFVC41ty8PenoK6xrGH4DC1AOXgPq6e21SWbV/wI2ggzmZXmo4vAlYuMwN6uxmySkY= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 658D5FEC; Tue, 5 Dec 2023 02:14:26 -0800 (PST) Received: from e121540-lin.manchester.arm.com (e121540-lin.manchester.arm.com [10.32.110.72]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 574A83F5A1; Tue, 5 Dec 2023 02:13:39 -0800 (PST) From: Richard Sandiford To: gcc-patches@gcc.gnu.org Cc: Richard Sandiford Subject: [pushed v2 09/25] aarch64: Tweak error message for (tuple, vector) pairs Date: Tue, 5 Dec 2023 10:13:07 +0000 Message-Id: <20231205101323.1914247-10-richard.sandiford@arm.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20231205101323.1914247-1-richard.sandiford@arm.com> References: <20231205101323.1914247-1-richard.sandiford@arm.com> MIME-Version: 1.0 X-Spam-Status: No, score=-22.4 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_DMARC_NONE, KAM_DMARC_STATUS, KAM_LAZY_DOMAIN_SECURITY, KAM_SHORT, SPF_HELO_NONE, SPF_NONE, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org SME2 adds more intrinsics that take a tuple of vectors followed by a single vector, with the two arguments expected to have the same element type. Unlike with the existing svset* intrinsics, the size of the tuple is not fixed by the overloaded function name. This patch adds an error message that (hopefully) copes better with that combination. gcc/ * config/aarch64/aarch64-sve-builtins.cc (function_resolver::require_derived_vector_type): Add a specific error message for the case in which the caller wants a single vector whose element type matches a previous tuyple argument. gcc/testsuite/ * gcc.target/aarch64/sve/acle/general-c/set_1.c: Tweak expected error message. * gcc.target/aarch64/sve/acle/general-c/set_3.c: Likewise. * gcc.target/aarch64/sve/acle/general-c/set_5.c: Likewise. --- gcc/config/aarch64/aarch64-sve-builtins.cc | 13 +++++++++++++ .../gcc.target/aarch64/sve/acle/general-c/set_1.c | 4 ++-- .../gcc.target/aarch64/sve/acle/general-c/set_3.c | 4 ++-- .../gcc.target/aarch64/sve/acle/general-c/set_5.c | 4 ++-- 4 files changed, 19 insertions(+), 6 deletions(-) diff --git a/gcc/config/aarch64/aarch64-sve-builtins.cc b/gcc/config/aarch64/aarch64-sve-builtins.cc index cdae77272ab..55bd2662d1a 100644 --- a/gcc/config/aarch64/aarch64-sve-builtins.cc +++ b/gcc/config/aarch64/aarch64-sve-builtins.cc @@ -1707,6 +1707,19 @@ require_derived_vector_type (unsigned int argno, if (!actual_type) return false; + if (orig_expected_tclass == SAME_TYPE_CLASS + && orig_expected_bits == SAME_SIZE) + { + if (actual_type.type == first_type.type) + return true; + + error_at (location, "passing %qT to argument %d of %qE, but" + " argument %d was a tuple of %qT", + get_vector_type (actual_type), argno + 1, fndecl, + first_argno + 1, get_vector_type (first_type.type)); + return false; + } + /* Exit now if we got the right type. */ auto &actual_type_suffix = type_suffixes[actual_type.type]; bool tclass_ok_p = (actual_type_suffix.tclass == expected_tclass); diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/set_1.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/set_1.c index f07c76102ca..f2a6da5360f 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/set_1.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/set_1.c @@ -16,8 +16,8 @@ f1 (svbool_t pg, svuint8_t u8, svuint8x2_t u8x2, svuint8x3_t u8x3, int x) u8x2 = svset2 (u8x3, 0, u8); /* { dg-error {passing 'svuint8x3_t' to argument 1 of 'svset2', which expects a tuple of 2 vectors} } */ u8x2 = svset2 (pg, 0, u8); /* { dg-error {passing 'svbool_t' to argument 1 of 'svset2', which expects a tuple of 2 vectors} } */ u8x2 = svset2 (u8x2, 0, u8x2); /* { dg-error {passing 'svuint8x2_t' to argument 3 of 'svset2', which expects a single SVE vector rather than a tuple} } */ - u8x2 = svset2 (u8x2, 0, f64); /* { dg-error {passing 'svfloat64_t' instead of the expected 'svuint8_t' to argument 3 of 'svset2', after passing 'svuint8x2_t' to argument 1} } */ - u8x2 = svset2 (u8x2, 0, pg); /* { dg-error {passing 'svbool_t' instead of the expected 'svuint8_t' to argument 3 of 'svset2', after passing 'svuint8x2_t' to argument 1} } */ + u8x2 = svset2 (u8x2, 0, f64); /* { dg-error {passing 'svfloat64_t' to argument 3 of 'svset2', but argument 1 was a tuple of 'svuint8_t'} } */ + u8x2 = svset2 (u8x2, 0, pg); /* { dg-error {passing 'svbool_t' to argument 3 of 'svset2', but argument 1 was a tuple of 'svuint8_t'} } */ u8x2 = svset2 (u8x2, x, u8); /* { dg-error {argument 2 of 'svset2' must be an integer constant expression} } */ u8x2 = svset2 (u8x2, 0, u8); f64 = svset2 (u8x2, 0, u8); /* { dg-error {incompatible types when assigning to type 'svfloat64_t' from type 'svuint8x2_t'} } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/set_3.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/set_3.c index 543a1bea8f3..92b955f8355 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/set_3.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/set_3.c @@ -17,8 +17,8 @@ f1 (svbool_t pg, svfloat16_t f16, svfloat16x3_t f16x3, svfloat16x4_t f16x4, f16x3 = svset3 (f16x4, 0, f16); /* { dg-error {passing 'svfloat16x4_t' to argument 1 of 'svset3', which expects a tuple of 3 vectors} } */ f16x3 = svset3 (pg, 0, f16); /* { dg-error {passing 'svbool_t' to argument 1 of 'svset3', which expects a tuple of 3 vectors} } */ f16x3 = svset3 (f16x3, 0, f16x3); /* { dg-error {passing 'svfloat16x3_t' to argument 3 of 'svset3', which expects a single SVE vector rather than a tuple} } */ - f16x3 = svset3 (f16x3, 0, f64); /* { dg-error {passing 'svfloat64_t' instead of the expected 'svfloat16_t' to argument 3 of 'svset3', after passing 'svfloat16x3_t' to argument 1} } */ - f16x3 = svset3 (f16x3, 0, pg); /* { dg-error {passing 'svbool_t' instead of the expected 'svfloat16_t' to argument 3 of 'svset3', after passing 'svfloat16x3_t' to argument 1} } */ + f16x3 = svset3 (f16x3, 0, f64); /* { dg-error {passing 'svfloat64_t' to argument 3 of 'svset3', but argument 1 was a tuple of 'svfloat16_t'} } */ + f16x3 = svset3 (f16x3, 0, pg); /* { dg-error {passing 'svbool_t' to argument 3 of 'svset3', but argument 1 was a tuple of 'svfloat16_t'} } */ f16x3 = svset3 (f16x3, x, f16); /* { dg-error {argument 2 of 'svset3' must be an integer constant expression} } */ f16x3 = svset3 (f16x3, 0, f16); f64 = svset3 (f16x3, 0, f16); /* { dg-error {incompatible types when assigning to type 'svfloat64_t' from type 'svfloat16x3_t'} } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/set_5.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/set_5.c index be911a73176..f0696fb07c7 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/set_5.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/set_5.c @@ -16,8 +16,8 @@ f1 (svbool_t pg, svint32_t s32, svint32x4_t s32x4, svint32x2_t s32x2, int x) s32x4 = svset4 (s32x2, 0, s32); /* { dg-error {passing 'svint32x2_t' to argument 1 of 'svset4', which expects a tuple of 4 vectors} } */ s32x4 = svset4 (pg, 0, s32); /* { dg-error {passing 'svbool_t' to argument 1 of 'svset4', which expects a tuple of 4 vectors} } */ s32x4 = svset4 (s32x4, 0, s32x4); /* { dg-error {passing 'svint32x4_t' to argument 3 of 'svset4', which expects a single SVE vector rather than a tuple} } */ - s32x4 = svset4 (s32x4, 0, f64); /* { dg-error {passing 'svfloat64_t' instead of the expected 'svint32_t' to argument 3 of 'svset4', after passing 'svint32x4_t' to argument 1} } */ - s32x4 = svset4 (s32x4, 0, pg); /* { dg-error {passing 'svbool_t' instead of the expected 'svint32_t' to argument 3 of 'svset4', after passing 'svint32x4_t' to argument 1} } */ + s32x4 = svset4 (s32x4, 0, f64); /* { dg-error {passing 'svfloat64_t' to argument 3 of 'svset4', but argument 1 was a tuple of 'svint32_t'} } */ + s32x4 = svset4 (s32x4, 0, pg); /* { dg-error {passing 'svbool_t' to argument 3 of 'svset4', but argument 1 was a tuple of 'svint32_t'} } */ s32x4 = svset4 (s32x4, x, s32); /* { dg-error {argument 2 of 'svset4' must be an integer constant expression} } */ s32x4 = svset4 (s32x4, 0, s32); f64 = svset4 (s32x4, 0, s32); /* { dg-error {incompatible types when assigning to type 'svfloat64_t' from type 'svint32x4_t'} } */ From patchwork Tue Dec 5 10:13:08 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Sandiford X-Patchwork-Id: 1872031 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4SkxHj57Y6z1ySd for ; Tue, 5 Dec 2023 21:15:29 +1100 (AEDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 7FB9D3871011 for ; Tue, 5 Dec 2023 10:15:14 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by sourceware.org (Postfix) with ESMTP id BBF8A385DC01 for ; Tue, 5 Dec 2023 10:13:40 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org BBF8A385DC01 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=arm.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org BBF8A385DC01 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=217.140.110.172 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1701771225; cv=none; b=Txex4Xo/XEvJMNuluFpODHeSFZ2z7Tzdv8WEDvZWBUb0Z1TcAbuUHdzy+6dBT2u2eJuoSbzML7TXPjAXuiHFR/4dmP+slE/b0lSvgPagKOAEjddUzaBH361PG3K+IUb91/l50ibdTOc4haAKfgFtNYkKOz56aapLMDEEEyahcD0= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1701771225; c=relaxed/simple; bh=Lk0YDezrKuqLwHSIf+dVXOQKG3jttYQiN3+rr/7bTtI=; h=From:To:Subject:Date:Message-Id:MIME-Version; b=ve2Djy8LDnPBqPeqlOT4dv+z7x6+wyGsMHoeYDjwVEIMqfZ+Pv50EwrFAv67tFcNGhtQKrWNJkUI0ujPBoiS2IpASX3UbT1Y/lQytFFfTYi6GNKU73qg04lbQ2b602oj1fHbrU664QguQjUje+KVuMbz8Vqwtfd5nFzAqaXNjgE= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 17171139F; Tue, 5 Dec 2023 02:14:27 -0800 (PST) Received: from e121540-lin.manchester.arm.com (e121540-lin.manchester.arm.com [10.32.110.72]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 089693F5A1; Tue, 5 Dec 2023 02:13:39 -0800 (PST) From: Richard Sandiford To: gcc-patches@gcc.gnu.org Cc: Richard Sandiford Subject: [pushed v2 10/25] aarch64: Add tuple forms of svreinterpret Date: Tue, 5 Dec 2023 10:13:08 +0000 Message-Id: <20231205101323.1914247-11-richard.sandiford@arm.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20231205101323.1914247-1-richard.sandiford@arm.com> References: <20231205101323.1914247-1-richard.sandiford@arm.com> MIME-Version: 1.0 X-Spam-Status: No, score=-22.4 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_DMARC_NONE, KAM_DMARC_STATUS, KAM_LAZY_DOMAIN_SECURITY, KAM_SHORT, SPF_HELO_NONE, SPF_NONE, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org SME2 adds a number of intrinsics that operate on tuples of 2 and 4 vectors. The ACLE therefore extends the existing svreinterpret intrinsics to handle tuples as well. gcc/ * config/aarch64/aarch64-sve-builtins-base.cc (svreinterpret_impl::fold): Punt on tuple forms. (svreinterpret_impl::expand): Use tuple_mode instead of vector_mode. * config/aarch64/aarch64-sve-builtins-base.def (svreinterpret): Extend to x1234 groups. * config/aarch64/aarch64-sve-builtins-functions.h (multi_vector_function::vectors_per_tuple): If the function has a group suffix, get the number of vectors from there. * config/aarch64/aarch64-sve-builtins-shapes.h (reinterpret): Declare. * config/aarch64/aarch64-sve-builtins-shapes.cc (reinterpret_def) (reinterpret): New function shape. * config/aarch64/aarch64-sve-builtins.cc (function_groups): Handle DEF_SVE_FUNCTION_GS. * config/aarch64/aarch64-sve-builtins.def (DEF_SVE_FUNCTION_GS): New macro. (DEF_SVE_FUNCTION): Forward to DEF_SVE_FUNCTION_GS by default. * config/aarch64/aarch64-sve-builtins.h (function_instance::tuple_mode): New member function. (function_base::vectors_per_tuple): Take the function instance as argument and get the number from the group suffix. (function_instance::vectors_per_tuple): Update accordingly. * config/aarch64/iterators.md (SVE_FULLx2, SVE_FULLx3, SVE_FULLx4) (SVE_ALL_STRUCT): New mode iterators. (SVE_STRUCT): Redefine in terms of SVE_FULL*. * config/aarch64/aarch64-sve.md (@aarch64_sve_reinterpret) (*aarch64_sve_reinterpret): Extend to SVE structure modes. gcc/testsuite/ * gcc.target/aarch64/sve/acle/asm/test_sve_acle.h (TEST_DUAL_XN): New macro. * gcc.target/aarch64/sve/acle/asm/reinterpret_bf16.c: Add tests for tuple forms. * gcc.target/aarch64/sve/acle/asm/reinterpret_f16.c: Likewise. * gcc.target/aarch64/sve/acle/asm/reinterpret_f32.c: Likewise. * gcc.target/aarch64/sve/acle/asm/reinterpret_f64.c: Likewise. * gcc.target/aarch64/sve/acle/asm/reinterpret_s16.c: Likewise. * gcc.target/aarch64/sve/acle/asm/reinterpret_s32.c: Likewise. * gcc.target/aarch64/sve/acle/asm/reinterpret_s64.c: Likewise. * gcc.target/aarch64/sve/acle/asm/reinterpret_s8.c: Likewise. * gcc.target/aarch64/sve/acle/asm/reinterpret_u16.c: Likewise. * gcc.target/aarch64/sve/acle/asm/reinterpret_u32.c: Likewise. * gcc.target/aarch64/sve/acle/asm/reinterpret_u64.c: Likewise. * gcc.target/aarch64/sve/acle/asm/reinterpret_u8.c: Likewise. --- .../aarch64/aarch64-sve-builtins-base.cc | 5 +- .../aarch64/aarch64-sve-builtins-base.def | 2 +- .../aarch64/aarch64-sve-builtins-functions.h | 7 ++- .../aarch64/aarch64-sve-builtins-shapes.cc | 28 +++++++++ .../aarch64/aarch64-sve-builtins-shapes.h | 1 + gcc/config/aarch64/aarch64-sve-builtins.cc | 8 ++- gcc/config/aarch64/aarch64-sve-builtins.def | 8 ++- gcc/config/aarch64/aarch64-sve-builtins.h | 20 +++++- gcc/config/aarch64/aarch64-sve.md | 8 +-- gcc/config/aarch64/iterators.md | 26 +++++--- .../aarch64/sve/acle/asm/reinterpret_bf16.c | 62 +++++++++++++++++++ .../aarch64/sve/acle/asm/reinterpret_f16.c | 62 +++++++++++++++++++ .../aarch64/sve/acle/asm/reinterpret_f32.c | 62 +++++++++++++++++++ .../aarch64/sve/acle/asm/reinterpret_f64.c | 62 +++++++++++++++++++ .../aarch64/sve/acle/asm/reinterpret_s16.c | 62 +++++++++++++++++++ .../aarch64/sve/acle/asm/reinterpret_s32.c | 62 +++++++++++++++++++ .../aarch64/sve/acle/asm/reinterpret_s64.c | 62 +++++++++++++++++++ .../aarch64/sve/acle/asm/reinterpret_s8.c | 62 +++++++++++++++++++ .../aarch64/sve/acle/asm/reinterpret_u16.c | 62 +++++++++++++++++++ .../aarch64/sve/acle/asm/reinterpret_u32.c | 62 +++++++++++++++++++ .../aarch64/sve/acle/asm/reinterpret_u64.c | 62 +++++++++++++++++++ .../aarch64/sve/acle/asm/reinterpret_u8.c | 62 +++++++++++++++++++ .../aarch64/sve/acle/asm/test_sve_acle.h | 14 +++++ 23 files changed, 851 insertions(+), 20 deletions(-) diff --git a/gcc/config/aarch64/aarch64-sve-builtins-base.cc b/gcc/config/aarch64/aarch64-sve-builtins-base.cc index 6e108de54ea..a219c88085a 100644 --- a/gcc/config/aarch64/aarch64-sve-builtins-base.cc +++ b/gcc/config/aarch64/aarch64-sve-builtins-base.cc @@ -2148,6 +2148,9 @@ public: gimple * fold (gimple_folder &f) const override { + if (f.vectors_per_tuple () > 1) + return NULL; + /* Punt to rtl if the effect of the reinterpret on registers does not conform to GCC's endianness model. */ if (!targetm.can_change_mode_class (f.vector_mode (0), @@ -2164,7 +2167,7 @@ public: rtx expand (function_expander &e) const override { - machine_mode mode = e.vector_mode (0); + machine_mode mode = e.tuple_mode (0); return e.use_exact_insn (code_for_aarch64_sve_reinterpret (mode)); } }; diff --git a/gcc/config/aarch64/aarch64-sve-builtins-base.def b/gcc/config/aarch64/aarch64-sve-builtins-base.def index 0484863d3f7..4e31f67ac47 100644 --- a/gcc/config/aarch64/aarch64-sve-builtins-base.def +++ b/gcc/config/aarch64/aarch64-sve-builtins-base.def @@ -248,7 +248,7 @@ DEF_SVE_FUNCTION (svrdffr, rdffr, none, z_or_none) DEF_SVE_FUNCTION (svrecpe, unary, all_float, none) DEF_SVE_FUNCTION (svrecps, binary, all_float, none) DEF_SVE_FUNCTION (svrecpx, unary, all_float, mxz) -DEF_SVE_FUNCTION (svreinterpret, unary_convert, reinterpret, none) +DEF_SVE_FUNCTION_GS (svreinterpret, reinterpret, reinterpret, x1234, none) DEF_SVE_FUNCTION (svrev, unary, all_data, none) DEF_SVE_FUNCTION (svrev, unary_pred, all_pred, none) DEF_SVE_FUNCTION (svrevb, unary, hsd_integer, mxz) diff --git a/gcc/config/aarch64/aarch64-sve-builtins-functions.h b/gcc/config/aarch64/aarch64-sve-builtins-functions.h index 2729877d914..4a10102038a 100644 --- a/gcc/config/aarch64/aarch64-sve-builtins-functions.h +++ b/gcc/config/aarch64/aarch64-sve-builtins-functions.h @@ -48,8 +48,13 @@ public: : m_vectors_per_tuple (vectors_per_tuple) {} unsigned int - vectors_per_tuple () const override + vectors_per_tuple (const function_instance &fi) const override { + if (fi.group_suffix_id != GROUP_none) + { + gcc_checking_assert (m_vectors_per_tuple == 1); + return fi.group_suffix ().vectors_per_tuple; + } return m_vectors_per_tuple; } diff --git a/gcc/config/aarch64/aarch64-sve-builtins-shapes.cc b/gcc/config/aarch64/aarch64-sve-builtins-shapes.cc index 86ec29a5caf..2c25b122f05 100644 --- a/gcc/config/aarch64/aarch64-sve-builtins-shapes.cc +++ b/gcc/config/aarch64/aarch64-sve-builtins-shapes.cc @@ -2400,6 +2400,34 @@ struct reduction_wide_def : public overloaded_base<0> }; SHAPE (reduction_wide) +/* svx_t svfoo_t0[_t1_g](svx_t) + + where the target type must be specified explicitly but the source + type can be inferred. */ +struct reinterpret_def : public overloaded_base<1> +{ + bool explicit_group_suffix_p () const override { return false; } + + void + build (function_builder &b, const function_group_info &group) const override + { + b.add_overloaded_functions (group, MODE_none); + build_all (b, "t0,t1", group, MODE_none); + } + + tree + resolve (function_resolver &r) const override + { + sve_type type; + if (!r.check_num_arguments (1) + || !(type = r.infer_sve_type (0))) + return error_mark_node; + + return r.resolve_to (r.mode_suffix_id, type); + } +}; +SHAPE (reinterpret) + /* svxN_t svfoo[_t0](svxN_t, uint64_t, sv_t) where the second argument is an integer constant expression in the diff --git a/gcc/config/aarch64/aarch64-sve-builtins-shapes.h b/gcc/config/aarch64/aarch64-sve-builtins-shapes.h index 7483c1d04b8..38d494761ae 100644 --- a/gcc/config/aarch64/aarch64-sve-builtins-shapes.h +++ b/gcc/config/aarch64/aarch64-sve-builtins-shapes.h @@ -133,6 +133,7 @@ namespace aarch64_sve extern const function_shape *const rdffr; extern const function_shape *const reduction; extern const function_shape *const reduction_wide; + extern const function_shape *const reinterpret; extern const function_shape *const set; extern const function_shape *const setffr; extern const function_shape *const shift_left_imm_long; diff --git a/gcc/config/aarch64/aarch64-sve-builtins.cc b/gcc/config/aarch64/aarch64-sve-builtins.cc index 55bd2662d1a..ecee554a890 100644 --- a/gcc/config/aarch64/aarch64-sve-builtins.cc +++ b/gcc/config/aarch64/aarch64-sve-builtins.cc @@ -494,6 +494,10 @@ static const group_suffix_index groups_none[] = { GROUP_none, NUM_GROUP_SUFFIXES }; +static const group_suffix_index groups_x1234[] = { + GROUP_none, GROUP_x2, GROUP_x3, GROUP_x4, NUM_GROUP_SUFFIXES +}; + /* Used by functions that have no governing predicate. */ static const predication_index preds_none[] = { PRED_none, NUM_PREDS }; @@ -534,8 +538,8 @@ static const predication_index preds_z[] = { PRED_z, NUM_PREDS }; /* A list of all SVE ACLE functions. */ static CONSTEXPR const function_group_info function_groups[] = { -#define DEF_SVE_FUNCTION(NAME, SHAPE, TYPES, PREDS) \ - { #NAME, &functions::NAME, &shapes::SHAPE, types_##TYPES, groups_none, \ +#define DEF_SVE_FUNCTION_GS(NAME, SHAPE, TYPES, GROUPS, PREDS) \ + { #NAME, &functions::NAME, &shapes::SHAPE, types_##TYPES, groups_##GROUPS, \ preds_##PREDS, REQUIRED_EXTENSIONS }, #include "aarch64-sve-builtins.def" }; diff --git a/gcc/config/aarch64/aarch64-sve-builtins.def b/gcc/config/aarch64/aarch64-sve-builtins.def index 5fbd486d74e..14d12f07415 100644 --- a/gcc/config/aarch64/aarch64-sve-builtins.def +++ b/gcc/config/aarch64/aarch64-sve-builtins.def @@ -33,8 +33,13 @@ #define DEF_SVE_GROUP_SUFFIX(A, B, C) #endif +#ifndef DEF_SVE_FUNCTION_GS +#define DEF_SVE_FUNCTION_GS(A, B, C, D, E) +#endif + #ifndef DEF_SVE_FUNCTION -#define DEF_SVE_FUNCTION(A, B, C, D) +#define DEF_SVE_FUNCTION(NAME, SHAPE, TYPES, PREDS) \ + DEF_SVE_FUNCTION_GS (NAME, SHAPE, TYPES, none, PREDS) #endif DEF_SVE_MODE (n, none, none, none) @@ -107,6 +112,7 @@ DEF_SVE_GROUP_SUFFIX (x4, 0, 4) #include "aarch64-sve-builtins-sve2.def" #undef DEF_SVE_FUNCTION +#undef DEF_SVE_FUNCTION_GS #undef DEF_SVE_GROUP_SUFFIX #undef DEF_SVE_TYPE_SUFFIX #undef DEF_SVE_TYPE diff --git a/gcc/config/aarch64/aarch64-sve-builtins.h b/gcc/config/aarch64/aarch64-sve-builtins.h index 0b40ad7b7cd..e770a4042fe 100644 --- a/gcc/config/aarch64/aarch64-sve-builtins.h +++ b/gcc/config/aarch64/aarch64-sve-builtins.h @@ -364,6 +364,7 @@ public: tree tuple_type (unsigned int) const; unsigned int elements_per_vq (unsigned int i) const; machine_mode vector_mode (unsigned int) const; + machine_mode tuple_mode (unsigned int) const; machine_mode gp_mode (unsigned int) const; /* The properties of the function. */ @@ -666,7 +667,7 @@ public: /* If the function operates on tuples of vectors, return the number of vectors in the tuples, otherwise return 1. */ - virtual unsigned int vectors_per_tuple () const { return 1; } + virtual unsigned int vectors_per_tuple (const function_instance &) const; /* If the function addresses memory, return the type of a single scalar memory element. */ @@ -841,7 +842,7 @@ function_instance::operator!= (const function_instance &other) const inline unsigned int function_instance::vectors_per_tuple () const { - return base->vectors_per_tuple (); + return base->vectors_per_tuple (*this); } /* If the function addresses memory, return the type of a single @@ -945,6 +946,15 @@ function_instance::vector_mode (unsigned int i) const return type_suffix (i).vector_mode; } +/* Return the mode of tuple_type (I). */ +inline machine_mode +function_instance::tuple_mode (unsigned int i) const +{ + if (group_suffix ().vectors_per_tuple > 1) + return TYPE_MODE (tuple_type (i)); + return vector_mode (i); +} + /* Return the mode of the governing predicate to use when operating on type suffix I. */ inline machine_mode @@ -971,6 +981,12 @@ function_base::call_properties (const function_instance &instance) const return flags; } +inline unsigned int +function_base::vectors_per_tuple (const function_instance &instance) const +{ + return instance.group_suffix ().vectors_per_tuple; +} + /* Return the mode of the result of a call. */ inline machine_mode function_expander::result_mode () const diff --git a/gcc/config/aarch64/aarch64-sve.md b/gcc/config/aarch64/aarch64-sve.md index cfadac4f1be..e9cebffe3e0 100644 --- a/gcc/config/aarch64/aarch64-sve.md +++ b/gcc/config/aarch64/aarch64-sve.md @@ -787,8 +787,8 @@ (define_insn_and_split "*aarch64_sve_mov_subreg_be" ;; This is equivalent to a subreg on little-endian targets but not for ;; big-endian; see the comment at the head of the file for details. (define_expand "@aarch64_sve_reinterpret" - [(set (match_operand:SVE_ALL 0 "register_operand") - (unspec:SVE_ALL + [(set (match_operand:SVE_ALL_STRUCT 0 "register_operand") + (unspec:SVE_ALL_STRUCT [(match_operand 1 "aarch64_any_register_operand")] UNSPEC_REINTERPRET))] "TARGET_SVE" @@ -805,8 +805,8 @@ (define_expand "@aarch64_sve_reinterpret" ;; A pattern for handling type punning on big-endian targets. We use a ;; special predicate for operand 1 to reduce the number of patterns. (define_insn_and_split "*aarch64_sve_reinterpret" - [(set (match_operand:SVE_ALL 0 "register_operand" "=w") - (unspec:SVE_ALL + [(set (match_operand:SVE_ALL_STRUCT 0 "register_operand" "=w") + (unspec:SVE_ALL_STRUCT [(match_operand 1 "aarch64_any_register_operand" "w")] UNSPEC_REINTERPRET))] "TARGET_SVE" diff --git a/gcc/config/aarch64/iterators.md b/gcc/config/aarch64/iterators.md index a920de99ffc..e7aa7e35ae1 100644 --- a/gcc/config/aarch64/iterators.md +++ b/gcc/config/aarch64/iterators.md @@ -430,14 +430,6 @@ (define_mode_iterator VNx4SF_ONLY [VNx4SF]) (define_mode_iterator VNx2DI_ONLY [VNx2DI]) (define_mode_iterator VNx2DF_ONLY [VNx2DF]) -;; All SVE vector structure modes. -(define_mode_iterator SVE_STRUCT [VNx32QI VNx16HI VNx8SI VNx4DI - VNx16BF VNx16HF VNx8SF VNx4DF - VNx48QI VNx24HI VNx12SI VNx6DI - VNx24BF VNx24HF VNx12SF VNx6DF - VNx64QI VNx32HI VNx16SI VNx8DI - VNx32BF VNx32HF VNx16SF VNx8DF]) - ;; All fully-packed SVE vector modes. (define_mode_iterator SVE_FULL [VNx16QI VNx8HI VNx4SI VNx2DI VNx8BF VNx8HF VNx4SF VNx2DF]) @@ -509,6 +501,24 @@ (define_mode_iterator SVE_ALL [VNx16QI VNx8QI VNx4QI VNx2QI VNx2DI VNx2DF]) +;; All SVE 2-vector modes. +(define_mode_iterator SVE_FULLx2 [VNx32QI VNx16HI VNx8SI VNx4DI + VNx16BF VNx16HF VNx8SF VNx4DF]) + +;; All SVE 3-vector modes. +(define_mode_iterator SVE_FULLx3 [VNx48QI VNx24HI VNx12SI VNx6DI + VNx24BF VNx24HF VNx12SF VNx6DF]) + +;; All SVE 4-vector modes. +(define_mode_iterator SVE_FULLx4 [VNx64QI VNx32HI VNx16SI VNx8DI + VNx32BF VNx32HF VNx16SF VNx8DF]) + +;; All SVE vector structure modes. +(define_mode_iterator SVE_STRUCT [SVE_FULLx2 SVE_FULLx3 SVE_FULLx4]) + +;; All SVE vector and structure modes. +(define_mode_iterator SVE_ALL_STRUCT [SVE_ALL SVE_STRUCT]) + ;; All SVE integer vector modes. (define_mode_iterator SVE_I [VNx16QI VNx8QI VNx4QI VNx2QI VNx8HI VNx4HI VNx2HI diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/reinterpret_bf16.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/reinterpret_bf16.c index 2d2c2a714b9..dd0daf2eff0 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/reinterpret_bf16.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/reinterpret_bf16.c @@ -205,3 +205,65 @@ TEST_DUAL_Z_REV (reinterpret_bf16_u64_tied1, svbfloat16_t, svuint64_t, TEST_DUAL_Z (reinterpret_bf16_u64_untied, svbfloat16_t, svuint64_t, z0 = svreinterpret_bf16_u64 (z4), z0 = svreinterpret_bf16 (z4)) + +/* +** reinterpret_bf16_bf16_x2_tied1: +** ret +*/ +TEST_DUAL_Z_REV (reinterpret_bf16_bf16_x2_tied1, svbfloat16x2_t, svbfloat16x2_t, + z0_res = svreinterpret_bf16_bf16_x2 (z0), + z0_res = svreinterpret_bf16 (z0)) + +/* +** reinterpret_bf16_f32_x2_untied: +** ( +** mov z0\.d, z4\.d +** mov z1\.d, z5\.d +** | +** mov z0\.d, z4\.d +** mov z1\.d, z5\.d +** ) +** ret +*/ +TEST_DUAL_XN (reinterpret_bf16_f32_x2_untied, svbfloat16x2_t, svfloat32x2_t, z0, + svreinterpret_bf16_f32_x2 (z4), + svreinterpret_bf16 (z4)) + +/* +** reinterpret_bf16_s64_x3_tied1: +** ret +*/ +TEST_DUAL_Z_REV (reinterpret_bf16_s64_x3_tied1, svbfloat16x3_t, svint64x3_t, + z0_res = svreinterpret_bf16_s64_x3 (z0), + z0_res = svreinterpret_bf16 (z0)) + +/* +** reinterpret_bf16_u8_x3_untied: +** mov (z18|z19|z20)\.d, (z23|z24|z25)\.d +** mov (z18|z19|z20)\.d, (z23|z24|z25)\.d +** mov (z18|z19|z20)\.d, (z23|z24|z25)\.d +** ret +*/ +TEST_DUAL_XN (reinterpret_bf16_u8_x3_untied, svbfloat16x3_t, svuint8x3_t, z18, + svreinterpret_bf16_u8_x3 (z23), + svreinterpret_bf16 (z23)) + +/* +** reinterpret_bf16_u32_x4_tied1: +** ret +*/ +TEST_DUAL_Z_REV (reinterpret_bf16_u32_x4_tied1, svbfloat16x4_t, svuint32x4_t, + z0_res = svreinterpret_bf16_u32_x4 (z0), + z0_res = svreinterpret_bf16 (z0)) + +/* +** reinterpret_bf16_f64_x4_untied: +** mov (z28|z29|z30|z31)\.d, z[4-7]\.d +** mov (z28|z29|z30|z31)\.d, z[4-7]\.d +** mov (z28|z29|z30|z31)\.d, z[4-7]\.d +** mov (z28|z29|z30|z31)\.d, z[4-7]\.d +** ret +*/ +TEST_DUAL_XN (reinterpret_bf16_f64_x4_untied, svbfloat16x4_t, svfloat64x4_t, z28, + svreinterpret_bf16_f64_x4 (z4), + svreinterpret_bf16 (z4)) diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/reinterpret_f16.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/reinterpret_f16.c index 60705e62879..9b6f8227d2a 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/reinterpret_f16.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/reinterpret_f16.c @@ -205,3 +205,65 @@ TEST_DUAL_Z_REV (reinterpret_f16_u64_tied1, svfloat16_t, svuint64_t, TEST_DUAL_Z (reinterpret_f16_u64_untied, svfloat16_t, svuint64_t, z0 = svreinterpret_f16_u64 (z4), z0 = svreinterpret_f16 (z4)) + +/* +** reinterpret_f16_bf16_x2_tied1: +** ret +*/ +TEST_DUAL_Z_REV (reinterpret_f16_bf16_x2_tied1, svfloat16x2_t, svbfloat16x2_t, + z0_res = svreinterpret_f16_bf16_x2 (z0), + z0_res = svreinterpret_f16 (z0)) + +/* +** reinterpret_f16_f32_x2_untied: +** ( +** mov z0\.d, z4\.d +** mov z1\.d, z5\.d +** | +** mov z0\.d, z4\.d +** mov z1\.d, z5\.d +** ) +** ret +*/ +TEST_DUAL_XN (reinterpret_f16_f32_x2_untied, svfloat16x2_t, svfloat32x2_t, z0, + svreinterpret_f16_f32_x2 (z4), + svreinterpret_f16 (z4)) + +/* +** reinterpret_f16_s64_x3_tied1: +** ret +*/ +TEST_DUAL_Z_REV (reinterpret_f16_s64_x3_tied1, svfloat16x3_t, svint64x3_t, + z0_res = svreinterpret_f16_s64_x3 (z0), + z0_res = svreinterpret_f16 (z0)) + +/* +** reinterpret_f16_u8_x3_untied: +** mov (z18|z19|z20)\.d, (z23|z24|z25)\.d +** mov (z18|z19|z20)\.d, (z23|z24|z25)\.d +** mov (z18|z19|z20)\.d, (z23|z24|z25)\.d +** ret +*/ +TEST_DUAL_XN (reinterpret_f16_u8_x3_untied, svfloat16x3_t, svuint8x3_t, z18, + svreinterpret_f16_u8_x3 (z23), + svreinterpret_f16 (z23)) + +/* +** reinterpret_f16_u32_x4_tied1: +** ret +*/ +TEST_DUAL_Z_REV (reinterpret_f16_u32_x4_tied1, svfloat16x4_t, svuint32x4_t, + z0_res = svreinterpret_f16_u32_x4 (z0), + z0_res = svreinterpret_f16 (z0)) + +/* +** reinterpret_f16_f64_x4_untied: +** mov (z28|z29|z30|z31)\.d, z[4-7]\.d +** mov (z28|z29|z30|z31)\.d, z[4-7]\.d +** mov (z28|z29|z30|z31)\.d, z[4-7]\.d +** mov (z28|z29|z30|z31)\.d, z[4-7]\.d +** ret +*/ +TEST_DUAL_XN (reinterpret_f16_f64_x4_untied, svfloat16x4_t, svfloat64x4_t, z28, + svreinterpret_f16_f64_x4 (z4), + svreinterpret_f16 (z4)) diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/reinterpret_f32.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/reinterpret_f32.c index 06fc46f25de..ce981fce9d8 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/reinterpret_f32.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/reinterpret_f32.c @@ -205,3 +205,65 @@ TEST_DUAL_Z_REV (reinterpret_f32_u64_tied1, svfloat32_t, svuint64_t, TEST_DUAL_Z (reinterpret_f32_u64_untied, svfloat32_t, svuint64_t, z0 = svreinterpret_f32_u64 (z4), z0 = svreinterpret_f32 (z4)) + +/* +** reinterpret_f32_bf16_x2_tied1: +** ret +*/ +TEST_DUAL_Z_REV (reinterpret_f32_bf16_x2_tied1, svfloat32x2_t, svbfloat16x2_t, + z0_res = svreinterpret_f32_bf16_x2 (z0), + z0_res = svreinterpret_f32 (z0)) + +/* +** reinterpret_f32_f32_x2_untied: +** ( +** mov z0\.d, z4\.d +** mov z1\.d, z5\.d +** | +** mov z0\.d, z4\.d +** mov z1\.d, z5\.d +** ) +** ret +*/ +TEST_DUAL_XN (reinterpret_f32_f32_x2_untied, svfloat32x2_t, svfloat32x2_t, z0, + svreinterpret_f32_f32_x2 (z4), + svreinterpret_f32 (z4)) + +/* +** reinterpret_f32_s64_x3_tied1: +** ret +*/ +TEST_DUAL_Z_REV (reinterpret_f32_s64_x3_tied1, svfloat32x3_t, svint64x3_t, + z0_res = svreinterpret_f32_s64_x3 (z0), + z0_res = svreinterpret_f32 (z0)) + +/* +** reinterpret_f32_u8_x3_untied: +** mov (z18|z19|z20)\.d, (z23|z24|z25)\.d +** mov (z18|z19|z20)\.d, (z23|z24|z25)\.d +** mov (z18|z19|z20)\.d, (z23|z24|z25)\.d +** ret +*/ +TEST_DUAL_XN (reinterpret_f32_u8_x3_untied, svfloat32x3_t, svuint8x3_t, z18, + svreinterpret_f32_u8_x3 (z23), + svreinterpret_f32 (z23)) + +/* +** reinterpret_f32_u32_x4_tied1: +** ret +*/ +TEST_DUAL_Z_REV (reinterpret_f32_u32_x4_tied1, svfloat32x4_t, svuint32x4_t, + z0_res = svreinterpret_f32_u32_x4 (z0), + z0_res = svreinterpret_f32 (z0)) + +/* +** reinterpret_f32_f64_x4_untied: +** mov (z28|z29|z30|z31)\.d, z[4-7]\.d +** mov (z28|z29|z30|z31)\.d, z[4-7]\.d +** mov (z28|z29|z30|z31)\.d, z[4-7]\.d +** mov (z28|z29|z30|z31)\.d, z[4-7]\.d +** ret +*/ +TEST_DUAL_XN (reinterpret_f32_f64_x4_untied, svfloat32x4_t, svfloat64x4_t, z28, + svreinterpret_f32_f64_x4 (z4), + svreinterpret_f32 (z4)) diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/reinterpret_f64.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/reinterpret_f64.c index 003ee3fe220..4f51824ab7e 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/reinterpret_f64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/reinterpret_f64.c @@ -205,3 +205,65 @@ TEST_DUAL_Z_REV (reinterpret_f64_u64_tied1, svfloat64_t, svuint64_t, TEST_DUAL_Z (reinterpret_f64_u64_untied, svfloat64_t, svuint64_t, z0 = svreinterpret_f64_u64 (z4), z0 = svreinterpret_f64 (z4)) + +/* +** reinterpret_f64_bf16_x2_tied1: +** ret +*/ +TEST_DUAL_Z_REV (reinterpret_f64_bf16_x2_tied1, svfloat64x2_t, svbfloat16x2_t, + z0_res = svreinterpret_f64_bf16_x2 (z0), + z0_res = svreinterpret_f64 (z0)) + +/* +** reinterpret_f64_f32_x2_untied: +** ( +** mov z0\.d, z4\.d +** mov z1\.d, z5\.d +** | +** mov z0\.d, z4\.d +** mov z1\.d, z5\.d +** ) +** ret +*/ +TEST_DUAL_XN (reinterpret_f64_f32_x2_untied, svfloat64x2_t, svfloat32x2_t, z0, + svreinterpret_f64_f32_x2 (z4), + svreinterpret_f64 (z4)) + +/* +** reinterpret_f64_s64_x3_tied1: +** ret +*/ +TEST_DUAL_Z_REV (reinterpret_f64_s64_x3_tied1, svfloat64x3_t, svint64x3_t, + z0_res = svreinterpret_f64_s64_x3 (z0), + z0_res = svreinterpret_f64 (z0)) + +/* +** reinterpret_f64_u8_x3_untied: +** mov (z18|z19|z20)\.d, (z23|z24|z25)\.d +** mov (z18|z19|z20)\.d, (z23|z24|z25)\.d +** mov (z18|z19|z20)\.d, (z23|z24|z25)\.d +** ret +*/ +TEST_DUAL_XN (reinterpret_f64_u8_x3_untied, svfloat64x3_t, svuint8x3_t, z18, + svreinterpret_f64_u8_x3 (z23), + svreinterpret_f64 (z23)) + +/* +** reinterpret_f64_u32_x4_tied1: +** ret +*/ +TEST_DUAL_Z_REV (reinterpret_f64_u32_x4_tied1, svfloat64x4_t, svuint32x4_t, + z0_res = svreinterpret_f64_u32_x4 (z0), + z0_res = svreinterpret_f64 (z0)) + +/* +** reinterpret_f64_f64_x4_untied: +** mov (z28|z29|z30|z31)\.d, z[4-7]\.d +** mov (z28|z29|z30|z31)\.d, z[4-7]\.d +** mov (z28|z29|z30|z31)\.d, z[4-7]\.d +** mov (z28|z29|z30|z31)\.d, z[4-7]\.d +** ret +*/ +TEST_DUAL_XN (reinterpret_f64_f64_x4_untied, svfloat64x4_t, svfloat64x4_t, z28, + svreinterpret_f64_f64_x4 (z4), + svreinterpret_f64 (z4)) diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/reinterpret_s16.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/reinterpret_s16.c index d62817c2cac..7e15f3e9bd3 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/reinterpret_s16.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/reinterpret_s16.c @@ -205,3 +205,65 @@ TEST_DUAL_Z_REV (reinterpret_s16_u64_tied1, svint16_t, svuint64_t, TEST_DUAL_Z (reinterpret_s16_u64_untied, svint16_t, svuint64_t, z0 = svreinterpret_s16_u64 (z4), z0 = svreinterpret_s16 (z4)) + +/* +** reinterpret_s16_bf16_x2_tied1: +** ret +*/ +TEST_DUAL_Z_REV (reinterpret_s16_bf16_x2_tied1, svint16x2_t, svbfloat16x2_t, + z0_res = svreinterpret_s16_bf16_x2 (z0), + z0_res = svreinterpret_s16 (z0)) + +/* +** reinterpret_s16_f32_x2_untied: +** ( +** mov z0\.d, z4\.d +** mov z1\.d, z5\.d +** | +** mov z0\.d, z4\.d +** mov z1\.d, z5\.d +** ) +** ret +*/ +TEST_DUAL_XN (reinterpret_s16_f32_x2_untied, svint16x2_t, svfloat32x2_t, z0, + svreinterpret_s16_f32_x2 (z4), + svreinterpret_s16 (z4)) + +/* +** reinterpret_s16_s64_x3_tied1: +** ret +*/ +TEST_DUAL_Z_REV (reinterpret_s16_s64_x3_tied1, svint16x3_t, svint64x3_t, + z0_res = svreinterpret_s16_s64_x3 (z0), + z0_res = svreinterpret_s16 (z0)) + +/* +** reinterpret_s16_u8_x3_untied: +** mov (z18|z19|z20)\.d, (z23|z24|z25)\.d +** mov (z18|z19|z20)\.d, (z23|z24|z25)\.d +** mov (z18|z19|z20)\.d, (z23|z24|z25)\.d +** ret +*/ +TEST_DUAL_XN (reinterpret_s16_u8_x3_untied, svint16x3_t, svuint8x3_t, z18, + svreinterpret_s16_u8_x3 (z23), + svreinterpret_s16 (z23)) + +/* +** reinterpret_s16_u32_x4_tied1: +** ret +*/ +TEST_DUAL_Z_REV (reinterpret_s16_u32_x4_tied1, svint16x4_t, svuint32x4_t, + z0_res = svreinterpret_s16_u32_x4 (z0), + z0_res = svreinterpret_s16 (z0)) + +/* +** reinterpret_s16_f64_x4_untied: +** mov (z28|z29|z30|z31)\.d, z[4-7]\.d +** mov (z28|z29|z30|z31)\.d, z[4-7]\.d +** mov (z28|z29|z30|z31)\.d, z[4-7]\.d +** mov (z28|z29|z30|z31)\.d, z[4-7]\.d +** ret +*/ +TEST_DUAL_XN (reinterpret_s16_f64_x4_untied, svint16x4_t, svfloat64x4_t, z28, + svreinterpret_s16_f64_x4 (z4), + svreinterpret_s16 (z4)) diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/reinterpret_s32.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/reinterpret_s32.c index e1068f244ed..60da8aef333 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/reinterpret_s32.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/reinterpret_s32.c @@ -205,3 +205,65 @@ TEST_DUAL_Z_REV (reinterpret_s32_u64_tied1, svint32_t, svuint64_t, TEST_DUAL_Z (reinterpret_s32_u64_untied, svint32_t, svuint64_t, z0 = svreinterpret_s32_u64 (z4), z0 = svreinterpret_s32 (z4)) + +/* +** reinterpret_s32_bf16_x2_tied1: +** ret +*/ +TEST_DUAL_Z_REV (reinterpret_s32_bf16_x2_tied1, svint32x2_t, svbfloat16x2_t, + z0_res = svreinterpret_s32_bf16_x2 (z0), + z0_res = svreinterpret_s32 (z0)) + +/* +** reinterpret_s32_f32_x2_untied: +** ( +** mov z0\.d, z4\.d +** mov z1\.d, z5\.d +** | +** mov z0\.d, z4\.d +** mov z1\.d, z5\.d +** ) +** ret +*/ +TEST_DUAL_XN (reinterpret_s32_f32_x2_untied, svint32x2_t, svfloat32x2_t, z0, + svreinterpret_s32_f32_x2 (z4), + svreinterpret_s32 (z4)) + +/* +** reinterpret_s32_s64_x3_tied1: +** ret +*/ +TEST_DUAL_Z_REV (reinterpret_s32_s64_x3_tied1, svint32x3_t, svint64x3_t, + z0_res = svreinterpret_s32_s64_x3 (z0), + z0_res = svreinterpret_s32 (z0)) + +/* +** reinterpret_s32_u8_x3_untied: +** mov (z18|z19|z20)\.d, (z23|z24|z25)\.d +** mov (z18|z19|z20)\.d, (z23|z24|z25)\.d +** mov (z18|z19|z20)\.d, (z23|z24|z25)\.d +** ret +*/ +TEST_DUAL_XN (reinterpret_s32_u8_x3_untied, svint32x3_t, svuint8x3_t, z18, + svreinterpret_s32_u8_x3 (z23), + svreinterpret_s32 (z23)) + +/* +** reinterpret_s32_u32_x4_tied1: +** ret +*/ +TEST_DUAL_Z_REV (reinterpret_s32_u32_x4_tied1, svint32x4_t, svuint32x4_t, + z0_res = svreinterpret_s32_u32_x4 (z0), + z0_res = svreinterpret_s32 (z0)) + +/* +** reinterpret_s32_f64_x4_untied: +** mov (z28|z29|z30|z31)\.d, z[4-7]\.d +** mov (z28|z29|z30|z31)\.d, z[4-7]\.d +** mov (z28|z29|z30|z31)\.d, z[4-7]\.d +** mov (z28|z29|z30|z31)\.d, z[4-7]\.d +** ret +*/ +TEST_DUAL_XN (reinterpret_s32_f64_x4_untied, svint32x4_t, svfloat64x4_t, z28, + svreinterpret_s32_f64_x4 (z4), + svreinterpret_s32 (z4)) diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/reinterpret_s64.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/reinterpret_s64.c index cada7533c53..d705c60dfd7 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/reinterpret_s64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/reinterpret_s64.c @@ -205,3 +205,65 @@ TEST_DUAL_Z_REV (reinterpret_s64_u64_tied1, svint64_t, svuint64_t, TEST_DUAL_Z (reinterpret_s64_u64_untied, svint64_t, svuint64_t, z0 = svreinterpret_s64_u64 (z4), z0 = svreinterpret_s64 (z4)) + +/* +** reinterpret_s64_bf16_x2_tied1: +** ret +*/ +TEST_DUAL_Z_REV (reinterpret_s64_bf16_x2_tied1, svint64x2_t, svbfloat16x2_t, + z0_res = svreinterpret_s64_bf16_x2 (z0), + z0_res = svreinterpret_s64 (z0)) + +/* +** reinterpret_s64_f32_x2_untied: +** ( +** mov z0\.d, z4\.d +** mov z1\.d, z5\.d +** | +** mov z0\.d, z4\.d +** mov z1\.d, z5\.d +** ) +** ret +*/ +TEST_DUAL_XN (reinterpret_s64_f32_x2_untied, svint64x2_t, svfloat32x2_t, z0, + svreinterpret_s64_f32_x2 (z4), + svreinterpret_s64 (z4)) + +/* +** reinterpret_s64_s64_x3_tied1: +** ret +*/ +TEST_DUAL_Z_REV (reinterpret_s64_s64_x3_tied1, svint64x3_t, svint64x3_t, + z0_res = svreinterpret_s64_s64_x3 (z0), + z0_res = svreinterpret_s64 (z0)) + +/* +** reinterpret_s64_u8_x3_untied: +** mov (z18|z19|z20)\.d, (z23|z24|z25)\.d +** mov (z18|z19|z20)\.d, (z23|z24|z25)\.d +** mov (z18|z19|z20)\.d, (z23|z24|z25)\.d +** ret +*/ +TEST_DUAL_XN (reinterpret_s64_u8_x3_untied, svint64x3_t, svuint8x3_t, z18, + svreinterpret_s64_u8_x3 (z23), + svreinterpret_s64 (z23)) + +/* +** reinterpret_s64_u32_x4_tied1: +** ret +*/ +TEST_DUAL_Z_REV (reinterpret_s64_u32_x4_tied1, svint64x4_t, svuint32x4_t, + z0_res = svreinterpret_s64_u32_x4 (z0), + z0_res = svreinterpret_s64 (z0)) + +/* +** reinterpret_s64_f64_x4_untied: +** mov (z28|z29|z30|z31)\.d, z[4-7]\.d +** mov (z28|z29|z30|z31)\.d, z[4-7]\.d +** mov (z28|z29|z30|z31)\.d, z[4-7]\.d +** mov (z28|z29|z30|z31)\.d, z[4-7]\.d +** ret +*/ +TEST_DUAL_XN (reinterpret_s64_f64_x4_untied, svint64x4_t, svfloat64x4_t, z28, + svreinterpret_s64_f64_x4 (z4), + svreinterpret_s64 (z4)) diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/reinterpret_s8.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/reinterpret_s8.c index 23a40d0bab7..ab90a54d746 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/reinterpret_s8.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/reinterpret_s8.c @@ -205,3 +205,65 @@ TEST_DUAL_Z_REV (reinterpret_s8_u64_tied1, svint8_t, svuint64_t, TEST_DUAL_Z (reinterpret_s8_u64_untied, svint8_t, svuint64_t, z0 = svreinterpret_s8_u64 (z4), z0 = svreinterpret_s8 (z4)) + +/* +** reinterpret_s8_bf16_x2_tied1: +** ret +*/ +TEST_DUAL_Z_REV (reinterpret_s8_bf16_x2_tied1, svint8x2_t, svbfloat16x2_t, + z0_res = svreinterpret_s8_bf16_x2 (z0), + z0_res = svreinterpret_s8 (z0)) + +/* +** reinterpret_s8_f32_x2_untied: +** ( +** mov z0\.d, z4\.d +** mov z1\.d, z5\.d +** | +** mov z0\.d, z4\.d +** mov z1\.d, z5\.d +** ) +** ret +*/ +TEST_DUAL_XN (reinterpret_s8_f32_x2_untied, svint8x2_t, svfloat32x2_t, z0, + svreinterpret_s8_f32_x2 (z4), + svreinterpret_s8 (z4)) + +/* +** reinterpret_s8_s64_x3_tied1: +** ret +*/ +TEST_DUAL_Z_REV (reinterpret_s8_s64_x3_tied1, svint8x3_t, svint64x3_t, + z0_res = svreinterpret_s8_s64_x3 (z0), + z0_res = svreinterpret_s8 (z0)) + +/* +** reinterpret_s8_u8_x3_untied: +** mov (z18|z19|z20)\.d, (z23|z24|z25)\.d +** mov (z18|z19|z20)\.d, (z23|z24|z25)\.d +** mov (z18|z19|z20)\.d, (z23|z24|z25)\.d +** ret +*/ +TEST_DUAL_XN (reinterpret_s8_u8_x3_untied, svint8x3_t, svuint8x3_t, z18, + svreinterpret_s8_u8_x3 (z23), + svreinterpret_s8 (z23)) + +/* +** reinterpret_s8_u32_x4_tied1: +** ret +*/ +TEST_DUAL_Z_REV (reinterpret_s8_u32_x4_tied1, svint8x4_t, svuint32x4_t, + z0_res = svreinterpret_s8_u32_x4 (z0), + z0_res = svreinterpret_s8 (z0)) + +/* +** reinterpret_s8_f64_x4_untied: +** mov (z28|z29|z30|z31)\.d, z[4-7]\.d +** mov (z28|z29|z30|z31)\.d, z[4-7]\.d +** mov (z28|z29|z30|z31)\.d, z[4-7]\.d +** mov (z28|z29|z30|z31)\.d, z[4-7]\.d +** ret +*/ +TEST_DUAL_XN (reinterpret_s8_f64_x4_untied, svint8x4_t, svfloat64x4_t, z28, + svreinterpret_s8_f64_x4 (z4), + svreinterpret_s8 (z4)) diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/reinterpret_u16.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/reinterpret_u16.c index 48e8ecaff44..fcfc0eb9da5 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/reinterpret_u16.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/reinterpret_u16.c @@ -205,3 +205,65 @@ TEST_DUAL_Z_REV (reinterpret_u16_u64_tied1, svuint16_t, svuint64_t, TEST_DUAL_Z (reinterpret_u16_u64_untied, svuint16_t, svuint64_t, z0 = svreinterpret_u16_u64 (z4), z0 = svreinterpret_u16 (z4)) + +/* +** reinterpret_u16_bf16_x2_tied1: +** ret +*/ +TEST_DUAL_Z_REV (reinterpret_u16_bf16_x2_tied1, svuint16x2_t, svbfloat16x2_t, + z0_res = svreinterpret_u16_bf16_x2 (z0), + z0_res = svreinterpret_u16 (z0)) + +/* +** reinterpret_u16_f32_x2_untied: +** ( +** mov z0\.d, z4\.d +** mov z1\.d, z5\.d +** | +** mov z0\.d, z4\.d +** mov z1\.d, z5\.d +** ) +** ret +*/ +TEST_DUAL_XN (reinterpret_u16_f32_x2_untied, svuint16x2_t, svfloat32x2_t, z0, + svreinterpret_u16_f32_x2 (z4), + svreinterpret_u16 (z4)) + +/* +** reinterpret_u16_s64_x3_tied1: +** ret +*/ +TEST_DUAL_Z_REV (reinterpret_u16_s64_x3_tied1, svuint16x3_t, svint64x3_t, + z0_res = svreinterpret_u16_s64_x3 (z0), + z0_res = svreinterpret_u16 (z0)) + +/* +** reinterpret_u16_u8_x3_untied: +** mov (z18|z19|z20)\.d, (z23|z24|z25)\.d +** mov (z18|z19|z20)\.d, (z23|z24|z25)\.d +** mov (z18|z19|z20)\.d, (z23|z24|z25)\.d +** ret +*/ +TEST_DUAL_XN (reinterpret_u16_u8_x3_untied, svuint16x3_t, svuint8x3_t, z18, + svreinterpret_u16_u8_x3 (z23), + svreinterpret_u16 (z23)) + +/* +** reinterpret_u16_u32_x4_tied1: +** ret +*/ +TEST_DUAL_Z_REV (reinterpret_u16_u32_x4_tied1, svuint16x4_t, svuint32x4_t, + z0_res = svreinterpret_u16_u32_x4 (z0), + z0_res = svreinterpret_u16 (z0)) + +/* +** reinterpret_u16_f64_x4_untied: +** mov (z28|z29|z30|z31)\.d, z[4-7]\.d +** mov (z28|z29|z30|z31)\.d, z[4-7]\.d +** mov (z28|z29|z30|z31)\.d, z[4-7]\.d +** mov (z28|z29|z30|z31)\.d, z[4-7]\.d +** ret +*/ +TEST_DUAL_XN (reinterpret_u16_f64_x4_untied, svuint16x4_t, svfloat64x4_t, z28, + svreinterpret_u16_f64_x4 (z4), + svreinterpret_u16 (z4)) diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/reinterpret_u32.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/reinterpret_u32.c index 1d4e857120e..6d7e05857fe 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/reinterpret_u32.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/reinterpret_u32.c @@ -205,3 +205,65 @@ TEST_DUAL_Z_REV (reinterpret_u32_u64_tied1, svuint32_t, svuint64_t, TEST_DUAL_Z (reinterpret_u32_u64_untied, svuint32_t, svuint64_t, z0 = svreinterpret_u32_u64 (z4), z0 = svreinterpret_u32 (z4)) + +/* +** reinterpret_u32_bf16_x2_tied1: +** ret +*/ +TEST_DUAL_Z_REV (reinterpret_u32_bf16_x2_tied1, svuint32x2_t, svbfloat16x2_t, + z0_res = svreinterpret_u32_bf16_x2 (z0), + z0_res = svreinterpret_u32 (z0)) + +/* +** reinterpret_u32_f32_x2_untied: +** ( +** mov z0\.d, z4\.d +** mov z1\.d, z5\.d +** | +** mov z0\.d, z4\.d +** mov z1\.d, z5\.d +** ) +** ret +*/ +TEST_DUAL_XN (reinterpret_u32_f32_x2_untied, svuint32x2_t, svfloat32x2_t, z0, + svreinterpret_u32_f32_x2 (z4), + svreinterpret_u32 (z4)) + +/* +** reinterpret_u32_s64_x3_tied1: +** ret +*/ +TEST_DUAL_Z_REV (reinterpret_u32_s64_x3_tied1, svuint32x3_t, svint64x3_t, + z0_res = svreinterpret_u32_s64_x3 (z0), + z0_res = svreinterpret_u32 (z0)) + +/* +** reinterpret_u32_u8_x3_untied: +** mov (z18|z19|z20)\.d, (z23|z24|z25)\.d +** mov (z18|z19|z20)\.d, (z23|z24|z25)\.d +** mov (z18|z19|z20)\.d, (z23|z24|z25)\.d +** ret +*/ +TEST_DUAL_XN (reinterpret_u32_u8_x3_untied, svuint32x3_t, svuint8x3_t, z18, + svreinterpret_u32_u8_x3 (z23), + svreinterpret_u32 (z23)) + +/* +** reinterpret_u32_u32_x4_tied1: +** ret +*/ +TEST_DUAL_Z_REV (reinterpret_u32_u32_x4_tied1, svuint32x4_t, svuint32x4_t, + z0_res = svreinterpret_u32_u32_x4 (z0), + z0_res = svreinterpret_u32 (z0)) + +/* +** reinterpret_u32_f64_x4_untied: +** mov (z28|z29|z30|z31)\.d, z[4-7]\.d +** mov (z28|z29|z30|z31)\.d, z[4-7]\.d +** mov (z28|z29|z30|z31)\.d, z[4-7]\.d +** mov (z28|z29|z30|z31)\.d, z[4-7]\.d +** ret +*/ +TEST_DUAL_XN (reinterpret_u32_f64_x4_untied, svuint32x4_t, svfloat64x4_t, z28, + svreinterpret_u32_f64_x4 (z4), + svreinterpret_u32 (z4)) diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/reinterpret_u64.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/reinterpret_u64.c index 07af69dce8d..55c0baefb6f 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/reinterpret_u64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/reinterpret_u64.c @@ -205,3 +205,65 @@ TEST_DUAL_Z_REV (reinterpret_u64_u64_tied1, svuint64_t, svuint64_t, TEST_DUAL_Z (reinterpret_u64_u64_untied, svuint64_t, svuint64_t, z0 = svreinterpret_u64_u64 (z4), z0 = svreinterpret_u64 (z4)) + +/* +** reinterpret_u64_bf16_x2_tied1: +** ret +*/ +TEST_DUAL_Z_REV (reinterpret_u64_bf16_x2_tied1, svuint64x2_t, svbfloat16x2_t, + z0_res = svreinterpret_u64_bf16_x2 (z0), + z0_res = svreinterpret_u64 (z0)) + +/* +** reinterpret_u64_f32_x2_untied: +** ( +** mov z0\.d, z4\.d +** mov z1\.d, z5\.d +** | +** mov z0\.d, z4\.d +** mov z1\.d, z5\.d +** ) +** ret +*/ +TEST_DUAL_XN (reinterpret_u64_f32_x2_untied, svuint64x2_t, svfloat32x2_t, z0, + svreinterpret_u64_f32_x2 (z4), + svreinterpret_u64 (z4)) + +/* +** reinterpret_u64_s64_x3_tied1: +** ret +*/ +TEST_DUAL_Z_REV (reinterpret_u64_s64_x3_tied1, svuint64x3_t, svint64x3_t, + z0_res = svreinterpret_u64_s64_x3 (z0), + z0_res = svreinterpret_u64 (z0)) + +/* +** reinterpret_u64_u8_x3_untied: +** mov (z18|z19|z20)\.d, (z23|z24|z25)\.d +** mov (z18|z19|z20)\.d, (z23|z24|z25)\.d +** mov (z18|z19|z20)\.d, (z23|z24|z25)\.d +** ret +*/ +TEST_DUAL_XN (reinterpret_u64_u8_x3_untied, svuint64x3_t, svuint8x3_t, z18, + svreinterpret_u64_u8_x3 (z23), + svreinterpret_u64 (z23)) + +/* +** reinterpret_u64_u32_x4_tied1: +** ret +*/ +TEST_DUAL_Z_REV (reinterpret_u64_u32_x4_tied1, svuint64x4_t, svuint32x4_t, + z0_res = svreinterpret_u64_u32_x4 (z0), + z0_res = svreinterpret_u64 (z0)) + +/* +** reinterpret_u64_f64_x4_untied: +** mov (z28|z29|z30|z31)\.d, z[4-7]\.d +** mov (z28|z29|z30|z31)\.d, z[4-7]\.d +** mov (z28|z29|z30|z31)\.d, z[4-7]\.d +** mov (z28|z29|z30|z31)\.d, z[4-7]\.d +** ret +*/ +TEST_DUAL_XN (reinterpret_u64_f64_x4_untied, svuint64x4_t, svfloat64x4_t, z28, + svreinterpret_u64_f64_x4 (z4), + svreinterpret_u64 (z4)) diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/reinterpret_u8.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/reinterpret_u8.c index a4c7f4c8d21..f7302196162 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/reinterpret_u8.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/reinterpret_u8.c @@ -205,3 +205,65 @@ TEST_DUAL_Z_REV (reinterpret_u8_u64_tied1, svuint8_t, svuint64_t, TEST_DUAL_Z (reinterpret_u8_u64_untied, svuint8_t, svuint64_t, z0 = svreinterpret_u8_u64 (z4), z0 = svreinterpret_u8 (z4)) + +/* +** reinterpret_u8_bf16_x2_tied1: +** ret +*/ +TEST_DUAL_Z_REV (reinterpret_u8_bf16_x2_tied1, svuint8x2_t, svbfloat16x2_t, + z0_res = svreinterpret_u8_bf16_x2 (z0), + z0_res = svreinterpret_u8 (z0)) + +/* +** reinterpret_u8_f32_x2_untied: +** ( +** mov z0\.d, z4\.d +** mov z1\.d, z5\.d +** | +** mov z0\.d, z4\.d +** mov z1\.d, z5\.d +** ) +** ret +*/ +TEST_DUAL_XN (reinterpret_u8_f32_x2_untied, svuint8x2_t, svfloat32x2_t, z0, + svreinterpret_u8_f32_x2 (z4), + svreinterpret_u8 (z4)) + +/* +** reinterpret_u8_s64_x3_tied1: +** ret +*/ +TEST_DUAL_Z_REV (reinterpret_u8_s64_x3_tied1, svuint8x3_t, svint64x3_t, + z0_res = svreinterpret_u8_s64_x3 (z0), + z0_res = svreinterpret_u8 (z0)) + +/* +** reinterpret_u8_u8_x3_untied: +** mov (z18|z19|z20)\.d, (z23|z24|z25)\.d +** mov (z18|z19|z20)\.d, (z23|z24|z25)\.d +** mov (z18|z19|z20)\.d, (z23|z24|z25)\.d +** ret +*/ +TEST_DUAL_XN (reinterpret_u8_u8_x3_untied, svuint8x3_t, svuint8x3_t, z18, + svreinterpret_u8_u8_x3 (z23), + svreinterpret_u8 (z23)) + +/* +** reinterpret_u8_u32_x4_tied1: +** ret +*/ +TEST_DUAL_Z_REV (reinterpret_u8_u32_x4_tied1, svuint8x4_t, svuint32x4_t, + z0_res = svreinterpret_u8_u32_x4 (z0), + z0_res = svreinterpret_u8 (z0)) + +/* +** reinterpret_u8_f64_x4_untied: +** mov (z28|z29|z30|z31)\.d, z[4-7]\.d +** mov (z28|z29|z30|z31)\.d, z[4-7]\.d +** mov (z28|z29|z30|z31)\.d, z[4-7]\.d +** mov (z28|z29|z30|z31)\.d, z[4-7]\.d +** ret +*/ +TEST_DUAL_XN (reinterpret_u8_f64_x4_untied, svuint8x4_t, svfloat64x4_t, z28, + svreinterpret_u8_f64_x4 (z4), + svreinterpret_u8 (z4)) diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/test_sve_acle.h b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/test_sve_acle.h index fbf392b3ed4..2da61ff5c0b 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/test_sve_acle.h +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/test_sve_acle.h @@ -421,4 +421,18 @@ return z0_res; \ } +#define TEST_DUAL_XN(NAME, TTYPE1, TTYPE2, RES, CODE1, CODE2) \ + PROTO (NAME, void, ()) \ + { \ + register TTYPE1 z0 __asm ("z0"); \ + register TTYPE2 z4 __asm ("z4"); \ + register TTYPE1 z18 __asm ("z18"); \ + register TTYPE2 z23 __asm ("z23"); \ + register TTYPE1 z28 __asm ("z28"); \ + __asm volatile ("" : "=w" (z0), "=w" (z4), "=w" (z18), \ + "=w" (z23), "=w" (z28)); \ + INVOKE (RES = CODE1, RES = CODE2); \ + __asm volatile ("" :: "w" (RES)); \ + } + #endif From patchwork Tue Dec 5 10:13:09 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Sandiford X-Patchwork-Id: 1872037 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=8.43.85.97; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4SkxJd1D9pz1ySd for ; Tue, 5 Dec 2023 21:16:17 +1100 (AEDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id E1E3E3875DFB for ; Tue, 5 Dec 2023 10:15:53 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by sourceware.org (Postfix) with ESMTP id 813ED385C322 for ; Tue, 5 Dec 2023 10:13:41 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 813ED385C322 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=arm.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 813ED385C322 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=217.140.110.172 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1701771227; cv=none; b=pUR1eImkdqEuQjco9TCmr8udQQFpCCWmu5VViAAuoXvIH7U1r82Zn7DJ658PaqPkNWegM72gNVmxMDvvIEqzn/WdKfiw9rzz6pttusM+LoJg870hZLMEnXZYw+vg83Tbpda20R5ryLPV8Qau7pS5jxaZqb1d+lbZyl0xRmT9Ymo= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1701771227; c=relaxed/simple; bh=ldzIIh4rS+EIqzsdVZSy94rsSrvWFFjoB/BGru1r4P0=; h=From:To:Subject:Date:Message-Id:MIME-Version; b=uMCHsnhuTtDgGpR3czd8GbynVtFccVJAdsQdy9Mf1coH64FN60zCfHPpEY9zbMt4HitFpT/hVyl2Ahu3p4I3cyhOhS8uNnOQKC3sQFu+aPhaI94gayM8F3pFEkBCWg0KZ0KGWMAvDsySFOuARJPp+WhHIe1ZiDXgmkN7dL8HdGE= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id BD2711570; Tue, 5 Dec 2023 02:14:27 -0800 (PST) Received: from e121540-lin.manchester.arm.com (e121540-lin.manchester.arm.com [10.32.110.72]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id AE9403F5A1; Tue, 5 Dec 2023 02:13:40 -0800 (PST) From: Richard Sandiford To: gcc-patches@gcc.gnu.org Cc: Richard Sandiford Subject: [pushed v2 11/25] aarch64: Add arm_streaming(_compatible) attributes Date: Tue, 5 Dec 2023 10:13:09 +0000 Message-Id: <20231205101323.1914247-12-richard.sandiford@arm.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20231205101323.1914247-1-richard.sandiford@arm.com> References: <20231205101323.1914247-1-richard.sandiford@arm.com> MIME-Version: 1.0 X-Spam-Status: No, score=-21.2 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_ASCII_DIVIDERS, KAM_DMARC_NONE, KAM_DMARC_STATUS, KAM_LAZY_DOMAIN_SECURITY, KAM_SHORT, KAM_STOCKGEN, SPF_HELO_NONE, SPF_NONE, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org This patch adds support for recognising the SME arm::streaming and arm::streaming_compatible attributes. These attributes respectively describe whether the processor is definitely in "streaming mode" (PSTATE.SM==1), whether the processor is definitely not in streaming mode (PSTATE.SM==0), or whether we don't know at compile time either way. As far as the compiler is concerned, this effectively creates three ISA submodes: streaming mode enables things that are not available in non-streaming mode, non-streaming mode enables things that not available in streaming mode, and streaming-compatible mode has to stick to the common subset. This means that some instructions are conditional on PSTATE.SM==1 and some are conditional on PSTATE.SM==0. I wondered about recording the streaming state in a new variable. However, the set of available instructions is also influenced by PSTATE.ZA (added later), so I think it makes sense to view this as an instance of a more general mechanism. Also, keeping the PSTATE.SM state in the same flag variable as the other ISA features makes it possible to sum up the requirements of an ACLE function in a single value. The patch therefore adds a new set of feature flags called "ISA modes". Unlike the other two sets of flags (optional features and architecture- level features), these ISA modes are not controlled directly by command-line parameters or "target" attributes. arm::streaming and arm::streaming_compatible are function type attributes rather than function declaration attributes. This means that we need to find somewhere to copy the type information across to a function's target options. The patch does this in aarch64_set_current_function. We also need to record which ISA mode a callee expects/requires to be active on entry. (The same mode is then active on return.) The patch extends the current UNSPEC_CALLEE_ABI cookie to include this information, as well as the PCS variant that it recorded previously. The attributes can also be written __arm_streaming and __arm_streaming_compatible. This has two advantages: it triggers an error on compilers that don't understand the attributes, and it eases use on C, where [[...]] attributes were only added in C23. gcc/ * config/aarch64/aarch64-isa-modes.def: New file. * config/aarch64/aarch64.h: Include it in the feature enumerations. (AARCH64_FL_SM_STATE, AARCH64_FL_ISA_MODES): New constants. (AARCH64_FL_DEFAULT_ISA_MODE): Likewise. (AARCH64_ISA_MODE): New macro. (CUMULATIVE_ARGS): Add an isa_mode field. * config/aarch64/aarch64-protos.h (aarch64_gen_callee_cookie): Declare. (aarch64_tlsdesc_abi_id): Return an arm_pcs. * config/aarch64/aarch64.cc (attr_streaming_exclusions) (aarch64_gnu_attributes, aarch64_gnu_attribute_table) (aarch64_arm_attributes, aarch64_arm_attribute_table): New tables. (aarch64_attribute_table): Redefine to include the gnu and arm attributes. (aarch64_fntype_pstate_sm, aarch64_fntype_isa_mode): New functions. (aarch64_fndecl_pstate_sm, aarch64_fndecl_isa_mode): Likewise. (aarch64_gen_callee_cookie, aarch64_callee_abi): Likewise. (aarch64_insn_callee_cookie, aarch64_insn_callee_abi): Use them. (aarch64_function_arg, aarch64_output_mi_thunk): Likewise. (aarch64_init_cumulative_args): Initialize the isa_mode field. (aarch64_output_mi_thunk): Use aarch64_gen_callee_cookie to get the ABI cookie. (aarch64_override_options): Add the ISA mode to the feature set. (aarch64_temporary_target::copy_from_fndecl): Likewise. (aarch64_fndecl_options, aarch64_handle_attr_arch): Likewise. (aarch64_set_current_function): Maintain the correct ISA mode. (aarch64_tlsdesc_abi_id): Return an arm_pcs. (aarch64_comp_type_attributes): Handle arm::streaming and arm::streaming_compatible. * config/aarch64/aarch64-c.cc (aarch64_define_unconditional_macros): Define __arm_streaming and __arm_streaming_compatible. * config/aarch64/aarch64.md (tlsdesc_small_): Use aarch64_gen_callee_cookie to get the ABI cookie. * config/aarch64/t-aarch64 (TM_H): Add all feature-related .def files. gcc/testsuite/ * gcc.target/aarch64/sme/aarch64-sme.exp: New harness. * gcc.target/aarch64/sme/streaming_mode_1.c: New test. * gcc.target/aarch64/sme/streaming_mode_2.c: Likewise. * gcc.target/aarch64/sme/keyword_macros_1.c: Likewise. * g++.target/aarch64/sme/aarch64-sme.exp: New harness. * g++.target/aarch64/sme/streaming_mode_1.C: New test. * g++.target/aarch64/sme/streaming_mode_2.C: Likewise. * g++.target/aarch64/sme/keyword_macros_1.C: Likewise. * gcc.target/aarch64/auto-init-1.c: Only expect the call insn to contain 1 (const_int 0), not 2. --- gcc/config/aarch64/aarch64-c.cc | 14 ++ gcc/config/aarch64/aarch64-isa-modes.def | 35 +++ gcc/config/aarch64/aarch64-protos.h | 3 +- gcc/config/aarch64/aarch64.cc | 233 +++++++++++++++--- gcc/config/aarch64/aarch64.h | 24 +- gcc/config/aarch64/aarch64.md | 3 +- gcc/config/aarch64/t-aarch64 | 5 +- .../g++.target/aarch64/sme/aarch64-sme.exp | 40 +++ .../g++.target/aarch64/sme/keyword_macros_1.C | 4 + .../g++.target/aarch64/sme/streaming_mode_1.C | 142 +++++++++++ .../g++.target/aarch64/sme/streaming_mode_2.C | 25 ++ .../gcc.target/aarch64/auto-init-1.c | 3 +- .../gcc.target/aarch64/sme/aarch64-sme.exp | 40 +++ .../gcc.target/aarch64/sme/keyword_macros_1.c | 4 + .../gcc.target/aarch64/sme/streaming_mode_1.c | 130 ++++++++++ .../gcc.target/aarch64/sme/streaming_mode_2.c | 25 ++ 16 files changed, 685 insertions(+), 45 deletions(-) create mode 100644 gcc/config/aarch64/aarch64-isa-modes.def create mode 100644 gcc/testsuite/g++.target/aarch64/sme/aarch64-sme.exp create mode 100644 gcc/testsuite/g++.target/aarch64/sme/keyword_macros_1.C create mode 100644 gcc/testsuite/g++.target/aarch64/sme/streaming_mode_1.C create mode 100644 gcc/testsuite/g++.target/aarch64/sme/streaming_mode_2.C create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/aarch64-sme.exp create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/keyword_macros_1.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/streaming_mode_1.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/streaming_mode_2.c diff --git a/gcc/config/aarch64/aarch64-c.cc b/gcc/config/aarch64/aarch64-c.cc index ab8844f6049..1603621b30d 100644 --- a/gcc/config/aarch64/aarch64-c.cc +++ b/gcc/config/aarch64/aarch64-c.cc @@ -72,6 +72,20 @@ aarch64_define_unconditional_macros (cpp_reader *pfile) builtin_define_with_int_value ("__ARM_SIZEOF_WCHAR_T", WCHAR_TYPE_SIZE / 8); builtin_define ("__GCC_ASM_FLAG_OUTPUTS__"); + + /* Define keyword attributes like __arm_streaming as macros that expand + to the associated [[...]] attribute. Use __extension__ in the attribute + for C, since the [[...]] syntax was only added in C23. */ +#define DEFINE_ARM_KEYWORD_MACRO(NAME) \ + builtin_define_with_value ("__arm_" NAME, \ + lang_GNU_CXX () \ + ? "[[arm::" NAME "]]" \ + : "[[__extension__ arm::" NAME "]]", 0); + + DEFINE_ARM_KEYWORD_MACRO ("streaming"); + DEFINE_ARM_KEYWORD_MACRO ("streaming_compatible"); + +#undef DEFINE_ARM_KEYWORD_MACRO } /* Undefine/redefine macros that depend on the current backend state and may diff --git a/gcc/config/aarch64/aarch64-isa-modes.def b/gcc/config/aarch64/aarch64-isa-modes.def new file mode 100644 index 00000000000..5915c98a896 --- /dev/null +++ b/gcc/config/aarch64/aarch64-isa-modes.def @@ -0,0 +1,35 @@ +/* Copyright (C) 2023 Free Software Foundation, Inc. + + This file is part of GCC. + + GCC is free software; you can redistribute it and/or modify it + under the terms of the GNU General Public License as published + by the Free Software Foundation; either version 3, or (at your + option) any later version. + + GCC is distributed in the hope that it will be useful, but WITHOUT + ANY WARRANTY; without even the implied warranty of MERCHANTABILITY + or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public + License for more details. + + You should have received a copy of the GNU General Public License + along with GCC; see the file COPYING3. If not see + . */ + +/* This file defines a set of "ISA modes"; in other words, it defines + various bits of runtime state that control the set of available + instructions or that affect the semantics of instructions in some way. + + Before using #include to read this file, define a macro: + + DEF_AARCH64_ISA_MODE(NAME) + + where NAME is the name of the mode. */ + +/* Indicates that PSTATE.SM is known to be 1 or 0 respectively. These + modes are mutually exclusive. If neither mode is active then the state + of PSTATE.SM is not known at compile time. */ +DEF_AARCH64_ISA_MODE(SM_ON) +DEF_AARCH64_ISA_MODE(SM_OFF) + +#undef DEF_AARCH64_ISA_MODE diff --git a/gcc/config/aarch64/aarch64-protos.h b/gcc/config/aarch64/aarch64-protos.h index 765c42916f6..f629c1c383e 100644 --- a/gcc/config/aarch64/aarch64-protos.h +++ b/gcc/config/aarch64/aarch64-protos.h @@ -767,6 +767,7 @@ bool aarch64_constant_address_p (rtx); bool aarch64_emit_approx_div (rtx, rtx, rtx); bool aarch64_emit_approx_sqrt (rtx, rtx, bool); tree aarch64_vector_load_decl (tree); +rtx aarch64_gen_callee_cookie (aarch64_feature_flags, arm_pcs); void aarch64_expand_call (rtx, rtx, rtx, bool); bool aarch64_expand_cpymem_mops (rtx *, bool); bool aarch64_expand_cpymem (rtx *); @@ -852,7 +853,7 @@ bool aarch64_use_return_insn_p (void); const char *aarch64_output_casesi (rtx *); const char *aarch64_output_load_tp (rtx); -unsigned int aarch64_tlsdesc_abi_id (); +arm_pcs aarch64_tlsdesc_abi_id (); enum aarch64_symbol_type aarch64_classify_symbol (rtx, HOST_WIDE_INT); enum aarch64_symbol_type aarch64_classify_tls_symbol (rtx); enum reg_class aarch64_regno_regclass (unsigned); diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc index 7a5d0d325e9..b60728b3b5d 100644 --- a/gcc/config/aarch64/aarch64.cc +++ b/gcc/config/aarch64/aarch64.cc @@ -464,8 +464,18 @@ handle_aarch64_vector_pcs_attribute (tree *node, tree name, tree, gcc_unreachable (); } +/* Mutually-exclusive function type attributes for controlling PSTATE.SM. */ +static const struct attribute_spec::exclusions attr_streaming_exclusions[] = +{ + /* Attribute name exclusion applies to: + function, type, variable */ + { "streaming", false, true, false }, + { "streaming_compatible", false, true, false }, + { NULL, false, false, false } +}; + /* Table of machine attributes. */ -TARGET_GNU_ATTRIBUTES (aarch64_attribute_table, +static const attribute_spec aarch64_gnu_attributes[] = { /* { name, min_len, max_len, decl_req, type_req, fn_type_req, affects_type_identity, handler, exclude } */ @@ -477,7 +487,31 @@ TARGET_GNU_ATTRIBUTES (aarch64_attribute_table, { "Advanced SIMD type", 1, 1, false, true, false, true, NULL, NULL }, { "SVE type", 3, 3, false, true, false, true, NULL, NULL }, { "SVE sizeless type", 0, 0, false, true, false, true, NULL, NULL } -}); +}; + +static const scoped_attribute_specs aarch64_gnu_attribute_table = +{ + "gnu", aarch64_gnu_attributes +}; + +static const attribute_spec aarch64_arm_attributes[] = +{ + { "streaming", 0, 0, false, true, true, true, + NULL, attr_streaming_exclusions }, + { "streaming_compatible", 0, 0, false, true, true, true, + NULL, attr_streaming_exclusions }, +}; + +static const scoped_attribute_specs aarch64_arm_attribute_table = +{ + "arm", aarch64_arm_attributes +}; + +static const scoped_attribute_specs *const aarch64_attribute_table[] = +{ + &aarch64_gnu_attribute_table, + &aarch64_arm_attribute_table +}; typedef enum aarch64_cond_code { @@ -1715,6 +1749,48 @@ aarch64_fntype_abi (const_tree fntype) return default_function_abi; } +/* Return the state of PSTATE.SM on entry to functions of type FNTYPE. */ + +static aarch64_feature_flags +aarch64_fntype_pstate_sm (const_tree fntype) +{ + if (lookup_attribute ("arm", "streaming", TYPE_ATTRIBUTES (fntype))) + return AARCH64_FL_SM_ON; + + if (lookup_attribute ("arm", "streaming_compatible", + TYPE_ATTRIBUTES (fntype))) + return 0; + + return AARCH64_FL_SM_OFF; +} + +/* Return the ISA mode on entry to functions of type FNTYPE. */ + +static aarch64_feature_flags +aarch64_fntype_isa_mode (const_tree fntype) +{ + return aarch64_fntype_pstate_sm (fntype); +} + +/* Return the state of PSTATE.SM when compiling the body of + function FNDECL. This might be different from the state of + PSTATE.SM on entry. */ + +static aarch64_feature_flags +aarch64_fndecl_pstate_sm (const_tree fndecl) +{ + return aarch64_fntype_pstate_sm (TREE_TYPE (fndecl)); +} + +/* Return the ISA mode that should be used to compile the body of + function FNDECL. */ + +static aarch64_feature_flags +aarch64_fndecl_isa_mode (const_tree fndecl) +{ + return aarch64_fndecl_pstate_sm (fndecl); +} + /* Implement TARGET_COMPATIBLE_VECTOR_TYPES_P. */ static bool @@ -1777,17 +1853,46 @@ aarch64_reg_save_mode (unsigned int regno) gcc_unreachable (); } -/* Implement TARGET_INSN_CALLEE_ABI. */ +/* Given the ISA mode on entry to a callee and the ABI of the callee, + return the CONST_INT that should be placed in an UNSPEC_CALLEE_ABI rtx. */ -const predefined_function_abi & -aarch64_insn_callee_abi (const rtx_insn *insn) +rtx +aarch64_gen_callee_cookie (aarch64_feature_flags isa_mode, arm_pcs pcs_variant) +{ + return gen_int_mode ((unsigned int) isa_mode + | (unsigned int) pcs_variant << AARCH64_NUM_ISA_MODES, + DImode); +} + +/* COOKIE is a CONST_INT from an UNSPEC_CALLEE_ABI rtx. Return the + callee's ABI. */ + +static const predefined_function_abi & +aarch64_callee_abi (rtx cookie) +{ + return function_abis[UINTVAL (cookie) >> AARCH64_NUM_ISA_MODES]; +} + +/* INSN is a call instruction. Return the CONST_INT stored in its + UNSPEC_CALLEE_ABI rtx. */ + +static rtx +aarch64_insn_callee_cookie (const rtx_insn *insn) { rtx pat = PATTERN (insn); gcc_assert (GET_CODE (pat) == PARALLEL); rtx unspec = XVECEXP (pat, 0, 1); gcc_assert (GET_CODE (unspec) == UNSPEC && XINT (unspec, 1) == UNSPEC_CALLEE_ABI); - return function_abis[INTVAL (XVECEXP (unspec, 0, 0))]; + return XVECEXP (unspec, 0, 0); +} + +/* Implement TARGET_INSN_CALLEE_ABI. */ + +const predefined_function_abi & +aarch64_insn_callee_abi (const rtx_insn *insn) +{ + return aarch64_callee_abi (aarch64_insn_callee_cookie (insn)); } /* Implement TARGET_HARD_REGNO_CALL_PART_CLOBBERED. The callee only saves @@ -5712,7 +5817,7 @@ aarch64_function_arg (cumulative_args_t pcum_v, const function_arg_info &arg) || pcum->pcs_variant == ARM_PCS_SVE); if (arg.end_marker_p ()) - return gen_int_mode (pcum->pcs_variant, DImode); + return aarch64_gen_callee_cookie (pcum->isa_mode, pcum->pcs_variant); aarch64_layout_arg (pcum_v, arg); return pcum->aapcs_reg; @@ -5733,9 +5838,15 @@ aarch64_init_cumulative_args (CUMULATIVE_ARGS *pcum, pcum->aapcs_nextnvrn = 0; pcum->aapcs_nextnprn = 0; if (fntype) - pcum->pcs_variant = (arm_pcs) fntype_abi (fntype).id (); + { + pcum->pcs_variant = (arm_pcs) fntype_abi (fntype).id (); + pcum->isa_mode = aarch64_fntype_isa_mode (fntype); + } else - pcum->pcs_variant = ARM_PCS_AAPCS64; + { + pcum->pcs_variant = ARM_PCS_AAPCS64; + pcum->isa_mode = AARCH64_FL_DEFAULT_ISA_MODE; + } pcum->aapcs_reg = NULL_RTX; pcum->aapcs_arg_processed = false; pcum->aapcs_stack_words = 0; @@ -8306,7 +8417,9 @@ aarch64_output_mi_thunk (FILE *file, tree thunk ATTRIBUTE_UNUSED, } funexp = XEXP (DECL_RTL (function), 0); funexp = gen_rtx_MEM (FUNCTION_MODE, funexp); - rtx callee_abi = gen_int_mode (fndecl_abi (function).id (), DImode); + auto isa_mode = aarch64_fntype_isa_mode (TREE_TYPE (function)); + auto pcs_variant = arm_pcs (fndecl_abi (function).id ()); + rtx callee_abi = aarch64_gen_callee_cookie (isa_mode, pcs_variant); insn = emit_call_insn (gen_sibcall (funexp, const0_rtx, callee_abi)); SIBLING_CALL_P (insn) = 1; @@ -16485,6 +16598,7 @@ aarch64_override_options (void) SUBTARGET_OVERRIDE_OPTIONS; #endif + auto isa_mode = AARCH64_FL_DEFAULT_ISA_MODE; if (cpu && arch) { /* If both -mcpu and -march are specified, warn if they are not @@ -16507,25 +16621,25 @@ aarch64_override_options (void) } selected_arch = arch->arch; - aarch64_set_asm_isa_flags (arch_isa); + aarch64_set_asm_isa_flags (arch_isa | isa_mode); } else if (cpu) { selected_arch = cpu->arch; - aarch64_set_asm_isa_flags (cpu_isa); + aarch64_set_asm_isa_flags (cpu_isa | isa_mode); } else if (arch) { cpu = &all_cores[arch->ident]; selected_arch = arch->arch; - aarch64_set_asm_isa_flags (arch_isa); + aarch64_set_asm_isa_flags (arch_isa | isa_mode); } else { /* No -mcpu or -march specified, so use the default CPU. */ cpu = &all_cores[TARGET_CPU_DEFAULT]; selected_arch = cpu->arch; - aarch64_set_asm_isa_flags (cpu->flags); + aarch64_set_asm_isa_flags (cpu->flags | isa_mode); } selected_tune = tune ? tune->ident : cpu->ident; @@ -16698,6 +16812,21 @@ aarch64_save_restore_target_globals (tree new_tree) TREE_TARGET_GLOBALS (new_tree) = save_target_globals_default_opts (); } +/* Return the target_option_node for FNDECL, or the current options + if FNDECL is null. */ + +static tree +aarch64_fndecl_options (tree fndecl) +{ + if (!fndecl) + return target_option_current_node; + + if (tree options = DECL_FUNCTION_SPECIFIC_TARGET (fndecl)) + return options; + + return target_option_default_node; +} + /* Implement TARGET_SET_CURRENT_FUNCTION. Unpack the codegen decisions like tuning and ISA features from the DECL_FUNCTION_SPECIFIC_TARGET of the function, if such exists. This function may be called multiple @@ -16707,25 +16836,24 @@ aarch64_save_restore_target_globals (tree new_tree) static void aarch64_set_current_function (tree fndecl) { - if (!fndecl || fndecl == aarch64_previous_fndecl) - return; - - tree old_tree = (aarch64_previous_fndecl - ? DECL_FUNCTION_SPECIFIC_TARGET (aarch64_previous_fndecl) - : NULL_TREE); - - tree new_tree = DECL_FUNCTION_SPECIFIC_TARGET (fndecl); + tree old_tree = aarch64_fndecl_options (aarch64_previous_fndecl); + tree new_tree = aarch64_fndecl_options (fndecl); - /* If current function has no attributes but the previous one did, - use the default node. */ - if (!new_tree && old_tree) - new_tree = target_option_default_node; + auto new_isa_mode = (fndecl + ? aarch64_fndecl_isa_mode (fndecl) + : AARCH64_FL_DEFAULT_ISA_MODE); + auto isa_flags = TREE_TARGET_OPTION (new_tree)->x_aarch64_isa_flags; /* If nothing to do, return. #pragma GCC reset or #pragma GCC pop to the default have been handled by aarch64_save_restore_target_globals from aarch64_pragma_target_parse. */ - if (old_tree == new_tree) - return; + if (old_tree == new_tree + && (!fndecl || aarch64_previous_fndecl) + && (isa_flags & AARCH64_FL_ISA_MODES) == new_isa_mode) + { + gcc_assert (AARCH64_ISA_MODE == new_isa_mode); + return; + } aarch64_previous_fndecl = fndecl; @@ -16733,7 +16861,28 @@ aarch64_set_current_function (tree fndecl) cl_target_option_restore (&global_options, &global_options_set, TREE_TARGET_OPTION (new_tree)); + /* The ISA mode can vary based on function type attributes and + function declaration attributes. Make sure that the target + options correctly reflect these attributes. */ + if ((isa_flags & AARCH64_FL_ISA_MODES) != new_isa_mode) + { + auto base_flags = (aarch64_asm_isa_flags & ~AARCH64_FL_ISA_MODES); + aarch64_set_asm_isa_flags (base_flags | new_isa_mode); + + aarch64_override_options_internal (&global_options); + new_tree = build_target_option_node (&global_options, + &global_options_set); + DECL_FUNCTION_SPECIFIC_TARGET (fndecl) = new_tree; + + tree new_optimize = build_optimization_node (&global_options, + &global_options_set); + if (new_optimize != optimization_default_node) + DECL_FUNCTION_SPECIFIC_OPTIMIZATION (fndecl) = new_optimize; + } + aarch64_save_restore_target_globals (new_tree); + + gcc_assert (AARCH64_ISA_MODE == new_isa_mode); } /* Enum describing the various ways we can handle attributes. @@ -16783,7 +16932,7 @@ aarch64_handle_attr_arch (const char *str) { gcc_assert (tmp_arch); selected_arch = tmp_arch->arch; - aarch64_set_asm_isa_flags (tmp_flags); + aarch64_set_asm_isa_flags (tmp_flags | AARCH64_ISA_MODE); return true; } @@ -16824,7 +16973,7 @@ aarch64_handle_attr_cpu (const char *str) gcc_assert (tmp_cpu); selected_tune = tmp_cpu->ident; selected_arch = tmp_cpu->arch; - aarch64_set_asm_isa_flags (tmp_flags); + aarch64_set_asm_isa_flags (tmp_flags | AARCH64_ISA_MODE); return true; } @@ -16924,7 +17073,7 @@ aarch64_handle_attr_isa_flags (char *str) features if the user wants to handpick specific features. */ if (strncmp ("+nothing", str, 8) == 0) { - isa_flags = 0; + isa_flags = AARCH64_ISA_MODE; str += 8; } @@ -17417,7 +17566,7 @@ aarch64_can_inline_p (tree caller, tree callee) /* Return the ID of the TLDESC ABI, initializing the descriptor if hasn't been already. */ -unsigned int +arm_pcs aarch64_tlsdesc_abi_id () { predefined_function_abi &tlsdesc_abi = function_abis[ARM_PCS_TLSDESC]; @@ -17431,7 +17580,7 @@ aarch64_tlsdesc_abi_id () SET_HARD_REG_BIT (full_reg_clobbers, regno); tlsdesc_abi.initialize (ARM_PCS_TLSDESC, full_reg_clobbers); } - return tlsdesc_abi.id (); + return ARM_PCS_TLSDESC; } /* Return true if SYMBOL_REF X binds locally. */ @@ -25386,22 +25535,26 @@ aarch64_simd_clone_usable (struct cgraph_node *node) static int aarch64_comp_type_attributes (const_tree type1, const_tree type2) { - auto check_attr = [&](const char *name) { - tree attr1 = lookup_attribute (name, TYPE_ATTRIBUTES (type1)); - tree attr2 = lookup_attribute (name, TYPE_ATTRIBUTES (type2)); + auto check_attr = [&](const char *ns, const char *name) { + tree attr1 = lookup_attribute (ns, name, TYPE_ATTRIBUTES (type1)); + tree attr2 = lookup_attribute (ns, name, TYPE_ATTRIBUTES (type2)); if (!attr1 && !attr2) return true; return attr1 && attr2 && attribute_value_equal (attr1, attr2); }; - if (!check_attr ("aarch64_vector_pcs")) + if (!check_attr ("gnu", "aarch64_vector_pcs")) + return 0; + if (!check_attr ("gnu", "Advanced SIMD type")) + return 0; + if (!check_attr ("gnu", "SVE type")) return 0; - if (!check_attr ("Advanced SIMD type")) + if (!check_attr ("gnu", "SVE sizeless type")) return 0; - if (!check_attr ("SVE type")) + if (!check_attr ("arm", "streaming")) return 0; - if (!check_attr ("SVE sizeless type")) + if (!check_attr ("arm", "streaming_compatible")) return 0; return 1; } diff --git a/gcc/config/aarch64/aarch64.h b/gcc/config/aarch64/aarch64.h index 43b56748b36..08d135d9a74 100644 --- a/gcc/config/aarch64/aarch64.h +++ b/gcc/config/aarch64/aarch64.h @@ -157,10 +157,13 @@ #ifndef USED_FOR_TARGET -/* Define an enum of all features (architectures and extensions). */ +/* Define an enum of all features (ISA modes, architectures and extensions). + The ISA modes must come first. */ enum class aarch64_feature : unsigned char { +#define DEF_AARCH64_ISA_MODE(IDENT) IDENT, #define AARCH64_OPT_EXTENSION(A, IDENT, C, D, E, F) IDENT, #define AARCH64_ARCH(A, B, IDENT, D, E) IDENT, +#include "aarch64-isa-modes.def" #include "aarch64-option-extensions.def" #include "aarch64-arches.def" }; @@ -169,16 +172,34 @@ enum class aarch64_feature : unsigned char { #define HANDLE(IDENT) \ constexpr auto AARCH64_FL_##IDENT \ = aarch64_feature_flags (1) << int (aarch64_feature::IDENT); +#define DEF_AARCH64_ISA_MODE(IDENT) HANDLE (IDENT) #define AARCH64_OPT_EXTENSION(A, IDENT, C, D, E, F) HANDLE (IDENT) #define AARCH64_ARCH(A, B, IDENT, D, E) HANDLE (IDENT) +#include "aarch64-isa-modes.def" #include "aarch64-option-extensions.def" #include "aarch64-arches.def" #undef HANDLE +constexpr auto AARCH64_FL_SM_STATE = AARCH64_FL_SM_ON | AARCH64_FL_SM_OFF; + +constexpr unsigned int AARCH64_NUM_ISA_MODES = (0 +#define DEF_AARCH64_ISA_MODE(IDENT) + 1 +#include "aarch64-isa-modes.def" +); + +/* The mask of all ISA modes. */ +constexpr auto AARCH64_FL_ISA_MODES + = (aarch64_feature_flags (1) << AARCH64_NUM_ISA_MODES) - 1; + +/* The default ISA mode, for functions with no attributes that specify + something to the contrary. */ +constexpr auto AARCH64_FL_DEFAULT_ISA_MODE = AARCH64_FL_SM_OFF; + #endif /* Macros to test ISA flags. */ +#define AARCH64_ISA_MODE (aarch64_isa_flags & AARCH64_FL_ISA_MODES) #define AARCH64_ISA_CRC (aarch64_isa_flags & AARCH64_FL_CRC) #define AARCH64_ISA_CRYPTO (aarch64_isa_flags & AARCH64_FL_CRYPTO) #define AARCH64_ISA_FP (aarch64_isa_flags & AARCH64_FL_FP) @@ -935,6 +956,7 @@ enum arm_pcs typedef struct { enum arm_pcs pcs_variant; + aarch64_feature_flags isa_mode; int aapcs_arg_processed; /* No need to lay out this argument again. */ int aapcs_ncrn; /* Next Core register number. */ int aapcs_nextncrn; /* Next next core register number. */ diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md index d843f472dc2..e6b19b962b1 100644 --- a/gcc/config/aarch64/aarch64.md +++ b/gcc/config/aarch64/aarch64.md @@ -7335,7 +7335,8 @@ (define_expand "tlsdesc_small_" { if (TARGET_SVE) { - rtx abi = gen_int_mode (aarch64_tlsdesc_abi_id (), DImode); + rtx abi = aarch64_gen_callee_cookie (AARCH64_ISA_MODE, + aarch64_tlsdesc_abi_id ()); rtx_insn *call = emit_call_insn (gen_tlsdesc_small_sve_ (operands[0], abi)); RTL_CONST_CALL_P (call) = 1; diff --git a/gcc/config/aarch64/t-aarch64 b/gcc/config/aarch64/t-aarch64 index a9a244ab6d6..a4e0aa03274 100644 --- a/gcc/config/aarch64/t-aarch64 +++ b/gcc/config/aarch64/t-aarch64 @@ -20,7 +20,10 @@ TM_H += $(srcdir)/config/aarch64/aarch64-fusion-pairs.def \ $(srcdir)/config/aarch64/aarch64-tuning-flags.def \ - $(srcdir)/config/aarch64/aarch64-option-extensions.def + $(srcdir)/config/aarch64/aarch64-option-extensions.def \ + $(srcdir)/config/aarch64/aarch64-cores.def \ + $(srcdir)/config/aarch64/aarch64-isa-modes.def \ + $(srcdir)/config/aarch64/aarch64-arches.def OPTIONS_H_EXTRA += $(srcdir)/config/aarch64/aarch64-cores.def \ $(srcdir)/config/aarch64/aarch64-arches.def diff --git a/gcc/testsuite/g++.target/aarch64/sme/aarch64-sme.exp b/gcc/testsuite/g++.target/aarch64/sme/aarch64-sme.exp new file mode 100644 index 00000000000..72fcd0bd982 --- /dev/null +++ b/gcc/testsuite/g++.target/aarch64/sme/aarch64-sme.exp @@ -0,0 +1,40 @@ +# Specific regression driver for AArch64 SME. +# Copyright (C) 2009-2023 Free Software Foundation, Inc. +# +# This file is part of GCC. +# +# GCC is free software; you can redistribute it and/or modify it +# under the terms of the GNU General Public License as published by +# the Free Software Foundation; either version 3, or (at your option) +# any later version. +# +# GCC is distributed in the hope that it will be useful, but +# WITHOUT ANY WARRANTY; without even the implied warranty of +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU +# General Public License for more details. +# +# You should have received a copy of the GNU General Public License +# along with GCC; see the file COPYING3. If not see +# . */ + +# GCC testsuite that uses the `dg.exp' driver. + +# Exit immediately if this isn't an AArch64 target. +if {![istarget aarch64*-*-*] } { + return +} + +# Load support procs. +load_lib g++-dg.exp + +# Initialize `dg'. +dg-init + +aarch64-with-arch-dg-options "" { + # Main loop. + dg-runtest [lsort [glob -nocomplain $srcdir/$subdir/*.\[cCS\]]] \ + "" "" +} + +# All done. +dg-finish diff --git a/gcc/testsuite/g++.target/aarch64/sme/keyword_macros_1.C b/gcc/testsuite/g++.target/aarch64/sme/keyword_macros_1.C new file mode 100644 index 00000000000..032485adf95 --- /dev/null +++ b/gcc/testsuite/g++.target/aarch64/sme/keyword_macros_1.C @@ -0,0 +1,4 @@ +/* { dg-options "-std=c++11 -pedantic-errors" } */ + +void f1 () __arm_streaming; +void f2 () __arm_streaming_compatible; diff --git a/gcc/testsuite/g++.target/aarch64/sme/streaming_mode_1.C b/gcc/testsuite/g++.target/aarch64/sme/streaming_mode_1.C new file mode 100644 index 00000000000..c3de726e726 --- /dev/null +++ b/gcc/testsuite/g++.target/aarch64/sme/streaming_mode_1.C @@ -0,0 +1,142 @@ +// { dg-options "" } + +void sc_a () [[arm::streaming_compatible]]; +void sc_a (); // { dg-error "ambiguating new declaration" "" { xfail *-*-* } } + +void sc_b (); +void sc_b () [[arm::streaming_compatible]]; // { dg-error "ambiguating new declaration" } + +void sc_c () [[arm::streaming_compatible]]; +void sc_c () {} // Inherits attribute from declaration (confusingly). + +void sc_d (); +void sc_d () [[arm::streaming_compatible]] {} // { dg-error "ambiguating new declaration" } + +void sc_e () [[arm::streaming_compatible]] {} +void sc_e (); // { dg-error "ambiguating new declaration" "" { xfail *-*-* } } + +void sc_f () {} +void sc_f () [[arm::streaming_compatible]]; // { dg-error "ambiguating new declaration" } + +extern void (*sc_g) (); +extern void (*sc_g) () [[arm::streaming_compatible]]; // { dg-error "conflicting declaration" } + +extern void (*sc_h) () [[arm::streaming_compatible]]; +extern void (*sc_h) (); // { dg-error "conflicting declaration" } + +//---------------------------------------------------------------------------- + +void s_a () [[arm::streaming]]; +void s_a (); // { dg-error "ambiguating new declaration" "" { xfail *-*-* } } + +void s_b (); +void s_b () [[arm::streaming]]; // { dg-error "ambiguating new declaration" } + +void s_c () [[arm::streaming]]; +void s_c () {} // Inherits attribute from declaration (confusingly). + +void s_d (); +void s_d () [[arm::streaming]] {} // { dg-error "ambiguating new declaration" } + +void s_e () [[arm::streaming]] {} +void s_e (); // { dg-error "ambiguating new declaration" "" { xfail *-*-* } } + +void s_f () {} +void s_f () [[arm::streaming]]; // { dg-error "ambiguating new declaration" } + +extern void (*s_g) (); +extern void (*s_g) () [[arm::streaming]]; // { dg-error "conflicting declaration" } + +extern void (*s_h) () [[arm::streaming]]; +extern void (*s_h) (); // { dg-error "conflicting declaration" } + +//---------------------------------------------------------------------------- + +void mixed_a () [[arm::streaming]]; +void mixed_a () [[arm::streaming_compatible]]; // { dg-error "ambiguating new declaration" } + +void mixed_b () [[arm::streaming_compatible]]; +void mixed_b () [[arm::streaming]]; // { dg-error "ambiguating new declaration" } + +void mixed_c () [[arm::streaming]]; +void mixed_c () [[arm::streaming_compatible]] {} // { dg-error "ambiguating new declaration" } + +void mixed_d () [[arm::streaming_compatible]]; +void mixed_d () [[arm::streaming]] {} // { dg-error "ambiguating new declaration" } + +void mixed_e () [[arm::streaming]] {} +void mixed_e () [[arm::streaming_compatible]]; // { dg-error "ambiguating new declaration" } + +void mixed_f () [[arm::streaming_compatible]] {} +void mixed_f () [[arm::streaming]]; // { dg-error "ambiguating new declaration" } + +extern void (*mixed_g) () [[arm::streaming_compatible]]; +extern void (*mixed_g) () [[arm::streaming]]; // { dg-error "conflicting declaration" } + +extern void (*mixed_h) () [[arm::streaming]]; +extern void (*mixed_h) () [[arm::streaming_compatible]]; // { dg-error "conflicting declaration" } + +//---------------------------------------------------------------------------- + +void contradiction_1 () [[arm::streaming, arm::streaming_compatible]]; // { dg-warning "conflicts with attribute" } +void contradiction_2 () [[arm::streaming_compatible, arm::streaming]]; // { dg-warning "conflicts with attribute" } + +int [[arm::streaming_compatible]] int_attr; // { dg-warning "attribute ignored" } +void [[arm::streaming_compatible]] ret_attr (); // { dg-warning "attribute ignored" } +void *[[arm::streaming]] ptr_attr; // { dg-warning "only applies to function types" } + +typedef void s_callback () [[arm::streaming]]; +typedef void sc_callback () [[arm::streaming_compatible]]; + +typedef void contradiction_callback_1 () [[arm::streaming, arm::streaming_compatible]]; // { dg-warning "conflicts with attribute" } +typedef void contradiction_callback_2 () [[arm::streaming_compatible, arm::streaming]]; // { dg-warning "conflicts with attribute" } + +void (*contradiction_callback_ptr_1) () [[arm::streaming, arm::streaming_compatible]]; // { dg-warning "conflicts with attribute" } +void (*contradiction_callback_ptr_2) () [[arm::streaming_compatible, arm::streaming]]; // { dg-warning "conflicts with attribute" } + +struct s { + void (*contradiction_callback_ptr_1) () [[arm::streaming, arm::streaming_compatible]]; // { dg-warning "conflicts with attribute" } + void (*contradiction_callback_ptr_2) () [[arm::streaming_compatible, arm::streaming]]; // { dg-warning "conflicts with attribute" } +}; + +//---------------------------------------------------------------------------- + +void keyword_ok_1 () __arm_streaming; +void keyword_ok_1 () __arm_streaming; + +void keyword_ok_2 () __arm_streaming; +void keyword_ok_2 () [[arm::streaming]]; + +void keyword_ok_3 () [[arm::streaming]]; +void keyword_ok_3 () __arm_streaming; + +void keyword_ok_4 () __arm_streaming [[arm::streaming]]; + +void keyword_ok_5 () __arm_streaming_compatible; +void keyword_ok_5 () [[arm::streaming_compatible]]; + +//---------------------------------------------------------------------------- + +void keyword_contradiction_1 () __arm_streaming; +void keyword_contradiction_1 (); // { dg-error "ambiguating new declaration" "" { xfail *-*-* } } + +void keyword_contradiction_2 (); +void keyword_contradiction_2 () __arm_streaming; // { dg-error "ambiguating new declaration" } + +void keyword_contradiction_3 () __arm_streaming; +void keyword_contradiction_3 () [[arm::streaming_compatible]]; // { dg-error "ambiguating new declaration" } + +void keyword_contradiction_4 () [[arm::streaming_compatible]]; +void keyword_contradiction_4 () __arm_streaming; // { dg-error "ambiguating new declaration" } + +//---------------------------------------------------------------------------- + +struct s1 +{ + virtual void f () [[arm::streaming]]; +}; + +struct s2 : public s1 +{ + void f () override; // { dg-error "conflicting type attributes" } +}; diff --git a/gcc/testsuite/g++.target/aarch64/sme/streaming_mode_2.C b/gcc/testsuite/g++.target/aarch64/sme/streaming_mode_2.C new file mode 100644 index 00000000000..f2dd2db9b6f --- /dev/null +++ b/gcc/testsuite/g++.target/aarch64/sme/streaming_mode_2.C @@ -0,0 +1,25 @@ +// { dg-options "" } + +void sc_fn () [[arm::streaming_compatible]]; +void s_fn () [[arm::streaming]]; +void ns_fn (); + +void (*sc_fn_ptr) () [[arm::streaming_compatible]]; +void (*s_fn_ptr) () [[arm::streaming]]; +void (*ns_fn_ptr) (); + +void +f () +{ + sc_fn_ptr = sc_fn; + sc_fn_ptr = s_fn; // { dg-error "invalid conversion" } + sc_fn_ptr = ns_fn; // { dg-error "invalid conversion" } + + s_fn_ptr = sc_fn; // { dg-error "invalid conversion" } + s_fn_ptr = s_fn; + s_fn_ptr = ns_fn; // { dg-error "invalid conversion" } + + ns_fn_ptr = sc_fn; // { dg-error "invalid conversion" } + ns_fn_ptr = s_fn; // { dg-error "invalid conversion" } + ns_fn_ptr = ns_fn; +} diff --git a/gcc/testsuite/gcc.target/aarch64/auto-init-1.c b/gcc/testsuite/gcc.target/aarch64/auto-init-1.c index 0fa470880bf..45bb02561ed 100644 --- a/gcc/testsuite/gcc.target/aarch64/auto-init-1.c +++ b/gcc/testsuite/gcc.target/aarch64/auto-init-1.c @@ -29,4 +29,5 @@ void foo() return; } -/* { dg-final { scan-rtl-dump-times "const_int 0" 11 "expand" } } */ +/* Includes 1 for the call instruction and 1 for a nop. */ +/* { dg-final { scan-rtl-dump-times "const_int 0" 10 "expand" } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sme/aarch64-sme.exp b/gcc/testsuite/gcc.target/aarch64/sme/aarch64-sme.exp new file mode 100644 index 00000000000..c990e59247a --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sme/aarch64-sme.exp @@ -0,0 +1,40 @@ +# Specific regression driver for AArch64 SME. +# Copyright (C) 2009-2023 Free Software Foundation, Inc. +# +# This file is part of GCC. +# +# GCC is free software; you can redistribute it and/or modify it +# under the terms of the GNU General Public License as published by +# the Free Software Foundation; either version 3, or (at your option) +# any later version. +# +# GCC is distributed in the hope that it will be useful, but +# WITHOUT ANY WARRANTY; without even the implied warranty of +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU +# General Public License for more details. +# +# You should have received a copy of the GNU General Public License +# along with GCC; see the file COPYING3. If not see +# . */ + +# GCC testsuite that uses the `dg.exp' driver. + +# Exit immediately if this isn't an AArch64 target. +if {![istarget aarch64*-*-*] } { + return +} + +# Load support procs. +load_lib gcc-dg.exp + +# Initialize `dg'. +dg-init + +aarch64-with-arch-dg-options "" { + # Main loop. + dg-runtest [lsort [glob -nocomplain $srcdir/$subdir/*.\[cCS\]]] \ + "" "" +} + +# All done. +dg-finish diff --git a/gcc/testsuite/gcc.target/aarch64/sme/keyword_macros_1.c b/gcc/testsuite/gcc.target/aarch64/sme/keyword_macros_1.c new file mode 100644 index 00000000000..8f1b836764e --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sme/keyword_macros_1.c @@ -0,0 +1,4 @@ +/* { dg-options "-std=c90 -pedantic-errors" } */ + +void f1 () __arm_streaming; +void f2 () __arm_streaming_compatible; diff --git a/gcc/testsuite/gcc.target/aarch64/sme/streaming_mode_1.c b/gcc/testsuite/gcc.target/aarch64/sme/streaming_mode_1.c new file mode 100644 index 00000000000..8874b05b882 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sme/streaming_mode_1.c @@ -0,0 +1,130 @@ +// { dg-options "" } + +void sc_a () [[arm::streaming_compatible]]; +void sc_a (); // { dg-error "conflicting types" } + +void sc_b (); +void sc_b () [[arm::streaming_compatible]]; // { dg-error "conflicting types" } + +void sc_c () [[arm::streaming_compatible]]; +void sc_c () {} // Inherits attribute from declaration (confusingly). + +void sc_d (); +void sc_d () [[arm::streaming_compatible]] {} // { dg-error "conflicting types" } + +void sc_e () [[arm::streaming_compatible]] {} +void sc_e (); // { dg-error "conflicting types" } + +void sc_f () {} +void sc_f () [[arm::streaming_compatible]]; // { dg-error "conflicting types" } + +extern void (*sc_g) (); +extern void (*sc_g) () [[arm::streaming_compatible]]; // { dg-error "conflicting types" } + +extern void (*sc_h) () [[arm::streaming_compatible]]; +extern void (*sc_h) (); // { dg-error "conflicting types" } + +//---------------------------------------------------------------------------- + +void s_a () [[arm::streaming]]; +void s_a (); // { dg-error "conflicting types" } + +void s_b (); +void s_b () [[arm::streaming]]; // { dg-error "conflicting types" } + +void s_c () [[arm::streaming]]; +void s_c () {} // Inherits attribute from declaration (confusingly). + +void s_d (); +void s_d () [[arm::streaming]] {} // { dg-error "conflicting types" } + +void s_e () [[arm::streaming]] {} +void s_e (); // { dg-error "conflicting types" } + +void s_f () {} +void s_f () [[arm::streaming]]; // { dg-error "conflicting types" } + +extern void (*s_g) (); +extern void (*s_g) () [[arm::streaming]]; // { dg-error "conflicting types" } + +extern void (*s_h) () [[arm::streaming]]; +extern void (*s_h) (); // { dg-error "conflicting types" } + +//---------------------------------------------------------------------------- + +void mixed_a () [[arm::streaming]]; +void mixed_a () [[arm::streaming_compatible]]; // { dg-error "conflicting types" } + +void mixed_b () [[arm::streaming_compatible]]; +void mixed_b () [[arm::streaming]]; // { dg-error "conflicting types" } + +void mixed_c () [[arm::streaming]]; +void mixed_c () [[arm::streaming_compatible]] {} // { dg-error "conflicting types" } + +void mixed_d () [[arm::streaming_compatible]]; +void mixed_d () [[arm::streaming]] {} // { dg-error "conflicting types" } + +void mixed_e () [[arm::streaming]] {} +void mixed_e () [[arm::streaming_compatible]]; // { dg-error "conflicting types" } + +void mixed_f () [[arm::streaming_compatible]] {} +void mixed_f () [[arm::streaming]]; // { dg-error "conflicting types" } + +extern void (*mixed_g) () [[arm::streaming_compatible]]; +extern void (*mixed_g) () [[arm::streaming]]; // { dg-error "conflicting types" } + +extern void (*mixed_h) () [[arm::streaming]]; +extern void (*mixed_h) () [[arm::streaming_compatible]]; // { dg-error "conflicting types" } + +//---------------------------------------------------------------------------- + +void contradiction_1 () [[arm::streaming, arm::streaming_compatible]]; // { dg-warning "conflicts with attribute" } +void contradiction_2 () [[arm::streaming_compatible, arm::streaming]]; // { dg-warning "conflicts with attribute" } + +int [[arm::streaming_compatible]] int_attr; // { dg-warning "only applies to function types" } +void [[arm::streaming_compatible]] ret_attr (); // { dg-warning "only applies to function types" } +void *[[arm::streaming]] ptr_attr; // { dg-warning "only applies to function types" } + +typedef void s_callback () [[arm::streaming]]; +typedef void sc_callback () [[arm::streaming_compatible]]; + +typedef void contradiction_callback_1 () [[arm::streaming, arm::streaming_compatible]]; // { dg-warning "conflicts with attribute" } +typedef void contradiction_callback_2 () [[arm::streaming_compatible, arm::streaming]]; // { dg-warning "conflicts with attribute" } + +void (*contradiction_callback_ptr_1) () [[arm::streaming, arm::streaming_compatible]]; // { dg-warning "conflicts with attribute" } +void (*contradiction_callback_ptr_2) () [[arm::streaming_compatible, arm::streaming]]; // { dg-warning "conflicts with attribute" } + +struct s { + void (*contradiction_callback_ptr_1) () [[arm::streaming, arm::streaming_compatible]]; // { dg-warning "conflicts with attribute" } + void (*contradiction_callback_ptr_2) () [[arm::streaming_compatible, arm::streaming]]; // { dg-warning "conflicts with attribute" } +}; + +//---------------------------------------------------------------------------- + +void keyword_ok_1 () __arm_streaming; +void keyword_ok_1 () __arm_streaming; + +void keyword_ok_2 () __arm_streaming; +void keyword_ok_2 () [[arm::streaming]]; + +void keyword_ok_3 () [[arm::streaming]]; +void keyword_ok_3 () __arm_streaming; + +void keyword_ok_4 () __arm_streaming [[arm::streaming]]; + +void keyword_ok_5 () __arm_streaming_compatible; +void keyword_ok_5 () [[arm::streaming_compatible]]; + +//---------------------------------------------------------------------------- + +void keyword_contradiction_1 () __arm_streaming; +void keyword_contradiction_1 (); // { dg-error "conflicting types" } + +void keyword_contradiction_2 (); +void keyword_contradiction_2 () __arm_streaming; // { dg-error "conflicting types" } + +void keyword_contradiction_3 () __arm_streaming; +void keyword_contradiction_3 () [[arm::streaming_compatible]]; // { dg-error "conflicting types" } + +void keyword_contradiction_4 () [[arm::streaming_compatible]]; +void keyword_contradiction_4 () __arm_streaming; // { dg-error "conflicting types" } diff --git a/gcc/testsuite/gcc.target/aarch64/sme/streaming_mode_2.c b/gcc/testsuite/gcc.target/aarch64/sme/streaming_mode_2.c new file mode 100644 index 00000000000..e8be0f82176 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sme/streaming_mode_2.c @@ -0,0 +1,25 @@ +// { dg-options "" } + +void sc_fn () [[arm::streaming_compatible]]; +void s_fn () [[arm::streaming]]; +void ns_fn (); + +void (*sc_fn_ptr) () [[arm::streaming_compatible]]; +void (*s_fn_ptr) () [[arm::streaming]]; +void (*ns_fn_ptr) (); + +void +f () +{ + sc_fn_ptr = sc_fn; + sc_fn_ptr = s_fn; // { dg-error "incompatible pointer type" } + sc_fn_ptr = ns_fn; // { dg-error "incompatible pointer type" } + + s_fn_ptr = sc_fn; // { dg-error "incompatible pointer type" } + s_fn_ptr = s_fn; + s_fn_ptr = ns_fn; // { dg-error "incompatible pointer type" } + + ns_fn_ptr = sc_fn; // { dg-error "incompatible pointer type" } + ns_fn_ptr = s_fn; // { dg-error "incompatible pointer type" } + ns_fn_ptr = ns_fn; +} From patchwork Tue Dec 5 10:13:10 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Sandiford X-Patchwork-Id: 1872040 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4SkxKQ2KVfz1ySd for ; Tue, 5 Dec 2023 21:16:58 +1100 (AEDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 4E8D838768A1 for ; Tue, 5 Dec 2023 10:16:29 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by sourceware.org (Postfix) with ESMTP id 1030F3856DCD for ; Tue, 5 Dec 2023 10:13:42 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 1030F3856DCD Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=arm.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 1030F3856DCD Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=217.140.110.172 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1701771224; cv=none; b=BM/97CrF4JV5YSQU2C5StO5b7imzFSoTRtJBRGqUfzV3xAnnFzLAM5filccf3ok7N/Fn1PHvSlGsR3Te2dwDOJu/OtCCU6x0a4Xh+Yv23oW9MeX09opgwXmDRhy9ohoCoSkrWiixzMkIEvKSF2YLEmEW0zZG6maG4VIUmyF09rQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1701771224; c=relaxed/simple; bh=61K0rTiGRUARnZ7/CGcbObMSvvSvrBZLHW1pDJuuD88=; h=From:To:Subject:Date:Message-Id:MIME-Version; b=P7TnpJMJ/mSiWkSq5OGYewJjgbhRiLmhACdOJ7VZXq0GKgqW2mCOki6KuINBQ+ddsVxqTiTzMrADo/T0vM2SA3GzgIiLmSt5jHpgt72hwOmrn5Yoke8zt+TC3zFKvXFoK+G4y0C4beOA/4aIpNqS42VjiIYdJzbPxf3YzlC3cxk= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 6CB4F1576; Tue, 5 Dec 2023 02:14:28 -0800 (PST) Received: from e121540-lin.manchester.arm.com (e121540-lin.manchester.arm.com [10.32.110.72]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 5E6AB3F5A1; Tue, 5 Dec 2023 02:13:41 -0800 (PST) From: Richard Sandiford To: gcc-patches@gcc.gnu.org Cc: Richard Sandiford Subject: [pushed v2 12/25] aarch64: Add +sme Date: Tue, 5 Dec 2023 10:13:10 +0000 Message-Id: <20231205101323.1914247-13-richard.sandiford@arm.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20231205101323.1914247-1-richard.sandiford@arm.com> References: <20231205101323.1914247-1-richard.sandiford@arm.com> MIME-Version: 1.0 X-Spam-Status: No, score=-22.4 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_DMARC_NONE, KAM_DMARC_STATUS, KAM_LAZY_DOMAIN_SECURITY, KAM_SHORT, SPF_HELO_NONE, SPF_NONE, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org This patch adds the +sme ISA feature and requires it to be present when compiling arm_streaming code. (arm_streaming_compatible code does not necessarily assume the presence of SME. It just has to work when SME is present and streaming mode is enabled.) gcc/ * doc/invoke.texi: Document SME. * doc/sourcebuild.texi: Document aarch64_sve. * config/aarch64/aarch64-option-extensions.def (sme): Define. * config/aarch64/aarch64.h (AARCH64_ISA_SME): New macro. (TARGET_SME): Likewise. * config/aarch64/aarch64.cc (aarch64_override_options_internal): Ensure that SME is present when compiling streaming code. gcc/testsuite/ * lib/target-supports.exp (check_effective_target_aarch64_sme): New target test. * gcc.target/aarch64/sme/aarch64-sme.exp: Force SME to be enabled if it isn't by default. * g++.target/aarch64/sme/aarch64-sme.exp: Likewise. * gcc.target/aarch64/sme/streaming_mode_3.c: New test. --- .../aarch64/aarch64-option-extensions.def | 2 + gcc/config/aarch64/aarch64.cc | 33 ++++++++++ gcc/config/aarch64/aarch64.h | 5 ++ gcc/doc/invoke.texi | 2 + gcc/doc/sourcebuild.texi | 2 + .../g++.target/aarch64/sme/aarch64-sme.exp | 10 ++- .../gcc.target/aarch64/sme/aarch64-sme.exp | 10 ++- .../gcc.target/aarch64/sme/streaming_mode_3.c | 63 +++++++++++++++++++ .../gcc.target/aarch64/sme/streaming_mode_4.c | 22 +++++++ gcc/testsuite/lib/target-supports.exp | 12 ++++ 10 files changed, 157 insertions(+), 4 deletions(-) create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/streaming_mode_3.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/streaming_mode_4.c diff --git a/gcc/config/aarch64/aarch64-option-extensions.def b/gcc/config/aarch64/aarch64-option-extensions.def index 825f3bf7758..fb9ff1b66b2 100644 --- a/gcc/config/aarch64/aarch64-option-extensions.def +++ b/gcc/config/aarch64/aarch64-option-extensions.def @@ -151,4 +151,6 @@ AARCH64_OPT_EXTENSION("mops", MOPS, (), (), (), "") AARCH64_OPT_EXTENSION("cssc", CSSC, (), (), (), "cssc") +AARCH64_OPT_EXTENSION("sme", SME, (BF16, SVE2), (), (), "sme") + #undef AARCH64_OPT_EXTENSION diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc index b60728b3b5d..3792f1e99fd 100644 --- a/gcc/config/aarch64/aarch64.cc +++ b/gcc/config/aarch64/aarch64.cc @@ -9328,6 +9328,23 @@ aarch64_fixed_condition_code_regs (unsigned int *p1, unsigned int *p2) return true; } +/* Implement TARGET_START_CALL_ARGS. */ + +static void +aarch64_start_call_args (cumulative_args_t ca_v) +{ + CUMULATIVE_ARGS *ca = get_cumulative_args (ca_v); + + if (!TARGET_SME && (ca->isa_mode & AARCH64_FL_SM_ON)) + { + error ("calling a streaming function requires the ISA extension %qs", + "sme"); + inform (input_location, "you can enable %qs using the command-line" + " option %<-march%>, or by using the %" + " attribute or pragma", "sme"); + } +} + /* This function is used by the call expanders of the machine description. RESULT is the register in which the result is returned. It's NULL for "call" and "sibcall". @@ -16147,6 +16164,19 @@ aarch64_override_options_internal (struct gcc_options *opts) && !fixed_regs[R18_REGNUM]) error ("%<-fsanitize=shadow-call-stack%> requires %<-ffixed-x18%>"); + if ((opts->x_aarch64_isa_flags & AARCH64_FL_SM_ON) + && !(opts->x_aarch64_isa_flags & AARCH64_FL_SME)) + { + error ("streaming functions require the ISA extension %qs", "sme"); + inform (input_location, "you can enable %qs using the command-line" + " option %<-march%>, or by using the %" + " attribute or pragma", "sme"); + opts->x_target_flags &= ~MASK_GENERAL_REGS_ONLY; + auto new_flags = (opts->x_aarch64_asm_isa_flags + | feature_deps::SME ().enable); + aarch64_set_asm_isa_flags (opts, new_flags); + } + initialize_aarch64_code_model (opts); initialize_aarch64_tls_size (opts); aarch64_tpidr_register = opts->x_aarch64_tpidr_reg; @@ -26248,6 +26278,9 @@ aarch64_run_selftests (void) #undef TARGET_FUNCTION_VALUE_REGNO_P #define TARGET_FUNCTION_VALUE_REGNO_P aarch64_function_value_regno_p +#undef TARGET_START_CALL_ARGS +#define TARGET_START_CALL_ARGS aarch64_start_call_args + #undef TARGET_GIMPLE_FOLD_BUILTIN #define TARGET_GIMPLE_FOLD_BUILTIN aarch64_gimple_fold_builtin diff --git a/gcc/config/aarch64/aarch64.h b/gcc/config/aarch64/aarch64.h index 08d135d9a74..aa908ced7cd 100644 --- a/gcc/config/aarch64/aarch64.h +++ b/gcc/config/aarch64/aarch64.h @@ -214,6 +214,7 @@ constexpr auto AARCH64_FL_DEFAULT_ISA_MODE = AARCH64_FL_SM_OFF; #define AARCH64_ISA_SVE2_BITPERM (aarch64_isa_flags & AARCH64_FL_SVE2_BITPERM) #define AARCH64_ISA_SVE2_SHA3 (aarch64_isa_flags & AARCH64_FL_SVE2_SHA3) #define AARCH64_ISA_SVE2_SM4 (aarch64_isa_flags & AARCH64_FL_SVE2_SM4) +#define AARCH64_ISA_SME (aarch64_isa_flags & AARCH64_FL_SME) #define AARCH64_ISA_V8_3A (aarch64_isa_flags & AARCH64_FL_V8_3A) #define AARCH64_ISA_DOTPROD (aarch64_isa_flags & AARCH64_FL_DOTPROD) #define AARCH64_ISA_AES (aarch64_isa_flags & AARCH64_FL_AES) @@ -293,6 +294,10 @@ constexpr auto AARCH64_FL_DEFAULT_ISA_MODE = AARCH64_FL_SM_OFF; /* SVE2 SM4 instructions, enabled through +sve2-sm4. */ #define TARGET_SVE2_SM4 (AARCH64_ISA_SVE2_SM4) +/* SME instructions, enabled through +sme. Note that this does not + imply anything about the state of PSTATE.SM. */ +#define TARGET_SME (AARCH64_ISA_SME) + /* ARMv8.3-A features. */ #define TARGET_ARMV8_3 (AARCH64_ISA_V8_3A) diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi index 681e3f3f466..b138a74cc2b 100644 --- a/gcc/doc/invoke.texi +++ b/gcc/doc/invoke.texi @@ -21273,6 +21273,8 @@ Enable the Flag Manipulation instructions Extension. Enable the Pointer Authentication Extension. @item cssc Enable the Common Short Sequence Compression instructions. +@item sme +Enable the Scalable Matrix Extension. @end table diff --git a/gcc/doc/sourcebuild.texi b/gcc/doc/sourcebuild.texi index d3f68f35371..123e73508b6 100644 --- a/gcc/doc/sourcebuild.texi +++ b/gcc/doc/sourcebuild.texi @@ -2316,6 +2316,8 @@ AArch64 target which generates instruction sequences for big endian. @item aarch64_small_fpic Binutils installed on test system supports relocation types required by -fpic for AArch64 small memory model. +@item aarch64_sme +AArch64 target that generates instructions for SME. @item aarch64_sve_hw AArch64 target that is able to generate and execute SVE code (regardless of whether it does so by default). diff --git a/gcc/testsuite/g++.target/aarch64/sme/aarch64-sme.exp b/gcc/testsuite/g++.target/aarch64/sme/aarch64-sme.exp index 72fcd0bd982..1c3e69cde12 100644 --- a/gcc/testsuite/g++.target/aarch64/sme/aarch64-sme.exp +++ b/gcc/testsuite/g++.target/aarch64/sme/aarch64-sme.exp @@ -30,10 +30,16 @@ load_lib g++-dg.exp # Initialize `dg'. dg-init -aarch64-with-arch-dg-options "" { +if { [check_effective_target_aarch64_sme] } { + set sme_flags "" +} else { + set sme_flags "-march=armv9-a+sme" +} + +aarch64-with-arch-dg-options $sme_flags { # Main loop. dg-runtest [lsort [glob -nocomplain $srcdir/$subdir/*.\[cCS\]]] \ - "" "" + "" $sme_flags } # All done. diff --git a/gcc/testsuite/gcc.target/aarch64/sme/aarch64-sme.exp b/gcc/testsuite/gcc.target/aarch64/sme/aarch64-sme.exp index c990e59247a..011310e8061 100644 --- a/gcc/testsuite/gcc.target/aarch64/sme/aarch64-sme.exp +++ b/gcc/testsuite/gcc.target/aarch64/sme/aarch64-sme.exp @@ -30,10 +30,16 @@ load_lib gcc-dg.exp # Initialize `dg'. dg-init -aarch64-with-arch-dg-options "" { +if { [check_effective_target_aarch64_sme] } { + set sme_flags "" +} else { + set sme_flags "-march=armv9-a+sme" +} + +aarch64-with-arch-dg-options $sme_flags { # Main loop. dg-runtest [lsort [glob -nocomplain $srcdir/$subdir/*.\[cCS\]]] \ - "" "" + "" $sme_flags } # All done. diff --git a/gcc/testsuite/gcc.target/aarch64/sme/streaming_mode_3.c b/gcc/testsuite/gcc.target/aarch64/sme/streaming_mode_3.c new file mode 100644 index 00000000000..45ec92321b2 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sme/streaming_mode_3.c @@ -0,0 +1,63 @@ +// { dg-options "" } + +#pragma GCC target "+nosme" + +void sc_a () [[arm::streaming_compatible]] {} +void s_a () [[arm::streaming]] {} // { dg-error "streaming functions require the ISA extension 'sme'" } +void ns_a () {} + +void sc_b () [[arm::streaming_compatible]] {} +void ns_b () {} +void s_b () [[arm::streaming]] {} // { dg-error "streaming functions require the ISA extension 'sme'" } + +void sc_c () [[arm::streaming_compatible]] {} +void sc_d () [[arm::streaming_compatible]] {} + +void s_c () [[arm::streaming]] {} // { dg-error "streaming functions require the ISA extension 'sme'" } +void s_d () [[arm::streaming]] {} // { dg-error "streaming functions require the ISA extension 'sme'" } + +void ns_c () {} +void ns_d () {} + +void sc_e () [[arm::streaming_compatible]]; +void s_e () [[arm::streaming]]; +void ns_e (); + +#pragma GCC target "+sme" + +void sc_f () [[arm::streaming_compatible]] {} +void s_f () [[arm::streaming]] {} +void ns_f () {} + +void sc_g () [[arm::streaming_compatible]] {} +void ns_g () {} +void s_g () [[arm::streaming]] {} + +void sc_h () [[arm::streaming_compatible]] {} +void sc_i () [[arm::streaming_compatible]] {} + +void s_h () [[arm::streaming]] {} +void s_i () [[arm::streaming]] {} + +void ns_h () {} +void ns_i () {} + +void sc_j () [[arm::streaming_compatible]]; +void s_j () [[arm::streaming]]; +void ns_j (); + +#pragma GCC target "+sme" + +void sc_k () [[arm::streaming_compatible]] {} + +#pragma GCC target "+nosme" +#pragma GCC target "+sme" + +void s_k () [[arm::streaming]] {} + +#pragma GCC target "+nosme" +#pragma GCC target "+sme" + +void ns_k () {} + +#pragma GCC target "+nosme" diff --git a/gcc/testsuite/gcc.target/aarch64/sme/streaming_mode_4.c b/gcc/testsuite/gcc.target/aarch64/sme/streaming_mode_4.c new file mode 100644 index 00000000000..50e92f2e18a --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sme/streaming_mode_4.c @@ -0,0 +1,22 @@ +// { dg-options "-mgeneral-regs-only" } + +void sc_a () [[arm::streaming_compatible]] {} +void s_a () [[arm::streaming]] {} // { dg-error "streaming functions require the ISA extension 'sme'" } +void ns_a () {} + +void sc_b () [[arm::streaming_compatible]] {} +void ns_b () {} +void s_b () [[arm::streaming]] {} // { dg-error "streaming functions require the ISA extension 'sme'" } + +void sc_c () [[arm::streaming_compatible]] {} +void sc_d () [[arm::streaming_compatible]] {} + +void s_c () [[arm::streaming]] {} // { dg-error "streaming functions require the ISA extension 'sme'" } +void s_d () [[arm::streaming]] {} // { dg-error "streaming functions require the ISA extension 'sme'" } + +void ns_c () {} +void ns_d () {} + +void sc_e () [[arm::streaming_compatible]]; +void s_e () [[arm::streaming]]; +void ns_e (); diff --git a/gcc/testsuite/lib/target-supports.exp b/gcc/testsuite/lib/target-supports.exp index 83eb08ba54e..c9bbab9007b 100644 --- a/gcc/testsuite/lib/target-supports.exp +++ b/gcc/testsuite/lib/target-supports.exp @@ -4386,6 +4386,18 @@ proc aarch64_sve_bits { } { }] } +# Return 1 if this is an AArch64 target that generates instructions for SME. +proc check_effective_target_aarch64_sme { } { + if { ![istarget aarch64*-*-*] } { + return 0 + } + return [check_no_compiler_messages aarch64_sme assembly { + #if !defined (__ARM_FEATURE_SME) + #error FOO + #endif + }] +} + # Return 1 if this is a compiler supporting ARC atomic operations proc check_effective_target_arc_atomic { } { return [check_no_compiler_messages arc_atomic assembly { From patchwork Tue Dec 5 10:13:11 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Sandiford X-Patchwork-Id: 1872042 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4SkxKd1yMVz1ySd for ; Tue, 5 Dec 2023 21:17:09 +1100 (AEDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id F21853872EBF for ; Tue, 5 Dec 2023 10:16:38 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by sourceware.org (Postfix) with ESMTP id DC199385C303 for ; Tue, 5 Dec 2023 10:13:42 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org DC199385C303 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=arm.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org DC199385C303 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=217.140.110.172 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1701771230; cv=none; b=sEXrJwXnaxgO+7gN3bIn4wmnZ+hC/g3s7rOxTctT0TtY7cev3Jq/yk93WWBSH0kwh5AuaON6+Xpb18mG4sSYFwrmxCLk299cJtqvLYRgPCmDbE8iIyhU3HBdSL4xgU2wvMNe9lwvJPBJg/V7wxU7yUytsDFH6QTD+frhtKTLdyw= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1701771230; c=relaxed/simple; bh=TlFoHv17krISx5EAHCt8/h0xfis8+VS6qWJwDIS34v8=; h=From:To:Subject:Date:Message-Id:MIME-Version; b=w7n+Zkr3r8/I9Zxkn+5fWMHCQZ6LtfVQQ5IK8l2RYhjpzRBrWt1VLOj07blzMW/MosuymEusNAptwanMjfQdJbYxngF3yjd4Kev+2Bh4f95UdTXxKSR8gM9GXM8LGDokKhohzCJ/o+7KmgjkysV1pnSsvjmABJlDIX6EZOrmirA= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 1EB0FFEC; Tue, 5 Dec 2023 02:14:29 -0800 (PST) Received: from e121540-lin.manchester.arm.com (e121540-lin.manchester.arm.com [10.32.110.72]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 0FF263F5A1; Tue, 5 Dec 2023 02:13:41 -0800 (PST) From: Richard Sandiford To: gcc-patches@gcc.gnu.org Cc: Richard Sandiford Subject: [pushed v2 13/25] aarch64: Distinguish streaming-compatible AdvSIMD insns Date: Tue, 5 Dec 2023 10:13:11 +0000 Message-Id: <20231205101323.1914247-14-richard.sandiford@arm.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20231205101323.1914247-1-richard.sandiford@arm.com> References: <20231205101323.1914247-1-richard.sandiford@arm.com> MIME-Version: 1.0 X-Spam-Status: No, score=-22.3 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_DMARC_NONE, KAM_DMARC_STATUS, KAM_LAZY_DOMAIN_SECURITY, KAM_SHORT, SPF_HELO_NONE, SPF_NONE, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org The vast majority of Advanced SIMD instructions are not available in streaming mode, but some of the load/store/move instructions are. This patch adds a new target feature macro called TARGET_BASE_SIMD for this streaming-compatible subset. The vector-to-vector move instructions are not streaming-compatible, so we need to use the SVE move instructions where enabled, or fall back to the nofp16 handling otherwise. I haven't found a good way of testing the SVE EXT alternative in aarch64_simd_mov_from_high, but I'd rather provide it than not. gcc/ * config/aarch64/aarch64.h (TARGET_BASE_SIMD): New macro. (TARGET_SIMD): Require PSTATE.SM to be 0. (AARCH64_ISA_SM_OFF): New macro. * config/aarch64/aarch64.cc (aarch64_array_mode_supported_p): Allow Advanced SIMD structure modes for TARGET_BASE_SIMD. (aarch64_print_operand): Support '%Z'. (aarch64_secondary_reload): Expect SVE moves to be used for Advanced SIMD modes if SVE is enabled and non-streaming Advanced SIMD isn't. (aarch64_register_move_cost): Likewise. (aarch64_simd_container_mode): Extend Advanced SIMD mode handling to TARGET_BASE_SIMD. (aarch64_expand_cpymem): Expand commentary. * config/aarch64/aarch64.md (arches): Add base_simd and nobase_simd. (arch_enabled): Handle it. (*mov_aarch64): Extend UMOV alternative to TARGET_BASE_SIMD. (*movti_aarch64): Use an SVE move instruction if non-streaming SIMD isn't available. (*mov_aarch64): Likewise. (load_pair_dw_tftf): Extend to TARGET_BASE_SIMD. (store_pair_dw_tftf): Likewise. (loadwb_pair_): Likewise. (storewb_pair_): Likewise. * config/aarch64/aarch64-simd.md (*aarch64_simd_mov): Allow UMOV in streaming mode. (*aarch64_simd_mov): Use an SVE move instruction if non-streaming SIMD isn't available. (aarch64_store_lane0): Depend on TARGET_FLOAT rather than TARGET_SIMD. (aarch64_simd_mov_from_low): Likewise. Use fmov if Advanced SIMD is completely disabled. (aarch64_simd_mov_from_high): Use SVE EXT instructions if non-streaming SIMD isn't available. gcc/testsuite/ * gcc.target/aarch64/movdf_2.c: New test. * gcc.target/aarch64/movdi_3.c: Likewise. * gcc.target/aarch64/movhf_2.c: Likewise. * gcc.target/aarch64/movhi_2.c: Likewise. * gcc.target/aarch64/movqi_2.c: Likewise. * gcc.target/aarch64/movsf_2.c: Likewise. * gcc.target/aarch64/movsi_2.c: Likewise. * gcc.target/aarch64/movtf_3.c: Likewise. * gcc.target/aarch64/movtf_4.c: Likewise. * gcc.target/aarch64/movti_3.c: Likewise. * gcc.target/aarch64/movti_4.c: Likewise. * gcc.target/aarch64/movv16qi_4.c: Likewise. * gcc.target/aarch64/movv16qi_5.c: Likewise. * gcc.target/aarch64/movv8qi_4.c: Likewise. * gcc.target/aarch64/sme/arm_neon_1.c: Likewise. * gcc.target/aarch64/sme/arm_neon_2.c: Likewise. * gcc.target/aarch64/sme/arm_neon_3.c: Likewise. --- gcc/config/aarch64/aarch64-simd.md | 48 +++++------ gcc/config/aarch64/aarch64.cc | 16 ++-- gcc/config/aarch64/aarch64.h | 12 ++- gcc/config/aarch64/aarch64.md | 79 +++++++++-------- gcc/testsuite/gcc.target/aarch64/movdf_2.c | 51 +++++++++++ gcc/testsuite/gcc.target/aarch64/movdi_3.c | 59 +++++++++++++ gcc/testsuite/gcc.target/aarch64/movhf_2.c | 53 ++++++++++++ gcc/testsuite/gcc.target/aarch64/movhi_2.c | 61 +++++++++++++ gcc/testsuite/gcc.target/aarch64/movqi_2.c | 59 +++++++++++++ gcc/testsuite/gcc.target/aarch64/movsf_2.c | 51 +++++++++++ gcc/testsuite/gcc.target/aarch64/movsi_2.c | 59 +++++++++++++ gcc/testsuite/gcc.target/aarch64/movtf_3.c | 81 +++++++++++++++++ gcc/testsuite/gcc.target/aarch64/movtf_4.c | 78 +++++++++++++++++ gcc/testsuite/gcc.target/aarch64/movti_3.c | 86 +++++++++++++++++++ gcc/testsuite/gcc.target/aarch64/movti_4.c | 83 ++++++++++++++++++ gcc/testsuite/gcc.target/aarch64/movv16qi_4.c | 82 ++++++++++++++++++ gcc/testsuite/gcc.target/aarch64/movv16qi_5.c | 79 +++++++++++++++++ gcc/testsuite/gcc.target/aarch64/movv8qi_4.c | 55 ++++++++++++ .../gcc.target/aarch64/sme/arm_neon_1.c | 13 +++ .../gcc.target/aarch64/sme/arm_neon_2.c | 11 +++ .../gcc.target/aarch64/sme/arm_neon_3.c | 11 +++ 21 files changed, 1060 insertions(+), 67 deletions(-) create mode 100644 gcc/testsuite/gcc.target/aarch64/movdf_2.c create mode 100644 gcc/testsuite/gcc.target/aarch64/movdi_3.c create mode 100644 gcc/testsuite/gcc.target/aarch64/movhf_2.c create mode 100644 gcc/testsuite/gcc.target/aarch64/movhi_2.c create mode 100644 gcc/testsuite/gcc.target/aarch64/movqi_2.c create mode 100644 gcc/testsuite/gcc.target/aarch64/movsf_2.c create mode 100644 gcc/testsuite/gcc.target/aarch64/movsi_2.c create mode 100644 gcc/testsuite/gcc.target/aarch64/movtf_3.c create mode 100644 gcc/testsuite/gcc.target/aarch64/movtf_4.c create mode 100644 gcc/testsuite/gcc.target/aarch64/movti_3.c create mode 100644 gcc/testsuite/gcc.target/aarch64/movti_4.c create mode 100644 gcc/testsuite/gcc.target/aarch64/movv16qi_4.c create mode 100644 gcc/testsuite/gcc.target/aarch64/movv16qi_5.c create mode 100644 gcc/testsuite/gcc.target/aarch64/movv8qi_4.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/arm_neon_1.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/arm_neon_2.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/arm_neon_3.c diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md index ad79a8110a5..50b68552fe4 100644 --- a/gcc/config/aarch64/aarch64-simd.md +++ b/gcc/config/aarch64/aarch64-simd.md @@ -149,20 +149,20 @@ (define_insn_and_split "*aarch64_simd_mov" && (register_operand (operands[0], mode) || aarch64_simd_reg_or_zero (operands[1], mode))" {@ [cons: =0, 1; attrs: type, arch, length] - [w , m ; neon_load1_1reg , * , *] ldr\t%d0, %1 - [r , m ; load_8 , * , *] ldr\t%x0, %1 - [m , Dz; store_8 , * , *] str\txzr, %0 - [m , w ; neon_store1_1reg, * , *] str\t%d1, %0 - [m , r ; store_8 , * , *] str\t%x1, %0 - [w , w ; neon_logic , simd, *] mov\t%0., %1. - [w , w ; neon_logic , * , *] fmov\t%d0, %d1 - [?r, w ; neon_to_gp , simd, *] umov\t%0, %1.d[0] - [?r, w ; neon_to_gp , * , *] fmov\t%x0, %d1 - [?w, r ; f_mcr , * , *] fmov\t%d0, %1 - [?r, r ; mov_reg , * , *] mov\t%0, %1 - [w , Dn; neon_move , simd, *] << aarch64_output_simd_mov_immediate (operands[1], 64); - [w , Dz; f_mcr , * , *] fmov\t%d0, xzr - [w , Dx; neon_move , simd, 8] # + [w , m ; neon_load1_1reg , * , *] ldr\t%d0, %1 + [r , m ; load_8 , * , *] ldr\t%x0, %1 + [m , Dz; store_8 , * , *] str\txzr, %0 + [m , w ; neon_store1_1reg, * , *] str\t%d1, %0 + [m , r ; store_8 , * , *] str\t%x1, %0 + [w , w ; neon_logic , simd , *] mov\t%0., %1. + [w , w ; neon_logic , * , *] fmov\t%d0, %d1 + [?r, w ; neon_to_gp , base_simd, *] umov\t%0, %1.d[0] + [?r, w ; neon_to_gp , * , *] fmov\t%x0, %d1 + [?w, r ; f_mcr , * , *] fmov\t%d0, %1 + [?r, r ; mov_reg , * , *] mov\t%0, %1 + [w , Dn; neon_move , simd , *] << aarch64_output_simd_mov_immediate (operands[1], 64); + [w , Dz; f_mcr , * , *] fmov\t%d0, xzr + [w , Dx; neon_move , simd , 8] # } "CONST_INT_P (operands[1]) && aarch64_simd_special_constant_p (operands[1], mode) @@ -185,6 +185,7 @@ (define_insn_and_split "*aarch64_simd_mov" [Umn, Dz; store_16 , * , 4] stp\txzr, xzr, %0 [m , w ; neon_store1_1reg, * , 4] str\t%q1, %0 [w , w ; neon_logic , simd, 4] mov\t%0., %1. + [w , w ; * , sve , 4] mov\t%Z0.d, %Z1.d [?r , w ; multiple , * , 8] # [?w , r ; multiple , * , 8] # [?r , r ; multiple , * , 8] # @@ -225,7 +226,7 @@ (define_insn "aarch64_store_lane0" [(set (match_operand: 0 "memory_operand" "=m") (vec_select: (match_operand:VALL_F16 1 "register_operand" "w") (parallel [(match_operand 2 "const_int_operand" "n")])))] - "TARGET_SIMD + "TARGET_FLOAT && ENDIAN_LANE_N (, INTVAL (operands[2])) == 0" "str\\t%1, %0" [(set_attr "type" "neon_store1_1reg")] @@ -374,18 +375,18 @@ (define_insn_and_split "aarch64_simd_mov_from_low" (vec_select: (match_operand:VQMOV_NO2E 1 "register_operand") (match_operand:VQMOV_NO2E 2 "vect_par_cnst_lo_half")))] - "TARGET_SIMD" - {@ [ cons: =0 , 1 ; attrs: type ] - [ w , w ; mov_reg ] # - [ ?r , w ; neon_to_gp ] umov\t%0, %1.d[0] + "TARGET_FLOAT" + {@ [ cons: =0 , 1 ; attrs: type , arch ] + [ w , w ; mov_reg , simd ] # + [ ?r , w ; neon_to_gp , base_simd ] umov\t%0, %1.d[0] + [ ?r , w ; f_mrc , * ] fmov\t%0, %d1 } "&& reload_completed && aarch64_simd_register (operands[0], mode)" [(set (match_dup 0) (match_dup 1))] { operands[1] = aarch64_replace_reg_mode (operands[1], mode); } - [ - (set_attr "length" "4")] + [(set_attr "length" "4")] ) (define_insn "aarch64_simd_mov_from_high" @@ -396,12 +397,11 @@ (define_insn "aarch64_simd_mov_from_high" "TARGET_FLOAT" {@ [ cons: =0 , 1 ; attrs: type , arch ] [ w , w ; neon_dup , simd ] dup\t%d0, %1.d[1] + [ w , w ; * , sve ] ext\t%Z0.b, %Z0.b, %Z0.b, #8 [ ?r , w ; neon_to_gp , simd ] umov\t%0, %1.d[1] [ ?r , w ; f_mrc , * ] fmov\t%0, %1.d[1] } - [ - - (set_attr "length" "4")] + [(set_attr "length" "4")] ) (define_insn "orn3" diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc index 3792f1e99fd..ea00ec192ee 100644 --- a/gcc/config/aarch64/aarch64.cc +++ b/gcc/config/aarch64/aarch64.cc @@ -1400,7 +1400,7 @@ static bool aarch64_array_mode_supported_p (machine_mode mode, unsigned HOST_WIDE_INT nelems) { - if (TARGET_SIMD + if (TARGET_BASE_SIMD && (AARCH64_VALID_SIMD_QREG_MODE (mode) || AARCH64_VALID_SIMD_DREG_MODE (mode)) && (nelems >= 2 && nelems <= 4)) @@ -10762,8 +10762,8 @@ aarch64_secondary_reload (bool in_p ATTRIBUTE_UNUSED, rtx x, return NO_REGS; } - /* Without the TARGET_SIMD instructions we cannot move a Q register - to a Q register directly. We need a scratch. */ + /* Without the TARGET_SIMD or TARGET_SVE instructions we cannot move a + Q register to a Q register directly. We need a scratch. */ if (REG_P (x) && (mode == TFmode || mode == TImode @@ -13368,7 +13368,7 @@ aarch64_register_move_cost (machine_mode mode, secondary reload. A general register is used as a scratch to move the upper DI value and the lower DI value is moved directly, hence the cost is the sum of three moves. */ - if (! TARGET_SIMD) + if (!TARGET_SIMD && !TARGET_SVE) return regmove_cost->GP2FP + regmove_cost->FP2GP + regmove_cost->FP2FP; return regmove_cost->FP2FP; @@ -18996,7 +18996,7 @@ aarch64_simd_container_mode (scalar_mode mode, poly_int64 width) return aarch64_full_sve_mode (mode).else_mode (word_mode); gcc_assert (known_eq (width, 64) || known_eq (width, 128)); - if (TARGET_SIMD) + if (TARGET_BASE_SIMD) { if (known_eq (width, 128)) return aarch64_vq_mode (mode).else_mode (word_mode); @@ -23409,7 +23409,11 @@ aarch64_expand_cpymem (rtx *operands) int copy_bits = 256; /* Default to 256-bit LDP/STP on large copies, however small copies, no SIMD - support or slow 256-bit LDP/STP fall back to 128-bit chunks. */ + support or slow 256-bit LDP/STP fall back to 128-bit chunks. + + ??? Although it would be possible to use LDP/STP Qn in streaming mode + (so using TARGET_BASE_SIMD instead of TARGET_SIMD), it isn't clear + whether that would improve performance. */ if (size <= 24 || !TARGET_SIMD || (aarch64_tune_params.extra_tuning_flags diff --git a/gcc/config/aarch64/aarch64.h b/gcc/config/aarch64/aarch64.h index aa908ced7cd..808e2044009 100644 --- a/gcc/config/aarch64/aarch64.h +++ b/gcc/config/aarch64/aarch64.h @@ -61,8 +61,15 @@ #define WORDS_BIG_ENDIAN (BYTES_BIG_ENDIAN) /* AdvSIMD is supported in the default configuration, unless disabled by - -mgeneral-regs-only or by the +nosimd extension. */ -#define TARGET_SIMD (AARCH64_ISA_SIMD) + -mgeneral-regs-only or by the +nosimd extension. The set of available + instructions is then subdivided into: + + - the "base" set, available both in SME streaming mode and in + non-streaming mode + + - the full set, available only in non-streaming mode. */ +#define TARGET_BASE_SIMD (AARCH64_ISA_SIMD) +#define TARGET_SIMD (AARCH64_ISA_SIMD && AARCH64_ISA_SM_OFF) #define TARGET_FLOAT (AARCH64_ISA_FP) #define UNITS_PER_WORD 8 @@ -199,6 +206,7 @@ constexpr auto AARCH64_FL_DEFAULT_ISA_MODE = AARCH64_FL_SM_OFF; /* Macros to test ISA flags. */ +#define AARCH64_ISA_SM_OFF (aarch64_isa_flags & AARCH64_FL_SM_OFF) #define AARCH64_ISA_MODE (aarch64_isa_flags & AARCH64_FL_ISA_MODES) #define AARCH64_ISA_CRC (aarch64_isa_flags & AARCH64_FL_CRC) #define AARCH64_ISA_CRYPTO (aarch64_isa_flags & AARCH64_FL_CRYPTO) diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md index e6b19b962b1..ddfd17bd2dd 100644 --- a/gcc/config/aarch64/aarch64.md +++ b/gcc/config/aarch64/aarch64.md @@ -366,7 +366,8 @@ (define_constants ;; As a convenience, "fp_q" means "fp" + the ability to move between ;; Q registers and is equivalent to "simd". -(define_enum "arches" [ any rcpc8_4 fp fp_q simd nosimd sve fp16]) +(define_enum "arches" [any rcpc8_4 fp fp_q base_simd nobase_simd + simd nosimd sve fp16]) (define_enum_attr "arch" "arches" (const_string "any")) @@ -394,6 +395,12 @@ (define_attr "arch_enabled" "no,yes" (and (eq_attr "arch" "fp") (match_test "TARGET_FLOAT")) + (and (eq_attr "arch" "base_simd") + (match_test "TARGET_BASE_SIMD")) + + (and (eq_attr "arch" "nobase_simd") + (match_test "!TARGET_BASE_SIMD")) + (and (eq_attr "arch" "fp_q, simd") (match_test "TARGET_SIMD")) @@ -1224,23 +1231,23 @@ (define_insn "*mov_aarch64" "(register_operand (operands[0], mode) || aarch64_reg_or_zero (operands[1], mode))" {@ [cons: =0, 1; attrs: type, arch] - [w, Z ; neon_move , simd ] movi\t%0., #0 - [r, r ; mov_reg , * ] mov\t%w0, %w1 - [r, M ; mov_imm , * ] mov\t%w0, %1 - [w, D; neon_move , simd ] << aarch64_output_scalar_simd_mov_immediate (operands[1], mode); + [w, Z ; neon_move , simd ] movi\t%0., #0 + [r, r ; mov_reg , * ] mov\t%w0, %w1 + [r, M ; mov_imm , * ] mov\t%w0, %1 + [w, D; neon_move , simd ] << aarch64_output_scalar_simd_mov_immediate (operands[1], mode); /* The "mov_imm" type for CNT is just a placeholder. */ - [r, Usv ; mov_imm , sve ] << aarch64_output_sve_cnt_immediate ("cnt", "%x0", operands[1]); - [r, Usr ; mov_imm , sve ] << aarch64_output_sve_rdvl (operands[1]); - [r, m ; load_4 , * ] ldr\t%w0, %1 - [w, m ; load_4 , * ] ldr\t%0, %1 - [m, r Z ; store_4 , * ] str\\t%w1, %0 - [m, w ; store_4 , * ] str\t%1, %0 - [r, w ; neon_to_gp , simd ] umov\t%w0, %1.[0] - [r, w ; neon_to_gp , nosimd] fmov\t%w0, %s1 - [w, r Z ; neon_from_gp, simd ] dup\t%0., %w1 - [w, r Z ; neon_from_gp, nosimd] fmov\t%s0, %w1 - [w, w ; neon_dup , simd ] dup\t%0, %1.[0] - [w, w ; neon_dup , nosimd] fmov\t%s0, %s1 + [r, Usv ; mov_imm , sve ] << aarch64_output_sve_cnt_immediate ("cnt", "%x0", operands[1]); + [r, Usr ; mov_imm , sve ] << aarch64_output_sve_rdvl (operands[1]); + [r, m ; load_4 , * ] ldr\t%w0, %1 + [w, m ; load_4 , * ] ldr\t%0, %1 + [m, r Z ; store_4 , * ] str\\t%w1, %0 + [m, w ; store_4 , * ] str\t%1, %0 + [r, w ; neon_to_gp , base_simd ] umov\t%w0, %1.[0] + [r, w ; neon_to_gp , nobase_simd] fmov\t%w0, %s1 + [w, r Z ; neon_from_gp, simd ] dup\t%0., %w1 + [w, r Z ; neon_from_gp, nosimd ] fmov\t%s0, %w1 + [w, w ; neon_dup , simd ] dup\t%0, %1.[0] + [w, w ; neon_dup , nosimd ] fmov\t%s0, %s1 } ) @@ -1405,9 +1412,9 @@ (define_expand "movti" (define_insn "*movti_aarch64" [(set (match_operand:TI 0 - "nonimmediate_operand" "= r,w,w,w, r,w,r,m,m,w,m") + "nonimmediate_operand" "= r,w,w,w, r,w,w,r,m,m,w,m") (match_operand:TI 1 - "aarch64_movti_operand" " rUti,Z,Z,r, w,w,m,r,Z,m,w"))] + "aarch64_movti_operand" " rUti,Z,Z,r, w,w,w,m,r,Z,m,w"))] "(register_operand (operands[0], TImode) || aarch64_reg_or_zero (operands[1], TImode))" "@ @@ -1417,16 +1424,17 @@ (define_insn "*movti_aarch64" # # mov\\t%0.16b, %1.16b + mov\\t%Z0.d, %Z1.d ldp\\t%0, %H0, %1 stp\\t%1, %H1, %0 stp\\txzr, xzr, %0 ldr\\t%q0, %1 str\\t%q1, %0" - [(set_attr "type" "multiple,neon_move,f_mcr,f_mcr,f_mrc,neon_logic_q, \ + [(set_attr "type" "multiple,neon_move,f_mcr,f_mcr,f_mrc,neon_logic_q,*,\ load_16,store_16,store_16,\ load_16,store_16") - (set_attr "length" "8,4,4,8,8,4,4,4,4,4,4") - (set_attr "arch" "*,simd,*,*,*,simd,*,*,*,fp,fp")] + (set_attr "length" "8,4,4,8,8,4,4,4,4,4,4,4") + (set_attr "arch" "*,simd,*,*,*,simd,sve,*,*,*,fp,fp")] ) ;; Split a TImode register-register or register-immediate move into @@ -1553,13 +1561,14 @@ (define_split (define_insn "*mov_aarch64" [(set (match_operand:TFD 0 - "nonimmediate_operand" "=w,?r ,w ,?r,w,?w,w,m,?r,m ,m") + "nonimmediate_operand" "=w,w,?r ,w ,?r,w,?w,w,m,?r,m ,m") (match_operand:TFD 1 - "general_operand" " w,?rY,?r,w ,Y,Y ,m,w,m ,?r,Y"))] + "general_operand" " w,w,?rY,?r,w ,Y,Y ,m,w,m ,?r,Y"))] "TARGET_FLOAT && (register_operand (operands[0], mode) || aarch64_reg_or_fp_zero (operands[1], mode))" "@ mov\\t%0.16b, %1.16b + mov\\t%Z0.d, %Z1.d # # # @@ -1570,10 +1579,10 @@ (define_insn "*mov_aarch64" ldp\\t%0, %H0, %1 stp\\t%1, %H1, %0 stp\\txzr, xzr, %0" - [(set_attr "type" "logic_reg,multiple,f_mcr,f_mrc,neon_move_q,f_mcr,\ + [(set_attr "type" "logic_reg,*,multiple,f_mcr,f_mrc,neon_move_q,f_mcr,\ f_loadd,f_stored,load_16,store_16,store_16") - (set_attr "length" "4,8,8,8,4,4,4,4,4,4,4") - (set_attr "arch" "simd,*,*,*,simd,*,*,*,*,*,*")] + (set_attr "length" "4,4,8,8,8,4,4,4,4,4,4,4") + (set_attr "arch" "simd,sve,*,*,*,simd,*,*,*,*,*,*")] ) (define_split @@ -1767,7 +1776,7 @@ (define_insn "load_pair_dw_" (match_operand:TX 1 "aarch64_mem_pair_operand" "Ump")) (set (match_operand:TX2 2 "register_operand" "=w") (match_operand:TX2 3 "memory_operand" "m"))] - "TARGET_SIMD + "TARGET_BASE_SIMD && rtx_equal_p (XEXP (operands[3], 0), plus_constant (Pmode, XEXP (operands[1], 0), @@ -1815,11 +1824,11 @@ (define_insn "store_pair_dw_" (match_operand:TX 1 "register_operand" "w")) (set (match_operand:TX2 2 "memory_operand" "=m") (match_operand:TX2 3 "register_operand" "w"))] - "TARGET_SIMD && - rtx_equal_p (XEXP (operands[2], 0), - plus_constant (Pmode, - XEXP (operands[0], 0), - GET_MODE_SIZE (TFmode)))" + "TARGET_BASE_SIMD + && rtx_equal_p (XEXP (operands[2], 0), + plus_constant (Pmode, + XEXP (operands[0], 0), + GET_MODE_SIZE (TFmode)))" "stp\\t%q1, %q3, %z0" [(set_attr "type" "neon_stp_q") (set_attr "fp" "yes")] @@ -1867,7 +1876,7 @@ (define_insn "loadwb_pair_" (set (match_operand:TX 3 "register_operand" "=w") (mem:TX (plus:P (match_dup 1) (match_operand:P 5 "const_int_operand" "n"))))])] - "TARGET_SIMD && INTVAL (operands[5]) == GET_MODE_SIZE (mode)" + "TARGET_BASE_SIMD && INTVAL (operands[5]) == GET_MODE_SIZE (mode)" "ldp\\t%q2, %q3, [%1], %4" [(set_attr "type" "neon_ldp_q")] ) @@ -1917,7 +1926,7 @@ (define_insn "storewb_pair_" (set (mem:TX (plus:P (match_dup 0) (match_operand:P 5 "const_int_operand" "n"))) (match_operand:TX 3 "register_operand" "w"))])] - "TARGET_SIMD + "TARGET_BASE_SIMD && INTVAL (operands[5]) == INTVAL (operands[4]) + GET_MODE_SIZE (mode)" "stp\\t%q2, %q3, [%0, %4]!" diff --git a/gcc/testsuite/gcc.target/aarch64/movdf_2.c b/gcc/testsuite/gcc.target/aarch64/movdf_2.c new file mode 100644 index 00000000000..0d459d31760 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/movdf_2.c @@ -0,0 +1,51 @@ +/* { dg-do assemble } */ +/* { dg-options "-O --save-temps" } */ +/* { dg-final { check-function-bodies "**" "" "" } } */ + +/* +** fpr_to_fpr: +** fmov d0, d1 +** ret +*/ +double +fpr_to_fpr (double q0, double q1) [[arm::streaming_compatible]] +{ + return q1; +} + +/* +** gpr_to_fpr: +** fmov d0, x0 +** ret +*/ +double +gpr_to_fpr () [[arm::streaming_compatible]] +{ + register double x0 asm ("x0"); + asm volatile ("" : "=r" (x0)); + return x0; +} + +/* +** zero_to_fpr: +** fmov d0, xzr +** ret +*/ +double +zero_to_fpr () [[arm::streaming_compatible]] +{ + return 0; +} + +/* +** fpr_to_gpr: +** fmov x0, d0 +** ret +*/ +void +fpr_to_gpr (double q0) [[arm::streaming_compatible]] +{ + register double x0 asm ("x0"); + x0 = q0; + asm volatile ("" :: "r" (x0)); +} diff --git a/gcc/testsuite/gcc.target/aarch64/movdi_3.c b/gcc/testsuite/gcc.target/aarch64/movdi_3.c new file mode 100644 index 00000000000..31b2cbbaeb0 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/movdi_3.c @@ -0,0 +1,59 @@ +/* { dg-do assemble } */ +/* { dg-options "-O --save-temps" } */ +/* { dg-final { check-function-bodies "**" "" "" } } */ + +#include + +/* +** fpr_to_fpr: +** fmov d0, d1 +** ret +*/ +void +fpr_to_fpr (void) [[arm::streaming_compatible]] +{ + register uint64_t q0 asm ("q0"); + register uint64_t q1 asm ("q1"); + asm volatile ("" : "=w" (q1)); + q0 = q1; + asm volatile ("" :: "w" (q0)); +} + +/* +** gpr_to_fpr: +** fmov d0, x0 +** ret +*/ +void +gpr_to_fpr (uint64_t x0) [[arm::streaming_compatible]] +{ + register uint64_t q0 asm ("q0"); + q0 = x0; + asm volatile ("" :: "w" (q0)); +} + +/* +** zero_to_fpr: +** fmov d0, xzr +** ret +*/ +void +zero_to_fpr () [[arm::streaming_compatible]] +{ + register uint64_t q0 asm ("q0"); + q0 = 0; + asm volatile ("" :: "w" (q0)); +} + +/* +** fpr_to_gpr: +** fmov x0, d0 +** ret +*/ +uint64_t +fpr_to_gpr () [[arm::streaming_compatible]] +{ + register uint64_t q0 asm ("q0"); + asm volatile ("" : "=w" (q0)); + return q0; +} diff --git a/gcc/testsuite/gcc.target/aarch64/movhf_2.c b/gcc/testsuite/gcc.target/aarch64/movhf_2.c new file mode 100644 index 00000000000..3292b0de8d1 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/movhf_2.c @@ -0,0 +1,53 @@ +/* { dg-do assemble } */ +/* { dg-options "-O --save-temps" } */ +/* { dg-final { check-function-bodies "**" "" "" } } */ + +#pragma GCC target "+nothing+simd" + +/* +** fpr_to_fpr: +** fmov s0, s1 +** ret +*/ +_Float16 +fpr_to_fpr (_Float16 q0, _Float16 q1) [[arm::streaming_compatible]] +{ + return q1; +} + +/* +** gpr_to_fpr: +** fmov s0, w0 +** ret +*/ +_Float16 +gpr_to_fpr () [[arm::streaming_compatible]] +{ + register _Float16 w0 asm ("w0"); + asm volatile ("" : "=r" (w0)); + return w0; +} + +/* +** zero_to_fpr: +** fmov s0, wzr +** ret +*/ +_Float16 +zero_to_fpr () [[arm::streaming_compatible]] +{ + return 0; +} + +/* +** fpr_to_gpr: +** fmov w0, s0 +** ret +*/ +void +fpr_to_gpr (_Float16 q0) [[arm::streaming_compatible]] +{ + register _Float16 w0 asm ("w0"); + w0 = q0; + asm volatile ("" :: "r" (w0)); +} diff --git a/gcc/testsuite/gcc.target/aarch64/movhi_2.c b/gcc/testsuite/gcc.target/aarch64/movhi_2.c new file mode 100644 index 00000000000..dbbf3486f58 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/movhi_2.c @@ -0,0 +1,61 @@ +/* { dg-do assemble } */ +/* { dg-options "-O --save-temps" } */ +/* { dg-final { check-function-bodies "**" "" "" } } */ + +#pragma GCC target "+nothing+simd" + +#include + +/* +** fpr_to_fpr: +** fmov s0, s1 +** ret +*/ +void +fpr_to_fpr (void) [[arm::streaming_compatible]] +{ + register uint16_t q0 asm ("q0"); + register uint16_t q1 asm ("q1"); + asm volatile ("" : "=w" (q1)); + q0 = q1; + asm volatile ("" :: "w" (q0)); +} + +/* +** gpr_to_fpr: +** fmov s0, w0 +** ret +*/ +void +gpr_to_fpr (uint16_t w0) [[arm::streaming_compatible]] +{ + register uint16_t q0 asm ("q0"); + q0 = w0; + asm volatile ("" :: "w" (q0)); +} + +/* +** zero_to_fpr: +** fmov s0, wzr +** ret +*/ +void +zero_to_fpr () [[arm::streaming_compatible]] +{ + register uint16_t q0 asm ("q0"); + q0 = 0; + asm volatile ("" :: "w" (q0)); +} + +/* +** fpr_to_gpr: +** umov w0, v0.h\[0\] +** ret +*/ +uint16_t +fpr_to_gpr () [[arm::streaming_compatible]] +{ + register uint16_t q0 asm ("q0"); + asm volatile ("" : "=w" (q0)); + return q0; +} diff --git a/gcc/testsuite/gcc.target/aarch64/movqi_2.c b/gcc/testsuite/gcc.target/aarch64/movqi_2.c new file mode 100644 index 00000000000..aec087e4e2c --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/movqi_2.c @@ -0,0 +1,59 @@ +/* { dg-do assemble } */ +/* { dg-options "-O --save-temps" } */ +/* { dg-final { check-function-bodies "**" "" "" } } */ + +#include + +/* +** fpr_to_fpr: +** fmov s0, s1 +** ret +*/ +void +fpr_to_fpr (void) [[arm::streaming_compatible]] +{ + register uint8_t q0 asm ("q0"); + register uint8_t q1 asm ("q1"); + asm volatile ("" : "=w" (q1)); + q0 = q1; + asm volatile ("" :: "w" (q0)); +} + +/* +** gpr_to_fpr: +** fmov s0, w0 +** ret +*/ +void +gpr_to_fpr (uint8_t w0) [[arm::streaming_compatible]] +{ + register uint8_t q0 asm ("q0"); + q0 = w0; + asm volatile ("" :: "w" (q0)); +} + +/* +** zero_to_fpr: +** fmov s0, wzr +** ret +*/ +void +zero_to_fpr () [[arm::streaming_compatible]] +{ + register uint8_t q0 asm ("q0"); + q0 = 0; + asm volatile ("" :: "w" (q0)); +} + +/* +** fpr_to_gpr: +** umov w0, v0.b\[0\] +** ret +*/ +uint8_t +fpr_to_gpr () [[arm::streaming_compatible]] +{ + register uint8_t q0 asm ("q0"); + asm volatile ("" : "=w" (q0)); + return q0; +} diff --git a/gcc/testsuite/gcc.target/aarch64/movsf_2.c b/gcc/testsuite/gcc.target/aarch64/movsf_2.c new file mode 100644 index 00000000000..7fed4b22f7a --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/movsf_2.c @@ -0,0 +1,51 @@ +/* { dg-do assemble } */ +/* { dg-options "-O --save-temps" } */ +/* { dg-final { check-function-bodies "**" "" "" } } */ + +/* +** fpr_to_fpr: +** fmov s0, s1 +** ret +*/ +float +fpr_to_fpr (float q0, float q1) [[arm::streaming_compatible]] +{ + return q1; +} + +/* +** gpr_to_fpr: +** fmov s0, w0 +** ret +*/ +float +gpr_to_fpr () [[arm::streaming_compatible]] +{ + register float w0 asm ("w0"); + asm volatile ("" : "=r" (w0)); + return w0; +} + +/* +** zero_to_fpr: +** fmov s0, wzr +** ret +*/ +float +zero_to_fpr () [[arm::streaming_compatible]] +{ + return 0; +} + +/* +** fpr_to_gpr: +** fmov w0, s0 +** ret +*/ +void +fpr_to_gpr (float q0) [[arm::streaming_compatible]] +{ + register float w0 asm ("w0"); + w0 = q0; + asm volatile ("" :: "r" (w0)); +} diff --git a/gcc/testsuite/gcc.target/aarch64/movsi_2.c b/gcc/testsuite/gcc.target/aarch64/movsi_2.c new file mode 100644 index 00000000000..c14d2468af3 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/movsi_2.c @@ -0,0 +1,59 @@ +/* { dg-do assemble } */ +/* { dg-options "-O --save-temps" } */ +/* { dg-final { check-function-bodies "**" "" "" } } */ + +#include + +/* +** fpr_to_fpr: +** fmov s0, s1 +** ret +*/ +void +fpr_to_fpr (void) [[arm::streaming_compatible]] +{ + register uint32_t q0 asm ("q0"); + register uint32_t q1 asm ("q1"); + asm volatile ("" : "=w" (q1)); + q0 = q1; + asm volatile ("" :: "w" (q0)); +} + +/* +** gpr_to_fpr: +** fmov s0, w0 +** ret +*/ +void +gpr_to_fpr (uint32_t w0) [[arm::streaming_compatible]] +{ + register uint32_t q0 asm ("q0"); + q0 = w0; + asm volatile ("" :: "w" (q0)); +} + +/* +** zero_to_fpr: +** fmov s0, wzr +** ret +*/ +void +zero_to_fpr () [[arm::streaming_compatible]] +{ + register uint32_t q0 asm ("q0"); + q0 = 0; + asm volatile ("" :: "w" (q0)); +} + +/* +** fpr_to_gpr: +** fmov w0, s0 +** ret +*/ +uint32_t +fpr_to_gpr () [[arm::streaming_compatible]] +{ + register uint32_t q0 asm ("q0"); + asm volatile ("" : "=w" (q0)); + return q0; +} diff --git a/gcc/testsuite/gcc.target/aarch64/movtf_3.c b/gcc/testsuite/gcc.target/aarch64/movtf_3.c new file mode 100644 index 00000000000..dd164a41855 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/movtf_3.c @@ -0,0 +1,81 @@ +/* { dg-do assemble } */ +/* { dg-require-effective-target large_long_double } */ +/* { dg-options "-O -mtune=neoverse-v1 --save-temps" } */ +/* { dg-final { check-function-bodies "**" "" "" } } */ + +#pragma GCC target "+nosve" + +/* +** fpr_to_fpr: +** sub sp, sp, #16 +** str q1, \[sp\] +** ldr q0, \[sp\] +** add sp, sp, #?16 +** ret +*/ +long double +fpr_to_fpr (long double q0, long double q1) [[arm::streaming_compatible]] +{ + return q1; +} + +/* +** gpr_to_fpr: { target aarch64_little_endian } +** fmov d0, x0 +** fmov v0.d\[1\], x1 +** ret +*/ +/* +** gpr_to_fpr: { target aarch64_big_endian } +** fmov d0, x1 +** fmov v0.d\[1\], x0 +** ret +*/ +long double +gpr_to_fpr () [[arm::streaming_compatible]] +{ + register long double x0 asm ("x0"); + asm volatile ("" : "=r" (x0)); + return x0; +} + +/* +** zero_to_fpr: +** fmov s0, wzr +** ret +*/ +long double +zero_to_fpr () [[arm::streaming_compatible]] +{ + return 0; +} + +/* +** fpr_to_gpr: { target aarch64_little_endian } +** ( +** fmov x0, d0 +** fmov x1, v0.d\[1\] +** | +** fmov x1, v0.d\[1\] +** fmov x0, d0 +** ) +** ret +*/ +/* +** fpr_to_gpr: { target aarch64_big_endian } +** ( +** fmov x1, d0 +** fmov x0, v0.d\[1\] +** | +** fmov x0, v0.d\[1\] +** fmov x1, d0 +** ) +** ret +*/ +void +fpr_to_gpr (long double q0) [[arm::streaming_compatible]] +{ + register long double x0 asm ("x0"); + x0 = q0; + asm volatile ("" :: "r" (x0)); +} diff --git a/gcc/testsuite/gcc.target/aarch64/movtf_4.c b/gcc/testsuite/gcc.target/aarch64/movtf_4.c new file mode 100644 index 00000000000..faf9703e2b6 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/movtf_4.c @@ -0,0 +1,78 @@ +/* { dg-do assemble } */ +/* { dg-require-effective-target large_long_double } */ +/* { dg-options "-O -mtune=neoverse-v1 --save-temps" } */ +/* { dg-final { check-function-bodies "**" "" "" } } */ + +#pragma GCC target "+sve" + +/* +** fpr_to_fpr: +** mov z0.d, z1.d +** ret +*/ +long double +fpr_to_fpr (long double q0, long double q1) [[arm::streaming_compatible]] +{ + return q1; +} + +/* +** gpr_to_fpr: { target aarch64_little_endian } +** fmov d0, x0 +** fmov v0.d\[1\], x1 +** ret +*/ +/* +** gpr_to_fpr: { target aarch64_big_endian } +** fmov d0, x1 +** fmov v0.d\[1\], x0 +** ret +*/ +long double +gpr_to_fpr () [[arm::streaming_compatible]] +{ + register long double x0 asm ("x0"); + asm volatile ("" : "=r" (x0)); + return x0; +} + +/* +** zero_to_fpr: +** fmov s0, wzr +** ret +*/ +long double +zero_to_fpr () [[arm::streaming_compatible]] +{ + return 0; +} + +/* +** fpr_to_gpr: { target aarch64_little_endian } +** ( +** fmov x0, d0 +** fmov x1, v0.d\[1\] +** | +** fmov x1, v0.d\[1\] +** fmov x0, d0 +** ) +** ret +*/ +/* +** fpr_to_gpr: { target aarch64_big_endian } +** ( +** fmov x1, d0 +** fmov x0, v0.d\[1\] +** | +** fmov x0, v0.d\[1\] +** fmov x1, d0 +** ) +** ret +*/ +void +fpr_to_gpr (long double q0) [[arm::streaming_compatible]] +{ + register long double x0 asm ("x0"); + x0 = q0; + asm volatile ("" :: "r" (x0)); +} diff --git a/gcc/testsuite/gcc.target/aarch64/movti_3.c b/gcc/testsuite/gcc.target/aarch64/movti_3.c new file mode 100644 index 00000000000..243109181d6 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/movti_3.c @@ -0,0 +1,86 @@ +/* { dg-do assemble } */ +/* { dg-options "-O -mtune=neoverse-v1 --save-temps" } */ +/* { dg-final { check-function-bodies "**" "" "" } } */ + +#pragma GCC target "+nosve" + +/* +** fpr_to_fpr: +** sub sp, sp, #16 +** str q1, \[sp\] +** ldr q0, \[sp\] +** add sp, sp, #?16 +** ret +*/ +void +fpr_to_fpr (void) [[arm::streaming_compatible]] +{ + register __int128_t q0 asm ("q0"); + register __int128_t q1 asm ("q1"); + asm volatile ("" : "=w" (q1)); + q0 = q1; + asm volatile ("" :: "w" (q0)); +} + +/* +** gpr_to_fpr: { target aarch64_little_endian } +** fmov d0, x0 +** fmov v0.d\[1\], x1 +** ret +*/ +/* +** gpr_to_fpr: { target aarch64_big_endian } +** fmov d0, x1 +** fmov v0.d\[1\], x0 +** ret +*/ +void +gpr_to_fpr (__int128_t x0) [[arm::streaming_compatible]] +{ + register __int128_t q0 asm ("q0"); + q0 = x0; + asm volatile ("" :: "w" (q0)); +} + +/* +** zero_to_fpr: +** fmov d0, xzr +** ret +*/ +void +zero_to_fpr () [[arm::streaming_compatible]] +{ + register __int128_t q0 asm ("q0"); + q0 = 0; + asm volatile ("" :: "w" (q0)); +} + +/* +** fpr_to_gpr: { target aarch64_little_endian } +** ( +** fmov x0, d0 +** fmov x1, v0.d\[1\] +** | +** fmov x1, v0.d\[1\] +** fmov x0, d0 +** ) +** ret +*/ +/* +** fpr_to_gpr: { target aarch64_big_endian } +** ( +** fmov x1, d0 +** fmov x0, v0.d\[1\] +** | +** fmov x0, v0.d\[1\] +** fmov x1, d0 +** ) +** ret +*/ +__int128_t +fpr_to_gpr () [[arm::streaming_compatible]] +{ + register __int128_t q0 asm ("q0"); + asm volatile ("" : "=w" (q0)); + return q0; +} diff --git a/gcc/testsuite/gcc.target/aarch64/movti_4.c b/gcc/testsuite/gcc.target/aarch64/movti_4.c new file mode 100644 index 00000000000..a70feccb0e3 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/movti_4.c @@ -0,0 +1,83 @@ +/* { dg-do assemble } */ +/* { dg-options "-O -mtune=neoverse-v1 --save-temps" } */ +/* { dg-final { check-function-bodies "**" "" "" } } */ + +#pragma GCC target "+sve" + +/* +** fpr_to_fpr: +** mov z0\.d, z1\.d +** ret +*/ +void +fpr_to_fpr (void) [[arm::streaming_compatible]] +{ + register __int128_t q0 asm ("q0"); + register __int128_t q1 asm ("q1"); + asm volatile ("" : "=w" (q1)); + q0 = q1; + asm volatile ("" :: "w" (q0)); +} + +/* +** gpr_to_fpr: { target aarch64_little_endian } +** fmov d0, x0 +** fmov v0.d\[1\], x1 +** ret +*/ +/* +** gpr_to_fpr: { target aarch64_big_endian } +** fmov d0, x1 +** fmov v0.d\[1\], x0 +** ret +*/ +void +gpr_to_fpr (__int128_t x0) [[arm::streaming_compatible]] +{ + register __int128_t q0 asm ("q0"); + q0 = x0; + asm volatile ("" :: "w" (q0)); +} + +/* +** zero_to_fpr: +** fmov d0, xzr +** ret +*/ +void +zero_to_fpr () [[arm::streaming_compatible]] +{ + register __int128_t q0 asm ("q0"); + q0 = 0; + asm volatile ("" :: "w" (q0)); +} + +/* +** fpr_to_gpr: { target aarch64_little_endian } +** ( +** fmov x0, d0 +** fmov x1, v0.d\[1\] +** | +** fmov x1, v0.d\[1\] +** fmov x0, d0 +** ) +** ret +*/ +/* +** fpr_to_gpr: { target aarch64_big_endian } +** ( +** fmov x1, d0 +** fmov x0, v0.d\[1\] +** | +** fmov x0, v0.d\[1\] +** fmov x1, d0 +** ) +** ret +*/ +__int128_t +fpr_to_gpr () [[arm::streaming_compatible]] +{ + register __int128_t q0 asm ("q0"); + asm volatile ("" : "=w" (q0)); + return q0; +} diff --git a/gcc/testsuite/gcc.target/aarch64/movv16qi_4.c b/gcc/testsuite/gcc.target/aarch64/movv16qi_4.c new file mode 100644 index 00000000000..7bec888b71d --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/movv16qi_4.c @@ -0,0 +1,82 @@ +/* { dg-do assemble } */ +/* { dg-options "-O -mtune=neoverse-v1 --save-temps" } */ +/* { dg-final { check-function-bodies "**" "" "" } } */ + +#pragma GCC target "+nosve" + +typedef unsigned char v16qi __attribute__((vector_size(16))); + +/* +** fpr_to_fpr: +** sub sp, sp, #16 +** str q1, \[sp\] +** ldr q0, \[sp\] +** add sp, sp, #?16 +** ret +*/ +v16qi +fpr_to_fpr (v16qi q0, v16qi q1) [[arm::streaming_compatible]] +{ + return q1; +} + +/* +** gpr_to_fpr: { target aarch64_little_endian } +** fmov d0, x0 +** fmov v0.d\[1\], x1 +** ret +*/ +/* +** gpr_to_fpr: { target aarch64_big_endian } +** fmov d0, x1 +** fmov v0.d\[1\], x0 +** ret +*/ +v16qi +gpr_to_fpr () [[arm::streaming_compatible]] +{ + register v16qi x0 asm ("x0"); + asm volatile ("" : "=r" (x0)); + return x0; +} + +/* +** zero_to_fpr: +** fmov d0, xzr +** ret +*/ +v16qi +zero_to_fpr () [[arm::streaming_compatible]] +{ + return (v16qi) {}; +} + +/* +** fpr_to_gpr: { target aarch64_little_endian } +** ( +** umov x0, v0.d\[0\] +** fmov x1, v0.d\[1\] +** | +** fmov x1, v0.d\[1\] +** umov x0, v0.d\[0\] +** ) +** ret +*/ +/* +** fpr_to_gpr: { target aarch64_big_endian } +** ( +** umov x1, v0.d\[0\] +** fmov x0, v0.d\[1\] +** | +** fmov x0, v0.d\[1\] +** umov x1, v0.d\[0\] +** ) +** ret +*/ +void +fpr_to_gpr (v16qi q0) [[arm::streaming_compatible]] +{ + register v16qi x0 asm ("x0"); + x0 = q0; + asm volatile ("" :: "r" (x0)); +} diff --git a/gcc/testsuite/gcc.target/aarch64/movv16qi_5.c b/gcc/testsuite/gcc.target/aarch64/movv16qi_5.c new file mode 100644 index 00000000000..2d36342b3f8 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/movv16qi_5.c @@ -0,0 +1,79 @@ +/* { dg-do assemble } */ +/* { dg-options "-O -mtune=neoverse-v1 --save-temps" } */ +/* { dg-final { check-function-bodies "**" "" "" } } */ + +#pragma GCC target "+sve" + +typedef unsigned char v16qi __attribute__((vector_size(16))); + +/* +** fpr_to_fpr: +** mov z0.d, z1.d +** ret +*/ +v16qi +fpr_to_fpr (v16qi q0, v16qi q1) [[arm::streaming_compatible]] +{ + return q1; +} + +/* +** gpr_to_fpr: { target aarch64_little_endian } +** fmov d0, x0 +** fmov v0.d\[1\], x1 +** ret +*/ +/* +** gpr_to_fpr: { target aarch64_big_endian } +** fmov d0, x1 +** fmov v0.d\[1\], x0 +** ret +*/ +v16qi +gpr_to_fpr () [[arm::streaming_compatible]] +{ + register v16qi x0 asm ("x0"); + asm volatile ("" : "=r" (x0)); + return x0; +} + +/* +** zero_to_fpr: +** fmov d0, xzr +** ret +*/ +v16qi +zero_to_fpr () [[arm::streaming_compatible]] +{ + return (v16qi) {}; +} + +/* +** fpr_to_gpr: { target aarch64_little_endian } +** ( +** umov x0, v0.d\[0\] +** fmov x1, v0.d\[1\] +** | +** fmov x1, v0.d\[1\] +** umov x0, v0.d\[0\] +** ) +** ret +*/ +/* +** fpr_to_gpr: { target aarch64_big_endian } +** ( +** umov x1, v0.d\[0\] +** fmov x0, v0.d\[1\] +** | +** fmov x0, v0.d\[1\] +** umov x1, v0.d\[0\] +** ) +** ret +*/ +void +fpr_to_gpr (v16qi q0) [[arm::streaming_compatible]] +{ + register v16qi x0 asm ("x0"); + x0 = q0; + asm volatile ("" :: "r" (x0)); +} diff --git a/gcc/testsuite/gcc.target/aarch64/movv8qi_4.c b/gcc/testsuite/gcc.target/aarch64/movv8qi_4.c new file mode 100644 index 00000000000..12ae25a3a4a --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/movv8qi_4.c @@ -0,0 +1,55 @@ +/* { dg-do assemble } */ +/* { dg-options "-O -mtune=neoverse-v1 --save-temps" } */ +/* { dg-final { check-function-bodies "**" "" "" } } */ + +#pragma GCC target "+nosve" + +typedef unsigned char v8qi __attribute__((vector_size(8))); + +/* +** fpr_to_fpr: +** fmov d0, d1 +** ret +*/ +v8qi +fpr_to_fpr (v8qi q0, v8qi q1) [[arm::streaming_compatible]] +{ + return q1; +} + +/* +** gpr_to_fpr: +** fmov d0, x0 +** ret +*/ +v8qi +gpr_to_fpr () [[arm::streaming_compatible]] +{ + register v8qi x0 asm ("x0"); + asm volatile ("" : "=r" (x0)); + return x0; +} + +/* +** zero_to_fpr: +** fmov d0, xzr +** ret +*/ +v8qi +zero_to_fpr () [[arm::streaming_compatible]] +{ + return (v8qi) {}; +} + +/* +** fpr_to_gpr: +** umov x0, v0\.d\[0\] +** ret +*/ +void +fpr_to_gpr (v8qi q0) [[arm::streaming_compatible]] +{ + register v8qi x0 asm ("x0"); + x0 = q0; + asm volatile ("" :: "r" (x0)); +} diff --git a/gcc/testsuite/gcc.target/aarch64/sme/arm_neon_1.c b/gcc/testsuite/gcc.target/aarch64/sme/arm_neon_1.c new file mode 100644 index 00000000000..5b5346cf435 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sme/arm_neon_1.c @@ -0,0 +1,13 @@ +// { dg-options "" } + +#include + +#pragma GCC target "+nosme" + +// { dg-error {inlining failed.*'vhaddq_s32'} "" { target *-*-* } 0 } + +int32x4_t +foo (int32x4_t x, int32x4_t y) [[arm::streaming_compatible]] +{ + return vhaddq_s32 (x, y); +} diff --git a/gcc/testsuite/gcc.target/aarch64/sme/arm_neon_2.c b/gcc/testsuite/gcc.target/aarch64/sme/arm_neon_2.c new file mode 100644 index 00000000000..2092c4471f0 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sme/arm_neon_2.c @@ -0,0 +1,11 @@ +// { dg-options "" } + +#include + +// { dg-error {inlining failed.*'vhaddq_s32'} "" { target *-*-* } 0 } + +int32x4_t +foo (int32x4_t x, int32x4_t y) [[arm::streaming_compatible]] +{ + return vhaddq_s32 (x, y); +} diff --git a/gcc/testsuite/gcc.target/aarch64/sme/arm_neon_3.c b/gcc/testsuite/gcc.target/aarch64/sme/arm_neon_3.c new file mode 100644 index 00000000000..36794e5b0df --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sme/arm_neon_3.c @@ -0,0 +1,11 @@ +// { dg-options "" } + +#include + +// { dg-error {inlining failed.*'vhaddq_s32'} "" { target *-*-* } 0 } + +int32x4_t +foo (int32x4_t x, int32x4_t y) [[arm::streaming]] +{ + return vhaddq_s32 (x, y); +} From patchwork Tue Dec 5 10:13:12 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Sandiford X-Patchwork-Id: 1872045 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4SkxLT16llz1ySd for ; Tue, 5 Dec 2023 21:17:53 +1100 (AEDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id EF2473870919 for ; Tue, 5 Dec 2023 10:17:26 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by sourceware.org (Postfix) with ESMTP id 88A363857C4E for ; Tue, 5 Dec 2023 10:13:43 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 88A363857C4E Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=arm.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 88A363857C4E Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=217.140.110.172 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1701771231; cv=none; b=j+a04o+wYa9Xeg0ZCsgqMqOvb9MJZgwjB8gIYLOR39FcNxAynZJgYirArva0DwWPPTHspGKNQNffJioxEjBAMkXlDIK0nfnf8Uzz6uJAhoqRD78wmec5bFRZDNriGydVhUZZF4EUq2A5Iao2i3xjMf9fiSAWFtC1dRFpfoXmrR4= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1701771231; c=relaxed/simple; bh=FraR2fudXHMiTlUaow3eytrwV6L+hOT2VZ7QAoK9sbY=; h=From:To:Subject:Date:Message-Id:MIME-Version; b=m29DsFXhIC8YvNkGJmskDsPpY8cmrRAPAHG6LNZs9j6Ot2B0l72O4GfN5sD/8kWlCaJWccSItPz5G4395lpnwoLdv9640Z+8gcRdkGFJIM8bGnyoNfF/juj92D3NBNJ8X+J97xp+02UBBIJ6zW1Y4iEe1N+kEAxM1MUisazPP0k= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id E4C4C1476; Tue, 5 Dec 2023 02:14:29 -0800 (PST) Received: from e121540-lin.manchester.arm.com (e121540-lin.manchester.arm.com [10.32.110.72]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id BB2123F5A1; Tue, 5 Dec 2023 02:13:42 -0800 (PST) From: Richard Sandiford To: gcc-patches@gcc.gnu.org Cc: Richard Sandiford Subject: [pushed v2 14/25] aarch64: Mark relevant SVE instructions as non-streaming Date: Tue, 5 Dec 2023 10:13:12 +0000 Message-Id: <20231205101323.1914247-15-richard.sandiford@arm.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20231205101323.1914247-1-richard.sandiford@arm.com> References: <20231205101323.1914247-1-richard.sandiford@arm.com> MIME-Version: 1.0 X-Spam-Status: No, score=-22.3 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_DMARC_NONE, KAM_DMARC_STATUS, KAM_LAZY_DOMAIN_SECURITY, KAM_SHORT, SPF_HELO_NONE, SPF_NONE, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org Following on from the previous Advanced SIMD patch, this one divides SVE instructions into non-streaming and streaming- compatible groups. gcc/ * config/aarch64/aarch64.h (TARGET_NON_STREAMING): New macro. (TARGET_SVE2_AES, TARGET_SVE2_BITPERM): Use it. (TARGET_SVE2_SHA3, TARGET_SVE2_SM4): Likewise. * config/aarch64/aarch64-sve-builtins-base.def: Separate out the functions that require PSTATE.SM to be 0 and guard them with AARCH64_FL_SM_OFF. * config/aarch64/aarch64-sve-builtins-sve2.def: Likewise. * config/aarch64/aarch64-sve-builtins.cc (check_required_extensions): Enforce AARCH64_FL_SM_OFF requirements. * config/aarch64/aarch64-sve.md (aarch64_wrffr): Require TARGET_NON_STREAMING (aarch64_rdffr, aarch64_rdffr_z, *aarch64_rdffr_z_ptest): Likewise. (*aarch64_rdffr_ptest, *aarch64_rdffr_z_cc, *aarch64_rdffr_cc) (@aarch64_ldf1): Likewise. (@aarch64_ldf1_) (gather_load): Likewise (mask_gather_load): Likewise. (mask_gather_load): Likewise. (*mask_gather_load_xtw_unpacked): Likewise. (*mask_gather_load_sxtw): Likewise. (*mask_gather_load_uxtw): Likewise. (@aarch64_gather_load_) (@aarch64_gather_load_ ): Likewise. (*aarch64_gather_load_ _xtw_unpacked) (*aarch64_gather_load_ _sxtw): Likewise. (*aarch64_gather_load_ _uxtw): Likewise. (@aarch64_ldff1_gather, @aarch64_ldff1_gather): Likewise. (*aarch64_ldff1_gather_sxtw): Likewise. (*aarch64_ldff1_gather_uxtw): Likewise. (@aarch64_ldff1_gather_ ): Likewise. (@aarch64_ldff1_gather_ ): Likewise. (*aarch64_ldff1_gather_ _sxtw): Likewise. (*aarch64_ldff1_gather_ _uxtw): Likewise. (@aarch64_sve_gather_prefetch) (@aarch64_sve_gather_prefetch) (*aarch64_sve_gather_prefetch_sxtw) (*aarch64_sve_gather_prefetch_uxtw) (scatter_store): Likewise. (mask_scatter_store): Likewise. (*mask_scatter_store_xtw_unpacked) (*mask_scatter_store_sxtw): Likewise. (*mask_scatter_store_uxtw): Likewise. (@aarch64_scatter_store_trunc) (@aarch64_scatter_store_trunc) (*aarch64_scatter_store_trunc_sxtw) (*aarch64_scatter_store_trunc_uxtw) (@aarch64_sve_ld1ro, @aarch64_adr): Likewise. (*aarch64_adr_sxtw, *aarch64_adr_uxtw_unspec): Likewise. (*aarch64_adr_uxtw_and, @aarch64_adr_shift): Likewise. (*aarch64_adr_shift, *aarch64_adr_shift_sxtw): Likewise. (*aarch64_adr_shift_uxtw, @aarch64_sve_add_): Likewise. (@aarch64_sve_, fold_left_plus_): Likewise. (mask_fold_left_plus_, @aarch64_sve_compact): Likewise. * config/aarch64/aarch64-sve2.md (@aarch64_gather_ldnt) (@aarch64_gather_ldnt_ ): Likewise. (@aarch64_sve2_histcnt, @aarch64_sve2_histseg): Likewise. (@aarch64_pred_): Likewise. (*aarch64_pred__cc): Likewise. (*aarch64_pred__ptest): Likewise. * config/aarch64/iterators.md (SVE_FP_UNARY_INT): Make FEXPA depend on TARGET_NON_STREAMING. (SVE_BFLOAT_TERNARY_LONG): Likewise BFMMLA. gcc/testsuite/ * g++.target/aarch64/sve/aarch64-ssve.exp: New harness. * g++.target/aarch64/sve/acle/aarch64-sve-acle-asm.exp: Add -DSTREAMING_COMPATIBLE to the list of options. * g++.target/aarch64/sve2/acle/aarch64-sve2-acle-asm.exp: Likewise. * gcc.target/aarch64/sve/acle/aarch64-sve-acle-asm.exp: Likewise. * gcc.target/aarch64/sve2/acle/aarch64-sve2-acle-asm.exp: Likewise. Fix pasto in variable name. * gcc.target/aarch64/sve/acle/asm/test_sve_acle.h: Mark functions as streaming-compatible if STREAMING_COMPATIBLE is defined. * gcc.target/aarch64/sve/acle/asm/adda_f16.c: Disable for streaming-compatible code. * gcc.target/aarch64/sve/acle/asm/adda_f32.c: Likewise. * gcc.target/aarch64/sve/acle/asm/adda_f64.c: Likewise. * gcc.target/aarch64/sve/acle/asm/adrb.c: Likewise. * gcc.target/aarch64/sve/acle/asm/adrd.c: Likewise. * gcc.target/aarch64/sve/acle/asm/adrh.c: Likewise. * gcc.target/aarch64/sve/acle/asm/adrw.c: Likewise. * gcc.target/aarch64/sve/acle/asm/bfmmla_f32.c: Likewise. * gcc.target/aarch64/sve/acle/asm/compact_f32.c: Likewise. * gcc.target/aarch64/sve/acle/asm/compact_f64.c: Likewise. * gcc.target/aarch64/sve/acle/asm/compact_s32.c: Likewise. * gcc.target/aarch64/sve/acle/asm/compact_s64.c: Likewise. * gcc.target/aarch64/sve/acle/asm/compact_u32.c: Likewise. * gcc.target/aarch64/sve/acle/asm/compact_u64.c: Likewise. * gcc.target/aarch64/sve/acle/asm/expa_f16.c: Likewise. * gcc.target/aarch64/sve/acle/asm/expa_f32.c: Likewise. * gcc.target/aarch64/sve/acle/asm/expa_f64.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ld1_gather_f32.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ld1_gather_f64.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ld1_gather_s32.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ld1_gather_s64.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ld1_gather_u32.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ld1_gather_u64.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ld1ro_bf16.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ld1ro_f16.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ld1ro_f32.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ld1ro_f64.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ld1ro_s16.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ld1ro_s32.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ld1ro_s64.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ld1ro_s8.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ld1ro_u16.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ld1ro_u32.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ld1ro_u64.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ld1ro_u8.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ld1sb_gather_s32.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ld1sb_gather_s64.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ld1sb_gather_u32.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ld1sb_gather_u64.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ld1sh_gather_s32.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ld1sh_gather_s64.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ld1sh_gather_u32.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ld1sh_gather_u64.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ld1sw_gather_s64.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ld1sw_gather_u64.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ld1ub_gather_s32.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ld1ub_gather_s64.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ld1ub_gather_u32.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ld1ub_gather_u64.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ld1uh_gather_s32.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ld1uh_gather_s64.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ld1uh_gather_u32.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ld1uh_gather_u64.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ld1uw_gather_s64.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ld1uw_gather_u64.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ldff1_bf16.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ldff1_f16.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ldff1_f32.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ldff1_f64.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ldff1_gather_f32.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ldff1_gather_f64.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ldff1_gather_s32.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ldff1_gather_s64.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ldff1_gather_u32.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ldff1_gather_u64.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ldff1_s16.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ldff1_s32.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ldff1_s64.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ldff1_s8.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ldff1_u16.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ldff1_u32.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ldff1_u64.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ldff1_u8.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ldff1sb_gather_s32.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ldff1sb_gather_s64.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ldff1sb_gather_u32.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ldff1sb_gather_u64.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ldff1sb_s16.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ldff1sb_s32.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ldff1sb_s64.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ldff1sb_u16.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ldff1sb_u32.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ldff1sb_u64.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ldff1sh_gather_s32.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ldff1sh_gather_s64.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ldff1sh_gather_u32.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ldff1sh_gather_u64.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ldff1sh_s32.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ldff1sh_s64.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ldff1sh_u32.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ldff1sh_u64.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ldff1sw_gather_s64.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ldff1sw_gather_u64.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ldff1sw_s64.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ldff1sw_u64.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ldff1ub_gather_s32.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ldff1ub_gather_s64.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ldff1ub_gather_u32.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ldff1ub_gather_u64.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ldff1ub_s16.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ldff1ub_s32.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ldff1ub_s64.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ldff1ub_u16.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ldff1ub_u32.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ldff1ub_u64.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ldff1uh_gather_s32.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ldff1uh_gather_s64.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ldff1uh_gather_u32.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ldff1uh_gather_u64.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ldff1uh_s32.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ldff1uh_s64.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ldff1uh_u32.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ldff1uh_u64.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ldff1uw_gather_s64.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ldff1uw_gather_u64.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ldff1uw_s64.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ldff1uw_u64.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ldnf1_bf16.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ldnf1_f16.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ldnf1_f32.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ldnf1_f64.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ldnf1_s16.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ldnf1_s32.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ldnf1_s64.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ldnf1_s8.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ldnf1_u16.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ldnf1_u32.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ldnf1_u64.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ldnf1_u8.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ldnf1sb_s16.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ldnf1sb_s32.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ldnf1sb_s64.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ldnf1sb_u16.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ldnf1sb_u32.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ldnf1sb_u64.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ldnf1sh_s32.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ldnf1sh_s64.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ldnf1sh_u32.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ldnf1sh_u64.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ldnf1sw_s64.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ldnf1sw_u64.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ldnf1ub_s16.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ldnf1ub_s32.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ldnf1ub_s64.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ldnf1ub_u16.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ldnf1ub_u32.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ldnf1ub_u64.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ldnf1uh_s32.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ldnf1uh_s64.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ldnf1uh_u32.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ldnf1uh_u64.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ldnf1uw_s64.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ldnf1uw_u64.c: Likewise. * gcc.target/aarch64/sve/acle/asm/mmla_f32.c: Likewise. * gcc.target/aarch64/sve/acle/asm/mmla_f64.c: Likewise. * gcc.target/aarch64/sve/acle/asm/mmla_s32.c: Likewise. * gcc.target/aarch64/sve/acle/asm/mmla_u32.c: Likewise. * gcc.target/aarch64/sve/acle/asm/prfb_gather.c: Likewise. * gcc.target/aarch64/sve/acle/asm/prfd_gather.c: Likewise. * gcc.target/aarch64/sve/acle/asm/prfh_gather.c: Likewise. * gcc.target/aarch64/sve/acle/asm/prfw_gather.c: Likewise. * gcc.target/aarch64/sve/acle/asm/rdffr_1.c: Likewise. * gcc.target/aarch64/sve/acle/asm/st1_scatter_f32.c: Likewise. * gcc.target/aarch64/sve/acle/asm/st1_scatter_f64.c: Likewise. * gcc.target/aarch64/sve/acle/asm/st1_scatter_s32.c: Likewise. * gcc.target/aarch64/sve/acle/asm/st1_scatter_s64.c: Likewise. * gcc.target/aarch64/sve/acle/asm/st1_scatter_u32.c: Likewise. * gcc.target/aarch64/sve/acle/asm/st1_scatter_u64.c: Likewise. * gcc.target/aarch64/sve/acle/asm/st1b_scatter_s32.c: Likewise. * gcc.target/aarch64/sve/acle/asm/st1b_scatter_s64.c: Likewise. * gcc.target/aarch64/sve/acle/asm/st1b_scatter_u32.c: Likewise. * gcc.target/aarch64/sve/acle/asm/st1b_scatter_u64.c: Likewise. * gcc.target/aarch64/sve/acle/asm/st1h_scatter_s32.c: Likewise. * gcc.target/aarch64/sve/acle/asm/st1h_scatter_s64.c: Likewise. * gcc.target/aarch64/sve/acle/asm/st1h_scatter_u32.c: Likewise. * gcc.target/aarch64/sve/acle/asm/st1h_scatter_u64.c: Likewise. * gcc.target/aarch64/sve/acle/asm/st1w_scatter_s64.c: Likewise. * gcc.target/aarch64/sve/acle/asm/st1w_scatter_u64.c: Likewise. * gcc.target/aarch64/sve/acle/asm/tmad_f16.c: Likewise. * gcc.target/aarch64/sve/acle/asm/tmad_f32.c: Likewise. * gcc.target/aarch64/sve/acle/asm/tmad_f64.c: Likewise. * gcc.target/aarch64/sve/acle/asm/tsmul_f16.c: Likewise. * gcc.target/aarch64/sve/acle/asm/tsmul_f32.c: Likewise. * gcc.target/aarch64/sve/acle/asm/tsmul_f64.c: Likewise. * gcc.target/aarch64/sve/acle/asm/tssel_f16.c: Likewise. * gcc.target/aarch64/sve/acle/asm/tssel_f32.c: Likewise. * gcc.target/aarch64/sve/acle/asm/tssel_f64.c: Likewise. * gcc.target/aarch64/sve/acle/asm/usmmla_s32.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/aesd_u8.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/aese_u8.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/aesimc_u8.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/aesmc_u8.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/bdep_u16.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/bdep_u32.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/bdep_u64.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/bdep_u8.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/bext_u16.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/bext_u32.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/bext_u64.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/bext_u8.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/bgrp_u16.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/bgrp_u32.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/bgrp_u64.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/bgrp_u8.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/histcnt_s32.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/histcnt_s64.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/histcnt_u32.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/histcnt_u64.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/histseg_s8.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/histseg_u8.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/ldnt1_gather_f32.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/ldnt1_gather_f64.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/ldnt1_gather_s32.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/ldnt1_gather_s64.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/ldnt1_gather_u32.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/ldnt1_gather_u64.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/ldnt1sb_gather_s32.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/ldnt1sb_gather_s64.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/ldnt1sb_gather_u32.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/ldnt1sb_gather_u64.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/ldnt1sh_gather_s32.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/ldnt1sh_gather_s64.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/ldnt1sh_gather_u32.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/ldnt1sh_gather_u64.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/ldnt1sw_gather_s64.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/ldnt1sw_gather_u64.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/ldnt1ub_gather_s32.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/ldnt1ub_gather_s64.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/ldnt1ub_gather_u32.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/ldnt1ub_gather_u64.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/ldnt1uh_gather_s32.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/ldnt1uh_gather_s64.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/ldnt1uh_gather_u32.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/ldnt1uh_gather_u64.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/ldnt1uw_gather_s64.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/ldnt1uw_gather_u64.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/match_s16.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/match_s8.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/match_u16.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/match_u8.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/nmatch_s16.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/nmatch_s8.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/nmatch_u16.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/nmatch_u8.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/pmullb_pair_u64.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/pmullt_pair_u64.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/rax1_s64.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/rax1_u64.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/sm4e_u32.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/sm4ekey_u32.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/stnt1_scatter_f32.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/stnt1_scatter_f64.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/stnt1_scatter_s32.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/stnt1_scatter_s64.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/stnt1_scatter_u32.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/stnt1_scatter_u64.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/stnt1b_scatter_s32.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/stnt1b_scatter_s64.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/stnt1b_scatter_u32.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/stnt1b_scatter_u64.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/stnt1h_scatter_s32.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/stnt1h_scatter_s64.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/stnt1h_scatter_u32.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/stnt1h_scatter_u64.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/stnt1w_scatter_s64.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/stnt1w_scatter_u64.c: Likewise. --- .../aarch64/aarch64-sve-builtins-base.def | 158 +++++---- .../aarch64/aarch64-sve-builtins-sve2.def | 63 ++-- gcc/config/aarch64/aarch64-sve-builtins.cc | 7 + gcc/config/aarch64/aarch64-sve.md | 124 +++---- gcc/config/aarch64/aarch64-sve2.md | 14 +- gcc/config/aarch64/aarch64.h | 11 +- gcc/config/aarch64/iterators.md | 4 +- .../g++.target/aarch64/sve/aarch64-ssve.exp | 308 ++++++++++++++++++ .../aarch64/sve/acle/aarch64-sve-acle-asm.exp | 1 + .../sve2/acle/aarch64-sve2-acle-asm.exp | 1 + .../aarch64/sve/acle/aarch64-sve-acle-asm.exp | 1 + .../aarch64/sve/acle/asm/adda_f16.c | 1 + .../aarch64/sve/acle/asm/adda_f32.c | 1 + .../aarch64/sve/acle/asm/adda_f64.c | 1 + .../gcc.target/aarch64/sve/acle/asm/adrb.c | 1 + .../gcc.target/aarch64/sve/acle/asm/adrd.c | 1 + .../gcc.target/aarch64/sve/acle/asm/adrh.c | 1 + .../gcc.target/aarch64/sve/acle/asm/adrw.c | 1 + .../aarch64/sve/acle/asm/bfmmla_f32.c | 1 + .../aarch64/sve/acle/asm/compact_f32.c | 1 + .../aarch64/sve/acle/asm/compact_f64.c | 1 + .../aarch64/sve/acle/asm/compact_s32.c | 1 + .../aarch64/sve/acle/asm/compact_s64.c | 1 + .../aarch64/sve/acle/asm/compact_u32.c | 1 + .../aarch64/sve/acle/asm/compact_u64.c | 1 + .../aarch64/sve/acle/asm/expa_f16.c | 1 + .../aarch64/sve/acle/asm/expa_f32.c | 1 + .../aarch64/sve/acle/asm/expa_f64.c | 1 + .../aarch64/sve/acle/asm/ld1_gather_f32.c | 1 + .../aarch64/sve/acle/asm/ld1_gather_f64.c | 1 + .../aarch64/sve/acle/asm/ld1_gather_s32.c | 1 + .../aarch64/sve/acle/asm/ld1_gather_s64.c | 1 + .../aarch64/sve/acle/asm/ld1_gather_u32.c | 1 + .../aarch64/sve/acle/asm/ld1_gather_u64.c | 1 + .../aarch64/sve/acle/asm/ld1ro_bf16.c | 1 + .../aarch64/sve/acle/asm/ld1ro_f16.c | 1 + .../aarch64/sve/acle/asm/ld1ro_f32.c | 1 + .../aarch64/sve/acle/asm/ld1ro_f64.c | 1 + .../aarch64/sve/acle/asm/ld1ro_s16.c | 1 + .../aarch64/sve/acle/asm/ld1ro_s32.c | 1 + .../aarch64/sve/acle/asm/ld1ro_s64.c | 1 + .../aarch64/sve/acle/asm/ld1ro_s8.c | 1 + .../aarch64/sve/acle/asm/ld1ro_u16.c | 1 + .../aarch64/sve/acle/asm/ld1ro_u32.c | 1 + .../aarch64/sve/acle/asm/ld1ro_u64.c | 1 + .../aarch64/sve/acle/asm/ld1ro_u8.c | 1 + .../aarch64/sve/acle/asm/ld1sb_gather_s32.c | 1 + .../aarch64/sve/acle/asm/ld1sb_gather_s64.c | 1 + .../aarch64/sve/acle/asm/ld1sb_gather_u32.c | 1 + .../aarch64/sve/acle/asm/ld1sb_gather_u64.c | 1 + .../aarch64/sve/acle/asm/ld1sh_gather_s32.c | 1 + .../aarch64/sve/acle/asm/ld1sh_gather_s64.c | 1 + .../aarch64/sve/acle/asm/ld1sh_gather_u32.c | 1 + .../aarch64/sve/acle/asm/ld1sh_gather_u64.c | 1 + .../aarch64/sve/acle/asm/ld1sw_gather_s64.c | 1 + .../aarch64/sve/acle/asm/ld1sw_gather_u64.c | 1 + .../aarch64/sve/acle/asm/ld1ub_gather_s32.c | 1 + .../aarch64/sve/acle/asm/ld1ub_gather_s64.c | 1 + .../aarch64/sve/acle/asm/ld1ub_gather_u32.c | 1 + .../aarch64/sve/acle/asm/ld1ub_gather_u64.c | 1 + .../aarch64/sve/acle/asm/ld1uh_gather_s32.c | 1 + .../aarch64/sve/acle/asm/ld1uh_gather_s64.c | 1 + .../aarch64/sve/acle/asm/ld1uh_gather_u32.c | 1 + .../aarch64/sve/acle/asm/ld1uh_gather_u64.c | 1 + .../aarch64/sve/acle/asm/ld1uw_gather_s64.c | 1 + .../aarch64/sve/acle/asm/ld1uw_gather_u64.c | 1 + .../aarch64/sve/acle/asm/ldff1_bf16.c | 1 + .../aarch64/sve/acle/asm/ldff1_f16.c | 1 + .../aarch64/sve/acle/asm/ldff1_f32.c | 1 + .../aarch64/sve/acle/asm/ldff1_f64.c | 1 + .../aarch64/sve/acle/asm/ldff1_gather_f32.c | 1 + .../aarch64/sve/acle/asm/ldff1_gather_f64.c | 1 + .../aarch64/sve/acle/asm/ldff1_gather_s32.c | 1 + .../aarch64/sve/acle/asm/ldff1_gather_s64.c | 1 + .../aarch64/sve/acle/asm/ldff1_gather_u32.c | 1 + .../aarch64/sve/acle/asm/ldff1_gather_u64.c | 1 + .../aarch64/sve/acle/asm/ldff1_s16.c | 1 + .../aarch64/sve/acle/asm/ldff1_s32.c | 1 + .../aarch64/sve/acle/asm/ldff1_s64.c | 1 + .../aarch64/sve/acle/asm/ldff1_s8.c | 1 + .../aarch64/sve/acle/asm/ldff1_u16.c | 1 + .../aarch64/sve/acle/asm/ldff1_u32.c | 1 + .../aarch64/sve/acle/asm/ldff1_u64.c | 1 + .../aarch64/sve/acle/asm/ldff1_u8.c | 1 + .../aarch64/sve/acle/asm/ldff1sb_gather_s32.c | 1 + .../aarch64/sve/acle/asm/ldff1sb_gather_s64.c | 1 + .../aarch64/sve/acle/asm/ldff1sb_gather_u32.c | 1 + .../aarch64/sve/acle/asm/ldff1sb_gather_u64.c | 1 + .../aarch64/sve/acle/asm/ldff1sb_s16.c | 1 + .../aarch64/sve/acle/asm/ldff1sb_s32.c | 1 + .../aarch64/sve/acle/asm/ldff1sb_s64.c | 1 + .../aarch64/sve/acle/asm/ldff1sb_u16.c | 1 + .../aarch64/sve/acle/asm/ldff1sb_u32.c | 1 + .../aarch64/sve/acle/asm/ldff1sb_u64.c | 1 + .../aarch64/sve/acle/asm/ldff1sh_gather_s32.c | 1 + .../aarch64/sve/acle/asm/ldff1sh_gather_s64.c | 1 + .../aarch64/sve/acle/asm/ldff1sh_gather_u32.c | 1 + .../aarch64/sve/acle/asm/ldff1sh_gather_u64.c | 1 + .../aarch64/sve/acle/asm/ldff1sh_s32.c | 1 + .../aarch64/sve/acle/asm/ldff1sh_s64.c | 1 + .../aarch64/sve/acle/asm/ldff1sh_u32.c | 1 + .../aarch64/sve/acle/asm/ldff1sh_u64.c | 1 + .../aarch64/sve/acle/asm/ldff1sw_gather_s64.c | 1 + .../aarch64/sve/acle/asm/ldff1sw_gather_u64.c | 1 + .../aarch64/sve/acle/asm/ldff1sw_s64.c | 1 + .../aarch64/sve/acle/asm/ldff1sw_u64.c | 1 + .../aarch64/sve/acle/asm/ldff1ub_gather_s32.c | 1 + .../aarch64/sve/acle/asm/ldff1ub_gather_s64.c | 1 + .../aarch64/sve/acle/asm/ldff1ub_gather_u32.c | 1 + .../aarch64/sve/acle/asm/ldff1ub_gather_u64.c | 1 + .../aarch64/sve/acle/asm/ldff1ub_s16.c | 1 + .../aarch64/sve/acle/asm/ldff1ub_s32.c | 1 + .../aarch64/sve/acle/asm/ldff1ub_s64.c | 1 + .../aarch64/sve/acle/asm/ldff1ub_u16.c | 1 + .../aarch64/sve/acle/asm/ldff1ub_u32.c | 1 + .../aarch64/sve/acle/asm/ldff1ub_u64.c | 1 + .../aarch64/sve/acle/asm/ldff1uh_gather_s32.c | 1 + .../aarch64/sve/acle/asm/ldff1uh_gather_s64.c | 1 + .../aarch64/sve/acle/asm/ldff1uh_gather_u32.c | 1 + .../aarch64/sve/acle/asm/ldff1uh_gather_u64.c | 1 + .../aarch64/sve/acle/asm/ldff1uh_s32.c | 1 + .../aarch64/sve/acle/asm/ldff1uh_s64.c | 1 + .../aarch64/sve/acle/asm/ldff1uh_u32.c | 1 + .../aarch64/sve/acle/asm/ldff1uh_u64.c | 1 + .../aarch64/sve/acle/asm/ldff1uw_gather_s64.c | 1 + .../aarch64/sve/acle/asm/ldff1uw_gather_u64.c | 1 + .../aarch64/sve/acle/asm/ldff1uw_s64.c | 1 + .../aarch64/sve/acle/asm/ldff1uw_u64.c | 1 + .../aarch64/sve/acle/asm/ldnf1_bf16.c | 1 + .../aarch64/sve/acle/asm/ldnf1_f16.c | 1 + .../aarch64/sve/acle/asm/ldnf1_f32.c | 1 + .../aarch64/sve/acle/asm/ldnf1_f64.c | 1 + .../aarch64/sve/acle/asm/ldnf1_s16.c | 1 + .../aarch64/sve/acle/asm/ldnf1_s32.c | 1 + .../aarch64/sve/acle/asm/ldnf1_s64.c | 1 + .../aarch64/sve/acle/asm/ldnf1_s8.c | 1 + .../aarch64/sve/acle/asm/ldnf1_u16.c | 1 + .../aarch64/sve/acle/asm/ldnf1_u32.c | 1 + .../aarch64/sve/acle/asm/ldnf1_u64.c | 1 + .../aarch64/sve/acle/asm/ldnf1_u8.c | 1 + .../aarch64/sve/acle/asm/ldnf1sb_s16.c | 1 + .../aarch64/sve/acle/asm/ldnf1sb_s32.c | 1 + .../aarch64/sve/acle/asm/ldnf1sb_s64.c | 1 + .../aarch64/sve/acle/asm/ldnf1sb_u16.c | 1 + .../aarch64/sve/acle/asm/ldnf1sb_u32.c | 1 + .../aarch64/sve/acle/asm/ldnf1sb_u64.c | 1 + .../aarch64/sve/acle/asm/ldnf1sh_s32.c | 1 + .../aarch64/sve/acle/asm/ldnf1sh_s64.c | 1 + .../aarch64/sve/acle/asm/ldnf1sh_u32.c | 1 + .../aarch64/sve/acle/asm/ldnf1sh_u64.c | 1 + .../aarch64/sve/acle/asm/ldnf1sw_s64.c | 1 + .../aarch64/sve/acle/asm/ldnf1sw_u64.c | 1 + .../aarch64/sve/acle/asm/ldnf1ub_s16.c | 1 + .../aarch64/sve/acle/asm/ldnf1ub_s32.c | 1 + .../aarch64/sve/acle/asm/ldnf1ub_s64.c | 1 + .../aarch64/sve/acle/asm/ldnf1ub_u16.c | 1 + .../aarch64/sve/acle/asm/ldnf1ub_u32.c | 1 + .../aarch64/sve/acle/asm/ldnf1ub_u64.c | 1 + .../aarch64/sve/acle/asm/ldnf1uh_s32.c | 1 + .../aarch64/sve/acle/asm/ldnf1uh_s64.c | 1 + .../aarch64/sve/acle/asm/ldnf1uh_u32.c | 1 + .../aarch64/sve/acle/asm/ldnf1uh_u64.c | 1 + .../aarch64/sve/acle/asm/ldnf1uw_s64.c | 1 + .../aarch64/sve/acle/asm/ldnf1uw_u64.c | 1 + .../aarch64/sve/acle/asm/mmla_f32.c | 1 + .../aarch64/sve/acle/asm/mmla_f64.c | 1 + .../aarch64/sve/acle/asm/mmla_s32.c | 1 + .../aarch64/sve/acle/asm/mmla_u32.c | 1 + .../aarch64/sve/acle/asm/prfb_gather.c | 1 + .../aarch64/sve/acle/asm/prfd_gather.c | 1 + .../aarch64/sve/acle/asm/prfh_gather.c | 1 + .../aarch64/sve/acle/asm/prfw_gather.c | 1 + .../gcc.target/aarch64/sve/acle/asm/rdffr_1.c | 1 + .../aarch64/sve/acle/asm/st1_scatter_f32.c | 1 + .../aarch64/sve/acle/asm/st1_scatter_f64.c | 1 + .../aarch64/sve/acle/asm/st1_scatter_s32.c | 1 + .../aarch64/sve/acle/asm/st1_scatter_s64.c | 1 + .../aarch64/sve/acle/asm/st1_scatter_u32.c | 1 + .../aarch64/sve/acle/asm/st1_scatter_u64.c | 1 + .../aarch64/sve/acle/asm/st1b_scatter_s32.c | 1 + .../aarch64/sve/acle/asm/st1b_scatter_s64.c | 1 + .../aarch64/sve/acle/asm/st1b_scatter_u32.c | 1 + .../aarch64/sve/acle/asm/st1b_scatter_u64.c | 1 + .../aarch64/sve/acle/asm/st1h_scatter_s32.c | 1 + .../aarch64/sve/acle/asm/st1h_scatter_s64.c | 1 + .../aarch64/sve/acle/asm/st1h_scatter_u32.c | 1 + .../aarch64/sve/acle/asm/st1h_scatter_u64.c | 1 + .../aarch64/sve/acle/asm/st1w_scatter_s64.c | 1 + .../aarch64/sve/acle/asm/st1w_scatter_u64.c | 1 + .../aarch64/sve/acle/asm/test_sve_acle.h | 11 +- .../aarch64/sve/acle/asm/tmad_f16.c | 1 + .../aarch64/sve/acle/asm/tmad_f32.c | 1 + .../aarch64/sve/acle/asm/tmad_f64.c | 1 + .../aarch64/sve/acle/asm/tsmul_f16.c | 1 + .../aarch64/sve/acle/asm/tsmul_f32.c | 1 + .../aarch64/sve/acle/asm/tsmul_f64.c | 1 + .../aarch64/sve/acle/asm/tssel_f16.c | 1 + .../aarch64/sve/acle/asm/tssel_f32.c | 1 + .../aarch64/sve/acle/asm/tssel_f64.c | 1 + .../aarch64/sve/acle/asm/usmmla_s32.c | 1 + .../sve2/acle/aarch64-sve2-acle-asm.exp | 1 + .../aarch64/sve2/acle/asm/aesd_u8.c | 1 + .../aarch64/sve2/acle/asm/aese_u8.c | 1 + .../aarch64/sve2/acle/asm/aesimc_u8.c | 1 + .../aarch64/sve2/acle/asm/aesmc_u8.c | 1 + .../aarch64/sve2/acle/asm/bdep_u16.c | 1 + .../aarch64/sve2/acle/asm/bdep_u32.c | 1 + .../aarch64/sve2/acle/asm/bdep_u64.c | 1 + .../aarch64/sve2/acle/asm/bdep_u8.c | 1 + .../aarch64/sve2/acle/asm/bext_u16.c | 1 + .../aarch64/sve2/acle/asm/bext_u32.c | 1 + .../aarch64/sve2/acle/asm/bext_u64.c | 1 + .../aarch64/sve2/acle/asm/bext_u8.c | 1 + .../aarch64/sve2/acle/asm/bgrp_u16.c | 1 + .../aarch64/sve2/acle/asm/bgrp_u32.c | 1 + .../aarch64/sve2/acle/asm/bgrp_u64.c | 1 + .../aarch64/sve2/acle/asm/bgrp_u8.c | 1 + .../aarch64/sve2/acle/asm/histcnt_s32.c | 1 + .../aarch64/sve2/acle/asm/histcnt_s64.c | 1 + .../aarch64/sve2/acle/asm/histcnt_u32.c | 1 + .../aarch64/sve2/acle/asm/histcnt_u64.c | 1 + .../aarch64/sve2/acle/asm/histseg_s8.c | 1 + .../aarch64/sve2/acle/asm/histseg_u8.c | 1 + .../aarch64/sve2/acle/asm/ldnt1_gather_f32.c | 1 + .../aarch64/sve2/acle/asm/ldnt1_gather_f64.c | 1 + .../aarch64/sve2/acle/asm/ldnt1_gather_s32.c | 1 + .../aarch64/sve2/acle/asm/ldnt1_gather_s64.c | 1 + .../aarch64/sve2/acle/asm/ldnt1_gather_u32.c | 1 + .../aarch64/sve2/acle/asm/ldnt1_gather_u64.c | 1 + .../sve2/acle/asm/ldnt1sb_gather_s32.c | 1 + .../sve2/acle/asm/ldnt1sb_gather_s64.c | 1 + .../sve2/acle/asm/ldnt1sb_gather_u32.c | 1 + .../sve2/acle/asm/ldnt1sb_gather_u64.c | 1 + .../sve2/acle/asm/ldnt1sh_gather_s32.c | 1 + .../sve2/acle/asm/ldnt1sh_gather_s64.c | 1 + .../sve2/acle/asm/ldnt1sh_gather_u32.c | 1 + .../sve2/acle/asm/ldnt1sh_gather_u64.c | 1 + .../sve2/acle/asm/ldnt1sw_gather_s64.c | 1 + .../sve2/acle/asm/ldnt1sw_gather_u64.c | 1 + .../sve2/acle/asm/ldnt1ub_gather_s32.c | 1 + .../sve2/acle/asm/ldnt1ub_gather_s64.c | 1 + .../sve2/acle/asm/ldnt1ub_gather_u32.c | 1 + .../sve2/acle/asm/ldnt1ub_gather_u64.c | 1 + .../sve2/acle/asm/ldnt1uh_gather_s32.c | 1 + .../sve2/acle/asm/ldnt1uh_gather_s64.c | 1 + .../sve2/acle/asm/ldnt1uh_gather_u32.c | 1 + .../sve2/acle/asm/ldnt1uh_gather_u64.c | 1 + .../sve2/acle/asm/ldnt1uw_gather_s64.c | 1 + .../sve2/acle/asm/ldnt1uw_gather_u64.c | 1 + .../aarch64/sve2/acle/asm/match_s16.c | 1 + .../aarch64/sve2/acle/asm/match_s8.c | 1 + .../aarch64/sve2/acle/asm/match_u16.c | 1 + .../aarch64/sve2/acle/asm/match_u8.c | 1 + .../aarch64/sve2/acle/asm/nmatch_s16.c | 1 + .../aarch64/sve2/acle/asm/nmatch_s8.c | 1 + .../aarch64/sve2/acle/asm/nmatch_u16.c | 1 + .../aarch64/sve2/acle/asm/nmatch_u8.c | 1 + .../aarch64/sve2/acle/asm/pmullb_pair_u64.c | 1 + .../aarch64/sve2/acle/asm/pmullt_pair_u64.c | 1 + .../aarch64/sve2/acle/asm/rax1_s64.c | 1 + .../aarch64/sve2/acle/asm/rax1_u64.c | 1 + .../aarch64/sve2/acle/asm/sm4e_u32.c | 1 + .../aarch64/sve2/acle/asm/sm4ekey_u32.c | 1 + .../aarch64/sve2/acle/asm/stnt1_scatter_f32.c | 1 + .../aarch64/sve2/acle/asm/stnt1_scatter_f64.c | 1 + .../aarch64/sve2/acle/asm/stnt1_scatter_s32.c | 1 + .../aarch64/sve2/acle/asm/stnt1_scatter_s64.c | 1 + .../aarch64/sve2/acle/asm/stnt1_scatter_u32.c | 1 + .../aarch64/sve2/acle/asm/stnt1_scatter_u64.c | 1 + .../sve2/acle/asm/stnt1b_scatter_s32.c | 1 + .../sve2/acle/asm/stnt1b_scatter_s64.c | 1 + .../sve2/acle/asm/stnt1b_scatter_u32.c | 1 + .../sve2/acle/asm/stnt1b_scatter_u64.c | 1 + .../sve2/acle/asm/stnt1h_scatter_s32.c | 1 + .../sve2/acle/asm/stnt1h_scatter_s64.c | 1 + .../sve2/acle/asm/stnt1h_scatter_u32.c | 1 + .../sve2/acle/asm/stnt1h_scatter_u64.c | 1 + .../sve2/acle/asm/stnt1w_scatter_s64.c | 1 + .../sve2/acle/asm/stnt1w_scatter_u64.c | 1 + 279 files changed, 805 insertions(+), 165 deletions(-) create mode 100644 gcc/testsuite/g++.target/aarch64/sve/aarch64-ssve.exp diff --git a/gcc/config/aarch64/aarch64-sve-builtins-base.def b/gcc/config/aarch64/aarch64-sve-builtins-base.def index 4e31f67ac47..ac53f35220d 100644 --- a/gcc/config/aarch64/aarch64-sve-builtins-base.def +++ b/gcc/config/aarch64/aarch64-sve-builtins-base.def @@ -25,12 +25,7 @@ DEF_SVE_FUNCTION (svacgt, compare_opt_n, all_float, implicit) DEF_SVE_FUNCTION (svacle, compare_opt_n, all_float, implicit) DEF_SVE_FUNCTION (svaclt, compare_opt_n, all_float, implicit) DEF_SVE_FUNCTION (svadd, binary_opt_n, all_arith, mxz) -DEF_SVE_FUNCTION (svadda, fold_left, all_float, implicit) DEF_SVE_FUNCTION (svaddv, reduction_wide, all_arith, implicit) -DEF_SVE_FUNCTION (svadrb, adr_offset, none, none) -DEF_SVE_FUNCTION (svadrd, adr_index, none, none) -DEF_SVE_FUNCTION (svadrh, adr_index, none, none) -DEF_SVE_FUNCTION (svadrw, adr_index, none, none) DEF_SVE_FUNCTION (svand, binary_opt_n, all_integer, mxz) DEF_SVE_FUNCTION (svand, binary_opt_n, b, z) DEF_SVE_FUNCTION (svandv, reduction, all_integer, implicit) @@ -75,7 +70,6 @@ DEF_SVE_FUNCTION (svcnth_pat, count_pat, none, none) DEF_SVE_FUNCTION (svcntp, count_pred, all_pred, implicit) DEF_SVE_FUNCTION (svcntw, count_inherent, none, none) DEF_SVE_FUNCTION (svcntw_pat, count_pat, none, none) -DEF_SVE_FUNCTION (svcompact, unary, sd_data, implicit) DEF_SVE_FUNCTION (svcreate2, create, all_data, none) DEF_SVE_FUNCTION (svcreate3, create, all_data, none) DEF_SVE_FUNCTION (svcreate4, create, all_data, none) @@ -93,7 +87,6 @@ DEF_SVE_FUNCTION (svdupq_lane, binary_uint64_n, all_data, none) DEF_SVE_FUNCTION (sveor, binary_opt_n, all_integer, mxz) DEF_SVE_FUNCTION (sveor, binary_opt_n, b, z) DEF_SVE_FUNCTION (sveorv, reduction, all_integer, implicit) -DEF_SVE_FUNCTION (svexpa, unary_uint, all_float, none) DEF_SVE_FUNCTION (svext, ext, all_data, none) DEF_SVE_FUNCTION (svextb, unary, hsd_integer, mxz) DEF_SVE_FUNCTION (svexth, unary, sd_integer, mxz) @@ -106,51 +99,13 @@ DEF_SVE_FUNCTION (svinsr, binary_n, all_data, none) DEF_SVE_FUNCTION (svlasta, reduction, all_data, implicit) DEF_SVE_FUNCTION (svlastb, reduction, all_data, implicit) DEF_SVE_FUNCTION (svld1, load, all_data, implicit) -DEF_SVE_FUNCTION (svld1_gather, load_gather_sv, sd_data, implicit) -DEF_SVE_FUNCTION (svld1_gather, load_gather_vs, sd_data, implicit) DEF_SVE_FUNCTION (svld1rq, load_replicate, all_data, implicit) DEF_SVE_FUNCTION (svld1sb, load_ext, hsd_integer, implicit) -DEF_SVE_FUNCTION (svld1sb_gather, load_ext_gather_offset, sd_integer, implicit) DEF_SVE_FUNCTION (svld1sh, load_ext, sd_integer, implicit) -DEF_SVE_FUNCTION (svld1sh_gather, load_ext_gather_offset, sd_integer, implicit) -DEF_SVE_FUNCTION (svld1sh_gather, load_ext_gather_index, sd_integer, implicit) DEF_SVE_FUNCTION (svld1sw, load_ext, d_integer, implicit) -DEF_SVE_FUNCTION (svld1sw_gather, load_ext_gather_offset, d_integer, implicit) -DEF_SVE_FUNCTION (svld1sw_gather, load_ext_gather_index, d_integer, implicit) DEF_SVE_FUNCTION (svld1ub, load_ext, hsd_integer, implicit) -DEF_SVE_FUNCTION (svld1ub_gather, load_ext_gather_offset, sd_integer, implicit) DEF_SVE_FUNCTION (svld1uh, load_ext, sd_integer, implicit) -DEF_SVE_FUNCTION (svld1uh_gather, load_ext_gather_offset, sd_integer, implicit) -DEF_SVE_FUNCTION (svld1uh_gather, load_ext_gather_index, sd_integer, implicit) DEF_SVE_FUNCTION (svld1uw, load_ext, d_integer, implicit) -DEF_SVE_FUNCTION (svld1uw_gather, load_ext_gather_offset, d_integer, implicit) -DEF_SVE_FUNCTION (svld1uw_gather, load_ext_gather_index, d_integer, implicit) -DEF_SVE_FUNCTION (svldff1, load, all_data, implicit) -DEF_SVE_FUNCTION (svldff1_gather, load_gather_sv, sd_data, implicit) -DEF_SVE_FUNCTION (svldff1_gather, load_gather_vs, sd_data, implicit) -DEF_SVE_FUNCTION (svldff1sb, load_ext, hsd_integer, implicit) -DEF_SVE_FUNCTION (svldff1sb_gather, load_ext_gather_offset, sd_integer, implicit) -DEF_SVE_FUNCTION (svldff1sh, load_ext, sd_integer, implicit) -DEF_SVE_FUNCTION (svldff1sh_gather, load_ext_gather_offset, sd_integer, implicit) -DEF_SVE_FUNCTION (svldff1sh_gather, load_ext_gather_index, sd_integer, implicit) -DEF_SVE_FUNCTION (svldff1sw, load_ext, d_integer, implicit) -DEF_SVE_FUNCTION (svldff1sw_gather, load_ext_gather_offset, d_integer, implicit) -DEF_SVE_FUNCTION (svldff1sw_gather, load_ext_gather_index, d_integer, implicit) -DEF_SVE_FUNCTION (svldff1ub, load_ext, hsd_integer, implicit) -DEF_SVE_FUNCTION (svldff1ub_gather, load_ext_gather_offset, sd_integer, implicit) -DEF_SVE_FUNCTION (svldff1uh, load_ext, sd_integer, implicit) -DEF_SVE_FUNCTION (svldff1uh_gather, load_ext_gather_offset, sd_integer, implicit) -DEF_SVE_FUNCTION (svldff1uh_gather, load_ext_gather_index, sd_integer, implicit) -DEF_SVE_FUNCTION (svldff1uw, load_ext, d_integer, implicit) -DEF_SVE_FUNCTION (svldff1uw_gather, load_ext_gather_offset, d_integer, implicit) -DEF_SVE_FUNCTION (svldff1uw_gather, load_ext_gather_index, d_integer, implicit) -DEF_SVE_FUNCTION (svldnf1, load, all_data, implicit) -DEF_SVE_FUNCTION (svldnf1sb, load_ext, hsd_integer, implicit) -DEF_SVE_FUNCTION (svldnf1sh, load_ext, sd_integer, implicit) -DEF_SVE_FUNCTION (svldnf1sw, load_ext, d_integer, implicit) -DEF_SVE_FUNCTION (svldnf1ub, load_ext, hsd_integer, implicit) -DEF_SVE_FUNCTION (svldnf1uh, load_ext, sd_integer, implicit) -DEF_SVE_FUNCTION (svldnf1uw, load_ext, d_integer, implicit) DEF_SVE_FUNCTION (svldnt1, load, all_data, implicit) DEF_SVE_FUNCTION (svld2, load, all_data, implicit) DEF_SVE_FUNCTION (svld3, load, all_data, implicit) @@ -173,7 +128,6 @@ DEF_SVE_FUNCTION (svmla, ternary_opt_n, all_arith, mxz) DEF_SVE_FUNCTION (svmla_lane, ternary_lane, all_float, none) DEF_SVE_FUNCTION (svmls, ternary_opt_n, all_arith, mxz) DEF_SVE_FUNCTION (svmls_lane, ternary_lane, all_float, none) -DEF_SVE_FUNCTION (svmmla, mmla, none, none) DEF_SVE_FUNCTION (svmov, unary, b, z) DEF_SVE_FUNCTION (svmsb, ternary_opt_n, all_arith, mxz) DEF_SVE_FUNCTION (svmul, binary_opt_n, all_arith, mxz) @@ -197,13 +151,9 @@ DEF_SVE_FUNCTION (svpfalse, inherent_b, b, none) DEF_SVE_FUNCTION (svpfirst, unary, b, implicit) DEF_SVE_FUNCTION (svpnext, unary_pred, all_pred, implicit) DEF_SVE_FUNCTION (svprfb, prefetch, none, implicit) -DEF_SVE_FUNCTION (svprfb_gather, prefetch_gather_offset, none, implicit) DEF_SVE_FUNCTION (svprfd, prefetch, none, implicit) -DEF_SVE_FUNCTION (svprfd_gather, prefetch_gather_index, none, implicit) DEF_SVE_FUNCTION (svprfh, prefetch, none, implicit) -DEF_SVE_FUNCTION (svprfh_gather, prefetch_gather_index, none, implicit) DEF_SVE_FUNCTION (svprfw, prefetch, none, implicit) -DEF_SVE_FUNCTION (svprfw_gather, prefetch_gather_index, none, implicit) DEF_SVE_FUNCTION (svptest_any, ptest, none, implicit) DEF_SVE_FUNCTION (svptest_first, ptest, none, implicit) DEF_SVE_FUNCTION (svptest_last, ptest, none, implicit) @@ -244,7 +194,6 @@ DEF_SVE_FUNCTION (svqincw_pat, inc_dec_pat, s_integer, none) DEF_SVE_FUNCTION (svqincw_pat, inc_dec_pat, sd_integer, none) DEF_SVE_FUNCTION (svqsub, binary_opt_n, all_integer, none) DEF_SVE_FUNCTION (svrbit, unary, all_integer, mxz) -DEF_SVE_FUNCTION (svrdffr, rdffr, none, z_or_none) DEF_SVE_FUNCTION (svrecpe, unary, all_float, none) DEF_SVE_FUNCTION (svrecps, binary, all_float, none) DEF_SVE_FUNCTION (svrecpx, unary, all_float, mxz) @@ -269,20 +218,12 @@ DEF_SVE_FUNCTION (svsel, binary, b, implicit) DEF_SVE_FUNCTION (svset2, set, all_data, none) DEF_SVE_FUNCTION (svset3, set, all_data, none) DEF_SVE_FUNCTION (svset4, set, all_data, none) -DEF_SVE_FUNCTION (svsetffr, setffr, none, none) DEF_SVE_FUNCTION (svsplice, binary, all_data, implicit) DEF_SVE_FUNCTION (svsqrt, unary, all_float, mxz) DEF_SVE_FUNCTION (svst1, store, all_data, implicit) -DEF_SVE_FUNCTION (svst1_scatter, store_scatter_index, sd_data, implicit) -DEF_SVE_FUNCTION (svst1_scatter, store_scatter_offset, sd_data, implicit) DEF_SVE_FUNCTION (svst1b, store, hsd_integer, implicit) -DEF_SVE_FUNCTION (svst1b_scatter, store_scatter_offset, sd_integer, implicit) DEF_SVE_FUNCTION (svst1h, store, sd_integer, implicit) -DEF_SVE_FUNCTION (svst1h_scatter, store_scatter_index, sd_integer, implicit) -DEF_SVE_FUNCTION (svst1h_scatter, store_scatter_offset, sd_integer, implicit) DEF_SVE_FUNCTION (svst1w, store, d_integer, implicit) -DEF_SVE_FUNCTION (svst1w_scatter, store_scatter_index, d_integer, implicit) -DEF_SVE_FUNCTION (svst1w_scatter, store_scatter_offset, d_integer, implicit) DEF_SVE_FUNCTION (svst2, store, all_data, implicit) DEF_SVE_FUNCTION (svst3, store, all_data, implicit) DEF_SVE_FUNCTION (svst4, store, all_data, implicit) @@ -290,13 +231,10 @@ DEF_SVE_FUNCTION (svstnt1, store, all_data, implicit) DEF_SVE_FUNCTION (svsub, binary_opt_n, all_arith, mxz) DEF_SVE_FUNCTION (svsubr, binary_opt_n, all_arith, mxz) DEF_SVE_FUNCTION (svtbl, binary_uint, all_data, none) -DEF_SVE_FUNCTION (svtmad, tmad, all_float, none) DEF_SVE_FUNCTION (svtrn1, binary, all_data, none) DEF_SVE_FUNCTION (svtrn1, binary_pred, all_pred, none) DEF_SVE_FUNCTION (svtrn2, binary, all_data, none) DEF_SVE_FUNCTION (svtrn2, binary_pred, all_pred, none) -DEF_SVE_FUNCTION (svtsmul, binary_uint, all_float, none) -DEF_SVE_FUNCTION (svtssel, binary_uint, all_float, none) DEF_SVE_FUNCTION (svundef, inherent, all_data, none) DEF_SVE_FUNCTION (svundef2, inherent, all_data, none) DEF_SVE_FUNCTION (svundef3, inherent, all_data, none) @@ -311,13 +249,78 @@ DEF_SVE_FUNCTION (svuzp2, binary, all_data, none) DEF_SVE_FUNCTION (svuzp2, binary_pred, all_pred, none) DEF_SVE_FUNCTION (svwhilele, compare_scalar, while, none) DEF_SVE_FUNCTION (svwhilelt, compare_scalar, while, none) -DEF_SVE_FUNCTION (svwrffr, setffr, none, implicit) DEF_SVE_FUNCTION (svzip1, binary, all_data, none) DEF_SVE_FUNCTION (svzip1, binary_pred, all_pred, none) DEF_SVE_FUNCTION (svzip2, binary, all_data, none) DEF_SVE_FUNCTION (svzip2, binary_pred, all_pred, none) #undef REQUIRED_EXTENSIONS +#define REQUIRED_EXTENSIONS AARCH64_FL_SVE | AARCH64_FL_SM_OFF +DEF_SVE_FUNCTION (svadda, fold_left, all_float, implicit) +DEF_SVE_FUNCTION (svadrb, adr_offset, none, none) +DEF_SVE_FUNCTION (svadrd, adr_index, none, none) +DEF_SVE_FUNCTION (svadrh, adr_index, none, none) +DEF_SVE_FUNCTION (svadrw, adr_index, none, none) +DEF_SVE_FUNCTION (svcompact, unary, sd_data, implicit) +DEF_SVE_FUNCTION (svexpa, unary_uint, all_float, none) +DEF_SVE_FUNCTION (svld1_gather, load_gather_sv, sd_data, implicit) +DEF_SVE_FUNCTION (svld1_gather, load_gather_vs, sd_data, implicit) +DEF_SVE_FUNCTION (svld1sb_gather, load_ext_gather_offset, sd_integer, implicit) +DEF_SVE_FUNCTION (svld1sh_gather, load_ext_gather_offset, sd_integer, implicit) +DEF_SVE_FUNCTION (svld1sh_gather, load_ext_gather_index, sd_integer, implicit) +DEF_SVE_FUNCTION (svld1sw_gather, load_ext_gather_offset, d_integer, implicit) +DEF_SVE_FUNCTION (svld1sw_gather, load_ext_gather_index, d_integer, implicit) +DEF_SVE_FUNCTION (svld1ub_gather, load_ext_gather_offset, sd_integer, implicit) +DEF_SVE_FUNCTION (svld1uh_gather, load_ext_gather_offset, sd_integer, implicit) +DEF_SVE_FUNCTION (svld1uh_gather, load_ext_gather_index, sd_integer, implicit) +DEF_SVE_FUNCTION (svld1uw_gather, load_ext_gather_offset, d_integer, implicit) +DEF_SVE_FUNCTION (svld1uw_gather, load_ext_gather_index, d_integer, implicit) +DEF_SVE_FUNCTION (svldff1, load, all_data, implicit) +DEF_SVE_FUNCTION (svldff1_gather, load_gather_sv, sd_data, implicit) +DEF_SVE_FUNCTION (svldff1_gather, load_gather_vs, sd_data, implicit) +DEF_SVE_FUNCTION (svldff1sb, load_ext, hsd_integer, implicit) +DEF_SVE_FUNCTION (svldff1sb_gather, load_ext_gather_offset, sd_integer, implicit) +DEF_SVE_FUNCTION (svldff1sh, load_ext, sd_integer, implicit) +DEF_SVE_FUNCTION (svldff1sh_gather, load_ext_gather_offset, sd_integer, implicit) +DEF_SVE_FUNCTION (svldff1sh_gather, load_ext_gather_index, sd_integer, implicit) +DEF_SVE_FUNCTION (svldff1sw, load_ext, d_integer, implicit) +DEF_SVE_FUNCTION (svldff1sw_gather, load_ext_gather_offset, d_integer, implicit) +DEF_SVE_FUNCTION (svldff1sw_gather, load_ext_gather_index, d_integer, implicit) +DEF_SVE_FUNCTION (svldff1ub, load_ext, hsd_integer, implicit) +DEF_SVE_FUNCTION (svldff1ub_gather, load_ext_gather_offset, sd_integer, implicit) +DEF_SVE_FUNCTION (svldff1uh, load_ext, sd_integer, implicit) +DEF_SVE_FUNCTION (svldff1uh_gather, load_ext_gather_offset, sd_integer, implicit) +DEF_SVE_FUNCTION (svldff1uh_gather, load_ext_gather_index, sd_integer, implicit) +DEF_SVE_FUNCTION (svldff1uw, load_ext, d_integer, implicit) +DEF_SVE_FUNCTION (svldff1uw_gather, load_ext_gather_offset, d_integer, implicit) +DEF_SVE_FUNCTION (svldff1uw_gather, load_ext_gather_index, d_integer, implicit) +DEF_SVE_FUNCTION (svldnf1, load, all_data, implicit) +DEF_SVE_FUNCTION (svldnf1sb, load_ext, hsd_integer, implicit) +DEF_SVE_FUNCTION (svldnf1sh, load_ext, sd_integer, implicit) +DEF_SVE_FUNCTION (svldnf1sw, load_ext, d_integer, implicit) +DEF_SVE_FUNCTION (svldnf1ub, load_ext, hsd_integer, implicit) +DEF_SVE_FUNCTION (svldnf1uh, load_ext, sd_integer, implicit) +DEF_SVE_FUNCTION (svldnf1uw, load_ext, d_integer, implicit) +DEF_SVE_FUNCTION (svmmla, mmla, none, none) +DEF_SVE_FUNCTION (svprfb_gather, prefetch_gather_offset, none, implicit) +DEF_SVE_FUNCTION (svprfd_gather, prefetch_gather_index, none, implicit) +DEF_SVE_FUNCTION (svprfh_gather, prefetch_gather_index, none, implicit) +DEF_SVE_FUNCTION (svprfw_gather, prefetch_gather_index, none, implicit) +DEF_SVE_FUNCTION (svrdffr, rdffr, none, z_or_none) +DEF_SVE_FUNCTION (svsetffr, setffr, none, none) +DEF_SVE_FUNCTION (svst1_scatter, store_scatter_index, sd_data, implicit) +DEF_SVE_FUNCTION (svst1_scatter, store_scatter_offset, sd_data, implicit) +DEF_SVE_FUNCTION (svst1b_scatter, store_scatter_offset, sd_integer, implicit) +DEF_SVE_FUNCTION (svst1h_scatter, store_scatter_index, sd_integer, implicit) +DEF_SVE_FUNCTION (svst1h_scatter, store_scatter_offset, sd_integer, implicit) +DEF_SVE_FUNCTION (svst1w_scatter, store_scatter_index, d_integer, implicit) +DEF_SVE_FUNCTION (svst1w_scatter, store_scatter_offset, d_integer, implicit) +DEF_SVE_FUNCTION (svtmad, tmad, all_float, none) +DEF_SVE_FUNCTION (svtsmul, binary_uint, all_float, none) +DEF_SVE_FUNCTION (svtssel, binary_uint, all_float, none) +DEF_SVE_FUNCTION (svwrffr, setffr, none, implicit) +#undef REQUIRED_EXTENSIONS + #define REQUIRED_EXTENSIONS AARCH64_FL_SVE | AARCH64_FL_BF16 DEF_SVE_FUNCTION (svbfdot, ternary_bfloat_opt_n, s_float, none) DEF_SVE_FUNCTION (svbfdot_lane, ternary_bfloat_lanex2, s_float, none) @@ -325,27 +328,37 @@ DEF_SVE_FUNCTION (svbfmlalb, ternary_bfloat_opt_n, s_float, none) DEF_SVE_FUNCTION (svbfmlalb_lane, ternary_bfloat_lane, s_float, none) DEF_SVE_FUNCTION (svbfmlalt, ternary_bfloat_opt_n, s_float, none) DEF_SVE_FUNCTION (svbfmlalt_lane, ternary_bfloat_lane, s_float, none) -DEF_SVE_FUNCTION (svbfmmla, ternary_bfloat, s_float, none) DEF_SVE_FUNCTION (svcvt, unary_convert, cvt_bfloat, mxz) DEF_SVE_FUNCTION (svcvtnt, unary_convert_narrowt, cvt_bfloat, mx) #undef REQUIRED_EXTENSIONS +#define REQUIRED_EXTENSIONS (AARCH64_FL_SVE \ + | AARCH64_FL_BF16 \ + | AARCH64_FL_SM_OFF) +DEF_SVE_FUNCTION (svbfmmla, ternary_bfloat, s_float, none) +#undef REQUIRED_EXTENSIONS + #define REQUIRED_EXTENSIONS AARCH64_FL_SVE | AARCH64_FL_I8MM -DEF_SVE_FUNCTION (svmmla, mmla, s_integer, none) -DEF_SVE_FUNCTION (svusmmla, ternary_uintq_intq, s_signed, none) DEF_SVE_FUNCTION (svsudot, ternary_intq_uintq_opt_n, s_signed, none) DEF_SVE_FUNCTION (svsudot_lane, ternary_intq_uintq_lane, s_signed, none) DEF_SVE_FUNCTION (svusdot, ternary_uintq_intq_opt_n, s_signed, none) DEF_SVE_FUNCTION (svusdot_lane, ternary_uintq_intq_lane, s_signed, none) #undef REQUIRED_EXTENSIONS -#define REQUIRED_EXTENSIONS AARCH64_FL_SVE | AARCH64_FL_F32MM +#define REQUIRED_EXTENSIONS (AARCH64_FL_SVE \ + | AARCH64_FL_I8MM \ + | AARCH64_FL_SM_OFF) +DEF_SVE_FUNCTION (svmmla, mmla, s_integer, none) +DEF_SVE_FUNCTION (svusmmla, ternary_uintq_intq, s_signed, none) +#undef REQUIRED_EXTENSIONS + +#define REQUIRED_EXTENSIONS (AARCH64_FL_SVE \ + | AARCH64_FL_F32MM \ + | AARCH64_FL_SM_OFF) DEF_SVE_FUNCTION (svmmla, mmla, s_float, none) #undef REQUIRED_EXTENSIONS #define REQUIRED_EXTENSIONS AARCH64_FL_SVE | AARCH64_FL_F64MM -DEF_SVE_FUNCTION (svld1ro, load_replicate, all_data, implicit) -DEF_SVE_FUNCTION (svmmla, mmla, d_float, none) DEF_SVE_FUNCTION (svtrn1q, binary, all_data, none) DEF_SVE_FUNCTION (svtrn2q, binary, all_data, none) DEF_SVE_FUNCTION (svuzp1q, binary, all_data, none) @@ -353,3 +366,10 @@ DEF_SVE_FUNCTION (svuzp2q, binary, all_data, none) DEF_SVE_FUNCTION (svzip1q, binary, all_data, none) DEF_SVE_FUNCTION (svzip2q, binary, all_data, none) #undef REQUIRED_EXTENSIONS + +#define REQUIRED_EXTENSIONS (AARCH64_FL_SVE \ + | AARCH64_FL_F64MM \ + | AARCH64_FL_SM_OFF) +DEF_SVE_FUNCTION (svld1ro, load_replicate, all_data, implicit) +DEF_SVE_FUNCTION (svmmla, mmla, d_float, none) +#undef REQUIRED_EXTENSIONS diff --git a/gcc/config/aarch64/aarch64-sve-builtins-sve2.def b/gcc/config/aarch64/aarch64-sve-builtins-sve2.def index 565393f3081..4aac1ac942a 100644 --- a/gcc/config/aarch64/aarch64-sve-builtins-sve2.def +++ b/gcc/config/aarch64/aarch64-sve-builtins-sve2.def @@ -51,24 +51,9 @@ DEF_SVE_FUNCTION (sveor3, ternary_opt_n, all_integer, none) DEF_SVE_FUNCTION (sveorbt, ternary_opt_n, all_integer, none) DEF_SVE_FUNCTION (sveortb, ternary_opt_n, all_integer, none) DEF_SVE_FUNCTION (svhadd, binary_opt_n, all_integer, mxz) -DEF_SVE_FUNCTION (svhistcnt, binary_to_uint, sd_integer, z) -DEF_SVE_FUNCTION (svhistseg, binary_to_uint, b_integer, none) DEF_SVE_FUNCTION (svhsub, binary_opt_n, all_integer, mxz) DEF_SVE_FUNCTION (svhsubr, binary_opt_n, all_integer, mxz) -DEF_SVE_FUNCTION (svldnt1_gather, load_gather_sv_restricted, sd_data, implicit) -DEF_SVE_FUNCTION (svldnt1_gather, load_gather_vs, sd_data, implicit) -DEF_SVE_FUNCTION (svldnt1sb_gather, load_ext_gather_offset_restricted, sd_integer, implicit) -DEF_SVE_FUNCTION (svldnt1sh_gather, load_ext_gather_offset_restricted, sd_integer, implicit) -DEF_SVE_FUNCTION (svldnt1sh_gather, load_ext_gather_index_restricted, sd_integer, implicit) -DEF_SVE_FUNCTION (svldnt1sw_gather, load_ext_gather_offset_restricted, d_integer, implicit) -DEF_SVE_FUNCTION (svldnt1sw_gather, load_ext_gather_index_restricted, d_integer, implicit) -DEF_SVE_FUNCTION (svldnt1ub_gather, load_ext_gather_offset_restricted, sd_integer, implicit) -DEF_SVE_FUNCTION (svldnt1uh_gather, load_ext_gather_offset_restricted, sd_integer, implicit) -DEF_SVE_FUNCTION (svldnt1uh_gather, load_ext_gather_index_restricted, sd_integer, implicit) -DEF_SVE_FUNCTION (svldnt1uw_gather, load_ext_gather_offset_restricted, d_integer, implicit) -DEF_SVE_FUNCTION (svldnt1uw_gather, load_ext_gather_index_restricted, d_integer, implicit) DEF_SVE_FUNCTION (svlogb, unary_to_int, all_float, mxz) -DEF_SVE_FUNCTION (svmatch, compare, bh_integer, implicit) DEF_SVE_FUNCTION (svmaxp, binary, all_arith, mx) DEF_SVE_FUNCTION (svmaxnmp, binary, all_float, mx) DEF_SVE_FUNCTION (svmla_lane, ternary_lane, hsd_integer, none) @@ -91,7 +76,6 @@ DEF_SVE_FUNCTION (svmullb_lane, binary_long_lane, sd_integer, none) DEF_SVE_FUNCTION (svmullt, binary_long_opt_n, hsd_integer, none) DEF_SVE_FUNCTION (svmullt_lane, binary_long_lane, sd_integer, none) DEF_SVE_FUNCTION (svnbsl, ternary_opt_n, all_integer, none) -DEF_SVE_FUNCTION (svnmatch, compare, bh_integer, implicit) DEF_SVE_FUNCTION (svpmul, binary_opt_n, b_unsigned, none) DEF_SVE_FUNCTION (svpmullb, binary_long_opt_n, hd_unsigned, none) DEF_SVE_FUNCTION (svpmullb_pair, binary_opt_n, bs_unsigned, none) @@ -164,13 +148,6 @@ DEF_SVE_FUNCTION (svsli, ternary_shift_left_imm, all_integer, none) DEF_SVE_FUNCTION (svsqadd, binary_int_opt_n, all_unsigned, mxz) DEF_SVE_FUNCTION (svsra, ternary_shift_right_imm, all_integer, none) DEF_SVE_FUNCTION (svsri, ternary_shift_right_imm, all_integer, none) -DEF_SVE_FUNCTION (svstnt1_scatter, store_scatter_index_restricted, sd_data, implicit) -DEF_SVE_FUNCTION (svstnt1_scatter, store_scatter_offset_restricted, sd_data, implicit) -DEF_SVE_FUNCTION (svstnt1b_scatter, store_scatter_offset_restricted, sd_integer, implicit) -DEF_SVE_FUNCTION (svstnt1h_scatter, store_scatter_index_restricted, sd_integer, implicit) -DEF_SVE_FUNCTION (svstnt1h_scatter, store_scatter_offset_restricted, sd_integer, implicit) -DEF_SVE_FUNCTION (svstnt1w_scatter, store_scatter_index_restricted, d_integer, implicit) -DEF_SVE_FUNCTION (svstnt1w_scatter, store_scatter_offset_restricted, d_integer, implicit) DEF_SVE_FUNCTION (svsubhnb, binary_narrowb_opt_n, hsd_integer, none) DEF_SVE_FUNCTION (svsubhnt, binary_narrowt_opt_n, hsd_integer, none) DEF_SVE_FUNCTION (svsublb, binary_long_opt_n, hsd_integer, none) @@ -191,7 +168,36 @@ DEF_SVE_FUNCTION (svxar, ternary_shift_right_imm, all_integer, none) #define REQUIRED_EXTENSIONS (AARCH64_FL_SVE \ | AARCH64_FL_SVE2 \ - | AARCH64_FL_SVE2_AES) + | AARCH64_FL_SM_OFF) +DEF_SVE_FUNCTION (svhistcnt, binary_to_uint, sd_integer, z) +DEF_SVE_FUNCTION (svhistseg, binary_to_uint, b_integer, none) +DEF_SVE_FUNCTION (svldnt1_gather, load_gather_sv_restricted, sd_data, implicit) +DEF_SVE_FUNCTION (svldnt1_gather, load_gather_vs, sd_data, implicit) +DEF_SVE_FUNCTION (svldnt1sb_gather, load_ext_gather_offset_restricted, sd_integer, implicit) +DEF_SVE_FUNCTION (svldnt1sh_gather, load_ext_gather_offset_restricted, sd_integer, implicit) +DEF_SVE_FUNCTION (svldnt1sh_gather, load_ext_gather_index_restricted, sd_integer, implicit) +DEF_SVE_FUNCTION (svldnt1sw_gather, load_ext_gather_offset_restricted, d_integer, implicit) +DEF_SVE_FUNCTION (svldnt1sw_gather, load_ext_gather_index_restricted, d_integer, implicit) +DEF_SVE_FUNCTION (svldnt1ub_gather, load_ext_gather_offset_restricted, sd_integer, implicit) +DEF_SVE_FUNCTION (svldnt1uh_gather, load_ext_gather_offset_restricted, sd_integer, implicit) +DEF_SVE_FUNCTION (svldnt1uh_gather, load_ext_gather_index_restricted, sd_integer, implicit) +DEF_SVE_FUNCTION (svldnt1uw_gather, load_ext_gather_offset_restricted, d_integer, implicit) +DEF_SVE_FUNCTION (svldnt1uw_gather, load_ext_gather_index_restricted, d_integer, implicit) +DEF_SVE_FUNCTION (svmatch, compare, bh_integer, implicit) +DEF_SVE_FUNCTION (svnmatch, compare, bh_integer, implicit) +DEF_SVE_FUNCTION (svstnt1_scatter, store_scatter_index_restricted, sd_data, implicit) +DEF_SVE_FUNCTION (svstnt1_scatter, store_scatter_offset_restricted, sd_data, implicit) +DEF_SVE_FUNCTION (svstnt1b_scatter, store_scatter_offset_restricted, sd_integer, implicit) +DEF_SVE_FUNCTION (svstnt1h_scatter, store_scatter_index_restricted, sd_integer, implicit) +DEF_SVE_FUNCTION (svstnt1h_scatter, store_scatter_offset_restricted, sd_integer, implicit) +DEF_SVE_FUNCTION (svstnt1w_scatter, store_scatter_index_restricted, d_integer, implicit) +DEF_SVE_FUNCTION (svstnt1w_scatter, store_scatter_offset_restricted, d_integer, implicit) +#undef REQUIRED_EXTENSIONS + +#define REQUIRED_EXTENSIONS (AARCH64_FL_SVE \ + | AARCH64_FL_SVE2 \ + | AARCH64_FL_SVE2_AES \ + | AARCH64_FL_SM_OFF) DEF_SVE_FUNCTION (svaesd, binary, b_unsigned, none) DEF_SVE_FUNCTION (svaese, binary, b_unsigned, none) DEF_SVE_FUNCTION (svaesmc, unary, b_unsigned, none) @@ -202,7 +208,8 @@ DEF_SVE_FUNCTION (svpmullt_pair, binary_opt_n, d_unsigned, none) #define REQUIRED_EXTENSIONS (AARCH64_FL_SVE \ | AARCH64_FL_SVE2 \ - | AARCH64_FL_SVE2_BITPERM) + | AARCH64_FL_SVE2_BITPERM \ + | AARCH64_FL_SM_OFF) DEF_SVE_FUNCTION (svbdep, binary_opt_n, all_unsigned, none) DEF_SVE_FUNCTION (svbext, binary_opt_n, all_unsigned, none) DEF_SVE_FUNCTION (svbgrp, binary_opt_n, all_unsigned, none) @@ -210,13 +217,15 @@ DEF_SVE_FUNCTION (svbgrp, binary_opt_n, all_unsigned, none) #define REQUIRED_EXTENSIONS (AARCH64_FL_SVE \ | AARCH64_FL_SVE2 \ - | AARCH64_FL_SVE2_SHA3) + | AARCH64_FL_SVE2_SHA3 \ + | AARCH64_FL_SM_OFF) DEF_SVE_FUNCTION (svrax1, binary, d_integer, none) #undef REQUIRED_EXTENSIONS #define REQUIRED_EXTENSIONS (AARCH64_FL_SVE \ | AARCH64_FL_SVE2 \ - | AARCH64_FL_SVE2_SM4) + | AARCH64_FL_SVE2_SM4 \ + | AARCH64_FL_SM_OFF) DEF_SVE_FUNCTION (svsm4e, binary, s_unsigned, none) DEF_SVE_FUNCTION (svsm4ekey, binary, s_unsigned, none) #undef REQUIRED_EXTENSIONS diff --git a/gcc/config/aarch64/aarch64-sve-builtins.cc b/gcc/config/aarch64/aarch64-sve-builtins.cc index ecee554a890..d5ac1dc76c5 100644 --- a/gcc/config/aarch64/aarch64-sve-builtins.cc +++ b/gcc/config/aarch64/aarch64-sve-builtins.cc @@ -738,6 +738,13 @@ check_required_extensions (location_t location, tree fndecl, if (missing_extensions == 0) return check_required_registers (location, fndecl); + if (missing_extensions & AARCH64_FL_SM_OFF) + { + error_at (location, "ACLE function %qD cannot be called when" + " SME streaming mode is enabled", fndecl); + return false; + } + static const struct { aarch64_feature_flags flag; const char *name; diff --git a/gcc/config/aarch64/aarch64-sve.md b/gcc/config/aarch64/aarch64-sve.md index e9cebffe3e0..3f48e4cdf26 100644 --- a/gcc/config/aarch64/aarch64-sve.md +++ b/gcc/config/aarch64/aarch64-sve.md @@ -1086,7 +1086,7 @@ (define_insn "aarch64_wrffr" (match_operand:VNx16BI 0 "aarch64_simd_reg_or_minus_one")) (set (reg:VNx16BI FFRT_REGNUM) (unspec:VNx16BI [(match_dup 0)] UNSPEC_WRFFR))] - "TARGET_SVE" + "TARGET_SVE && TARGET_NON_STREAMING" {@ [ cons: 0 ] [ Dm ] setffr [ Upa ] wrffr\t%0.b @@ -1128,7 +1128,7 @@ (define_insn "aarch64_copy_ffr_to_ffrt" (define_insn "aarch64_rdffr" [(set (match_operand:VNx16BI 0 "register_operand" "=Upa") (reg:VNx16BI FFRT_REGNUM))] - "TARGET_SVE" + "TARGET_SVE && TARGET_NON_STREAMING" "rdffr\t%0.b" ) @@ -1138,7 +1138,7 @@ (define_insn "aarch64_rdffr_z" (and:VNx16BI (reg:VNx16BI FFRT_REGNUM) (match_operand:VNx16BI 1 "register_operand" "Upa")))] - "TARGET_SVE" + "TARGET_SVE && TARGET_NON_STREAMING" "rdffr\t%0.b, %1/z" ) @@ -1154,7 +1154,7 @@ (define_insn "*aarch64_rdffr_z_ptest" (match_dup 1))] UNSPEC_PTEST)) (clobber (match_scratch:VNx16BI 0 "=Upa"))] - "TARGET_SVE" + "TARGET_SVE && TARGET_NON_STREAMING" "rdffrs\t%0.b, %1/z" ) @@ -1168,7 +1168,7 @@ (define_insn "*aarch64_rdffr_ptest" (reg:VNx16BI FFRT_REGNUM)] UNSPEC_PTEST)) (clobber (match_scratch:VNx16BI 0 "=Upa"))] - "TARGET_SVE" + "TARGET_SVE && TARGET_NON_STREAMING" "rdffrs\t%0.b, %1/z" ) @@ -1187,7 +1187,7 @@ (define_insn "*aarch64_rdffr_z_cc" (and:VNx16BI (reg:VNx16BI FFRT_REGNUM) (match_dup 1)))] - "TARGET_SVE" + "TARGET_SVE && TARGET_NON_STREAMING" "rdffrs\t%0.b, %1/z" ) @@ -1202,7 +1202,7 @@ (define_insn "*aarch64_rdffr_cc" UNSPEC_PTEST)) (set (match_operand:VNx16BI 0 "register_operand" "=Upa") (reg:VNx16BI FFRT_REGNUM))] - "TARGET_SVE" + "TARGET_SVE && TARGET_NON_STREAMING" "rdffrs\t%0.b, %1/z" ) @@ -1332,7 +1332,7 @@ (define_insn "@aarch64_ldf1" (match_operand:SVE_FULL 1 "aarch64_sve_ldf1_operand" "Ut") (reg:VNx16BI FFRT_REGNUM)] SVE_LDFF1_LDNF1))] - "TARGET_SVE" + "TARGET_SVE && TARGET_NON_STREAMING" "ldf1\t%0., %2/z, %1" ) @@ -1366,7 +1366,9 @@ (define_insn_and_rewrite "@aarch64_ldf1_ & ) == 0" + "TARGET_SVE + && TARGET_NON_STREAMING + && (~ & ) == 0" "ldf1\t%0., %2/z, %1" "&& !CONSTANT_P (operands[3])" { @@ -1414,7 +1416,7 @@ (define_expand "gather_load" (match_operand:DI 4 "aarch64_gather_scale_operand_") (mem:BLK (scratch))] UNSPEC_LD1_GATHER))] - "TARGET_SVE" + "TARGET_SVE && TARGET_NON_STREAMING" { operands[5] = aarch64_ptrue_reg (mode); } @@ -1432,7 +1434,7 @@ (define_insn "mask_gather_load" (match_operand:DI 4 "aarch64_gather_scale_operand_") (mem:BLK (scratch))] UNSPEC_LD1_GATHER))] - "TARGET_SVE" + "TARGET_SVE && TARGET_NON_STREAMING" {@ [cons: =0, 1, 2, 3, 4, 5 ] [&w, Z, w, Ui1, Ui1, Upl] ld1\t%0.s, %5/z, [%2.s] [?w, Z, 0, Ui1, Ui1, Upl] ^ @@ -1461,7 +1463,7 @@ (define_insn "mask_gather_load" (match_operand:DI 4 "aarch64_gather_scale_operand_") (mem:BLK (scratch))] UNSPEC_LD1_GATHER))] - "TARGET_SVE" + "TARGET_SVE && TARGET_NON_STREAMING" {@ [cons: =0, 1, 2, 3, 4, 5] [&w, Z, w, i, Ui1, Upl] ld1\t%0.d, %5/z, [%2.d] [?w, Z, 0, i, Ui1, Upl] ^ @@ -1489,7 +1491,7 @@ (define_insn_and_rewrite "*mask_gather_load_xtw_unpac (match_operand:DI 4 "aarch64_gather_scale_operand_") (mem:BLK (scratch))] UNSPEC_LD1_GATHER))] - "TARGET_SVE" + "TARGET_SVE && TARGET_NON_STREAMING" {@ [cons: =0, 1, 2, 3, 4, 5] [&w, rk, w, i, Ui1, Upl ] ld1\t%0.d, %5/z, [%1, %2.d, xtw] [?w, rk, 0, i, Ui1, Upl ] ^ @@ -1519,7 +1521,7 @@ (define_insn_and_rewrite "*mask_gather_load_sxtw" (match_operand:DI 4 "aarch64_gather_scale_operand_") (mem:BLK (scratch))] UNSPEC_LD1_GATHER))] - "TARGET_SVE" + "TARGET_SVE && TARGET_NON_STREAMING" {@ [cons: =0, 1, 2, 3, 4, 5] [&w, rk, w, i, Ui1, Upl ] ld1\t%0.d, %5/z, [%1, %2.d, sxtw] [?w, rk, 0, i, Ui1, Upl ] ^ @@ -1546,7 +1548,7 @@ (define_insn "*mask_gather_load_uxtw" (match_operand:DI 4 "aarch64_gather_scale_operand_") (mem:BLK (scratch))] UNSPEC_LD1_GATHER))] - "TARGET_SVE" + "TARGET_SVE && TARGET_NON_STREAMING" {@ [cons: =0, 1, 2, 3, 4, 5] [&w, rk, w, i, Ui1, Upl ] ld1\t%0.d, %5/z, [%1, %2.d, uxtw] [?w, rk, 0, i, Ui1, Upl ] ^ @@ -1583,7 +1585,9 @@ (define_insn_and_rewrite "@aarch64_gather_load_ (mem:BLK (scratch))] UNSPEC_LD1_GATHER))] UNSPEC_PRED_X))] - "TARGET_SVE && (~ & ) == 0" + "TARGET_SVE + && TARGET_NON_STREAMING + && (~ & ) == 0" {@ [cons: =0, 1, 2, 3, 4, 5, 6] [&w, Z, w, Ui1, Ui1, Upl, UplDnm] ld1\t%0.s, %5/z, [%2.s] [?w, Z, 0, Ui1, Ui1, Upl, UplDnm] ^ @@ -1620,7 +1624,9 @@ (define_insn_and_rewrite "@aarch64_gather_load_ & ) == 0" + "TARGET_SVE + && TARGET_NON_STREAMING + && (~ & ) == 0" {@ [cons: =0, 1, 2, 3, 4, 5, 6] [&w, Z, w, i, Ui1, Upl, UplDnm] ld1\t%0.d, %5/z, [%2.d] [?w, Z, 0, i, Ui1, Upl, UplDnm] ^ @@ -1656,7 +1662,9 @@ (define_insn_and_rewrite "*aarch64_gather_load_ & ) == 0" + "TARGET_SVE + && TARGET_NON_STREAMING + && (~ & ) == 0" {@ [cons: =0, 1, 2, 3, 4, 5] [&w, rk, w, i, Ui1, Upl ] ld1\t%0.d, %5/z, [%1, %2.d, xtw] [?w, rk, 0, i, Ui1, Upl ] ^ @@ -1691,7 +1699,9 @@ (define_insn_and_rewrite "*aarch64_gather_load_ & ) == 0" + "TARGET_SVE + && TARGET_NON_STREAMING + && (~ & ) == 0" {@ [cons: =0, 1, 2, 3, 4, 5] [&w, rk, w, i, Ui1, Upl ] ld1\t%0.d, %5/z, [%1, %2.d, sxtw] [?w, rk, 0, i, Ui1, Upl ] ^ @@ -1723,7 +1733,9 @@ (define_insn_and_rewrite "*aarch64_gather_load_ & ) == 0" + "TARGET_SVE + && TARGET_NON_STREAMING + && (~ & ) == 0" {@ [cons: =0, 1, 2, 3, 4, 5] [&w, rk, w, i, Ui1, Upl ] ld1\t%0.d, %5/z, [%1, %2.d, uxtw] [?w, rk, 0, i, Ui1, Upl ] ^ @@ -1757,7 +1769,7 @@ (define_insn "@aarch64_ldff1_gather" (mem:BLK (scratch)) (reg:VNx16BI FFRT_REGNUM)] UNSPEC_LDFF1_GATHER))] - "TARGET_SVE" + "TARGET_SVE && TARGET_NON_STREAMING" {@ [cons: =0, 1, 2, 3, 4, 5 ] [&w, Z, w, i, Ui1, Upl] ldff1w\t%0.s, %5/z, [%2.s] [?w, Z, 0, i, Ui1, Upl] ^ @@ -1787,7 +1799,7 @@ (define_insn "@aarch64_ldff1_gather" (mem:BLK (scratch)) (reg:VNx16BI FFRT_REGNUM)] UNSPEC_LDFF1_GATHER))] - "TARGET_SVE" + "TARGET_SVE && TARGET_NON_STREAMING" {@ [cons: =0, 1, 2, 3, 4, 5 ] [&w, Z, w, i, Ui1, Upl ] ldff1d\t%0.d, %5/z, [%2.d] [?w, Z, 0, i, Ui1, Upl ] ^ @@ -1817,7 +1829,7 @@ (define_insn_and_rewrite "*aarch64_ldff1_gather_sxtw" (mem:BLK (scratch)) (reg:VNx16BI FFRT_REGNUM)] UNSPEC_LDFF1_GATHER))] - "TARGET_SVE" + "TARGET_SVE && TARGET_NON_STREAMING" {@ [cons: =0, 1, 2, 3, 4, 5] [&w, rk, w, i, Ui1, Upl ] ldff1d\t%0.d, %5/z, [%1, %2.d, sxtw] [?w, rk, 0, i, Ui1, Upl ] ^ @@ -1844,7 +1856,7 @@ (define_insn "*aarch64_ldff1_gather_uxtw" (mem:BLK (scratch)) (reg:VNx16BI FFRT_REGNUM)] UNSPEC_LDFF1_GATHER))] - "TARGET_SVE" + "TARGET_SVE && TARGET_NON_STREAMING" {@ [cons: =0, 1, 2, 3, 4, 5] [&w, rk, w, i, Ui1, Upl ] ldff1d\t%0.d, %5/z, [%1, %2.d, uxtw] [?w, rk, 0, i, Ui1, Upl ] ^ @@ -1882,7 +1894,7 @@ (define_insn_and_rewrite "@aarch64_ldff1_gather_\t%0.s, %5/z, [%2.s] [?w, Z, 0, i, Ui1, Upl, UplDnm] ^ @@ -1920,7 +1932,7 @@ (define_insn_and_rewrite "@aarch64_ldff1_gather_\t%0.d, %5/z, [%2.d] [?w, Z, 0, i, Ui1, Upl, UplDnm] ^ @@ -1958,7 +1970,7 @@ (define_insn_and_rewrite "*aarch64_ldff1_gather_\t%0.d, %5/z, [%1, %2.d, sxtw] [?w, rk, 0, i, Ui1, Upl ] ^ @@ -1990,7 +2002,7 @@ (define_insn_and_rewrite "*aarch64_ldff1_gather_\t%0.d, %5/z, [%1, %2.d, uxtw] [?w, rk, 0, i, Ui1, Upl ] ^ @@ -2068,7 +2080,7 @@ (define_insn "@aarch64_sve_gather_prefetch" UNSPEC_SVE_PREFETCH_GATHER) (match_operand:DI 7 "const_int_operand") (match_operand:DI 8 "const_int_operand"))] - "TARGET_SVE" + "TARGET_SVE && TARGET_NON_STREAMING" { static const char *const insns[][2] = { "prf", "%0, [%2.s]", @@ -2097,7 +2109,7 @@ (define_insn "@aarch64_sve_gather_prefetch" UNSPEC_SVE_PREFETCH_GATHER) (match_operand:DI 7 "const_int_operand") (match_operand:DI 8 "const_int_operand"))] - "TARGET_SVE" + "TARGET_SVE && TARGET_NON_STREAMING" { static const char *const insns[][2] = { "prf", "%0, [%2.d]", @@ -2128,7 +2140,7 @@ (define_insn_and_rewrite "*aarch64_sve_gather_prefetch_ux UNSPEC_SVE_PREFETCH_GATHER) (match_operand:DI 7 "const_int_operand") (match_operand:DI 8 "const_int_operand"))] - "TARGET_SVE" + "TARGET_SVE && TARGET_NON_STREAMING" { static const char *const insns[][2] = { "prfb", "%0, [%1, %2.d, uxtw]", @@ -2325,7 +2337,7 @@ (define_expand "scatter_store" (match_operand:DI 3 "aarch64_gather_scale_operand_") (match_operand:SVE_24 4 "register_operand")] UNSPEC_ST1_SCATTER))] - "TARGET_SVE" + "TARGET_SVE && TARGET_NON_STREAMING" { operands[5] = aarch64_ptrue_reg (mode); } @@ -2343,7 +2355,7 @@ (define_insn "mask_scatter_store" (match_operand:DI 3 "aarch64_gather_scale_operand_") (match_operand:SVE_4 4 "register_operand")] UNSPEC_ST1_SCATTER))] - "TARGET_SVE" + "TARGET_SVE && TARGET_NON_STREAMING" {@ [ cons: 0 , 1 , 2 , 3 , 4 , 5 ] [ Z , w , Ui1 , Ui1 , w , Upl ] st1\t%4.s, %5, [%1.s] [ vgw , w , Ui1 , Ui1 , w , Upl ] st1\t%4.s, %5, [%1.s, #%0] @@ -2366,7 +2378,7 @@ (define_insn "mask_scatter_store" (match_operand:DI 3 "aarch64_gather_scale_operand_") (match_operand:SVE_2 4 "register_operand")] UNSPEC_ST1_SCATTER))] - "TARGET_SVE" + "TARGET_SVE && TARGET_NON_STREAMING" {@ [ cons: 0 , 1 , 3 , 4 , 5 ] [ Z , w , Ui1 , w , Upl ] st1\t%4.d, %5, [%1.d] [ vgd , w , Ui1 , w , Upl ] st1\t%4.d, %5, [%1.d, #%0] @@ -2390,7 +2402,7 @@ (define_insn_and_rewrite "*mask_scatter_store_xtw_unp (match_operand:DI 3 "aarch64_gather_scale_operand_") (match_operand:SVE_2 4 "register_operand")] UNSPEC_ST1_SCATTER))] - "TARGET_SVE" + "TARGET_SVE && TARGET_NON_STREAMING" {@ [ cons: 0 , 1 , 3 , 4 , 5 ] [ rk , w , Ui1 , w , Upl ] st1\t%4.d, %5, [%0, %1.d, xtw] [ rk , w , i , w , Upl ] st1\t%4.d, %5, [%0, %1.d, xtw %p3] @@ -2418,7 +2430,7 @@ (define_insn_and_rewrite "*mask_scatter_store_sxtw" (match_operand:DI 3 "aarch64_gather_scale_operand_") (match_operand:SVE_2 4 "register_operand")] UNSPEC_ST1_SCATTER))] - "TARGET_SVE" + "TARGET_SVE && TARGET_NON_STREAMING" {@ [ cons: 0 , 1 , 3 , 4 , 5 ] [ rk , w , Ui1 , w , Upl ] st1\t%4.d, %5, [%0, %1.d, sxtw] [ rk , w , i , w , Upl ] st1\t%4.d, %5, [%0, %1.d, sxtw %p3] @@ -2443,7 +2455,7 @@ (define_insn "*mask_scatter_store_uxtw" (match_operand:DI 3 "aarch64_gather_scale_operand_") (match_operand:SVE_2 4 "register_operand")] UNSPEC_ST1_SCATTER))] - "TARGET_SVE" + "TARGET_SVE && TARGET_NON_STREAMING" {@ [ cons: 0 , 1 , 3 , 4 , 5 ] [ rk , w , Ui1 , w , Upl ] st1\t%4.d, %5, [%0, %1.d, uxtw] [ rk , w , i , w , Upl ] st1\t%4.d, %5, [%0, %1.d, uxtw %p3] @@ -2472,7 +2484,7 @@ (define_insn "@aarch64_scatter_store_trunc" (truncate:VNx4_NARROW (match_operand:VNx4_WIDE 4 "register_operand"))] UNSPEC_ST1_SCATTER))] - "TARGET_SVE" + "TARGET_SVE && TARGET_NON_STREAMING" {@ [ cons: 1 , 2 , 4 , 5 ] [ w , Ui1 , w , Upl ] st1\t%4.s, %5, [%1.s] [ w , Ui1 , w , Upl ] st1\t%4.s, %5, [%1.s, #%0] @@ -2496,7 +2508,7 @@ (define_insn "@aarch64_scatter_store_trunc" (truncate:VNx2_NARROW (match_operand:VNx2_WIDE 4 "register_operand"))] UNSPEC_ST1_SCATTER))] - "TARGET_SVE" + "TARGET_SVE && TARGET_NON_STREAMING" {@ [ cons: 1 , 4 , 5 ] [ w , w , Upl ] st1\t%4.d, %5, [%1.d] [ w , w , Upl ] st1\t%4.d, %5, [%1.d, #%0] @@ -2522,7 +2534,7 @@ (define_insn_and_rewrite "*aarch64_scatter_store_trunc\t%4.d, %5, [%0, %1.d, sxtw] [ rk , w , w , Upl ] st1\t%4.d, %5, [%0, %1.d, sxtw %p3] @@ -2547,7 +2559,7 @@ (define_insn "*aarch64_scatter_store_trunc_uxt (truncate:VNx2_NARROW (match_operand:VNx2_WIDE 4 "register_operand"))] UNSPEC_ST1_SCATTER))] - "TARGET_SVE" + "TARGET_SVE && TARGET_NON_STREAMING" {@ [ cons: 0 , 1 , 4 , 5 ] [ rk , w , w , Upl ] st1\t%4.d, %5, [%0, %1.d, uxtw] [ rk , w , w , Upl ] st1\t%4.d, %5, [%0, %1.d, uxtw %p3] @@ -2727,7 +2739,7 @@ (define_insn "@aarch64_sve_ld1ro" (match_operand:OI 1 "aarch64_sve_ld1ro_operand_" "UO")] UNSPEC_LD1RO))] - "TARGET_SVE_F64MM" + "TARGET_SVE_F64MM && TARGET_NON_STREAMING" { operands[1] = gen_rtx_MEM (mode, XEXP (operands[1], 0)); return "ld1ro\t%0., %2/z, %1"; @@ -3971,7 +3983,7 @@ (define_insn "@aarch64_adr" [(match_operand:SVE_FULL_SDI 1 "register_operand" "w") (match_operand:SVE_FULL_SDI 2 "register_operand" "w")] UNSPEC_ADR))] - "TARGET_SVE" + "TARGET_SVE && TARGET_NON_STREAMING" "adr\t%0., [%1., %2.]" ) @@ -3987,7 +3999,7 @@ (define_insn_and_rewrite "*aarch64_adr_sxtw" (match_operand:VNx2DI 2 "register_operand" "w")))] UNSPEC_PRED_X)] UNSPEC_ADR))] - "TARGET_SVE" + "TARGET_SVE && TARGET_NON_STREAMING" "adr\t%0.d, [%1.d, %2.d, sxtw]" "&& !CONSTANT_P (operands[3])" { @@ -4004,7 +4016,7 @@ (define_insn "*aarch64_adr_uxtw_unspec" (match_operand:VNx2DI 2 "register_operand" "w") (match_operand:VNx2DI 3 "aarch64_sve_uxtw_immediate"))] UNSPEC_ADR))] - "TARGET_SVE" + "TARGET_SVE && TARGET_NON_STREAMING" "adr\t%0.d, [%1.d, %2.d, uxtw]" ) @@ -4016,7 +4028,7 @@ (define_insn "*aarch64_adr_uxtw_and" (match_operand:VNx2DI 2 "register_operand" "w") (match_operand:VNx2DI 3 "aarch64_sve_uxtw_immediate")) (match_operand:VNx2DI 1 "register_operand" "w")))] - "TARGET_SVE" + "TARGET_SVE && TARGET_NON_STREAMING" "adr\t%0.d, [%1.d, %2.d, uxtw]" ) @@ -4031,7 +4043,7 @@ (define_expand "@aarch64_adr_shift" (match_operand:SVE_FULL_SDI 3 "const_1_to_3_operand"))] UNSPEC_PRED_X) (match_operand:SVE_FULL_SDI 1 "register_operand")))] - "TARGET_SVE" + "TARGET_SVE && TARGET_NON_STREAMING" { operands[4] = CONSTM1_RTX (mode); } @@ -4047,7 +4059,7 @@ (define_insn_and_rewrite "*aarch64_adr_shift" (match_operand:SVE_24I 3 "const_1_to_3_operand"))] UNSPEC_PRED_X) (match_operand:SVE_24I 1 "register_operand" "w")))] - "TARGET_SVE" + "TARGET_SVE && TARGET_NON_STREAMING" "adr\t%0., [%1., %2., lsl %3]" "&& !CONSTANT_P (operands[4])" { @@ -4071,7 +4083,7 @@ (define_insn_and_rewrite "*aarch64_adr_shift_sxtw" (match_operand:VNx2DI 3 "const_1_to_3_operand"))] UNSPEC_PRED_X) (match_operand:VNx2DI 1 "register_operand" "w")))] - "TARGET_SVE" + "TARGET_SVE && TARGET_NON_STREAMING" "adr\t%0.d, [%1.d, %2.d, sxtw %3]" "&& (!CONSTANT_P (operands[4]) || !CONSTANT_P (operands[5]))" { @@ -4092,7 +4104,7 @@ (define_insn_and_rewrite "*aarch64_adr_shift_uxtw" (match_operand:VNx2DI 3 "const_1_to_3_operand"))] UNSPEC_PRED_X) (match_operand:VNx2DI 1 "register_operand" "w")))] - "TARGET_SVE" + "TARGET_SVE && TARGET_NON_STREAMING" "adr\t%0.d, [%1.d, %2.d, uxtw %3]" "&& !CONSTANT_P (operands[5])" { @@ -7197,7 +7209,7 @@ (define_insn "@aarch64_sve_add_" (match_operand: 3 "register_operand")] MATMUL) (match_operand:VNx4SI_ONLY 1 "register_operand")))] - "TARGET_SVE_I8MM" + "TARGET_SVE_I8MM && TARGET_NON_STREAMING" {@ [ cons: =0 , 1 , 2 , 3 ; attrs: movprfx ] [ w , 0 , w , w ; * ] mmla\t%0.s, %2.b, %3.b [ ?&w , w , w , w ; yes ] movprfx\t%0, %1\;mmla\t%0.s, %2.b, %3.b @@ -7772,7 +7784,7 @@ (define_insn "@aarch64_sve_" (match_operand:SVE_MATMULF 3 "register_operand") (match_operand:SVE_MATMULF 1 "register_operand")] FMMLA))] - "TARGET_SVE" + "TARGET_SVE && TARGET_NON_STREAMING" {@ [ cons: =0 , 1 , 2 , 3 ; attrs: movprfx ] [ w , 0 , w , w ; * ] \t%0., %2., %3. [ ?&w , w , w , w ; yes ] movprfx\t%0, %1\;\t%0., %2., %3. @@ -8841,7 +8853,7 @@ (define_expand "fold_left_plus_" (match_operand: 1 "register_operand") (match_operand:SVE_FULL_F 2 "register_operand")] UNSPEC_FADDA))] - "TARGET_SVE" + "TARGET_SVE && TARGET_NON_STREAMING" { operands[3] = aarch64_ptrue_reg (mode); } @@ -8854,7 +8866,7 @@ (define_insn "mask_fold_left_plus_" (match_operand: 1 "register_operand" "0") (match_operand:SVE_FULL_F 2 "register_operand" "w")] UNSPEC_FADDA))] - "TARGET_SVE" + "TARGET_SVE && TARGET_NON_STREAMING" "fadda\t%0, %3, %0, %2." ) @@ -8908,7 +8920,7 @@ (define_insn "@aarch64_sve_compact" [(match_operand: 1 "register_operand" "Upl") (match_operand:SVE_FULL_SD 2 "register_operand" "w")] UNSPEC_SVE_COMPACT))] - "TARGET_SVE" + "TARGET_SVE && TARGET_NON_STREAMING" "compact\t%0., %1, %2." ) diff --git a/gcc/config/aarch64/aarch64-sve2.md b/gcc/config/aarch64/aarch64-sve2.md index ffa964d6060..79e19699bc4 100644 --- a/gcc/config/aarch64/aarch64-sve2.md +++ b/gcc/config/aarch64/aarch64-sve2.md @@ -109,7 +109,7 @@ (define_insn "@aarch64_gather_ldnt" (match_operand: 3 "register_operand") (mem:BLK (scratch))] UNSPEC_LDNT1_GATHER))] - "TARGET_SVE2" + "TARGET_SVE2 && TARGET_NON_STREAMING" {@ [cons: =0, 1, 2, 3] [&w, Upl, Z, w ] ldnt1\t%0., %1/z, [%3.] [?w, Upl, Z, 0 ] ^ @@ -132,6 +132,7 @@ (define_insn_and_rewrite "@aarch64_gather_ldnt_ & ) == 0" {@ [cons: =0, 1, 2, 3, 4] [&w, Upl, Z, w, UplDnm] ldnt1\t%0., %1/z, [%3.] @@ -165,7 +166,7 @@ (define_insn "@aarch64_scatter_stnt" (match_operand:SVE_FULL_SD 3 "register_operand")] UNSPEC_STNT1_SCATTER))] - "TARGET_SVE" + "TARGET_SVE && TARGET_NON_STREAMING" {@ [ cons: 0 , 1 , 2 , 3 ] [ Upl , Z , w , w ] stnt1\t%3., %0, [%2.] [ Upl , r , w , w ] stnt1\t%3., %0, [%2., %1] @@ -183,6 +184,7 @@ (define_insn "@aarch64_scatter_stnt_" (match_operand:SVE_FULL_SDI 3 "register_operand"))] UNSPEC_STNT1_SCATTER))] "TARGET_SVE2 + && TARGET_NON_STREAMING && (~ & ) == 0" {@ [ cons: 0 , 1 , 2 , 3 ] [ Upl , Z , w , w ] stnt1\t%3., %0, [%2.] @@ -2469,7 +2471,7 @@ (define_insn "@aarch64_sve2_histcnt" (match_operand:SVE_FULL_SDI 2 "register_operand" "w") (match_operand:SVE_FULL_SDI 3 "register_operand" "w")] UNSPEC_HISTCNT))] - "TARGET_SVE2" + "TARGET_SVE2 && TARGET_NON_STREAMING" "histcnt\t%0., %1/z, %2., %3." ) @@ -2479,7 +2481,7 @@ (define_insn "@aarch64_sve2_histseg" [(match_operand:VNx16QI_ONLY 1 "register_operand" "w") (match_operand:VNx16QI_ONLY 2 "register_operand" "w")] UNSPEC_HISTSEG))] - "TARGET_SVE2" + "TARGET_SVE2 && TARGET_NON_STREAMING" "histseg\t%0., %1., %2." ) @@ -2503,7 +2505,7 @@ (define_insn "@aarch64_pred_" SVE2_MATCH)] UNSPEC_PRED_Z)) (clobber (reg:CC_NZC CC_REGNUM))] - "TARGET_SVE2" + "TARGET_SVE2 && TARGET_NON_STREAMING" "\t%0., %1/z, %3., %4." ) @@ -2534,6 +2536,7 @@ (define_insn_and_rewrite "*aarch64_pred__cc" SVE2_MATCH)] UNSPEC_PRED_Z))] "TARGET_SVE2 + && TARGET_NON_STREAMING && aarch64_sve_same_pred_for_ptest_p (&operands[4], &operands[6])" "\t%0., %1/z, %2., %3." "&& !rtx_equal_p (operands[4], operands[6])" @@ -2561,6 +2564,7 @@ (define_insn_and_rewrite "*aarch64_pred__ptest" UNSPEC_PTEST)) (clobber (match_scratch: 0 "=Upa"))] "TARGET_SVE2 + && TARGET_NON_STREAMING && aarch64_sve_same_pred_for_ptest_p (&operands[4], &operands[6])" "\t%0., %1/z, %2., %3." "&& !rtx_equal_p (operands[4], operands[6])" diff --git a/gcc/config/aarch64/aarch64.h b/gcc/config/aarch64/aarch64.h index 808e2044009..a88d35000df 100644 --- a/gcc/config/aarch64/aarch64.h +++ b/gcc/config/aarch64/aarch64.h @@ -253,6 +253,9 @@ constexpr auto AARCH64_FL_DEFAULT_ISA_MODE = AARCH64_FL_SM_OFF; #define AARCH64_ISA_LS64 (aarch64_isa_flags & AARCH64_FL_LS64) #define AARCH64_ISA_CSSC (aarch64_isa_flags & AARCH64_FL_CSSC) +/* The current function is a normal non-streaming function. */ +#define TARGET_NON_STREAMING (AARCH64_ISA_SM_OFF) + /* Crypto is an optional extension to AdvSIMD. */ #define TARGET_CRYPTO (AARCH64_ISA_CRYPTO) @@ -291,16 +294,16 @@ constexpr auto AARCH64_FL_DEFAULT_ISA_MODE = AARCH64_FL_SM_OFF; #define TARGET_SVE2 (AARCH64_ISA_SVE2) /* SVE2 AES instructions, enabled through +sve2-aes. */ -#define TARGET_SVE2_AES (AARCH64_ISA_SVE2_AES) +#define TARGET_SVE2_AES (AARCH64_ISA_SVE2_AES && TARGET_NON_STREAMING) /* SVE2 BITPERM instructions, enabled through +sve2-bitperm. */ -#define TARGET_SVE2_BITPERM (AARCH64_ISA_SVE2_BITPERM) +#define TARGET_SVE2_BITPERM (AARCH64_ISA_SVE2_BITPERM && TARGET_NON_STREAMING) /* SVE2 SHA3 instructions, enabled through +sve2-sha3. */ -#define TARGET_SVE2_SHA3 (AARCH64_ISA_SVE2_SHA3) +#define TARGET_SVE2_SHA3 (AARCH64_ISA_SVE2_SHA3 && TARGET_NON_STREAMING) /* SVE2 SM4 instructions, enabled through +sve2-sm4. */ -#define TARGET_SVE2_SM4 (AARCH64_ISA_SVE2_SM4) +#define TARGET_SVE2_SM4 (AARCH64_ISA_SVE2_SM4 && TARGET_NON_STREAMING) /* SME instructions, enabled through +sme. Note that this does not imply anything about the state of PSTATE.SM. */ diff --git a/gcc/config/aarch64/iterators.md b/gcc/config/aarch64/iterators.md index e7aa7e35ae1..5f7cd886283 100644 --- a/gcc/config/aarch64/iterators.md +++ b/gcc/config/aarch64/iterators.md @@ -2707,7 +2707,7 @@ (define_int_iterator SVE_INT_UNARY [UNSPEC_RBIT UNSPEC_REVB (define_int_iterator SVE_FP_UNARY [UNSPEC_FRECPE UNSPEC_RSQRTE]) -(define_int_iterator SVE_FP_UNARY_INT [UNSPEC_FEXPA]) +(define_int_iterator SVE_FP_UNARY_INT [(UNSPEC_FEXPA "TARGET_NON_STREAMING")]) (define_int_iterator SVE_INT_SHIFT_IMM [UNSPEC_ASRD (UNSPEC_SQSHLU "TARGET_SVE2") @@ -2721,7 +2721,7 @@ (define_int_iterator SVE_FP_BINARY_INT [UNSPEC_FTSMUL UNSPEC_FTSSEL]) (define_int_iterator SVE_BFLOAT_TERNARY_LONG [UNSPEC_BFDOT UNSPEC_BFMLALB UNSPEC_BFMLALT - UNSPEC_BFMMLA]) + (UNSPEC_BFMMLA "TARGET_NON_STREAMING")]) (define_int_iterator SVE_BFLOAT_TERNARY_LONG_LANE [UNSPEC_BFDOT UNSPEC_BFMLALB diff --git a/gcc/testsuite/g++.target/aarch64/sve/aarch64-ssve.exp b/gcc/testsuite/g++.target/aarch64/sve/aarch64-ssve.exp new file mode 100644 index 00000000000..d6a5a561a33 --- /dev/null +++ b/gcc/testsuite/g++.target/aarch64/sve/aarch64-ssve.exp @@ -0,0 +1,308 @@ +# Specific regression driver for AArch64 SME. +# Copyright (C) 2009-2023 Free Software Foundation, Inc. +# +# This file is part of GCC. +# +# GCC is free software; you can redistribute it and/or modify it +# under the terms of the GNU General Public License as published by +# the Free Software Foundation; either version 3, or (at your option) +# any later version. +# +# GCC is distributed in the hope that it will be useful, but +# WITHOUT ANY WARRANTY; without even the implied warranty of +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU +# General Public License for more details. +# +# You should have received a copy of the GNU General Public License +# along with GCC; see the file COPYING3. If not see +# . */ + +# Test whether certain SVE instructions are accepted or rejected in +# SME streaming mode. + +# Exit immediately if this isn't an AArch64 target. +if {![istarget aarch64*-*-*] } { + return +} + +load_lib gcc-defs.exp + +gcc_parallel_test_enable 0 + +# Code shared by all tests. +set preamble { +#include + +#pragma GCC target "+i8mm+f32mm+f64mm+sve2+sve2-bitperm+sve2-sm4+sve2-aes+sve2-sha3+sme" + +extern svbool_t &pred; + +extern svint8_t &s8; +extern svint32_t &s32; + +extern svuint8_t &u8; +extern svuint16_t &u16; +extern svuint32_t &u32; +extern svuint64_t &u64; + +extern svbfloat16_t &bf16; +extern svfloat32_t &f32; + +extern void *void_ptr; + +extern int8_t *s8_ptr; +extern int16_t *s16_ptr; +extern int32_t *s32_ptr; + +extern uint8_t *u8_ptr; +extern uint16_t *u16_ptr; +extern uint32_t *u32_ptr; +extern uint64_t *u64_ptr; + +extern uint64_t indx; +} + +# Wrap a standalone call in a streaming-compatible function. +set sc_harness { +void +foo () [[arm::streaming_compatible]] +{ + $CALL; +} +} + +# HARNESS is some source code that should be appended to the preamble +# variable defined above. It includes the string "$CALL", which should be +# replaced by the function call in CALL. The result after both steps is +# a complete C++ translation unit. +# +# Try compiling the C++ code and see what output GCC produces. +# The expected output is either: +# +# - empty, if SHOULD_PASS is true +# - a message rejecting CALL in streaming mode, if SHOULD_PASS is false +# +# CALL is simple enough that it can be used in test names. +proc check_ssve_call { harness name call should_pass } { + global preamble + + set filename test-[pid] + set fd [open $filename.cc w] + puts $fd $preamble + puts -nonewline $fd [string map [list {$CALL} $call] $harness] + close $fd + remote_download host $filename.cc + + set test "streaming SVE call $name" + + set gcc_output [g++_target_compile $filename.cc $filename.s assembly ""] + remote_file build delete $filename.cc $filename.s + + if { [string equal $gcc_output ""] } { + if { $should_pass } { + pass $test + } else { + fail $test + } + return + } + + set lines [split $gcc_output "\n"] + set error_text "cannot be called when SME streaming mode is enabled" + if { [llength $lines] == 3 + && [string first "In function" [lindex $lines 0]] >= 0 + && [string first $error_text [lindex $lines 1]] >= 0 + && [string equal [lindex $lines 2] ""] } { + if { $should_pass } { + fail $test + } else { + pass $test + } + return + } + + verbose -log "$test: unexpected output" + fail $test +} + +# Apply check_ssve_call to each line in CALLS. The other arguments are +# as for check_ssve_call. +proc check_ssve_calls { harness calls should_pass } { + foreach line [split $calls "\n"] { + set call [string trim $line] + if { [string equal $call ""] } { + continue + } + check_ssve_call $harness "$call" $call $should_pass + } +} + +# A small selection of things that are valid in streaming mode. +set streaming_ok { + s8 = svadd_x (pred, s8, s8) + s8 = svld1 (pred, s8_ptr) +} + +# This order follows the list in the SME manual. +set nonstreaming_only { + u32 = svadrb_offset (u32, u32) + u64 = svadrb_offset (u64, u64) + u32 = svadrh_index (u32, u32) + u64 = svadrh_index (u64, u64) + u32 = svadrw_index (u32, u32) + u64 = svadrw_index (u64, u64) + u32 = svadrd_index (u32, u32) + u64 = svadrd_index (u64, u64) + u8 = svaesd (u8, u8) + u8 = svaese (u8, u8) + u8 = svaesimc (u8) + u8 = svaesmc (u8) + u8 = svbdep (u8, u8) + u8 = svbext (u8, u8) + f32 = svbfmmla (f32, bf16, bf16) + u8 = svbgrp (u8, u8) + u32 = svcompact (pred, u32) + f32 = svadda (pred, 1.0f, f32) + f32 = svexpa (u32) + f32 = svmmla (f32, f32, f32) + f32 = svtmad (f32, f32, 0) + f32 = svtsmul (f32, u32) + f32 = svtssel (f32, u32) + u32 = svhistcnt_z (pred, u32, u32) + u8 = svhistseg (u8, u8) + u32 = svld1ub_gather_offset_u32 (pred, u8_ptr, u32) + u32 = svld1ub_gather_offset_u32 (pred, u32, 1) + u64 = svld1_gather_index (pred, u64_ptr, u64) + u64 = svld1_gather_index_u64 (pred, u64, 1) + u32 = svld1uh_gather_index_u32 (pred, u16_ptr, u32) + u32 = svld1uh_gather_index_u32 (pred, u32, 1) + u8 = svld1ro (pred, u8_ptr + indx) + u8 = svld1ro (pred, u8_ptr + 1) + u16 = svld1ro (pred, u16_ptr + indx) + u16 = svld1ro (pred, u16_ptr + 1) + u32 = svld1ro (pred, u32_ptr + indx) + u32 = svld1ro (pred, u32_ptr + 1) + u64 = svld1ro (pred, u64_ptr + indx) + u64 = svld1ro (pred, u64_ptr + 1) + u32 = svld1sb_gather_offset_u32 (pred, s8_ptr, u32) + u32 = svld1sb_gather_offset_u32 (pred, u32, 1) + u32 = svld1sh_gather_index_u32 (pred, s16_ptr, u32) + u32 = svld1sh_gather_index_u32 (pred, u32, 1) + u64 = svld1sw_gather_index_u64 (pred, s32_ptr, u64) + u64 = svld1sw_gather_index_u64 (pred, u64, 1) + u64 = svld1uw_gather_index_u64 (pred, u32_ptr, u64) + u64 = svld1uw_gather_index_u64 (pred, u64, 1) + u32 = svld1_gather_index (pred, u32_ptr, u32) + u32 = svld1_gather_index_u32 (pred, u32, 1) + u8 = svldff1(pred, u8_ptr) + u16 = svldff1ub_u16(pred, u8_ptr) + u32 = svldff1ub_u32(pred, u8_ptr) + u64 = svldff1ub_u64(pred, u8_ptr) + u32 = svldff1ub_gather_offset_u32 (pred, u8_ptr, u32) + u32 = svldff1ub_gather_offset_u32 (pred, u32, 1) + u64 = svldff1(pred, u64_ptr) + u64 = svldff1_gather_index (pred, u64_ptr, u64) + u64 = svldff1_gather_index_u64 (pred, u64, 1) + u16 = svldff1(pred, u16_ptr) + u32 = svldff1uh_u32(pred, u16_ptr) + u64 = svldff1uh_u64(pred, u16_ptr) + u32 = svldff1uh_gather_offset_u32 (pred, u16_ptr, u32) + u32 = svldff1uh_gather_offset_u32 (pred, u32, 1) + u16 = svldff1sb_u16(pred, s8_ptr) + u32 = svldff1sb_u32(pred, s8_ptr) + u64 = svldff1sb_u64(pred, s8_ptr) + u32 = svldff1sb_gather_offset_u32 (pred, s8_ptr, u32) + u32 = svldff1sb_gather_offset_u32 (pred, u32, 1) + u32 = svldff1sh_u32(pred, s16_ptr) + u64 = svldff1sh_u64(pred, s16_ptr) + u32 = svldff1sh_gather_offset_u32 (pred, s16_ptr, u32) + u32 = svldff1sh_gather_offset_u32 (pred, u32, 1) + u64 = svldff1sw_u64(pred, s32_ptr) + u64 = svldff1sw_gather_offset_u64 (pred, s32_ptr, u64) + u64 = svldff1sw_gather_offset_u64 (pred, u64, 1) + u32 = svldff1(pred, u32_ptr) + u32 = svldff1_gather_index (pred, u32_ptr, u32) + u32 = svldff1_gather_index_u32 (pred, u32, 1) + u64 = svldff1uw_u64(pred, u32_ptr) + u64 = svldff1uw_gather_offset_u64 (pred, u32_ptr, u64) + u64 = svldff1uw_gather_offset_u64 (pred, u64, 1) + u8 = svldnf1(pred, u8_ptr) + u16 = svldnf1ub_u16(pred, u8_ptr) + u32 = svldnf1ub_u32(pred, u8_ptr) + u64 = svldnf1ub_u64(pred, u8_ptr) + u64 = svldnf1(pred, u64_ptr) + u16 = svldnf1(pred, u16_ptr) + u32 = svldnf1uh_u32(pred, u16_ptr) + u64 = svldnf1uh_u64(pred, u16_ptr) + u16 = svldnf1sb_u16(pred, s8_ptr) + u32 = svldnf1sb_u32(pred, s8_ptr) + u64 = svldnf1sb_u64(pred, s8_ptr) + u32 = svldnf1sh_u32(pred, s16_ptr) + u64 = svldnf1sh_u64(pred, s16_ptr) + u64 = svldnf1sw_u64(pred, s32_ptr) + u32 = svldnf1(pred, u32_ptr) + u64 = svldnf1uw_u64(pred, u32_ptr) + u32 = svldnt1ub_gather_offset_u32 (pred, u8_ptr, u32) + u32 = svldnt1ub_gather_offset_u32 (pred, u32, 1) + u64 = svldnt1_gather_index (pred, u64_ptr, u64) + u64 = svldnt1_gather_index_u64 (pred, u64, 1) + u32 = svldnt1uh_gather_offset_u32 (pred, u16_ptr, u32) + u32 = svldnt1uh_gather_offset_u32 (pred, u32, 1) + u32 = svldnt1sb_gather_offset_u32 (pred, s8_ptr, u32) + u32 = svldnt1sb_gather_offset_u32 (pred, u32, 1) + u32 = svldnt1sh_gather_offset_u32 (pred, s16_ptr, u32) + u32 = svldnt1sh_gather_offset_u32 (pred, u32, 1) + u64 = svldnt1sw_gather_offset_u64 (pred, s32_ptr, u64) + u64 = svldnt1sw_gather_offset_u64 (pred, u64, 1) + u64 = svldnt1uw_gather_offset_u64 (pred, u32_ptr, u64) + u64 = svldnt1uw_gather_offset_u64 (pred, u64, 1) + u32 = svldnt1_gather_offset (pred, u32_ptr, u32) + u32 = svldnt1_gather_offset_u32 (pred, u32, 1) + pred = svmatch (pred, u8, u8) + pred = svnmatch (pred, u8, u8) + u64 = svpmullb_pair (u64, u64) + u64 = svpmullt_pair (u64, u64) + svprfb_gather_offset (pred, void_ptr, u64, SV_PLDL1KEEP) + svprfb_gather_offset (pred, u64, 1, SV_PLDL1KEEP) + svprfd_gather_index (pred, void_ptr, u64, SV_PLDL1KEEP) + svprfd_gather_index (pred, u64, 1, SV_PLDL1KEEP) + svprfh_gather_index (pred, void_ptr, u64, SV_PLDL1KEEP) + svprfh_gather_index (pred, u64, 1, SV_PLDL1KEEP) + svprfw_gather_index (pred, void_ptr, u64, SV_PLDL1KEEP) + svprfw_gather_index (pred, u64, 1, SV_PLDL1KEEP) + u64 = svrax1 (u64, u64) + pred = svrdffr () + pred = svrdffr_z (pred) + svsetffr () + u32 = svsm4e (u32, u32) + u32 = svsm4ekey (u32, u32) + s32 = svmmla (s32, s8, s8) + svst1b_scatter_offset (pred, u8_ptr, u32, u32) + svst1b_scatter_offset (pred, u32, 1, u32) + svst1_scatter_index (pred, u64_ptr, u64, u64) + svst1_scatter_index (pred, u64, 1, u64) + svst1h_scatter_index (pred, u16_ptr, u32, u32) + svst1h_scatter_index (pred, u32, 1, u32) + svst1w_scatter_index (pred, u32_ptr, u64, u64) + svst1w_scatter_index (pred, u64, 1, u64) + svst1_scatter_index (pred, u32_ptr, u32, u32) + svst1_scatter_index (pred, u32, 1, u32) + svstnt1b_scatter_offset (pred, u8_ptr, u32, u32) + svstnt1b_scatter_offset (pred, u32, 1, u32) + svstnt1_scatter_offset (pred, u64_ptr, u64, u64) + svstnt1_scatter_offset (pred, u64, 1, u64) + svstnt1h_scatter_offset (pred, u16_ptr, u32, u32) + svstnt1h_scatter_offset (pred, u32, 1, u32) + svstnt1w_scatter_offset (pred, u32_ptr, u64, u64) + svstnt1w_scatter_offset (pred, u64, 1, u64) + svstnt1_scatter_offset (pred, u32_ptr, u32, u32) + svstnt1_scatter_offset (pred, u32, 1, u32) + u32 = svmmla (u32, u8, u8) + s32 = svusmmla (s32, u8, s8) + svwrffr (pred) +} + +check_ssve_calls $sc_harness $streaming_ok 1 +check_ssve_calls $sc_harness $nonstreaming_only 0 + +gcc_parallel_test_enable 1 diff --git a/gcc/testsuite/g++.target/aarch64/sve/acle/aarch64-sve-acle-asm.exp b/gcc/testsuite/g++.target/aarch64/sve/acle/aarch64-sve-acle-asm.exp index 5b40d0d5c39..4b4ee10a014 100644 --- a/gcc/testsuite/g++.target/aarch64/sve/acle/aarch64-sve-acle-asm.exp +++ b/gcc/testsuite/g++.target/aarch64/sve/acle/aarch64-sve-acle-asm.exp @@ -50,6 +50,7 @@ if { [info exists gcc_runtest_parallelize_limit_minor] } { torture-init set-torture-options { "-std=c++98 -O0 -g" + "-std=c++11 -O0 -DSTREAMING_COMPATIBLE" "-std=c++98 -O1 -g" "-std=c++11 -O2 -g" "-std=c++14 -O3 -g" diff --git a/gcc/testsuite/g++.target/aarch64/sve2/acle/aarch64-sve2-acle-asm.exp b/gcc/testsuite/g++.target/aarch64/sve2/acle/aarch64-sve2-acle-asm.exp index b605da8770b..9cd2efd05cb 100644 --- a/gcc/testsuite/g++.target/aarch64/sve2/acle/aarch64-sve2-acle-asm.exp +++ b/gcc/testsuite/g++.target/aarch64/sve2/acle/aarch64-sve2-acle-asm.exp @@ -53,6 +53,7 @@ if { [info exists gcc_runtest_parallelize_limit_minor] } { torture-init set-torture-options { "-std=c++98 -O0 -g" + "-std=c++11 -O0 -DSTREAMING_COMPATIBLE" "-std=c++98 -O1 -g" "-std=c++11 -O2 -g" "-std=c++14 -O3 -g" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/aarch64-sve-acle-asm.exp b/gcc/testsuite/gcc.target/aarch64/sve/acle/aarch64-sve-acle-asm.exp index ba4704e54f4..eee7c420ffd 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/aarch64-sve-acle-asm.exp +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/aarch64-sve-acle-asm.exp @@ -50,6 +50,7 @@ if { [info exists gcc_runtest_parallelize_limit_minor] } { torture-init set-torture-options { "-std=c90 -O0 -g" + "-std=c90 -O0 -DSTREAMING_COMPATIBLE" "-std=c90 -O1 -g" "-std=c99 -O2 -g" "-std=c11 -O3 -g" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/adda_f16.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/adda_f16.c index 642c45ab492..d381d881d82 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/adda_f16.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/adda_f16.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/adda_f32.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/adda_f32.c index 79bdd3d8048..e0b908837a0 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/adda_f32.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/adda_f32.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/adda_f64.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/adda_f64.c index c8f56772218..fd730c85153 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/adda_f64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/adda_f64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/adrb.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/adrb.c index a61eec9712e..5dcdc54b007 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/adrb.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/adrb.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/adrd.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/adrd.c index 970485bd67d..d9d16ce3f7d 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/adrd.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/adrd.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/adrh.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/adrh.c index d06f51fe35b..a358c240389 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/adrh.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/adrh.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/adrw.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/adrw.c index b23f25a1125..bd1e9af0a6d 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/adrw.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/adrw.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/bfmmla_f32.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/bfmmla_f32.c index b1d98fbf536..4bb2912a45a 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/bfmmla_f32.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/bfmmla_f32.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-additional-options "-march=armv8.2-a+sve+bf16" } */ /* { dg-require-effective-target aarch64_asm_bf16_ok } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/compact_f32.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/compact_f32.c index 2e80d6830ca..d261ec00b92 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/compact_f32.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/compact_f32.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/compact_f64.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/compact_f64.c index e0bc33efec2..024b0510faa 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/compact_f64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/compact_f64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/compact_s32.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/compact_s32.c index e4634982bf6..0b32dfb609c 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/compact_s32.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/compact_s32.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/compact_s64.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/compact_s64.c index 71cb97b8a2a..38688dbca73 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/compact_s64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/compact_s64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/compact_u32.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/compact_u32.c index 954329a0b2f..a3e89cc97a1 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/compact_u32.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/compact_u32.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/compact_u64.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/compact_u64.c index ec664845f4a..602ab048c99 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/compact_u64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/compact_u64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/expa_f16.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/expa_f16.c index 5a5411e46cb..87c26e6ea6b 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/expa_f16.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/expa_f16.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/expa_f32.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/expa_f32.c index 4ded1c5756e..5e9839537c7 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/expa_f32.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/expa_f32.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/expa_f64.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/expa_f64.c index c31f9ccb5b2..b117df2a4b1 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/expa_f64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/expa_f64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1_gather_f32.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1_gather_f32.c index 00b68ff290c..8b972f61b49 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1_gather_f32.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1_gather_f32.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1_gather_f64.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1_gather_f64.c index 47127960c0d..413d4d62d4e 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1_gather_f64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1_gather_f64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1_gather_s32.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1_gather_s32.c index 9b6335547f5..b3df7d154cf 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1_gather_s32.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1_gather_s32.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1_gather_s64.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1_gather_s64.c index c9cea3ad8c7..0da1e52966b 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1_gather_s64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1_gather_s64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1_gather_u32.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1_gather_u32.c index 2cccc8d4906..a3304c4197a 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1_gather_u32.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1_gather_u32.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1_gather_u64.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1_gather_u64.c index 6ee1d48ab0c..73ef94805dc 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1_gather_u64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1_gather_u64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1ro_bf16.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1ro_bf16.c index cb1801778d4..fe909b666c9 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1ro_bf16.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1ro_bf16.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ /* { dg-additional-options "-march=armv8.6-a+f64mm" } */ /* { dg-require-effective-target aarch64_asm_f64mm_ok } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1ro_f16.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1ro_f16.c index 86081edbd65..30ba3063900 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1ro_f16.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1ro_f16.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ /* { dg-additional-options "-march=armv8.6-a+f64mm" } */ /* { dg-require-effective-target aarch64_asm_f64mm_ok } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1ro_f32.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1ro_f32.c index c8df00f8a02..cf62fada91a 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1ro_f32.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1ro_f32.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ /* { dg-additional-options "-march=armv8.6-a+f64mm" } */ /* { dg-require-effective-target aarch64_asm_f64mm_ok } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1ro_f64.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1ro_f64.c index 2fb9d5b7486..b9fde4dac69 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1ro_f64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1ro_f64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ /* { dg-additional-options "-march=armv8.6-a+f64mm" } */ /* { dg-require-effective-target aarch64_asm_f64mm_ok } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1ro_s16.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1ro_s16.c index 3cd211b1646..35b7dd1d27e 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1ro_s16.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1ro_s16.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ /* { dg-additional-options "-march=armv8.6-a+f64mm" } */ /* { dg-require-effective-target aarch64_asm_f64mm_ok } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1ro_s32.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1ro_s32.c index 44b16ed5f72..57b6a6567c0 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1ro_s32.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1ro_s32.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ /* { dg-additional-options "-march=armv8.6-a+f64mm" } */ /* { dg-require-effective-target aarch64_asm_f64mm_ok } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1ro_s64.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1ro_s64.c index 3aa9a15eeee..bd7e28478e2 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1ro_s64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1ro_s64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ /* { dg-additional-options "-march=armv8.6-a+f64mm" } */ /* { dg-require-effective-target aarch64_asm_f64mm_ok } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1ro_s8.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1ro_s8.c index 49aff5146f2..1438000038e 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1ro_s8.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1ro_s8.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ /* { dg-additional-options "-march=armv8.6-a+f64mm" } */ /* { dg-require-effective-target aarch64_asm_f64mm_ok } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1ro_u16.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1ro_u16.c index 00bf9e129f5..145b0b7f3aa 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1ro_u16.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1ro_u16.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ /* { dg-additional-options "-march=armv8.6-a+f64mm" } */ /* { dg-require-effective-target aarch64_asm_f64mm_ok } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1ro_u32.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1ro_u32.c index 9e9b3290a12..9f150631b94 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1ro_u32.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1ro_u32.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ /* { dg-additional-options "-march=armv8.6-a+f64mm" } */ /* { dg-require-effective-target aarch64_asm_f64mm_ok } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1ro_u64.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1ro_u64.c index 64ec628714b..8dd75d13607 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1ro_u64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1ro_u64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ /* { dg-additional-options "-march=armv8.6-a+f64mm" } */ /* { dg-require-effective-target aarch64_asm_f64mm_ok } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1ro_u8.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1ro_u8.c index 22701320bf7..f154545868b 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1ro_u8.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1ro_u8.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ /* { dg-additional-options "-march=armv8.6-a+f64mm" } */ /* { dg-require-effective-target aarch64_asm_f64mm_ok } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1sb_gather_s32.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1sb_gather_s32.c index 16a5316a9e4..06249ad4c5c 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1sb_gather_s32.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1sb_gather_s32.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1sb_gather_s64.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1sb_gather_s64.c index 3f953247ea1..8d141e133e6 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1sb_gather_s64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1sb_gather_s64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1sb_gather_u32.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1sb_gather_u32.c index 424de65a6fe..77836cbf652 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1sb_gather_u32.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1sb_gather_u32.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1sb_gather_u64.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1sb_gather_u64.c index aa375bea2e3..f4b24ab419a 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1sb_gather_u64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1sb_gather_u64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1sh_gather_s32.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1sh_gather_s32.c index ed07b4dfcfa..1b978236845 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1sh_gather_s32.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1sh_gather_s32.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1sh_gather_s64.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1sh_gather_s64.c index 20ca4272059..2009dec812e 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1sh_gather_s64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1sh_gather_s64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1sh_gather_u32.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1sh_gather_u32.c index e3a85a23fb6..0e1d4896665 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1sh_gather_u32.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1sh_gather_u32.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1sh_gather_u64.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1sh_gather_u64.c index 3a0094fba59..115d7d3a996 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1sh_gather_u64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1sh_gather_u64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1sw_gather_s64.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1sw_gather_s64.c index 4d076b4861a..5dc44421ca4 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1sw_gather_s64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1sw_gather_s64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1sw_gather_u64.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1sw_gather_u64.c index ffa85eb3e73..fac4ec41c00 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1sw_gather_u64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1sw_gather_u64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1ub_gather_s32.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1ub_gather_s32.c index a9c4182659e..f57df42266d 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1ub_gather_s32.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1ub_gather_s32.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1ub_gather_s64.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1ub_gather_s64.c index 99af86ddf82..0c069fa4f44 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1ub_gather_s64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1ub_gather_s64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1ub_gather_u32.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1ub_gather_u32.c index 77c7e0a2dff..98102e01393 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1ub_gather_u32.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1ub_gather_u32.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1ub_gather_u64.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1ub_gather_u64.c index b605f8b67e3..f86a34d1248 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1ub_gather_u64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1ub_gather_u64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1uh_gather_s32.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1uh_gather_s32.c index 84fb5c335d7..13937187895 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1uh_gather_s32.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1uh_gather_s32.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1uh_gather_s64.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1uh_gather_s64.c index 44700179322..f0338aae6b4 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1uh_gather_s64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1uh_gather_s64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1uh_gather_u32.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1uh_gather_u32.c index 09d3cc8c298..5810bc0accb 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1uh_gather_u32.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1uh_gather_u32.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1uh_gather_u64.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1uh_gather_u64.c index f3dcf03cd81..52e95abb9b4 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1uh_gather_u64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1uh_gather_u64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1uw_gather_s64.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1uw_gather_s64.c index f4e9d5db970..0889eefdddd 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1uw_gather_s64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1uw_gather_s64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1uw_gather_u64.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1uw_gather_u64.c index 854d19233f5..fb144d756ab 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1uw_gather_u64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1uw_gather_u64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1_bf16.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1_bf16.c index 80f6468700e..1f997480ea8 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1_bf16.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1_bf16.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1_f16.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1_f16.c index 13ce863c96a..60405d0a0ed 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1_f16.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1_f16.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1_f32.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1_f32.c index 2fcc633906c..225e9969dd2 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1_f32.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1_f32.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1_f64.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1_f64.c index cc15b927aba..366e36afdbe 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1_f64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1_f64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1_gather_f32.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1_gather_f32.c index 7e330c04221..b84b9bcdda7 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1_gather_f32.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1_gather_f32.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1_gather_f64.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1_gather_f64.c index d0e47f0bf19..e779b071283 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1_gather_f64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1_gather_f64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1_gather_s32.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1_gather_s32.c index 66bf0f74630..17e0f9aa2d8 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1_gather_s32.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1_gather_s32.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1_gather_s64.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1_gather_s64.c index faf71bf9dd5..030f187b152 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1_gather_s64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1_gather_s64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1_gather_u32.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1_gather_u32.c index 41c7dc9cf31..fb86530166f 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1_gather_u32.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1_gather_u32.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1_gather_u64.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1_gather_u64.c index 8b53ce94f85..5be30a2d842 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1_gather_u64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1_gather_u64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1_s16.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1_s16.c index 1d5fde0e639..61d242c074b 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1_s16.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1_s16.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1_s32.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1_s32.c index 97a36e88499..afe748ef939 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1_s32.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1_s32.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1_s64.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1_s64.c index c018a4c1ca6..bee22285539 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1_s64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1_s64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1_s8.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1_s8.c index cf620d1f4b0..ccaac2ca4eb 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1_s8.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1_s8.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1_u16.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1_u16.c index 1fa819296cb..c8416f99df9 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1_u16.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1_u16.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1_u32.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1_u32.c index 5224ec40ac8..ec26a82ca19 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1_u32.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1_u32.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1_u64.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1_u64.c index 18e87f2b805..e211f179486 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1_u64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1_u64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1_u8.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1_u8.c index 83883fca43a..24dfe452f03 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1_u8.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1_u8.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1sb_gather_s32.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1sb_gather_s32.c index c2a676807a5..f7e3977bfcf 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1sb_gather_s32.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1sb_gather_s32.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1sb_gather_s64.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1sb_gather_s64.c index 2f2a04d24bb..7f2a829a8e4 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1sb_gather_s64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1sb_gather_s64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1sb_gather_u32.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1sb_gather_u32.c index e3e83a205cb..685f628088d 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1sb_gather_u32.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1sb_gather_u32.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1sb_gather_u64.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1sb_gather_u64.c index 769f2c266e9..49a7a85367f 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1sb_gather_u64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1sb_gather_u64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1sb_s16.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1sb_s16.c index e0a748c6a6b..1d30c7ba618 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1sb_s16.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1sb_s16.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1sb_s32.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1sb_s32.c index 86716da9ba1..c2b3f42cb5b 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1sb_s32.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1sb_s32.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1sb_s64.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1sb_s64.c index e7a4aa6e93d..585a6241e0b 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1sb_s64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1sb_s64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1sb_u16.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1sb_u16.c index 69ba96d52e2..ebb2f0f66f0 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1sb_u16.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1sb_u16.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1sb_u32.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1sb_u32.c index e1a1873f0a4..f4ea96cf91c 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1sb_u32.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1sb_u32.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1sb_u64.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1sb_u64.c index 0a49cbcc07f..e3735239c4e 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1sb_u64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1sb_u64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1sh_gather_s32.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1sh_gather_s32.c index b633335dc71..67e70361b5c 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1sh_gather_s32.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1sh_gather_s32.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1sh_gather_s64.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1sh_gather_s64.c index 32a4309b633..5755c79bc1a 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1sh_gather_s64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1sh_gather_s64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1sh_gather_u32.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1sh_gather_u32.c index 73a9be8923b..a5848999573 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1sh_gather_u32.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1sh_gather_u32.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1sh_gather_u64.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1sh_gather_u64.c index 94ea73b6306..b1875120980 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1sh_gather_u64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1sh_gather_u64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1sh_s32.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1sh_s32.c index 81b64e836b8..bffac936527 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1sh_s32.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1sh_s32.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1sh_s64.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1sh_s64.c index 453b3ff244a..a4acb1e5ea9 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1sh_s64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1sh_s64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1sh_u32.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1sh_u32.c index bbbed79dc35..828288cd825 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1sh_u32.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1sh_u32.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1sh_u64.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1sh_u64.c index 5430e256b46..e3432c46c27 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1sh_u64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1sh_u64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1sw_gather_s64.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1sw_gather_s64.c index e5da8a83dc3..78aa34ec055 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1sw_gather_s64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1sw_gather_s64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1sw_gather_u64.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1sw_gather_u64.c index 41142875673..9dad1212c81 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1sw_gather_u64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1sw_gather_u64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1sw_s64.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1sw_s64.c index d795ace6391..33b6c10ddc5 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1sw_s64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1sw_s64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1sw_u64.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1sw_u64.c index 6caf2f5045d..e8c9c845f95 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1sw_u64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1sw_u64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1ub_gather_s32.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1ub_gather_s32.c index af0be08d21c..b1c9c81357f 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1ub_gather_s32.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1ub_gather_s32.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1ub_gather_s64.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1ub_gather_s64.c index 43124dd8930..9ab776a218f 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1ub_gather_s64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1ub_gather_s64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1ub_gather_u32.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1ub_gather_u32.c index 90c4e58a275..745740dfa3f 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1ub_gather_u32.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1ub_gather_u32.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1ub_gather_u64.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1ub_gather_u64.c index 302623a400b..3a7bd6a436b 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1ub_gather_u64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1ub_gather_u64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1ub_s16.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1ub_s16.c index 88ad2d1dc61..ade0704f7ad 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1ub_s16.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1ub_s16.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1ub_s32.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1ub_s32.c index e8e06411f98..5d3e0ce95e5 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1ub_s32.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1ub_s32.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1ub_s64.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1ub_s64.c index 21d02ddb721..08ae802ee26 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1ub_s64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1ub_s64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1ub_u16.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1ub_u16.c index 904cb027e3e..d8dc5e15738 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1ub_u16.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1ub_u16.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1ub_u32.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1ub_u32.c index a400123188b..042ae5a9f02 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1ub_u32.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1ub_u32.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1ub_u64.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1ub_u64.c index a9a98a68362..d0844fa5197 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1ub_u64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1ub_u64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1uh_gather_s32.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1uh_gather_s32.c index d02e443428a..12460105d0e 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1uh_gather_s32.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1uh_gather_s32.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1uh_gather_s64.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1uh_gather_s64.c index 663a73d2715..536331371b0 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1uh_gather_s64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1uh_gather_s64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1uh_gather_u32.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1uh_gather_u32.c index 5e0ef067f54..602e6a686e6 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1uh_gather_u32.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1uh_gather_u32.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1uh_gather_u64.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1uh_gather_u64.c index 1cfae1b9532..4b307b3416e 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1uh_gather_u64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1uh_gather_u64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1uh_s32.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1uh_s32.c index abb3d769a74..db205b1ef7b 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1uh_s32.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1uh_s32.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1uh_s64.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1uh_s64.c index 6e330e8e8a8..0eac877eb82 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1uh_s64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1uh_s64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1uh_u32.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1uh_u32.c index 4eb5323e957..266ecf167fe 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1uh_u32.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1uh_u32.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1uh_u64.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1uh_u64.c index ebac26e7d37..bdd725e4a35 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1uh_u64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1uh_u64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1uw_gather_s64.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1uw_gather_s64.c index 6c0daea52b5..ab2c79da782 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1uw_gather_s64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1uw_gather_s64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1uw_gather_u64.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1uw_gather_u64.c index 0e400c6790f..361d7de05d8 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1uw_gather_u64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1uw_gather_u64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1uw_s64.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1uw_s64.c index ac97798991c..8adcec3d512 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1uw_s64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1uw_s64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1uw_u64.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1uw_u64.c index c7ab0617106..781fc1a9c66 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1uw_u64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1uw_u64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1_bf16.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1_bf16.c index 947a896e778..93b4425ecb5 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1_bf16.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1_bf16.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1_f16.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1_f16.c index cf017868839..d47d748c76c 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1_f16.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1_f16.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1_f32.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1_f32.c index 83b73ec8e09..e390d685797 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1_f32.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1_f32.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1_f64.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1_f64.c index 778096e826b..97a0e39e7c8 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1_f64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1_f64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1_s16.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1_s16.c index 592c8237de3..21008d7f9ca 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1_s16.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1_s16.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1_s32.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1_s32.c index 634092af8ea..8a3d795b309 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1_s32.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1_s32.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1_s64.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1_s64.c index 4a03f66767a..c0b57a2f3fc 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1_s64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1_s64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1_s8.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1_s8.c index 162ee176ad5..6714152d93c 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1_s8.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1_s8.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1_u16.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1_u16.c index e920ac43b45..3df404d77bb 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1_u16.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1_u16.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1_u32.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1_u32.c index 65e28c5c206..e899a4a6ff4 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1_u32.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1_u32.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1_u64.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1_u64.c index 70d3f27d87a..ab69656cfa8 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1_u64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1_u64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1_u8.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1_u8.c index 5c29f1d196a..5d7b074973e 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1_u8.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1_u8.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1sb_s16.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1sb_s16.c index e04b9a7887f..5b53c885d6a 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1sb_s16.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1sb_s16.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1sb_s32.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1sb_s32.c index 0553fc98da4..992eba7cc2f 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1sb_s32.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1sb_s32.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1sb_s64.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1sb_s64.c index 61a474fdf52..99e0f8bd091 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1sb_s64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1sb_s64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1sb_u16.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1sb_u16.c index be63d8bf9b2..fe23913f23c 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1sb_u16.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1sb_u16.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1sb_u32.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1sb_u32.c index 4f52490b4a8..6deb39770a1 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1sb_u32.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1sb_u32.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1sb_u64.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1sb_u64.c index 73f50d182a5..e76457da6cd 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1sb_u64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1sb_u64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1sh_s32.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1sh_s32.c index 08c7dc6dd4d..e49a7f8ed49 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1sh_s32.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1sh_s32.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1sh_s64.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1sh_s64.c index 6a41bc26b7f..00b40281c24 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1sh_s64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1sh_s64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1sh_u32.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1sh_u32.c index 2f7718730f1..41560af330f 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1sh_u32.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1sh_u32.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1sh_u64.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1sh_u64.c index d7f1a68a4cd..0acf4b34916 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1sh_u64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1sh_u64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1sw_s64.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1sw_s64.c index 5b483e4aa1d..5782128982c 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1sw_s64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1sw_s64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1sw_u64.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1sw_u64.c index 62121ce0a44..8249c4c3f79 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1sw_u64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1sw_u64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1ub_s16.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1ub_s16.c index 8fe13411f31..e59c451f790 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1ub_s16.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1ub_s16.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1ub_s32.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1ub_s32.c index 50122e3b786..d788576e275 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1ub_s32.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1ub_s32.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1ub_s64.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1ub_s64.c index d7cce11b60c..b21fdb96491 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1ub_s64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1ub_s64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1ub_u16.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1ub_u16.c index 7bf82c3b6c0..1ae41b002ff 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1ub_u16.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1ub_u16.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1ub_u32.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1ub_u32.c index e2fef064b47..e3d8fb3b5f0 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1ub_u32.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1ub_u32.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1ub_u64.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1ub_u64.c index 57c61e122ac..df9a0c07fa7 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1ub_u64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1ub_u64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1uh_s32.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1uh_s32.c index ed9686c4ed5..c3467d84675 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1uh_s32.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1uh_s32.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1uh_s64.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1uh_s64.c index a3107f562b8..bf3355e9986 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1uh_s64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1uh_s64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1uh_u32.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1uh_u32.c index 93d5abaf76e..bcc3eb3fd8f 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1uh_u32.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1uh_u32.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1uh_u64.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1uh_u64.c index 32d36a84ce3..4c01c13ac3f 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1uh_u64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1uh_u64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1uw_s64.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1uw_s64.c index 373922791d0..3c655659115 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1uw_s64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1uw_s64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1uw_u64.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1uw_u64.c index b3c3be1d01f..b222a0dc648 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1uw_u64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1uw_u64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/mmla_f32.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/mmla_f32.c index f66dbf397c4..e1c7f47dc96 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/mmla_f32.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/mmla_f32.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-require-effective-target aarch64_asm_f32mm_ok } */ /* { dg-additional-options "-march=armv8.2-a+f32mm" } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/mmla_f64.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/mmla_f64.c index 49dc0607cff..c45caa70001 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/mmla_f64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/mmla_f64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-require-effective-target aarch64_asm_f64mm_ok } */ /* { dg-additional-options "-march=armv8.2-a+f64mm" } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/mmla_s32.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/mmla_s32.c index e7ce009acfc..dc155461c61 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/mmla_s32.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/mmla_s32.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-require-effective-target aarch64_asm_i8mm_ok } */ /* { dg-additional-options "-march=armv8.2-a+sve+i8mm" } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/mmla_u32.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/mmla_u32.c index 81f5166fbf9..43d601a471d 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/mmla_u32.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/mmla_u32.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-require-effective-target aarch64_asm_i8mm_ok } */ /* { dg-additional-options "-march=armv8.2-a+sve+i8mm" } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/prfb_gather.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/prfb_gather.c index c4bfbbbf7d7..f32cfbfcb19 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/prfb_gather.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/prfb_gather.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/prfd_gather.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/prfd_gather.c index a84acb1a106..8a4293b6253 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/prfd_gather.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/prfd_gather.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/prfh_gather.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/prfh_gather.c index 04b7a15758c..6beca4b8e0f 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/prfh_gather.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/prfh_gather.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/prfw_gather.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/prfw_gather.c index 2bbae1b9e02..6af44ac8290 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/prfw_gather.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/prfw_gather.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/rdffr_1.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/rdffr_1.c index 5564e967fcf..7e28ef6412f 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/rdffr_1.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/rdffr_1.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/st1_scatter_f32.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/st1_scatter_f32.c index cb6774ad04f..1efd4344532 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/st1_scatter_f32.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/st1_scatter_f32.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/st1_scatter_f64.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/st1_scatter_f64.c index fe978bbe5f1..f50c43e8309 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/st1_scatter_f64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/st1_scatter_f64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/st1_scatter_s32.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/st1_scatter_s32.c index d244e701a81..bb6fb10b83f 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/st1_scatter_s32.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/st1_scatter_s32.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/st1_scatter_s64.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/st1_scatter_s64.c index 5c4ebf440bc..19ec78e9e6e 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/st1_scatter_s64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/st1_scatter_s64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/st1_scatter_u32.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/st1_scatter_u32.c index fe3f7259f24..57fbb91b0ef 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/st1_scatter_u32.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/st1_scatter_u32.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/st1_scatter_u64.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/st1_scatter_u64.c index 23212356625..60018be5b80 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/st1_scatter_u64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/st1_scatter_u64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/st1b_scatter_s32.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/st1b_scatter_s32.c index d59033356be..fb1bb29dbe2 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/st1b_scatter_s32.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/st1b_scatter_s32.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/st1b_scatter_s64.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/st1b_scatter_s64.c index c7a35f1b470..65ee9a071fd 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/st1b_scatter_s64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/st1b_scatter_s64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/st1b_scatter_u32.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/st1b_scatter_u32.c index e098cb9b77e..ceec6193952 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/st1b_scatter_u32.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/st1b_scatter_u32.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/st1b_scatter_u64.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/st1b_scatter_u64.c index 058d1313fc2..aeedbc6d7a7 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/st1b_scatter_u64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/st1b_scatter_u64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/st1h_scatter_s32.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/st1h_scatter_s32.c index 2a23d41f3a1..2d69d085bc0 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/st1h_scatter_s32.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/st1h_scatter_s32.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/st1h_scatter_s64.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/st1h_scatter_s64.c index 6a1adb05609..3e5733ef9bb 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/st1h_scatter_s64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/st1h_scatter_s64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/st1h_scatter_u32.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/st1h_scatter_u32.c index 12197315d09..5cd330a3dec 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/st1h_scatter_u32.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/st1h_scatter_u32.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/st1h_scatter_u64.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/st1h_scatter_u64.c index 7021ea68f49..0ee9948cb4e 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/st1h_scatter_u64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/st1h_scatter_u64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/st1w_scatter_s64.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/st1w_scatter_s64.c index 2363f592b19..f18bedce1ca 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/st1w_scatter_s64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/st1w_scatter_s64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/st1w_scatter_u64.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/st1w_scatter_u64.c index 767c009b4f7..6850865ec9a 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/st1w_scatter_u64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/st1w_scatter_u64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/test_sve_acle.h b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/test_sve_acle.h index 2da61ff5c0b..d8916809b8e 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/test_sve_acle.h +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/test_sve_acle.h @@ -11,10 +11,17 @@ #error "Please define -DTEST_OVERLOADS or -DTEST_FULL" #endif +#ifdef STREAMING_COMPATIBLE +#define ATTR __arm_streaming_compatible +#else +#define ATTR +#endif + #ifdef __cplusplus -#define PROTO(NAME, RET, ARGS) extern "C" RET NAME ARGS; RET NAME ARGS +#define PROTO(NAME, RET, ARGS) \ + extern "C" RET NAME ARGS ATTR; RET NAME ARGS ATTR #else -#define PROTO(NAME, RET, ARGS) RET NAME ARGS +#define PROTO(NAME, RET, ARGS) RET NAME ARGS ATTR #endif #define TEST_UNIFORM_Z(NAME, TYPE, CODE1, CODE2) \ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/tmad_f16.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/tmad_f16.c index 3a00716e37f..c0b03a0d331 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/tmad_f16.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/tmad_f16.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/tmad_f32.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/tmad_f32.c index b73d420fbac..8eef8a12ca8 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/tmad_f32.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/tmad_f32.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/tmad_f64.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/tmad_f64.c index fc31928a6c3..5c96c55796c 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/tmad_f64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/tmad_f64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/tsmul_f16.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/tsmul_f16.c index 94bc696eb07..9deed667f89 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/tsmul_f16.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/tsmul_f16.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/tsmul_f32.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/tsmul_f32.c index d0ec91882d2..749ea8664be 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/tsmul_f32.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/tsmul_f32.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/tsmul_f64.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/tsmul_f64.c index 23e0da3f7a0..053abcb26e9 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/tsmul_f64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/tsmul_f64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/tssel_f16.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/tssel_f16.c index e7c3ea03b81..3ab251fe04a 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/tssel_f16.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/tssel_f16.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/tssel_f32.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/tssel_f32.c index 022573a191d..6c6471c5e56 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/tssel_f32.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/tssel_f32.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/tssel_f64.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/tssel_f64.c index ffcdf4224b3..9559e0f352d 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/tssel_f64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/tssel_f64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/usmmla_s32.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/usmmla_s32.c index 9440f3fd919..a0dd7e334aa 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/usmmla_s32.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/usmmla_s32.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-require-effective-target aarch64_asm_i8mm_ok } */ /* { dg-additional-options "-march=armv8.2-a+sve+i8mm" } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/acle/aarch64-sve2-acle-asm.exp b/gcc/testsuite/gcc.target/aarch64/sve2/acle/aarch64-sve2-acle-asm.exp index 0ad6463d832..f62782ef40b 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve2/acle/aarch64-sve2-acle-asm.exp +++ b/gcc/testsuite/gcc.target/aarch64/sve2/acle/aarch64-sve2-acle-asm.exp @@ -52,6 +52,7 @@ if { [info exists gcc_runtest_parallelize_limit_minor] } { torture-init set-torture-options { "-std=c90 -O0 -g" + "-std=c90 -O0 -DSTREAMING_COMPATIBLE" "-std=c90 -O1 -g" "-std=c99 -O2 -g" "-std=c11 -O3 -g" diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/aesd_u8.c b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/aesd_u8.c index 384b6ffc9aa..65ba09471ac 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/aesd_u8.c +++ b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/aesd_u8.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/aese_u8.c b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/aese_u8.c index 6381bce1661..f902c3c1d32 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/aese_u8.c +++ b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/aese_u8.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/aesimc_u8.c b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/aesimc_u8.c index 76259326467..dab06b79a95 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/aesimc_u8.c +++ b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/aesimc_u8.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/aesmc_u8.c b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/aesmc_u8.c index 30e83d381dc..7e7cc65be5d 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/aesmc_u8.c +++ b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/aesmc_u8.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/bdep_u16.c b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/bdep_u16.c index 14230850f70..c1a4e10614f 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/bdep_u16.c +++ b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/bdep_u16.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/bdep_u32.c b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/bdep_u32.c index 7f08df4baa2..4f14cc4c432 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/bdep_u32.c +++ b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/bdep_u32.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/bdep_u64.c b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/bdep_u64.c index 7f7cbbeebad..091253ec60b 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/bdep_u64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/bdep_u64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/bdep_u8.c b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/bdep_u8.c index b420323b906..deb1ad27d90 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/bdep_u8.c +++ b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/bdep_u8.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/bext_u16.c b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/bext_u16.c index 50a647918e5..9efa501efa8 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/bext_u16.c +++ b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/bext_u16.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/bext_u32.c b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/bext_u32.c index 9f98b843c1a..18963da5bd3 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/bext_u32.c +++ b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/bext_u32.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/bext_u64.c b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/bext_u64.c index 9dbaec1b762..91591f93b88 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/bext_u64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/bext_u64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/bext_u8.c b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/bext_u8.c index 81ed5a463a0..1211587ef41 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/bext_u8.c +++ b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/bext_u8.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/bgrp_u16.c b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/bgrp_u16.c index 70aeae3f329..72868bea7f6 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/bgrp_u16.c +++ b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/bgrp_u16.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/bgrp_u32.c b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/bgrp_u32.c index 6e19e38d897..c8923816fe4 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/bgrp_u32.c +++ b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/bgrp_u32.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/bgrp_u64.c b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/bgrp_u64.c index 27fa40f4777..86989529faf 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/bgrp_u64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/bgrp_u64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/bgrp_u8.c b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/bgrp_u8.c index b667e03e3a4..5cd941a7a6e 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/bgrp_u8.c +++ b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/bgrp_u8.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/histcnt_s32.c b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/histcnt_s32.c index 7bf783a7c18..53d6c5c5636 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/histcnt_s32.c +++ b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/histcnt_s32.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/histcnt_s64.c b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/histcnt_s64.c index 001f5f0f187..c6d9862e31f 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/histcnt_s64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/histcnt_s64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/histcnt_u32.c b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/histcnt_u32.c index d93091adc55..cb11a00261b 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/histcnt_u32.c +++ b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/histcnt_u32.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/histcnt_u64.c b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/histcnt_u64.c index 3b889802395..0bb06cdb45d 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/histcnt_u64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/histcnt_u64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/histseg_s8.c b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/histseg_s8.c index 380ccdf85a5..ce3458e5ef6 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/histseg_s8.c +++ b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/histseg_s8.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/histseg_u8.c b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/histseg_u8.c index f43292f0ccd..7b1eff811c5 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/histseg_u8.c +++ b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/histseg_u8.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1_gather_f32.c b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1_gather_f32.c index 102810e25c8..17e3673a4a7 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1_gather_f32.c +++ b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1_gather_f32.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1_gather_f64.c b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1_gather_f64.c index a0ed71227e8..8ce32e9f9ff 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1_gather_f64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1_gather_f64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1_gather_s32.c b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1_gather_s32.c index 94c64971c77..b7e1d7a99c8 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1_gather_s32.c +++ b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1_gather_s32.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1_gather_s64.c b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1_gather_s64.c index a0aa6703f9c..b0789ad21ce 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1_gather_s64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1_gather_s64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1_gather_u32.c b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1_gather_u32.c index e1479684e82..df09eaa7680 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1_gather_u32.c +++ b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1_gather_u32.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1_gather_u64.c b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1_gather_u64.c index 77cdcfebafe..5f185ea824b 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1_gather_u64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1_gather_u64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1sb_gather_s32.c b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1sb_gather_s32.c index bb729483fcd..71fece575d9 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1sb_gather_s32.c +++ b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1sb_gather_s32.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1sb_gather_s64.c b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1sb_gather_s64.c index de5b693140c..1183e72f0fb 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1sb_gather_s64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1sb_gather_s64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1sb_gather_u32.c b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1sb_gather_u32.c index d01ec18e442..4d5e6e7716f 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1sb_gather_u32.c +++ b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1sb_gather_u32.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1sb_gather_u64.c b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1sb_gather_u64.c index b96e94353f1..ed329a23f19 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1sb_gather_u64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1sb_gather_u64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1sh_gather_s32.c b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1sh_gather_s32.c index 1dcfbc0fb95..6dbd6cea0f6 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1sh_gather_s32.c +++ b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1sh_gather_s32.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1sh_gather_s64.c b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1sh_gather_s64.c index 4166ed0a6c8..4ea3335a29f 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1sh_gather_s64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1sh_gather_s64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1sh_gather_u32.c b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1sh_gather_u32.c index 7680344da28..d5545151994 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1sh_gather_u32.c +++ b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1sh_gather_u32.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1sh_gather_u64.c b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1sh_gather_u64.c index 2427c83ab67..18c8ca44e7b 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1sh_gather_u64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1sh_gather_u64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1sw_gather_s64.c b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1sw_gather_s64.c index 2f538e847c2..41bff31d021 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1sw_gather_s64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1sw_gather_s64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1sw_gather_u64.c b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1sw_gather_u64.c index ace1c2f2fe5..30b8f6948f7 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1sw_gather_u64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1sw_gather_u64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1ub_gather_s32.c b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1ub_gather_s32.c index d3b29eb193d..8750d11af0f 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1ub_gather_s32.c +++ b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1ub_gather_s32.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1ub_gather_s64.c b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1ub_gather_s64.c index 3bc406620d7..f7981991a6a 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1ub_gather_s64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1ub_gather_s64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1ub_gather_u32.c b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1ub_gather_u32.c index 0af4b40b851..4d5ee4ef4ef 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1ub_gather_u32.c +++ b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1ub_gather_u32.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1ub_gather_u64.c b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1ub_gather_u64.c index fe28d78ed46..005c29c0644 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1ub_gather_u64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1ub_gather_u64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1uh_gather_s32.c b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1uh_gather_s32.c index 985432615ca..92613b16685 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1uh_gather_s32.c +++ b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1uh_gather_s32.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1uh_gather_s64.c b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1uh_gather_s64.c index 3c5baeee60e..be2e6d126e8 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1uh_gather_s64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1uh_gather_s64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1uh_gather_u32.c b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1uh_gather_u32.c index 4d945e9f994..4d122059f72 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1uh_gather_u32.c +++ b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1uh_gather_u32.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1uh_gather_u64.c b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1uh_gather_u64.c index 680238ac4f7..e3bc1044cd7 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1uh_gather_u64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1uh_gather_u64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1uw_gather_s64.c b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1uw_gather_s64.c index 787ae9defb2..9efa4b2cbf0 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1uw_gather_s64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1uw_gather_s64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1uw_gather_u64.c b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1uw_gather_u64.c index 4810bc3c45c..4ded4454df1 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1uw_gather_u64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1uw_gather_u64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/match_s16.c b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/match_s16.c index baebc7693c6..d0ce8129475 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/match_s16.c +++ b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/match_s16.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/match_s8.c b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/match_s8.c index f35a753791d..03473906aa2 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/match_s8.c +++ b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/match_s8.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/match_u16.c b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/match_u16.c index 0bdf4462f3d..2a8b4d250ab 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/match_u16.c +++ b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/match_u16.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/match_u8.c b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/match_u8.c index 6d78692bdb4..8409276d905 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/match_u8.c +++ b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/match_u8.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/nmatch_s16.c b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/nmatch_s16.c index 935b19a1040..044ba1de397 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/nmatch_s16.c +++ b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/nmatch_s16.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/nmatch_s8.c b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/nmatch_s8.c index 8a00b30f308..6c2d890fa41 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/nmatch_s8.c +++ b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/nmatch_s8.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/nmatch_u16.c b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/nmatch_u16.c index 868c20a11e5..863e31054e2 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/nmatch_u16.c +++ b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/nmatch_u16.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/nmatch_u8.c b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/nmatch_u8.c index af6b5816513..a62783db763 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/nmatch_u8.c +++ b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/nmatch_u8.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/pmullb_pair_u64.c b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/pmullb_pair_u64.c index 944609214a1..1fd85e0ce80 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/pmullb_pair_u64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/pmullb_pair_u64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/pmullt_pair_u64.c b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/pmullt_pair_u64.c index 90e2e991f9b..300d885abb0 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/pmullt_pair_u64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/pmullt_pair_u64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/rax1_s64.c b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/rax1_s64.c index ea80d40dbdf..9dbc7183992 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/rax1_s64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/rax1_s64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/rax1_u64.c b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/rax1_u64.c index b237c7edd5a..5caa2a5443b 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/rax1_u64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/rax1_u64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/sm4e_u32.c b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/sm4e_u32.c index cf6a2a95235..96c20dcaac4 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/sm4e_u32.c +++ b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/sm4e_u32.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/sm4ekey_u32.c b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/sm4ekey_u32.c index 58ad33c5ddb..e72384108e6 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/sm4ekey_u32.c +++ b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/sm4ekey_u32.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/stnt1_scatter_f32.c b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/stnt1_scatter_f32.c index 3f928e20eac..75539f6928f 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/stnt1_scatter_f32.c +++ b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/stnt1_scatter_f32.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/stnt1_scatter_f64.c b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/stnt1_scatter_f64.c index 8a35c76b90a..c0d47d0c13f 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/stnt1_scatter_f64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/stnt1_scatter_f64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/stnt1_scatter_s32.c b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/stnt1_scatter_s32.c index bd600268228..80fb3e8695b 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/stnt1_scatter_s32.c +++ b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/stnt1_scatter_s32.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/stnt1_scatter_s64.c b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/stnt1_scatter_s64.c index 0bfa2616ef5..edd2bc41832 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/stnt1_scatter_s64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/stnt1_scatter_s64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/stnt1_scatter_u32.c b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/stnt1_scatter_u32.c index fbfa008c1d5..a6e5059def9 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/stnt1_scatter_u32.c +++ b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/stnt1_scatter_u32.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/stnt1_scatter_u64.c b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/stnt1_scatter_u64.c index c283135c4ec..067e5b109c3 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/stnt1_scatter_u64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/stnt1_scatter_u64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/stnt1b_scatter_s32.c b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/stnt1b_scatter_s32.c index bf6ba597362..498fe82e5c2 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/stnt1b_scatter_s32.c +++ b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/stnt1b_scatter_s32.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/stnt1b_scatter_s64.c b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/stnt1b_scatter_s64.c index a24d0c89c76..614f5fb1a49 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/stnt1b_scatter_s64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/stnt1b_scatter_s64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/stnt1b_scatter_u32.c b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/stnt1b_scatter_u32.c index 2b05a7720bd..ce2c482afbd 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/stnt1b_scatter_u32.c +++ b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/stnt1b_scatter_u32.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/stnt1b_scatter_u64.c b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/stnt1b_scatter_u64.c index a13c5f5bb9d..593dc193975 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/stnt1b_scatter_u64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/stnt1b_scatter_u64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/stnt1h_scatter_s32.c b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/stnt1h_scatter_s32.c index 4e012f61f34..b9d06c1c5ab 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/stnt1h_scatter_s32.c +++ b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/stnt1h_scatter_s32.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/stnt1h_scatter_s64.c b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/stnt1h_scatter_s64.c index e934a708d89..006e0e24dec 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/stnt1h_scatter_s64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/stnt1h_scatter_s64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/stnt1h_scatter_u32.c b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/stnt1h_scatter_u32.c index db21821eb58..8cd7cb86ab3 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/stnt1h_scatter_u32.c +++ b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/stnt1h_scatter_u32.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/stnt1h_scatter_u64.c b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/stnt1h_scatter_u64.c index 53f930da1fc..972ee36896b 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/stnt1h_scatter_u64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/stnt1h_scatter_u64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/stnt1w_scatter_s64.c b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/stnt1w_scatter_s64.c index ec6c837d907..368a17c4769 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/stnt1w_scatter_s64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/stnt1w_scatter_s64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/stnt1w_scatter_u64.c b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/stnt1w_scatter_u64.c index 3c5d96de4f8..57d60a350de 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/stnt1w_scatter_u64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/stnt1w_scatter_u64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" From patchwork Tue Dec 5 10:13:13 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Sandiford X-Patchwork-Id: 1872041 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4SkxKS4dr4z1ySd for ; Tue, 5 Dec 2023 21:17:00 +1100 (AEDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 1730A386185C for ; Tue, 5 Dec 2023 10:16:31 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by sourceware.org (Postfix) with ESMTP id 49AAE3858032 for ; Tue, 5 Dec 2023 10:13:44 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 49AAE3858032 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=arm.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 49AAE3858032 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=217.140.110.172 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1701771230; cv=none; b=tz6HYYj3MBq4b7QTbmMS9mVwvLcpcTgrI1wIN6nNj2Le4jbbU4Enor0+CGCgHdxqaKEfkv3ic9fmIOnqu322Je15TYm2Xj+zCSm3eUHer9snZktE9lvYbhtr6PLLQJ00Xel2CbVZLFoedQdvaZPwgfSJ/kjk2vNIt+bHW6vlIco= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1701771230; c=relaxed/simple; bh=cRbSDLE+ClQ05QeUqBDEOpEPqUfFSobUzq3xHf2aB70=; h=From:To:Subject:Date:Message-Id:MIME-Version; b=h3wuhv2g3mOSjlXSU3aV0uFe45Bk0iXDEjzLgO64c3GPoVkfmYFXnT76q3Qxk40CzEuImGREbnWBVmwISz6bj/cqlT9LX/4U9zqdMpddx0IBbTTAHVUzmQvUWa1eg1SFkdlNgHfgqovocbm3zBLR1xQ98XzRErgm99/PwQObSI8= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 987881596; Tue, 5 Dec 2023 02:14:30 -0800 (PST) Received: from e121540-lin.manchester.arm.com (e121540-lin.manchester.arm.com [10.32.110.72]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 899713F5A1; Tue, 5 Dec 2023 02:13:43 -0800 (PST) From: Richard Sandiford To: gcc-patches@gcc.gnu.org Cc: Richard Sandiford Subject: [pushed v2 15/25] aarch64: Switch PSTATE.SM around calls Date: Tue, 5 Dec 2023 10:13:13 +0000 Message-Id: <20231205101323.1914247-16-richard.sandiford@arm.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20231205101323.1914247-1-richard.sandiford@arm.com> References: <20231205101323.1914247-1-richard.sandiford@arm.com> MIME-Version: 1.0 X-Spam-Status: No, score=-21.9 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_ASCII_DIVIDERS, KAM_DMARC_NONE, KAM_DMARC_STATUS, KAM_LAZY_DOMAIN_SECURITY, KAM_SHORT, SPF_HELO_NONE, SPF_NONE, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org This patch adds support for switching to the appropriate SME mode for each call. Switching to streaming mode requires an SMSTART SM instruction and switching to non-streaming mode requires an SMSTOP SM instruction. If the call is being made from streaming-compatible code, these switches are conditional on the current mode being the opposite of the one that the call needs. Since changing PSTATE.SM changes the vector length and effectively changes the ISA, the code to do the switching has to be emitted late. The patch does this using a new pass that runs next to late prologue/ epilogue insertion. (It doesn't use md_reorg because later additions need the CFG.) If a streaming-compatible function needs to switch mode for a call, it must restore the original mode afterwards. The old mode must therefore be available immediately after the call. The easiest way of ensuring this is to force the use of a hard frame pointer and ensure that the old state is saved at an in-range offset from there. Changing modes clobbers the Z and P registers, so we need to save and restore live Z and P state around each mode switch. However, mode switches are not expected to be performance critical, so it seemed better to err on the side of being correct rather than trying to optimise the save and restore with surrounding code. gcc/ * config/aarch64/aarch64-passes.def (pass_late_thread_prologue_and_epilogue): New pass. * config/aarch64/aarch64-sme.md: New file. * config/aarch64/aarch64.md: Include it. (*tb1): Rename to... (@aarch64_tb): ...this. (call, call_value, sibcall, sibcall_value): Don't require operand 2 to be a CONST_INT. * config/aarch64/aarch64-protos.h (aarch64_emit_call_insn): Return the insn. (make_pass_switch_sm_state): Declare. * config/aarch64/aarch64.h (TARGET_STREAMING_COMPATIBLE): New macro. (CALL_USED_REGISTER): Mark VG as call-preserved. (aarch64_frame::old_svcr_offset): New member variable. (machine_function::call_switches_sm_state): Likewise. (CUMULATIVE_ARGS::num_sme_mode_switch_args): Likewise. (CUMULATIVE_ARGS::sme_mode_switch_args): Likewise. * config/aarch64/aarch64.cc: Include tree-pass.h and cfgbuild.h. (aarch64_cfun_incoming_pstate_sm): New function. (aarch64_call_switches_pstate_sm): Likewise. (aarch64_reg_save_mode): Return DImode for VG_REGNUM. (aarch64_callee_isa_mode): New function. (aarch64_insn_callee_isa_mode): Likewise. (aarch64_guard_switch_pstate_sm): Likewise. (aarch64_switch_pstate_sm): Likewise. (aarch64_sme_mode_switch_regs): New class. (aarch64_record_sme_mode_switch_args): New function. (aarch64_finish_sme_mode_switch_args): Likewise. (aarch64_function_arg): Handle the end marker by returning a PARALLEL that contains the ABI cookie that we used previously alongside the result of aarch64_finish_sme_mode_switch_args. (aarch64_init_cumulative_args): Initialize num_sme_mode_switch_args. (aarch64_function_arg_advance): If a call would switch SM state, record all argument registers that would need to be saved around the mode switch. (aarch64_need_old_pstate_sm): New function. (aarch64_layout_frame): Decide whether the frame needs to store the incoming value of PSTATE.SM and allocate a save slot for it if so. If a function switches SME state, arrange to save the old value of the DWARF VG register. Handle the case where this is the only register save slot above the FP. (aarch64_save_callee_saves): Handles saves of the DWARF VG register. (aarch64_get_separate_components): Prevent such saves from being shrink-wrapped. (aarch64_old_svcr_mem): New function. (aarch64_read_old_svcr): Likewise. (aarch64_guard_switch_pstate_sm): Likewise. (aarch64_expand_prologue): Handle saves of the DWARF VG register. Initialize any SVCR save slot. (aarch64_expand_call): Allow the cookie to be PARALLEL that contains both the UNSPEC_CALLEE_ABI value and a list of registers that need to be preserved across a change to PSTATE.SM. If the call does involve such a change to PSTATE.SM, record the registers that would be clobbered by this process. Also emit an instruction to mark the temporary change in VG. Update call_switches_pstate_sm. (aarch64_emit_call_insn): Return the emitted instruction. (aarch64_frame_pointer_required): New function. (aarch64_conditional_register_usage): Prevent VG_REGNUM from being treated as a register operand. (aarch64_switch_pstate_sm_for_call): New function. (pass_data_switch_pstate_sm): New pass variable. (pass_switch_pstate_sm): New pass class. (make_pass_switch_pstate_sm): New function. (TARGET_FRAME_POINTER_REQUIRED): Define. * config/aarch64/t-aarch64 (s-check-sve-md): Add aarch64-sme.md. gcc/testsuite/ * gcc.target/aarch64/sme/call_sm_switch_1.c: New test. * gcc.target/aarch64/sme/call_sm_switch_2.c: Likewise. * gcc.target/aarch64/sme/call_sm_switch_3.c: Likewise. * gcc.target/aarch64/sme/call_sm_switch_4.c: Likewise. * gcc.target/aarch64/sme/call_sm_switch_5.c: Likewise. * gcc.target/aarch64/sme/call_sm_switch_6.c: Likewise. * gcc.target/aarch64/sme/call_sm_switch_7.c: Likewise. * gcc.target/aarch64/sme/call_sm_switch_8.c: Likewise. * gcc.target/aarch64/sme/call_sm_switch_9.c: Likewise. * gcc.target/aarch64/sme/call_sm_switch_10.c: Likewise. --- gcc/config/aarch64/aarch64-passes.def | 1 + gcc/config/aarch64/aarch64-protos.h | 3 +- gcc/config/aarch64/aarch64-sme.md | 171 ++++ gcc/config/aarch64/aarch64.cc | 883 +++++++++++++++++- gcc/config/aarch64/aarch64.h | 25 +- gcc/config/aarch64/aarch64.md | 13 +- gcc/config/aarch64/t-aarch64 | 5 +- .../gcc.target/aarch64/sme/call_sm_switch_1.c | 233 +++++ .../aarch64/sme/call_sm_switch_10.c | 37 + .../gcc.target/aarch64/sme/call_sm_switch_2.c | 43 + .../gcc.target/aarch64/sme/call_sm_switch_3.c | 166 ++++ .../gcc.target/aarch64/sme/call_sm_switch_4.c | 43 + .../gcc.target/aarch64/sme/call_sm_switch_5.c | 318 +++++++ .../gcc.target/aarch64/sme/call_sm_switch_6.c | 45 + .../gcc.target/aarch64/sme/call_sm_switch_7.c | 516 ++++++++++ .../gcc.target/aarch64/sme/call_sm_switch_8.c | 87 ++ .../gcc.target/aarch64/sme/call_sm_switch_9.c | 103 ++ 17 files changed, 2668 insertions(+), 24 deletions(-) create mode 100644 gcc/config/aarch64/aarch64-sme.md create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/call_sm_switch_1.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/call_sm_switch_10.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/call_sm_switch_2.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/call_sm_switch_3.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/call_sm_switch_4.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/call_sm_switch_5.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/call_sm_switch_6.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/call_sm_switch_7.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/call_sm_switch_8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/call_sm_switch_9.c diff --git a/gcc/config/aarch64/aarch64-passes.def b/gcc/config/aarch64/aarch64-passes.def index 6ace797b738..662a13fd5e6 100644 --- a/gcc/config/aarch64/aarch64-passes.def +++ b/gcc/config/aarch64/aarch64-passes.def @@ -20,6 +20,7 @@ INSERT_PASS_AFTER (pass_regrename, 1, pass_fma_steering); INSERT_PASS_BEFORE (pass_reorder_blocks, 1, pass_track_speculation); +INSERT_PASS_BEFORE (pass_late_thread_prologue_and_epilogue, 1, pass_switch_pstate_sm); INSERT_PASS_AFTER (pass_machine_reorg, 1, pass_tag_collision_avoidance); INSERT_PASS_BEFORE (pass_shorten_branches, 1, pass_insert_bti); INSERT_PASS_AFTER (pass_if_after_combine, 1, pass_cc_fusion); diff --git a/gcc/config/aarch64/aarch64-protos.h b/gcc/config/aarch64/aarch64-protos.h index f629c1c383e..be929e0a774 100644 --- a/gcc/config/aarch64/aarch64-protos.h +++ b/gcc/config/aarch64/aarch64-protos.h @@ -912,7 +912,7 @@ void aarch64_sve_expand_vector_init (rtx, rtx); void aarch64_init_cumulative_args (CUMULATIVE_ARGS *, const_tree, rtx, const_tree, unsigned, bool = false); void aarch64_init_expanders (void); -void aarch64_emit_call_insn (rtx); +rtx_call_insn *aarch64_emit_call_insn (rtx); void aarch64_register_pragmas (void); void aarch64_relayout_simd_types (void); void aarch64_reset_previous_fndecl (void); @@ -1053,6 +1053,7 @@ rtl_opt_pass *make_pass_track_speculation (gcc::context *); rtl_opt_pass *make_pass_tag_collision_avoidance (gcc::context *); rtl_opt_pass *make_pass_insert_bti (gcc::context *ctxt); rtl_opt_pass *make_pass_cc_fusion (gcc::context *ctxt); +rtl_opt_pass *make_pass_switch_pstate_sm (gcc::context *ctxt); poly_uint64 aarch64_regmode_natural_size (machine_mode); diff --git a/gcc/config/aarch64/aarch64-sme.md b/gcc/config/aarch64/aarch64-sme.md new file mode 100644 index 00000000000..52427b4f17a --- /dev/null +++ b/gcc/config/aarch64/aarch64-sme.md @@ -0,0 +1,171 @@ +;; Machine description for AArch64 SME. +;; Copyright (C) 2023 Free Software Foundation, Inc. +;; +;; This file is part of GCC. +;; +;; GCC is free software; you can redistribute it and/or modify it +;; under the terms of the GNU General Public License as published by +;; the Free Software Foundation; either version 3, or (at your option) +;; any later version. +;; +;; GCC is distributed in the hope that it will be useful, but +;; WITHOUT ANY WARRANTY; without even the implied warranty of +;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU +;; General Public License for more details. +;; +;; You should have received a copy of the GNU General Public License +;; along with GCC; see the file COPYING3. If not see +;; . + +;; The file is organised into the following sections (search for the full +;; line): +;; +;; == State management +;; ---- Test current state +;; ---- PSTATE.SM management + +;; ========================================================================= +;; == State management +;; ========================================================================= +;; +;; Many of the instructions in this section are only valid when SME is +;; present. However, they don't have a TARGET_SME condition since +;; (a) they are only emitted under direct control of aarch64 code and +;; (b) they are sometimes used conditionally, particularly in streaming- +;; compatible code. +;; +;; ========================================================================= + +;; ------------------------------------------------------------------------- +;; ---- Test current state +;; ------------------------------------------------------------------------- + +(define_c_enum "unspec" [ + UNSPEC_OLD_VG_SAVED + UNSPEC_UPDATE_VG + UNSPEC_GET_SME_STATE + UNSPEC_READ_SVCR +]) + +;; A marker instruction to say that the old value of the DWARF VG register +;; has been saved to the stack, for CFI purposes. Operand 0 is the old +;; value of the register and operand 1 is the save slot. +(define_insn "aarch64_old_vg_saved" + [(set (reg:DI VG_REGNUM) + (unspec:DI [(match_operand 0) + (match_operand 1)] UNSPEC_OLD_VG_SAVED))] + "" + "" + [(set_attr "type" "no_insn")] +) + +;; A marker to indicate places where a call temporarily changes VG. +(define_insn "aarch64_update_vg" + [(set (reg:DI VG_REGNUM) + (unspec:DI [(reg:DI VG_REGNUM)] UNSPEC_UPDATE_VG))] + "" + "" + [(set_attr "type" "no_insn")] +) + +(define_insn "aarch64_get_sme_state" + [(set (reg:TI R0_REGNUM) + (unspec_volatile:TI [(const_int 0)] UNSPEC_GET_SME_STATE)) + (clobber (reg:DI R16_REGNUM)) + (clobber (reg:DI R17_REGNUM)) + (clobber (reg:DI R18_REGNUM)) + (clobber (reg:DI R30_REGNUM)) + (clobber (reg:CC CC_REGNUM))] + "" + "bl\t__arm_sme_state" +) + +(define_insn "aarch64_read_svcr" + [(set (match_operand:DI 0 "register_operand" "=r") + (unspec_volatile:DI [(const_int 0)] UNSPEC_READ_SVCR))] + "" + "mrs\t%0, svcr" +) + +;; ------------------------------------------------------------------------- +;; ---- PSTATE.SM management +;; ------------------------------------------------------------------------- +;; Includes: +;; - SMSTART SM +;; - SMSTOP SM +;; ------------------------------------------------------------------------- + +(define_c_enum "unspec" [ + UNSPEC_SMSTART_SM + UNSPEC_SMSTOP_SM +]) + +;; Turn on streaming mode. This clobbers all SVE state. +;; +;; Depend on VG_REGNUM to ensure that the VG save slot has already been +;; initialized. +(define_insn "aarch64_smstart_sm" + [(unspec_volatile [(const_int 0)] UNSPEC_SMSTART_SM) + (use (reg:DI VG_REGNUM)) + (clobber (reg:V4x16QI V0_REGNUM)) + (clobber (reg:V4x16QI V4_REGNUM)) + (clobber (reg:V4x16QI V8_REGNUM)) + (clobber (reg:V4x16QI V12_REGNUM)) + (clobber (reg:V4x16QI V16_REGNUM)) + (clobber (reg:V4x16QI V20_REGNUM)) + (clobber (reg:V4x16QI V24_REGNUM)) + (clobber (reg:V4x16QI V28_REGNUM)) + (clobber (reg:VNx16BI P0_REGNUM)) + (clobber (reg:VNx16BI P1_REGNUM)) + (clobber (reg:VNx16BI P2_REGNUM)) + (clobber (reg:VNx16BI P3_REGNUM)) + (clobber (reg:VNx16BI P4_REGNUM)) + (clobber (reg:VNx16BI P5_REGNUM)) + (clobber (reg:VNx16BI P6_REGNUM)) + (clobber (reg:VNx16BI P7_REGNUM)) + (clobber (reg:VNx16BI P8_REGNUM)) + (clobber (reg:VNx16BI P9_REGNUM)) + (clobber (reg:VNx16BI P10_REGNUM)) + (clobber (reg:VNx16BI P11_REGNUM)) + (clobber (reg:VNx16BI P12_REGNUM)) + (clobber (reg:VNx16BI P13_REGNUM)) + (clobber (reg:VNx16BI P14_REGNUM)) + (clobber (reg:VNx16BI P15_REGNUM))] + "" + "smstart\tsm" +) + +;; Turn off streaming mode. This clobbers all SVE state. +;; +;; Depend on VG_REGNUM to ensure that the VG save slot has already been +;; initialized. +(define_insn "aarch64_smstop_sm" + [(unspec_volatile [(const_int 0)] UNSPEC_SMSTOP_SM) + (use (reg:DI VG_REGNUM)) + (clobber (reg:V4x16QI V0_REGNUM)) + (clobber (reg:V4x16QI V4_REGNUM)) + (clobber (reg:V4x16QI V8_REGNUM)) + (clobber (reg:V4x16QI V12_REGNUM)) + (clobber (reg:V4x16QI V16_REGNUM)) + (clobber (reg:V4x16QI V20_REGNUM)) + (clobber (reg:V4x16QI V24_REGNUM)) + (clobber (reg:V4x16QI V28_REGNUM)) + (clobber (reg:VNx16BI P0_REGNUM)) + (clobber (reg:VNx16BI P1_REGNUM)) + (clobber (reg:VNx16BI P2_REGNUM)) + (clobber (reg:VNx16BI P3_REGNUM)) + (clobber (reg:VNx16BI P4_REGNUM)) + (clobber (reg:VNx16BI P5_REGNUM)) + (clobber (reg:VNx16BI P6_REGNUM)) + (clobber (reg:VNx16BI P7_REGNUM)) + (clobber (reg:VNx16BI P8_REGNUM)) + (clobber (reg:VNx16BI P9_REGNUM)) + (clobber (reg:VNx16BI P10_REGNUM)) + (clobber (reg:VNx16BI P11_REGNUM)) + (clobber (reg:VNx16BI P12_REGNUM)) + (clobber (reg:VNx16BI P13_REGNUM)) + (clobber (reg:VNx16BI P14_REGNUM)) + (clobber (reg:VNx16BI P15_REGNUM))] + "" + "smstop\tsm" +) diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc index ea00ec192ee..0bee2a8e373 100644 --- a/gcc/config/aarch64/aarch64.cc +++ b/gcc/config/aarch64/aarch64.cc @@ -85,6 +85,8 @@ #include "config/arm/aarch-common.h" #include "config/arm/aarch-common-protos.h" #include "ssa.h" +#include "tree-pass.h" +#include "cfgbuild.h" /* This file should be included last. */ #include "target-def.h" @@ -1791,6 +1793,26 @@ aarch64_fndecl_isa_mode (const_tree fndecl) return aarch64_fndecl_pstate_sm (fndecl); } +/* Return the state of PSTATE.SM on entry to the current function. + This might be different from the state of PSTATE.SM in the function + body. */ + +static aarch64_feature_flags +aarch64_cfun_incoming_pstate_sm () +{ + return aarch64_fntype_pstate_sm (TREE_TYPE (cfun->decl)); +} + +/* Return true if a call from the current function to a function with + ISA mode CALLEE_MODE would involve a change to PSTATE.SM around + the BL instruction. */ + +static bool +aarch64_call_switches_pstate_sm (aarch64_feature_flags callee_mode) +{ + return (callee_mode & ~AARCH64_ISA_MODE & AARCH64_FL_SM_STATE) != 0; +} + /* Implement TARGET_COMPATIBLE_VECTOR_TYPES_P. */ static bool @@ -1814,7 +1836,7 @@ aarch64_emit_cfi_for_reg_p (unsigned int regno) static machine_mode aarch64_reg_save_mode (unsigned int regno) { - if (GP_REGNUM_P (regno)) + if (GP_REGNUM_P (regno) || regno == VG_REGNUM) return DImode; if (FP_REGNUM_P (regno)) @@ -1873,6 +1895,16 @@ aarch64_callee_abi (rtx cookie) return function_abis[UINTVAL (cookie) >> AARCH64_NUM_ISA_MODES]; } +/* COOKIE is a CONST_INT from an UNSPEC_CALLEE_ABI rtx. Return the + required ISA mode on entry to the callee, which is also the ISA + mode on return from the callee. */ + +static aarch64_feature_flags +aarch64_callee_isa_mode (rtx cookie) +{ + return UINTVAL (cookie) & AARCH64_FL_ISA_MODES; +} + /* INSN is a call instruction. Return the CONST_INT stored in its UNSPEC_CALLEE_ABI rtx. */ @@ -1895,6 +1927,15 @@ aarch64_insn_callee_abi (const rtx_insn *insn) return aarch64_callee_abi (aarch64_insn_callee_cookie (insn)); } +/* INSN is a call instruction. Return the required ISA mode on entry to + the callee, which is also the ISA mode on return from the callee. */ + +static aarch64_feature_flags +aarch64_insn_callee_isa_mode (const rtx_insn *insn) +{ + return aarch64_callee_isa_mode (aarch64_insn_callee_cookie (insn)); +} + /* Implement TARGET_HARD_REGNO_CALL_PART_CLOBBERED. The callee only saves the lower 64 bits of a 128-bit register. Tell the compiler the callee clobbers the top 64 bits when restoring the bottom 64 bits. */ @@ -4108,6 +4149,437 @@ aarch64_sub_sp (rtx temp1, rtx temp2, poly_int64 delta, bool frame_related_p, temp1, temp2, frame_related_p, emit_move_imm); } +/* A streaming-compatible function needs to switch temporarily to the known + PSTATE.SM mode described by LOCAL_MODE. The low bit of OLD_SVCR contains + the runtime state of PSTATE.SM in the streaming-compatible code, before + the start of the switch to LOCAL_MODE. + + Emit instructions to branch around the mode switch if PSTATE.SM already + matches LOCAL_MODE. Return the label that the branch jumps to. */ + +static rtx_insn * +aarch64_guard_switch_pstate_sm (rtx old_svcr, aarch64_feature_flags local_mode) +{ + local_mode &= AARCH64_FL_SM_STATE; + gcc_assert (local_mode != 0); + auto already_ok_cond = (local_mode & AARCH64_FL_SM_ON ? NE : EQ); + auto *label = gen_label_rtx (); + auto *jump = emit_jump_insn (gen_aarch64_tb (already_ok_cond, DImode, DImode, + old_svcr, const0_rtx, label)); + JUMP_LABEL (jump) = label; + return label; +} + +/* Emit code to switch from the PSTATE.SM state in OLD_MODE to the PSTATE.SM + state in NEW_MODE. This is known to involve either an SMSTART SM or + an SMSTOP SM. */ + +static void +aarch64_switch_pstate_sm (aarch64_feature_flags old_mode, + aarch64_feature_flags new_mode) +{ + old_mode &= AARCH64_FL_SM_STATE; + new_mode &= AARCH64_FL_SM_STATE; + gcc_assert (old_mode != new_mode); + + if ((new_mode & AARCH64_FL_SM_ON) + || (new_mode == 0 && (old_mode & AARCH64_FL_SM_OFF))) + emit_insn (gen_aarch64_smstart_sm ()); + else + emit_insn (gen_aarch64_smstop_sm ()); +} + +/* As a side-effect, SMSTART SM and SMSTOP SM clobber the contents of all + FP and predicate registers. This class emits code to preserve any + necessary registers around the mode switch. + + The class uses four approaches to saving and restoring contents, enumerated + by group_type: + + - GPR: save and restore the contents of FP registers using GPRs. + This is used if the FP register contains no more than 64 significant + bits. The registers used are FIRST_GPR onwards. + + - MEM_128: save and restore 128-bit SIMD registers using memory. + + - MEM_SVE_PRED: save and restore full SVE predicate registers using memory. + + - MEM_SVE_DATA: save and restore full SVE vector registers using memory. + + The save slots within each memory group are consecutive, with the + MEM_SVE_PRED slots occupying a region below the MEM_SVE_DATA slots. + + There will only be two mode switches for each use of SME, so they should + not be particularly performance-sensitive. It's also rare for SIMD, SVE + or predicate registers to be live across mode switches. We therefore + don't preallocate the save slots but instead allocate them locally on + demand. This makes the code emitted by the class self-contained. */ + +class aarch64_sme_mode_switch_regs +{ +public: + static const unsigned int FIRST_GPR = R10_REGNUM; + + void add_reg (machine_mode, unsigned int); + void add_call_args (rtx_call_insn *); + void add_call_result (rtx_call_insn *); + + void emit_prologue (); + void emit_epilogue (); + + /* The number of GPRs needed to save FP registers, starting from + FIRST_GPR. */ + unsigned int num_gprs () { return m_group_count[GPR]; } + +private: + enum sequence { PROLOGUE, EPILOGUE }; + enum group_type { GPR, MEM_128, MEM_SVE_PRED, MEM_SVE_DATA, NUM_GROUPS }; + + /* Information about the save location for one FP, SIMD, SVE data, or + SVE predicate register. */ + struct save_location { + /* The register to be saved. */ + rtx reg; + + /* Which group the save location belongs to. */ + group_type group; + + /* A zero-based index of the register within the group. */ + unsigned int index; + }; + + unsigned int sve_data_headroom (); + rtx get_slot_mem (machine_mode, poly_int64); + void emit_stack_adjust (sequence, poly_int64); + void emit_mem_move (sequence, const save_location &, poly_int64); + + void emit_gpr_moves (sequence); + void emit_mem_128_moves (sequence); + void emit_sve_sp_adjust (sequence); + void emit_sve_pred_moves (sequence); + void emit_sve_data_moves (sequence); + + /* All save locations, in no particular order. */ + auto_vec m_save_locations; + + /* The number of registers in each group. */ + unsigned int m_group_count[NUM_GROUPS] = {}; +}; + +/* Record that (reg:MODE REGNO) needs to be preserved around the mode + switch. */ + +void +aarch64_sme_mode_switch_regs::add_reg (machine_mode mode, unsigned int regno) +{ + if (!FP_REGNUM_P (regno) && !PR_REGNUM_P (regno)) + return; + + unsigned int end_regno = end_hard_regno (mode, regno); + unsigned int vec_flags = aarch64_classify_vector_mode (mode); + gcc_assert ((vec_flags & VEC_STRUCT) || end_regno == regno + 1); + for (; regno < end_regno; regno++) + { + machine_mode submode = mode; + if (vec_flags & VEC_STRUCT) + { + if (vec_flags & VEC_SVE_DATA) + submode = SVE_BYTE_MODE; + else if (vec_flags & VEC_PARTIAL) + submode = V8QImode; + else + submode = V16QImode; + } + save_location loc; + loc.reg = gen_rtx_REG (submode, regno); + if (vec_flags == VEC_SVE_PRED) + { + gcc_assert (PR_REGNUM_P (regno)); + loc.group = MEM_SVE_PRED; + } + else + { + gcc_assert (FP_REGNUM_P (regno)); + if (known_le (GET_MODE_SIZE (submode), 8)) + loc.group = GPR; + else if (known_eq (GET_MODE_SIZE (submode), 16)) + loc.group = MEM_128; + else + loc.group = MEM_SVE_DATA; + } + loc.index = m_group_count[loc.group]++; + m_save_locations.quick_push (loc); + } +} + +/* Record that the arguments to CALL_INSN need to be preserved around + the mode switch. */ + +void +aarch64_sme_mode_switch_regs::add_call_args (rtx_call_insn *call_insn) +{ + for (rtx node = CALL_INSN_FUNCTION_USAGE (call_insn); + node; node = XEXP (node, 1)) + { + rtx item = XEXP (node, 0); + if (GET_CODE (item) != USE) + continue; + item = XEXP (item, 0); + if (!REG_P (item)) + continue; + add_reg (GET_MODE (item), REGNO (item)); + } +} + +/* Record that the return value from CALL_INSN (if any) needs to be + preserved around the mode switch. */ + +void +aarch64_sme_mode_switch_regs::add_call_result (rtx_call_insn *call_insn) +{ + rtx pat = PATTERN (call_insn); + gcc_assert (GET_CODE (pat) == PARALLEL); + pat = XVECEXP (pat, 0, 0); + if (GET_CODE (pat) == CALL) + return; + rtx dest = SET_DEST (pat); + if (GET_CODE (dest) == PARALLEL) + for (int i = 0; i < XVECLEN (dest, 0); ++i) + { + rtx x = XVECEXP (dest, 0, i); + gcc_assert (GET_CODE (x) == EXPR_LIST); + rtx reg = XEXP (x, 0); + add_reg (GET_MODE (reg), REGNO (reg)); + } + else + add_reg (GET_MODE (dest), REGNO (dest)); +} + +/* Emit code to save registers before the mode switch. */ + +void +aarch64_sme_mode_switch_regs::emit_prologue () +{ + emit_sve_sp_adjust (PROLOGUE); + emit_sve_pred_moves (PROLOGUE); + emit_sve_data_moves (PROLOGUE); + emit_mem_128_moves (PROLOGUE); + emit_gpr_moves (PROLOGUE); +} + +/* Emit code to restore registers after the mode switch. */ + +void +aarch64_sme_mode_switch_regs::emit_epilogue () +{ + emit_gpr_moves (EPILOGUE); + emit_mem_128_moves (EPILOGUE); + emit_sve_pred_moves (EPILOGUE); + emit_sve_data_moves (EPILOGUE); + emit_sve_sp_adjust (EPILOGUE); +} + +/* The SVE predicate registers are stored below the SVE data registers, + with the predicate save area being padded to a data-register-sized + boundary. Return the size of this padded area as a whole number + of data register slots. */ + +unsigned int +aarch64_sme_mode_switch_regs::sve_data_headroom () +{ + return CEIL (m_group_count[MEM_SVE_PRED], 8); +} + +/* Return a memory reference of mode MODE to OFFSET bytes from the + stack pointer. */ + +rtx +aarch64_sme_mode_switch_regs::get_slot_mem (machine_mode mode, + poly_int64 offset) +{ + rtx addr = plus_constant (Pmode, stack_pointer_rtx, offset); + return gen_rtx_MEM (mode, addr); +} + +/* Allocate or deallocate SIZE bytes of stack space: SEQ decides which. */ + +void +aarch64_sme_mode_switch_regs::emit_stack_adjust (sequence seq, + poly_int64 size) +{ + if (seq == PROLOGUE) + size = -size; + emit_insn (gen_rtx_SET (stack_pointer_rtx, + plus_constant (Pmode, stack_pointer_rtx, size))); +} + +/* Save or restore the register in LOC, whose slot is OFFSET bytes from + the stack pointer. SEQ chooses between saving and restoring. */ + +void +aarch64_sme_mode_switch_regs::emit_mem_move (sequence seq, + const save_location &loc, + poly_int64 offset) +{ + rtx mem = get_slot_mem (GET_MODE (loc.reg), offset); + if (seq == PROLOGUE) + emit_move_insn (mem, loc.reg); + else + emit_move_insn (loc.reg, mem); +} + +/* Emit instructions to save or restore the GPR group. SEQ chooses between + saving and restoring. */ + +void +aarch64_sme_mode_switch_regs::emit_gpr_moves (sequence seq) +{ + for (auto &loc : m_save_locations) + if (loc.group == GPR) + { + gcc_assert (loc.index < 8); + rtx gpr = gen_rtx_REG (GET_MODE (loc.reg), FIRST_GPR + loc.index); + if (seq == PROLOGUE) + emit_move_insn (gpr, loc.reg); + else + emit_move_insn (loc.reg, gpr); + } +} + +/* Emit instructions to save or restore the MEM_128 group. SEQ chooses + between saving and restoring. */ + +void +aarch64_sme_mode_switch_regs::emit_mem_128_moves (sequence seq) +{ + HOST_WIDE_INT count = m_group_count[MEM_128]; + if (count == 0) + return; + + auto sp = stack_pointer_rtx; + auto sp_adjust = (seq == PROLOGUE ? -count : count) * 16; + + /* Pick a common mode that supports LDR & STR with pre/post-modification + and LDP & STP with pre/post-modification. */ + auto mode = TFmode; + + /* An instruction pattern that should be emitted at the end. */ + rtx last_pat = NULL_RTX; + + /* A previous MEM_128 location that hasn't been handled yet. */ + save_location *prev_loc = nullptr; + + /* Look for LDP/STPs and record any leftover LDR/STR in PREV_LOC. */ + for (auto &loc : m_save_locations) + if (loc.group == MEM_128) + { + if (!prev_loc) + { + prev_loc = &loc; + continue; + } + gcc_assert (loc.index == prev_loc->index + 1); + + /* The offset of the base of the save area from the current + stack pointer. */ + HOST_WIDE_INT bias = 0; + if (prev_loc->index == 0 && seq == PROLOGUE) + bias = sp_adjust; + + /* Get the two sets in the LDP/STP. */ + rtx ops[] = { + gen_rtx_REG (mode, REGNO (prev_loc->reg)), + get_slot_mem (mode, prev_loc->index * 16 + bias), + gen_rtx_REG (mode, REGNO (loc.reg)), + get_slot_mem (mode, loc.index * 16 + bias) + }; + unsigned int lhs = (seq == PROLOGUE); + rtx set1 = gen_rtx_SET (ops[lhs], ops[1 - lhs]); + rtx set2 = gen_rtx_SET (ops[lhs + 2], ops[3 - lhs]); + + /* Combine the sets with any stack allocation/deallocation. */ + rtvec vec; + if (prev_loc->index == 0) + { + rtx plus_sp = plus_constant (Pmode, sp, sp_adjust); + vec = gen_rtvec (3, gen_rtx_SET (sp, plus_sp), set1, set2); + } + else + vec = gen_rtvec (2, set1, set2); + rtx pat = gen_rtx_PARALLEL (VOIDmode, vec); + + /* Queue a deallocation to the end, otherwise emit the + instruction now. */ + if (seq == EPILOGUE && prev_loc->index == 0) + last_pat = pat; + else + emit_insn (pat); + prev_loc = nullptr; + } + + /* Handle any leftover LDR/STR. */ + if (prev_loc) + { + rtx reg = gen_rtx_REG (mode, REGNO (prev_loc->reg)); + rtx addr; + if (prev_loc->index != 0) + addr = plus_constant (Pmode, sp, prev_loc->index * 16); + else if (seq == PROLOGUE) + { + rtx allocate = plus_constant (Pmode, sp, -count * 16); + addr = gen_rtx_PRE_MODIFY (Pmode, sp, allocate); + } + else + { + rtx deallocate = plus_constant (Pmode, sp, count * 16); + addr = gen_rtx_POST_MODIFY (Pmode, sp, deallocate); + } + rtx mem = gen_rtx_MEM (mode, addr); + if (seq == PROLOGUE) + emit_move_insn (mem, reg); + else + emit_move_insn (reg, mem); + } + + if (last_pat) + emit_insn (last_pat); +} + +/* Allocate or deallocate the stack space needed by the SVE groups. + SEQ chooses between allocating and deallocating. */ + +void +aarch64_sme_mode_switch_regs::emit_sve_sp_adjust (sequence seq) +{ + if (unsigned int count = m_group_count[MEM_SVE_DATA] + sve_data_headroom ()) + emit_stack_adjust (seq, count * BYTES_PER_SVE_VECTOR); +} + +/* Save or restore the MEM_SVE_DATA group. SEQ chooses between saving + and restoring. */ + +void +aarch64_sme_mode_switch_regs::emit_sve_data_moves (sequence seq) +{ + for (auto &loc : m_save_locations) + if (loc.group == MEM_SVE_DATA) + { + auto index = loc.index + sve_data_headroom (); + emit_mem_move (seq, loc, index * BYTES_PER_SVE_VECTOR); + } +} + +/* Save or restore the MEM_SVE_PRED group. SEQ chooses between saving + and restoring. */ + +void +aarch64_sme_mode_switch_regs::emit_sve_pred_moves (sequence seq) +{ + for (auto &loc : m_save_locations) + if (loc.group == MEM_SVE_PRED) + emit_mem_move (seq, loc, loc.index * BYTES_PER_SVE_PRED); +} + /* Set DEST to (vec_series BASE STEP). */ static void @@ -5806,6 +6278,40 @@ on_stack: return; } +/* Add the current argument register to the set of those that need + to be saved and restored around a change to PSTATE.SM. */ + +static void +aarch64_record_sme_mode_switch_args (CUMULATIVE_ARGS *pcum) +{ + subrtx_var_iterator::array_type array; + FOR_EACH_SUBRTX_VAR (iter, array, pcum->aapcs_reg, NONCONST) + { + rtx x = *iter; + if (REG_P (x) && (FP_REGNUM_P (REGNO (x)) || PR_REGNUM_P (REGNO (x)))) + { + unsigned int i = pcum->num_sme_mode_switch_args++; + gcc_assert (i < ARRAY_SIZE (pcum->sme_mode_switch_args)); + pcum->sme_mode_switch_args[i] = x; + } + } +} + +/* Return a parallel that contains all the registers that need to be + saved around a change to PSTATE.SM. Return const0_rtx if there is + no such mode switch, or if no registers need to be saved. */ + +static rtx +aarch64_finish_sme_mode_switch_args (CUMULATIVE_ARGS *pcum) +{ + if (!pcum->num_sme_mode_switch_args) + return const0_rtx; + + auto argvec = gen_rtvec_v (pcum->num_sme_mode_switch_args, + pcum->sme_mode_switch_args); + return gen_rtx_PARALLEL (VOIDmode, argvec); +} + /* Implement TARGET_FUNCTION_ARG. */ static rtx @@ -5817,7 +6323,13 @@ aarch64_function_arg (cumulative_args_t pcum_v, const function_arg_info &arg) || pcum->pcs_variant == ARM_PCS_SVE); if (arg.end_marker_p ()) - return aarch64_gen_callee_cookie (pcum->isa_mode, pcum->pcs_variant); + { + rtx abi_cookie = aarch64_gen_callee_cookie (pcum->isa_mode, + pcum->pcs_variant); + rtx sme_mode_switch_args = aarch64_finish_sme_mode_switch_args (pcum); + return gen_rtx_PARALLEL (VOIDmode, gen_rtvec (2, abi_cookie, + sme_mode_switch_args)); + } aarch64_layout_arg (pcum_v, arg); return pcum->aapcs_reg; @@ -5852,6 +6364,7 @@ aarch64_init_cumulative_args (CUMULATIVE_ARGS *pcum, pcum->aapcs_stack_words = 0; pcum->aapcs_stack_size = 0; pcum->silent_p = silent_p; + pcum->num_sme_mode_switch_args = 0; if (!silent_p && !TARGET_FLOAT @@ -5892,6 +6405,10 @@ aarch64_function_arg_advance (cumulative_args_t pcum_v, aarch64_layout_arg (pcum_v, arg); gcc_assert ((pcum->aapcs_reg != NULL_RTX) != (pcum->aapcs_stack_words != 0)); + if (pcum->aapcs_reg + && aarch64_call_switches_pstate_sm (pcum->isa_mode)) + aarch64_record_sme_mode_switch_args (pcum); + pcum->aapcs_arg_processed = false; pcum->aapcs_ncrn = pcum->aapcs_nextncrn; pcum->aapcs_nvrn = pcum->aapcs_nextnvrn; @@ -6345,6 +6862,30 @@ aarch64_save_regs_above_locals_p () return crtl->stack_protect_guard; } +/* Return true if the current function needs to record the incoming + value of PSTATE.SM. */ +static bool +aarch64_need_old_pstate_sm () +{ + /* Exit early if the incoming value of PSTATE.SM is known at + compile time. */ + if (aarch64_cfun_incoming_pstate_sm () != 0) + return false; + + if (cfun->machine->call_switches_pstate_sm) + for (auto insn = get_insns (); insn; insn = NEXT_INSN (insn)) + if (auto *call = dyn_cast (insn)) + if (!SIBLING_CALL_P (call)) + { + /* Return true if there is a call to a non-streaming-compatible + function. */ + auto callee_isa_mode = aarch64_insn_callee_isa_mode (call); + if (aarch64_call_switches_pstate_sm (callee_isa_mode)) + return true; + } + return false; +} + /* Mark the registers that need to be saved by the callee and calculate the size of the callee-saved registers area and frame record (both FP and LR may be omitted). */ @@ -6378,6 +6919,7 @@ aarch64_layout_frame (void) /* First mark all the registers that really need to be saved... */ for (regno = 0; regno <= LAST_SAVED_REGNUM; regno++) frame.reg_offset[regno] = SLOT_NOT_REQUIRED; + frame.old_svcr_offset = SLOT_NOT_REQUIRED; /* ... that includes the eh data registers (if needed)... */ if (crtl->calls_eh_return) @@ -6530,6 +7072,21 @@ aarch64_layout_frame (void) if (known_eq (frame.reg_offset[regno], SLOT_REQUIRED)) allocate_gpr_slot (regno); + if (aarch64_need_old_pstate_sm ()) + { + frame.old_svcr_offset = offset; + offset += UNITS_PER_WORD; + } + + /* If the current function changes the SVE vector length, ensure that the + old value of the DWARF VG register is saved and available in the CFI, + so that outer frames with VL-sized offsets can be processed correctly. */ + if (cfun->machine->call_switches_pstate_sm) + { + frame.reg_offset[VG_REGNUM] = offset; + offset += UNITS_PER_WORD; + } + poly_int64 max_int_offset = offset; offset = aligned_upper_bound (offset, STACK_BOUNDARY / BITS_PER_UNIT); bool has_align_gap = maybe_ne (offset, max_int_offset); @@ -6567,8 +7124,6 @@ aarch64_layout_frame (void) if (push_regs.size () > 1) frame.wb_push_candidate2 = push_regs[1]; } - else - gcc_assert (known_eq (saved_regs_size, below_hard_fp_saved_regs_size)); /* With stack-clash, a register must be saved in non-leaf functions. The saving of the bottommost register counts as an implicit probe, @@ -6676,7 +7231,8 @@ aarch64_layout_frame (void) frame.initial_adjust = frame.frame_size - frame.bytes_below_saved_regs; frame.final_adjust = frame.bytes_below_saved_regs; } - else if (frame.bytes_above_hard_fp.is_constant (&const_above_fp) + else if (frame.wb_push_candidate1 != INVALID_REGNUM + && frame.bytes_above_hard_fp.is_constant (&const_above_fp) && const_above_fp < max_push_offset) { /* Frame with large area below the saved registers, or with SVE saves, @@ -7100,7 +7656,13 @@ aarch64_save_callee_saves (poly_int64 bytes_below_sp, machine_mode mode = aarch64_reg_save_mode (regno); rtx reg = gen_rtx_REG (mode, regno); + rtx move_src = reg; offset = frame.reg_offset[regno] - bytes_below_sp; + if (regno == VG_REGNUM) + { + move_src = gen_rtx_REG (DImode, IP0_REGNUM); + emit_move_insn (move_src, gen_int_mode (aarch64_sve_vg, DImode)); + } rtx base_rtx = stack_pointer_rtx; poly_int64 sp_offset = offset; @@ -7108,7 +7670,7 @@ aarch64_save_callee_saves (poly_int64 bytes_below_sp, if (mode == VNx2DImode && BYTES_BIG_ENDIAN) aarch64_adjust_sve_callee_save_base (mode, base_rtx, anchor_reg, offset, ptrue); - else if (GP_REGNUM_P (regno) + else if (GP_REGNUM_P (REGNO (reg)) && (!offset.is_constant (&const_offset) || const_offset >= 512)) { poly_int64 fp_offset = frame.bytes_below_hard_fp - bytes_below_sp; @@ -7131,6 +7693,7 @@ aarch64_save_callee_saves (poly_int64 bytes_below_sp, unsigned int regno2; if (!aarch64_sve_mode_p (mode) + && reg == move_src && i + 1 < regs.size () && (regno2 = regs[i + 1], !skip_save_p (regno2)) && known_eq (GET_MODE_SIZE (mode), @@ -7162,17 +7725,24 @@ aarch64_save_callee_saves (poly_int64 bytes_below_sp, } else if (mode == VNx2DImode && BYTES_BIG_ENDIAN) { - insn = emit_insn (gen_aarch64_pred_mov (mode, mem, ptrue, reg)); + insn = emit_insn (gen_aarch64_pred_mov (mode, mem, ptrue, move_src)); need_cfa_note_p = true; } else if (aarch64_sve_mode_p (mode)) - insn = emit_insn (gen_rtx_SET (mem, reg)); + insn = emit_insn (gen_rtx_SET (mem, move_src)); else - insn = emit_move_insn (mem, reg); + insn = emit_move_insn (mem, move_src); RTX_FRAME_RELATED_P (insn) = frame_related_p; if (frame_related_p && need_cfa_note_p) aarch64_add_cfa_expression (insn, reg, stack_pointer_rtx, sp_offset); + else if (frame_related_p && move_src != reg) + add_reg_note (insn, REG_FRAME_RELATED_EXPR, gen_rtx_SET (mem, reg)); + + /* Emit a fake instruction to indicate that the VG save slot has + been initialized. */ + if (regno == VG_REGNUM) + emit_insn (gen_aarch64_old_vg_saved (move_src, mem)); } } @@ -7395,6 +7965,10 @@ aarch64_get_separate_components (void) bitmap_clear_bit (components, frame.hard_fp_save_and_probe); } + /* The VG save sequence needs a temporary GPR. Punt for now on trying + to find one. */ + bitmap_clear_bit (components, VG_REGNUM); + return components; } @@ -7890,6 +8464,47 @@ aarch64_epilogue_uses (int regno) return 0; } +/* The current function's frame has a save slot for the incoming state + of SVCR. Return a legitimate memory for the slot, based on the hard + frame pointer. */ + +static rtx +aarch64_old_svcr_mem () +{ + gcc_assert (frame_pointer_needed + && known_ge (cfun->machine->frame.old_svcr_offset, 0)); + rtx base = hard_frame_pointer_rtx; + poly_int64 offset = (0 + /* hard fp -> bottom of frame. */ + - cfun->machine->frame.bytes_below_hard_fp + /* bottom of frame -> save slot. */ + + cfun->machine->frame.old_svcr_offset); + return gen_frame_mem (DImode, plus_constant (Pmode, base, offset)); +} + +/* The current function's frame has a save slot for the incoming state + of SVCR. Load the slot into register REGNO and return the register. */ + +static rtx +aarch64_read_old_svcr (unsigned int regno) +{ + rtx svcr = gen_rtx_REG (DImode, regno); + emit_move_insn (svcr, aarch64_old_svcr_mem ()); + return svcr; +} + +/* Like the rtx version of aarch64_guard_switch_pstate_sm, but first + load the incoming value of SVCR from its save slot into temporary + register REGNO. */ + +static rtx_insn * +aarch64_guard_switch_pstate_sm (unsigned int regno, + aarch64_feature_flags local_mode) +{ + rtx old_svcr = aarch64_read_old_svcr (regno); + return aarch64_guard_switch_pstate_sm (old_svcr, local_mode); +} + /* AArch64 stack frames generated by this compiler look like: +-------------------------------+ @@ -8104,6 +8719,12 @@ aarch64_expand_prologue (void) aarch64_save_callee_saves (bytes_below_sp, frame.saved_gprs, true, emit_frame_chain); + if (maybe_ge (frame.reg_offset[VG_REGNUM], 0)) + { + unsigned int saved_regs[] = { VG_REGNUM }; + aarch64_save_callee_saves (bytes_below_sp, saved_regs, true, + emit_frame_chain); + } if (maybe_ne (sve_callee_adjust, 0)) { gcc_assert (!flag_stack_clash_protection @@ -8125,6 +8746,40 @@ aarch64_expand_prologue (void) !frame_pointer_needed, true); if (emit_frame_chain && maybe_ne (final_adjust, 0)) aarch64_emit_stack_tie (hard_frame_pointer_rtx); + + /* Save the incoming value of PSTATE.SM, if required. */ + if (known_ge (frame.old_svcr_offset, 0)) + { + rtx mem = aarch64_old_svcr_mem (); + MEM_VOLATILE_P (mem) = 1; + if (TARGET_SME) + { + rtx reg = gen_rtx_REG (DImode, IP0_REGNUM); + emit_insn (gen_aarch64_read_svcr (reg)); + emit_move_insn (mem, reg); + } + else + { + rtx old_r0 = NULL_RTX, old_r1 = NULL_RTX; + auto &args = crtl->args.info; + if (args.aapcs_ncrn > 0) + { + old_r0 = gen_rtx_REG (DImode, PROBE_STACK_FIRST_REGNUM); + emit_move_insn (old_r0, gen_rtx_REG (DImode, R0_REGNUM)); + } + if (args.aapcs_ncrn > 1) + { + old_r1 = gen_rtx_REG (DImode, PROBE_STACK_SECOND_REGNUM); + emit_move_insn (old_r1, gen_rtx_REG (DImode, R1_REGNUM)); + } + emit_insn (gen_aarch64_get_sme_state ()); + emit_move_insn (mem, gen_rtx_REG (DImode, R0_REGNUM)); + if (old_r0) + emit_move_insn (gen_rtx_REG (DImode, R0_REGNUM), old_r0); + if (old_r1) + emit_move_insn (gen_rtx_REG (DImode, R1_REGNUM), old_r1); + } + } } /* Return TRUE if we can use a simple_return insn. @@ -9349,17 +10004,33 @@ aarch64_start_call_args (cumulative_args_t ca_v) RESULT is the register in which the result is returned. It's NULL for "call" and "sibcall". MEM is the location of the function call. - CALLEE_ABI is a const_int that gives the arm_pcs of the callee. + COOKIE is either: + - a const_int that gives the argument to the call's UNSPEC_CALLEE_ABI. + - a PARALLEL that contains such a const_int as its first element. + The second element is a PARALLEL that lists all the argument + registers that need to be saved and restored around a change + in PSTATE.SM, or const0_rtx if no such switch is needed. SIBCALL indicates whether this function call is normal call or sibling call. It will generate different pattern accordingly. */ void -aarch64_expand_call (rtx result, rtx mem, rtx callee_abi, bool sibcall) +aarch64_expand_call (rtx result, rtx mem, rtx cookie, bool sibcall) { rtx call, callee, tmp; rtvec vec; machine_mode mode; + rtx callee_abi = cookie; + rtx sme_mode_switch_args = const0_rtx; + if (GET_CODE (cookie) == PARALLEL) + { + callee_abi = XVECEXP (cookie, 0, 0); + sme_mode_switch_args = XVECEXP (cookie, 0, 1); + } + + gcc_assert (CONST_INT_P (callee_abi)); + auto callee_isa_mode = aarch64_callee_isa_mode (callee_abi); + gcc_assert (MEM_P (mem)); callee = XEXP (mem, 0); mode = GET_MODE (callee); @@ -9384,26 +10055,75 @@ aarch64_expand_call (rtx result, rtx mem, rtx callee_abi, bool sibcall) else tmp = gen_rtx_CLOBBER (VOIDmode, gen_rtx_REG (Pmode, LR_REGNUM)); - gcc_assert (CONST_INT_P (callee_abi)); callee_abi = gen_rtx_UNSPEC (DImode, gen_rtvec (1, callee_abi), UNSPEC_CALLEE_ABI); vec = gen_rtvec (3, call, callee_abi, tmp); call = gen_rtx_PARALLEL (VOIDmode, vec); - aarch64_emit_call_insn (call); + auto call_insn = aarch64_emit_call_insn (call); + + /* Check whether the call requires a change to PSTATE.SM. We can't + emit the instructions to change PSTATE.SM yet, since they involve + a change in vector length and a change in instruction set, which + cannot be represented in RTL. + + For now, just record which registers will be clobbered and used + by the changes to PSTATE.SM. */ + if (!sibcall && aarch64_call_switches_pstate_sm (callee_isa_mode)) + { + aarch64_sme_mode_switch_regs args_switch; + if (sme_mode_switch_args != const0_rtx) + { + unsigned int num_args = XVECLEN (sme_mode_switch_args, 0); + for (unsigned int i = 0; i < num_args; ++i) + { + rtx x = XVECEXP (sme_mode_switch_args, 0, i); + args_switch.add_reg (GET_MODE (x), REGNO (x)); + } + } + + aarch64_sme_mode_switch_regs result_switch; + if (result) + result_switch.add_call_result (call_insn); + + unsigned int num_gprs = MAX (args_switch.num_gprs (), + result_switch.num_gprs ()); + for (unsigned int i = 0; i < num_gprs; ++i) + clobber_reg (&CALL_INSN_FUNCTION_USAGE (call_insn), + gen_rtx_REG (DImode, args_switch.FIRST_GPR + i)); + + for (int regno = V0_REGNUM; regno < V0_REGNUM + 32; regno += 4) + clobber_reg (&CALL_INSN_FUNCTION_USAGE (call_insn), + gen_rtx_REG (V4x16QImode, regno)); + + for (int regno = P0_REGNUM; regno < P0_REGNUM + 16; regno += 1) + clobber_reg (&CALL_INSN_FUNCTION_USAGE (call_insn), + gen_rtx_REG (VNx16BImode, regno)); + + /* Ensure that the VG save slot has been initialized. Also emit + an instruction to model the effect of the temporary clobber + of VG, so that the prologue/epilogue pass sees the need to + save the old value. */ + use_reg (&CALL_INSN_FUNCTION_USAGE (call_insn), + gen_rtx_REG (DImode, VG_REGNUM)); + emit_insn_before (gen_aarch64_update_vg (), call_insn); + + cfun->machine->call_switches_pstate_sm = true; + } } /* Emit call insn with PAT and do aarch64-specific handling. */ -void +rtx_call_insn * aarch64_emit_call_insn (rtx pat) { - rtx insn = emit_call_insn (pat); + auto insn = emit_call_insn (pat); rtx *fusage = &CALL_INSN_FUNCTION_USAGE (insn); clobber_reg (fusage, gen_rtx_REG (word_mode, IP0_REGNUM)); clobber_reg (fusage, gen_rtx_REG (word_mode, IP1_REGNUM)); + return as_a (insn); } machine_mode @@ -10815,6 +11535,16 @@ aarch64_secondary_memory_needed (machine_mode mode, reg_class_t class1, return false; } +/* Implement TARGET_FRAME_POINTER_REQUIRED. */ + +static bool +aarch64_frame_pointer_required () +{ + /* If the function needs to record the incoming value of PSTATE.SM, + make sure that the slot is accessible from the frame pointer. */ + return aarch64_need_old_pstate_sm (); +} + static bool aarch64_can_eliminate (const int from ATTRIBUTE_UNUSED, const int to) { @@ -18427,7 +19157,8 @@ aarch64_conditional_register_usage (void) call_used_regs[i] = 1; } - /* Only allow the FFR and FFRT to be accessed via special patterns. */ + /* Only allow these registers to be accessed via special patterns. */ + CLEAR_HARD_REG_BIT (operand_reg_set, VG_REGNUM); CLEAR_HARD_REG_BIT (operand_reg_set, FFR_REGNUM); CLEAR_HARD_REG_BIT (operand_reg_set, FFRT_REGNUM); @@ -26017,6 +26748,123 @@ aarch64_pars_overlap_p (rtx par1, rtx par2) return false; } +/* If CALL involves a change in PSTATE.SM, emit the instructions needed + to switch to the new mode and the instructions needed to restore the + original mode. Return true if something changed. */ +static bool +aarch64_switch_pstate_sm_for_call (rtx_call_insn *call) +{ + /* Mode switches for sibling calls are handled via the epilogue. */ + if (SIBLING_CALL_P (call)) + return false; + + auto callee_isa_mode = aarch64_insn_callee_isa_mode (call); + if (!aarch64_call_switches_pstate_sm (callee_isa_mode)) + return false; + + /* Switch mode before the call, preserving any argument registers + across the switch. */ + start_sequence (); + rtx_insn *args_guard_label = nullptr; + if (TARGET_STREAMING_COMPATIBLE) + args_guard_label = aarch64_guard_switch_pstate_sm (IP0_REGNUM, + callee_isa_mode); + aarch64_sme_mode_switch_regs args_switch; + args_switch.add_call_args (call); + args_switch.emit_prologue (); + aarch64_switch_pstate_sm (AARCH64_ISA_MODE, callee_isa_mode); + args_switch.emit_epilogue (); + if (args_guard_label) + emit_label (args_guard_label); + auto args_seq = get_insns (); + end_sequence (); + emit_insn_before (args_seq, call); + + if (find_reg_note (call, REG_NORETURN, NULL_RTX)) + return true; + + /* Switch mode after the call, preserving any return registers across + the switch. */ + start_sequence (); + rtx_insn *return_guard_label = nullptr; + if (TARGET_STREAMING_COMPATIBLE) + return_guard_label = aarch64_guard_switch_pstate_sm (IP0_REGNUM, + callee_isa_mode); + aarch64_sme_mode_switch_regs return_switch; + return_switch.add_call_result (call); + return_switch.emit_prologue (); + aarch64_switch_pstate_sm (callee_isa_mode, AARCH64_ISA_MODE); + return_switch.emit_epilogue (); + if (return_guard_label) + emit_label (return_guard_label); + auto result_seq = get_insns (); + end_sequence (); + emit_insn_after (result_seq, call); + return true; +} + +namespace { + +const pass_data pass_data_switch_pstate_sm = +{ + RTL_PASS, // type + "smstarts", // name + OPTGROUP_NONE, // optinfo_flags + TV_NONE, // tv_id + 0, // properties_required + 0, // properties_provided + 0, // properties_destroyed + 0, // todo_flags_start + TODO_df_finish, // todo_flags_finish +}; + +class pass_switch_pstate_sm : public rtl_opt_pass +{ +public: + pass_switch_pstate_sm (gcc::context *ctxt) + : rtl_opt_pass (pass_data_switch_pstate_sm, ctxt) + {} + + // opt_pass methods: + bool gate (function *) override final; + unsigned int execute (function *) override final; +}; + +bool +pass_switch_pstate_sm::gate (function *) +{ + return cfun->machine->call_switches_pstate_sm; +} + +/* Emit any instructions needed to switch PSTATE.SM. */ +unsigned int +pass_switch_pstate_sm::execute (function *fn) +{ + basic_block bb; + + auto_sbitmap blocks (last_basic_block_for_fn (cfun)); + bitmap_clear (blocks); + FOR_EACH_BB_FN (bb, fn) + { + rtx_insn *insn; + FOR_BB_INSNS (bb, insn) + if (auto *call = dyn_cast (insn)) + if (aarch64_switch_pstate_sm_for_call (call)) + bitmap_set_bit (blocks, bb->index); + } + find_many_sub_basic_blocks (blocks); + clear_aux_for_blocks (); + return 0; +} + +} + +rtl_opt_pass * +make_pass_switch_pstate_sm (gcc::context *ctxt) +{ + return new pass_switch_pstate_sm (ctxt); +} + /* Target-specific selftests. */ #if CHECKING_P @@ -26204,6 +27052,9 @@ aarch64_run_selftests (void) #undef TARGET_CALLEE_COPIES #define TARGET_CALLEE_COPIES hook_bool_CUMULATIVE_ARGS_arg_info_false +#undef TARGET_FRAME_POINTER_REQUIRED +#define TARGET_FRAME_POINTER_REQUIRED aarch64_frame_pointer_required + #undef TARGET_CAN_ELIMINATE #define TARGET_CAN_ELIMINATE aarch64_can_eliminate diff --git a/gcc/config/aarch64/aarch64.h b/gcc/config/aarch64/aarch64.h index a88d35000df..182f45005f9 100644 --- a/gcc/config/aarch64/aarch64.h +++ b/gcc/config/aarch64/aarch64.h @@ -256,6 +256,10 @@ constexpr auto AARCH64_FL_DEFAULT_ISA_MODE = AARCH64_FL_SM_OFF; /* The current function is a normal non-streaming function. */ #define TARGET_NON_STREAMING (AARCH64_ISA_SM_OFF) +/* The current function has a streaming-compatible body. */ +#define TARGET_STREAMING_COMPATIBLE \ + ((aarch64_isa_flags & AARCH64_FL_SM_STATE) == 0) + /* Crypto is an optional extension to AdvSIMD. */ #define TARGET_CRYPTO (AARCH64_ISA_CRYPTO) @@ -477,7 +481,7 @@ constexpr auto AARCH64_FL_DEFAULT_ISA_MODE = AARCH64_FL_SM_OFF; 0, 0, 0, 0, 0, 0, 0, 0, /* V8 - V15 */ \ 1, 1, 1, 1, 1, 1, 1, 1, /* V16 - V23 */ \ 1, 1, 1, 1, 1, 1, 1, 1, /* V24 - V31 */ \ - 1, 1, 1, 1, /* SFP, AP, CC, VG */ \ + 1, 1, 1, 0, /* SFP, AP, CC, VG */ \ 1, 1, 1, 1, 1, 1, 1, 1, /* P0 - P7 */ \ 1, 1, 1, 1, 1, 1, 1, 1, /* P8 - P15 */ \ 1, 1 /* FFR and FFRT */ \ @@ -816,6 +820,13 @@ struct GTY (()) aarch64_frame vec *saved_fprs; vec *saved_prs; + /* The offset from the base of the frame of a 64-bit slot whose low + bit contains the incoming value of PSTATE.SM. This slot must be + within reach of the hard frame pointer. + + The offset is -1 if such a slot isn't needed. */ + poly_int64 old_svcr_offset; + /* The number of extra stack bytes taken up by register varargs. This area is allocated by the callee at the very top of the frame. This value is rounded up to a multiple of @@ -924,6 +935,12 @@ typedef struct GTY (()) machine_function /* One entry for each general purpose register. */ rtx call_via[SP_REGNUM]; bool label_is_assembled; + + /* True if we've expanded at least one call to a function that changes + PSTATE.SM. This should only be used for saving compile time: false + guarantees that no such mode switch exists. */ + bool call_switches_pstate_sm; + /* A set of all decls that have been passed to a vld1 intrinsic in the current function. This is used to help guide the vector cost model. */ hash_set *vector_load_decls; @@ -992,6 +1009,12 @@ typedef struct stack arg area so far. */ bool silent_p; /* True if we should act silently, rather than raise an error for invalid calls. */ + + /* A list of registers that need to be saved and restored around a + change to PSTATE.SM. An auto_vec would be more convenient, but those + can't be copied. */ + unsigned int num_sme_mode_switch_args; + rtx sme_mode_switch_args[12]; } CUMULATIVE_ARGS; #endif diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md index ddfd17bd2dd..0dac5df1b74 100644 --- a/gcc/config/aarch64/aarch64.md +++ b/gcc/config/aarch64/aarch64.md @@ -956,7 +956,7 @@ (define_expand "tbranch_3" operands[1]); }) -(define_insn "*tb1" +(define_insn "@aarch64_tb" [(set (pc) (if_then_else (EQL (zero_extract:GPI (match_operand:ALLI 0 "register_operand" "r") (const_int 1) @@ -1043,7 +1043,7 @@ (define_expand "call" [(parallel [(call (match_operand 0 "memory_operand") (match_operand 1 "general_operand")) - (unspec:DI [(match_operand 2 "const_int_operand")] UNSPEC_CALLEE_ABI) + (unspec:DI [(match_operand 2)] UNSPEC_CALLEE_ABI) (clobber (reg:DI LR_REGNUM))])] "" " @@ -1070,7 +1070,7 @@ (define_expand "call_value" [(set (match_operand 0 "") (call (match_operand 1 "memory_operand") (match_operand 2 "general_operand"))) - (unspec:DI [(match_operand 3 "const_int_operand")] UNSPEC_CALLEE_ABI) + (unspec:DI [(match_operand 3)] UNSPEC_CALLEE_ABI) (clobber (reg:DI LR_REGNUM))])] "" " @@ -1097,7 +1097,7 @@ (define_expand "sibcall" [(parallel [(call (match_operand 0 "memory_operand") (match_operand 1 "general_operand")) - (unspec:DI [(match_operand 2 "const_int_operand")] UNSPEC_CALLEE_ABI) + (unspec:DI [(match_operand 2)] UNSPEC_CALLEE_ABI) (return)])] "" { @@ -1111,7 +1111,7 @@ (define_expand "sibcall_value" [(set (match_operand 0 "") (call (match_operand 1 "memory_operand") (match_operand 2 "general_operand"))) - (unspec:DI [(match_operand 3 "const_int_operand")] UNSPEC_CALLEE_ABI) + (unspec:DI [(match_operand 3)] UNSPEC_CALLEE_ABI) (return)])] "" { @@ -8048,3 +8048,6 @@ (define_insn "patchable_area" ;; SVE2. (include "aarch64-sve2.md") + +;; SME and extensions +(include "aarch64-sme.md") diff --git a/gcc/config/aarch64/t-aarch64 b/gcc/config/aarch64/t-aarch64 index a4e0aa03274..cff56dc9f55 100644 --- a/gcc/config/aarch64/t-aarch64 +++ b/gcc/config/aarch64/t-aarch64 @@ -186,9 +186,12 @@ MULTILIB_DIRNAMES = $(subst $(comma), ,$(TM_MULTILIB_CONFIG)) insn-conditions.md: s-check-sve-md s-check-sve-md: $(srcdir)/config/aarch64/check-sve-md.awk \ $(srcdir)/config/aarch64/aarch64-sve.md \ - $(srcdir)/config/aarch64/aarch64-sve2.md + $(srcdir)/config/aarch64/aarch64-sve2.md \ + $(srcdir)/config/aarch64/aarch64-sme.md $(AWK) -f $(srcdir)/config/aarch64/check-sve-md.awk \ $(srcdir)/config/aarch64/aarch64-sve.md $(AWK) -f $(srcdir)/config/aarch64/check-sve-md.awk \ $(srcdir)/config/aarch64/aarch64-sve2.md + $(AWK) -f $(srcdir)/config/aarch64/check-sve-md.awk \ + $(srcdir)/config/aarch64/aarch64-sme.md $(STAMP) s-check-sve-md diff --git a/gcc/testsuite/gcc.target/aarch64/sme/call_sm_switch_1.c b/gcc/testsuite/gcc.target/aarch64/sme/call_sm_switch_1.c new file mode 100644 index 00000000000..a2de55773af --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sme/call_sm_switch_1.c @@ -0,0 +1,233 @@ +// { dg-options "-O -fomit-frame-pointer -fno-optimize-sibling-calls" } +// { dg-final { check-function-bodies "**" "" } } + +void ns_callee (); + void s_callee () [[arm::streaming]]; + void sc_callee () [[arm::streaming_compatible]]; + +void ns_callee_stack (int, int, int, int, int, int, int, int, int); + +struct callbacks { + void (*ns_ptr) (); + void (*s_ptr) () [[arm::streaming]]; + void (*sc_ptr) () [[arm::streaming_compatible]]; +}; + +/* +** n_caller: { target lp64 } +** stp x30, (x19|x2[0-8]), \[sp, #?-96\]! +** cntd x16 +** str x16, \[sp, #?16\] +** stp d8, d9, \[sp, #?32\] +** stp d10, d11, \[sp, #?48\] +** stp d12, d13, \[sp, #?64\] +** stp d14, d15, \[sp, #?80\] +** mov \1, x0 +** bl ns_callee +** smstart sm +** bl s_callee +** smstop sm +** bl sc_callee +** ldr (x[0-9]+), \[\1\] +** blr \2 +** ldr (x[0-9]+), \[\1, #?8\] +** smstart sm +** blr \3 +** smstop sm +** ldr (x[0-9]+), \[\1, #?16\] +** blr \4 +** ldp d8, d9, \[sp, #?32\] +** ldp d10, d11, \[sp, #?48\] +** ldp d12, d13, \[sp, #?64\] +** ldp d14, d15, \[sp, #?80\] +** ldp x30, \1, \[sp\], #?96 +** ret +*/ +void +n_caller (struct callbacks *c) +{ + ns_callee (); + s_callee (); + sc_callee (); + + c->ns_ptr (); + c->s_ptr (); + c->sc_ptr (); +} + +/* +** s_caller: { target lp64 } +** stp x30, (x19|x2[0-8]), \[sp, #?-96\]! +** cntd x16 +** str x16, \[sp, #?16\] +** stp d8, d9, \[sp, #?32\] +** stp d10, d11, \[sp, #?48\] +** stp d12, d13, \[sp, #?64\] +** stp d14, d15, \[sp, #?80\] +** mov \1, x0 +** smstop sm +** bl ns_callee +** smstart sm +** bl s_callee +** bl sc_callee +** ldr (x[0-9]+), \[\1\] +** smstop sm +** blr \2 +** smstart sm +** ldr (x[0-9]+), \[\1, #?8\] +** blr \3 +** ldr (x[0-9]+), \[\1, #?16\] +** blr \4 +** ldp d8, d9, \[sp, #?32\] +** ldp d10, d11, \[sp, #?48\] +** ldp d12, d13, \[sp, #?64\] +** ldp d14, d15, \[sp, #?80\] +** ldp x30, \1, \[sp\], #?96 +** ret +*/ +void +s_caller (struct callbacks *c) [[arm::streaming]] +{ + ns_callee (); + s_callee (); + sc_callee (); + + c->ns_ptr (); + c->s_ptr (); + c->sc_ptr (); +} + +/* +** sc_caller_sme: +** stp x29, x30, \[sp, #?-96\]! +** mov x29, sp +** cntd x16 +** str x16, \[sp, #?24\] +** stp d8, d9, \[sp, #?32\] +** stp d10, d11, \[sp, #?48\] +** stp d12, d13, \[sp, #?64\] +** stp d14, d15, \[sp, #?80\] +** mrs x16, svcr +** str x16, \[x29, #?16\] +** ldr x16, \[x29, #?16\] +** tbz x16, 0, .* +** smstop sm +** bl ns_callee +** ldr x16, \[x29, #?16\] +** tbz x16, 0, .* +** smstart sm +** ldr x16, \[x29, #?16\] +** tbnz x16, 0, .* +** smstart sm +** bl s_callee +** ldr x16, \[x29, #?16\] +** tbnz x16, 0, .* +** smstop sm +** bl sc_callee +** ldp d8, d9, \[sp, #?32\] +** ldp d10, d11, \[sp, #?48\] +** ldp d12, d13, \[sp, #?64\] +** ldp d14, d15, \[sp, #?80\] +** ldp x29, x30, \[sp\], #?96 +** ret +*/ +void +sc_caller_sme () [[arm::streaming_compatible]] +{ + ns_callee (); + s_callee (); + sc_callee (); +} + +#pragma GCC target "+nosme" + +/* +** sc_caller: +** stp x29, x30, \[sp, #?-96\]! +** mov x29, sp +** cntd x16 +** str x16, \[sp, #?24\] +** stp d8, d9, \[sp, #?32\] +** stp d10, d11, \[sp, #?48\] +** stp d12, d13, \[sp, #?64\] +** stp d14, d15, \[sp, #?80\] +** bl __arm_sme_state +** str x0, \[x29, #?16\] +** ... +** bl sc_callee +** ldp d8, d9, \[sp, #?32\] +** ldp d10, d11, \[sp, #?48\] +** ldp d12, d13, \[sp, #?64\] +** ldp d14, d15, \[sp, #?80\] +** ldp x29, x30, \[sp\], #?96 +** ret +*/ +void +sc_caller () [[arm::streaming_compatible]] +{ + ns_callee (); + sc_callee (); +} + +/* +** sc_caller_x0: +** ... +** mov x10, x0 +** bl __arm_sme_state +** ... +** str wzr, \[x10\] +** ... +*/ +void +sc_caller_x0 (int *ptr) [[arm::streaming_compatible]] +{ + *ptr = 0; + ns_callee (); + sc_callee (); +} + +/* +** sc_caller_x1: +** ... +** mov x10, x0 +** mov x11, x1 +** bl __arm_sme_state +** ... +** str w11, \[x10\] +** ... +*/ +void +sc_caller_x1 (int *ptr, int a) [[arm::streaming_compatible]] +{ + *ptr = a; + ns_callee (); + sc_callee (); +} + +/* +** sc_caller_stack: +** sub sp, sp, #112 +** stp x29, x30, \[sp, #?16\] +** add x29, sp, #?16 +** ... +** stp d8, d9, \[sp, #?48\] +** ... +** bl __arm_sme_state +** str x0, \[x29, #?16\] +** ... +** bl ns_callee_stack +** ldr x16, \[x29, #?16\] +** tbz x16, 0, .* +** smstart sm +** ... +*/ +void +sc_caller_stack () [[arm::streaming_compatible]] +{ + ns_callee_stack (0, 0, 0, 0, 0, 0, 0, 0, 0); +} + +/* { dg-final { scan-assembler {n_caller:(?:(?!ret).)*\.cfi_offset 46, -80\n} } } */ +/* { dg-final { scan-assembler {s_caller:(?:(?!ret).)*\.cfi_offset 46, -80\n} } } */ +/* { dg-final { scan-assembler {sc_caller_sme:(?:(?!ret).)*\.cfi_offset 46, -72\n} } } */ +/* { dg-final { scan-assembler {sc_caller:(?:(?!ret).)*\.cfi_offset 46, -72\n} } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sme/call_sm_switch_10.c b/gcc/testsuite/gcc.target/aarch64/sme/call_sm_switch_10.c new file mode 100644 index 00000000000..49c5e4a6acb --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sme/call_sm_switch_10.c @@ -0,0 +1,37 @@ +// { dg-options "" } + +#pragma GCC target "+nosme" + +void ns_callee (); + void s_callee () [[arm::streaming]]; + void sc_callee () [[arm::streaming_compatible]]; + +struct callbacks { + void (*ns_ptr) (); + void (*s_ptr) () [[arm::streaming]]; + void (*sc_ptr) () [[arm::streaming_compatible]]; +}; + +void +n_caller (struct callbacks *c) +{ + ns_callee (); + s_callee (); // { dg-error "calling a streaming function requires the ISA extension 'sme'" } + sc_callee (); + + c->ns_ptr (); + c->s_ptr (); // { dg-error "calling a streaming function requires the ISA extension 'sme'" } + c->sc_ptr (); +} + +void +sc_caller_sme (struct callbacks *c) [[arm::streaming_compatible]] +{ + ns_callee (); + s_callee (); // { dg-error "calling a streaming function requires the ISA extension 'sme'" } + sc_callee (); + + c->ns_ptr (); + c->s_ptr (); // { dg-error "calling a streaming function requires the ISA extension 'sme'" } + c->sc_ptr (); +} diff --git a/gcc/testsuite/gcc.target/aarch64/sme/call_sm_switch_2.c b/gcc/testsuite/gcc.target/aarch64/sme/call_sm_switch_2.c new file mode 100644 index 00000000000..890fcbc5b1a --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sme/call_sm_switch_2.c @@ -0,0 +1,43 @@ +// { dg-options "-O -fomit-frame-pointer -fno-optimize-sibling-calls" } + +void ns_callee (); + void s_callee () [[arm::streaming]]; + void sc_callee () [[arm::streaming_compatible]]; + +struct callbacks { + void (*ns_ptr) (); + void (*s_ptr) () [[arm::streaming]]; + void (*sc_ptr) () [[arm::streaming_compatible]]; +}; + +void +n_caller (struct callbacks *c) +{ + ns_callee (); + sc_callee (); + + c->ns_ptr (); + c->sc_ptr (); +} + +void +s_caller (struct callbacks *c) [[arm::streaming]] +{ + s_callee (); + sc_callee (); + + c->s_ptr (); + c->sc_ptr (); +} + +void +sc_caller (struct callbacks *c) [[arm::streaming_compatible]] +{ + sc_callee (); + + c->sc_ptr (); +} + +// { dg-final { scan-assembler-not {[dpqz][0-9]+,} } } +// { dg-final { scan-assembler-not {smstart\tsm} } } +// { dg-final { scan-assembler-not {smstop\tsm} } } diff --git a/gcc/testsuite/gcc.target/aarch64/sme/call_sm_switch_3.c b/gcc/testsuite/gcc.target/aarch64/sme/call_sm_switch_3.c new file mode 100644 index 00000000000..ed999d08560 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sme/call_sm_switch_3.c @@ -0,0 +1,166 @@ +// { dg-options "-O -fomit-frame-pointer -fno-optimize-sibling-calls" } +// { dg-final { check-function-bodies "**" "" } } + +__attribute__((aarch64_vector_pcs)) void ns_callee (); +__attribute__((aarch64_vector_pcs)) void s_callee () [[arm::streaming]]; +__attribute__((aarch64_vector_pcs)) void sc_callee () [[arm::streaming_compatible]]; + +struct callbacks { + __attribute__((aarch64_vector_pcs)) void (*ns_ptr) (); + __attribute__((aarch64_vector_pcs)) void (*s_ptr) () [[arm::streaming]]; + __attribute__((aarch64_vector_pcs)) void (*sc_ptr) () [[arm::streaming_compatible]]; +}; + +/* +** n_caller: { target lp64 } +** stp x30, (x19|x2[0-8]), \[sp, #?-288\]! +** cntd x16 +** str x16, \[sp, #?16\] +** stp q8, q9, \[sp, #?32\] +** stp q10, q11, \[sp, #?64\] +** stp q12, q13, \[sp, #?96\] +** stp q14, q15, \[sp, #?128\] +** stp q16, q17, \[sp, #?160\] +** stp q18, q19, \[sp, #?192\] +** stp q20, q21, \[sp, #?224\] +** stp q22, q23, \[sp, #?256\] +** mov \1, x0 +** bl ns_callee +** smstart sm +** bl s_callee +** smstop sm +** bl sc_callee +** ldr (x[0-9]+), \[\1\] +** blr \2 +** ldr (x[0-9]+), \[\1, #?8\] +** smstart sm +** blr \3 +** smstop sm +** ldr (x[0-9]+), \[\1, #?16\] +** blr \4 +** ldp q8, q9, \[sp, #?32\] +** ldp q10, q11, \[sp, #?64\] +** ldp q12, q13, \[sp, #?96\] +** ldp q14, q15, \[sp, #?128\] +** ldp q16, q17, \[sp, #?160\] +** ldp q18, q19, \[sp, #?192\] +** ldp q20, q21, \[sp, #?224\] +** ldp q22, q23, \[sp, #?256\] +** ldp x30, \1, \[sp\], #?288 +** ret +*/ +void __attribute__((aarch64_vector_pcs)) +n_caller (struct callbacks *c) +{ + ns_callee (); + s_callee (); + sc_callee (); + + c->ns_ptr (); + c->s_ptr (); + c->sc_ptr (); +} + +/* +** s_caller: { target lp64 } +** stp x30, (x19|x2[0-8]), \[sp, #?-288\]! +** cntd x16 +** str x16, \[sp, #?16\] +** stp q8, q9, \[sp, #?32\] +** stp q10, q11, \[sp, #?64\] +** stp q12, q13, \[sp, #?96\] +** stp q14, q15, \[sp, #?128\] +** stp q16, q17, \[sp, #?160\] +** stp q18, q19, \[sp, #?192\] +** stp q20, q21, \[sp, #?224\] +** stp q22, q23, \[sp, #?256\] +** mov \1, x0 +** smstop sm +** bl ns_callee +** smstart sm +** bl s_callee +** bl sc_callee +** ldr (x[0-9]+), \[\1\] +** smstop sm +** blr \2 +** smstart sm +** ldr (x[0-9]+), \[\1, #?8\] +** blr \3 +** ldr (x[0-9]+), \[\1, #?16\] +** blr \4 +** ldp q8, q9, \[sp, #?32\] +** ldp q10, q11, \[sp, #?64\] +** ldp q12, q13, \[sp, #?96\] +** ldp q14, q15, \[sp, #?128\] +** ldp q16, q17, \[sp, #?160\] +** ldp q18, q19, \[sp, #?192\] +** ldp q20, q21, \[sp, #?224\] +** ldp q22, q23, \[sp, #?256\] +** ldp x30, \1, \[sp\], #?288 +** ret +*/ +void __attribute__((aarch64_vector_pcs)) +s_caller (struct callbacks *c) [[arm::streaming]] +{ + ns_callee (); + s_callee (); + sc_callee (); + + c->ns_ptr (); + c->s_ptr (); + c->sc_ptr (); +} + +/* +** sc_caller: +** stp x29, x30, \[sp, #?-288\]! +** mov x29, sp +** cntd x16 +** str x16, \[sp, #?24\] +** stp q8, q9, \[sp, #?32\] +** stp q10, q11, \[sp, #?64\] +** stp q12, q13, \[sp, #?96\] +** stp q14, q15, \[sp, #?128\] +** stp q16, q17, \[sp, #?160\] +** stp q18, q19, \[sp, #?192\] +** stp q20, q21, \[sp, #?224\] +** stp q22, q23, \[sp, #?256\] +** mrs x16, svcr +** str x16, \[x29, #?16\] +** ldr x16, \[x29, #?16\] +** tbz x16, 0, .* +** smstop sm +** bl ns_callee +** ldr x16, \[x29, #?16\] +** tbz x16, 0, .* +** smstart sm +** ldr x16, \[x29, #?16\] +** tbnz x16, 0, .* +** smstart sm +** bl s_callee +** ldr x16, \[x29, #?16\] +** tbnz x16, 0, .* +** smstop sm +** bl sc_callee +** ldp q8, q9, \[sp, #?32\] +** ldp q10, q11, \[sp, #?64\] +** ldp q12, q13, \[sp, #?96\] +** ldp q14, q15, \[sp, #?128\] +** ldp q16, q17, \[sp, #?160\] +** ldp q18, q19, \[sp, #?192\] +** ldp q20, q21, \[sp, #?224\] +** ldp q22, q23, \[sp, #?256\] +** ldp x29, x30, \[sp\], #?288 +** ret +*/ +void __attribute__((aarch64_vector_pcs)) +sc_caller () [[arm::streaming_compatible]] +{ + ns_callee (); + s_callee (); + sc_callee (); +} + +/* { dg-final { scan-assembler {n_caller:(?:(?!ret).)*\.cfi_offset 46, -272\n} } } */ +/* { dg-final { scan-assembler {s_caller:(?:(?!ret).)*\.cfi_offset 46, -272\n} } } */ +/* { dg-final { scan-assembler {sc_caller:(?:(?!ret).)*\.cfi_offset 46, -264\n} } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sme/call_sm_switch_4.c b/gcc/testsuite/gcc.target/aarch64/sme/call_sm_switch_4.c new file mode 100644 index 00000000000..f93a67f974a --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sme/call_sm_switch_4.c @@ -0,0 +1,43 @@ +// { dg-options "-O -fomit-frame-pointer -fno-optimize-sibling-calls" } + +__attribute__((aarch64_vector_pcs)) void ns_callee (); +__attribute__((aarch64_vector_pcs)) void s_callee () [[arm::streaming]]; +__attribute__((aarch64_vector_pcs)) void sc_callee () [[arm::streaming_compatible]]; + +struct callbacks { + __attribute__((aarch64_vector_pcs)) void (*ns_ptr) (); + __attribute__((aarch64_vector_pcs)) void (*s_ptr) () [[arm::streaming]]; + __attribute__((aarch64_vector_pcs)) void (*sc_ptr) () [[arm::streaming_compatible]]; +}; + +void __attribute__((aarch64_vector_pcs)) +n_caller (struct callbacks *c) +{ + ns_callee (); + sc_callee (); + + c->ns_ptr (); + c->sc_ptr (); +} + +void __attribute__((aarch64_vector_pcs)) +s_caller (struct callbacks *c) [[arm::streaming]] +{ + s_callee (); + sc_callee (); + + c->s_ptr (); + c->sc_ptr (); +} + +void __attribute__((aarch64_vector_pcs)) +sc_caller (struct callbacks *c) [[arm::streaming_compatible]] +{ + sc_callee (); + + c->sc_ptr (); +} + +// { dg-final { scan-assembler-not {[dpqz][0-9]+,} } } +// { dg-final { scan-assembler-not {smstart\tsm} } } +// { dg-final { scan-assembler-not {smstop\tsm} } } diff --git a/gcc/testsuite/gcc.target/aarch64/sme/call_sm_switch_5.c b/gcc/testsuite/gcc.target/aarch64/sme/call_sm_switch_5.c new file mode 100644 index 00000000000..be9b5cc0410 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sme/call_sm_switch_5.c @@ -0,0 +1,318 @@ +// { dg-options "-O -fomit-frame-pointer -fno-optimize-sibling-calls" } +// { dg-final { check-function-bodies "**" "" } } + +#include + +svbool_t ns_callee (); + svbool_t s_callee () [[arm::streaming]]; + svbool_t sc_callee () [[arm::streaming_compatible]]; + +struct callbacks { + svbool_t (*ns_ptr) (); + svbool_t (*s_ptr) () [[arm::streaming]]; + svbool_t (*sc_ptr) () [[arm::streaming_compatible]]; +}; + +/* +** n_caller: { target lp64 } +** stp x30, (x19|x2[0-8]), \[sp, #?-32\]! +** cntd x16 +** str x16, \[sp, #?16\] +** addvl sp, sp, #-18 +** str p4, \[sp\] +** str p5, \[sp, #1, mul vl\] +** str p6, \[sp, #2, mul vl\] +** str p7, \[sp, #3, mul vl\] +** str p8, \[sp, #4, mul vl\] +** str p9, \[sp, #5, mul vl\] +** str p10, \[sp, #6, mul vl\] +** str p11, \[sp, #7, mul vl\] +** str p12, \[sp, #8, mul vl\] +** str p13, \[sp, #9, mul vl\] +** str p14, \[sp, #10, mul vl\] +** str p15, \[sp, #11, mul vl\] +** str z8, \[sp, #2, mul vl\] +** str z9, \[sp, #3, mul vl\] +** str z10, \[sp, #4, mul vl\] +** str z11, \[sp, #5, mul vl\] +** str z12, \[sp, #6, mul vl\] +** str z13, \[sp, #7, mul vl\] +** str z14, \[sp, #8, mul vl\] +** str z15, \[sp, #9, mul vl\] +** str z16, \[sp, #10, mul vl\] +** str z17, \[sp, #11, mul vl\] +** str z18, \[sp, #12, mul vl\] +** str z19, \[sp, #13, mul vl\] +** str z20, \[sp, #14, mul vl\] +** str z21, \[sp, #15, mul vl\] +** str z22, \[sp, #16, mul vl\] +** str z23, \[sp, #17, mul vl\] +** mov \1, x0 +** bl ns_callee +** smstart sm +** bl s_callee +** addvl sp, sp, #-1 +** str p0, \[sp\] +** smstop sm +** ldr p0, \[sp\] +** addvl sp, sp, #1 +** bl sc_callee +** ldr (x[0-9]+), \[\1\] +** blr \2 +** ldr (x[0-9]+), \[\1, #?8\] +** smstart sm +** blr \3 +** addvl sp, sp, #-1 +** str p0, \[sp\] +** smstop sm +** ldr p0, \[sp\] +** addvl sp, sp, #1 +** ldr (x[0-9]+), \[\1, #?16\] +** blr \4 +** ldr z8, \[sp, #2, mul vl\] +** ldr z9, \[sp, #3, mul vl\] +** ldr z10, \[sp, #4, mul vl\] +** ldr z11, \[sp, #5, mul vl\] +** ldr z12, \[sp, #6, mul vl\] +** ldr z13, \[sp, #7, mul vl\] +** ldr z14, \[sp, #8, mul vl\] +** ldr z15, \[sp, #9, mul vl\] +** ldr z16, \[sp, #10, mul vl\] +** ldr z17, \[sp, #11, mul vl\] +** ldr z18, \[sp, #12, mul vl\] +** ldr z19, \[sp, #13, mul vl\] +** ldr z20, \[sp, #14, mul vl\] +** ldr z21, \[sp, #15, mul vl\] +** ldr z22, \[sp, #16, mul vl\] +** ldr z23, \[sp, #17, mul vl\] +** ldr p4, \[sp\] +** ldr p5, \[sp, #1, mul vl\] +** ldr p6, \[sp, #2, mul vl\] +** ldr p7, \[sp, #3, mul vl\] +** ldr p8, \[sp, #4, mul vl\] +** ldr p9, \[sp, #5, mul vl\] +** ldr p10, \[sp, #6, mul vl\] +** ldr p11, \[sp, #7, mul vl\] +** ldr p12, \[sp, #8, mul vl\] +** ldr p13, \[sp, #9, mul vl\] +** ldr p14, \[sp, #10, mul vl\] +** ldr p15, \[sp, #11, mul vl\] +** addvl sp, sp, #18 +** ldp x30, \1, \[sp\], #?32 +** ret +*/ +svbool_t +n_caller (struct callbacks *c) +{ + ns_callee (); + s_callee (); + sc_callee (); + + c->ns_ptr (); + c->s_ptr (); + return c->sc_ptr (); +} + +/* +** s_caller: { target lp64 } +** stp x30, (x19|x2[0-8]), \[sp, #?-32\]! +** cntd x16 +** str x16, \[sp, #?16\] +** addvl sp, sp, #-18 +** str p4, \[sp\] +** str p5, \[sp, #1, mul vl\] +** str p6, \[sp, #2, mul vl\] +** str p7, \[sp, #3, mul vl\] +** str p8, \[sp, #4, mul vl\] +** str p9, \[sp, #5, mul vl\] +** str p10, \[sp, #6, mul vl\] +** str p11, \[sp, #7, mul vl\] +** str p12, \[sp, #8, mul vl\] +** str p13, \[sp, #9, mul vl\] +** str p14, \[sp, #10, mul vl\] +** str p15, \[sp, #11, mul vl\] +** str z8, \[sp, #2, mul vl\] +** str z9, \[sp, #3, mul vl\] +** str z10, \[sp, #4, mul vl\] +** str z11, \[sp, #5, mul vl\] +** str z12, \[sp, #6, mul vl\] +** str z13, \[sp, #7, mul vl\] +** str z14, \[sp, #8, mul vl\] +** str z15, \[sp, #9, mul vl\] +** str z16, \[sp, #10, mul vl\] +** str z17, \[sp, #11, mul vl\] +** str z18, \[sp, #12, mul vl\] +** str z19, \[sp, #13, mul vl\] +** str z20, \[sp, #14, mul vl\] +** str z21, \[sp, #15, mul vl\] +** str z22, \[sp, #16, mul vl\] +** str z23, \[sp, #17, mul vl\] +** mov \1, x0 +** smstop sm +** bl ns_callee +** addvl sp, sp, #-1 +** str p0, \[sp\] +** smstart sm +** ldr p0, \[sp\] +** addvl sp, sp, #1 +** bl s_callee +** bl sc_callee +** ldr (x[0-9]+), \[\1\] +** smstop sm +** blr \2 +** addvl sp, sp, #-1 +** str p0, \[sp\] +** smstart sm +** ldr p0, \[sp\] +** addvl sp, sp, #1 +** ldr (x[0-9]+), \[\1, #?8\] +** blr \3 +** ldr (x[0-9]+), \[\1, #?16\] +** blr \4 +** ldr z8, \[sp, #2, mul vl\] +** ldr z9, \[sp, #3, mul vl\] +** ldr z10, \[sp, #4, mul vl\] +** ldr z11, \[sp, #5, mul vl\] +** ldr z12, \[sp, #6, mul vl\] +** ldr z13, \[sp, #7, mul vl\] +** ldr z14, \[sp, #8, mul vl\] +** ldr z15, \[sp, #9, mul vl\] +** ldr z16, \[sp, #10, mul vl\] +** ldr z17, \[sp, #11, mul vl\] +** ldr z18, \[sp, #12, mul vl\] +** ldr z19, \[sp, #13, mul vl\] +** ldr z20, \[sp, #14, mul vl\] +** ldr z21, \[sp, #15, mul vl\] +** ldr z22, \[sp, #16, mul vl\] +** ldr z23, \[sp, #17, mul vl\] +** ldr p4, \[sp\] +** ldr p5, \[sp, #1, mul vl\] +** ldr p6, \[sp, #2, mul vl\] +** ldr p7, \[sp, #3, mul vl\] +** ldr p8, \[sp, #4, mul vl\] +** ldr p9, \[sp, #5, mul vl\] +** ldr p10, \[sp, #6, mul vl\] +** ldr p11, \[sp, #7, mul vl\] +** ldr p12, \[sp, #8, mul vl\] +** ldr p13, \[sp, #9, mul vl\] +** ldr p14, \[sp, #10, mul vl\] +** ldr p15, \[sp, #11, mul vl\] +** addvl sp, sp, #18 +** ldp x30, \1, \[sp\], #?32 +** ret +*/ +svbool_t +s_caller (struct callbacks *c) [[arm::streaming]] +{ + ns_callee (); + s_callee (); + sc_callee (); + + c->ns_ptr (); + c->s_ptr (); + return c->sc_ptr (); +} + +/* +** sc_caller: +** stp x29, x30, \[sp, #?-32\]! +** mov x29, sp +** cntd x16 +** str x16, \[sp, #?24\] +** addvl sp, sp, #-18 +** str p4, \[sp\] +** str p5, \[sp, #1, mul vl\] +** str p6, \[sp, #2, mul vl\] +** str p7, \[sp, #3, mul vl\] +** str p8, \[sp, #4, mul vl\] +** str p9, \[sp, #5, mul vl\] +** str p10, \[sp, #6, mul vl\] +** str p11, \[sp, #7, mul vl\] +** str p12, \[sp, #8, mul vl\] +** str p13, \[sp, #9, mul vl\] +** str p14, \[sp, #10, mul vl\] +** str p15, \[sp, #11, mul vl\] +** str z8, \[sp, #2, mul vl\] +** str z9, \[sp, #3, mul vl\] +** str z10, \[sp, #4, mul vl\] +** str z11, \[sp, #5, mul vl\] +** str z12, \[sp, #6, mul vl\] +** str z13, \[sp, #7, mul vl\] +** str z14, \[sp, #8, mul vl\] +** str z15, \[sp, #9, mul vl\] +** str z16, \[sp, #10, mul vl\] +** str z17, \[sp, #11, mul vl\] +** str z18, \[sp, #12, mul vl\] +** str z19, \[sp, #13, mul vl\] +** str z20, \[sp, #14, mul vl\] +** str z21, \[sp, #15, mul vl\] +** str z22, \[sp, #16, mul vl\] +** str z23, \[sp, #17, mul vl\] +** mrs x16, svcr +** str x16, \[x29, #?16\] +** ldr x16, \[x29, #?16\] +** tbz x16, 0, .* +** smstop sm +** bl ns_callee +** ldr x16, \[x29, #?16\] +** tbz x16, 0, .* +** addvl sp, sp, #-1 +** str p0, \[sp\] +** smstart sm +** ldr p0, \[sp\] +** addvl sp, sp, #1 +** ldr x16, \[x29, #?16\] +** tbnz x16, 0, .* +** smstart sm +** bl s_callee +** ldr x16, \[x29, #?16\] +** tbnz x16, 0, .* +** addvl sp, sp, #-1 +** str p0, \[sp\] +** smstop sm +** ldr p0, \[sp\] +** addvl sp, sp, #1 +** bl sc_callee +** ldr z8, \[sp, #2, mul vl\] +** ldr z9, \[sp, #3, mul vl\] +** ldr z10, \[sp, #4, mul vl\] +** ldr z11, \[sp, #5, mul vl\] +** ldr z12, \[sp, #6, mul vl\] +** ldr z13, \[sp, #7, mul vl\] +** ldr z14, \[sp, #8, mul vl\] +** ldr z15, \[sp, #9, mul vl\] +** ldr z16, \[sp, #10, mul vl\] +** ldr z17, \[sp, #11, mul vl\] +** ldr z18, \[sp, #12, mul vl\] +** ldr z19, \[sp, #13, mul vl\] +** ldr z20, \[sp, #14, mul vl\] +** ldr z21, \[sp, #15, mul vl\] +** ldr z22, \[sp, #16, mul vl\] +** ldr z23, \[sp, #17, mul vl\] +** ldr p4, \[sp\] +** ldr p5, \[sp, #1, mul vl\] +** ldr p6, \[sp, #2, mul vl\] +** ldr p7, \[sp, #3, mul vl\] +** ldr p8, \[sp, #4, mul vl\] +** ldr p9, \[sp, #5, mul vl\] +** ldr p10, \[sp, #6, mul vl\] +** ldr p11, \[sp, #7, mul vl\] +** ldr p12, \[sp, #8, mul vl\] +** ldr p13, \[sp, #9, mul vl\] +** ldr p14, \[sp, #10, mul vl\] +** ldr p15, \[sp, #11, mul vl\] +** addvl sp, sp, #18 +** ldp x29, x30, \[sp\], #?32 +** ret +*/ +svbool_t +sc_caller () [[arm::streaming_compatible]] +{ + ns_callee (); + s_callee (); + return sc_callee (); +} + +/* { dg-final { scan-assembler {n_caller:(?:(?!ret).)*\.cfi_offset 46, -16\n} } } */ +/* { dg-final { scan-assembler {s_caller:(?:(?!ret).)*\.cfi_offset 46, -16\n} } } */ +/* { dg-final { scan-assembler {sc_caller:(?:(?!ret).)*\.cfi_offset 46, -8\n} } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sme/call_sm_switch_6.c b/gcc/testsuite/gcc.target/aarch64/sme/call_sm_switch_6.c new file mode 100644 index 00000000000..0f6bc4f6c9a --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sme/call_sm_switch_6.c @@ -0,0 +1,45 @@ +// { dg-options "-O -fomit-frame-pointer -fno-optimize-sibling-calls" } + +#include + +svbool_t ns_callee (); + svbool_t s_callee () [[arm::streaming]]; + svbool_t sc_callee () [[arm::streaming_compatible]]; + +struct callbacks { + svbool_t (*ns_ptr) (); + svbool_t (*s_ptr) () [[arm::streaming]]; + svbool_t (*sc_ptr) () [[arm::streaming_compatible]]; +}; + +svbool_t +n_caller (struct callbacks *c) +{ + ns_callee (); + sc_callee (); + + c->ns_ptr (); + return c->sc_ptr (); +} + +svbool_t +s_caller (struct callbacks *c) [[arm::streaming]] +{ + s_callee (); + sc_callee (); + + c->s_ptr (); + return c->sc_ptr (); +} + +svbool_t +sc_caller (struct callbacks *c) [[arm::streaming_compatible]] +{ + sc_callee (); + + return c->sc_ptr (); +} + +// { dg-final { scan-assembler-not {[dpqz][0-9]+,} } } +// { dg-final { scan-assembler-not {smstart\tsm} } } +// { dg-final { scan-assembler-not {smstop\tsm} } } diff --git a/gcc/testsuite/gcc.target/aarch64/sme/call_sm_switch_7.c b/gcc/testsuite/gcc.target/aarch64/sme/call_sm_switch_7.c new file mode 100644 index 00000000000..6482a489fc5 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sme/call_sm_switch_7.c @@ -0,0 +1,516 @@ +// { dg-options "-O -fomit-frame-pointer -fno-optimize-sibling-calls" } +// { dg-final { check-function-bodies "**" "" } } + +#include +#include + +double produce_d0 (); +void consume_d0 (double); + +/* +** test_d0: +** ... +** smstop sm +** bl produce_d0 +** fmov x10, d0 +** smstart sm +** fmov d0, x10 +** fmov x10, d0 +** smstop sm +** fmov d0, x10 +** bl consume_d0 +** ... +*/ +void +test_d0 () [[arm::streaming]] +{ + double res = produce_d0 (); + asm volatile (""); + consume_d0 (res); +} + +int8x8_t produce_d0_vec (); +void consume_d0_vec (int8x8_t); + +/* +** test_d0_vec: +** ... +** smstop sm +** bl produce_d0_vec +** ( +** fmov x10, d0 +** | +** umov x10, v0.d\[0\] +** ) +** smstart sm +** fmov d0, x10 +** ( +** fmov x10, d0 +** | +** umov x10, v0.d\[0\] +** ) +** smstop sm +** fmov d0, x10 +** bl consume_d0_vec +** ... +*/ +void +test_d0_vec () [[arm::streaming]] +{ + int8x8_t res = produce_d0_vec (); + asm volatile (""); + consume_d0_vec (res); +} + +int8x16_t produce_q0 (); +void consume_q0 (int8x16_t); + +/* +** test_q0: +** ... +** smstop sm +** bl produce_q0 +** str q0, \[sp, #?-16\]! +** smstart sm +** ldr q0, \[sp\], #?16 +** str q0, \[sp, #?-16\]! +** smstop sm +** ldr q0, \[sp\], #?16 +** bl consume_q0 +** ... +*/ +void +test_q0 () [[arm::streaming]] +{ + int8x16_t res = produce_q0 (); + asm volatile (""); + consume_q0 (res); +} + +int8x16x2_t produce_q1 (); +void consume_q1 (int8x16x2_t); + +/* +** test_q1: +** ... +** smstop sm +** bl produce_q1 +** stp q0, q1, \[sp, #?-32\]! +** smstart sm +** ldp q0, q1, \[sp\], #?32 +** stp q0, q1, \[sp, #?-32\]! +** smstop sm +** ldp q0, q1, \[sp\], #?32 +** bl consume_q1 +** ... +*/ +void +test_q1 () [[arm::streaming]] +{ + int8x16x2_t res = produce_q1 (); + asm volatile (""); + consume_q1 (res); +} + +int8x16x3_t produce_q2 (); +void consume_q2 (int8x16x3_t); + +/* +** test_q2: +** ... +** smstop sm +** bl produce_q2 +** stp q0, q1, \[sp, #?-48\]! +** str q2, \[sp, #?32\] +** smstart sm +** ldr q2, \[sp, #?32\] +** ldp q0, q1, \[sp\], #?48 +** stp q0, q1, \[sp, #?-48\]! +** str q2, \[sp, #?32\] +** smstop sm +** ldr q2, \[sp, #?32\] +** ldp q0, q1, \[sp\], #?48 +** bl consume_q2 +** ... +*/ +void +test_q2 () [[arm::streaming]] +{ + int8x16x3_t res = produce_q2 (); + asm volatile (""); + consume_q2 (res); +} + +int8x16x4_t produce_q3 (); +void consume_q3 (int8x16x4_t); + +/* +** test_q3: +** ... +** smstop sm +** bl produce_q3 +** stp q0, q1, \[sp, #?-64\]! +** stp q2, q3, \[sp, #?32\] +** smstart sm +** ldp q2, q3, \[sp, #?32\] +** ldp q0, q1, \[sp\], #?64 +** stp q0, q1, \[sp, #?-64\]! +** stp q2, q3, \[sp, #?32\] +** smstop sm +** ldp q2, q3, \[sp, #?32\] +** ldp q0, q1, \[sp\], #?64 +** bl consume_q3 +** ... +*/ +void +test_q3 () [[arm::streaming]] +{ + int8x16x4_t res = produce_q3 (); + asm volatile (""); + consume_q3 (res); +} + +svint8_t produce_z0 (); +void consume_z0 (svint8_t); + +/* +** test_z0: +** ... +** smstop sm +** bl produce_z0 +** addvl sp, sp, #-1 +** str z0, \[sp\] +** smstart sm +** ldr z0, \[sp\] +** addvl sp, sp, #1 +** addvl sp, sp, #-1 +** str z0, \[sp\] +** smstop sm +** ldr z0, \[sp\] +** addvl sp, sp, #1 +** bl consume_z0 +** ... +*/ +void +test_z0 () [[arm::streaming]] +{ + svint8_t res = produce_z0 (); + asm volatile (""); + consume_z0 (res); +} + +svint8x4_t produce_z3 (); +void consume_z3 (svint8x4_t); + +/* +** test_z3: +** ... +** smstop sm +** bl produce_z3 +** addvl sp, sp, #-4 +** str z0, \[sp\] +** str z1, \[sp, #1, mul vl\] +** str z2, \[sp, #2, mul vl\] +** str z3, \[sp, #3, mul vl\] +** smstart sm +** ldr z0, \[sp\] +** ldr z1, \[sp, #1, mul vl\] +** ldr z2, \[sp, #2, mul vl\] +** ldr z3, \[sp, #3, mul vl\] +** addvl sp, sp, #4 +** addvl sp, sp, #-4 +** str z0, \[sp\] +** str z1, \[sp, #1, mul vl\] +** str z2, \[sp, #2, mul vl\] +** str z3, \[sp, #3, mul vl\] +** smstop sm +** ldr z0, \[sp\] +** ldr z1, \[sp, #1, mul vl\] +** ldr z2, \[sp, #2, mul vl\] +** ldr z3, \[sp, #3, mul vl\] +** addvl sp, sp, #4 +** bl consume_z3 +** ... +*/ +void +test_z3 () [[arm::streaming]] +{ + svint8x4_t res = produce_z3 (); + asm volatile (""); + consume_z3 (res); +} + +svbool_t produce_p0 (); +void consume_p0 (svbool_t); + +/* +** test_p0: +** ... +** smstop sm +** bl produce_p0 +** addvl sp, sp, #-1 +** str p0, \[sp\] +** smstart sm +** ldr p0, \[sp\] +** addvl sp, sp, #1 +** addvl sp, sp, #-1 +** str p0, \[sp\] +** smstop sm +** ldr p0, \[sp\] +** addvl sp, sp, #1 +** bl consume_p0 +** ... +*/ +void +test_p0 () [[arm::streaming]] +{ + svbool_t res = produce_p0 (); + asm volatile (""); + consume_p0 (res); +} + +void consume_d7 (double, double, double, double, double, double, double, + double); + +/* +** test_d7: +** ... +** fmov x10, d0 +** fmov x11, d1 +** fmov x12, d2 +** fmov x13, d3 +** fmov x14, d4 +** fmov x15, d5 +** fmov x16, d6 +** fmov x17, d7 +** smstop sm +** fmov d0, x10 +** fmov d1, x11 +** fmov d2, x12 +** fmov d3, x13 +** fmov d4, x14 +** fmov d5, x15 +** fmov d6, x16 +** fmov d7, x17 +** bl consume_d7 +** ... +*/ +void +test_d7 () [[arm::streaming]] +{ + consume_d7 (1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0); +} + +void consume_d7_vec (int8x8_t, int8x8_t, int8x8_t, int8x8_t, int8x8_t, + int8x8_t, int8x8_t, int8x8_t); + +/* +** test_d7_vec: +** ... +** ( +** fmov x10, d0 +** fmov x11, d1 +** fmov x12, d2 +** fmov x13, d3 +** fmov x14, d4 +** fmov x15, d5 +** fmov x16, d6 +** fmov x17, d7 +** | +** umov x10, v0.d\[0\] +** umov x11, v1.d\[0\] +** umov x12, v2.d\[0\] +** umov x13, v3.d\[0\] +** umov x14, v4.d\[0\] +** umov x15, v5.d\[0\] +** umov x16, v6.d\[0\] +** umov x17, v7.d\[0\] +** ) +** smstop sm +** fmov d0, x10 +** fmov d1, x11 +** fmov d2, x12 +** fmov d3, x13 +** fmov d4, x14 +** fmov d5, x15 +** fmov d6, x16 +** fmov d7, x17 +** bl consume_d7_vec +** ... +*/ +void +test_d7_vec (int8x8_t *ptr) [[arm::streaming]] +{ + consume_d7_vec (*ptr, *ptr, *ptr, *ptr, *ptr, *ptr, *ptr, *ptr); +} + +void consume_q7 (int8x16_t, int8x16_t, int8x16_t, int8x16_t, int8x16_t, + int8x16_t, int8x16_t, int8x16_t); + +/* +** test_q7: +** ... +** stp q0, q1, \[sp, #?-128\]! +** stp q2, q3, \[sp, #?32\] +** stp q4, q5, \[sp, #?64\] +** stp q6, q7, \[sp, #?96\] +** smstop sm +** ldp q2, q3, \[sp, #?32\] +** ldp q4, q5, \[sp, #?64\] +** ldp q6, q7, \[sp, #?96\] +** ldp q0, q1, \[sp\], #?128 +** bl consume_q7 +** ... +*/ +void +test_q7 (int8x16_t *ptr) [[arm::streaming]] +{ + consume_q7 (*ptr, *ptr, *ptr, *ptr, *ptr, *ptr, *ptr, *ptr); +} + +void consume_z7 (svint8_t, svint8_t, svint8_t, svint8_t, svint8_t, + svint8_t, svint8_t, svint8_t); + +/* +** test_z7: +** ... +** addvl sp, sp, #-8 +** str z0, \[sp\] +** str z1, \[sp, #1, mul vl\] +** str z2, \[sp, #2, mul vl\] +** str z3, \[sp, #3, mul vl\] +** str z4, \[sp, #4, mul vl\] +** str z5, \[sp, #5, mul vl\] +** str z6, \[sp, #6, mul vl\] +** str z7, \[sp, #7, mul vl\] +** smstop sm +** ldr z0, \[sp\] +** ldr z1, \[sp, #1, mul vl\] +** ldr z2, \[sp, #2, mul vl\] +** ldr z3, \[sp, #3, mul vl\] +** ldr z4, \[sp, #4, mul vl\] +** ldr z5, \[sp, #5, mul vl\] +** ldr z6, \[sp, #6, mul vl\] +** ldr z7, \[sp, #7, mul vl\] +** addvl sp, sp, #8 +** bl consume_z7 +** ... +*/ +void +test_z7 (svint8_t *ptr) [[arm::streaming]] +{ + consume_z7 (*ptr, *ptr, *ptr, *ptr, *ptr, *ptr, *ptr, *ptr); +} + +void consume_p3 (svbool_t, svbool_t, svbool_t, svbool_t); + +/* +** test_p3: +** ... +** addvl sp, sp, #-1 +** str p0, \[sp\] +** str p1, \[sp, #1, mul vl\] +** str p2, \[sp, #2, mul vl\] +** str p3, \[sp, #3, mul vl\] +** smstop sm +** ldr p0, \[sp\] +** ldr p1, \[sp, #1, mul vl\] +** ldr p2, \[sp, #2, mul vl\] +** ldr p3, \[sp, #3, mul vl\] +** addvl sp, sp, #1 +** bl consume_p3 +** ... +*/ +void +test_p3 (svbool_t *ptr) [[arm::streaming]] +{ + consume_p3 (*ptr, *ptr, *ptr, *ptr); +} + +void consume_mixed (float, double, float32x4_t, svfloat32_t, + float, double, float64x2_t, svfloat64_t, + svbool_t, svbool_t, svbool_t, svbool_t); + +/* +** test_mixed: +** ... +** addvl sp, sp, #-3 +** str p0, \[sp\] +** str p1, \[sp, #1, mul vl\] +** str p2, \[sp, #2, mul vl\] +** str p3, \[sp, #3, mul vl\] +** str z3, \[sp, #1, mul vl\] +** str z7, \[sp, #2, mul vl\] +** stp q2, q6, \[sp, #?-32\]! +** fmov w10, s0 +** fmov x11, d1 +** fmov w12, s4 +** fmov x13, d5 +** smstop sm +** fmov s0, w10 +** fmov d1, x11 +** fmov s4, w12 +** fmov d5, x13 +** ldp q2, q6, \[sp\], #?32 +** ldr p0, \[sp\] +** ldr p1, \[sp, #1, mul vl\] +** ldr p2, \[sp, #2, mul vl\] +** ldr p3, \[sp, #3, mul vl\] +** ldr z3, \[sp, #1, mul vl\] +** ldr z7, \[sp, #2, mul vl\] +** addvl sp, sp, #3 +** bl consume_mixed +** ... +*/ +void +test_mixed (float32x4_t *float32x4_ptr, + svfloat32_t *svfloat32_ptr, + float64x2_t *float64x2_ptr, + svfloat64_t *svfloat64_ptr, + svbool_t *svbool_ptr) [[arm::streaming]] +{ + consume_mixed (1.0f, 2.0, *float32x4_ptr, *svfloat32_ptr, + 3.0f, 4.0, *float64x2_ptr, *svfloat64_ptr, + *svbool_ptr, *svbool_ptr, *svbool_ptr, *svbool_ptr); +} + +void consume_varargs (float, ...); + +/* +** test_varargs: +** ... +** stp q3, q7, \[sp, #?-32\]! +** fmov w10, s0 +** fmov x11, d1 +** ( +** fmov x12, d2 +** | +** umov x12, v2.d\[0\] +** ) +** fmov x13, d4 +** fmov x14, d5 +** ( +** fmov x15, d6 +** | +** umov x15, v6.d\[0\] +** ) +** smstop sm +** fmov s0, w10 +** fmov d1, x11 +** fmov d2, x12 +** fmov d4, x13 +** fmov d5, x14 +** fmov d6, x15 +** ldp q3, q7, \[sp\], #?32 +** bl consume_varargs +** ... +*/ +void +test_varargs (float32x2_t *float32x2_ptr, + float32x4_t *float32x4_ptr, + float64x1_t *float64x1_ptr, + float64x2_t *float64x2_ptr) [[arm::streaming]] +{ + consume_varargs (1.0f, 2.0, *float32x2_ptr, *float32x4_ptr, + 3.0f, 4.0, *float64x1_ptr, *float64x2_ptr); +} diff --git a/gcc/testsuite/gcc.target/aarch64/sme/call_sm_switch_8.c b/gcc/testsuite/gcc.target/aarch64/sme/call_sm_switch_8.c new file mode 100644 index 00000000000..f44724df32f --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sme/call_sm_switch_8.c @@ -0,0 +1,87 @@ +// { dg-options "-O -fomit-frame-pointer -fno-optimize-sibling-calls -msve-vector-bits=128" } +// { dg-final { check-function-bodies "**" "" } } + +#include + +svint8_t produce_z0 (); +void consume_z0 (svint8_t); + +/* +** test_z0: +** ... +** smstop sm +** bl produce_z0 +** str q0, \[sp, #?-16\]! +** smstart sm +** ldr q0, \[sp\], #?16 +** str q0, \[sp, #?-16\]! +** smstop sm +** ldr q0, \[sp\], #?16 +** bl consume_z0 +** ... +*/ +void +test_z0 () [[arm::streaming]] +{ + svint8_t res = produce_z0 (); + asm volatile (""); + consume_z0 (res); +} + +svint8x4_t produce_z3 (); +void consume_z3 (svint8x4_t); + +/* +** test_z3: +** ... +** smstop sm +** bl produce_z3 +** stp q0, q1, \[sp, #?-64\]! +** stp q2, q3, \[sp, #?32\] +** smstart sm +** ldp q2, q3, \[sp, #?32\] +** ldp q0, q1, \[sp\], #?64 +** stp q0, q1, \[sp, #?-64\]! +** stp q2, q3, \[sp, #?32\] +** smstop sm +** ldp q2, q3, \[sp, #?32\] +** ldp q0, q1, \[sp\], #?64 +** bl consume_z3 +** ... +*/ +void +test_z3 () [[arm::streaming]] +{ + svint8x4_t res = produce_z3 (); + asm volatile (""); + consume_z3 (res); +} + +svbool_t produce_p0 (); +void consume_p0 (svbool_t); + +/* +** test_p0: +** ... +** smstop sm +** bl produce_p0 +** sub sp, sp, #?16 +** str p0, \[sp\] +** smstart sm +** ldr p0, \[sp\] +** add sp, sp, #?16 +** sub sp, sp, #?16 +** str p0, \[sp\] +** smstop sm +** ldr p0, \[sp\] +** add sp, sp, #?16 +** bl consume_p0 +** ... +*/ +void +test_p0 () [[arm::streaming]] +{ + svbool_t res = produce_p0 (); + asm volatile (""); + consume_p0 (res); +} diff --git a/gcc/testsuite/gcc.target/aarch64/sme/call_sm_switch_9.c b/gcc/testsuite/gcc.target/aarch64/sme/call_sm_switch_9.c new file mode 100644 index 00000000000..83b4073eef3 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sme/call_sm_switch_9.c @@ -0,0 +1,103 @@ +// { dg-options "-O -fomit-frame-pointer -fno-optimize-sibling-calls -msve-vector-bits=256" } +// { dg-final { check-function-bodies "**" "" } } + +#include + +svint8_t produce_z0 (); +void consume_z0 (svint8_t); + +/* +** test_z0: +** ... +** smstop sm +** bl produce_z0 +** sub sp, sp, #?32 +** str z0, \[sp\] +** smstart sm +** ldr z0, \[sp\] +** add sp, sp, #?32 +** sub sp, sp, #?32 +** str z0, \[sp\] +** smstop sm +** ldr z0, \[sp\] +** add sp, sp, #?32 +** bl consume_z0 +** ... +*/ +void +test_z0 () [[arm::streaming]] +{ + svint8_t res = produce_z0 (); + asm volatile (""); + consume_z0 (res); +} + +svint8x4_t produce_z3 (); +void consume_z3 (svint8x4_t); + +/* +** test_z3: +** ... +** smstop sm +** bl produce_z3 +** sub sp, sp, #?128 +** str z0, \[sp\] +** str z1, \[sp, #1, mul vl\] +** str z2, \[sp, #2, mul vl\] +** str z3, \[sp, #3, mul vl\] +** smstart sm +** ldr z0, \[sp\] +** ldr z1, \[sp, #1, mul vl\] +** ldr z2, \[sp, #2, mul vl\] +** ldr z3, \[sp, #3, mul vl\] +** add sp, sp, #?128 +** sub sp, sp, #?128 +** str z0, \[sp\] +** str z1, \[sp, #1, mul vl\] +** str z2, \[sp, #2, mul vl\] +** str z3, \[sp, #3, mul vl\] +** smstop sm +** ldr z0, \[sp\] +** ldr z1, \[sp, #1, mul vl\] +** ldr z2, \[sp, #2, mul vl\] +** ldr z3, \[sp, #3, mul vl\] +** add sp, sp, #?128 +** bl consume_z3 +** ... +*/ +void +test_z3 () [[arm::streaming]] +{ + svint8x4_t res = produce_z3 (); + asm volatile (""); + consume_z3 (res); +} + +svbool_t produce_p0 (); +void consume_p0 (svbool_t); + +/* +** test_p0: +** ... +** smstop sm +** bl produce_p0 +** sub sp, sp, #?32 +** str p0, \[sp\] +** smstart sm +** ldr p0, \[sp\] +** add sp, sp, #?32 +** sub sp, sp, #?32 +** str p0, \[sp\] +** smstop sm +** ldr p0, \[sp\] +** add sp, sp, #?32 +** bl consume_p0 +** ... +*/ +void +test_p0 () [[arm::streaming]] +{ + svbool_t res = produce_p0 (); + asm volatile (""); + consume_p0 (res); +} From patchwork Tue Dec 5 10:13:14 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Sandiford X-Patchwork-Id: 1872043 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4SkxKz3gnrz1ySd for ; Tue, 5 Dec 2023 21:17:27 +1100 (AEDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id E07143875DC3 for ; Tue, 5 Dec 2023 10:16:53 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by sourceware.org (Postfix) with ESMTP id ECFC0386186D for ; Tue, 5 Dec 2023 10:13:44 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org ECFC0386186D Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=arm.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org ECFC0386186D Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=217.140.110.172 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1701771231; cv=none; b=OfnT8Wx2MvhMtlqedyShVxrjKiFsWZ8Lq6hM4IkRW3XFrLhp5VftYKH+jzAk9OhD5Tuc2nZ8J/sQhESUugCBgfJa3Jgia7+YDSMqlju9sZjE1wPCT+WOuBsAYHhdCKYeyqn9nL1FPsVlIIH+BxjR9VZk24QEkRzcRfBoVFgdpek= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1701771231; c=relaxed/simple; bh=8bC1245H6lwS8n3G/D7ZttE2gNX/VlLnViUl/pOA49A=; h=From:To:Subject:Date:Message-Id:MIME-Version; b=XQqWuPX3cTc/uWMUC1SwduI9X3v/PdPwx+duG2PuXsGfGPYhDdxfuraTZdiiRFOS84a3+4Amem7YbB11CtIxtW0nXo3+wz0DE4LjawA9fR9jDVUDXImdiMUgRIusrYtSVgxDf8dTheuWKLzUafzZbjNaD+MlQjlbAZ/gjXYzplY= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 4D70C1480; Tue, 5 Dec 2023 02:14:31 -0800 (PST) Received: from e121540-lin.manchester.arm.com (e121540-lin.manchester.arm.com [10.32.110.72]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 3EA983F5A1; Tue, 5 Dec 2023 02:13:44 -0800 (PST) From: Richard Sandiford To: gcc-patches@gcc.gnu.org Cc: Richard Sandiford Subject: [pushed v2 16/25] aarch64: Add support for SME ZA attributes Date: Tue, 5 Dec 2023 10:13:14 +0000 Message-Id: <20231205101323.1914247-17-richard.sandiford@arm.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20231205101323.1914247-1-richard.sandiford@arm.com> References: <20231205101323.1914247-1-richard.sandiford@arm.com> MIME-Version: 1.0 X-Spam-Status: No, score=-21.9 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_ASCII_DIVIDERS, KAM_DMARC_NONE, KAM_DMARC_STATUS, KAM_LAZY_DOMAIN_SECURITY, KAM_SHORT, SPF_HELO_NONE, SPF_NONE, TXREP, T_FILL_THIS_FORM_SHORT, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org SME has an array called ZA that can be enabled and disabled separately from streaming mode. A status bit called PSTATE.ZA indicates whether ZA is currently enabled or not. In C and C++, the state of PSTATE.ZA is controlled using function attributes. There are four attributes that can be attached to function types to indicate that the function shares ZA with its caller. These are: - arm::in("za") - arm::out("za") - arm::inout("za") - arm::preserves("za") If a function's type has one of these shared-ZA attributes, PSTATE.ZA is specified to be 1 on entry to the function and on return from the function. Otherwise, the caller and callee have separate ZA contexts; they do not use ZA to share data. Although normal non-shared-ZA functions have a separate ZA context from their callers, nested uses of ZA are expected to be rare. The ABI therefore defines a cooperative lazy saving scheme that allows saves and restore of ZA to be kept to a minimum. (Callers still have the option of doing a full save and restore if they prefer.) Functions that want to use ZA internally have an arm::new("za") attribute, which tells the compiler to enable PSTATE.ZA for the duration of the function body. It also tells the compiler to commit any lazy save initiated by a caller. The patch uses various abstract hard registers to track dataflow relating to ZA. See the comments in the patch for details. The lazy save scheme is intended to be transparent to most normal functions, so that they don't need to be recompiled for SME. This is reflected in the way that most normal functions ignore the new hard registers added in the patch. As with arm::streaming and arm::streaming_compatible, the attributes are also available as __arm_. This has two advantages: it triggers an error on compilers that don't understand the attributes, and it eases use on C, where [[...]] attributes were only added in C23. gcc/ * config/aarch64/aarch64-isa-modes.def (ZA_ON): New ISA mode. * config/aarch64/aarch64-protos.h (aarch64_rdsvl_immediate_p) (aarch64_output_rdsvl, aarch64_optimize_mode_switching) (aarch64_restore_za): Declare. * config/aarch64/constraints.md (UsR): New constraint. * config/aarch64/aarch64.md (LOWERING_REGNUM, TPIDR_BLOCK_REGNUM) (SME_STATE_REGNUM, TPIDR2_SETUP_REGNUM, ZA_FREE_REGNUM) (ZA_SAVED_REGNUM, ZA_REGNUM, FIRST_FAKE_REGNUM): New constants. (LAST_FAKE_REGNUM): Likewise. (UNSPEC_SAVE_NZCV, UNSPEC_RESTORE_NZCV, UNSPEC_SME_VQ): New unspecs. (arches): Add sme. (arch_enabled): Handle it. (*cb1): Rename to... (aarch64_cb1): ...this. (*movsi_aarch64): Add an alternative for RDSVL. (*movdi_aarch64): Likewise. (aarch64_save_nzcv, aarch64_restore_nzcv): New insns. * config/aarch64/aarch64-sme.md (UNSPEC_SMSTOP_ZA) (UNSPEC_INITIAL_ZERO_ZA, UNSPEC_TPIDR2_SAVE, UNSPEC_TPIDR2_RESTORE) (UNSPEC_READ_TPIDR2, UNSPEC_WRITE_TPIDR2, UNSPEC_SETUP_LOCAL_TPIDR2) (UNSPEC_RESTORE_ZA, UNSPEC_START_PRIVATE_ZA_CALL): New unspecs. (UNSPEC_END_PRIVATE_ZA_CALL, UNSPEC_COMMIT_LAZY_SAVE): Likewise. (UNSPECV_ASM_UPDATE_ZA): New unspecv. (aarch64_tpidr2_save, aarch64_smstart_za, aarch64_smstop_za) (aarch64_initial_zero_za, aarch64_setup_local_tpidr2) (aarch64_clear_tpidr2, aarch64_write_tpidr2, aarch64_read_tpidr2) (aarch64_tpidr2_restore, aarch64_restore_za, aarch64_asm_update_za) (aarch64_start_private_za_call, aarch64_end_private_za_call) (aarch64_commit_lazy_save): New patterns. * config/aarch64/aarch64.h (AARCH64_ISA_ZA_ON, TARGET_ZA): New macros. (FIXED_REGISTERS, REGISTER_NAMES): Add the new fake ZA registers. (CALL_USED_REGISTERS): Replace with... (CALL_REALLY_USED_REGISTERS): ...this and add the fake ZA registers. (FIRST_PSEUDO_REGISTER): Bump to include the fake ZA registers. (FAKE_REGS): New register class. (REG_CLASS_NAMES): Update accordingly. (REG_CLASS_CONTENTS): Likewise. (machine_function::tpidr2_block): New member variable. (machine_function::tpidr2_block_ptr): Likewise. (machine_function::za_save_buffer): Likewise. (machine_function::next_asm_update_za_id): Likewise. (CUMULATIVE_ARGS::shared_za_flags): Likewise. (aarch64_mode_entity, aarch64_local_sme_state): New enums. (aarch64_tristate_mode): Likewise. (OPTIMIZE_MODE_SWITCHING, NUM_MODES_FOR_MODE_SWITCHING): Define. * config/aarch64/aarch64.cc (AARCH64_STATE_SHARED, AARCH64_STATE_IN) (AARCH64_STATE_OUT): New constants. (aarch64_attribute_shared_state_flags): New function. (aarch64_lookup_shared_state_flags, aarch64_fndecl_has_new_state) (aarch64_check_state_string, cmp_string_csts): Likewise. (aarch64_merge_string_arguments, aarch64_check_arm_new_against_type) (handle_arm_new, handle_arm_shared): Likewise. (handle_arm_new_za_attribute): New (aarch64_arm_attribute_table): Add new, preserves, in, out, and inout. (aarch64_hard_regno_nregs): Handle FAKE_REGS. (aarch64_hard_regno_mode_ok): Likewise. (aarch64_fntype_shared_flags, aarch64_fntype_pstate_za): New functions. (aarch64_fntype_isa_mode): Include aarch64_fntype_pstate_za. (aarch64_fndecl_has_state, aarch64_fndecl_pstate_za): New functions. (aarch64_fndecl_isa_mode): Include aarch64_fndecl_pstate_za. (aarch64_cfun_incoming_pstate_za, aarch64_cfun_shared_flags) (aarch64_cfun_has_new_state, aarch64_cfun_has_state): New functions. (aarch64_sme_vq_immediate, aarch64_sme_vq_unspec_p): Likewise. (aarch64_rdsvl_immediate_p, aarch64_output_rdsvl): Likewise. (aarch64_expand_mov_immediate): Handle RDSVL immediates. (aarch64_function_arg): Add the ZA sharing flags as a third limb of the PARALLEL. (aarch64_init_cumulative_args): Record the ZA sharing flags. (aarch64_extra_live_on_entry): New function. Handle the new ZA-related fake registers. (aarch64_epilogue_uses): Handle the new ZA-related fake registers. (aarch64_cannot_force_const_mem): Handle UNSPEC_SME_VQ constants. (aarch64_get_tpidr2_block, aarch64_get_tpidr2_ptr): New functions. (aarch64_init_tpidr2_block, aarch64_restore_za): Likewise. (aarch64_layout_frame): Check whether the current function creates new ZA state. Record that it clobbers LR if so. (aarch64_expand_prologue): Handle functions that create new ZA state. (aarch64_expand_epilogue): Likewise. (aarch64_create_tpidr2_block): New function. (aarch64_restore_za): Likewise. (aarch64_start_call_args): Disallow calls to shared-ZA functions from functions that have no ZA state. Emit a marker instruction before calls to private-ZA functions from functions that have SME state. (aarch64_expand_call): Add return registers for state that is managed via attributes. Record the use and clobber information for the ZA registers. (aarch64_end_call_args): New function. (aarch64_regno_regclass): Handle FAKE_REGS. (aarch64_class_max_nregs): Likewise. (aarch64_override_options_internal): Require TARGET_SME for functions that have ZA state. (aarch64_conditional_register_usage): Handle FAKE_REGS. (aarch64_mov_operand_p): Handle RDSVL immediates. (aarch64_comp_type_attributes): Check that the ZA sharing flags are equal. (aarch64_merge_decl_attributes): New function. (aarch64_optimize_mode_switching, aarch64_mode_emit_za_save_buffer) (aarch64_mode_emit_local_sme_state, aarch64_mode_emit): Likewise. (aarch64_insn_references_sme_state_p): Likewise. (aarch64_mode_needed_local_sme_state): Likewise. (aarch64_mode_needed_za_save_buffer, aarch64_mode_needed): Likewise. (aarch64_mode_after_local_sme_state, aarch64_mode_after): Likewise. (aarch64_local_sme_confluence, aarch64_mode_confluence): Likewise. (aarch64_one_shot_backprop, aarch64_local_sme_backprop): Likewise. (aarch64_mode_backprop, aarch64_mode_entry): Likewise. (aarch64_mode_exit, aarch64_mode_eh_handler): Likewise. (aarch64_mode_priority, aarch64_md_asm_adjust): Likewise. (TARGET_END_CALL_ARGS, TARGET_MERGE_DECL_ATTRIBUTES): Define. (TARGET_MODE_EMIT, TARGET_MODE_NEEDED, TARGET_MODE_AFTER): Likewise. (TARGET_MODE_CONFLUENCE, TARGET_MODE_BACKPROP): Likewise. (TARGET_MODE_ENTRY, TARGET_MODE_EXIT): Likewise. (TARGET_MODE_EH_HANDLER, TARGET_MODE_PRIORITY): Likewise. (TARGET_EXTRA_LIVE_ON_ENTRY): Likewise. (TARGET_MD_ASM_ADJUST): Use aarch64_md_asm_adjust. * config/aarch64/aarch64-c.cc (aarch64_define_unconditional_macros): Define __arm_new, __arm_preserves,__arm_in, __arm_out, and __arm_inout. gcc/testsuite/ * gcc.target/aarch64/sme/za_state_1.c: New test. * gcc.target/aarch64/sme/za_state_2.c: Likewise. * gcc.target/aarch64/sme/za_state_3.c: Likewise. * gcc.target/aarch64/sme/za_state_4.c: Likewise. * gcc.target/aarch64/sme/za_state_5.c: Likewise. * gcc.target/aarch64/sme/za_state_6.c: Likewise. * g++.target/aarch64/sme/exceptions_1.C: Likewise. * gcc.target/aarch64/sme/keyword_macros_1.c: Add ZA macros. * g++.target/aarch64/sme/keyword_macros_1.C: Likewise. --- gcc/config/aarch64/aarch64-c.cc | 32 + gcc/config/aarch64/aarch64-isa-modes.def | 5 + gcc/config/aarch64/aarch64-protos.h | 5 + gcc/config/aarch64/aarch64-sme.md | 287 ++++ gcc/config/aarch64/aarch64.cc | 1371 ++++++++++++++++- gcc/config/aarch64/aarch64.h | 98 +- gcc/config/aarch64/aarch64.md | 81 +- gcc/config/aarch64/constraints.md | 6 + .../g++.target/aarch64/sme/exceptions_1.C | 189 +++ .../g++.target/aarch64/sme/keyword_macros_1.C | 5 + .../gcc.target/aarch64/sme/keyword_macros_1.c | 5 + .../gcc.target/aarch64/sme/za_state_1.c | 154 ++ .../gcc.target/aarch64/sme/za_state_2.c | 73 + .../gcc.target/aarch64/sme/za_state_3.c | 31 + .../gcc.target/aarch64/sme/za_state_4.c | 585 +++++++ .../gcc.target/aarch64/sme/za_state_5.c | 595 +++++++ .../gcc.target/aarch64/sme/za_state_6.c | 23 + 17 files changed, 3523 insertions(+), 22 deletions(-) create mode 100644 gcc/testsuite/g++.target/aarch64/sme/exceptions_1.C create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/za_state_1.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/za_state_2.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/za_state_3.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/za_state_4.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/za_state_5.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/za_state_6.c diff --git a/gcc/config/aarch64/aarch64-c.cc b/gcc/config/aarch64/aarch64-c.cc index 1603621b30d..9494e560be0 100644 --- a/gcc/config/aarch64/aarch64-c.cc +++ b/gcc/config/aarch64/aarch64-c.cc @@ -73,6 +73,8 @@ aarch64_define_unconditional_macros (cpp_reader *pfile) builtin_define ("__GCC_ASM_FLAG_OUTPUTS__"); + builtin_define ("__ARM_STATE_ZA"); + /* Define keyword attributes like __arm_streaming as macros that expand to the associated [[...]] attribute. Use __extension__ in the attribute for C, since the [[...]] syntax was only added in C23. */ @@ -86,6 +88,36 @@ aarch64_define_unconditional_macros (cpp_reader *pfile) DEFINE_ARM_KEYWORD_MACRO ("streaming_compatible"); #undef DEFINE_ARM_KEYWORD_MACRO + + /* Same for the keyword attributes that take arguments. The snag here + is that some old modes warn about or reject variadic arguments. */ + auto *cpp_opts = cpp_get_options (parse_in); + if (!cpp_opts->traditional) + { + auto old_warn_variadic_macros = cpp_opts->warn_variadic_macros; + auto old_cpp_warn_c90_c99_compat = cpp_opts->cpp_warn_c90_c99_compat; + + cpp_opts->warn_variadic_macros = false; + cpp_opts->cpp_warn_c90_c99_compat = 0; + +#define DEFINE_ARM_KEYWORD_MACRO_ARGS(NAME) \ + builtin_define_with_value ("__arm_" NAME "(...)", \ + lang_GNU_CXX () \ + ? "[[arm::" NAME "(__VA_ARGS__)]]" \ + : "[[__extension__ arm::" NAME \ + "(__VA_ARGS__)]]", 0); + + DEFINE_ARM_KEYWORD_MACRO_ARGS ("new"); + DEFINE_ARM_KEYWORD_MACRO_ARGS ("preserves"); + DEFINE_ARM_KEYWORD_MACRO_ARGS ("in"); + DEFINE_ARM_KEYWORD_MACRO_ARGS ("out"); + DEFINE_ARM_KEYWORD_MACRO_ARGS ("inout"); + +#undef DEFINE_ARM_KEYWORD_MACRO_ARGS + + cpp_opts->warn_variadic_macros = old_warn_variadic_macros; + cpp_opts->cpp_warn_c90_c99_compat = old_cpp_warn_c90_c99_compat; + } } /* Undefine/redefine macros that depend on the current backend state and may diff --git a/gcc/config/aarch64/aarch64-isa-modes.def b/gcc/config/aarch64/aarch64-isa-modes.def index 5915c98a896..c0ada35bd19 100644 --- a/gcc/config/aarch64/aarch64-isa-modes.def +++ b/gcc/config/aarch64/aarch64-isa-modes.def @@ -32,4 +32,9 @@ DEF_AARCH64_ISA_MODE(SM_ON) DEF_AARCH64_ISA_MODE(SM_OFF) +/* Indicates that PSTATE.ZA is known to be 1. The converse is that + PSTATE.ZA might be 0 or 1, depending on whether there is an uncommitted + lazy save. */ +DEF_AARCH64_ISA_MODE(ZA_ON) + #undef DEF_AARCH64_ISA_MODE diff --git a/gcc/config/aarch64/aarch64-protos.h b/gcc/config/aarch64/aarch64-protos.h index be929e0a774..f42981bd507 100644 --- a/gcc/config/aarch64/aarch64-protos.h +++ b/gcc/config/aarch64/aarch64-protos.h @@ -804,6 +804,8 @@ bool aarch64_sve_addvl_addpl_immediate_p (rtx); bool aarch64_sve_vector_inc_dec_immediate_p (rtx); int aarch64_add_offset_temporaries (rtx); void aarch64_split_add_offset (scalar_int_mode, rtx, rtx, rtx, rtx, rtx); +bool aarch64_rdsvl_immediate_p (const_rtx); +char *aarch64_output_rdsvl (const_rtx); bool aarch64_mov_operand_p (rtx, machine_mode); rtx aarch64_reverse_mask (machine_mode, unsigned int); bool aarch64_offset_7bit_signed_scaled_p (machine_mode, poly_int64); @@ -1083,4 +1085,7 @@ extern void aarch64_output_patchable_area (unsigned int, bool); extern void aarch64_adjust_reg_alloc_order (); +bool aarch64_optimize_mode_switching (aarch64_mode_entity); +void aarch64_restore_za (rtx); + #endif /* GCC_AARCH64_PROTOS_H */ diff --git a/gcc/config/aarch64/aarch64-sme.md b/gcc/config/aarch64/aarch64-sme.md index 52427b4f17a..d4973098e66 100644 --- a/gcc/config/aarch64/aarch64-sme.md +++ b/gcc/config/aarch64/aarch64-sme.md @@ -23,6 +23,7 @@ ;; == State management ;; ---- Test current state ;; ---- PSTATE.SM management +;; ---- PSTATE.ZA management ;; ========================================================================= ;; == State management @@ -169,3 +170,289 @@ (define_insn "aarch64_smstop_sm" "" "smstop\tsm" ) + +;; ------------------------------------------------------------------------- +;; ---- PSTATE.ZA management +;; ------------------------------------------------------------------------- +;; Includes: +;; - SMSTART ZA +;; - SMSTOP ZA +;; plus calls to support routines. +;; ------------------------------------------------------------------------- + +(define_c_enum "unspec" [ + UNSPEC_SMSTOP_ZA + UNSPEC_INITIAL_ZERO_ZA + UNSPEC_TPIDR2_SAVE + UNSPEC_TPIDR2_RESTORE + UNSPEC_READ_TPIDR2 + UNSPEC_WRITE_TPIDR2 + UNSPEC_SETUP_LOCAL_TPIDR2 + UNSPEC_RESTORE_ZA + UNSPEC_START_PRIVATE_ZA_CALL + UNSPEC_END_PRIVATE_ZA_CALL + UNSPEC_COMMIT_LAZY_SAVE +]) + +(define_c_enum "unspecv" [ + UNSPECV_ASM_UPDATE_ZA +]) + +;; Use the ABI-defined routine to commit an uncommitted lazy save. +;; This relies on the current PSTATE.ZA, so depends on SME_STATE_REGNUM. +;; The fake TPIDR2_SETUP_REGNUM register initially holds the incoming +;; value of the architected TPIDR2_EL0. +(define_insn "aarch64_tpidr2_save" + [(set (reg:DI ZA_FREE_REGNUM) + (unspec:DI [(reg:DI SME_STATE_REGNUM) + (reg:DI TPIDR2_SETUP_REGNUM)] UNSPEC_TPIDR2_SAVE)) + (clobber (reg:DI R14_REGNUM)) + (clobber (reg:DI R15_REGNUM)) + (clobber (reg:DI R16_REGNUM)) + (clobber (reg:DI R17_REGNUM)) + (clobber (reg:DI R18_REGNUM)) + (clobber (reg:DI R30_REGNUM)) + (clobber (reg:CC CC_REGNUM))] + "" + "bl\t__arm_tpidr2_save" +) + +;; Set PSTATE.ZA to 1. If ZA was previously dormant or active, +;; it remains in the same state afterwards, with the same contents. +;; Otherwise, it goes from off to on with zeroed contents. +;; +;; Later writes of TPIDR2_EL0 to a nonzero value must not be moved +;; up past this instruction, since that could create an invalid +;; combination of having an active lazy save while ZA is off. +;; Create an anti-dependence by reading the current contents +;; of TPIDR2_SETUP_REGNUM. +;; +;; Making this depend on ZA_FREE_REGNUM ensures that contents belonging +;; to the caller have already been saved. That isn't necessary for this +;; instruction itself, since PSTATE.ZA is already 1 if it contains data. +;; But doing this here means that other uses of ZA can just depend on +;; SME_STATE_REGNUM, rather than both SME_STATE_REGNUM and ZA_FREE_REGNUM. +(define_insn "aarch64_smstart_za" + [(set (reg:DI SME_STATE_REGNUM) + (const_int 1)) + (use (reg:DI TPIDR2_SETUP_REGNUM)) + (use (reg:DI ZA_FREE_REGNUM))] + "" + "smstart\tza" +) + +;; Disable ZA and discard its current contents. +;; +;; The ABI says that the ZA save buffer must be null whenever PSTATE.ZA +;; is zero, so earlier writes to TPIDR2_EL0 must not be moved down past +;; this instruction. Depend on TPIDR2_SETUP_REGNUM to ensure this. +;; +;; We can only turn ZA off once we know that it is free (i.e. doesn't +;; contain data belonging to the caller). Depend on ZA_FREE_REGNUM +;; to ensure this. +;; +;; We only turn ZA off when the current function's ZA state is dead, +;; or perhaps if we're sure that the contents are saved. Either way, +;; we know whether ZA is saved or not. +(define_insn "aarch64_smstop_za" + [(set (reg:DI SME_STATE_REGNUM) + (const_int 0)) + (set (reg:DI ZA_SAVED_REGNUM) + (unspec:DI [(reg:DI TPIDR2_SETUP_REGNUM) + (reg:DI ZA_FREE_REGNUM)] UNSPEC_SMSTOP_ZA))] + "" + "smstop\tza" +) + +;; Zero ZA after committing a lazy save. The sequencing is enforced +;; by reading ZA_FREE_REGNUM. +(define_insn "aarch64_initial_zero_za" + [(set (reg:DI ZA_REGNUM) + (unspec:DI [(reg:DI SME_STATE_REGNUM) + (reg:DI ZA_FREE_REGNUM)] UNSPEC_INITIAL_ZERO_ZA))] + "" + "zero\t{ za }" +) + +;; Initialize the abstract TPIDR2_BLOCK_REGNUM from the contents of +;; the current function's TPIDR2 block. Other instructions can then +;; depend on TPIDR2_BLOCK_REGNUM rather than on the memory block. +(define_insn "aarch64_setup_local_tpidr2" + [(set (reg:DI TPIDR2_BLOCK_REGNUM) + (unspec:DI [(match_operand:V16QI 0 "memory_operand" "m")] + UNSPEC_SETUP_LOCAL_TPIDR2))] + "" + "" + [(set_attr "type" "no_insn")] +) + +;; Clear TPIDR2_EL0, cancelling any uncommitted lazy save. +(define_insn "aarch64_clear_tpidr2" + [(set (reg:DI TPIDR2_SETUP_REGNUM) + (const_int 0))] + "" + "msr\ttpidr2_el0, xzr" +) + +;; Point TPIDR2_EL0 to the current function's TPIDR2 block, whose address +;; is given by operand 0. TPIDR2_BLOCK_REGNUM represents the contents of the +;; pointed-to block. +(define_insn "aarch64_write_tpidr2" + [(set (reg:DI TPIDR2_SETUP_REGNUM) + (unspec:DI [(match_operand 0 "pmode_register_operand" "r") + (reg:DI TPIDR2_BLOCK_REGNUM)] UNSPEC_WRITE_TPIDR2))] + "" + "msr\ttpidr2_el0, %0" +) + +;; Check whether ZA has been saved. The system depends on the value that +;; we wrote to TPIDR2_EL0 previously, so it depends on TPDIR2_SETUP_REGNUM. +(define_insn "aarch64_read_tpidr2" + [(set (match_operand:DI 0 "register_operand" "=r") + (unspec:DI [(reg:DI TPIDR2_SETUP_REGNUM) + (reg:DI ZA_SAVED_REGNUM)] UNSPEC_READ_TPIDR2))] + "" + "mrs\t%0, tpidr2_el0" +) + +;; Use the ABI-defined routine to restore lazy-saved ZA contents +;; from the TPIDR2 block pointed to by X0. ZA must already be active. +(define_insn "aarch64_tpidr2_restore" + [(set (reg:DI ZA_SAVED_REGNUM) + (unspec:DI [(reg:DI R0_REGNUM)] UNSPEC_TPIDR2_RESTORE)) + (set (reg:DI SME_STATE_REGNUM) + (unspec:DI [(reg:DI SME_STATE_REGNUM)] UNSPEC_TPIDR2_RESTORE)) + (clobber (reg:DI R14_REGNUM)) + (clobber (reg:DI R15_REGNUM)) + (clobber (reg:DI R16_REGNUM)) + (clobber (reg:DI R17_REGNUM)) + (clobber (reg:DI R18_REGNUM)) + (clobber (reg:DI R30_REGNUM)) + (clobber (reg:CC CC_REGNUM))] + "" + "bl\t__arm_tpidr2_restore" +) + +;; Check whether a lazy save set up by aarch64_save_za was committed +;; and restore the saved contents if so. +;; +;; Operand 0 is the address of the current function's TPIDR2 block. +(define_insn_and_split "aarch64_restore_za" + [(set (reg:DI ZA_SAVED_REGNUM) + (unspec:DI [(match_operand 0 "pmode_register_operand" "r") + (reg:DI SME_STATE_REGNUM) + (reg:DI TPIDR2_SETUP_REGNUM) + (reg:DI ZA_SAVED_REGNUM)] UNSPEC_RESTORE_ZA)) + (clobber (reg:DI R0_REGNUM)) + (clobber (reg:DI R14_REGNUM)) + (clobber (reg:DI R15_REGNUM)) + (clobber (reg:DI R16_REGNUM)) + (clobber (reg:DI R17_REGNUM)) + (clobber (reg:DI R18_REGNUM)) + (clobber (reg:DI R30_REGNUM)) + (clobber (reg:CC CC_REGNUM))] + "" + "#" + "&& epilogue_completed" + [(const_int 0)] + { + auto label = gen_label_rtx (); + auto tpidr2 = gen_rtx_REG (DImode, R16_REGNUM); + emit_insn (gen_aarch64_read_tpidr2 (tpidr2)); + auto jump = emit_likely_jump_insn (gen_aarch64_cbnedi1 (tpidr2, label)); + JUMP_LABEL (jump) = label; + + aarch64_restore_za (operands[0]); + emit_label (label); + DONE; + } +) + +;; This instruction is emitted after asms that alter ZA, in order to model +;; the effect on dataflow. The asm itself can't have ZA as an input or +;; an output, since there is no associated data type. Instead it retains +;; the original "za" clobber, which on its own would indicate that ZA +;; is dead. +;; +;; The operand is a unique identifier. +(define_insn "aarch64_asm_update_za" + [(set (reg:VNx16QI ZA_REGNUM) + (unspec_volatile:VNx16QI + [(reg:VNx16QI ZA_REGNUM) + (reg:DI SME_STATE_REGNUM) + (match_operand 0 "const_int_operand")] + UNSPECV_ASM_UPDATE_ZA))] + "" + "" + [(set_attr "type" "no_insn")] +) + +;; This pseudo-instruction is emitted as part of a call to a private-ZA +;; function from a function with ZA state. It marks a natural place to set +;; up a lazy save, if that turns out to be necessary. The save itself +;; is managed by the mode-switching pass. +(define_insn "aarch64_start_private_za_call" + [(set (reg:DI LOWERING_REGNUM) + (unspec:DI [(reg:DI LOWERING_REGNUM)] UNSPEC_START_PRIVATE_ZA_CALL))] + "" + "" + [(set_attr "type" "no_insn")] +) + +;; This pseudo-instruction is emitted as part of a call to a private-ZA +;; function from a function with ZA state. It marks a natural place to restore +;; the current function's ZA contents from the lazy save buffer, if that +;; turns out to be necessary. The save itself is managed by the +;; mode-switching pass. +(define_insn "aarch64_end_private_za_call" + [(set (reg:DI LOWERING_REGNUM) + (unspec:DI [(reg:DI LOWERING_REGNUM)] UNSPEC_END_PRIVATE_ZA_CALL))] + "" + "" + [(set_attr "type" "no_insn")] +) + +;; This pseudo-instruction is emitted before a private-ZA function uses +;; PSTATE.ZA state for the first time. The instruction checks whether +;; ZA currently contains data belonging to a caller and commits the +;; lazy save if so. +;; +;; Operand 0 is the incoming value of TPIDR2_EL0. Operand 1 is nonzero +;; if ZA is live, and should therefore be zeroed after committing a save. +;; +;; The instruction is generated by the mode-switching pass. It is a +;; define_insn_and_split rather than a define_expand because of the +;; internal control flow. +(define_insn_and_split "aarch64_commit_lazy_save" + [(set (reg:DI ZA_FREE_REGNUM) + (unspec:DI [(match_operand 0 "pmode_register_operand" "r") + (match_operand 1 "const_int_operand") + (reg:DI SME_STATE_REGNUM) + (reg:DI TPIDR2_SETUP_REGNUM) + (reg:VNx16QI ZA_REGNUM)] UNSPEC_COMMIT_LAZY_SAVE)) + (set (reg:DI ZA_REGNUM) + (unspec:DI [(reg:DI SME_STATE_REGNUM) + (reg:DI ZA_FREE_REGNUM)] UNSPEC_INITIAL_ZERO_ZA)) + (clobber (reg:DI R14_REGNUM)) + (clobber (reg:DI R15_REGNUM)) + (clobber (reg:DI R16_REGNUM)) + (clobber (reg:DI R17_REGNUM)) + (clobber (reg:DI R18_REGNUM)) + (clobber (reg:DI R30_REGNUM)) + (clobber (reg:CC CC_REGNUM))] + "" + "#" + "true" + [(const_int 0)] + { + auto label = gen_label_rtx (); + auto jump = emit_jump_insn (gen_aarch64_cbeqdi1 (operands[0], label)); + JUMP_LABEL (jump) = label; + emit_insn (gen_aarch64_tpidr2_save ()); + emit_insn (gen_aarch64_clear_tpidr2 ()); + if (INTVAL (operands[1]) != 0) + emit_insn (gen_aarch64_initial_zero_za ()); + emit_label (label); + DONE; + } +) diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc index 0bee2a8e373..5d06c7fb411 100644 --- a/gcc/config/aarch64/aarch64.cc +++ b/gcc/config/aarch64/aarch64.cc @@ -94,6 +94,26 @@ /* Defined for convenience. */ #define POINTER_BYTES (POINTER_SIZE / BITS_PER_UNIT) +/* Flags that describe how a function shares certain architectural state + with its callers. + + - AARCH64_STATE_SHARED indicates that the function does share the state + with callers. + + - AARCH64_STATE_IN indicates that the function reads (or might read) the + incoming state. The converse is that the function ignores the incoming + state. + + - AARCH64_STATE_OUT indicates that the function returns new state. + The converse is that the state on return is the same as it was on entry. + + A function that partially modifies the state treats it as both IN + and OUT (because the value on return depends to some extent on the + value on input). */ +constexpr auto AARCH64_STATE_SHARED = 1U << 0; +constexpr auto AARCH64_STATE_IN = 1U << 1; +constexpr auto AARCH64_STATE_OUT = 1U << 2; + /* Information about a legitimate vector immediate operand. */ struct simd_immediate_info { @@ -438,6 +458,151 @@ static const struct processor all_cores[] = /* The current tuning set. */ struct tune_params aarch64_tune_params = generic_tunings; +/* If NAME is the name of an arm:: attribute that describes shared state, + return its associated AARCH64_STATE_* flags, otherwise return 0. */ +static unsigned int +aarch64_attribute_shared_state_flags (const char *name) +{ + if (strcmp (name, "in") == 0) + return AARCH64_STATE_SHARED | AARCH64_STATE_IN; + if (strcmp (name, "inout") == 0) + return AARCH64_STATE_SHARED | AARCH64_STATE_IN | AARCH64_STATE_OUT; + if (strcmp (name, "out") == 0) + return AARCH64_STATE_SHARED | AARCH64_STATE_OUT; + if (strcmp (name, "preserves") == 0) + return AARCH64_STATE_SHARED; + return 0; +} + +/* See whether attribute list ATTRS has any sharing information + for state STATE_NAME. Return the associated state flags if so, + otherwise return 0. */ +static unsigned int +aarch64_lookup_shared_state_flags (tree attrs, const char *state_name) +{ + for (tree attr = attrs; attr; attr = TREE_CHAIN (attr)) + { + if (!cxx11_attribute_p (attr)) + continue; + + auto ns = IDENTIFIER_POINTER (TREE_PURPOSE (TREE_PURPOSE (attr))); + if (strcmp (ns, "arm") != 0) + continue; + + auto attr_name = IDENTIFIER_POINTER (TREE_VALUE (TREE_PURPOSE (attr))); + auto flags = aarch64_attribute_shared_state_flags (attr_name); + if (!flags) + continue; + + for (tree arg = TREE_VALUE (attr); arg; arg = TREE_CHAIN (arg)) + { + tree value = TREE_VALUE (arg); + if (TREE_CODE (value) == STRING_CST + && strcmp (TREE_STRING_POINTER (value), state_name) == 0) + return flags; + } + } + return 0; +} + +/* Return true if DECL creates a new scope for state STATE_STRING. */ +static bool +aarch64_fndecl_has_new_state (const_tree decl, const char *state_name) +{ + if (tree attr = lookup_attribute ("arm", "new", DECL_ATTRIBUTES (decl))) + for (tree arg = TREE_VALUE (attr); arg; arg = TREE_CHAIN (arg)) + { + tree value = TREE_VALUE (arg); + if (TREE_CODE (value) == STRING_CST + && strcmp (TREE_STRING_POINTER (value), state_name) == 0) + return true; + } + return false; +} + +/* Return true if attribute argument VALUE is a recognized state string, + otherwise report an error. NAME is the name of the attribute to which + VALUE is being passed. */ +static bool +aarch64_check_state_string (tree name, tree value) +{ + if (TREE_CODE (value) != STRING_CST) + { + error ("the arguments to %qE must be constant strings", name); + return false; + } + + const char *state_name = TREE_STRING_POINTER (value); + if (strcmp (state_name, "za") != 0) + { + error ("unrecognized state string %qs", state_name); + return false; + } + + return true; +} + +/* qsort callback to compare two STRING_CSTs. */ +static int +cmp_string_csts (const void *a, const void *b) +{ + return strcmp (TREE_STRING_POINTER (*(const_tree const *) a), + TREE_STRING_POINTER (*(const_tree const *) b)); +} + +/* Canonicalize a list of state strings. ARGS contains the arguments to + a new attribute while OLD_ATTR, if nonnull, contains a previous attribute + of the same type. If CAN_MERGE_IN_PLACE, it is safe to adjust OLD_ATTR's + arguments and drop the new attribute. Otherwise, the new attribute must + be kept and ARGS must include the information in OLD_ATTR. + + In both cases, the new arguments must be a sorted list of state strings + with duplicates removed. + + Return true if new attribute should be kept, false if it should be + dropped. */ +static bool +aarch64_merge_string_arguments (tree args, tree old_attr, + bool can_merge_in_place) +{ + /* Get a sorted list of all state strings (including duplicates). */ + auto add_args = [](vec &strings, const_tree args) + { + for (const_tree arg = args; arg; arg = TREE_CHAIN (arg)) + if (TREE_CODE (TREE_VALUE (arg)) == STRING_CST) + strings.safe_push (TREE_VALUE (arg)); + }; + auto_vec strings; + add_args (strings, args); + if (old_attr) + add_args (strings, TREE_VALUE (old_attr)); + strings.qsort (cmp_string_csts); + + /* The list can be empty if there was no previous attribute and if all + the new arguments are erroneous. Drop the attribute in that case. */ + if (strings.is_empty ()) + return false; + + /* Destructively modify one of the argument lists, removing duplicates + on the fly. */ + bool use_old_attr = old_attr && can_merge_in_place; + tree *end = use_old_attr ? &TREE_VALUE (old_attr) : &args; + tree prev = NULL_TREE; + for (tree arg : strings) + { + if (prev && simple_cst_equal (arg, prev)) + continue; + prev = arg; + if (!*end) + *end = tree_cons (NULL_TREE, arg, NULL_TREE); + else + TREE_VALUE (*end) = arg; + end = &TREE_CHAIN (*end); + } + *end = NULL_TREE; + return !use_old_attr; +} + /* Check whether an 'aarch64_vector_pcs' attribute is valid. */ static tree @@ -466,6 +631,101 @@ handle_aarch64_vector_pcs_attribute (tree *node, tree name, tree, gcc_unreachable (); } +/* Return true if arm::new(ARGS) is compatible with the type of decl DECL, + otherwise report an error. */ +static bool +aarch64_check_arm_new_against_type (tree args, tree decl) +{ + tree type_attrs = TYPE_ATTRIBUTES (TREE_TYPE (decl)); + for (tree arg = args; arg; arg = TREE_CHAIN (arg)) + { + tree value = TREE_VALUE (arg); + if (TREE_CODE (value) == STRING_CST) + { + const char *state_name = TREE_STRING_POINTER (value); + if (aarch64_lookup_shared_state_flags (type_attrs, state_name)) + { + error_at (DECL_SOURCE_LOCATION (decl), + "cannot create a new %qs scope since %qs is shared" + " with callers", state_name, state_name); + return false; + } + } + } + return true; +} + +/* Callback for arm::new attributes. */ +static tree +handle_arm_new (tree *node, tree name, tree args, int, bool *no_add_attrs) +{ + tree decl = *node; + if (TREE_CODE (decl) != FUNCTION_DECL) + { + error ("%qE attribute applies only to function definitions", name); + *no_add_attrs = true; + return NULL_TREE; + } + if (TREE_TYPE (decl) == error_mark_node) + { + *no_add_attrs = true; + return NULL_TREE; + } + + for (tree arg = args; arg; arg = TREE_CHAIN (arg)) + aarch64_check_state_string (name, TREE_VALUE (arg)); + + if (!aarch64_check_arm_new_against_type (args, decl)) + { + *no_add_attrs = true; + return NULL_TREE; + } + + /* If there is an old attribute, we should try to update it in-place, + so that there is only one (definitive) arm::new attribute on the decl. */ + tree old_attr = lookup_attribute ("arm", "new", DECL_ATTRIBUTES (decl)); + if (!aarch64_merge_string_arguments (args, old_attr, true)) + *no_add_attrs = true; + + return NULL_TREE; +} + +/* Callback for arm::{in,out,inout,preserves} attributes. */ +static tree +handle_arm_shared (tree *node, tree name, tree args, + int, bool *no_add_attrs) +{ + tree type = *node; + tree old_attrs = TYPE_ATTRIBUTES (type); + auto flags = aarch64_attribute_shared_state_flags (IDENTIFIER_POINTER (name)); + for (tree arg = args; arg; arg = TREE_CHAIN (arg)) + { + tree value = TREE_VALUE (arg); + if (aarch64_check_state_string (name, value)) + { + const char *state_name = TREE_STRING_POINTER (value); + auto old_flags = aarch64_lookup_shared_state_flags (old_attrs, + state_name); + if (old_flags && old_flags != flags) + { + error ("inconsistent attributes for state %qs", state_name); + *no_add_attrs = true; + return NULL_TREE; + } + } + } + + /* We can't update an old attribute in-place, since types are shared. + Instead make sure that this new attribute contains all the + information, so that the old attribute becomes redundant. */ + tree old_attr = lookup_attribute ("arm", IDENTIFIER_POINTER (name), + old_attrs); + if (!aarch64_merge_string_arguments (args, old_attr, false)) + *no_add_attrs = true; + + return NULL_TREE; +} + /* Mutually-exclusive function type attributes for controlling PSTATE.SM. */ static const struct attribute_spec::exclusions attr_streaming_exclusions[] = { @@ -502,6 +762,16 @@ static const attribute_spec aarch64_arm_attributes[] = NULL, attr_streaming_exclusions }, { "streaming_compatible", 0, 0, false, true, true, true, NULL, attr_streaming_exclusions }, + { "new", 1, -1, true, false, false, false, + handle_arm_new, NULL }, + { "preserves", 1, -1, false, true, true, true, + handle_arm_shared, NULL }, + { "in", 1, -1, false, true, true, true, + handle_arm_shared, NULL }, + { "out", 1, -1, false, true, true, true, + handle_arm_shared, NULL }, + { "inout", 1, -1, false, true, true, true, + handle_arm_shared, NULL } }; static const scoped_attribute_specs aarch64_arm_attribute_table = @@ -1616,6 +1886,7 @@ aarch64_hard_regno_nregs (unsigned regno, machine_mode mode) case PR_HI_REGS: case FFR_REGS: case PR_AND_FFR_REGS: + case FAKE_REGS: return 1; default: return CEIL (lowest_size, UNITS_PER_WORD); @@ -1646,6 +1917,10 @@ aarch64_hard_regno_mode_ok (unsigned regno, machine_mode mode) if (pr_or_ffr_regnum_p (regno)) return false; + /* These registers are abstract; their modes don't matter. */ + if (FAKE_REGNUM_P (regno)) + return true; + if (regno == SP_REGNUM) /* The purpose of comparing with ptr_mode is to support the global register variable associated with the stack pointer @@ -1766,12 +2041,34 @@ aarch64_fntype_pstate_sm (const_tree fntype) return AARCH64_FL_SM_OFF; } +/* Return state flags that describe whether and how functions of type + FNTYPE share state STATE_NAME with their callers. */ + +static unsigned int +aarch64_fntype_shared_flags (const_tree fntype, const char *state_name) +{ + return aarch64_lookup_shared_state_flags (TYPE_ATTRIBUTES (fntype), + state_name); +} + +/* Return the state of PSTATE.ZA on entry to functions of type FNTYPE. */ + +static aarch64_feature_flags +aarch64_fntype_pstate_za (const_tree fntype) +{ + if (aarch64_fntype_shared_flags (fntype, "za")) + return AARCH64_FL_ZA_ON; + + return 0; +} + /* Return the ISA mode on entry to functions of type FNTYPE. */ static aarch64_feature_flags aarch64_fntype_isa_mode (const_tree fntype) { - return aarch64_fntype_pstate_sm (fntype); + return (aarch64_fntype_pstate_sm (fntype) + | aarch64_fntype_pstate_za (fntype)); } /* Return the state of PSTATE.SM when compiling the body of @@ -1784,13 +2081,37 @@ aarch64_fndecl_pstate_sm (const_tree fndecl) return aarch64_fntype_pstate_sm (TREE_TYPE (fndecl)); } +/* Return true if function FNDECL has state STATE_NAME, either by creating + new state itself or by sharing state with callers. */ + +static bool +aarch64_fndecl_has_state (tree fndecl, const char *state_name) +{ + return (aarch64_fndecl_has_new_state (fndecl, state_name) + || aarch64_fntype_shared_flags (TREE_TYPE (fndecl), + state_name) != 0); +} + +/* Return the state of PSTATE.ZA when compiling the body of function FNDECL. + This might be different from the state of PSTATE.ZA on entry. */ + +static aarch64_feature_flags +aarch64_fndecl_pstate_za (const_tree fndecl) +{ + if (aarch64_fndecl_has_new_state (fndecl, "za")) + return AARCH64_FL_ZA_ON; + + return aarch64_fntype_pstate_za (TREE_TYPE (fndecl)); +} + /* Return the ISA mode that should be used to compile the body of function FNDECL. */ static aarch64_feature_flags aarch64_fndecl_isa_mode (const_tree fndecl) { - return aarch64_fndecl_pstate_sm (fndecl); + return (aarch64_fndecl_pstate_sm (fndecl) + | aarch64_fndecl_pstate_za (fndecl)); } /* Return the state of PSTATE.SM on entry to the current function. @@ -1803,6 +2124,44 @@ aarch64_cfun_incoming_pstate_sm () return aarch64_fntype_pstate_sm (TREE_TYPE (cfun->decl)); } +/* Return the state of PSTATE.ZA on entry to the current function. + This might be different from the state of PSTATE.ZA in the function + body. */ + +static aarch64_feature_flags +aarch64_cfun_incoming_pstate_za () +{ + return aarch64_fntype_pstate_za (TREE_TYPE (cfun->decl)); +} + +/* Return state flags that describe whether and how the current function shares + state STATE_NAME with callers. */ + +static unsigned int +aarch64_cfun_shared_flags (const char *state_name) +{ + return aarch64_fntype_shared_flags (TREE_TYPE (cfun->decl), state_name); +} + +/* Return true if the current function creates new state of type STATE_NAME + (as opposed to sharing the state with its callers or ignoring the state + altogether). */ + +static bool +aarch64_cfun_has_new_state (const char *state_name) +{ + return aarch64_fndecl_has_new_state (cfun->decl, state_name); +} + +/* Return true if the current function has state STATE_NAME, either by + creating new state itself or by sharing state with callers. */ + +static bool +aarch64_cfun_has_state (const char *state_name) +{ + return aarch64_fndecl_has_state (cfun->decl, state_name); +} + /* Return true if a call from the current function to a function with ISA mode CALLEE_MODE would involve a change to PSTATE.SM around the BL instruction. */ @@ -3366,6 +3725,74 @@ aarch64_output_sve_vector_inc_dec (const char *operands, rtx x) factor, nelts_per_vq); } +/* Return a constant that represents FACTOR multiplied by the + number of 128-bit quadwords in an SME vector. ISA_MODE is the + ISA mode in which the calculation is being performed. */ + +static rtx +aarch64_sme_vq_immediate (machine_mode mode, HOST_WIDE_INT factor, + aarch64_feature_flags isa_mode) +{ + gcc_assert (aarch64_sve_rdvl_factor_p (factor)); + if (isa_mode & AARCH64_FL_SM_ON) + /* We're in streaming mode, so we can use normal poly-int values. */ + return gen_int_mode ({ factor, factor }, mode); + + rtvec vec = gen_rtvec (1, gen_int_mode (factor, SImode)); + rtx unspec = gen_rtx_UNSPEC (mode, vec, UNSPEC_SME_VQ); + return gen_rtx_CONST (mode, unspec); +} + +/* Return true if X is a constant that represents some number X + multiplied by the number of quadwords in an SME vector. Store this X + in *FACTOR if so. */ + +static bool +aarch64_sme_vq_unspec_p (const_rtx x, HOST_WIDE_INT *factor) +{ + if (!TARGET_SME || GET_CODE (x) != CONST) + return false; + + x = XEXP (x, 0); + if (GET_CODE (x) != UNSPEC + || XINT (x, 1) != UNSPEC_SME_VQ + || XVECLEN (x, 0) != 1) + return false; + + x = XVECEXP (x, 0, 0); + if (!CONST_INT_P (x)) + return false; + + *factor = INTVAL (x); + return true; +} + +/* Return true if X is a constant that represents some number Y + multiplied by the number of quadwords in an SME vector, and if + that Y is in the range of RDSVL. */ + +bool +aarch64_rdsvl_immediate_p (const_rtx x) +{ + HOST_WIDE_INT factor; + return (aarch64_sme_vq_unspec_p (x, &factor) + && aarch64_sve_rdvl_factor_p (factor)); +} + +/* Return the asm string for an RDSVL instruction that calculates X, + which is a constant that satisfies aarch64_rdsvl_immediate_p. */ + +char * +aarch64_output_rdsvl (const_rtx x) +{ + gcc_assert (aarch64_rdsvl_immediate_p (x)); + static char buffer[sizeof ("rdsvl\t%x0, #-") + 3 * sizeof (int)]; + x = XVECEXP (XEXP (x, 0), 0, 0); + snprintf (buffer, sizeof (buffer), "rdsvl\t%%x0, #%d", + (int) INTVAL (x) / 16); + return buffer; +} + /* Multipliers for repeating bitmasks of width 32, 16, 8, 4, and 2. */ static const unsigned HOST_WIDE_INT bitmask_imm_mul[] = @@ -5181,6 +5608,15 @@ aarch64_expand_mov_immediate (rtx dest, rtx imm) return; } + if (aarch64_rdsvl_immediate_p (base)) + { + /* We could handle non-constant offsets if they are ever + generated. */ + gcc_assert (const_offset == 0); + emit_insn (gen_rtx_SET (dest, imm)); + return; + } + sty = aarch64_classify_symbol (base, const_offset); switch (sty) { @@ -6327,8 +6763,10 @@ aarch64_function_arg (cumulative_args_t pcum_v, const function_arg_info &arg) rtx abi_cookie = aarch64_gen_callee_cookie (pcum->isa_mode, pcum->pcs_variant); rtx sme_mode_switch_args = aarch64_finish_sme_mode_switch_args (pcum); - return gen_rtx_PARALLEL (VOIDmode, gen_rtvec (2, abi_cookie, - sme_mode_switch_args)); + rtx shared_za_flags = gen_int_mode (pcum->shared_za_flags, SImode); + return gen_rtx_PARALLEL (VOIDmode, gen_rtvec (3, abi_cookie, + sme_mode_switch_args, + shared_za_flags)); } aarch64_layout_arg (pcum_v, arg); @@ -6339,7 +6777,7 @@ void aarch64_init_cumulative_args (CUMULATIVE_ARGS *pcum, const_tree fntype, rtx libname ATTRIBUTE_UNUSED, - const_tree fndecl ATTRIBUTE_UNUSED, + const_tree fndecl, unsigned n_named ATTRIBUTE_UNUSED, bool silent_p) { @@ -6364,6 +6802,8 @@ aarch64_init_cumulative_args (CUMULATIVE_ARGS *pcum, pcum->aapcs_stack_words = 0; pcum->aapcs_stack_size = 0; pcum->silent_p = silent_p; + pcum->shared_za_flags + = (fntype ? aarch64_fntype_shared_flags (fntype, "za") : 0U); pcum->num_sme_mode_switch_args = 0; if (!silent_p @@ -8444,14 +8884,31 @@ aarch64_allocate_and_probe_stack_space (rtx temp1, rtx temp2, } } +/* Implement TARGET_EXTRA_LIVE_ON_ENTRY. */ + +void +aarch64_extra_live_on_entry (bitmap regs) +{ + if (TARGET_ZA) + { + bitmap_set_bit (regs, LOWERING_REGNUM); + bitmap_set_bit (regs, SME_STATE_REGNUM); + bitmap_set_bit (regs, TPIDR2_SETUP_REGNUM); + bitmap_set_bit (regs, ZA_FREE_REGNUM); + bitmap_set_bit (regs, ZA_SAVED_REGNUM); + + /* The only time ZA can't have live contents on entry is when + the function explicitly treats it as a pure output. */ + auto za_flags = aarch64_cfun_shared_flags ("za"); + if (za_flags != (AARCH64_STATE_SHARED | AARCH64_STATE_OUT)) + bitmap_set_bit (regs, ZA_REGNUM); + } +} + /* Return 1 if the register is used by the epilogue. We need to say the return register is used, but only after epilogue generation is complete. Note that in the case of sibcalls, the values "used by the epilogue" are - considered live at the start of the called function. - - For SIMD functions we need to return 1 for FP registers that are saved and - restored by a function but are not zero in call_used_regs. If we do not do - this optimizations may remove the restore of the register. */ + considered live at the start of the called function. */ int aarch64_epilogue_uses (int regno) @@ -8461,6 +8918,18 @@ aarch64_epilogue_uses (int regno) if (regno == LR_REGNUM) return 1; } + if (regno == LOWERING_REGNUM && TARGET_ZA) + return 1; + if (regno == SME_STATE_REGNUM && TARGET_ZA) + return 1; + if (regno == TPIDR2_SETUP_REGNUM && TARGET_ZA) + return 1; + /* If the function shares SME state with its caller, ensure that that + data is not in the lazy save buffer on exit. */ + if (regno == ZA_SAVED_REGNUM && aarch64_cfun_incoming_pstate_za () != 0) + return 1; + if (regno == ZA_REGNUM && aarch64_cfun_shared_flags ("za") != 0) + return 1; return 0; } @@ -9119,8 +9588,10 @@ aarch64_cannot_force_const_mem (machine_mode mode ATTRIBUTE_UNUSED, rtx x) /* There's no way to calculate VL-based values using relocations. */ subrtx_iterator::array_type array; + HOST_WIDE_INT factor; FOR_EACH_SUBRTX (iter, array, x, ALL) - if (GET_CODE (*iter) == CONST_POLY_INT) + if (GET_CODE (*iter) == CONST_POLY_INT + || aarch64_sme_vq_unspec_p (x, &factor)) return true; poly_int64 offset; @@ -9983,6 +10454,72 @@ aarch64_fixed_condition_code_regs (unsigned int *p1, unsigned int *p2) return true; } +/* Return a fresh memory reference to the current function's TPIDR2 block, + creating a block if necessary. */ + +static rtx +aarch64_get_tpidr2_block () +{ + if (!cfun->machine->tpidr2_block) + /* The TPIDR2 block is 16 bytes in size and must be aligned to a 128-bit + boundary. */ + cfun->machine->tpidr2_block = assign_stack_local (V16QImode, 16, 128); + return copy_rtx (cfun->machine->tpidr2_block); +} + +/* Return a fresh register that points to the current function's + TPIDR2 block, creating a block if necessary. */ + +static rtx +aarch64_get_tpidr2_ptr () +{ + rtx block = aarch64_get_tpidr2_block (); + return force_reg (Pmode, XEXP (block, 0)); +} + +/* Emit instructions to allocate a ZA lazy save buffer and initialize the + current function's TPIDR2 block. */ + +static void +aarch64_init_tpidr2_block () +{ + rtx block = aarch64_get_tpidr2_block (); + + /* The ZA save buffer is SVL.B*SVL.B bytes in size. */ + rtx svl_bytes = aarch64_sme_vq_immediate (Pmode, 16, AARCH64_ISA_MODE); + rtx svl_bytes_reg = force_reg (DImode, svl_bytes); + rtx za_size = expand_simple_binop (Pmode, MULT, svl_bytes_reg, + svl_bytes_reg, NULL, 0, OPTAB_LIB_WIDEN); + rtx za_save_buffer = allocate_dynamic_stack_space (za_size, 128, + BITS_PER_UNIT, -1, true); + za_save_buffer = force_reg (Pmode, za_save_buffer); + cfun->machine->za_save_buffer = za_save_buffer; + + /* The first word of the block points to the save buffer and the second + word is the number of ZA slices to save. */ + rtx block_0 = adjust_address (block, DImode, 0); + rtx block_8 = adjust_address (block, DImode, 8); + emit_insn (gen_store_pair_dw_didi (block_0, za_save_buffer, + block_8, svl_bytes_reg)); + + if (!memory_operand (block, V16QImode)) + block = replace_equiv_address (block, force_reg (Pmode, XEXP (block, 0))); + emit_insn (gen_aarch64_setup_local_tpidr2 (block)); +} + +/* Restore the contents of ZA from the lazy save buffer, given that + register TPIDR2_BLOCK points to the current function's TPIDR2 block. + PSTATE.ZA is known to be 0 and TPIDR2_EL0 is known to be null. */ + +void +aarch64_restore_za (rtx tpidr2_block) +{ + emit_insn (gen_aarch64_smstart_za ()); + if (REGNO (tpidr2_block) != R0_REGNUM) + emit_move_insn (gen_rtx_REG (Pmode, R0_REGNUM), tpidr2_block); + emit_insn (gen_aarch64_tpidr2_restore ()); +} + /* Implement TARGET_START_CALL_ARGS. */ static void @@ -9998,6 +10535,20 @@ aarch64_start_call_args (cumulative_args_t ca_v) " option %<-march%>, or by using the %" " attribute or pragma", "sme"); } + + if ((ca->shared_za_flags & (AARCH64_STATE_IN | AARCH64_STATE_OUT)) + && !aarch64_cfun_has_state ("za")) + error ("call to a function that shares %qs state from a function" + " that has no %qs state", "za", "za"); + else if (!TARGET_ZA && (ca->isa_mode & AARCH64_FL_ZA_ON)) + error ("call to a function that shares SME state from a function" + " that has no SME state"); + + /* If this is a call to a private ZA function, emit a marker to + indicate where any necessary set-up code could be inserted. + The code itself is inserted by the mode-switching pass. */ + if (TARGET_ZA && !(ca->isa_mode & AARCH64_FL_ZA_ON)) + emit_insn (gen_aarch64_start_private_za_call ()); } /* This function is used by the call expanders of the machine description. @@ -10010,6 +10561,8 @@ aarch64_start_call_args (cumulative_args_t ca_v) The second element is a PARALLEL that lists all the argument registers that need to be saved and restored around a change in PSTATE.SM, or const0_rtx if no such switch is needed. + The third element is a const_int that contains the sharing flags + for ZA. SIBCALL indicates whether this function call is normal call or sibling call. It will generate different pattern accordingly. */ @@ -10022,10 +10575,12 @@ aarch64_expand_call (rtx result, rtx mem, rtx cookie, bool sibcall) rtx callee_abi = cookie; rtx sme_mode_switch_args = const0_rtx; + unsigned int shared_za_flags = 0; if (GET_CODE (cookie) == PARALLEL) { callee_abi = XVECEXP (cookie, 0, 0); sme_mode_switch_args = XVECEXP (cookie, 0, 1); + shared_za_flags = INTVAL (XVECEXP (cookie, 0, 2)); } gcc_assert (CONST_INT_P (callee_abi)); @@ -10045,6 +10600,41 @@ aarch64_expand_call (rtx result, rtx mem, rtx cookie, bool sibcall) : !REG_P (callee)) XEXP (mem, 0) = force_reg (mode, callee); + /* Accumulate the return values, including state that is shared via + attributes. */ + auto_vec return_values; + if (result) + { + if (GET_CODE (result) == PARALLEL) + for (int i = 0; i < XVECLEN (result, 0); ++i) + return_values.safe_push (XVECEXP (result, 0, i)); + else + return_values.safe_push (result); + } + unsigned int orig_num_return_values = return_values.length (); + if (shared_za_flags & AARCH64_STATE_OUT) + return_values.safe_push (gen_rtx_REG (VNx16BImode, ZA_REGNUM)); + /* When calling private-ZA functions from functions with ZA state, + we want to know whether the call committed a lazy save. */ + if (TARGET_ZA && !shared_za_flags) + return_values.safe_push (gen_rtx_REG (VNx16BImode, ZA_SAVED_REGNUM)); + + /* Create the new return value, if necessary. */ + if (orig_num_return_values != return_values.length ()) + { + if (return_values.length () == 1) + result = return_values[0]; + else + { + for (rtx &x : return_values) + if (GET_CODE (x) != EXPR_LIST) + x = gen_rtx_EXPR_LIST (VOIDmode, x, const0_rtx); + rtvec v = gen_rtvec_v (return_values.length (), + return_values.address ()); + result = gen_rtx_PARALLEL (VOIDmode, v); + } + } + call = gen_rtx_CALL (VOIDmode, mem, const0_rtx); if (result != NULL_RTX) @@ -10111,6 +10701,50 @@ aarch64_expand_call (rtx result, rtx mem, rtx cookie, bool sibcall) cfun->machine->call_switches_pstate_sm = true; } + + /* Add any ZA-related information. + ZA_REGNUM represents the current function's ZA state, rather than + the contents of the ZA register itself. We ensure that the function's + ZA state is preserved by private-ZA call sequences, so the call itself + does not use or clobber ZA_REGNUM. */ + if (TARGET_ZA) + { + /* The callee requires ZA to be active if the callee is shared-ZA, + otherwise it requires ZA to be dormant or off. The state of ZA is + captured by a combination of SME_STATE_REGNUM, TPIDR2_SETUP_REGNUM, + and ZA_SAVED_REGNUM. */ + use_reg (&CALL_INSN_FUNCTION_USAGE (call_insn), + gen_rtx_REG (DImode, SME_STATE_REGNUM)); + use_reg (&CALL_INSN_FUNCTION_USAGE (call_insn), + gen_rtx_REG (DImode, TPIDR2_SETUP_REGNUM)); + use_reg (&CALL_INSN_FUNCTION_USAGE (call_insn), + gen_rtx_REG (VNx16BImode, ZA_SAVED_REGNUM)); + + /* Keep the aarch64_start/end_private_za_call markers live. */ + if (!(callee_isa_mode & AARCH64_FL_ZA_ON)) + use_reg (&CALL_INSN_FUNCTION_USAGE (call_insn), + gen_rtx_REG (VNx16BImode, LOWERING_REGNUM)); + + /* If the callee is a shared-ZA function, record whether it uses the + current value of ZA. */ + if (shared_za_flags & AARCH64_STATE_IN) + use_reg (&CALL_INSN_FUNCTION_USAGE (call_insn), + gen_rtx_REG (VNx16BImode, ZA_REGNUM)); + } +} + +/* Implement TARGET_END_CALL_ARGS. */ + +static void +aarch64_end_call_args (cumulative_args_t ca_v) +{ + CUMULATIVE_ARGS *ca = get_cumulative_args (ca_v); + + /* If this is a call to a private ZA function, emit a marker to + indicate where any necessary restoration code could be inserted. + The code itself is inserted by the mode-switching pass. */ + if (TARGET_ZA && !(ca->isa_mode & AARCH64_FL_ZA_ON)) + emit_insn (gen_aarch64_end_private_za_call ()); } /* Emit call insn with PAT and do aarch64-specific handling. */ @@ -11348,6 +11982,9 @@ aarch64_regno_regclass (unsigned regno) if (regno == FFR_REGNUM || regno == FFRT_REGNUM) return FFR_REGS; + if (FAKE_REGNUM_P (regno)) + return FAKE_REGS; + return NO_REGS; } @@ -11703,12 +12340,14 @@ aarch64_class_max_nregs (reg_class_t regclass, machine_mode mode) return (vec_flags & VEC_ADVSIMD ? CEIL (lowest_size, UNITS_PER_VREG) : CEIL (lowest_size, UNITS_PER_WORD)); + case STACK_REG: case PR_REGS: case PR_LO_REGS: case PR_HI_REGS: case FFR_REGS: case PR_AND_FFR_REGS: + case FAKE_REGS: return 1; case NO_REGS: @@ -16894,10 +17533,14 @@ aarch64_override_options_internal (struct gcc_options *opts) && !fixed_regs[R18_REGNUM]) error ("%<-fsanitize=shadow-call-stack%> requires %<-ffixed-x18%>"); - if ((opts->x_aarch64_isa_flags & AARCH64_FL_SM_ON) + if ((opts->x_aarch64_isa_flags & (AARCH64_FL_SM_ON | AARCH64_FL_ZA_ON)) && !(opts->x_aarch64_isa_flags & AARCH64_FL_SME)) { - error ("streaming functions require the ISA extension %qs", "sme"); + if (opts->x_aarch64_isa_flags & AARCH64_FL_SM_ON) + error ("streaming functions require the ISA extension %qs", "sme"); + else + error ("functions with SME state require the ISA extension %qs", + "sme"); inform (input_location, "you can enable %qs using the command-line" " option %<-march%>, or by using the %" " attribute or pragma", "sme"); @@ -19161,6 +19804,8 @@ aarch64_conditional_register_usage (void) CLEAR_HARD_REG_BIT (operand_reg_set, VG_REGNUM); CLEAR_HARD_REG_BIT (operand_reg_set, FFR_REGNUM); CLEAR_HARD_REG_BIT (operand_reg_set, FFRT_REGNUM); + for (int i = FIRST_FAKE_REGNUM; i <= LAST_FAKE_REGNUM; ++i) + CLEAR_HARD_REG_BIT (operand_reg_set, i); /* When tracking speculation, we need a couple of call-clobbered registers to track the speculation state. It would be nice to just use @@ -20625,6 +21270,9 @@ aarch64_mov_operand_p (rtx x, machine_mode mode) || aarch64_sve_rdvl_immediate_p (x))) return true; + if (aarch64_rdsvl_immediate_p (x)) + return true; + return aarch64_classify_symbolic_expression (x) == SYMBOL_TINY_ABSOLUTE; } @@ -26321,9 +26969,45 @@ aarch64_comp_type_attributes (const_tree type1, const_tree type2) return 0; if (!check_attr ("arm", "streaming_compatible")) return 0; + if (aarch64_lookup_shared_state_flags (TYPE_ATTRIBUTES (type1), "za") + != aarch64_lookup_shared_state_flags (TYPE_ATTRIBUTES (type2), "za")) + return 0; return 1; } +/* Implement TARGET_MERGE_DECL_ATTRIBUTES. */ + +static tree +aarch64_merge_decl_attributes (tree olddecl, tree newdecl) +{ + tree old_attrs = DECL_ATTRIBUTES (olddecl); + tree old_new = lookup_attribute ("arm", "new", old_attrs); + + tree new_attrs = DECL_ATTRIBUTES (newdecl); + tree new_new = lookup_attribute ("arm", "new", new_attrs); + + if (DECL_INITIAL (olddecl) && new_new) + { + error ("cannot apply attribute %qs to %q+D after the function" + " has been defined", "new", newdecl); + inform (DECL_SOURCE_LOCATION (olddecl), "%q+D defined here", + newdecl); + } + else + { + if (old_new && new_new) + { + old_attrs = remove_attribute ("arm", "new", old_attrs); + TREE_VALUE (new_new) = chainon (TREE_VALUE (new_new), + TREE_VALUE (old_new)); + } + if (new_new) + aarch64_check_arm_new_against_type (TREE_VALUE (new_new), newdecl); + } + + return merge_attributes (old_attrs, new_attrs); +} + /* Implement TARGET_GET_MULTILIB_ABI_NAME */ static const char * @@ -26748,6 +27432,629 @@ aarch64_pars_overlap_p (rtx par1, rtx par2) return false; } +/* Implement OPTIMIZE_MODE_SWITCHING. */ + +bool +aarch64_optimize_mode_switching (aarch64_mode_entity entity) +{ + bool have_sme_state = (aarch64_cfun_incoming_pstate_za () != 0 + || (aarch64_cfun_has_new_state ("za") + && df_regs_ever_live_p (ZA_REGNUM))); + + if (have_sme_state && nonlocal_goto_handler_labels) + { + static bool reported; + if (!reported) + { + sorry ("non-local gotos in functions with SME state"); + reported = true; + } + } + + switch (entity) + { + case aarch64_mode_entity::HAVE_ZA_SAVE_BUFFER: + case aarch64_mode_entity::LOCAL_SME_STATE: + return have_sme_state && !nonlocal_goto_handler_labels; + } + gcc_unreachable (); +} + +/* Implement TARGET_MODE_EMIT for ZA_SAVE_BUFFER. */ + +static void +aarch64_mode_emit_za_save_buffer (aarch64_tristate_mode mode, + aarch64_tristate_mode prev_mode) +{ + if (mode == aarch64_tristate_mode::YES) + { + gcc_assert (prev_mode == aarch64_tristate_mode::NO); + aarch64_init_tpidr2_block (); + } + else + gcc_unreachable (); +} + +/* Implement TARGET_MODE_EMIT for LOCAL_SME_STATE. */ + +static void +aarch64_mode_emit_local_sme_state (aarch64_local_sme_state mode, + aarch64_local_sme_state prev_mode) +{ + /* Back-propagation should ensure that we're always starting from + a known mode. */ + gcc_assert (prev_mode != aarch64_local_sme_state::ANY); + + if (prev_mode == aarch64_local_sme_state::INACTIVE_CALLER) + { + /* Commit any uncommitted lazy save. This leaves ZA either active + and zero (lazy save case) or off (normal case). + + The sequence is: + + mrs , tpidr2_el0 + cbz , no_save + bl __arm_tpidr2_save + msr tpidr2_el0, xzr + zero { za } // Only if ZA is live + no_save: */ + bool is_active = (mode == aarch64_local_sme_state::ACTIVE_LIVE + || mode == aarch64_local_sme_state::ACTIVE_DEAD); + auto tmp_reg = gen_reg_rtx (DImode); + auto active_flag = gen_int_mode (is_active, DImode); + emit_insn (gen_aarch64_read_tpidr2 (tmp_reg)); + emit_insn (gen_aarch64_commit_lazy_save (tmp_reg, active_flag)); + } + + if (mode == aarch64_local_sme_state::ACTIVE_LIVE + || mode == aarch64_local_sme_state::ACTIVE_DEAD) + { + if (prev_mode == aarch64_local_sme_state::INACTIVE_LOCAL) + { + /* Make ZA active after being inactive. + + First handle the case in which the lazy save we set up was + committed by a callee. If the function's source-level ZA state + is live then we must conditionally restore it from the lazy + save buffer. Otherwise we can just force PSTATE.ZA to 1. */ + if (mode == aarch64_local_sme_state::ACTIVE_LIVE) + emit_insn (gen_aarch64_restore_za (aarch64_get_tpidr2_ptr ())); + else + emit_insn (gen_aarch64_smstart_za ()); + + /* Now handle the case in which the lazy save was not committed. + In that case, ZA still contains the current function's ZA state, + and we just need to cancel the lazy save. */ + emit_insn (gen_aarch64_clear_tpidr2 ()); + return; + } + + if (prev_mode == aarch64_local_sme_state::SAVED_LOCAL) + { + /* Retrieve the current function's ZA state from the lazy save + buffer. */ + aarch64_restore_za (aarch64_get_tpidr2_ptr ()); + return; + } + + if (prev_mode == aarch64_local_sme_state::INACTIVE_CALLER + || prev_mode == aarch64_local_sme_state::OFF) + { + /* INACTIVE_CALLER means that we are enabling ZA for the first + time in this function. The code above means that ZA is either + active and zero (if we committed a lazy save) or off. Handle + the latter case by forcing ZA on. + + OFF means that PSTATE.ZA is guaranteed to be 0. We just need + to force it to 1. + + Both cases leave ZA zeroed. */ + emit_insn (gen_aarch64_smstart_za ()); + return; + } + + if (prev_mode == aarch64_local_sme_state::ACTIVE_DEAD + || prev_mode == aarch64_local_sme_state::ACTIVE_LIVE) + /* A simple change in liveness, such as in a CFG structure where + ZA is only conditionally defined. No code is needed. */ + return; + + gcc_unreachable (); + } + + if (mode == aarch64_local_sme_state::INACTIVE_LOCAL) + { + if (prev_mode == aarch64_local_sme_state::ACTIVE_LIVE + || prev_mode == aarch64_local_sme_state::ACTIVE_DEAD + || prev_mode == aarch64_local_sme_state::INACTIVE_CALLER) + { + /* A transition from ACTIVE_LIVE to INACTIVE_LOCAL is the usual + case of setting up a lazy save buffer before a call. + A transition from INACTIVE_CALLER is similar, except that + the contents of ZA are known to be zero. + + A transition from ACTIVE_DEAD means that ZA is live at the + point of the transition, but is dead on at least one incoming + edge. (That is, ZA is only conditionally initialized.) + For efficiency, we want to set up a lazy save even for + dead contents, since forcing ZA off would make later code + restore ZA from the lazy save buffer. */ + emit_insn (gen_aarch64_write_tpidr2 (aarch64_get_tpidr2_ptr ())); + return; + } + + if (prev_mode == aarch64_local_sme_state::SAVED_LOCAL + || prev_mode == aarch64_local_sme_state::OFF) + /* We're simply discarding the information about which inactive + state applies. */ + return; + + gcc_unreachable (); + } + + if (mode == aarch64_local_sme_state::INACTIVE_CALLER + || mode == aarch64_local_sme_state::OFF) + { + /* The transition to INACTIVE_CALLER is used before returning from + new("za") functions. Any state in ZA belongs to the current + function rather than a caller, but that state is no longer + needed. Clear any pending lazy save and turn ZA off. + + The transition to OFF is used before calling a private-ZA function. + We committed any incoming lazy save above, so at this point any + contents in ZA belong to the current function. */ + if (prev_mode == aarch64_local_sme_state::INACTIVE_LOCAL) + emit_insn (gen_aarch64_clear_tpidr2 ()); + + if (prev_mode != aarch64_local_sme_state::OFF + && prev_mode != aarch64_local_sme_state::SAVED_LOCAL) + emit_insn (gen_aarch64_smstop_za ()); + + return; + } + + if (mode == aarch64_local_sme_state::SAVED_LOCAL) + { + /* This is a transition to an exception handler. */ + gcc_assert (prev_mode == aarch64_local_sme_state::OFF + || prev_mode == aarch64_local_sme_state::INACTIVE_LOCAL); + return; + } + + gcc_unreachable (); +} + +/* Implement TARGET_MODE_EMIT. */ + +static void +aarch64_mode_emit (int entity, int mode, int prev_mode, HARD_REG_SET live) +{ + if (mode == prev_mode) + return; + + start_sequence (); + switch (aarch64_mode_entity (entity)) + { + case aarch64_mode_entity::HAVE_ZA_SAVE_BUFFER: + aarch64_mode_emit_za_save_buffer (aarch64_tristate_mode (mode), + aarch64_tristate_mode (prev_mode)); + break; + + case aarch64_mode_entity::LOCAL_SME_STATE: + aarch64_mode_emit_local_sme_state (aarch64_local_sme_state (mode), + aarch64_local_sme_state (prev_mode)); + break; + } + rtx_insn *seq = get_insns (); + end_sequence (); + + /* Get the set of clobbered registers that are currently live. */ + HARD_REG_SET clobbers = {}; + for (rtx_insn *insn = seq; insn; insn = NEXT_INSN (insn)) + { + vec_rtx_properties properties; + properties.add_insn (insn, false); + for (rtx_obj_reference ref : properties.refs ()) + if (ref.is_write () && HARD_REGISTER_NUM_P (ref.regno)) + SET_HARD_REG_BIT (clobbers, ref.regno); + } + clobbers &= live; + + /* Emit instructions to save clobbered registers to pseudos. Queue + instructions to restore the registers afterwards. + + This should only needed in rare situations. */ + auto_vec after; + for (unsigned int regno = R0_REGNUM; regno < R30_REGNUM; ++regno) + if (TEST_HARD_REG_BIT (clobbers, regno)) + { + rtx hard_reg = gen_rtx_REG (DImode, regno); + rtx pseudo_reg = gen_reg_rtx (DImode); + emit_move_insn (pseudo_reg, hard_reg); + after.quick_push (gen_move_insn (hard_reg, pseudo_reg)); + } + if (TEST_HARD_REG_BIT (clobbers, CC_REGNUM)) + { + rtx pseudo_reg = gen_reg_rtx (DImode); + emit_insn (gen_aarch64_save_nzcv (pseudo_reg)); + after.quick_push (gen_aarch64_restore_nzcv (pseudo_reg)); + } + + /* Emit the transition instructions themselves. */ + emit_insn (seq); + + /* Restore the clobbered registers. */ + for (auto *insn : after) + emit_insn (insn); +} + +/* Return true if INSN references the SME state represented by hard register + REGNO. */ + +static bool +aarch64_insn_references_sme_state_p (rtx_insn *insn, unsigned int regno) +{ + df_ref ref; + FOR_EACH_INSN_DEF (ref, insn) + if (!DF_REF_FLAGS_IS_SET (ref, DF_REF_MUST_CLOBBER) + && DF_REF_REGNO (ref) == regno) + return true; + FOR_EACH_INSN_USE (ref, insn) + if (DF_REF_REGNO (ref) == regno) + return true; + return false; +} + +/* Implement TARGET_MODE_NEEDED for LOCAL_SME_STATE. */ + +static aarch64_local_sme_state +aarch64_mode_needed_local_sme_state (rtx_insn *insn, HARD_REG_SET live) +{ + if (!CALL_P (insn) + && find_reg_note (insn, REG_EH_REGION, NULL_RTX)) + { + static bool reported; + if (!reported) + { + sorry ("catching non-call exceptions in functions with SME state"); + reported = true; + } + /* Aim for graceful error recovery by picking the value that is + least likely to generate an ICE. */ + return aarch64_local_sme_state::INACTIVE_LOCAL; + } + + /* A non-local goto is equivalent to a return. We disallow non-local + receivers in functions with SME state, so we know that the target + expects ZA to be dormant or off. */ + if (JUMP_P (insn) + && find_reg_note (insn, REG_NON_LOCAL_GOTO, NULL_RTX)) + return aarch64_local_sme_state::INACTIVE_CALLER; + + /* start_private_za_call and end_private_za_call bracket a sequence + that calls a private-ZA function. Force ZA to be turned off if the + function doesn't have any live ZA state, otherwise require ZA to be + inactive. */ + auto icode = recog_memoized (insn); + if (icode == CODE_FOR_aarch64_start_private_za_call + || icode == CODE_FOR_aarch64_end_private_za_call) + return (TEST_HARD_REG_BIT (live, ZA_REGNUM) + ? aarch64_local_sme_state::INACTIVE_LOCAL + : aarch64_local_sme_state::OFF); + + /* Force ZA to contain the current function's ZA state if INSN wants + to access it. */ + if (aarch64_insn_references_sme_state_p (insn, ZA_REGNUM)) + return (TEST_HARD_REG_BIT (live, ZA_REGNUM) + ? aarch64_local_sme_state::ACTIVE_LIVE + : aarch64_local_sme_state::ACTIVE_DEAD); + + return aarch64_local_sme_state::ANY; +} + +/* Implement TARGET_MODE_NEEDED for ZA_SAVE_BUFFER. */ + +static aarch64_tristate_mode +aarch64_mode_needed_za_save_buffer (rtx_insn *insn, HARD_REG_SET live) +{ + /* We need to set up a lazy save buffer no later than the first + transition to INACTIVE_LOCAL (which involves setting up a lazy save). */ + if (aarch64_mode_needed_local_sme_state (insn, live) + == aarch64_local_sme_state::INACTIVE_LOCAL) + return aarch64_tristate_mode::YES; + + /* Also make sure that the lazy save buffer is set up before the first + insn that throws internally. The exception handler will sometimes + load from it. */ + if (find_reg_note (insn, REG_EH_REGION, NULL_RTX)) + return aarch64_tristate_mode::YES; + + return aarch64_tristate_mode::MAYBE; +} + +/* Implement TARGET_MODE_NEEDED. */ + +static int +aarch64_mode_needed (int entity, rtx_insn *insn, HARD_REG_SET live) +{ + switch (aarch64_mode_entity (entity)) + { + case aarch64_mode_entity::HAVE_ZA_SAVE_BUFFER: + return int (aarch64_mode_needed_za_save_buffer (insn, live)); + + case aarch64_mode_entity::LOCAL_SME_STATE: + return int (aarch64_mode_needed_local_sme_state (insn, live)); + } + gcc_unreachable (); +} + +/* Implement TARGET_MODE_AFTER for LOCAL_SME_STATE. */ + +static aarch64_local_sme_state +aarch64_mode_after_local_sme_state (aarch64_local_sme_state mode, + HARD_REG_SET live) +{ + /* Note places where ZA dies, so that we can try to avoid saving and + restoring state that isn't needed. */ + if (mode == aarch64_local_sme_state::ACTIVE_LIVE + && !TEST_HARD_REG_BIT (live, ZA_REGNUM)) + return aarch64_local_sme_state::ACTIVE_DEAD; + + /* Note where ZA is born, e.g. when moving past an __arm_out("za") + function. */ + if (mode == aarch64_local_sme_state::ACTIVE_DEAD + && TEST_HARD_REG_BIT (live, ZA_REGNUM)) + return aarch64_local_sme_state::ACTIVE_LIVE; + + return mode; +} + +/* Implement TARGET_MODE_AFTER. */ + +static int +aarch64_mode_after (int entity, int mode, rtx_insn *, HARD_REG_SET live) +{ + switch (aarch64_mode_entity (entity)) + { + case aarch64_mode_entity::HAVE_ZA_SAVE_BUFFER: + return mode; + + case aarch64_mode_entity::LOCAL_SME_STATE: + return int (aarch64_mode_after_local_sme_state + (aarch64_local_sme_state (mode), live)); + } + gcc_unreachable (); +} + +/* Implement TARGET_MODE_CONFLUENCE for LOCAL_SME_STATE. */ + +static aarch64_local_sme_state +aarch64_local_sme_confluence (aarch64_local_sme_state mode1, + aarch64_local_sme_state mode2) +{ + /* Perform a symmetrical check for two values. */ + auto is_pair = [&](aarch64_local_sme_state val1, + aarch64_local_sme_state val2) + { + return ((mode1 == val1 && mode2 == val2) + || (mode1 == val2 && mode2 == val1)); + }; + + /* INACTIVE_CALLER means ZA is off or it has dormant contents belonging + to a caller. OFF is one of the options. */ + if (is_pair (aarch64_local_sme_state::INACTIVE_CALLER, + aarch64_local_sme_state::OFF)) + return aarch64_local_sme_state::INACTIVE_CALLER; + + /* Similarly for dormant contents belonging to the current function. */ + if (is_pair (aarch64_local_sme_state::INACTIVE_LOCAL, + aarch64_local_sme_state::OFF)) + return aarch64_local_sme_state::INACTIVE_LOCAL; + + /* Treat a conditionally-initialized value as a fully-initialized value. */ + if (is_pair (aarch64_local_sme_state::ACTIVE_LIVE, + aarch64_local_sme_state::ACTIVE_DEAD)) + return aarch64_local_sme_state::ACTIVE_LIVE; + + return aarch64_local_sme_state::ANY; +} + +/* Implement TARGET_MODE_CONFLUENCE. */ + +static int +aarch64_mode_confluence (int entity, int mode1, int mode2) +{ + gcc_assert (mode1 != mode2); + switch (aarch64_mode_entity (entity)) + { + case aarch64_mode_entity::HAVE_ZA_SAVE_BUFFER: + return int (aarch64_tristate_mode::MAYBE); + + case aarch64_mode_entity::LOCAL_SME_STATE: + return int (aarch64_local_sme_confluence + (aarch64_local_sme_state (mode1), + aarch64_local_sme_state (mode2))); + } + gcc_unreachable (); +} + +/* Implement TARGET_MODE_BACKPROP for an entity that either stays + NO throughput, or makes one transition from NO to YES. */ + +static aarch64_tristate_mode +aarch64_one_shot_backprop (aarch64_tristate_mode mode1, + aarch64_tristate_mode mode2) +{ + /* Keep bringing the transition forward until it starts from NO. */ + if (mode1 == aarch64_tristate_mode::MAYBE + && mode2 == aarch64_tristate_mode::YES) + return mode2; + + return aarch64_tristate_mode::MAYBE; +} + +/* Implement TARGET_MODE_BACKPROP for LOCAL_SME_STATE. */ + +static aarch64_local_sme_state +aarch64_local_sme_backprop (aarch64_local_sme_state mode1, + aarch64_local_sme_state mode2) +{ + /* We always need to know what the current state is when transitioning + to a new state. Force any location with indeterminate starting state + to be active. */ + if (mode1 == aarch64_local_sme_state::ANY) + switch (mode2) + { + case aarch64_local_sme_state::INACTIVE_CALLER: + case aarch64_local_sme_state::OFF: + case aarch64_local_sme_state::ACTIVE_DEAD: + /* The current function's ZA state is not live. */ + return aarch64_local_sme_state::ACTIVE_DEAD; + + case aarch64_local_sme_state::INACTIVE_LOCAL: + case aarch64_local_sme_state::ACTIVE_LIVE: + /* The current function's ZA state is live. */ + return aarch64_local_sme_state::ACTIVE_LIVE; + + case aarch64_local_sme_state::SAVED_LOCAL: + /* This is a transition to an exception handler. Since we don't + support non-call exceptions for SME functions, the source of + the transition must be known. We'll assert later if that's + not the case. */ + return aarch64_local_sme_state::ANY; + + case aarch64_local_sme_state::ANY: + return aarch64_local_sme_state::ANY; + } + + return aarch64_local_sme_state::ANY; +} + +/* Implement TARGET_MODE_BACKPROP. */ + +static int +aarch64_mode_backprop (int entity, int mode1, int mode2) +{ + switch (aarch64_mode_entity (entity)) + { + case aarch64_mode_entity::HAVE_ZA_SAVE_BUFFER: + return int (aarch64_one_shot_backprop (aarch64_tristate_mode (mode1), + aarch64_tristate_mode (mode2))); + + case aarch64_mode_entity::LOCAL_SME_STATE: + return int (aarch64_local_sme_backprop + (aarch64_local_sme_state (mode1), + aarch64_local_sme_state (mode2))); + } + gcc_unreachable (); +} + +/* Implement TARGET_MODE_ENTRY. */ + +static int +aarch64_mode_entry (int entity) +{ + switch (aarch64_mode_entity (entity)) + { + case aarch64_mode_entity::HAVE_ZA_SAVE_BUFFER: + return int (aarch64_tristate_mode::NO); + + case aarch64_mode_entity::LOCAL_SME_STATE: + return int (aarch64_cfun_shared_flags ("za") != 0 + ? aarch64_local_sme_state::ACTIVE_LIVE + : aarch64_local_sme_state::INACTIVE_CALLER); + } + gcc_unreachable (); +} + +/* Implement TARGET_MODE_EXIT. */ + +static int +aarch64_mode_exit (int entity) +{ + switch (aarch64_mode_entity (entity)) + { + case aarch64_mode_entity::HAVE_ZA_SAVE_BUFFER: + return int (aarch64_tristate_mode::MAYBE); + + case aarch64_mode_entity::LOCAL_SME_STATE: + return int (aarch64_cfun_shared_flags ("za") != 0 + ? aarch64_local_sme_state::ACTIVE_LIVE + : aarch64_local_sme_state::INACTIVE_CALLER); + } + gcc_unreachable (); +} + +/* Implement TARGET_MODE_EH_HANDLER. */ + +static int +aarch64_mode_eh_handler (int entity) +{ + switch (aarch64_mode_entity (entity)) + { + case aarch64_mode_entity::HAVE_ZA_SAVE_BUFFER: + /* Require a lazy save buffer to be allocated before the first + insn that can throw. */ + return int (aarch64_tristate_mode::YES); + + case aarch64_mode_entity::LOCAL_SME_STATE: + return int (aarch64_local_sme_state::SAVED_LOCAL); + } + gcc_unreachable (); +} + +/* Implement TARGET_MODE_PRIORITY. */ + +static int +aarch64_mode_priority (int, int n) +{ + return n; +} + +/* Implement TARGET_MD_ASM_ADJUST. */ + +static rtx_insn * +aarch64_md_asm_adjust (vec &outputs, vec &inputs, + vec &input_modes, + vec &constraints, + vec &uses, vec &clobbers, + HARD_REG_SET &clobbered_regs, location_t loc) +{ + rtx_insn *seq = arm_md_asm_adjust (outputs, inputs, input_modes, constraints, + uses, clobbers, clobbered_regs, loc); + + /* "za" in the clobber list of a function with ZA state is defined to + mean that the asm can read from and write to ZA. We can model the + read using a USE, but unfortunately, it's not possible to model the + write directly. Use a separate insn to model the effect. + + We must ensure that ZA is active on entry, which is enforced by using + SME_STATE_REGNUM. The asm must ensure that ZA is active on return. */ + if (TARGET_ZA) + for (unsigned int i = clobbers.length (); i-- > 0; ) + { + rtx x = clobbers[i]; + if (REG_P (x) && REGNO (x) == ZA_REGNUM) + { + auto id = cfun->machine->next_asm_update_za_id++; + + start_sequence (); + if (seq) + emit_insn (seq); + emit_insn (gen_aarch64_asm_update_za (gen_int_mode (id, SImode))); + seq = get_insns (); + end_sequence (); + + uses.safe_push (gen_rtx_REG (VNx16QImode, ZA_REGNUM)); + uses.safe_push (gen_rtx_REG (DImode, SME_STATE_REGNUM)); + + clobbers.ordered_remove (i); + CLEAR_HARD_REG_BIT (clobbered_regs, ZA_REGNUM); + } + } + return seq; +} + /* If CALL involves a change in PSTATE.SM, emit the instructions needed to switch to the new mode and the instructions needed to restore the original mode. Return true if something changed. */ @@ -27136,6 +28443,9 @@ aarch64_run_selftests (void) #undef TARGET_START_CALL_ARGS #define TARGET_START_CALL_ARGS aarch64_start_call_args +#undef TARGET_END_CALL_ARGS +#define TARGET_END_CALL_ARGS aarch64_end_call_args + #undef TARGET_GIMPLE_FOLD_BUILTIN #define TARGET_GIMPLE_FOLD_BUILTIN aarch64_gimple_fold_builtin @@ -27504,6 +28814,9 @@ aarch64_libgcc_floating_mode_supported_p #undef TARGET_COMP_TYPE_ATTRIBUTES #define TARGET_COMP_TYPE_ATTRIBUTES aarch64_comp_type_attributes +#undef TARGET_MERGE_DECL_ATTRIBUTES +#define TARGET_MERGE_DECL_ATTRIBUTES aarch64_merge_decl_attributes + #undef TARGET_GET_MULTILIB_ABI_NAME #define TARGET_GET_MULTILIB_ABI_NAME aarch64_get_multilib_abi_name @@ -27524,8 +28837,35 @@ aarch64_libgcc_floating_mode_supported_p #undef TARGET_STRICT_ARGUMENT_NAMING #define TARGET_STRICT_ARGUMENT_NAMING hook_bool_CUMULATIVE_ARGS_true +#undef TARGET_MODE_EMIT +#define TARGET_MODE_EMIT aarch64_mode_emit + +#undef TARGET_MODE_NEEDED +#define TARGET_MODE_NEEDED aarch64_mode_needed + +#undef TARGET_MODE_AFTER +#define TARGET_MODE_AFTER aarch64_mode_after + +#undef TARGET_MODE_CONFLUENCE +#define TARGET_MODE_CONFLUENCE aarch64_mode_confluence + +#undef TARGET_MODE_BACKPROP +#define TARGET_MODE_BACKPROP aarch64_mode_backprop + +#undef TARGET_MODE_ENTRY +#define TARGET_MODE_ENTRY aarch64_mode_entry + +#undef TARGET_MODE_EXIT +#define TARGET_MODE_EXIT aarch64_mode_exit + +#undef TARGET_MODE_EH_HANDLER +#define TARGET_MODE_EH_HANDLER aarch64_mode_eh_handler + +#undef TARGET_MODE_PRIORITY +#define TARGET_MODE_PRIORITY aarch64_mode_priority + #undef TARGET_MD_ASM_ADJUST -#define TARGET_MD_ASM_ADJUST arm_md_asm_adjust +#define TARGET_MD_ASM_ADJUST aarch64_md_asm_adjust #undef TARGET_ASM_FILE_END #define TARGET_ASM_FILE_END aarch64_asm_file_end @@ -27539,6 +28879,9 @@ aarch64_libgcc_floating_mode_supported_p #undef TARGET_CONST_ANCHOR #define TARGET_CONST_ANCHOR 0x1000000 +#undef TARGET_EXTRA_LIVE_ON_ENTRY +#define TARGET_EXTRA_LIVE_ON_ENTRY aarch64_extra_live_on_entry + #undef TARGET_EMIT_EPILOGUE_FOR_SIBCALL #define TARGET_EMIT_EPILOGUE_FOR_SIBCALL aarch64_expand_epilogue diff --git a/gcc/config/aarch64/aarch64.h b/gcc/config/aarch64/aarch64.h index 182f45005f9..2d39b843d9c 100644 --- a/gcc/config/aarch64/aarch64.h +++ b/gcc/config/aarch64/aarch64.h @@ -207,6 +207,7 @@ constexpr auto AARCH64_FL_DEFAULT_ISA_MODE = AARCH64_FL_SM_OFF; /* Macros to test ISA flags. */ #define AARCH64_ISA_SM_OFF (aarch64_isa_flags & AARCH64_FL_SM_OFF) +#define AARCH64_ISA_ZA_ON (aarch64_isa_flags & AARCH64_FL_ZA_ON) #define AARCH64_ISA_MODE (aarch64_isa_flags & AARCH64_FL_ISA_MODES) #define AARCH64_ISA_CRC (aarch64_isa_flags & AARCH64_FL_CRC) #define AARCH64_ISA_CRYPTO (aarch64_isa_flags & AARCH64_FL_CRYPTO) @@ -260,6 +261,9 @@ constexpr auto AARCH64_FL_DEFAULT_ISA_MODE = AARCH64_FL_SM_OFF; #define TARGET_STREAMING_COMPATIBLE \ ((aarch64_isa_flags & AARCH64_FL_SM_STATE) == 0) +/* PSTATE.ZA is enabled in the current function body. */ +#define TARGET_ZA (AARCH64_ISA_ZA_ON) + /* Crypto is an optional extension to AdvSIMD. */ #define TARGET_CRYPTO (AARCH64_ISA_CRYPTO) @@ -461,7 +465,8 @@ constexpr auto AARCH64_FL_DEFAULT_ISA_MODE = AARCH64_FL_SM_OFF; 1, 1, 1, 1, /* SFP, AP, CC, VG */ \ 0, 0, 0, 0, 0, 0, 0, 0, /* P0 - P7 */ \ 0, 0, 0, 0, 0, 0, 0, 0, /* P8 - P15 */ \ - 1, 1 /* FFR and FFRT */ \ + 1, 1, /* FFR and FFRT */ \ + 1, 1, 1, 1, 1, 1, 1 /* Fake registers */ \ } /* X30 is marked as caller-saved which is in line with regular function call @@ -471,7 +476,7 @@ constexpr auto AARCH64_FL_DEFAULT_ISA_MODE = AARCH64_FL_SM_OFF; true but not until function epilogues have been generated. This ensures that X30 is available for use in leaf functions if needed. */ -#define CALL_USED_REGISTERS \ +#define CALL_REALLY_USED_REGISTERS \ { \ 1, 1, 1, 1, 1, 1, 1, 1, /* R0 - R7 */ \ 1, 1, 1, 1, 1, 1, 1, 1, /* R8 - R15 */ \ @@ -484,7 +489,8 @@ constexpr auto AARCH64_FL_DEFAULT_ISA_MODE = AARCH64_FL_SM_OFF; 1, 1, 1, 0, /* SFP, AP, CC, VG */ \ 1, 1, 1, 1, 1, 1, 1, 1, /* P0 - P7 */ \ 1, 1, 1, 1, 1, 1, 1, 1, /* P8 - P15 */ \ - 1, 1 /* FFR and FFRT */ \ + 1, 1, /* FFR and FFRT */ \ + 0, 0, 0, 0, 0, 0, 0 /* Fake registers */ \ } #define REGISTER_NAMES \ @@ -500,7 +506,9 @@ constexpr auto AARCH64_FL_DEFAULT_ISA_MODE = AARCH64_FL_SM_OFF; "sfp", "ap", "cc", "vg", \ "p0", "p1", "p2", "p3", "p4", "p5", "p6", "p7", \ "p8", "p9", "p10", "p11", "p12", "p13", "p14", "p15", \ - "ffr", "ffrt" \ + "ffr", "ffrt", \ + "lowering", "tpidr2_block", "sme_state", "tpidr2_setup", \ + "za_free", "za_saved", "za" \ } /* Generate the register aliases for core register N */ @@ -549,7 +557,7 @@ constexpr auto AARCH64_FL_DEFAULT_ISA_MODE = AARCH64_FL_SM_OFF; #define FRAME_POINTER_REGNUM SFP_REGNUM #define STACK_POINTER_REGNUM SP_REGNUM #define ARG_POINTER_REGNUM AP_REGNUM -#define FIRST_PSEUDO_REGISTER (FFRT_REGNUM + 1) +#define FIRST_PSEUDO_REGISTER (LAST_FAKE_REGNUM + 1) /* The number of argument registers available for each class. */ #define NUM_ARG_REGS 8 @@ -672,6 +680,9 @@ constexpr auto AARCH64_FL_DEFAULT_ISA_MODE = AARCH64_FL_SM_OFF; #define FP_SIMD_SAVED_REGNUM_P(REGNO) \ (((unsigned) (REGNO - V8_REGNUM)) <= (V23_REGNUM - V8_REGNUM)) + +#define FAKE_REGNUM_P(REGNO) \ + IN_RANGE (REGNO, FIRST_FAKE_REGNUM, LAST_FAKE_REGNUM) /* Register and constant classes. */ @@ -692,6 +703,7 @@ enum reg_class PR_REGS, FFR_REGS, PR_AND_FFR_REGS, + FAKE_REGS, ALL_REGS, LIM_REG_CLASSES /* Last */ }; @@ -715,6 +727,7 @@ enum reg_class "PR_REGS", \ "FFR_REGS", \ "PR_AND_FFR_REGS", \ + "FAKE_REGS", \ "ALL_REGS" \ } @@ -735,6 +748,7 @@ enum reg_class { 0x00000000, 0x00000000, 0x000ffff0 }, /* PR_REGS */ \ { 0x00000000, 0x00000000, 0x00300000 }, /* FFR_REGS */ \ { 0x00000000, 0x00000000, 0x003ffff0 }, /* PR_AND_FFR_REGS */ \ + { 0x00000000, 0x00000000, 0x1fc00000 }, /* FAKE_REGS */ \ { 0xffffffff, 0xffffffff, 0x000fffff } /* ALL_REGS */ \ } @@ -934,6 +948,15 @@ typedef struct GTY (()) machine_function bool reg_is_wrapped_separately[LAST_SAVED_REGNUM]; /* One entry for each general purpose register. */ rtx call_via[SP_REGNUM]; + + /* A pseudo register that points to the function's TPIDR2 block, or null + if the function doesn't have a TPIDR2 block. */ + rtx tpidr2_block; + + /* A pseudo register that points to the function's ZA save buffer, + or null if none. */ + rtx za_save_buffer; + bool label_is_assembled; /* True if we've expanded at least one call to a function that changes @@ -941,6 +964,10 @@ typedef struct GTY (()) machine_function guarantees that no such mode switch exists. */ bool call_switches_pstate_sm; + /* Used to generated unique identifiers for each update to ZA by an + asm statement. */ + unsigned int next_asm_update_za_id; + /* A set of all decls that have been passed to a vld1 intrinsic in the current function. This is used to help guide the vector cost model. */ hash_set *vector_load_decls; @@ -1010,6 +1037,10 @@ typedef struct bool silent_p; /* True if we should act silently, rather than raise an error for invalid calls. */ + /* AARCH64_STATE_* flags that describe whether the function shares ZA + with its callers. */ + unsigned int shared_za_flags; + /* A list of registers that need to be saved and restored around a change to PSTATE.SM. An auto_vec would be more convenient, but those can't be copied. */ @@ -1381,4 +1412,61 @@ extern poly_uint16 aarch64_sve_vg; || ((T) == US_TRUNCATE && (S) == LSHIFTRT) \ || ((T) == SS_TRUNCATE && (S) == ASHIFTRT)) +#ifndef USED_FOR_TARGET + +/* Enumerates the mode-switching "entities" for AArch64. */ +enum class aarch64_mode_entity : int +{ + /* An aarch64_tristate_mode that says whether we have created a local + save buffer for the current function's ZA state. The only transition + is from NO to YES. */ + HAVE_ZA_SAVE_BUFFER, + + /* An aarch64_local_sme_state that reflects the state of all data + controlled by PSTATE.ZA. */ + LOCAL_SME_STATE +}; + +/* Describes the state of all data controlled by PSTATE.ZA */ +enum class aarch64_local_sme_state : int +{ + /* ZA is in the off or dormant state. If it is dormant, the contents + of ZA belong to a caller. */ + INACTIVE_CALLER, + + /* ZA is in the off state: PSTATE.ZA is 0 and TPIDR2_EL0 is null. */ + OFF, + + /* ZA is in the off or dormant state. If it is dormant, the contents + of ZA belong to the current function. */ + INACTIVE_LOCAL, + + /* ZA is in the off state and the current function's ZA contents are + stored in the lazy save buffer. This is the state on entry to + exception handlers. */ + SAVED_LOCAL, + + /* ZA is in the active state: PSTATE.ZA is 1 and TPIDR2_EL0 is null. + The contents of ZA are live. */ + ACTIVE_LIVE, + + /* ZA is in the active state: PSTATE.ZA is 1 and TPIDR2_EL0 is null. + The contents of ZA are dead. */ + ACTIVE_DEAD, + + /* ZA could be in multiple states. */ + ANY +}; + +enum class aarch64_tristate_mode : int { NO, YES, MAYBE }; + +#define OPTIMIZE_MODE_SWITCHING(ENTITY) \ + aarch64_optimize_mode_switching (aarch64_mode_entity (ENTITY)) + +#define NUM_MODES_FOR_MODE_SWITCHING \ + { int (aarch64_tristate_mode::MAYBE), \ + int (aarch64_local_sme_state::ANY) } + +#endif + #endif /* GCC_AARCH64_H */ diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md index 0dac5df1b74..14a401617f6 100644 --- a/gcc/config/aarch64/aarch64.md +++ b/gcc/config/aarch64/aarch64.md @@ -111,6 +111,56 @@ (define_constants ;; "FFR token": a fake register used for representing the scheduling ;; restrictions on FFR-related operations. (FFRT_REGNUM 85) + + ;; ---------------------------------------------------------------- + ;; Fake registers + ;; ---------------------------------------------------------------- + ;; These registers represent abstract things, rather than real + ;; architected registers. + + ;; Sometimes we use placeholder instructions to mark where later + ;; ABI-related lowering is needed. These placeholders read and + ;; write this register. Instructions that depend on the lowering + ;; read the register. + (LOWERING_REGNUM 86) + + ;; Represents the contents of the current function's TPIDR2 block, + ;; in abstract form. + (TPIDR2_BLOCK_REGNUM 87) + + ;; Holds the value that the current function wants PSTATE.ZA to be. + ;; The actual value can sometimes vary, because it does not track + ;; changes to PSTATE.ZA that happen during a lazy save and restore. + ;; Those effects are instead tracked by ZA_SAVED_REGNUM. + (SME_STATE_REGNUM 88) + + ;; Instructions write to this register if they set TPIDR2_EL0 to a + ;; well-defined value. Instructions read from the register if they + ;; depend on the result of such writes. + ;; + ;; The register does not model the architected TPIDR2_ELO, just the + ;; current function's management of it. + (TPIDR2_SETUP_REGNUM 89) + + ;; Represents the property "has an incoming lazy save been committed?". + (ZA_FREE_REGNUM 90) + + ;; Represents the property "are the current function's ZA contents + ;; stored in the lazy save buffer, rather than in ZA itself?". + (ZA_SAVED_REGNUM 91) + + ;; Represents the contents of the current function's ZA state in + ;; abstract form. At various times in the function, these contents + ;; might be stored in ZA itself, or in the function's lazy save buffer. + ;; + ;; The contents persist even when the architected ZA is off. Private-ZA + ;; functions have no effect on its contents. + (ZA_REGNUM 92) + ;; ---------------------------------------------------------------- + (FIRST_FAKE_REGNUM LOWERING_REGNUM) + (LAST_FAKE_REGNUM ZA_REGNUM) + ;; ---------------------------------------------------------------- + ;; The pair of scratch registers used for stack probing with -fstack-check. ;; Leave R9 alone as a possible choice for the static chain. ;; Note that the use of these registers is mutually exclusive with the use @@ -294,7 +344,12 @@ (define_c_enum "unspec" [ UNSPEC_TAG_SPACE ; Translate address to MTE tag address space. UNSPEC_LD1RO UNSPEC_SALT_ADDR + UNSPEC_SAVE_NZCV + UNSPEC_RESTORE_NZCV UNSPECV_PATCHABLE_AREA + ;; Wraps a constant integer that should be multiplied by the number + ;; of quadwords in an SME vector. + UNSPEC_SME_VQ ]) (define_c_enum "unspecv" [ @@ -367,7 +422,7 @@ (define_constants ;; Q registers and is equivalent to "simd". (define_enum "arches" [any rcpc8_4 fp fp_q base_simd nobase_simd - simd nosimd sve fp16]) + simd nosimd sve fp16 sme]) (define_enum_attr "arch" "arches" (const_string "any")) @@ -411,7 +466,10 @@ (define_attr "arch_enabled" "no,yes" (match_test "TARGET_FP_F16INST")) (and (eq_attr "arch" "sve") - (match_test "TARGET_SVE"))) + (match_test "TARGET_SVE")) + + (and (eq_attr "arch" "sme") + (match_test "TARGET_SME"))) (const_string "yes") (const_string "no"))) @@ -914,7 +972,7 @@ (define_insn "simple_return" (set_attr "sls_length" "retbr")] ) -(define_insn "*cb1" +(define_insn "aarch64_cb1" [(set (pc) (if_then_else (EQL (match_operand:GPI 0 "register_operand" "r") (const_int 0)) (label_ref (match_operand 1 "" "")) @@ -1298,6 +1356,7 @@ (define_insn_and_split "*movsi_aarch64" /* The "mov_imm" type for CNT is just a placeholder. */ [r , Usv; mov_imm , sve , 4] << aarch64_output_sve_cnt_immediate ("cnt", "%x0", operands[1]); [r , Usr; mov_imm , sve, 4] << aarch64_output_sve_rdvl (operands[1]); + [r , UsR; mov_imm , sme, 4] << aarch64_output_rdsvl (operands[1]); [r , m ; load_4 , * , 4] ldr\t%w0, %1 [w , m ; load_4 , fp , 4] ldr\t%s0, %1 [m , r Z; store_4 , * , 4] str\t%w1, %0 @@ -1334,6 +1393,7 @@ (define_insn_and_split "*movdi_aarch64" /* The "mov_imm" type for CNT is just a placeholder. */ [r, Usv; mov_imm , sve , 4] << aarch64_output_sve_cnt_immediate ("cnt", "%x0", operands[1]); [r, Usr; mov_imm , sve, 4] << aarch64_output_sve_rdvl (operands[1]); + [r, UsR; mov_imm , sme, 4] << aarch64_output_rdsvl (operands[1]); [r, m ; load_8 , * , 4] ldr\t%x0, %1 [w, m ; load_8 , fp , 4] ldr\t%d0, %1 [m, r Z; store_8 , * , 4] str\t%x1, %0 @@ -8034,6 +8094,21 @@ (define_insn "patchable_area" [(set (attr "length") (symbol_ref "INTVAL (operands[0])"))] ) +(define_insn "aarch64_save_nzcv" + [(set (match_operand:DI 0 "register_operand" "=r") + (unspec:DI [(reg:CC CC_REGNUM)] UNSPEC_SAVE_NZCV))] + "" + "mrs\t%0, nzcv" +) + +(define_insn "aarch64_restore_nzcv" + [(set (reg:CC CC_REGNUM) + (unspec:CC [(match_operand:DI 0 "register_operand" "r")] + UNSPEC_RESTORE_NZCV))] + "" + "msr\tnzcv, %0" +) + ;; AdvSIMD Stuff (include "aarch64-simd.md") diff --git a/gcc/config/aarch64/constraints.md b/gcc/config/aarch64/constraints.md index 5c02d15c77a..5dd50218b9f 100644 --- a/gcc/config/aarch64/constraints.md +++ b/gcc/config/aarch64/constraints.md @@ -225,6 +225,12 @@ (define_constraint "Usr" (and (match_code "const_poly_int") (match_test "aarch64_sve_rdvl_immediate_p (op)"))) +(define_constraint "UsR" + "@internal + A constraint that matches a value produced by RDSVL." + (and (match_code "const") + (match_test "aarch64_rdsvl_immediate_p (op)"))) + (define_constraint "Usv" "@internal A constraint that matches a VG-based constant that can be loaded by diff --git a/gcc/testsuite/g++.target/aarch64/sme/exceptions_1.C b/gcc/testsuite/g++.target/aarch64/sme/exceptions_1.C new file mode 100644 index 00000000000..a245546d8b1 --- /dev/null +++ b/gcc/testsuite/g++.target/aarch64/sme/exceptions_1.C @@ -0,0 +1,189 @@ +// { dg-options "-O -fno-optimize-sibling-calls" } +// { dg-final { check-function-bodies "**" "" } } + +void callee_inout() __arm_inout("za"); +void callee_in() noexcept __arm_in("za"); +void callee_out() noexcept __arm_out("za"); +void callee_normal(); + +/* +** _Z5test1v: +** ... +** bl __arm_tpidr2_save +** ... +** bl __cxa_begin_catch +** bl __cxa_end_catch +** mov w0, #?2 +** ... +*/ +__arm_new("za") int +test1 () +{ + try + { + callee_inout(); + return 1; + } + catch (...) + { + return 2; + } +} + +/* +** _Z5test2v: +** ... +** bl __arm_tpidr2_save +** ... +** bl __cxa_begin_catch +** smstart za +** bl _Z10callee_outv +** bl _Z9callee_inv +** smstop za +** bl __cxa_end_catch +** mov w0, #?2 +** ... +*/ +__arm_new("za") int +test2 () +{ + try + { + callee_inout(); + return 1; + } + catch (...) + { + callee_out(); + callee_in(); + return 2; + } +} + +/* +** _Z5test3v: +** ... +** bl __arm_tpidr2_save +** ... +** smstop za +** ... +** bl _Z13callee_normalv +** ... +** bl __cxa_begin_catch +** smstart za +** bl _Z10callee_outv +** bl _Z9callee_inv +** smstop za +** bl __cxa_end_catch +** mov w0, #?2 +** ... +*/ +__arm_new("za") int +test3 () +{ + try + { + callee_normal(); + return 1; + } + catch (...) + { + callee_out(); + callee_in(); + return 2; + } +} + +__arm_new("za") int +test4 () +{ + try + { + // No lazy save set up because this is a shared-ZA function. + callee_inout(); + return 1; + } + catch (...) + { + callee_inout(); + return 2; + } +} +// { dg-final { scan-assembler {_Z5test4v:(?:(?!msr\ttpidr2_el0, x[0-9]+).)*\tret} } } + +/* +** _Z5test5v: +** ... +** bl __arm_tpidr2_save +** ... +** smstart za +** ... +** bl _Z12callee_inoutv +** add (x[0-9]+), [^\n]+ +** msr tpidr2_el0, \1 +** bl _Z13callee_normalv +** msr tpidr2_el0, xzr +** smstop za +** ... +** bl __cxa_begin_catch +** ... +** mrs x[0-9]+, tpidr2_el0 +** ... +** smstart za +** ... +** bl __arm_tpidr2_restore +** msr tpidr2_el0, xzr +** bl _Z12callee_inoutv +** smstop za +** bl __cxa_end_catch +** mov w0, #?2 +** ... +*/ +__arm_new("za") int +test5 () +{ + try + { + callee_inout(); + callee_normal(); + return 1; + } + catch (...) + { + callee_inout(); + return 2; + } +} + +/* +** _Z5test6v: +** ... +** msr tpidr2_el0, x[0-9]+ +** bl _Z13callee_normalv +** msr tpidr2_el0, xzr +** ... +** bl __cxa_begin_catch +** bl __cxa_end_catch +** ... +** mrs x[0-9]+, tpidr2_el0 +** ... +** smstart za +** ... +** bl __arm_tpidr2_restore +** msr tpidr2_el0, xzr +** ... +*/ +int +test6 () __arm_inout("za") +{ + try + { + callee_normal(); + callee_out(); + return 1; + } + catch (...) + { + return 2; + } +} diff --git a/gcc/testsuite/g++.target/aarch64/sme/keyword_macros_1.C b/gcc/testsuite/g++.target/aarch64/sme/keyword_macros_1.C index 032485adf95..8b0755014cc 100644 --- a/gcc/testsuite/g++.target/aarch64/sme/keyword_macros_1.C +++ b/gcc/testsuite/g++.target/aarch64/sme/keyword_macros_1.C @@ -2,3 +2,8 @@ void f1 () __arm_streaming; void f2 () __arm_streaming_compatible; +void f3 () __arm_in("za"); +void f4 () __arm_out("za"); +void f5 () __arm_inout("za"); +void f6 () __arm_preserves("za"); +__arm_new("za") void f7 () {} diff --git a/gcc/testsuite/gcc.target/aarch64/sme/keyword_macros_1.c b/gcc/testsuite/gcc.target/aarch64/sme/keyword_macros_1.c index 8f1b836764e..fcabe3edc55 100644 --- a/gcc/testsuite/gcc.target/aarch64/sme/keyword_macros_1.c +++ b/gcc/testsuite/gcc.target/aarch64/sme/keyword_macros_1.c @@ -2,3 +2,8 @@ void f1 () __arm_streaming; void f2 () __arm_streaming_compatible; +void f3 () __arm_in("za"); +void f4 () __arm_out("za"); +void f5 () __arm_inout("za"); +void f6 () __arm_preserves("za"); +__arm_new("za") void f7 () {} diff --git a/gcc/testsuite/gcc.target/aarch64/sme/za_state_1.c b/gcc/testsuite/gcc.target/aarch64/sme/za_state_1.c new file mode 100644 index 00000000000..856880e2109 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sme/za_state_1.c @@ -0,0 +1,154 @@ +// { dg-options "" } + +void shared_a () [[arm::inout("za")]]; +void shared_a (); // { dg-error "conflicting types" } + +void shared_b (); +void shared_b () [[arm::inout("za")]]; // { dg-error "conflicting types" } + +void shared_c () [[arm::inout("za")]]; +void shared_c () {} // Inherits attribute from declaration (confusingly). + +void shared_d (); +void shared_d () [[arm::inout("za")]] {} // { dg-error "conflicting types" } + +void shared_e () [[arm::inout("za")]] {} +void shared_e (); // { dg-error "conflicting types" } + +void shared_f () {} +void shared_f () [[arm::inout("za")]]; // { dg-error "conflicting types" } + +extern void (*shared_g) (); +extern void (*shared_g) () [[arm::inout("za")]]; // { dg-error "conflicting types" } + +extern void (*shared_h) () [[arm::inout("za")]]; +extern void (*shared_h) (); // { dg-error "conflicting types" } + +//---------------------------------------------------------------------------- + +void preserved_a () [[arm::preserves("za")]]; +void preserved_a (); // { dg-error "conflicting types" } + +void preserved_b (); +void preserved_b () [[arm::preserves("za")]]; // { dg-error "conflicting types" } + +void preserved_c () [[arm::preserves("za")]]; +void preserved_c () {} // Inherits attribute from declaration (confusingly). + +void preserved_d (); +void preserved_d () [[arm::preserves("za")]] {} // { dg-error "conflicting types" } + +void preserved_e () [[arm::preserves("za")]] {} +void preserved_e (); // { dg-error "conflicting types" } + +void preserved_f () {} +void preserved_f () [[arm::preserves("za")]]; // { dg-error "conflicting types" } + +extern void (*preserved_g) (); +extern void (*preserved_g) () [[arm::preserves("za")]]; // { dg-error "conflicting types" } + +extern void (*preserved_h) () [[arm::preserves("za")]]; +extern void (*preserved_h) (); // { dg-error "conflicting types" } + +//---------------------------------------------------------------------------- + +void replicated_1 () [[arm::in("za", "za"), arm::in("za")]]; +void replicated_2 () [[arm::out("za", "za"), arm::out("za")]]; +void replicated_3 () [[arm::inout("za", "za"), arm::inout("za")]]; +void replicated_4 () [[arm::preserves("za", "za"), arm::preserves("za")]]; + +//---------------------------------------------------------------------------- + +void invalid_1 () [[arm::in]]; // { dg-error "wrong number of arguments" } +void invalid_2 () [[arm::in()]]; // { dg-error "parentheses must be omitted" } + // { dg-error "wrong number of arguments" "" { target *-*-* } .-1 } +void invalid_3 () [[arm::in("")]]; // { dg-error "unrecognized state string ''" } +void invalid_4 () [[arm::in("foo")]]; // { dg-error "unrecognized state string 'foo'" } +void invalid_5 () [[arm::in(42)]]; // { dg-error "the arguments to 'in' must be constant strings" } +void invalid_6 () [[arm::in(*(int *)0 ? "za" : "za")]]; // { dg-error "the arguments to 'in' must be constant strings" } + +//---------------------------------------------------------------------------- + +void mixed_a () [[arm::preserves("za")]]; +void mixed_a () [[arm::inout("za")]]; // { dg-error "conflicting types" } + +void mixed_b () [[arm::inout("za")]]; +void mixed_b () [[arm::preserves("za")]]; // { dg-error "conflicting types" } + +void mixed_c () [[arm::preserves("za")]]; +void mixed_c () [[arm::in("za")]] {} // { dg-error "conflicting types" } + +void mixed_d () [[arm::inout("za")]]; +void mixed_d () [[arm::in("za")]] {} // { dg-error "conflicting types" } + +void mixed_e () [[arm::out("za")]] {} +void mixed_e () [[arm::in("za")]]; // { dg-error "conflicting types" } + +void mixed_f () [[arm::inout("za")]] {} +void mixed_f () [[arm::out("za")]]; // { dg-error "conflicting types" } + +extern void (*mixed_g) () [[arm::in("za")]]; +extern void (*mixed_g) () [[arm::preserves("za")]]; // { dg-error "conflicting types" } + +extern void (*mixed_h) () [[arm::preserves("za")]]; +extern void (*mixed_h) () [[arm::out("za")]]; // { dg-error "conflicting types" } + +//---------------------------------------------------------------------------- + +void contradiction_1 () [[arm::preserves("za"), arm::inout("za")]]; // { dg-error "inconsistent attributes for state 'za'" } +void contradiction_2 () [[arm::inout("za"), arm::preserves("za")]]; // { dg-error "inconsistent attributes for state 'za'" } + +int [[arm::inout("za")]] int_attr; // { dg-warning "only applies to function types" } +void *[[arm::preserves("za")]] ptr_attr; // { dg-warning "only applies to function types" } + +typedef void preserved_callback () [[arm::preserves("za")]]; +typedef void shared_callback () [[arm::inout("za")]]; + +void (*preserved_callback_ptr) () [[arm::preserves("za")]]; +void (*shared_callback_ptr) () [[arm::inout("za")]]; + +typedef void contradiction_callback_1 () [[arm::preserves("za"), arm::inout("za")]]; // { dg-error "inconsistent attributes for state 'za'" } +typedef void contradiction_callback_2 () [[arm::inout("za"), arm::preserves("za")]]; // { dg-error "inconsistent attributes for state 'za'" } + +void (*contradiction_callback_ptr_1) () [[arm::preserves("za"), arm::inout("za")]]; // { dg-error "inconsistent attributes for state 'za'" } +void (*contradiction_callback_ptr_2) () [[arm::inout("za"), arm::preserves("za")]]; // { dg-error "inconsistent attributes for state 'za'" } + +struct s { + void (*contradiction_callback_ptr_1) () [[arm::preserves("za"), arm::inout("za")]]; // { dg-error "inconsistent attributes for state 'za'" } + void (*contradiction_callback_ptr_2) () [[arm::inout("za"), arm::preserves("za")]]; // { dg-error "inconsistent attributes for state 'za'" } +}; + +//---------------------------------------------------------------------------- + +void keyword_ok_1 () __arm_inout("za"); +void keyword_ok_1 () __arm_inout("za"); + +void keyword_ok_2 () __arm_in("za"); +void keyword_ok_2 () [[arm::in("za")]]; + +void keyword_ok_3 () [[arm::out("za")]]; +void keyword_ok_3 () __arm_out("za"); + +void keyword_ok_4 () __arm_inout("za") [[arm::inout("za")]]; + +void keyword_ok_5 () __arm_preserves("za"); +void keyword_ok_5 () [[arm::preserves("za")]]; + +__arm_new("za") void keyword_ok_6 () {} + +//---------------------------------------------------------------------------- + +void keyword_conflict_1 () __arm_inout("za"); +void keyword_conflict_1 (); // { dg-error "conflicting types" } + +void keyword_conflict_2 (); +void keyword_conflict_2 () __arm_inout("za"); // { dg-error "conflicting types" } + +void keyword_conflict_3 () __arm_inout("za"); +void keyword_conflict_3 () [[arm::preserves("za")]]; // { dg-error "conflicting types" } + +void keyword_conflict_4 () [[arm::preserves("za")]]; +void keyword_conflict_4 () __arm_inout("za"); // { dg-error "conflicting types" } + +__arm_new("za") void keyword_conflict_5 () __arm_inout("za") {} // { dg-error "cannot create a new 'za' scope since 'za' is shared with callers" } +__arm_new("za") void keyword_conflict_6 () __arm_preserves("za") {} // { dg-error "cannot create a new 'za' scope since 'za' is shared with callers" } diff --git a/gcc/testsuite/gcc.target/aarch64/sme/za_state_2.c b/gcc/testsuite/gcc.target/aarch64/sme/za_state_2.c new file mode 100644 index 00000000000..572ff309f8d --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sme/za_state_2.c @@ -0,0 +1,73 @@ +// { dg-options "" } + +[[arm::new("za")]] void new_za_a (); +void new_za_a (); + +void new_za_b (); +[[arm::new("za")]] void new_za_b (); + +[[arm::new("za")]] void new_za_c (); +void new_za_c () {} + +void new_za_d (); +[[arm::new("za")]] void new_za_d () {} + +[[arm::new("za")]] void new_za_e () {} +void new_za_e (); + +void new_za_f () {} +[[arm::new("za")]] void new_za_f (); // { dg-error "cannot apply attribute 'new' to 'new_za_f' after the function has been defined" } + +//---------------------------------------------------------------------------- + +[[arm::new("za")]] void shared_a (); +void shared_a () [[arm::inout("za")]]; // { dg-error "conflicting types" } + +void shared_b () [[arm::inout("za")]]; +[[arm::new("za")]] void shared_b (); // { dg-error "conflicting types" } + +[[arm::new("za")]] void shared_c (); +void shared_c () [[arm::in("za")]] {} // { dg-error "conflicting types" } + +void shared_d () [[arm::in("za")]]; +[[arm::new("za")]] void shared_d () {} // { dg-error "cannot create a new 'za' scope since 'za' is shared with callers" } + +[[arm::new("za")]] void shared_e () {} +void shared_e () [[arm::out("za")]]; // { dg-error "conflicting types" } + +void shared_f () [[arm::out("za")]] {} +[[arm::new("za")]] void shared_f (); // { dg-error "conflicting types" } + +[[arm::new("za")]] void shared_g () {} +void shared_g () [[arm::preserves("za")]]; // { dg-error "conflicting types" } + +void shared_h () [[arm::preserves("za")]] {} +[[arm::new("za")]] void shared_h (); // { dg-error "conflicting types" } + +//---------------------------------------------------------------------------- + +[[arm::new("za")]] void contradiction_1 () [[arm::inout("za")]]; // { dg-error "cannot create a new 'za' scope since 'za' is shared with callers" } +void contradiction_2 [[arm::new("za")]] () [[arm::inout("za")]]; // { dg-error "cannot create a new 'za' scope since 'za' is shared with callers" } +[[arm::new("za")]] void contradiction_3 () [[arm::preserves("za")]]; // { dg-error "cannot create a new 'za' scope since 'za' is shared with callers" } +void contradiction_4 [[arm::new("za")]] () [[arm::preserves("za")]]; // { dg-error "cannot create a new 'za' scope since 'za' is shared with callers" } + +int [[arm::new("za")]] int_attr; // { dg-warning "does not apply to types" } +[[arm::new("za")]] int int_var_attr; // { dg-error "applies only to function definitions" } +typedef void new_za_callback () [[arm::new("za")]]; // { dg-warning "does not apply to types" } +[[arm::new("za")]] void (*new_za_var_callback) (); // { dg-error "applies only to function definitions" } + +//---------------------------------------------------------------------------- + +[[arm::new("za")]] void complementary_1 () [[arm::streaming]] {} +void complementary_2 [[arm::new("za")]] () [[arm::streaming]] {} +[[arm::new("za")]] void complementary_3 () [[arm::streaming_compatible]] {} +void complementary_4 [[arm::new("za")]] () [[arm::streaming_compatible]] {} + +//---------------------------------------------------------------------------- + +#pragma GCC target "+nosme" + +[[arm::new("za")]] void bereft_1 (); +[[arm::new("za")]] void bereft_2 () {} // { dg-error "functions with SME state require the ISA extension 'sme'" } +void bereft_3 () [[arm::inout("za")]]; +void bereft_4 () [[arm::inout("za")]] {} // { dg-error "functions with SME state require the ISA extension 'sme'" } diff --git a/gcc/testsuite/gcc.target/aarch64/sme/za_state_3.c b/gcc/testsuite/gcc.target/aarch64/sme/za_state_3.c new file mode 100644 index 00000000000..203f6ae8a07 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sme/za_state_3.c @@ -0,0 +1,31 @@ +// { dg-options "" } + +void normal_callee (); +void in_callee () [[arm::in("za")]]; +void out_callee () [[arm::out("za")]]; +void inout_callee () [[arm::inout("za")]]; +void preserves_callee () [[arm::preserves("za")]]; + +struct callbacks { + void (*normal_ptr) (); + void (*in_ptr) () [[arm::in("za")]]; + void (*out_ptr) () [[arm::out("za")]]; + void (*inout_ptr) () [[arm::inout("za")]]; + void (*preserves_ptr) () [[arm::preserves("za")]]; +}; + +void +normal_caller (struct callbacks *c) +{ + normal_callee (); + in_callee (); // { dg-error {call to a function that shares 'za' state from a function that has no 'za' state} } + out_callee (); // { dg-error {call to a function that shares 'za' state from a function that has no 'za' state} } + inout_callee (); // { dg-error {call to a function that shares 'za' state from a function that has no 'za' state} } + preserves_callee (); // { dg-error {call to a function that shares SME state from a function that has no SME state} } + + c->normal_ptr (); + c->in_ptr (); // { dg-error {call to a function that shares 'za' state from a function that has no 'za' state} } + c->out_ptr (); // { dg-error {call to a function that shares 'za' state from a function that has no 'za' state} } + c->inout_ptr (); // { dg-error {call to a function that shares 'za' state from a function that has no 'za' state} } + c->preserves_ptr (); // { dg-error {call to a function that shares SME state from a function that has no SME state} } +} diff --git a/gcc/testsuite/gcc.target/aarch64/sme/za_state_4.c b/gcc/testsuite/gcc.target/aarch64/sme/za_state_4.c new file mode 100644 index 00000000000..cec0abf0ea9 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sme/za_state_4.c @@ -0,0 +1,585 @@ +// { dg-options "-O -fno-optimize-sibling-calls" } +// { dg-final { check-function-bodies "**" "" } } + +void private_za(); +void out_za() __arm_out("za"); +void in_za() __arm_in("za"); +void inout_za() __arm_inout("za"); +void preserves_za() __arm_preserves("za"); + +/* +** test1: +** ret +*/ +__arm_new("za") void test1() +{ +} + +/* +** test2: +** ldr w0, \[x0\] +** ret +*/ +__arm_new("za") int test2(int *ptr) +{ + return *ptr; +} + +/* +** test3: +** stp [^\n]+ +** mov x29, sp +** bl private_za +** ( +** mov w0, 0 +** ldp [^\n]+ +** | +** ldp [^\n]+ +** mov w0, 0 +** ) +** ret +*/ +__arm_new("za") int test3() +{ + private_za(); + return 0; +} + +/* +** test4: +** ... +** mrs x0, tpidr2_el0 +** cbz x0, [^\n]+ +** bl __arm_tpidr2_save +** msr tpidr2_el0, xzr +** zero { za } +** smstart za +** bl in_za +** smstop za +** ldp [^\n]+ +** ret +*/ +__arm_new("za") void test4() +{ + in_za(); // Uses zeroed contents. +} + +/* +** test5: +** ... +** mrs x0, tpidr2_el0 +** cbz x0, [^\n]+ +** bl __arm_tpidr2_save +** msr tpidr2_el0, xzr +** smstop za +** bl private_za +** smstart za +** bl out_za +** bl in_za +** smstop za +** bl private_za +** ldp [^\n]+ +** ret +*/ +__arm_new("za") void test5() +{ + private_za(); + out_za(); + in_za(); + private_za(); +} + +// Despite the long test, there shouldn't be too much scope for variation +// here. The point is both to test correctness and code quality. +/* +** test6: +** stp [^\n]+ +** mov x29, sp +** mrs x0, tpidr2_el0 +** cbz x0, [^\n]+ +** bl __arm_tpidr2_save +** msr tpidr2_el0, xzr +** smstart za +** bl out_za +** rdsvl (x[0-9]+), #1 +** mul (x[0-9]+), \1, \1 +** sub sp, sp, \2 +** mov (x[0-9]+), sp +** stp \3, \1, \[x29, #?16\] +** add (x[0-9]+), x29, #?16 +** msr tpidr2_el0, \4 +** bl private_za +** ( +** add (x[0-9]+), x29, #?16 +** mrs (x[0-9]+), tpidr2_el0 +** cbnz \6, [^\n]+ +** smstart za +** mov x0, \5 +** | +** add x0, x29, #?16 +** mrs (x[0-9]+), tpidr2_el0 +** cbnz \6, [^\n]+ +** smstart za +** ) +** bl __arm_tpidr2_restore +** msr tpidr2_el0, xzr +** bl in_za +** smstop za +** mov sp, x29 +** ldp [^\n]+ +** ret +*/ +__arm_new("za") void test6() +{ + out_za(); + private_za(); + in_za(); +} + +// Rely on previous tests for the part leading up to the smstart. +/* +** test7: +** ... +** smstart za +** bl out_za +** bl in_za +** smstop za +** bl private_za +** smstart za +** bl out_za +** bl in_za +** smstop za +** ldp [^\n]+ +** ret +*/ +__arm_new("za") void test7() +{ + out_za(); + in_za(); + private_za(); + out_za(); + in_za(); +} + +/* +** test8: +** ... +** smstart za +** bl out_za +** bl in_za +** smstop za +** bl private_za +** smstart za +** bl out_za +** bl in_za +** smstop za +** bl private_za +** ldp [^\n]+ +** ret +*/ +__arm_new("za") void test8() +{ + out_za(); + in_za(); + private_za(); + out_za(); + in_za(); + private_za(); +} + +/* +** test9: +** ... +** msr tpidr2_el0, x[0-9]+ +** bl private_za +** bl private_za +** bl private_za +** bl private_za +** add x[0-9]+, x29, #?16 +** mrs x[0-9]+, tpidr2_el0 +** ... +*/ +__arm_new("za") void test9() +{ + out_za(); + private_za(); + private_za(); + private_za(); + private_za(); + in_za(); +} + +/* +** test10: +** ldr (w[0-9]+), \[x0\] +** cbz \1, [^\n]+ +** ldr [^\n]+ +** add [^\n]+ +** str [^\n]+ +** ret +** ... +*/ +__arm_new("za") void test10(volatile int *ptr) +{ + if (__builtin_expect (*ptr != 0, 1)) + *ptr = *ptr + 1; + else + inout_za(); +} + +/* +** test11: +** ... +** ldr w[0-9]+, [^\n]+ +** add (w[0-9]+), [^\n]+ +** str \1, [^\n]+ +** ... +** ret +** mrs x[0-9]+, tpidr2_el0 +** ... +** smstart za +** bl inout_za +** ldr (w[0-9]+), [^\n]+ +** cbnz \2, [^\n]+ +** smstop za +** ... +*/ +__arm_new("za") void test11(volatile int *ptr) +{ + if (__builtin_expect (*ptr == 0, 0)) + do + inout_za(); + while (*ptr); + else + *ptr += 1; +} + +__arm_new("za") void test12(volatile int *ptr) +{ + do + { + inout_za(); + private_za(); + } + while (*ptr); + out_za(); + in_za(); +} + +/* +** test13: +** stp [^\n]+ +** ... +** stp [^\n]+ +** ... +** bl __arm_tpidr2_save +** ... +** msr tpidr2_el0, x[0-9]+ +** bl private_za +** ... +** mrs x[0-9]+, tpidr2_el0 +** ... +** bl inout_za +** ... +** msr tpidr2_el0, x[0-9]+ +** ... +** bl private_za +** ... +** cbnz [^\n]+ +** smstart za +** msr tpidr2_el0, xzr +** bl out_za +** bl in_za +** ... +** smstop za +** ... +*/ +__arm_new("za") void test13(volatile int *ptr) +{ + do + { + private_za(); + inout_za(); + private_za(); + } + while (*ptr); + out_za(); + in_za(); +} + +/* +** test14: +** ... +** bl __arm_tpidr2_save +** ... +** smstart za +** bl inout_za +** ldr [^\n]+ +** cbnz [^\n]+ +** bl out_za +** bl in_za +** smstop za +** ... +*/ +__arm_new("za") void test14(volatile int *ptr) +{ + do + inout_za(); + while (*ptr); + out_za(); + in_za(); +} + +/* +** test15: +** ... +** bl __arm_tpidr2_save +** ... +** smstart za +** bl out_za +** bl in_za +** ldr [^\n]+ +** cbnz [^\n]+ +** smstop za +** bl private_za +** ldr [^\n]+ +** ldp [^\n]+ +** ret +*/ +__arm_new("za") void test15(volatile int *ptr) +{ + do + { + out_za(); + in_za(); + } + while (*ptr); + private_za(); +} + +/* +** test16: +** ... +** bl __arm_tpidr2_save +** ... +** smstart za +** b [^\n]+ +-- loop: +** ... +** mrs x[0-9]+, tpidr2_el0 +** ... +** msr tpidr2_el0, xzr +-- loop_entry: +** bl inout_za +** ... +** msr tpidr2_el0, x[0-9]+ +** bl private_za +** ldr [^\n]+ +** cbnz [^\n]+ +** msr tpidr2_el0, xzr +** smstop za +** bl private_za +** ... +*/ +__arm_new("za") void test16(volatile int *ptr) +{ + do + { + inout_za(); + private_za(); + } + while (*ptr); + private_za(); +} + +/* +** test17: +** ... +** bl private_za +** ldr [^\n]+ +** cbnz [^\n]+ +** ... +** msr tpidr2_el0, xzr +** ... +** smstop za +** ... +*/ +__arm_new("za") void test17(volatile int *ptr) +{ + do + { + inout_za(); + private_za(); + } + while (*ptr); +} + +/* +** test18: +** ldr w[0-9]+, [^\n]+ +** cbnz w[0-9]+, [^\n]+ +** ret +** ... +** smstop za +** bl private_za +** ... +*/ +__arm_new("za") void test18(volatile int *ptr) +{ + if (__builtin_expect (*ptr, 0)) + { + out_za(); + in_za(); + private_za(); + } +} + +/* +** test19: +** ... +** ldr w[0-9]+, [^\n]+ +** cbz w[0-9]+, [^\n]+ +** mrs x[0-9]+, tpidr2_el0 +** ... +** smstop za +** bl private_za +** ... +*/ +__arm_new("za") void test19(volatile int *ptr) +{ + if (__builtin_expect (*ptr != 0, 1)) + private_za(); + else + do + { + inout_za(); + private_za(); + } + while (*ptr); +} + +/* +** test20: +** ... +** bl a20 +** (?:(?!x0).)* +** bl b20 +** ... +** mov ([wx][0-9]+), [wx]0 +** ... +** bl __arm_tpidr2_restore +** ... +** mov [wx]0, \1 +** ... +** bl c20 +** ... +*/ +__arm_new("za") void test20() +{ + extern int a20() __arm_inout("za"); + extern int b20(int); + extern void c20(int) __arm_inout("za"); + c20(b20(a20())); +} + +/* +** test21: +** ... +** bl a21 +** (?:(?!x0).)* +** bl b21 +** ... +** mov (x[0-9]+), x0 +** ... +** bl __arm_tpidr2_restore +** ... +** mov x0, \1 +** ... +** bl c21 +** ... +*/ +__arm_new("za") void test21() +{ + extern __UINT64_TYPE__ a21() __arm_inout("za"); + extern __UINT64_TYPE__ b21(__UINT64_TYPE__); + extern void c21(__UINT64_TYPE__) __arm_inout("za"); + c21(b21(a21())); +} + +/* +** test22: +** (?:(?!rdsvl).)* +** rdsvl x[0-9]+, #1 +** (?:(?!rdsvl).)* +*/ +__arm_new("za") void test22(volatile int *ptr) +{ + inout_za(); + if (*ptr) + *ptr += 1; + else + private_za(); + private_za(); + in_za(); +} + +/* +** test23: +** (?:(?!__arm_tpidr2_save).)* +** bl __arm_tpidr2_save +** (?:(?!__arm_tpidr2_save).)* +*/ +__arm_new("za") void test23(volatile int *ptr) +{ + if (*ptr) + *ptr += 1; + else + inout_za(); + inout_za(); +} + +/* +** test24: +** ... +** bl in_za +** ... +** incb x1 +** ... +** bl out_za +** bl inout_za +** ... +** msr tpidr2_el0, x[0-9]+ +** ... +** bl private_za +** ... +** mrs x[0-9]+, tpidr2_el0 +** ... +** incb x1 +** ... +** msr tpidr2_el0, x[0-9]+ +** ... +** bl private_za +** ... +** mrs x[0-9]+, tpidr2_el0 +** ... +** incb x1 +** ... +** smstop za +** ... +** bl private_za +** ... +** ret +*/ +__arm_new("za") void test24() +{ + in_za(); + asm ("incb\tx1" ::: "x1", "za"); + out_za(); + inout_za(); + private_za(); + asm ("incb\tx1" ::: "x1", "za"); + private_za(); + asm ("incb\tx1" ::: "x1", "za"); + in_za(); + private_za(); +} diff --git a/gcc/testsuite/gcc.target/aarch64/sme/za_state_5.c b/gcc/testsuite/gcc.target/aarch64/sme/za_state_5.c new file mode 100644 index 00000000000..d54840d3d77 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sme/za_state_5.c @@ -0,0 +1,595 @@ +// { dg-options "-O2 -fno-optimize-sibling-calls" } +// { dg-final { check-function-bodies "**" "" } } + +void private_za(); +void out_za() __arm_out("za"); +void in_za() __arm_in("za"); +void inout_za() __arm_inout("za"); +void preserves_za() __arm_preserves("za"); + +/* +** test1: +** ret +*/ +void test1() __arm_inout("za") +{ +} + +/* +** test2: +** ldr w0, \[x0\] +** ret +*/ +int test2(int *ptr) __arm_inout("za") +{ + return *ptr; +} + +/* +** test3: +** ... +** sub sp, sp, x[0-9]+ +** ... +** msr tpidr2_el0, x[0-9]+ +** ... +** bl private_za +** ... +** mrs x[0-9]+, tpidr2_el0 +** ... +** smstart za +** ... +** bl __arm_tpidr2_restore +** ... +** msr tpidr2_el0, xzr +** ... +*/ +int test3() __arm_inout("za") +{ + private_za(); + return 0; +} + +/* +** test4: +** stp [^\n]+ +** [^\n]+ +** bl in_za +** ldp [^\n]+ +** ret +*/ +void test4() __arm_inout("za") +{ + in_za(); +} + +/* +** test5: +** ... +** smstop za +** ... +** bl private_za +** smstart za +** bl out_za +** bl in_za +** ... +** sub sp, sp, x[0-9]+ +** ... +** msr tpidr2_el0, x[0-9]+ +** ... +** bl private_za +** ... +** mrs x[0-9]+, tpidr2_el0 +** ... +** smstart za +** ... +** bl __arm_tpidr2_restore +** ... +** msr tpidr2_el0, xzr +** ... +*/ +void test5() __arm_inout("za") +{ + private_za(); + out_za(); + in_za(); + private_za(); +} + +/* +** test6: +** ... +** bl out_za +** ... +** sub sp, sp, x[0-9]+ +** ... +** msr tpidr2_el0, x[0-9]+ +** ... +** bl private_za +** ... +** mrs x[0-9]+, tpidr2_el0 +** ... +** smstart za +** ... +** bl __arm_tpidr2_restore +** ... +** msr tpidr2_el0, xzr +** ... +** bl in_za +** ... +*/ +void test6() __arm_inout("za") +{ + out_za(); + private_za(); + in_za(); +} + +/* +** test7: +** stp [^\n]+ +** [^\n]+ +** bl out_za +** bl in_za +** smstop za +** bl private_za +** smstart za +** bl out_za +** bl in_za +** ldp [^\n]+ +** ret +*/ +void test7() __arm_inout("za") +{ + out_za(); + in_za(); + private_za(); + out_za(); + in_za(); +} + +/* +** test8: +** stp [^\n]+ +** [^\n]+ +** bl out_za +** bl in_za +** smstop za +** bl private_za +** smstart za +** bl out_za +** bl in_za +** ... +** sub sp, sp, x[0-9]+ +** ... +** msr tpidr2_el0, x[0-9]+ +** ... +** bl private_za +** ... +** mrs x[0-9]+, tpidr2_el0 +** ... +** smstart za +** ... +** bl __arm_tpidr2_restore +** ... +** msr tpidr2_el0, xzr +** ... +** ret +*/ +void test8() __arm_inout("za") +{ + out_za(); + in_za(); + private_za(); + out_za(); + in_za(); + private_za(); +} + +/* +** test9: +** stp [^\n]+ +** [^\n]+ +** bl out_za +** ... +** msr tpidr2_el0, x[0-9]+ +** bl private_za +** bl private_za +** bl private_za +** bl private_za +** ... +** mrs x[0-9]+, tpidr2_el0 +** ... +** smstart za +** ... +** bl __arm_tpidr2_restore +** ... +** msr tpidr2_el0, xzr +** ... +*/ +void test9() __arm_inout("za") +{ + out_za(); + private_za(); + private_za(); + private_za(); + private_za(); + in_za(); +} + +/* +** test10: +** ldr (w[0-9]+), \[x0\] +** cbz \1, [^\n]+ +** ldr [^\n]+ +** add [^\n]+ +** str [^\n]+ +** ret +** ... +*/ +void test10(volatile int *ptr) __arm_inout("za") +{ + if (__builtin_expect (*ptr != 0, 1)) + *ptr = *ptr + 1; + else + inout_za(); +} + +/* +** test11: +** (?!.*(\t__arm|\tza|tpidr2_el0)).* +*/ +void test11(volatile int *ptr) __arm_inout("za") +{ + if (__builtin_expect (*ptr == 0, 0)) + do + inout_za(); + while (*ptr); + else + *ptr += 1; +} + +void test12(volatile int *ptr) __arm_inout("za") +{ + do + { + inout_za(); + private_za(); + } + while (*ptr); + out_za(); + in_za(); +} + +/* +** test13: +** stp [^\n]+ +** ... +** stp [^\n]+ +** ... +-- loop: +** mrs x[0-9]+, tpidr2_el0 +** ... +** smstart za +** ... +** bl __arm_tpidr2_restore +** ... +** msr tpidr2_el0, xzr +** bl inout_za +** ... +** msr tpidr2_el0, x[0-9]+ +** ... +** bl private_za +** ldr [^\n]+ +** cbnz [^\n]+ +** smstart za +** msr tpidr2_el0, xzr +** bl out_za +** bl in_za +** [^\n]+ +** [^\n]+ +** ldp [^\n]+ +** ret +*/ +void test13(volatile int *ptr) __arm_inout("za") +{ + do + { + private_za(); + inout_za(); + private_za(); + } + while (*ptr); + out_za(); + in_za(); +} + +/* +** test14: +** ... +** bl inout_za +** ldr [^\n]+ +** cbnz [^\n]+ +** bl out_za +** bl in_za +** ... +*/ +void test14(volatile int *ptr) __arm_inout("za") +{ + do + inout_za(); + while (*ptr); + out_za(); + in_za(); +} + +/* +** test15: +** ... +** bl out_za +** bl in_za +** ldr [^\n]+ +** cbnz [^\n]+ +** ... +** stp [^\n]+ +** ... +** msr tpidr2_el0, [^\n]+ +** ... +** bl private_za +** ... +** mrs x[0-9]+, tpidr2_el0 +** ... +** bl __arm_tpidr2_restore +** ... +** msr tpidr2_el0, xzr +** ... +*/ +void test15(volatile int *ptr) __arm_inout("za") +{ + do + { + out_za(); + in_za(); + } + while (*ptr); + private_za(); +} + +/* +** test16: +** stp [^\n]+ +** ... +** stp [^\n]+ +** ... +** b [^\n]+ +-- loop: +** ... +** mrs x[0-9]+, tpidr2_el0 +** ... +** msr tpidr2_el0, xzr +-- loop_entry: +** bl inout_za +** ... +** msr tpidr2_el0, x[0-9]+ +** ... +** bl private_za +** ... +** bl private_za +** ... +** mrs x[0-9]+, tpidr2_el0 +** ... +** bl __arm_tpidr2_restore +** ... +** msr tpidr2_el0, xzr +** ... +*/ +void test16(volatile int *ptr) __arm_inout("za") +{ + do + { + inout_za(); + private_za(); + } + while (*ptr); + private_za(); +} + +/* +** test17: +** ... +-- loop: +** bl inout_za +** ... +** msr tpidr2_el0, x[0-9]+ +** ... +** bl private_za +** ... +** mrs x[0-9]+, tpidr2_el0 +** ... +** smstart za +** ... +** bl __arm_tpidr2_restore +** ... +** msr tpidr2_el0, xzr +** ... +** cbnz [^\n]+ +** [^\n]+ +** [^\n]+ +** ldp [^\n]+ +** ret +*/ +void test17(volatile int *ptr) __arm_inout("za") +{ + do + { + inout_za(); + private_za(); + while (*ptr) + ptr += 1; + } + while (*ptr); +} + +/* +** test18: +** ldr w[0-9]+, [^\n]+ +** cbnz w[0-9]+, [^\n]+ +** ret +** ... +** bl out_za +** bl in_za +** ... +** msr tpidr2_el0, x[0-9]+ +** ... +** bl private_za +** ... +** mrs x[0-9]+, tpidr2_el0 +** ... +** bl __arm_tpidr2_restore +** ... +** msr tpidr2_el0, xzr +** ... +*/ +void test18(volatile int *ptr) __arm_inout("za") +{ + if (__builtin_expect (*ptr, 0)) + { + out_za(); + in_za(); + private_za(); + } +} + +void test19(volatile int *ptr) __arm_inout("za") +{ + if (__builtin_expect (*ptr != 0, 1)) + private_za(); + else + do + { + inout_za(); + private_za(); + } + while (*ptr); +} + +/* +** test20: +** ... +** bl a20 +** (?:(?!x0).)* +** bl b20 +** ... +** mov ([wx][0-9]+), [wx]0 +** ... +** bl __arm_tpidr2_restore +** ... +** mov [wx]0, \1 +** ... +** bl c20 +** ... +*/ +void test20() __arm_inout("za") +{ + extern int a20() __arm_inout("za"); + extern int b20(int); + extern void c20(int) __arm_inout("za"); + c20(b20(a20())); +} + +/* +** test21: +** ... +** bl a21 +** (?:(?!x0).)* +** bl b21 +** ... +** mov (x[0-9]+), x0 +** ... +** bl __arm_tpidr2_restore +** ... +** mov x0, \1 +** ... +** bl c21 +** ... +*/ +void test21() __arm_inout("za") +{ + extern __UINT64_TYPE__ a21() __arm_inout("za"); + extern __UINT64_TYPE__ b21(__UINT64_TYPE__); + extern void c21(__UINT64_TYPE__) __arm_inout("za"); + c21(b21(a21())); +} + +/* +** test22: +** (?:(?!rdsvl).)* +** rdsvl x[0-9]+, #1 +** (?:(?!rdsvl).)* +*/ +void test22(volatile int *ptr) __arm_inout("za") +{ + inout_za(); + if (*ptr) + *ptr += 1; + else + private_za(); + private_za(); + in_za(); +} + +void test23(volatile int *ptr) __arm_inout("za") +{ + if (*ptr) + *ptr += 1; + else + inout_za(); + inout_za(); +} + +/* +** test24: +** ... +** bl in_za +** ... +** incb x1 +** ... +** bl out_za +** bl inout_za +** ... +** msr tpidr2_el0, x[0-9]+ +** ... +** bl private_za +** ... +** mrs x[0-9]+, tpidr2_el0 +** ... +** incb x1 +** ... +** msr tpidr2_el0, x[0-9]+ +** ... +** bl private_za +** ... +** mrs x[0-9]+, tpidr2_el0 +** ... +** incb x1 +** ... +** msr tpidr2_el0, x[0-9]+ +** ... +** bl private_za +** ... +** mrs x[0-9]+, tpidr2_el0 +** ... +** ret +*/ +void test24() __arm_inout("za") +{ + in_za(); + asm ("incb\tx1" ::: "x1", "za"); + out_za(); + inout_za(); + private_za(); + asm ("incb\tx1" ::: "x1", "za"); + private_za(); + asm ("incb\tx1" ::: "x1", "za"); + in_za(); + private_za(); +} diff --git a/gcc/testsuite/gcc.target/aarch64/sme/za_state_6.c b/gcc/testsuite/gcc.target/aarch64/sme/za_state_6.c new file mode 100644 index 00000000000..d5b226ae158 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sme/za_state_6.c @@ -0,0 +1,23 @@ +// { dg-options "-O -fno-optimize-sibling-calls -fomit-frame-pointer" } + +void private_za(); +void out_za() __arm_out("za"); +void in_za() __arm_in("za"); + +__arm_new("za") void test20(volatile int *ptr) +{ + if (*ptr) + out_za(); + else + *ptr += 1; + *ptr += 1; + if (*ptr) + in_za(); + else + *ptr += 1; +} + +// { dg-final { scan-assembler {\tbl\t__arm_tpidr2_save\n} } } +// { dg-final { scan-assembler {\tsmstart\tza\n} } } +// { dg-final { scan-assembler {\tsmstop\tza\n} } } +// { dg-final { scan-assembler-not {\tsub\tsp, sp, x[0-9]+\n} } } From patchwork Tue Dec 5 10:13:15 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Sandiford X-Patchwork-Id: 1872038 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4SkxJq12ZTz1ySd for ; Tue, 5 Dec 2023 21:16:27 +1100 (AEDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 1B276383E715 for ; Tue, 5 Dec 2023 10:16:01 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by sourceware.org (Postfix) with ESMTP id 7418D3861833 for ; Tue, 5 Dec 2023 10:13:45 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 7418D3861833 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=arm.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 7418D3861833 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=217.140.110.172 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1701771228; cv=none; b=siMXTDsEvE1DPMF4vPZCe0UVtXPa1tA36ce5L/KeiGOE26wekGxEL7MUdJP2+uVUbB9jxQrviLIcKTHaDGs1zq48sALxaEpC2U8AWybLfaWpMbSrs7hpD+4HNNpNxTZpg6LiWnzzEgOnPmhZK/2qEv+9XazycJi86fRh6WBzVvM= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1701771228; c=relaxed/simple; bh=gGVRolUZmE/KmYlxk2Z386jjZhEqdqM1LUDzIQR3MCU=; h=From:To:Subject:Date:Message-Id:MIME-Version; b=wsthRtKIdCO6BL3bZIZlE+ApSBpjGPr/ojlFKvmSgOnxzb3STaUBptHiwwJReKC9vbP7n46w0UWKG/hRmwJ8Hbe+9dBczU5s2MLSHu0qtEHfktizJlD3O7XFEjl8yVGBSONrq00PRokjz5JLsL16cpHvqSTg3/1hpyGyruy8jJM= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id F11011477; Tue, 5 Dec 2023 02:14:31 -0800 (PST) Received: from e121540-lin.manchester.arm.com (e121540-lin.manchester.arm.com [10.32.110.72]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id E2A2A3F5A1; Tue, 5 Dec 2023 02:13:44 -0800 (PST) From: Richard Sandiford To: gcc-patches@gcc.gnu.org Cc: Richard Sandiford Subject: [pushed v2 17/25] aarch64: Add a register class for w12-w15 Date: Tue, 5 Dec 2023 10:13:15 +0000 Message-Id: <20231205101323.1914247-18-richard.sandiford@arm.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20231205101323.1914247-1-richard.sandiford@arm.com> References: <20231205101323.1914247-1-richard.sandiford@arm.com> MIME-Version: 1.0 X-Spam-Status: No, score=-22.1 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_DMARC_NONE, KAM_DMARC_STATUS, KAM_LAZY_DOMAIN_SECURITY, KAM_NUMSUBJECT, SPF_HELO_NONE, SPF_NONE, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org Some SME instructions use w12-w15 to index ZA. This patch adds a register class for that range. gcc/ * config/aarch64/aarch64.h (W12_W15_REGNUM_P): New macro. (W12_W15_REGS): New register class. (REG_CLASS_NAMES, REG_CLASS_CONTENTS): Add entries for it. * config/aarch64/aarch64.cc (aarch64_regno_regclass) (aarch64_class_max_nregs, aarch64_register_move_cost): Handle W12_W15_REGS. --- gcc/config/aarch64/aarch64.cc | 12 +++++++----- gcc/config/aarch64/aarch64.h | 6 ++++++ 2 files changed, 13 insertions(+), 5 deletions(-) diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc index 5d06c7fb411..9a97b71fedc 100644 --- a/gcc/config/aarch64/aarch64.cc +++ b/gcc/config/aarch64/aarch64.cc @@ -11959,6 +11959,9 @@ aarch64_label_mentioned_p (rtx x) enum reg_class aarch64_regno_regclass (unsigned regno) { + if (W12_W15_REGNUM_P (regno)) + return W12_W15_REGS; + if (STUB_REGNUM_P (regno)) return STUB_REGS; @@ -12323,6 +12326,7 @@ aarch64_class_max_nregs (reg_class_t regclass, machine_mode mode) unsigned int nregs, vec_flags; switch (regclass) { + case W12_W15_REGS: case STUB_REGS: case TAILCALL_ADDR_REGS: case POINTER_REGS: @@ -14693,13 +14697,11 @@ aarch64_register_move_cost (machine_mode mode, const struct cpu_regmove_cost *regmove_cost = aarch64_tune_params.regmove_cost; - /* Caller save and pointer regs are equivalent to GENERAL_REGS. */ - if (to == TAILCALL_ADDR_REGS || to == POINTER_REGS - || to == STUB_REGS) + /* Trest any subset of POINTER_REGS as though it were GENERAL_REGS. */ + if (reg_class_subset_p (to, POINTER_REGS)) to = GENERAL_REGS; - if (from == TAILCALL_ADDR_REGS || from == POINTER_REGS - || from == STUB_REGS) + if (reg_class_subset_p (from, POINTER_REGS)) from = GENERAL_REGS; /* Make RDFFR very expensive. In particular, if we know that the FFR diff --git a/gcc/config/aarch64/aarch64.h b/gcc/config/aarch64/aarch64.h index 2d39b843d9c..57012a7c763 100644 --- a/gcc/config/aarch64/aarch64.h +++ b/gcc/config/aarch64/aarch64.h @@ -663,6 +663,9 @@ constexpr auto AARCH64_FL_DEFAULT_ISA_MODE = AARCH64_FL_SM_OFF; && (REGNO) != R17_REGNUM \ && (REGNO) != R30_REGNUM) \ +#define W12_W15_REGNUM_P(REGNO) \ + IN_RANGE (REGNO, R12_REGNUM, R15_REGNUM) + #define FP_REGNUM_P(REGNO) \ (((unsigned) (REGNO - V0_REGNUM)) <= (V31_REGNUM - V0_REGNUM)) @@ -689,6 +692,7 @@ constexpr auto AARCH64_FL_DEFAULT_ISA_MODE = AARCH64_FL_SM_OFF; enum reg_class { NO_REGS, + W12_W15_REGS, TAILCALL_ADDR_REGS, STUB_REGS, GENERAL_REGS, @@ -713,6 +717,7 @@ enum reg_class #define REG_CLASS_NAMES \ { \ "NO_REGS", \ + "W12_W15_REGS", \ "TAILCALL_ADDR_REGS", \ "STUB_REGS", \ "GENERAL_REGS", \ @@ -734,6 +739,7 @@ enum reg_class #define REG_CLASS_CONTENTS \ { \ { 0x00000000, 0x00000000, 0x00000000 }, /* NO_REGS */ \ + { 0x0000f000, 0x00000000, 0x00000000 }, /* W12_W15_REGS */ \ { 0x00030000, 0x00000000, 0x00000000 }, /* TAILCALL_ADDR_REGS */\ { 0x3ffcffff, 0x00000000, 0x00000000 }, /* STUB_REGS */ \ { 0x7fffffff, 0x00000000, 0x00000003 }, /* GENERAL_REGS */ \ From patchwork Tue Dec 5 10:13:16 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Sandiford X-Patchwork-Id: 1872039 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4SkxK42ZMSz1ySd for ; Tue, 5 Dec 2023 21:16:40 +1100 (AEDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 080A4386C5B5 for ; Tue, 5 Dec 2023 10:16:13 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by sourceware.org (Postfix) with ESMTP id 27A6F3854827 for ; Tue, 5 Dec 2023 10:13:46 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 27A6F3854827 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=arm.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 27A6F3854827 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=217.140.110.172 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1701771230; cv=none; b=Vyv8VEACEvhVAXSH6F3F6BT1VdBA/yh/pX/eqJOpVDagy7flTFRHwi00y+FUfmGHLI7On4toyI221ImgHpr0JPXGnTmIAjWRdWM628hxjanDRv6dy6XQWbxvxrCWyzU+IB88HOw2Np0WIyIiwepbS+6n9cISZUDe17v4TMclvu8= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1701771230; c=relaxed/simple; bh=624MomvQ7LB083fYrfoqRvL1vzdrCBvP8KfbEWDDAk8=; h=From:To:Subject:Date:Message-Id:MIME-Version; b=YYnFzWAvPxk4MRfrl8qK30vKSdWdcog3kS7CDkIX8ZXTiHKU92niItDzehTNauc5mC161DlnPMzD9cmDYUVqDkhwnHTwnKUlZQDX06s/SDefiL99LFdZVPVG+wE5l2BS8fxL9I9qTFgqZ2cwREsDonUQ8+kxypbxuCYklMjdvAM= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id A0F09139F; Tue, 5 Dec 2023 02:14:32 -0800 (PST) Received: from e121540-lin.manchester.arm.com (e121540-lin.manchester.arm.com [10.32.110.72]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 9264D3F5A1; Tue, 5 Dec 2023 02:13:45 -0800 (PST) From: Richard Sandiford To: gcc-patches@gcc.gnu.org Cc: Richard Sandiford Subject: [pushed v2 18/25] aarch64: Add a VNx1TI mode Date: Tue, 5 Dec 2023 10:13:16 +0000 Message-Id: <20231205101323.1914247-19-richard.sandiford@arm.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20231205101323.1914247-1-richard.sandiford@arm.com> References: <20231205101323.1914247-1-richard.sandiford@arm.com> MIME-Version: 1.0 X-Spam-Status: No, score=-22.3 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_DMARC_NONE, KAM_DMARC_STATUS, KAM_LAZY_DOMAIN_SECURITY, SPF_HELO_NONE, SPF_NONE, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org Although TI isn't really a native SVE element mode, it's convenient for SME if we define VNx1TI anyway, so that it can be used to distinguish .Q ZA operations from others. It's purely an RTL convenience and isn't (yet) a valid storage mode. gcc/ * config/aarch64/aarch64-modes.def: Add VNx1TI. --- gcc/config/aarch64/aarch64-modes.def | 21 ++++++++++++++------- 1 file changed, 14 insertions(+), 7 deletions(-) diff --git a/gcc/config/aarch64/aarch64-modes.def b/gcc/config/aarch64/aarch64-modes.def index 6b4f4e17dd5..a3efc5b8484 100644 --- a/gcc/config/aarch64/aarch64-modes.def +++ b/gcc/config/aarch64/aarch64-modes.def @@ -156,7 +156,7 @@ ADV_SIMD_Q_REG_STRUCT_MODES (4, V4x16, V4x8, V4x4, V4x2) for 8-bit, 16-bit, 32-bit and 64-bit elements respectively. It isn't strictly necessary to set the alignment here, since the default would be clamped to BIGGEST_ALIGNMENT anyhow, but it seems clearer. */ -#define SVE_MODES(NVECS, VB, VH, VS, VD) \ +#define SVE_MODES(NVECS, VB, VH, VS, VD, VT) \ VECTOR_MODES_WITH_PREFIX (VNx, INT, 16 * NVECS, NVECS == 1 ? 1 : 4); \ VECTOR_MODES_WITH_PREFIX (VNx, FLOAT, 16 * NVECS, NVECS == 1 ? 1 : 4); \ \ @@ -164,6 +164,7 @@ ADV_SIMD_Q_REG_STRUCT_MODES (4, V4x16, V4x8, V4x4, V4x2) ADJUST_NUNITS (VH##HI, aarch64_sve_vg * NVECS * 4); \ ADJUST_NUNITS (VS##SI, aarch64_sve_vg * NVECS * 2); \ ADJUST_NUNITS (VD##DI, aarch64_sve_vg * NVECS); \ + ADJUST_NUNITS (VT##TI, exact_div (aarch64_sve_vg * NVECS, 2)); \ ADJUST_NUNITS (VH##BF, aarch64_sve_vg * NVECS * 4); \ ADJUST_NUNITS (VH##HF, aarch64_sve_vg * NVECS * 4); \ ADJUST_NUNITS (VS##SF, aarch64_sve_vg * NVECS * 2); \ @@ -173,17 +174,23 @@ ADV_SIMD_Q_REG_STRUCT_MODES (4, V4x16, V4x8, V4x4, V4x2) ADJUST_ALIGNMENT (VH##HI, 16); \ ADJUST_ALIGNMENT (VS##SI, 16); \ ADJUST_ALIGNMENT (VD##DI, 16); \ + ADJUST_ALIGNMENT (VT##TI, 16); \ ADJUST_ALIGNMENT (VH##BF, 16); \ ADJUST_ALIGNMENT (VH##HF, 16); \ ADJUST_ALIGNMENT (VS##SF, 16); \ ADJUST_ALIGNMENT (VD##DF, 16); -/* Give SVE vectors the names normally used for 256-bit vectors. - The actual number depends on command-line flags. */ -SVE_MODES (1, VNx16, VNx8, VNx4, VNx2) -SVE_MODES (2, VNx32, VNx16, VNx8, VNx4) -SVE_MODES (3, VNx48, VNx24, VNx12, VNx6) -SVE_MODES (4, VNx64, VNx32, VNx16, VNx8) +/* Give SVE vectors names of the form VNxX, where X describes what is + stored in each 128-bit unit. The actual size of the mode depends + on command-line flags. + + VNx1TI isn't really a native SVE mode, but it can be useful in some + limited situations. */ +VECTOR_MODE_WITH_PREFIX (VNx, INT, TI, 1, 1); +SVE_MODES (1, VNx16, VNx8, VNx4, VNx2, VNx1) +SVE_MODES (2, VNx32, VNx16, VNx8, VNx4, VNx2) +SVE_MODES (3, VNx48, VNx24, VNx12, VNx6, VNx3) +SVE_MODES (4, VNx64, VNx32, VNx16, VNx8, VNx4) /* Partial SVE vectors: From patchwork Tue Dec 5 10:13:17 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Sandiford X-Patchwork-Id: 1872044 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4SkxLN6hkBz1ySd for ; Tue, 5 Dec 2023 21:17:48 +1100 (AEDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id C55DA385DC19 for ; Tue, 5 Dec 2023 10:17:22 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by sourceware.org (Postfix) with ESMTP id DE96938618B1 for ; Tue, 5 Dec 2023 10:13:46 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org DE96938618B1 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=arm.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org DE96938618B1 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=217.140.110.172 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1701771231; cv=none; b=L4kNvrTfScjK6KfILmjVKx4yZI3VfBGkACqOWBg3uNHivDqHg1os/Us0ECQV5kN38z4fLvmagVjCCp3LHJPhx8OF00B/zZaF7ZzI0GbbWlTi9iwxy9PL/T2t9eMvScjfI+PpY6/0XXXWxIOyqyep4l8s0/JTAIwvmm+eiVJMptM= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1701771231; c=relaxed/simple; bh=ZX1X2RJGxu3F7uTrr2VOS91rb1yd2eAVuZJCjZflXIo=; h=From:To:Subject:Date:Message-Id:MIME-Version; b=VqlOv66x02bTmhIQsNB6OHRxGaOX41jMm03vJy5WyX49jPSkSP7YwSHS1sqYqYVSTYYp5S3Q+4ka8A4+sUsbFc2a1Awv5OCF34svuYpimxiAXwJwB0g/BRSdW5HUYl3gWZqrzjU13YBO91BbuKf9aJZI6u8xgkA7K1B5sFbVWfE= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 50A981576; Tue, 5 Dec 2023 02:14:33 -0800 (PST) Received: from e121540-lin.manchester.arm.com (e121540-lin.manchester.arm.com [10.32.110.72]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 422603F5A1; Tue, 5 Dec 2023 02:13:46 -0800 (PST) From: Richard Sandiford To: gcc-patches@gcc.gnu.org Cc: Richard Sandiford Subject: [pushed v2 19/25] aarch64: Generalise unspec_based_function_base Date: Tue, 5 Dec 2023 10:13:17 +0000 Message-Id: <20231205101323.1914247-20-richard.sandiford@arm.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20231205101323.1914247-1-richard.sandiford@arm.com> References: <20231205101323.1914247-1-richard.sandiford@arm.com> MIME-Version: 1.0 X-Spam-Status: No, score=-22.3 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_DMARC_NONE, KAM_DMARC_STATUS, KAM_LAZY_DOMAIN_SECURITY, SPF_HELO_NONE, SPF_NONE, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org Until now, SVE intrinsics that map directly to unspecs have always used type suffix 0 to distinguish between signed integers, unsigned integers, and floating-point values. SME adds functions that need to use type suffix 1 instead. This patch generalises the classes accordingly. gcc/ * config/aarch64/aarch64-sve-builtins-functions.h (unspec_based_function_base): Allow type suffix 1 to determine the mode of the operation. (unspec_based_function): Update accordingly. (unspec_based_fused_function): Likewise. (unspec_based_fused_lane_function): Likewise. --- .../aarch64/aarch64-sve-builtins-functions.h | 29 ++++++++++++------- 1 file changed, 18 insertions(+), 11 deletions(-) diff --git a/gcc/config/aarch64/aarch64-sve-builtins-functions.h b/gcc/config/aarch64/aarch64-sve-builtins-functions.h index 4a10102038a..be2561620f4 100644 --- a/gcc/config/aarch64/aarch64-sve-builtins-functions.h +++ b/gcc/config/aarch64/aarch64-sve-builtins-functions.h @@ -234,18 +234,21 @@ class unspec_based_function_base : public function_base public: CONSTEXPR unspec_based_function_base (int unspec_for_sint, int unspec_for_uint, - int unspec_for_fp) + int unspec_for_fp, + unsigned int suffix_index = 0) : m_unspec_for_sint (unspec_for_sint), m_unspec_for_uint (unspec_for_uint), - m_unspec_for_fp (unspec_for_fp) + m_unspec_for_fp (unspec_for_fp), + m_suffix_index (suffix_index) {} /* Return the unspec code to use for INSTANCE, based on type suffix 0. */ int unspec_for (const function_instance &instance) const { - return (!instance.type_suffix (0).integer_p ? m_unspec_for_fp - : instance.type_suffix (0).unsigned_p ? m_unspec_for_uint + auto &suffix = instance.type_suffix (m_suffix_index); + return (!suffix.integer_p ? m_unspec_for_fp + : suffix.unsigned_p ? m_unspec_for_uint : m_unspec_for_sint); } @@ -254,6 +257,9 @@ public: int m_unspec_for_sint; int m_unspec_for_uint; int m_unspec_for_fp; + + /* Which type suffix is used to choose between the unspecs. */ + unsigned int m_suffix_index; }; /* A function_base for functions that have an associated unspec code. @@ -306,7 +312,8 @@ public: rtx expand (function_expander &e) const override { - return e.use_exact_insn (CODE (unspec_for (e), e.vector_mode (0))); + return e.use_exact_insn (CODE (unspec_for (e), + e.vector_mode (m_suffix_index))); } }; @@ -360,16 +367,16 @@ public: { int unspec = unspec_for (e); insn_code icode; - if (e.type_suffix (0).float_p) + if (e.type_suffix (m_suffix_index).float_p) { /* Put the operands in the normal (fma ...) order, with the accumulator last. This fits naturally since that's also the unprinted operand in the asm output. */ e.rotate_inputs_left (0, e.pred != PRED_none ? 4 : 3); - icode = code_for_aarch64_sve (unspec, e.vector_mode (0)); + icode = code_for_aarch64_sve (unspec, e.vector_mode (m_suffix_index)); } else - icode = INT_CODE (unspec, e.vector_mode (0)); + icode = INT_CODE (unspec, e.vector_mode (m_suffix_index)); return e.use_exact_insn (icode); } }; @@ -390,16 +397,16 @@ public: { int unspec = unspec_for (e); insn_code icode; - if (e.type_suffix (0).float_p) + if (e.type_suffix (m_suffix_index).float_p) { /* Put the operands in the normal (fma ...) order, with the accumulator last. This fits naturally since that's also the unprinted operand in the asm output. */ e.rotate_inputs_left (0, e.pred != PRED_none ? 5 : 4); - icode = code_for_aarch64_lane (unspec, e.vector_mode (0)); + icode = code_for_aarch64_lane (unspec, e.vector_mode (m_suffix_index)); } else - icode = INT_CODE (unspec, e.vector_mode (0)); + icode = INT_CODE (unspec, e.vector_mode (m_suffix_index)); return e.use_exact_insn (icode); } }; From patchwork Tue Dec 5 10:13:18 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Sandiford X-Patchwork-Id: 1872048 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4SkxMb5Bg2z1ySd for ; Tue, 5 Dec 2023 21:18:51 +1100 (AEDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 8AA943870C36 for ; Tue, 5 Dec 2023 10:18:28 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by sourceware.org (Postfix) with ESMTP id 951D23861002 for ; Tue, 5 Dec 2023 10:13:47 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 951D23861002 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=arm.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 951D23861002 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=217.140.110.172 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1701771231; cv=none; b=CFxZukO0NH+8dB4VIWtClmxQU+maeFgJgdpUi0XKs2SUO7GKkTzdQOwVBqb8Ouw6wF2mvMMWOLuFGPtxMXNZ9FSyc8crOn+WlANWCKg76uedU3v7Ju6SXEUIP//vITzB5kFh5/Ynp7WwTSDWLG56fyxqKQFMP5iyF+KjTy5T76g= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1701771231; c=relaxed/simple; bh=vKuqfBh+EgtLebRva+K5zivycLgR0dDuBjS4Pbj8Q9w=; h=From:To:Subject:Date:Message-Id:MIME-Version; b=hCQPlUVxSOzF6IlatUtw67HCT16hIaO519uW0xR5xwXykCytA1Qu28ONKNyf0UdsEl0PKoWb86BNa6Y1vzp3zWpMBWI1PevYSjZIGVjTkYa1vCcoI1JJWnXBPLBMHq1YQXp7qiUexQzQ2Imd6H+Fvg7xHhpoL9DeO87JZRz2vzA= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 006AD1570; Tue, 5 Dec 2023 02:14:34 -0800 (PST) Received: from e121540-lin.manchester.arm.com (e121540-lin.manchester.arm.com [10.32.110.72]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id E62293F5A1; Tue, 5 Dec 2023 02:13:46 -0800 (PST) From: Richard Sandiford To: gcc-patches@gcc.gnu.org Cc: Richard Sandiford Subject: [pushed v2 20/25] aarch64: Generalise _m rules for SVE intrinsics Date: Tue, 5 Dec 2023 10:13:18 +0000 Message-Id: <20231205101323.1914247-21-richard.sandiford@arm.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20231205101323.1914247-1-richard.sandiford@arm.com> References: <20231205101323.1914247-1-richard.sandiford@arm.com> MIME-Version: 1.0 X-Spam-Status: No, score=-22.3 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_DMARC_NONE, KAM_DMARC_STATUS, KAM_LAZY_DOMAIN_SECURITY, SPF_HELO_NONE, SPF_NONE, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org In SVE there was a simple rule that unary merging (_m) intrinsics had a separate initial argument to specify the values of inactive lanes, whereas other merging functions took inactive lanes from the first operand to the operation. That rule began to break down in SVE2, and it continues to do so in SME. This patch therefore adds a virtual function to specify whether the separate initial argument is present or not. The old rule is still the default. gcc/ * config/aarch64/aarch64-sve-builtins.h (function_shape::has_merge_argument_p): New member function. * config/aarch64/aarch64-sve-builtins.cc: (function_resolver::check_gp_argument): Use it. (function_expander::get_fallback_value): Likewise. * config/aarch64/aarch64-sve-builtins-shapes.cc (apply_predication): Likewise. (unary_convert_narrowt_def::has_merge_argument_p): New function. --- gcc/config/aarch64/aarch64-sve-builtins-shapes.cc | 10 ++++++++-- gcc/config/aarch64/aarch64-sve-builtins.cc | 4 ++-- gcc/config/aarch64/aarch64-sve-builtins.h | 13 +++++++++++++ 3 files changed, 23 insertions(+), 4 deletions(-) diff --git a/gcc/config/aarch64/aarch64-sve-builtins-shapes.cc b/gcc/config/aarch64/aarch64-sve-builtins-shapes.cc index 2c25b122f05..68708712001 100644 --- a/gcc/config/aarch64/aarch64-sve-builtins-shapes.cc +++ b/gcc/config/aarch64/aarch64-sve-builtins-shapes.cc @@ -66,8 +66,8 @@ apply_predication (const function_instance &instance, tree return_type, the same type as the result. For unary_convert_narrowt it also provides the "bottom" half of active elements, and is present for all types of predication. */ - if ((argument_types.length () == 2 && instance.pred == PRED_m) - || instance.shape == shapes::unary_convert_narrowt) + auto nargs = argument_types.length () - 1; + if (instance.shape->has_merge_argument_p (instance, nargs)) argument_types.quick_insert (0, return_type); } } @@ -3271,6 +3271,12 @@ SHAPE (unary_convert) predicate. */ struct unary_convert_narrowt_def : public overloaded_base<1> { + bool + has_merge_argument_p (const function_instance &, unsigned int) const override + { + return true; + } + void build (function_builder &b, const function_group_info &group) const override { diff --git a/gcc/config/aarch64/aarch64-sve-builtins.cc b/gcc/config/aarch64/aarch64-sve-builtins.cc index d5ac1dc76c5..7950977c14b 100644 --- a/gcc/config/aarch64/aarch64-sve-builtins.cc +++ b/gcc/config/aarch64/aarch64-sve-builtins.cc @@ -2287,7 +2287,7 @@ function_resolver::check_gp_argument (unsigned int nops, if (pred != PRED_none) { /* Unary merge operations should use resolve_unary instead. */ - gcc_assert (nops != 1 || pred != PRED_m); + gcc_assert (!shape->has_merge_argument_p (*this, nops)); nargs = nops + 1; if (!check_num_arguments (nargs) || !require_vector_type (i, VECTOR_TYPE_svbool_t)) @@ -2997,7 +2997,7 @@ function_expander::get_fallback_value (machine_mode mode, unsigned int nops, gcc_assert (pred == PRED_m || pred == PRED_x); if (merge_argno == DEFAULT_MERGE_ARGNO) - merge_argno = nops == 1 && pred == PRED_m ? 0 : 1; + merge_argno = shape->has_merge_argument_p (*this, nops) ? 0 : 1; if (merge_argno == 0) return args[argno++]; diff --git a/gcc/config/aarch64/aarch64-sve-builtins.h b/gcc/config/aarch64/aarch64-sve-builtins.h index e770a4042fe..b0218bbad6e 100644 --- a/gcc/config/aarch64/aarch64-sve-builtins.h +++ b/gcc/config/aarch64/aarch64-sve-builtins.h @@ -712,6 +712,9 @@ public: class function_shape { public: + virtual bool has_merge_argument_p (const function_instance &, + unsigned int) const; + virtual bool explicit_type_suffix_p (unsigned int) const = 0; /* True if the group suffix is present in overloaded names. @@ -987,6 +990,16 @@ function_base::vectors_per_tuple (const function_instance &instance) const return instance.group_suffix ().vectors_per_tuple; } +/* Return true if INSTANCE (which has NARGS arguments) has an initial + vector argument whose only purpose is to specify the values of + inactive lanes. */ +inline bool +function_shape::has_merge_argument_p (const function_instance &instance, + unsigned int nargs) const +{ + return nargs == 1 && instance.pred == PRED_m; +} + /* Return the mode of the result of a call. */ inline machine_mode function_expander::result_mode () const From patchwork Tue Dec 5 10:13:19 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Sandiford X-Patchwork-Id: 1872050 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=8.43.85.97; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4SkxNZ0qsyz1ySd for ; Tue, 5 Dec 2023 21:19:42 +1100 (AEDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 0F7893858407 for ; Tue, 5 Dec 2023 10:19:40 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by sourceware.org (Postfix) with ESMTP id 67324384F9AE for ; Tue, 5 Dec 2023 10:13:48 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 67324384F9AE Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=arm.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 67324384F9AE Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=217.140.110.172 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1701771236; cv=none; b=fCR2Ttg8NW46UBARcd/KhAHYrXJ9rtzjTt50QMA4KlEUCxdJUtZL+CkXFOCsjlkQ+66VHtjVcJ5qmcSlPCQ8bJaC1mveTiqzLfdab89gszetR6znPpEklZlbKWj++QIlcO4IE9UZGwjS6JuC8j6rYSgSfl8ud5Y4C7FIB8lgFZA= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1701771236; c=relaxed/simple; bh=+49XlB4l5uEI4GBV0G+G+xP1E5Bh38Glh8Q6n2fn5Uk=; h=From:To:Subject:Date:Message-Id:MIME-Version; b=DhkJNUCmWk7iwz/ngelTWVcAqEGRJeqHDv3eDRhJvOaEZ2H2oHUQ+/x2QOoNbf5T3eA/NNR52JpMXef4mkwzoYjk/HP/SZ6unZj1YI3Cw1gCxI6BFN89iuuZp0FRBXKl/Tt4/eeomW8hRoRJzQPJMrj2A0hmSFcZbaHeiIFwjEA= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id A97FD15A1; Tue, 5 Dec 2023 02:14:34 -0800 (PST) Received: from e121540-lin.manchester.arm.com (e121540-lin.manchester.arm.com [10.32.110.72]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 9ABAF3F5A1; Tue, 5 Dec 2023 02:13:47 -0800 (PST) From: Richard Sandiford To: gcc-patches@gcc.gnu.org Cc: Richard Sandiford Subject: [pushed v2 21/25] aarch64: Add support for Date: Tue, 5 Dec 2023 10:13:19 +0000 Message-Id: <20231205101323.1914247-22-richard.sandiford@arm.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20231205101323.1914247-1-richard.sandiford@arm.com> References: <20231205101323.1914247-1-richard.sandiford@arm.com> MIME-Version: 1.0 X-Spam-Status: No, score=-21.9 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_ASCII_DIVIDERS, KAM_DMARC_NONE, KAM_DMARC_STATUS, KAM_LAZY_DOMAIN_SECURITY, KAM_SHORT, SPF_HELO_NONE, SPF_NONE, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org This adds support for the SME parts of arm_sme.h. gcc/ * doc/invoke.texi: Document +sme-i16i64 and +sme-f64f64. * config.gcc (aarch64*-*-*): Add arm_sme.h to the list of headers to install and aarch64-sve-builtins-sme.o to the list of objects to build. * config/aarch64/aarch64-c.cc (aarch64_update_cpp_builtins): Define or undefine TARGET_SME, TARGET_SME_I16I64 and TARGET_SME_F64F64. (aarch64_pragma_aarch64): Handle arm_sme.h. * config/aarch64/aarch64-option-extensions.def (sme-i16i64) (sme-f64f64): New extensions. * config/aarch64/aarch64-protos.h (aarch64_sme_vq_immediate) (aarch64_addsvl_addspl_immediate_p, aarch64_output_addsvl_addspl) (aarch64_output_sme_zero_za): Declare. (aarch64_output_move_struct): Delete. (aarch64_sme_ldr_vnum_offset): Declare. (aarch64_sve::handle_arm_sme_h): Likewise. * config/aarch64/aarch64.h (AARCH64_ISA_SM_ON): New macro. (AARCH64_ISA_SME_I16I64, AARCH64_ISA_SME_F64F64): Likewise. (TARGET_STREAMING, TARGET_STREAMING_SME): Likewise. (TARGET_SME_I16I64, TARGET_SME_F64F64): Likewise. * config/aarch64/aarch64.cc (aarch64_sve_rdvl_factor_p): Rename to... (aarch64_sve_rdvl_addvl_factor_p): ...this. (aarch64_sve_rdvl_immediate_p): Update accordingly. (aarch64_rdsvl_immediate_p, aarch64_add_offset): Likewise. (aarch64_sme_vq_immediate): Likewise. Make public. (aarch64_sve_addpl_factor_p): New function. (aarch64_sve_addvl_addpl_immediate_p): Use aarch64_sve_rdvl_addvl_factor_p and aarch64_sve_addpl_factor_p. (aarch64_addsvl_addspl_immediate_p): New function. (aarch64_output_addsvl_addspl): Likewise. (aarch64_cannot_force_const_mem): Return true for RDSVL immediates. (aarch64_classify_index): Handle .Q scaling for VNx1TImode. (aarch64_classify_address): Likewise for vnum offsets. (aarch64_output_sme_zero_za): New function. (aarch64_sme_ldr_vnum_offset_p): Likewise. * config/aarch64/predicates.md (aarch64_addsvl_addspl_immediate): New predicate. (aarch64_pluslong_operand): Include it for SME. * config/aarch64/constraints.md (Ucj, Uav): New constraints. * config/aarch64/iterators.md (VNx1TI_ONLY): New mode iterator. (SME_ZA_I, SME_ZA_SDI, SME_ZA_SDF_I, SME_MOP_BHI): Likewise. (SME_MOP_HSDF): Likewise. (UNSPEC_SME_ADDHA, UNSPEC_SME_ADDVA, UNSPEC_SME_FMOPA) (UNSPEC_SME_FMOPS, UNSPEC_SME_LD1_HOR, UNSPEC_SME_LD1_VER) (UNSPEC_SME_READ_HOR, UNSPEC_SME_READ_VER, UNSPEC_SME_SMOPA) (UNSPEC_SME_SMOPS, UNSPEC_SME_ST1_HOR, UNSPEC_SME_ST1_VER) (UNSPEC_SME_SUMOPA, UNSPEC_SME_SUMOPS, UNSPEC_SME_UMOPA) (UNSPEC_SME_UMOPS, UNSPEC_SME_USMOPA, UNSPEC_SME_USMOPS) (UNSPEC_SME_WRITE_HOR, UNSPEC_SME_WRITE_VER): New unspecs. (elem_bits): Handle x2 and x4 structure modes, plus VNx1TI. (Vetype, Vesize, VPRED): Handle VNx1TI. (b): New mode attribute. (SME_LD1, SME_READ, SME_ST1, SME_WRITE, SME_BINARY_SDI, SME_INT_MOP) (SME_FP_MOP): New int iterators. (optab): Handle SME unspecs. (hv): New int attribute. * config/aarch64/aarch64.md (*add3_aarch64): Handle ADDSVL and ADDSPL. * config/aarch64/aarch64-sme.md (UNSPEC_SME_LDR): New unspec. (@aarch64_sme_, @aarch64_sme__plus) (aarch64_sme_ldr0, @aarch64_sme_ldrn): New patterns. (UNSPEC_SME_STR): New unspec. (@aarch64_sme_, @aarch64_sme__plus) (aarch64_sme_str0, @aarch64_sme_strn): New patterns. (@aarch64_sme_): Likewise. (*aarch64_sme__plus): Likewise. (@aarch64_sme_): Likewise. (@aarch64_sme_): Likewise. (*aarch64_sme__plus): Likewise. (@aarch64_sme_): Likewise. (UNSPEC_SME_ZERO): New unspec. (aarch64_sme_zero): New pattern. (@aarch64_sme_): Likewise. (@aarch64_sme_): Likewise. (@aarch64_sme_): Likewise. * config/aarch64/aarch64-sve-builtins.def: Add ZA type suffixes. Include aarch64-sve-builtins-sme.def. (DEF_SME_ZA_FUNCTION): New macro. * config/aarch64/aarch64-sve-builtins.h (CP_READ_ZA): New call property. (CP_WRITE_ZA): Likewise. (PRED_za_m): New predication type. (type_suffix_index): Handle DEF_SME_ZA_SUFFIX. (type_suffix_info): Add vector_p and za_p fields. (function_instance::num_za_tiles): New member function. (function_builder::get_attributes): Add an aarch64_feature_flags argument. (function_expander::get_contiguous_base): Take a base argument number, a vnum argument number, and an argument that indicates whether the vnum parameter is a factor of the SME vector length or the prevailing vector length. (function_expander::add_integer_operand): Take a poly_int64. (sve_switcher::sve_switcher): Take a base set of flags. (sme_switcher): New class. (scalar_types): Add a null entry for NUM_VECTOR_TYPES. * config/aarch64/aarch64-sve-builtins.cc: Include aarch64-sve-builtins-sme.h. (pred_suffixes): Add an entry for PRED_za_m. (type_suffixes): Initialize vector_p and za_p. Handle ZA suffixes. (TYPES_all_za, TYPES_d_za, TYPES_za_bhsd_data, TYPES_za_all_data) (TYPES_za_s_integer, TYPES_za_d_integer, TYPES_mop_base) (TYPES_mop_base_signed, TYPES_mop_base_unsigned, TYPES_mop_i16i64) (TYPES_mop_i16i64_signed, TYPES_mop_i16i64_unsigned, TYPES_za): New type suffix macros. (preds_m, preds_za_m): New predication lists. (function_groups): Handle DEF_SME_ZA_FUNCTION. (scalar_types): Add an entry for NUM_VECTOR_TYPES. (find_type_suffix_for_scalar_type): Check positively for vectors rather than negatively for predicates. (check_required_extensions): Handle PSTATE.SM and PSTATE.ZA requirements. (report_out_of_range): Handle the case where the minimum and maximum are the same. (function_instance::reads_global_state_p): Return true for functions that read ZA. (function_instance::modifies_global_state_p): Return true for functions that write to ZA. (sve_switcher::sve_switcher): Add a base flags argument. (function_builder::get_name): Handle "__arm_" prefixes. (add_attribute): Add an overload that takes a namespaces. (add_shared_state_attribute): New function. (function_builder::get_attributes): Take the required feature flags as argument. Add streaming and ZA attributes where appropriate. (function_builder::add_unique_function): Update calls accordingly. (function_resolver::check_gp_argument): Assert that the predication isn't ZA _m predication. (function_checker::function_checker): Don't bias the argument number for ZA _m predication. (function_expander::get_contiguous_base): Add arguments that specify the base argument number, the vnum argument number, and an argument that indicates whether the vnum parameter is a factor of the SME vector length or the prevailing vector length. Handle the SME case. (function_expander::add_input_operand): Handle pmode_register_operand. (function_expander::add_integer_operand): Take a poly_int64. (init_builtins): Call handle_arm_sme_h for LTO. (handle_arm_sve_h): Skip SME intrinsics. (handle_arm_sme_h): New function. * config/aarch64/aarch64-sve-builtins-functions.h (read_write_za, write_za): New classes. (unspec_based_sme_function, za_arith_function): New using aliases. (quiet_za_arith_function): Likewise. * config/aarch64/aarch64-sve-builtins-shapes.h (binary_za_int_m, binary_za_m, binary_za_uint_m, bool_inherent) (inherent_za, inherent_mask_za, ldr_za, load_za, read_za_m, store_za) (str_za, unary_za_m, write_za_m): Declare. * config/aarch64/aarch64-sve-builtins-shapes.cc (apply_predication): Expect za_m functions to have an existing governing predicate. (binary_za_m_base, binary_za_int_m_def, binary_za_m_def): New classes. (binary_za_uint_m_def, bool_inherent_def, inherent_za_def): Likewise. (inherent_mask_za_def, ldr_za_def, load_za_def, read_za_m_def) (store_za_def, str_za_def, unary_za_m_def, write_za_m_def): Likewise. * config/aarch64/arm_sme.h: New file. * config/aarch64/aarch64-sve-builtins-sme.h: Likewise. * config/aarch64/aarch64-sve-builtins-sme.cc: Likewise. * config/aarch64/aarch64-sve-builtins-sme.def: Likewise. * config/aarch64/t-aarch64 (aarch64-sve-builtins.o): Depend on aarch64-sve-builtins-sme.def and aarch64-sve-builtins-sme.h. (aarch64-sve-builtins-sme.o): New rule. gcc/testsuite/ * lib/target-supports.exp: Add sme and sme-i16i64 features. * gcc.target/aarch64/pragma_cpp_predefs_4.c: Test __ARM_FEATURE_SME* macros. * gcc.target/aarch64/sve/acle/asm/test_sve_acle.h: Allow functions to be marked as __arm_streaming, __arm_streaming_compatible, and __arm_inout("za"). * g++.target/aarch64/sve/acle/general-c++/func_redef_4.c: Mark the function as __arm_streaming_compatible. * g++.target/aarch64/sve/acle/general-c++/func_redef_5.c: Likewise. * g++.target/aarch64/sve/acle/general-c++/func_redef_7.c: Likewise. * gcc.target/aarch64/sve/acle/general-c/func_redef_4.c: Likewise. * gcc.target/aarch64/sve/acle/general-c/func_redef_5.c: Likewise. * g++.target/aarch64/sme/aarch64-sme-acle-asm.exp: New test harness. * gcc.target/aarch64/sme/aarch64-sme-acle-asm.exp: Likewise. * gcc.target/aarch64/sve/acle/general-c/binary_za_int_m_1.c: New test. * gcc.target/aarch64/sve/acle/general-c/binary_za_m_1.c: Likewise. * gcc.target/aarch64/sve/acle/general-c/binary_za_m_2.c: Likewise. * gcc.target/aarch64/sve/acle/general-c/binary_za_uint_m_1.c: Likewise. * gcc.target/aarch64/sve/acle/general-c/read_za_m_1.c: Likewise. * gcc.target/aarch64/sve/acle/general-c/unary_za_m_1.c: Likewise. * gcc.target/aarch64/sve/acle/general-c/write_za_m_1.c: Likewise. --- gcc/config.gcc | 4 +- gcc/config/aarch64/aarch64-c.cc | 6 + .../aarch64/aarch64-option-extensions.def | 4 + gcc/config/aarch64/aarch64-protos.h | 8 +- gcc/config/aarch64/aarch64-sme.md | 373 +++++++++++++++ .../aarch64/aarch64-sve-builtins-functions.h | 64 +++ .../aarch64/aarch64-sve-builtins-shapes.cc | 306 +++++++++++- .../aarch64/aarch64-sve-builtins-shapes.h | 13 + .../aarch64/aarch64-sve-builtins-sme.cc | 412 +++++++++++++++++ .../aarch64/aarch64-sve-builtins-sme.def | 76 +++ gcc/config/aarch64/aarch64-sve-builtins-sme.h | 57 +++ gcc/config/aarch64/aarch64-sve-builtins.cc | 336 ++++++++++++-- gcc/config/aarch64/aarch64-sve-builtins.def | 28 ++ gcc/config/aarch64/aarch64-sve-builtins.h | 46 +- gcc/config/aarch64/aarch64.cc | 140 +++++- gcc/config/aarch64/aarch64.h | 15 + gcc/config/aarch64/aarch64.md | 1 + gcc/config/aarch64/arm_sme.h | 45 ++ gcc/config/aarch64/constraints.md | 9 + gcc/config/aarch64/iterators.md | 94 +++- gcc/config/aarch64/predicates.md | 8 +- gcc/config/aarch64/t-aarch64 | 17 +- gcc/doc/invoke.texi | 4 + .../aarch64/sme/aarch64-sme-acle-asm.exp | 82 ++++ .../sve/acle/general-c++/func_redef_4.c | 3 +- .../sve/acle/general-c++/func_redef_5.c | 1 + .../sve/acle/general-c++/func_redef_7.c | 1 + .../gcc.target/aarch64/pragma_cpp_predefs_4.c | 38 ++ .../aarch64/sme/aarch64-sme-acle-asm.exp | 81 ++++ .../aarch64/sme/acle-asm/addha_za32.c | 48 ++ .../aarch64/sme/acle-asm/addha_za64.c | 50 ++ .../aarch64/sme/acle-asm/addva_za32.c | 48 ++ .../aarch64/sme/acle-asm/addva_za64.c | 50 ++ .../aarch64/sme/acle-asm/arm_has_sme_sc.c | 25 + .../sme/acle-asm/arm_in_streaming_mode_ns.c | 11 + .../sme/acle-asm/arm_in_streaming_mode_s.c | 11 + .../sme/acle-asm/arm_in_streaming_mode_sc.c | 26 ++ .../gcc.target/aarch64/sme/acle-asm/cntsb_s.c | 310 +++++++++++++ .../aarch64/sme/acle-asm/cntsb_sc.c | 12 + .../gcc.target/aarch64/sme/acle-asm/cntsd_s.c | 277 +++++++++++ .../aarch64/sme/acle-asm/cntsd_sc.c | 13 + .../gcc.target/aarch64/sme/acle-asm/cntsh_s.c | 279 +++++++++++ .../aarch64/sme/acle-asm/cntsh_sc.c | 13 + .../gcc.target/aarch64/sme/acle-asm/cntsw_s.c | 278 +++++++++++ .../aarch64/sme/acle-asm/cntsw_sc.c | 13 + .../aarch64/sme/acle-asm/ld1_hor_vnum_za128.c | 77 ++++ .../aarch64/sme/acle-asm/ld1_hor_vnum_za16.c | 123 +++++ .../aarch64/sme/acle-asm/ld1_hor_vnum_za32.c | 123 +++++ .../aarch64/sme/acle-asm/ld1_hor_vnum_za64.c | 112 +++++ .../aarch64/sme/acle-asm/ld1_hor_vnum_za8.c | 112 +++++ .../aarch64/sme/acle-asm/ld1_hor_za128.c | 83 ++++ .../aarch64/sme/acle-asm/ld1_hor_za16.c | 126 +++++ .../aarch64/sme/acle-asm/ld1_hor_za32.c | 125 +++++ .../aarch64/sme/acle-asm/ld1_hor_za64.c | 105 +++++ .../aarch64/sme/acle-asm/ld1_hor_za8.c | 95 ++++ .../aarch64/sme/acle-asm/ld1_ver_vnum_za128.c | 77 ++++ .../aarch64/sme/acle-asm/ld1_ver_vnum_za16.c | 123 +++++ .../aarch64/sme/acle-asm/ld1_ver_vnum_za32.c | 123 +++++ .../aarch64/sme/acle-asm/ld1_ver_vnum_za64.c | 112 +++++ .../aarch64/sme/acle-asm/ld1_ver_vnum_za8.c | 112 +++++ .../aarch64/sme/acle-asm/ld1_ver_za128.c | 83 ++++ .../aarch64/sme/acle-asm/ld1_ver_za16.c | 126 +++++ .../aarch64/sme/acle-asm/ld1_ver_za32.c | 125 +++++ .../aarch64/sme/acle-asm/ld1_ver_za64.c | 105 +++++ .../aarch64/sme/acle-asm/ld1_ver_za8.c | 95 ++++ .../aarch64/sme/acle-asm/ldr_vnum_za_s.c | 147 ++++++ .../aarch64/sme/acle-asm/ldr_vnum_za_sc.c | 148 ++++++ .../aarch64/sme/acle-asm/ldr_za_s.c | 124 +++++ .../aarch64/sme/acle-asm/ldr_za_sc.c | 71 +++ .../aarch64/sme/acle-asm/mopa_za32.c | 102 ++++ .../aarch64/sme/acle-asm/mopa_za64.c | 70 +++ .../aarch64/sme/acle-asm/mops_za32.c | 102 ++++ .../aarch64/sme/acle-asm/mops_za64.c | 70 +++ .../aarch64/sme/acle-asm/read_hor_za128.c | 435 ++++++++++++++++++ .../aarch64/sme/acle-asm/read_hor_za16.c | 207 +++++++++ .../aarch64/sme/acle-asm/read_hor_za32.c | 196 ++++++++ .../aarch64/sme/acle-asm/read_hor_za64.c | 186 ++++++++ .../aarch64/sme/acle-asm/read_hor_za8.c | 125 +++++ .../aarch64/sme/acle-asm/read_ver_za128.c | 435 ++++++++++++++++++ .../aarch64/sme/acle-asm/read_ver_za16.c | 207 +++++++++ .../aarch64/sme/acle-asm/read_ver_za32.c | 196 ++++++++ .../aarch64/sme/acle-asm/read_ver_za64.c | 186 ++++++++ .../aarch64/sme/acle-asm/read_ver_za8.c | 125 +++++ .../aarch64/sme/acle-asm/st1_hor_vnum_za128.c | 77 ++++ .../aarch64/sme/acle-asm/st1_hor_vnum_za16.c | 123 +++++ .../aarch64/sme/acle-asm/st1_hor_vnum_za32.c | 123 +++++ .../aarch64/sme/acle-asm/st1_hor_vnum_za64.c | 112 +++++ .../aarch64/sme/acle-asm/st1_hor_vnum_za8.c | 112 +++++ .../aarch64/sme/acle-asm/st1_hor_za128.c | 83 ++++ .../aarch64/sme/acle-asm/st1_hor_za16.c | 126 +++++ .../aarch64/sme/acle-asm/st1_hor_za32.c | 125 +++++ .../aarch64/sme/acle-asm/st1_hor_za64.c | 105 +++++ .../aarch64/sme/acle-asm/st1_hor_za8.c | 95 ++++ .../aarch64/sme/acle-asm/st1_ver_vnum_za128.c | 77 ++++ .../aarch64/sme/acle-asm/st1_ver_vnum_za16.c | 123 +++++ .../aarch64/sme/acle-asm/st1_ver_vnum_za32.c | 123 +++++ .../aarch64/sme/acle-asm/st1_ver_vnum_za64.c | 112 +++++ .../aarch64/sme/acle-asm/st1_ver_vnum_za8.c | 112 +++++ .../aarch64/sme/acle-asm/st1_ver_za128.c | 83 ++++ .../aarch64/sme/acle-asm/st1_ver_za16.c | 126 +++++ .../aarch64/sme/acle-asm/st1_ver_za32.c | 125 +++++ .../aarch64/sme/acle-asm/st1_ver_za64.c | 105 +++++ .../aarch64/sme/acle-asm/st1_ver_za8.c | 95 ++++ .../aarch64/sme/acle-asm/str_vnum_za_s.c | 147 ++++++ .../aarch64/sme/acle-asm/str_vnum_za_sc.c | 148 ++++++ .../aarch64/sme/acle-asm/str_za_s.c | 124 +++++ .../aarch64/sme/acle-asm/str_za_sc.c | 71 +++ .../aarch64/sme/acle-asm/sumopa_za32.c | 30 ++ .../aarch64/sme/acle-asm/sumopa_za64.c | 32 ++ .../aarch64/sme/acle-asm/sumops_za32.c | 30 ++ .../aarch64/sme/acle-asm/sumops_za64.c | 32 ++ .../aarch64/sme/acle-asm/test_sme_acle.h | 62 +++ .../aarch64/sme/acle-asm/undef_za.c | 33 ++ .../aarch64/sme/acle-asm/usmopa_za32.c | 30 ++ .../aarch64/sme/acle-asm/usmopa_za64.c | 32 ++ .../aarch64/sme/acle-asm/usmops_za32.c | 30 ++ .../aarch64/sme/acle-asm/usmops_za64.c | 32 ++ .../aarch64/sme/acle-asm/write_hor_za128.c | 193 ++++++++ .../aarch64/sme/acle-asm/write_hor_za16.c | 133 ++++++ .../aarch64/sme/acle-asm/write_hor_za32.c | 143 ++++++ .../aarch64/sme/acle-asm/write_hor_za64.c | 133 ++++++ .../aarch64/sme/acle-asm/write_hor_za8.c | 93 ++++ .../aarch64/sme/acle-asm/write_ver_za128.c | 193 ++++++++ .../aarch64/sme/acle-asm/write_ver_za16.c | 133 ++++++ .../aarch64/sme/acle-asm/write_ver_za32.c | 143 ++++++ .../aarch64/sme/acle-asm/write_ver_za64.c | 133 ++++++ .../aarch64/sme/acle-asm/write_ver_za8.c | 93 ++++ .../aarch64/sme/acle-asm/zero_mask_za.c | 130 ++++++ .../gcc.target/aarch64/sme/acle-asm/zero_za.c | 11 + .../aarch64/sve/acle/asm/test_sve_acle.h | 14 +- .../sve/acle/general-c/binary_za_int_m_1.c | 50 ++ .../sve/acle/general-c/binary_za_m_1.c | 49 ++ .../sve/acle/general-c/binary_za_m_2.c | 11 + .../sve/acle/general-c/binary_za_uint_m_1.c | 50 ++ .../aarch64/sve/acle/general-c/func_redef_4.c | 3 +- .../aarch64/sve/acle/general-c/func_redef_5.c | 1 + .../aarch64/sve/acle/general-c/read_za_m_1.c | 48 ++ .../aarch64/sve/acle/general-c/unary_za_m_1.c | 49 ++ .../aarch64/sve/acle/general-c/write_za_m_1.c | 48 ++ gcc/testsuite/lib/target-supports.exp | 3 +- 140 files changed, 13810 insertions(+), 72 deletions(-) create mode 100644 gcc/config/aarch64/aarch64-sve-builtins-sme.cc create mode 100644 gcc/config/aarch64/aarch64-sve-builtins-sme.def create mode 100644 gcc/config/aarch64/aarch64-sve-builtins-sme.h create mode 100644 gcc/config/aarch64/arm_sme.h create mode 100644 gcc/testsuite/g++.target/aarch64/sme/aarch64-sme-acle-asm.exp create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/aarch64-sme-acle-asm.exp create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/acle-asm/addha_za32.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/acle-asm/addha_za64.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/acle-asm/addva_za32.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/acle-asm/addva_za64.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/acle-asm/arm_has_sme_sc.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/acle-asm/arm_in_streaming_mode_ns.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/acle-asm/arm_in_streaming_mode_s.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/acle-asm/arm_in_streaming_mode_sc.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/acle-asm/cntsb_s.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/acle-asm/cntsb_sc.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/acle-asm/cntsd_s.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/acle-asm/cntsd_sc.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/acle-asm/cntsh_s.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/acle-asm/cntsh_sc.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/acle-asm/cntsw_s.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/acle-asm/cntsw_sc.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/acle-asm/ld1_hor_vnum_za128.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/acle-asm/ld1_hor_vnum_za16.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/acle-asm/ld1_hor_vnum_za32.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/acle-asm/ld1_hor_vnum_za64.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/acle-asm/ld1_hor_vnum_za8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/acle-asm/ld1_hor_za128.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/acle-asm/ld1_hor_za16.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/acle-asm/ld1_hor_za32.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/acle-asm/ld1_hor_za64.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/acle-asm/ld1_hor_za8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/acle-asm/ld1_ver_vnum_za128.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/acle-asm/ld1_ver_vnum_za16.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/acle-asm/ld1_ver_vnum_za32.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/acle-asm/ld1_ver_vnum_za64.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/acle-asm/ld1_ver_vnum_za8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/acle-asm/ld1_ver_za128.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/acle-asm/ld1_ver_za16.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/acle-asm/ld1_ver_za32.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/acle-asm/ld1_ver_za64.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/acle-asm/ld1_ver_za8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/acle-asm/ldr_vnum_za_s.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/acle-asm/ldr_vnum_za_sc.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/acle-asm/ldr_za_s.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/acle-asm/ldr_za_sc.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/acle-asm/mopa_za32.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/acle-asm/mopa_za64.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/acle-asm/mops_za32.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/acle-asm/mops_za64.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/acle-asm/read_hor_za128.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/acle-asm/read_hor_za16.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/acle-asm/read_hor_za32.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/acle-asm/read_hor_za64.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/acle-asm/read_hor_za8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/acle-asm/read_ver_za128.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/acle-asm/read_ver_za16.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/acle-asm/read_ver_za32.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/acle-asm/read_ver_za64.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/acle-asm/read_ver_za8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/acle-asm/st1_hor_vnum_za128.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/acle-asm/st1_hor_vnum_za16.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/acle-asm/st1_hor_vnum_za32.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/acle-asm/st1_hor_vnum_za64.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/acle-asm/st1_hor_vnum_za8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/acle-asm/st1_hor_za128.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/acle-asm/st1_hor_za16.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/acle-asm/st1_hor_za32.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/acle-asm/st1_hor_za64.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/acle-asm/st1_hor_za8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/acle-asm/st1_ver_vnum_za128.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/acle-asm/st1_ver_vnum_za16.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/acle-asm/st1_ver_vnum_za32.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/acle-asm/st1_ver_vnum_za64.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/acle-asm/st1_ver_vnum_za8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/acle-asm/st1_ver_za128.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/acle-asm/st1_ver_za16.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/acle-asm/st1_ver_za32.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/acle-asm/st1_ver_za64.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/acle-asm/st1_ver_za8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/acle-asm/str_vnum_za_s.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/acle-asm/str_vnum_za_sc.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/acle-asm/str_za_s.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/acle-asm/str_za_sc.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/acle-asm/sumopa_za32.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/acle-asm/sumopa_za64.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/acle-asm/sumops_za32.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/acle-asm/sumops_za64.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/acle-asm/test_sme_acle.h create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/acle-asm/undef_za.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/acle-asm/usmopa_za32.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/acle-asm/usmopa_za64.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/acle-asm/usmops_za32.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/acle-asm/usmops_za64.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/acle-asm/write_hor_za128.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/acle-asm/write_hor_za16.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/acle-asm/write_hor_za32.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/acle-asm/write_hor_za64.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/acle-asm/write_hor_za8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/acle-asm/write_ver_za128.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/acle-asm/write_ver_za16.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/acle-asm/write_ver_za32.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/acle-asm/write_ver_za64.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/acle-asm/write_ver_za8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/acle-asm/zero_mask_za.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/acle-asm/zero_za.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/binary_za_int_m_1.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/binary_za_m_1.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/binary_za_m_2.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/binary_za_uint_m_1.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/read_za_m_1.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/unary_za_m_1.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/write_za_m_1.c [...tests snipped since they haven't changed since last time...] diff --git a/gcc/config.gcc b/gcc/config.gcc index 748430194f3..6450448f2f0 100644 --- a/gcc/config.gcc +++ b/gcc/config.gcc @@ -345,11 +345,11 @@ m32c*-*-*) ;; aarch64*-*-*) cpu_type=aarch64 - extra_headers="arm_fp16.h arm_neon.h arm_bf16.h arm_acle.h arm_sve.h" + extra_headers="arm_fp16.h arm_neon.h arm_bf16.h arm_acle.h arm_sve.h arm_sme.h" c_target_objs="aarch64-c.o" cxx_target_objs="aarch64-c.o" d_target_objs="aarch64-d.o" - extra_objs="aarch64-builtins.o aarch-common.o aarch64-sve-builtins.o aarch64-sve-builtins-shapes.o aarch64-sve-builtins-base.o aarch64-sve-builtins-sve2.o cortex-a57-fma-steering.o aarch64-speculation.o falkor-tag-collision-avoidance.o aarch-bti-insert.o aarch64-cc-fusion.o" + extra_objs="aarch64-builtins.o aarch-common.o aarch64-sve-builtins.o aarch64-sve-builtins-shapes.o aarch64-sve-builtins-base.o aarch64-sve-builtins-sve2.o aarch64-sve-builtins-sme.o cortex-a57-fma-steering.o aarch64-speculation.o falkor-tag-collision-avoidance.o aarch-bti-insert.o aarch64-cc-fusion.o" target_gtfiles="\$(srcdir)/config/aarch64/aarch64-builtins.cc \$(srcdir)/config/aarch64/aarch64-sve-builtins.h \$(srcdir)/config/aarch64/aarch64-sve-builtins.cc" target_has_targetm_common=yes ;; diff --git a/gcc/config/aarch64/aarch64-c.cc b/gcc/config/aarch64/aarch64-c.cc index 9494e560be0..f2fa5df1b82 100644 --- a/gcc/config/aarch64/aarch64-c.cc +++ b/gcc/config/aarch64/aarch64-c.cc @@ -253,6 +253,10 @@ aarch64_update_cpp_builtins (cpp_reader *pfile) "__ARM_FEATURE_LS64", pfile); aarch64_def_or_undef (AARCH64_ISA_RCPC, "__ARM_FEATURE_RCPC", pfile); + aarch64_def_or_undef (TARGET_SME, "__ARM_FEATURE_SME", pfile); + aarch64_def_or_undef (TARGET_SME_I16I64, "__ARM_FEATURE_SME_I16I64", pfile); + aarch64_def_or_undef (TARGET_SME_F64F64, "__ARM_FEATURE_SME_F64F64", pfile); + /* Not for ACLE, but required to keep "float.h" correct if we switch target between implementations that do or do not support ARMv8.2-A 16-bit floating-point extensions. */ @@ -337,6 +341,8 @@ aarch64_pragma_aarch64 (cpp_reader *) const char *name = TREE_STRING_POINTER (x); if (strcmp (name, "arm_sve.h") == 0) aarch64_sve::handle_arm_sve_h (); + else if (strcmp (name, "arm_sme.h") == 0) + aarch64_sve::handle_arm_sme_h (); else if (strcmp (name, "arm_neon.h") == 0) handle_arm_neon_h (); else if (strcmp (name, "arm_acle.h") == 0) diff --git a/gcc/config/aarch64/aarch64-option-extensions.def b/gcc/config/aarch64/aarch64-option-extensions.def index fb9ff1b66b2..1480e498bbb 100644 --- a/gcc/config/aarch64/aarch64-option-extensions.def +++ b/gcc/config/aarch64/aarch64-option-extensions.def @@ -153,4 +153,8 @@ AARCH64_OPT_EXTENSION("cssc", CSSC, (), (), (), "cssc") AARCH64_OPT_EXTENSION("sme", SME, (BF16, SVE2), (), (), "sme") +AARCH64_OPT_EXTENSION("sme-i16i64", SME_I16I64, (SME), (), (), "") + +AARCH64_OPT_EXTENSION("sme-f64f64", SME_F64F64, (SME), (), (), "") + #undef AARCH64_OPT_EXTENSION diff --git a/gcc/config/aarch64/aarch64-protos.h b/gcc/config/aarch64/aarch64-protos.h index f42981bd507..ce7046b050e 100644 --- a/gcc/config/aarch64/aarch64-protos.h +++ b/gcc/config/aarch64/aarch64-protos.h @@ -805,7 +805,11 @@ bool aarch64_sve_vector_inc_dec_immediate_p (rtx); int aarch64_add_offset_temporaries (rtx); void aarch64_split_add_offset (scalar_int_mode, rtx, rtx, rtx, rtx, rtx); bool aarch64_rdsvl_immediate_p (const_rtx); +rtx aarch64_sme_vq_immediate (machine_mode mode, HOST_WIDE_INT, + aarch64_feature_flags); char *aarch64_output_rdsvl (const_rtx); +bool aarch64_addsvl_addspl_immediate_p (const_rtx); +char *aarch64_output_addsvl_addspl (rtx); bool aarch64_mov_operand_p (rtx, machine_mode); rtx aarch64_reverse_mask (machine_mode, unsigned int); bool aarch64_offset_7bit_signed_scaled_p (machine_mode, poly_int64); @@ -854,6 +858,7 @@ bool aarch64_is_mov_xn_imm (unsigned HOST_WIDE_INT); bool aarch64_use_return_insn_p (void); const char *aarch64_output_casesi (rtx *); const char *aarch64_output_load_tp (rtx); +const char *aarch64_output_sme_zero_za (rtx); arm_pcs aarch64_tlsdesc_abi_id (); enum aarch64_symbol_type aarch64_classify_symbol (rtx, HOST_WIDE_INT); @@ -867,7 +872,6 @@ machine_mode aarch64_hard_regno_caller_save_mode (unsigned, unsigned, int aarch64_uxt_size (int, HOST_WIDE_INT); int aarch64_vec_fpconst_pow_of_2 (rtx); rtx aarch64_mask_from_zextract_ops (rtx, rtx); -const char *aarch64_output_move_struct (rtx *operands); rtx aarch64_return_addr_rtx (void); rtx aarch64_return_addr (int, rtx); rtx aarch64_simd_gen_const_vector_dup (machine_mode, HOST_WIDE_INT); @@ -881,6 +885,7 @@ bool aarch64_sve_ldnf1_operand_p (rtx); bool aarch64_sve_ldr_operand_p (rtx); bool aarch64_sve_prefetch_operand_p (rtx, machine_mode); bool aarch64_sve_struct_memory_operand_p (rtx); +bool aarch64_sme_ldr_vnum_offset_p (rtx, rtx); rtx aarch64_simd_vect_par_cnst_half (machine_mode, int, bool); rtx aarch64_gen_stepped_int_parallel (unsigned int, int, int); bool aarch64_stepped_int_parallel_p (rtx, int); @@ -997,6 +1002,7 @@ void handle_arm_neon_h (void); namespace aarch64_sve { void init_builtins (); void handle_arm_sve_h (); + void handle_arm_sme_h (); tree builtin_decl (unsigned, bool); bool builtin_type_p (const_tree); bool builtin_type_p (const_tree, unsigned int *, unsigned int *); diff --git a/gcc/config/aarch64/aarch64-sme.md b/gcc/config/aarch64/aarch64-sme.md index d4973098e66..da0745f6570 100644 --- a/gcc/config/aarch64/aarch64-sme.md +++ b/gcc/config/aarch64/aarch64-sme.md @@ -24,6 +24,19 @@ ;; ---- Test current state ;; ---- PSTATE.SM management ;; ---- PSTATE.ZA management +;; +;; == Loads, stores and moves +;; ---- Single-vector loads +;; ---- Single-vector stores +;; ---- Single-vector moves +;; ---- Zeroing +;; +;; == Binary arithmetic +;; ---- Binary arithmetic on ZA tile +;; +;; == Ternary arithmetic +;; ---- [INT] Sum of outer products +;; ---- [FP] Sum of outer products ;; ========================================================================= ;; == State management @@ -456,3 +469,363 @@ (define_insn_and_split "aarch64_commit_lazy_save" DONE; } ) + +;; ========================================================================= +;; == Loads, stores and moves +;; ========================================================================= + +;; ------------------------------------------------------------------------- +;; ---- Single-vector loads +;; ------------------------------------------------------------------------- +;; Includes: +;; - LD1 +;; - LDR +;; ------------------------------------------------------------------------- + +(define_c_enum "unspec" [ + UNSPEC_SME_LDR +]) + +(define_insn "@aarch64_sme_" + [(set (reg:SME_ZA_I ZA_REGNUM) + (unspec:SME_ZA_I + [(reg:SME_ZA_I ZA_REGNUM) + (reg:DI SME_STATE_REGNUM) + (match_operand:DI 0 "const_int_operand") + (match_operand:SI 1 "register_operand" "Ucj") + (match_operand: 2 "register_operand" "Upl") + (match_operand:SME_ZA_I 3 "aarch64_sve_ldff1_operand" "Utf")] + SME_LD1))] + "TARGET_STREAMING_SME" + "ld1\t{ za%0.[%w1, 0] }, %2/z, %3" +) + +(define_insn "@aarch64_sme__plus" + [(set (reg:SME_ZA_I ZA_REGNUM) + (unspec:SME_ZA_I + [(reg:SME_ZA_I ZA_REGNUM) + (reg:DI SME_STATE_REGNUM) + (match_operand:DI 0 "const_int_operand") + (plus:SI (match_operand:SI 1 "register_operand" "Ucj") + (match_operand:SI 2 "const_int_operand")) + (match_operand: 3 "register_operand" "Upl") + (match_operand:SME_ZA_I 4 "aarch64_sve_ldff1_operand" "Utf")] + SME_LD1))] + "TARGET_STREAMING_SME + && UINTVAL (operands[2]) < 128 / " + "ld1\t{ za%0.[%w1, %2] }, %3/z, %4" +) + +(define_insn "aarch64_sme_ldr0" + [(set (reg:VNx16QI ZA_REGNUM) + (unspec:VNx16QI + [(reg:VNx16QI ZA_REGNUM) + (reg:DI SME_STATE_REGNUM) + (match_operand:SI 0 "register_operand" "Ucj") + (mem:VNx16QI (match_operand 1 "pmode_register_operand" "rk"))] + UNSPEC_SME_LDR))] + "TARGET_SME" + "ldr\tza[%w0, 0], [%1, #0, mul vl]" +) + +(define_insn "@aarch64_sme_ldrn" + [(set (reg:VNx16QI ZA_REGNUM) + (unspec:VNx16QI + [(reg:VNx16QI ZA_REGNUM) + (reg:DI SME_STATE_REGNUM) + (plus:SI (match_operand:SI 0 "register_operand" "Ucj") + (match_operand:SI 1 "const_int_operand")) + (mem:VNx16QI + (plus:P (match_operand:P 2 "register_operand" "rk") + (match_operand:P 3 "aarch64_mov_operand")))] + UNSPEC_SME_LDR))] + "TARGET_SME + && aarch64_sme_ldr_vnum_offset_p (operands[1], operands[3])" + "ldr\tza[%w0, %1], [%2, #%1, mul vl]" +) + +;; ------------------------------------------------------------------------- +;; ---- Single-vector stores +;; ------------------------------------------------------------------------- +;; Includes: +;; - ST1 +;; - STR +;; ------------------------------------------------------------------------- + +(define_c_enum "unspec" [ + UNSPEC_SME_STR +]) + +(define_insn "@aarch64_sme_" + [(set (match_operand:SME_ZA_I 0 "aarch64_sve_ldff1_operand" "+Utf") + (unspec:SME_ZA_I + [(reg:SME_ZA_I ZA_REGNUM) + (reg:DI SME_STATE_REGNUM) + (match_dup 0) + (match_operand:DI 1 "const_int_operand") + (match_operand:SI 2 "register_operand" "Ucj") + (match_operand: 3 "register_operand" "Upl")] + SME_ST1))] + "TARGET_STREAMING_SME" + "st1\t{ za%1.[%w2, 0] }, %3, %0" +) + +(define_insn "@aarch64_sme__plus" + [(set (match_operand:SME_ZA_I 0 "aarch64_sve_ldff1_operand" "+Utf") + (unspec:SME_ZA_I + [(reg:SME_ZA_I ZA_REGNUM) + (reg:DI SME_STATE_REGNUM) + (match_dup 0) + (match_operand:DI 1 "const_int_operand") + (plus:SI (match_operand:SI 2 "register_operand" "Ucj") + (match_operand:SI 3 "const_int_operand")) + (match_operand: 4 "register_operand" "Upl")] + SME_ST1))] + "TARGET_STREAMING_SME + && UINTVAL (operands[3]) < 128 / " + "st1\t{ za%1.[%w2, %3] }, %4, %0" +) + +(define_insn "aarch64_sme_str0" + [(set (mem:VNx16QI (match_operand 1 "pmode_register_operand" "rk")) + (unspec:VNx16QI + [(reg:VNx16QI ZA_REGNUM) + (reg:DI SME_STATE_REGNUM) + (mem:VNx16QI (match_dup 1)) + (match_operand:SI 0 "register_operand" "Ucj")] + UNSPEC_SME_STR))] + "TARGET_SME" + "str\tza[%w0, 0], [%1, #0, mul vl]" +) + +(define_insn "@aarch64_sme_strn" + [(set (mem:VNx16QI + (plus:P (match_operand:P 2 "register_operand" "rk") + (match_operand:P 3 "aarch64_mov_operand"))) + (unspec:VNx16QI + [(reg:VNx16QI ZA_REGNUM) + (reg:DI SME_STATE_REGNUM) + (mem:VNx16QI (plus:P (match_dup 2) (match_dup 3))) + (plus:SI (match_operand:SI 0 "register_operand" "Ucj") + (match_operand:SI 1 "const_int_operand"))] + UNSPEC_SME_STR))] + "TARGET_SME + && aarch64_sme_ldr_vnum_offset_p (operands[1], operands[3])" + "str\tza[%w0, %1], [%2, #%1, mul vl]" +) + +;; ------------------------------------------------------------------------- +;; ---- Single-vector moves +;; ------------------------------------------------------------------------- +;; Includes: +;; - MOVA +;; ------------------------------------------------------------------------- + +(define_insn "@aarch64_sme_" + [(set (match_operand:SVE_FULL 0 "register_operand" "=w") + (unspec:SVE_FULL + [(reg: ZA_REGNUM) + (reg:DI SME_STATE_REGNUM) + (match_operand:SVE_FULL 1 "register_operand" "0") + (match_operand: 2 "register_operand" "Upl") + (match_operand:DI 3 "const_int_operand") + (match_operand:SI 4 "register_operand" "Ucj")] + SME_READ))] + "TARGET_STREAMING_SME" + "mova\t%0., %2/m, za%3.[%w4, 0]" +) + +(define_insn "*aarch64_sme__plus" + [(set (match_operand:SVE_FULL 0 "register_operand" "=w") + (unspec:SVE_FULL + [(reg: ZA_REGNUM) + (reg:DI SME_STATE_REGNUM) + (match_operand:SVE_FULL 1 "register_operand" "0") + (match_operand: 2 "register_operand" "Upl") + (match_operand:DI 3 "const_int_operand") + (plus:SI (match_operand:SI 4 "register_operand" "Ucj") + (match_operand:SI 5 "const_int_operand"))] + SME_READ))] + "TARGET_STREAMING_SME + && UINTVAL (operands[5]) < 128 / " + "mova\t%0., %2/m, za%3.[%w4, %5]" +) + +(define_insn "@aarch64_sme_" + [(set (match_operand:SVE_FULL 0 "register_operand" "=w") + (unspec:SVE_FULL + [(reg:VNx1TI_ONLY ZA_REGNUM) + (reg:DI SME_STATE_REGNUM) + (match_operand:SVE_FULL 1 "register_operand" "0") + (match_operand:VNx2BI 2 "register_operand" "Upl") + (match_operand:DI 3 "const_int_operand") + (match_operand:SI 4 "register_operand" "Ucj")] + SME_READ))] + "TARGET_STREAMING_SME" + "mova\t%0.q, %2/m, za%3.q[%w4, 0]" +) + +(define_insn "@aarch64_sme_" + [(set (reg: ZA_REGNUM) + (unspec: + [(reg:SVE_FULL ZA_REGNUM) + (reg:DI SME_STATE_REGNUM) + (match_operand:DI 0 "const_int_operand") + (match_operand:SI 1 "register_operand" "Ucj") + (match_operand: 2 "register_operand" "Upl") + (match_operand:SVE_FULL 3 "register_operand" "w")] + SME_WRITE))] + "TARGET_STREAMING_SME" + "mova\tza%0.[%w1, 0], %2/m, %3." +) + +(define_insn "*aarch64_sme__plus" + [(set (reg: ZA_REGNUM) + (unspec: + [(reg:SVE_FULL ZA_REGNUM) + (reg:DI SME_STATE_REGNUM) + (match_operand:DI 0 "const_int_operand") + (plus:SI (match_operand:SI 1 "register_operand" "Ucj") + (match_operand:SI 2 "const_int_operand")) + (match_operand: 3 "register_operand" "Upl") + (match_operand:SVE_FULL 4 "register_operand" "w")] + SME_WRITE))] + "TARGET_STREAMING_SME + && UINTVAL (operands[2]) < 128 / " + "mova\tza%0.[%w1, %2], %3/m, %4." +) + +(define_insn "@aarch64_sme_" + [(set (reg:VNx1TI_ONLY ZA_REGNUM) + (unspec:VNx1TI_ONLY + [(reg:VNx1TI_ONLY ZA_REGNUM) + (reg:DI SME_STATE_REGNUM) + (match_operand:DI 0 "const_int_operand") + (match_operand:SI 1 "register_operand" "Ucj") + (match_operand:VNx2BI 2 "register_operand" "Upl") + (match_operand:SVE_FULL 3 "register_operand" "w")] + SME_WRITE))] + "TARGET_STREAMING_SME" + "mova\tza%0.q[%w1, 0], %2/m, %3.q" +) + +;; ------------------------------------------------------------------------- +;; ---- Zeroing +;; ------------------------------------------------------------------------- +;; Includes: +;; - ZERO +;; ------------------------------------------------------------------------- + +(define_c_enum "unspec" [UNSPEC_SME_ZERO]) + +(define_insn "aarch64_sme_zero_za" + [(set (reg:VNx16QI ZA_REGNUM) + (unspec:VNx16QI [(reg:VNx16QI ZA_REGNUM) + (reg:DI SME_STATE_REGNUM) + (match_operand:DI 0 "const_int_operand")] + UNSPEC_SME_ZERO))] + "TARGET_SME" + { + return aarch64_output_sme_zero_za (operands[0]); + } +) + +;; ========================================================================= +;; == Binary arithmetic +;; ========================================================================= + +;; ------------------------------------------------------------------------- +;; ---- Binary arithmetic on ZA tile +;; ------------------------------------------------------------------------- +;; Includes: +;; - ADDHA +;; - ADDVA +;; ------------------------------------------------------------------------- + +(define_insn "@aarch64_sme_" + [(set (reg:SME_ZA_SDI ZA_REGNUM) + (unspec:SME_ZA_SDI + [(reg:SME_ZA_SDI ZA_REGNUM) + (reg:DI SME_STATE_REGNUM) + (match_operand:DI 0 "const_int_operand") + (match_operand: 1 "register_operand" "Upl") + (match_operand: 2 "register_operand" "Upl") + (match_operand:SME_ZA_SDI 3 "register_operand" "w")] + SME_BINARY_SDI))] + "TARGET_STREAMING_SME" + "\tza%0., %1/m, %2/m, %3." +) + +;; ========================================================================= +;; == Ternary arithmetic +;; ========================================================================= + +;; ------------------------------------------------------------------------- +;; ---- [INT] Sum of outer products +;; ------------------------------------------------------------------------- +;; Includes: +;; - SMOPA +;; - SMOPS +;; - SUMOPA +;; - SUMOPS +;; - UMOPA +;; - UMOPS +;; - USMOPA +;; - USMOPS +;; ------------------------------------------------------------------------- + +(define_insn "@aarch64_sme_" + [(set (reg:VNx4SI_ONLY ZA_REGNUM) + (unspec:VNx4SI_ONLY + [(reg:VNx4SI_ONLY ZA_REGNUM) + (reg:DI SME_STATE_REGNUM) + (match_operand:DI 0 "const_int_operand") + (match_operand: 1 "register_operand" "Upl") + (match_operand: 2 "register_operand" "Upl") + (match_operand:VNx16QI_ONLY 3 "register_operand" "w") + (match_operand:VNx16QI_ONLY 4 "register_operand" "w")] + SME_INT_MOP))] + "TARGET_STREAMING_SME" + "\tza%0.s, %1/m, %2/m, %3.b, %4.b" +) + +(define_insn "@aarch64_sme_" + [(set (reg:VNx2DI_ONLY ZA_REGNUM) + (unspec:VNx2DI_ONLY + [(reg:VNx2DI_ONLY ZA_REGNUM) + (reg:DI SME_STATE_REGNUM) + (match_operand:DI 0 "const_int_operand") + (match_operand: 1 "register_operand" "Upl") + (match_operand: 2 "register_operand" "Upl") + (match_operand:VNx8HI_ONLY 3 "register_operand" "w") + (match_operand:VNx8HI_ONLY 4 "register_operand" "w")] + SME_INT_MOP))] + "TARGET_STREAMING_SME && TARGET_SME_I16I64" + "\tza%0.d, %1/m, %2/m, %3.h, %4.h" +) + +;; ------------------------------------------------------------------------- +;; ---- [FP] Sum of outer products +;; ------------------------------------------------------------------------- +;; Includes: +;; - BFMOPA +;; - BFMOPS +;; - FMOPA +;; - FMOPS +;; ------------------------------------------------------------------------- + +(define_insn "@aarch64_sme_" + [(set (reg:SME_ZA_SDF_I ZA_REGNUM) + (unspec:SME_ZA_SDF_I + [(reg:SME_ZA_SDF_I ZA_REGNUM) + (reg:DI SME_STATE_REGNUM) + (match_operand:DI 0 "const_int_operand") + (match_operand: 1 "register_operand" "Upl") + (match_operand: 2 "register_operand" "Upl") + (match_operand:SME_MOP_HSDF 3 "register_operand" "w") + (match_operand:SME_MOP_HSDF 4 "register_operand" "w")] + SME_FP_MOP))] + "TARGET_STREAMING_SME + && ( == 32) == ( <= 32)" + "\tza%0., %1/m, %2/m, %3., %4." +) diff --git a/gcc/config/aarch64/aarch64-sve-builtins-functions.h b/gcc/config/aarch64/aarch64-sve-builtins-functions.h index be2561620f4..5bd200d9c0a 100644 --- a/gcc/config/aarch64/aarch64-sve-builtins-functions.h +++ b/gcc/config/aarch64/aarch64-sve-builtins-functions.h @@ -39,6 +39,27 @@ public: } }; +/* Wrap T, which is derived from function_base, and indicate that it + additionally has the call properties in PROPERTIES. */ +template +class add_call_properties : public T +{ +public: + using T::T; + + unsigned int + call_properties (const function_instance &fi) const override + { + return T::call_properties (fi) | PROPERTIES; + } +}; + +template +using read_write_za = add_call_properties; + +template +using write_za = add_call_properties; + /* A function_base that sometimes or always operates on tuples of vectors. */ class multi_vector_function : public function_base @@ -353,6 +374,49 @@ typedef unspec_based_function_exact_insn typedef unspec_based_function_exact_insn unspec_based_sub_lane_function; +/* General SME unspec-based functions, parameterized on the vector mode. */ +class sme_1mode_function : public read_write_za +{ +public: + using parent = read_write_za; + + CONSTEXPR sme_1mode_function (int unspec_for_sint, int unspec_for_uint, + int unspec_for_fp) + : parent (unspec_for_sint, unspec_for_uint, unspec_for_fp, 1) + {} + + rtx + expand (function_expander &e) const override + { + auto icode = code_for_aarch64_sme (unspec_for (e), e.tuple_mode (1)); + return e.use_exact_insn (icode); + } +}; + +/* General SME unspec-based functions, parameterized on both the ZA mode + and the vector mode. */ +template +class sme_2mode_function_t : public read_write_za +{ +public: + using parent = read_write_za; + + CONSTEXPR sme_2mode_function_t (int unspec_for_sint, int unspec_for_uint, + int unspec_for_fp) + : parent (unspec_for_sint, unspec_for_uint, unspec_for_fp, 1) + {} + + rtx + expand (function_expander &e) const override + { + insn_code icode = CODE (unspec_for (e), e.vector_mode (0), + e.tuple_mode (1)); + return e.use_exact_insn (icode); + } +}; + +using sme_2mode_function = sme_2mode_function_t; + /* A function that acts like unspec_based_function_exact_insn when operating on integers, but that expands to an (fma ...)-style aarch64_sve* operation when applied to floats. */ diff --git a/gcc/config/aarch64/aarch64-sve-builtins-shapes.cc b/gcc/config/aarch64/aarch64-sve-builtins-shapes.cc index 68708712001..36c3c5005c4 100644 --- a/gcc/config/aarch64/aarch64-sve-builtins-shapes.cc +++ b/gcc/config/aarch64/aarch64-sve-builtins-shapes.cc @@ -59,7 +59,10 @@ static void apply_predication (const function_instance &instance, tree return_type, vec &argument_types) { - if (instance.pred != PRED_none) + /* There are currently no SME ZA instructions that have both merging and + unpredicated forms, so for simplicity, the predicates are always included + in the original format string. */ + if (instance.pred != PRED_none && instance.pred != PRED_za_m) { argument_types.quick_insert (0, get_svbool_t ()); /* For unary merge operations, the first argument is a vector with @@ -589,6 +592,33 @@ struct binary_imm_long_base : public overloaded_base<0> } }; +/* Base class for binary_za_m and similar shapes. */ +template +struct binary_za_m_base : public overloaded_base<1> +{ + tree + resolve (function_resolver &r) const override + { + type_suffix_index type; + if (!r.check_num_arguments (5) + || !r.require_integer_immediate (0) + || !r.require_vector_type (1, VECTOR_TYPE_svbool_t) + || !r.require_vector_type (2, VECTOR_TYPE_svbool_t) + || (type = r.infer_vector_type (3)) == NUM_TYPE_SUFFIXES + || !r.require_derived_vector_type (4, 3, type, TCLASS, BITS)) + return error_mark_node; + + return r.resolve_to (r.mode_suffix_id, r.type_suffix_ids[0], type); + } + + bool + check (function_checker &c) const override + { + return c.require_immediate_range (0, 0, c.num_za_tiles () - 1); + } +}; + /* Base class for inc_dec and inc_dec_pat. */ struct inc_dec_base : public overloaded_base<0> { @@ -1576,6 +1606,68 @@ struct binary_wide_opt_n_def : public overloaded_base<0> }; SHAPE (binary_wide_opt_n) +/* void svfoo_t0[_t1]_g(uint64_t, svbool_t, svbool_t, svx_t, + svx_t) + + where the first argument is a ZA tile. */ +struct binary_za_int_m_def : public binary_za_m_base +{ + void + build (function_builder &b, const function_group_info &group) const override + { + b.add_overloaded_functions (group, MODE_none); + build_all (b, "_,su64,vp,vp,t1,ts1", group, MODE_none); + } +}; +SHAPE (binary_za_int_m) + +/* void svfoo_t0[_t1]_g(uint64_t, svbool_t, svbool_t, svx_t, + svx_t) + + where the first argument is a ZA tile. */ +struct binary_za_m_def : public binary_za_m_base<> +{ + void + build (function_builder &b, const function_group_info &group) const override + { + b.add_overloaded_functions (group, MODE_none); + /* Allow the overloaded form to be specified seperately, with just + a single suffix. This is necessary for the 64-bit SME MOP intrinsics, + which have some forms dependent on FEAT_SME_I16I64 and some forms + dependent on FEAT_SME_F64F64. The resolver needs to be defined + for base SME. */ + if (group.types[0][1] != NUM_TYPE_SUFFIXES) + build_all (b, "_,su64,vp,vp,t1,t1", group, MODE_none); + } +}; +SHAPE (binary_za_m) + +/* void svfoo_t0[_t1]_g(uint64_t, svbool_t, svbool_t, svx_t, + svx_t) + + where the first argument is a ZA tile. */ +struct binary_za_uint_m_def : public binary_za_m_base +{ + void + build (function_builder &b, const function_group_info &group) const override + { + b.add_overloaded_functions (group, MODE_none); + build_all (b, "_,su64,vp,vp,t1,tu1", group, MODE_none); + } +}; +SHAPE (binary_za_uint_m) + +/* bool svfoo(). */ +struct bool_inherent_def : public nonoverloaded_base +{ + void + build (function_builder &b, const function_group_info &group) const override + { + build_all (b, "sp", group, MODE_none); + } +}; +SHAPE (bool_inherent) + /* sv_t svfoo[_t0](sv_t, sv_t) _t svfoo[_n_t0](_t, sv_t). */ struct clast_def : public overloaded_base<0> @@ -2055,6 +2147,51 @@ struct inherent_b_def : public overloaded_base<0> }; SHAPE (inherent_b) +/* void svfoo_t0(). */ +struct inherent_za_def : public nonoverloaded_base +{ + void + build (function_builder &b, const function_group_info &group) const override + { + build_all (b, "_", group, MODE_none); + } +}; +SHAPE (inherent_za) + +/* void svfoo_t0(uint64_t) + + where the argument is an integer constant that specifies an 8-bit mask. */ +struct inherent_mask_za_def : public nonoverloaded_base +{ + void + build (function_builder &b, const function_group_info &group) const override + { + build_all (b, "_,su64", group, MODE_none); + } + + bool + check (function_checker &c) const override + { + return c.require_immediate_range (0, 0, 255); + } +}; +SHAPE (inherent_mask_za) + +/* void svfoo_t0(uint32_t, const void *) + void svfoo_vnum_t0(uint32_t, const void *, int64_t) + + where the first argument is a variable ZA slice. */ +struct ldr_za_def : public nonoverloaded_base +{ + void + build (function_builder &b, const function_group_info &group) const override + { + build_all (b, "_,su32,al", group, MODE_none); + build_all (b, "_,su32,al,ss64", group, MODE_vnum); + } +}; +SHAPE (ldr_za) + /* sv[xN]_t svfoo[_t0](const _t *) sv[xN]_t svfoo_vnum[_t0](const _t *, int64_t). */ struct load_def : public load_contiguous_base @@ -2265,6 +2402,27 @@ struct load_replicate_def : public load_contiguous_base }; SHAPE (load_replicate) +/* void svfoo_t0(uint64_t, uint32_t, svbool_t, const void *) + void svfoo_vnum_t0(uint64_t, uint32_t, svbool_t, const void *, int64_t) + + where the first two fields form a (ZA tile, slice) pair. */ +struct load_za_def : public nonoverloaded_base +{ + void + build (function_builder &b, const function_group_info &group) const override + { + build_all (b, "_,su64,su32,vp,al", group, MODE_none); + build_all (b, "_,su64,su32,vp,al,ss64", group, MODE_vnum); + } + + bool + check (function_checker &c) const override + { + return c.require_immediate_range (0, 0, c.num_za_tiles () - 1); + } +}; +SHAPE (load_za) + /* svbool_t svfoo(enum svpattern). */ struct pattern_pred_def : public nonoverloaded_base { @@ -2359,6 +2517,48 @@ struct rdffr_def : public nonoverloaded_base }; SHAPE (rdffr) +/* sv_t svfoo_t0[_t1](uint64_t, uint32_t) + + where the first two fields form a (ZA tile, slice) pair. */ +struct read_za_m_def : public overloaded_base<1> +{ + bool + has_merge_argument_p (const function_instance &, unsigned int) const override + { + return true; + } + + void + build (function_builder &b, const function_group_info &group) const override + { + b.add_overloaded_functions (group, MODE_none); + build_all (b, "t1,su64,su32", group, MODE_none); + } + + tree + resolve (function_resolver &r) const override + { + gcc_assert (r.pred == PRED_m); + type_suffix_index type; + if (!r.check_num_arguments (4) + || (type = r.infer_vector_type (0)) == NUM_TYPE_SUFFIXES + || !r.require_vector_type (1, VECTOR_TYPE_svbool_t) + || !r.require_integer_immediate (2) + || !r.require_scalar_type (3, "uint32_t")) + return error_mark_node; + + return r.resolve_to (r.mode_suffix_id, r.type_suffix_ids[0], type); + } + + bool + check (function_checker &c) const override + { + gcc_assert (c.pred == PRED_m); + return c.require_immediate_range (1, 0, c.num_za_tiles () - 1); + } +}; +SHAPE (read_za_m) + /* _t svfoo[_t0](sv_t). */ struct reduction_def : public overloaded_base<0> { @@ -2727,6 +2927,42 @@ struct store_scatter_offset_restricted_def : public store_scatter_base }; SHAPE (store_scatter_offset_restricted) +/* void svfoo_t0(uint64_t, uint32_t, svbool_t, void *) + void svfoo_vnum_t0(uint64_t, uint32_t, svbool_t, void *, int64_t) + + where the first two fields form a (ZA tile, slice) pair. */ +struct store_za_def : public nonoverloaded_base +{ + void + build (function_builder &b, const function_group_info &group) const override + { + build_all (b, "_,su64,su32,vp,as", group, MODE_none); + build_all (b, "_,su64,su32,vp,as,ss64", group, MODE_vnum); + } + + bool + check (function_checker &c) const override + { + return c.require_immediate_range (0, 0, c.num_za_tiles () - 1); + } +}; +SHAPE (store_za) + +/* void svfoo_t0(uint32_t, void *) + void svfoo_vnum_t0(uint32_t, void *, int64_t) + + where the first argument is a variable ZA slice. */ +struct str_za_def : public nonoverloaded_base +{ + void + build (function_builder &b, const function_group_info &group) const override + { + build_all (b, "_,su32,as", group, MODE_none); + build_all (b, "_,su32,as,ss64", group, MODE_vnum); + } +}; +SHAPE (str_za) + /* sv_t svfoo[_t0](svxN_t, sv_t). */ struct tbl_tuple_def : public overloaded_base<0> { @@ -3487,4 +3723,72 @@ struct unary_widen_def : public overloaded_base<0> }; SHAPE (unary_widen) +/* void svfoo_t0[_t1](uint64_t, svbool_t, svbool_t, sv_t) + + where the first argument is a ZA tile. */ +struct unary_za_m_def : public overloaded_base<1> +{ + void + build (function_builder &b, const function_group_info &group) const override + { + b.add_overloaded_functions (group, MODE_none); + build_all (b, "_,su64,vp,vp,t1", group, MODE_none); + } + + tree + resolve (function_resolver &r) const override + { + type_suffix_index type; + if (!r.check_num_arguments (4) + || !r.require_integer_immediate (0) + || !r.require_vector_type (1, VECTOR_TYPE_svbool_t) + || !r.require_vector_type (2, VECTOR_TYPE_svbool_t) + || (type = r.infer_vector_type (3)) == NUM_TYPE_SUFFIXES) + return error_mark_node; + + return r.resolve_to (r.mode_suffix_id, r.type_suffix_ids[0], type); + } + + bool + check (function_checker &c) const override + { + return c.require_immediate_range (0, 0, c.num_za_tiles () - 1); + } +}; +SHAPE (unary_za_m) + +/* void svfoo_t0[_t1](uint64_t, uint32_t, svbool_t, sv_t) + + where the first two fields form a (ZA tile, slice) pair. */ +struct write_za_m_def : public overloaded_base<1> +{ + void + build (function_builder &b, const function_group_info &group) const override + { + b.add_overloaded_functions (group, MODE_none); + build_all (b, "_,su64,su32,vp,t1", group, MODE_none); + } + + tree + resolve (function_resolver &r) const override + { + type_suffix_index type; + if (!r.check_num_arguments (4) + || !r.require_integer_immediate (0) + || !r.require_scalar_type (1, "uint32_t") + || !r.require_vector_type (2, VECTOR_TYPE_svbool_t) + || (type = r.infer_vector_type (3)) == NUM_TYPE_SUFFIXES) + return error_mark_node; + + return r.resolve_to (r.mode_suffix_id, r.type_suffix_ids[0], type); + } + + bool + check (function_checker &c) const override + { + return c.require_immediate_range (0, 0, c.num_za_tiles () - 1); + } +}; +SHAPE (write_za_m) + } diff --git a/gcc/config/aarch64/aarch64-sve-builtins-shapes.h b/gcc/config/aarch64/aarch64-sve-builtins-shapes.h index 38d494761ae..d64ddca7358 100644 --- a/gcc/config/aarch64/aarch64-sve-builtins-shapes.h +++ b/gcc/config/aarch64/aarch64-sve-builtins-shapes.h @@ -93,6 +93,10 @@ namespace aarch64_sve extern const function_shape *const binary_uint64_opt_n; extern const function_shape *const binary_wide; extern const function_shape *const binary_wide_opt_n; + extern const function_shape *const binary_za_int_m; + extern const function_shape *const binary_za_m; + extern const function_shape *const binary_za_uint_m; + extern const function_shape *const bool_inherent; extern const function_shape *const clast; extern const function_shape *const compare; extern const function_shape *const compare_opt_n; @@ -114,6 +118,9 @@ namespace aarch64_sve extern const function_shape *const inc_dec_pred_scalar; extern const function_shape *const inherent; extern const function_shape *const inherent_b; + extern const function_shape *const inherent_za; + extern const function_shape *const inherent_mask_za; + extern const function_shape *const ldr_za; extern const function_shape *const load; extern const function_shape *const load_ext; extern const function_shape *const load_ext_gather_index; @@ -124,6 +131,7 @@ namespace aarch64_sve extern const function_shape *const load_gather_sv_restricted; extern const function_shape *const load_gather_vs; extern const function_shape *const load_replicate; + extern const function_shape *const load_za; extern const function_shape *const mmla; extern const function_shape *const pattern_pred; extern const function_shape *const prefetch; @@ -131,6 +139,7 @@ namespace aarch64_sve extern const function_shape *const prefetch_gather_offset; extern const function_shape *const ptest; extern const function_shape *const rdffr; + extern const function_shape *const read_za_m; extern const function_shape *const reduction; extern const function_shape *const reduction_wide; extern const function_shape *const reinterpret; @@ -148,6 +157,8 @@ namespace aarch64_sve extern const function_shape *const store_scatter_index_restricted; extern const function_shape *const store_scatter_offset; extern const function_shape *const store_scatter_offset_restricted; + extern const function_shape *const store_za; + extern const function_shape *const str_za; extern const function_shape *const tbl_tuple; extern const function_shape *const ternary_bfloat; extern const function_shape *const ternary_bfloat_lane; @@ -186,6 +197,8 @@ namespace aarch64_sve extern const function_shape *const unary_to_uint; extern const function_shape *const unary_uint; extern const function_shape *const unary_widen; + extern const function_shape *const unary_za_m; + extern const function_shape *const write_za_m; } } diff --git a/gcc/config/aarch64/aarch64-sve-builtins-sme.cc b/gcc/config/aarch64/aarch64-sve-builtins-sme.cc new file mode 100644 index 00000000000..e1df6ce0d30 --- /dev/null +++ b/gcc/config/aarch64/aarch64-sve-builtins-sme.cc @@ -0,0 +1,412 @@ +/* ACLE support for AArch64 SME. + Copyright (C) 2023 Free Software Foundation, Inc. + + This file is part of GCC. + + GCC is free software; you can redistribute it and/or modify it + under the terms of the GNU General Public License as published by + the Free Software Foundation; either version 3, or (at your option) + any later version. + + GCC is distributed in the hope that it will be useful, but + WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + General Public License for more details. + + You should have received a copy of the GNU General Public License + along with GCC; see the file COPYING3. If not see + . */ + +#include "config.h" +#include "system.h" +#include "coretypes.h" +#include "tm.h" +#include "tree.h" +#include "rtl.h" +#include "tm_p.h" +#include "memmodel.h" +#include "insn-codes.h" +#include "optabs.h" +#include "recog.h" +#include "expr.h" +#include "basic-block.h" +#include "function.h" +#include "fold-const.h" +#include "gimple.h" +#include "gimple-iterator.h" +#include "gimplify.h" +#include "explow.h" +#include "emit-rtl.h" +#include "aarch64-sve-builtins.h" +#include "aarch64-sve-builtins-shapes.h" +#include "aarch64-sve-builtins-base.h" +#include "aarch64-sve-builtins-sme.h" +#include "aarch64-sve-builtins-functions.h" + +using namespace aarch64_sve; + +namespace { + +class load_store_za_base : public function_base +{ +public: + tree + memory_scalar_type (const function_instance &) const override + { + return void_type_node; + } +}; + +class read_write_za_base : public function_base +{ +public: + constexpr read_write_za_base (int unspec) : m_unspec (unspec) {} + + rtx + expand (function_expander &e) const override + { + auto za_mode = e.vector_mode (0); + auto z_mode = e.vector_mode (1); + auto icode = (za_mode == VNx1TImode + ? code_for_aarch64_sme (m_unspec, za_mode, z_mode) + : code_for_aarch64_sme (m_unspec, z_mode, z_mode)); + return e.use_exact_insn (icode); + } + + int m_unspec; +}; + +using load_za_base = add_call_properties; + +using store_za_base = add_call_properties; + +/* E is a load or store intrinsic that accesses a ZA slice of mode MEM_MODE. + The intrinsic has a vnum parameter at index ARGNO. Return true if the + vnum argument is a constant that is a valid ZA offset for the underlying + instruction. */ + +static bool +has_in_range_vnum_arg (function_expander &e, machine_mode mem_mode, + unsigned int argno) +{ + return (e.mode_suffix_id == MODE_vnum + && CONST_INT_P (e.args[argno]) + && UINTVAL (e.args[argno]) < 16 / GET_MODE_UNIT_SIZE (mem_mode)); +} + +/* E is a ZA load or store intrinsic that uses instruction ICODE. Add a + 32-bit operand that gives the total ZA slice. (The instruction hard-codes + the constant offset to 0, so there is no operand for that.) + + Argument ARGNO is the intrinsic's slice argument. If the intrinsic is + a _vnum intrinsic, argument VNUM_ARGNO is the intrinsic's vnum operand, + which must be added to the slice argument. */ + +static void +add_load_store_slice_operand (function_expander &e, insn_code icode, + unsigned int argno, unsigned int vnum_argno) +{ + rtx base = e.args[argno]; + if (e.mode_suffix_id == MODE_vnum) + { + rtx vnum = lowpart_subreg (SImode, e.args[vnum_argno], DImode); + base = simplify_gen_binary (PLUS, SImode, base, vnum); + } + e.add_input_operand (icode, base); +} + +/* Add a memory operand for ZA LD1 or ST1 intrinsic E. BASE_ARGNO is + the index of the base argument. */ + +static void +add_load_store_operand (function_expander &e, unsigned int base_argno) +{ + auto mode = e.vector_mode (0); + rtx base = e.get_contiguous_base (mode, base_argno, base_argno + 1, + AARCH64_FL_SM_ON); + auto mem = gen_rtx_MEM (mode, force_reg (Pmode, base)); + set_mem_align (mem, BITS_PER_UNIT); + e.add_fixed_operand (mem); +} + +/* Expand ZA LDR or STR intrinsic E. There are two underlying instructions: + + - BASE_CODE has a zero ZA slice offset + - VNUM_CODE has a constant operand for the ZA slice offset. */ + +static rtx +expand_ldr_str_za (function_expander &e, insn_code base_code, + insn_code vnum_code) +{ + if (has_in_range_vnum_arg (e, VNx16QImode, 2)) + { + rtx mem_offset = aarch64_sme_vq_immediate (Pmode, + UINTVAL (e.args[2]) * 16, + AARCH64_ISA_MODE); + e.add_input_operand (vnum_code, e.args[0]); + e.add_input_operand (vnum_code, e.args[2]); + e.add_input_operand (vnum_code, e.args[1]); + e.add_input_operand (vnum_code, mem_offset); + return e.generate_insn (vnum_code); + } + else + { + rtx base = e.get_contiguous_base (VNx16QImode, 1, 2, AARCH64_FL_SM_ON); + add_load_store_slice_operand (e, base_code, 0, 2); + e.add_input_operand (base_code, base); + return e.generate_insn (base_code); + } +} + +/* Expand ZA LD1 or ST1 intrinsic E. UNSPEC is the load or store unspec. + IS_LOAD is true if E is a load, false if it is a store. */ + +static rtx +expand_ld1_st1 (function_expander &e, int unspec, bool is_load) +{ + bool is_vnum = has_in_range_vnum_arg (e, e.vector_mode (0), 4); + auto icode = (is_vnum + ? code_for_aarch64_sme_plus (unspec, e.vector_mode (0)) + : code_for_aarch64_sme (unspec, e.vector_mode (0))); + if (!is_load) + add_load_store_operand (e, 3); + e.add_input_operand (icode, e.args[0]); + if (is_vnum) + { + e.add_input_operand (icode, e.args[1]); + e.add_input_operand (icode, e.args[4]); + } + else + add_load_store_slice_operand (e, icode, 1, 4); + e.add_input_operand (icode, e.args[2]); + if (is_load) + add_load_store_operand (e, 3); + return e.generate_insn (icode); +} + +class arm_has_sme_impl : public function_base +{ + gimple * + fold (gimple_folder &f) const override + { + if (TARGET_SME) + return f.fold_to_cstu (1); + return nullptr; + } + + rtx + expand (function_expander &e) const override + { + if (TARGET_SME) + return const1_rtx; + emit_insn (gen_aarch64_get_sme_state ()); + return expand_simple_binop (DImode, LSHIFTRT, + gen_rtx_REG (DImode, R0_REGNUM), + gen_int_mode (63, QImode), + e.possible_target, true, OPTAB_LIB_WIDEN); + } +}; + +class arm_in_streaming_mode_impl : public function_base +{ + gimple * + fold (gimple_folder &f) const override + { + if (TARGET_STREAMING) + return f.fold_to_cstu (1); + if (TARGET_NON_STREAMING) + return f.fold_to_cstu (0); + return nullptr; + } + + rtx + expand (function_expander &e) const override + { + if (TARGET_STREAMING) + return const1_rtx; + + if (TARGET_NON_STREAMING) + return const0_rtx; + + rtx reg; + if (TARGET_SME) + { + reg = gen_reg_rtx (DImode); + emit_insn (gen_aarch64_read_svcr (reg)); + } + else + { + emit_insn (gen_aarch64_get_sme_state ()); + reg = gen_rtx_REG (DImode, R0_REGNUM); + } + return expand_simple_binop (DImode, AND, reg, gen_int_mode (1, DImode), + e.possible_target, true, OPTAB_LIB_WIDEN); + } +}; + +/* Implements svcnts[bhwd]. */ +class svcnts_bhwd_impl : public function_base +{ +public: + constexpr svcnts_bhwd_impl (machine_mode ref_mode) : m_ref_mode (ref_mode) {} + + unsigned int + get_shift () const + { + return exact_log2 (GET_MODE_UNIT_SIZE (m_ref_mode)); + } + + gimple * + fold (gimple_folder &f) const override + { + if (TARGET_STREAMING) + return f.fold_to_cstu (GET_MODE_NUNITS (m_ref_mode)); + return nullptr; + } + + rtx + expand (function_expander &e) const override + { + rtx cntsb = aarch64_sme_vq_immediate (DImode, 16, AARCH64_ISA_MODE); + auto shift = get_shift (); + if (!shift) + return cntsb; + + return expand_simple_binop (DImode, LSHIFTRT, cntsb, + gen_int_mode (shift, QImode), + e.possible_target, true, OPTAB_LIB_WIDEN); + } + + /* The mode of the vector associated with the [bhwd] suffix. */ + machine_mode m_ref_mode; +}; + +class svld1_za_impl : public load_za_base +{ +public: + constexpr svld1_za_impl (int unspec) : m_unspec (unspec) {} + + rtx + expand (function_expander &e) const override + { + return expand_ld1_st1 (e, m_unspec, true); + } + + int m_unspec; +}; + +class svldr_za_impl : public load_za_base +{ +public: + rtx + expand (function_expander &e) const override + { + return expand_ldr_str_za (e, CODE_FOR_aarch64_sme_ldr0, + code_for_aarch64_sme_ldrn (Pmode)); + } +}; + +using svread_za_tile_impl = add_call_properties; + +class svst1_za_impl : public store_za_base +{ +public: + constexpr svst1_za_impl (int unspec) : m_unspec (unspec) {} + + rtx + expand (function_expander &e) const override + { + return expand_ld1_st1 (e, m_unspec, false); + } + + int m_unspec; +}; + +class svstr_za_impl : public store_za_base +{ +public: + rtx + expand (function_expander &e) const override + { + return expand_ldr_str_za (e, CODE_FOR_aarch64_sme_str0, + code_for_aarch64_sme_strn (Pmode)); + } +}; + +class svundef_za_impl : public write_za +{ +public: + rtx + expand (function_expander &) const override + { + rtx target = gen_rtx_REG (VNx16QImode, ZA_REGNUM); + emit_clobber (copy_rtx (target)); + return const0_rtx; + } +}; + +using svwrite_za_tile_impl = add_call_properties; + +class svzero_mask_za_impl : public write_za +{ +public: + rtx + expand (function_expander &e) const override + { + return e.use_exact_insn (CODE_FOR_aarch64_sme_zero_za); + } +}; + +class svzero_za_impl : public write_za +{ +public: + rtx + expand (function_expander &) const override + { + emit_insn (gen_aarch64_sme_zero_za (gen_int_mode (0xff, SImode))); + return const0_rtx; + } +}; + +} /* end anonymous namespace */ + +namespace aarch64_sve { + +FUNCTION (arm_has_sme, arm_has_sme_impl, ) +FUNCTION (arm_in_streaming_mode, arm_in_streaming_mode_impl, ) +FUNCTION (svaddha_za, sme_1mode_function, (UNSPEC_SME_ADDHA, + UNSPEC_SME_ADDHA, -1)) +FUNCTION (svaddva_za, sme_1mode_function, (UNSPEC_SME_ADDVA, + UNSPEC_SME_ADDVA, -1)) +FUNCTION (svcntsb, svcnts_bhwd_impl, (VNx16QImode)) +FUNCTION (svcntsd, svcnts_bhwd_impl, (VNx2DImode)) +FUNCTION (svcntsh, svcnts_bhwd_impl, (VNx8HImode)) +FUNCTION (svcntsw, svcnts_bhwd_impl, (VNx4SImode)) +FUNCTION (svld1_hor_za, svld1_za_impl, (UNSPEC_SME_LD1_HOR)) +FUNCTION (svld1_ver_za, svld1_za_impl, (UNSPEC_SME_LD1_VER)) +FUNCTION (svldr_za, svldr_za_impl, ) +FUNCTION (svmopa_za, sme_2mode_function, (UNSPEC_SME_SMOPA, UNSPEC_SME_UMOPA, + UNSPEC_SME_FMOPA)) +FUNCTION (svmops_za, sme_2mode_function, (UNSPEC_SME_SMOPS, UNSPEC_SME_UMOPS, + UNSPEC_SME_FMOPS)) +FUNCTION (svread_hor_za, svread_za_tile_impl, (UNSPEC_SME_READ_HOR)) +FUNCTION (svread_ver_za, svread_za_tile_impl, (UNSPEC_SME_READ_VER)) +FUNCTION (svst1_hor_za, svst1_za_impl, (UNSPEC_SME_ST1_HOR)) +FUNCTION (svst1_ver_za, svst1_za_impl, (UNSPEC_SME_ST1_VER)) +FUNCTION (svstr_za, svstr_za_impl, ) +FUNCTION (svsumopa_za, sme_2mode_function, (UNSPEC_SME_SUMOPA, -1, -1)) +FUNCTION (svsumops_za, sme_2mode_function, (UNSPEC_SME_SUMOPS, -1, -1)) +FUNCTION (svundef_za, svundef_za_impl, ) +FUNCTION (svusmopa_za, sme_2mode_function, (-1, UNSPEC_SME_USMOPA, -1)) +FUNCTION (svusmops_za, sme_2mode_function, (-1, UNSPEC_SME_USMOPS, -1)) +FUNCTION (svwrite_hor_za, svwrite_za_tile_impl, (UNSPEC_SME_WRITE_HOR)) +FUNCTION (svwrite_ver_za, svwrite_za_tile_impl, (UNSPEC_SME_WRITE_VER)) +FUNCTION (svzero_mask_za, svzero_mask_za_impl, ) +FUNCTION (svzero_za, svzero_za_impl, ) + +} /* end namespace aarch64_sve */ diff --git a/gcc/config/aarch64/aarch64-sve-builtins-sme.def b/gcc/config/aarch64/aarch64-sve-builtins-sme.def new file mode 100644 index 00000000000..5bdcc93f40f --- /dev/null +++ b/gcc/config/aarch64/aarch64-sve-builtins-sme.def @@ -0,0 +1,76 @@ +/* ACLE support for AArch64 SME. + Copyright (C) 2023 Free Software Foundation, Inc. + + This file is part of GCC. + + GCC is free software; you can redistribute it and/or modify it + under the terms of the GNU General Public License as published by + the Free Software Foundation; either version 3, or (at your option) + any later version. + + GCC is distributed in the hope that it will be useful, but + WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + General Public License for more details. + + You should have received a copy of the GNU General Public License + along with GCC; see the file COPYING3. If not see + . */ + +#define REQUIRED_EXTENSIONS 0 +DEF_SVE_FUNCTION (arm_has_sme, bool_inherent, none, none) +DEF_SVE_FUNCTION (arm_in_streaming_mode, bool_inherent, none, none) +#undef REQUIRED_EXTENSIONS + +#define REQUIRED_EXTENSIONS AARCH64_FL_SME +DEF_SVE_FUNCTION (svcntsb, count_inherent, none, none) +DEF_SVE_FUNCTION (svcntsd, count_inherent, none, none) +DEF_SVE_FUNCTION (svcntsh, count_inherent, none, none) +DEF_SVE_FUNCTION (svcntsw, count_inherent, none, none) +DEF_SME_ZA_FUNCTION (svldr, ldr_za, za, none) +DEF_SME_ZA_FUNCTION (svstr, str_za, za, none) +DEF_SME_ZA_FUNCTION (svundef, inherent_za, za, none) +DEF_SME_ZA_FUNCTION (svzero, inherent_za, za, none) +DEF_SME_ZA_FUNCTION (svzero_mask, inherent_mask_za, za, none) +#undef REQUIRED_EXTENSIONS + +#define REQUIRED_EXTENSIONS AARCH64_FL_SME | AARCH64_FL_SM_ON +DEF_SME_ZA_FUNCTION (svaddha, unary_za_m, za_s_integer, za_m) +DEF_SME_ZA_FUNCTION (svaddva, unary_za_m, za_s_integer, za_m) +DEF_SME_ZA_FUNCTION (svld1_hor, load_za, all_za, none) +DEF_SME_ZA_FUNCTION (svld1_ver, load_za, all_za, none) +DEF_SME_ZA_FUNCTION (svmopa, binary_za_m, mop_base, za_m) +DEF_SME_ZA_FUNCTION (svmopa, binary_za_m, d_za, za_m) +DEF_SME_ZA_FUNCTION (svmops, binary_za_m, mop_base, za_m) +DEF_SME_ZA_FUNCTION (svmops, binary_za_m, d_za, za_m) +DEF_SME_ZA_FUNCTION (svread_hor, read_za_m, za_all_data, m) +DEF_SME_ZA_FUNCTION (svread_ver, read_za_m, za_all_data, m) +DEF_SME_ZA_FUNCTION (svst1_hor, store_za, all_za, none) +DEF_SME_ZA_FUNCTION (svst1_ver, store_za, all_za, none) +DEF_SME_ZA_FUNCTION (svsumopa, binary_za_uint_m, mop_base_signed, za_m) +DEF_SME_ZA_FUNCTION (svsumops, binary_za_uint_m, mop_base_signed, za_m) +DEF_SME_ZA_FUNCTION (svusmopa, binary_za_int_m, mop_base_unsigned, za_m) +DEF_SME_ZA_FUNCTION (svusmops, binary_za_int_m, mop_base_unsigned, za_m) +DEF_SME_ZA_FUNCTION (svwrite_hor, write_za_m, za_all_data, za_m) +DEF_SME_ZA_FUNCTION (svwrite_ver, write_za_m, za_all_data, za_m) +#undef REQUIRED_EXTENSIONS + +#define REQUIRED_EXTENSIONS (AARCH64_FL_SME \ + | AARCH64_FL_SME_I16I64 \ + | AARCH64_FL_SM_ON) +DEF_SME_ZA_FUNCTION (svaddha, unary_za_m, za_d_integer, za_m) +DEF_SME_ZA_FUNCTION (svaddva, unary_za_m, za_d_integer, za_m) +DEF_SME_ZA_FUNCTION (svmopa, binary_za_m, mop_i16i64, za_m) +DEF_SME_ZA_FUNCTION (svmops, binary_za_m, mop_i16i64, za_m) +DEF_SME_ZA_FUNCTION (svsumopa, binary_za_uint_m, mop_i16i64_signed, za_m) +DEF_SME_ZA_FUNCTION (svsumops, binary_za_uint_m, mop_i16i64_signed, za_m) +DEF_SME_ZA_FUNCTION (svusmopa, binary_za_int_m, mop_i16i64_unsigned, za_m) +DEF_SME_ZA_FUNCTION (svusmops, binary_za_int_m, mop_i16i64_unsigned, za_m) +#undef REQUIRED_EXTENSIONS + +#define REQUIRED_EXTENSIONS (AARCH64_FL_SME \ + | AARCH64_FL_SME_F64F64 \ + | AARCH64_FL_SM_ON) +DEF_SME_ZA_FUNCTION (svmopa, binary_za_m, za_d_float, za_m) +DEF_SME_ZA_FUNCTION (svmops, binary_za_m, za_d_float, za_m) +#undef REQUIRED_EXTENSIONS diff --git a/gcc/config/aarch64/aarch64-sve-builtins-sme.h b/gcc/config/aarch64/aarch64-sve-builtins-sme.h new file mode 100644 index 00000000000..acfed77006b --- /dev/null +++ b/gcc/config/aarch64/aarch64-sve-builtins-sme.h @@ -0,0 +1,57 @@ +/* ACLE support for AArch64 SME. + Copyright (C) 2023 Free Software Foundation, Inc. + + This file is part of GCC. + + GCC is free software; you can redistribute it and/or modify it + under the terms of the GNU General Public License as published by + the Free Software Foundation; either version 3, or (at your option) + any later version. + + GCC is distributed in the hope that it will be useful, but + WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + General Public License for more details. + + You should have received a copy of the GNU General Public License + along with GCC; see the file COPYING3. If not see + . */ + +#ifndef GCC_AARCH64_SVE_BUILTINS_SME_H +#define GCC_AARCH64_SVE_BUILTINS_SME_H + +namespace aarch64_sve +{ + namespace functions + { + extern const function_base *const arm_has_sme; + extern const function_base *const arm_in_streaming_mode; + extern const function_base *const svaddha_za; + extern const function_base *const svaddva_za; + extern const function_base *const svcntsb; + extern const function_base *const svcntsd; + extern const function_base *const svcntsh; + extern const function_base *const svcntsw; + extern const function_base *const svld1_hor_za; + extern const function_base *const svld1_ver_za; + extern const function_base *const svldr_za; + extern const function_base *const svmopa_za; + extern const function_base *const svmops_za; + extern const function_base *const svread_hor_za; + extern const function_base *const svread_ver_za; + extern const function_base *const svst1_hor_za; + extern const function_base *const svst1_ver_za; + extern const function_base *const svstr_za; + extern const function_base *const svsumopa_za; + extern const function_base *const svsumops_za; + extern const function_base *const svusmopa_za; + extern const function_base *const svusmops_za; + extern const function_base *const svwrite_hor_za; + extern const function_base *const svwrite_ver_za; + extern const function_base *const svundef_za; + extern const function_base *const svzero_za; + extern const function_base *const svzero_mask_za; + } +} + +#endif diff --git a/gcc/config/aarch64/aarch64-sve-builtins.cc b/gcc/config/aarch64/aarch64-sve-builtins.cc index 7950977c14b..a40d448685d 100644 --- a/gcc/config/aarch64/aarch64-sve-builtins.cc +++ b/gcc/config/aarch64/aarch64-sve-builtins.cc @@ -51,6 +51,7 @@ #include "aarch64-sve-builtins.h" #include "aarch64-sve-builtins-base.h" #include "aarch64-sve-builtins-sve2.h" +#include "aarch64-sve-builtins-sme.h" #include "aarch64-sve-builtins-shapes.h" namespace aarch64_sve { @@ -112,6 +113,7 @@ static const char *const pred_suffixes[NUM_PREDS + 1] = { "_m", "_x", "_z", + "_m", "" }; @@ -136,12 +138,28 @@ CONSTEXPR const type_suffix_info type_suffixes[NUM_TYPE_SUFFIXES + 1] = { TYPE_##CLASS == TYPE_signed || TYPE_##CLASS == TYPE_unsigned, \ TYPE_##CLASS == TYPE_unsigned, \ TYPE_##CLASS == TYPE_float, \ + TYPE_##CLASS != TYPE_bool, \ TYPE_##CLASS == TYPE_bool, \ + false, \ + 0, \ + MODE }, +#define DEF_SME_ZA_SUFFIX(NAME, BITS, MODE) \ + { "_" #NAME, \ + NUM_VECTOR_TYPES, \ + NUM_TYPE_CLASSES, \ + BITS, \ + BITS / BITS_PER_UNIT, \ + false, \ + false, \ + false, \ + false, \ + false, \ + true, \ 0, \ MODE }, #include "aarch64-sve-builtins.def" { "", NUM_VECTOR_TYPES, TYPE_bool, 0, 0, false, false, false, false, - 0, VOIDmode } + false, false, 0, VOIDmode } }; CONSTEXPR const group_suffix_info group_suffixes[] = { @@ -422,6 +440,79 @@ CONSTEXPR const group_suffix_info group_suffixes[] = { TYPES_while1 (D, b32), \ TYPES_while1 (D, b64) +/* _za8 _za16 _za32 _za64 _za128. */ +#define TYPES_all_za(S, D) \ + S (za8), S (za16), S (za32), S (za64), S (za128) + +/* _za64. */ +#define TYPES_d_za(S, D) \ + S (za64) + +/* { _za8 } x { _s8 _u8 } + + { _za16 } x { _bf16 _f16 _s16 _u16 } + + { _za32 } x { _f32 _s32 _u32 } + + { _za64 } x { _f64 _s64 _u64 }. */ +#define TYPES_za_bhsd_data(S, D) \ + D (za8, s8), D (za8, u8), \ + D (za16, bf16), D (za16, f16), D (za16, s16), D (za16, u16), \ + D (za32, f32), D (za32, s32), D (za32, u32), \ + D (za64, f64), D (za64, s64), D (za64, u64) + +/* Likewise, plus: + + { _za128 } x { _bf16 } + { _f16 _f32 _f64 } + { _s8 _s16 _s32 _s64 } + { _u8 _u16 _u32 _u64 }. */ + +#define TYPES_za_all_data(S, D) \ + TYPES_za_bhsd_data (S, D), \ + TYPES_reinterpret1 (D, za128) + +/* _za32 x { _s32 _u32 }. */ +#define TYPES_za_s_integer(S, D) \ + D (za32, s32), D (za32, u32) + + +/* _za64_f64. */ +#define TYPES_za_d_float(S, D) \ + D (za64, f64) + +/* _za64 x { _s64 _u64 }. */ +#define TYPES_za_d_integer(S, D) \ + D (za64, s64), D (za64, u64) + +/* _za32 x { _s8 _u8 _bf16 _f16 _f32 }. */ +#define TYPES_mop_base(S, D) \ + D (za32, s8), D (za32, u8), D (za32, bf16), D (za32, f16), D (za32, f32) + +/* _za32_s8. */ +#define TYPES_mop_base_signed(S, D) \ + D (za32, s8) + +/* _za32_u8. */ +#define TYPES_mop_base_unsigned(S, D) \ + D (za32, u8) + +/* _za64 x { _s16 _u16 }. */ +#define TYPES_mop_i16i64(S, D) \ + D (za64, s16), D (za64, u16) + +/* _za64_s16. */ +#define TYPES_mop_i16i64_signed(S, D) \ + D (za64, s16) + +/* _za64_u16. */ +#define TYPES_mop_i16i64_unsigned(S, D) \ + D (za64, u16) + +/* _za. */ +#define TYPES_za(S, D) \ + S (za) + /* Describe a pair of type suffixes in which only the first is used. */ #define DEF_VECTOR_TYPE(X) { TYPE_SUFFIX_ ## X, NUM_TYPE_SUFFIXES } @@ -489,6 +580,19 @@ DEF_SVE_TYPES_ARRAY (cvt_narrow); DEF_SVE_TYPES_ARRAY (inc_dec_n); DEF_SVE_TYPES_ARRAY (reinterpret); DEF_SVE_TYPES_ARRAY (while); +DEF_SVE_TYPES_ARRAY (all_za); +DEF_SVE_TYPES_ARRAY (d_za); +DEF_SVE_TYPES_ARRAY (za_all_data); +DEF_SVE_TYPES_ARRAY (za_s_integer); +DEF_SVE_TYPES_ARRAY (za_d_float); +DEF_SVE_TYPES_ARRAY (za_d_integer); +DEF_SVE_TYPES_ARRAY (mop_base); +DEF_SVE_TYPES_ARRAY (mop_base_signed); +DEF_SVE_TYPES_ARRAY (mop_base_unsigned); +DEF_SVE_TYPES_ARRAY (mop_i16i64); +DEF_SVE_TYPES_ARRAY (mop_i16i64_signed); +DEF_SVE_TYPES_ARRAY (mop_i16i64_unsigned); +DEF_SVE_TYPES_ARRAY (za); static const group_suffix_index groups_none[] = { GROUP_none, NUM_GROUP_SUFFIXES @@ -505,6 +609,9 @@ static const predication_index preds_none[] = { PRED_none, NUM_PREDS }; explicit suffix. */ static const predication_index preds_implicit[] = { PRED_implicit, NUM_PREDS }; +/* Used by functions that only support "_m" predication. */ +static const predication_index preds_m[] = { PRED_m, NUM_PREDS }; + /* Used by functions that allow merging and "don't care" predication, but are not suitable for predicated MOVPRFX. */ static const predication_index preds_mx[] = { @@ -536,17 +643,23 @@ static const predication_index preds_z_or_none[] = { /* Used by (mostly predicate) functions that only support "_z" predication. */ static const predication_index preds_z[] = { PRED_z, NUM_PREDS }; +/* Used by SME instructions that always merge into ZA. */ +static const predication_index preds_za_m[] = { PRED_za_m, NUM_PREDS }; + /* A list of all SVE ACLE functions. */ static CONSTEXPR const function_group_info function_groups[] = { #define DEF_SVE_FUNCTION_GS(NAME, SHAPE, TYPES, GROUPS, PREDS) \ { #NAME, &functions::NAME, &shapes::SHAPE, types_##TYPES, groups_##GROUPS, \ preds_##PREDS, REQUIRED_EXTENSIONS }, +#define DEF_SME_ZA_FUNCTION_GS(NAME, SHAPE, TYPES, GROUPS, PREDS) \ + { #NAME, &functions::NAME##_za, &shapes::SHAPE, types_##TYPES, \ + groups_##GROUPS, preds_##PREDS, (REQUIRED_EXTENSIONS | AARCH64_FL_ZA_ON) }, #include "aarch64-sve-builtins.def" }; /* The scalar type associated with each vector type. */ -extern GTY(()) tree scalar_types[NUM_VECTOR_TYPES]; -tree scalar_types[NUM_VECTOR_TYPES]; +extern GTY(()) tree scalar_types[NUM_VECTOR_TYPES + 1]; +tree scalar_types[NUM_VECTOR_TYPES + 1]; /* The single-predicate and single-vector types, with their built-in "__SV..._t" name. Allow an index of NUM_VECTOR_TYPES, which always @@ -654,7 +767,7 @@ find_type_suffix_for_scalar_type (const_tree type) /* A linear search should be OK here, since the code isn't hot and the number of types is only small. */ for (unsigned int suffix_i = 0; suffix_i < NUM_TYPE_SUFFIXES; ++suffix_i) - if (!type_suffixes[suffix_i].bool_p) + if (type_suffixes[suffix_i].vector_p) { vector_type_index vector_i = type_suffixes[suffix_i].vector_type; if (matches_type_p (scalar_types[vector_i], type)) @@ -745,6 +858,20 @@ check_required_extensions (location_t location, tree fndecl, return false; } + if (missing_extensions & AARCH64_FL_SM_ON) + { + error_at (location, "ACLE function %qD can only be called when" + " SME streaming mode is enabled", fndecl); + return false; + } + + if (missing_extensions & AARCH64_FL_ZA_ON) + { + error_at (location, "ACLE function %qD can only be called from" + " a function that has %qs state", fndecl, "za"); + return false; + } + static const struct { aarch64_feature_flags flag; const char *name; @@ -780,9 +907,13 @@ report_out_of_range (location_t location, tree fndecl, unsigned int argno, HOST_WIDE_INT actual, HOST_WIDE_INT min, HOST_WIDE_INT max) { - error_at (location, "passing %wd to argument %d of %qE, which expects" - " a value in the range [%wd, %wd]", actual, argno + 1, fndecl, - min, max); + if (min == max) + error_at (location, "passing %wd to argument %d of %qE, which expects" + " the value %wd", actual, argno + 1, fndecl, min); + else + error_at (location, "passing %wd to argument %d of %qE, which expects" + " a value in the range [%wd, %wd]", actual, argno + 1, fndecl, + min, max); } /* Report that LOCATION has a call to FNDECL in which argument ARGNO has @@ -869,7 +1000,7 @@ function_instance::reads_global_state_p () const return true; /* Handle direct reads of global state. */ - return flags & (CP_READ_MEMORY | CP_READ_FFR); + return flags & (CP_READ_MEMORY | CP_READ_FFR | CP_READ_ZA); } /* Return true if calls to the function could modify some form of @@ -890,7 +1021,7 @@ function_instance::modifies_global_state_p () const return true; /* Handle direct modifications of global state. */ - return flags & (CP_WRITE_MEMORY | CP_WRITE_FFR); + return flags & (CP_WRITE_MEMORY | CP_WRITE_FFR | CP_WRITE_ZA); } /* Return true if calls to the function could raise a signal. */ @@ -922,8 +1053,8 @@ registered_function_hasher::equal (value_type value, const compare_type &key) return value->instance == key; } -sve_switcher::sve_switcher () - : aarch64_simd_switcher (AARCH64_FL_F16 | AARCH64_FL_SVE) +sve_switcher::sve_switcher (aarch64_feature_flags flags) + : aarch64_simd_switcher (AARCH64_FL_F16 | AARCH64_FL_SVE | flags) { /* Changing the ISA flags and have_regs_of_mode should be enough here. We shouldn't need to pay the compile-time cost of a full target @@ -979,6 +1110,10 @@ char * function_builder::get_name (const function_instance &instance, bool overloaded_p) { + /* __arm_* functions are listed as arm_*, so that the associated GCC + code is not in the implementation namespace. */ + if (strncmp (instance.base_name, "arm_", 4) == 0) + append_name ("__"); append_name (instance.base_name); if (overloaded_p) switch (instance.displacement_units ()) @@ -1016,12 +1151,72 @@ add_attribute (const char *name, tree attrs) return tree_cons (get_identifier (name), NULL_TREE, attrs); } -/* Return the appropriate function attributes for INSTANCE. */ +/* Add attribute NS::NAME to ATTRS. */ +static tree +add_attribute (const char *ns, const char *name, tree value, tree attrs) +{ + return tree_cons (build_tree_list (get_identifier (ns), + get_identifier (name)), + value, attrs); +} + +/* Attribute arm::NAME describes shared state that is an input if IS_IN + and an output if IS_OUT. Check whether a call with call properties + CALL_FLAGS needs such an attribute. Add it to in-progress attribute + list ATTRS if so. Return the new attribute list. */ +static tree +add_shared_state_attribute (const char *name, bool is_in, bool is_out, + unsigned int call_flags, tree attrs) +{ + struct state_flag_info + { + const char *name; + unsigned int read_flag; + unsigned int write_flag; + }; + static state_flag_info state_flags[] = + { + { "za", CP_READ_ZA, CP_WRITE_ZA } + }; + + tree args = NULL_TREE; + for (const auto &state_flag : state_flags) + { + auto all_flags = state_flag.read_flag | state_flag.write_flag; + auto these_flags = ((is_in ? state_flag.read_flag : 0) + | (is_out ? state_flag.write_flag : 0)); + if ((call_flags & all_flags) == these_flags) + { + tree value = build_string (strlen (state_flag.name) + 1, + state_flag.name); + args = tree_cons (NULL_TREE, value, args); + } + } + if (args) + attrs = add_attribute ("arm", name, args, attrs); + return attrs; +} + +/* Return the appropriate function attributes for INSTANCE, which requires + the feature flags in REQUIRED_EXTENSIONS. */ tree -function_builder::get_attributes (const function_instance &instance) +function_builder::get_attributes (const function_instance &instance, + aarch64_feature_flags required_extensions) { tree attrs = NULL_TREE; + if (required_extensions & AARCH64_FL_SM_ON) + attrs = add_attribute ("arm", "streaming", NULL_TREE, attrs); + else if (!(required_extensions & AARCH64_FL_SM_OFF)) + attrs = add_attribute ("arm", "streaming_compatible", NULL_TREE, attrs); + + attrs = add_shared_state_attribute ("in", true, false, + instance.call_properties (), attrs); + attrs = add_shared_state_attribute ("out", false, true, + instance.call_properties (), attrs); + attrs = add_shared_state_attribute ("inout", true, true, + instance.call_properties (), attrs); + if (!instance.modifies_global_state_p ()) { if (instance.reads_global_state_p ()) @@ -1097,7 +1292,7 @@ add_unique_function (const function_instance &instance, tree fntype = build_function_type_array (return_type, argument_types.length (), argument_types.address ()); - tree attrs = get_attributes (instance); + tree attrs = get_attributes (instance, required_extensions); registered_function &rfn = add_function (instance, name, fntype, attrs, required_extensions, false, false); @@ -1114,7 +1309,7 @@ add_unique_function (const function_instance &instance, if (strcmp (name, overload_name) != 0) { /* Attribute lists shouldn't be shared. */ - tree attrs = get_attributes (instance); + tree attrs = get_attributes (instance, required_extensions); bool placeholder_p = !(m_direct_overloads || force_direct_overloads); add_function (instance, overload_name, fntype, attrs, required_extensions, false, placeholder_p); @@ -2283,6 +2478,7 @@ bool function_resolver::check_gp_argument (unsigned int nops, unsigned int &i, unsigned int &nargs) { + gcc_assert (pred != PRED_za_m); i = 0; if (pred != PRED_none) { @@ -2488,9 +2684,7 @@ function_checker::function_checker (location_t location, unsigned int nargs, tree *args) : function_call_info (location, instance, fndecl), m_fntype (fntype), m_nargs (nargs), m_args (args), - /* We don't have to worry about unary _m operations here, since they - never have arguments that need checking. */ - m_base_arg (pred != PRED_none ? 1 : 0) + m_base_arg (pred != PRED_none && pred != PRED_za_m ? 1 : 0) { } @@ -2955,21 +3149,51 @@ function_expander::convert_to_pmode (rtx x) } /* Return the base address for a contiguous load or store function. - MEM_MODE is the mode of the addressed memory. */ + MEM_MODE is the mode of the addressed memory, BASE_ARGNO is + the index of the base argument, and VNUM_ARGNO is the index of + the vnum offset argument (if any). VL_ISA_MODE is AARCH64_FL_SM_ON + if the vnum argument is a factor of the SME vector length, 0 if it + is a factor of the current prevailing vector length. */ rtx -function_expander::get_contiguous_base (machine_mode mem_mode) +function_expander::get_contiguous_base (machine_mode mem_mode, + unsigned int base_argno, + unsigned int vnum_argno, + aarch64_feature_flags vl_isa_mode) { - rtx base = convert_to_pmode (args[1]); + rtx base = convert_to_pmode (args[base_argno]); if (mode_suffix_id == MODE_vnum) { - /* Use the size of the memory mode for extending loads and truncating - stores. Use the size of a full vector for non-extending loads - and non-truncating stores (including svld[234] and svst[234]). */ - poly_int64 size = ordered_min (GET_MODE_SIZE (mem_mode), - BYTES_PER_SVE_VECTOR); - rtx offset = gen_int_mode (size, Pmode); - offset = simplify_gen_binary (MULT, Pmode, args[2], offset); - base = simplify_gen_binary (PLUS, Pmode, base, offset); + rtx vnum = args[vnum_argno]; + if (vnum != const0_rtx) + { + /* Use the size of the memory mode for extending loads and truncating + stores. Use the size of a full vector for non-extending loads + and non-truncating stores (including svld[234] and svst[234]). */ + poly_int64 size = ordered_min (GET_MODE_SIZE (mem_mode), + BYTES_PER_SVE_VECTOR); + rtx offset; + if ((vl_isa_mode & AARCH64_FL_SM_ON) + && !TARGET_STREAMING + && !size.is_constant ()) + { + gcc_assert (known_eq (size, BYTES_PER_SVE_VECTOR)); + if (CONST_INT_P (vnum) && IN_RANGE (INTVAL (vnum), -32, 31)) + offset = aarch64_sme_vq_immediate (Pmode, INTVAL (vnum) * 16, + AARCH64_ISA_MODE); + else + { + offset = aarch64_sme_vq_immediate (Pmode, 16, + AARCH64_ISA_MODE); + offset = simplify_gen_binary (MULT, Pmode, vnum, offset); + } + } + else + { + offset = gen_int_mode (size, Pmode); + offset = simplify_gen_binary (MULT, Pmode, vnum, offset); + } + base = simplify_gen_binary (PLUS, Pmode, base, offset); + } } return base; } @@ -3057,11 +3281,18 @@ function_expander::add_input_operand (insn_code icode, rtx x) machine_mode mode = operand.mode; if (mode == VOIDmode) { - /* The only allowable use of VOIDmode is the wildcard - aarch64_any_register_operand, which is used to avoid - combinatorial explosion in the reinterpret patterns. */ - gcc_assert (operand.predicate == aarch64_any_register_operand); - mode = GET_MODE (x); + /* The only allowable uses of VOIDmode are: + + - the wildcard aarch64_any_register_operand, which is used + to avoid combinatorial explosion in the reinterpret patterns + + - pmode_register_operand, which always has mode Pmode. */ + if (operand.predicate == aarch64_any_register_operand) + mode = GET_MODE (x); + else if (operand.predicate == pmode_register_operand) + mode = Pmode; + else + gcc_unreachable (); } else if (!VECTOR_MODE_P (GET_MODE (x)) && VECTOR_MODE_P (mode)) x = expand_vector_broadcast (mode, x); @@ -3076,7 +3307,7 @@ function_expander::add_input_operand (insn_code icode, rtx x) /* Add an integer operand with value X to the instruction. */ void -function_expander::add_integer_operand (HOST_WIDE_INT x) +function_expander::add_integer_operand (poly_int64 x) { m_ops.safe_grow (m_ops.length () + 1, true); create_integer_operand (&m_ops.last (), x); @@ -3621,7 +3852,10 @@ init_builtins () sve_switcher sve; register_builtin_types (); if (in_lto_p) - handle_arm_sve_h (); + { + handle_arm_sve_h (); + handle_arm_sme_h (); + } } /* Register vector type TYPE under its arm_sve.h name. */ @@ -3771,7 +4005,8 @@ handle_arm_sve_h () function_table = new hash_table (1023); function_builder builder; for (unsigned int i = 0; i < ARRAY_SIZE (function_groups); ++i) - builder.register_function_group (function_groups[i]); + if (!(function_groups[i].required_extensions & AARCH64_FL_SME)) + builder.register_function_group (function_groups[i]); } /* Return the function decl with SVE function subcode CODE, or error_mark_node @@ -3784,6 +4019,33 @@ builtin_decl (unsigned int code, bool) return (*registered_functions)[code]->decl; } +/* Implement #pragma GCC aarch64 "arm_sme.h". */ +void +handle_arm_sme_h () +{ + if (!function_table) + { + error ("%qs defined without first defining %qs", + "arm_sme.h", "arm_sve.h"); + return; + } + + static bool initialized_p; + if (initialized_p) + { + error ("duplicate definition of %qs", "arm_sme.h"); + return; + } + initialized_p = true; + + sme_switcher sme; + + function_builder builder; + for (unsigned int i = 0; i < ARRAY_SIZE (function_groups); ++i) + if (function_groups[i].required_extensions & AARCH64_FL_SME) + builder.register_function_group (function_groups[i]); +} + /* If we're implementing manual overloading, check whether the SVE function with subcode CODE is overloaded, and if so attempt to determine the corresponding non-overloaded function. The call diff --git a/gcc/config/aarch64/aarch64-sve-builtins.def b/gcc/config/aarch64/aarch64-sve-builtins.def index 14d12f07415..5824dc797f9 100644 --- a/gcc/config/aarch64/aarch64-sve-builtins.def +++ b/gcc/config/aarch64/aarch64-sve-builtins.def @@ -29,6 +29,10 @@ #define DEF_SVE_TYPE_SUFFIX(A, B, C, D, E) #endif +#ifndef DEF_SME_ZA_SUFFIX +#define DEF_SME_ZA_SUFFIX(A, B, C) +#endif + #ifndef DEF_SVE_GROUP_SUFFIX #define DEF_SVE_GROUP_SUFFIX(A, B, C) #endif @@ -42,6 +46,16 @@ DEF_SVE_FUNCTION_GS (NAME, SHAPE, TYPES, none, PREDS) #endif +#ifndef DEF_SME_ZA_FUNCTION_GS +#define DEF_SME_ZA_FUNCTION_GS(NAME, SHAPE, TYPES, GROUP, PREDS) \ + DEF_SVE_FUNCTION_GS(NAME, SHAPE, TYPES, GROUP, PREDS) +#endif + +#ifndef DEF_SME_ZA_FUNCTION +#define DEF_SME_ZA_FUNCTION(NAME, SHAPE, TYPES, PREDS) \ + DEF_SME_ZA_FUNCTION_GS (NAME, SHAPE, TYPES, none, PREDS) +#endif + DEF_SVE_MODE (n, none, none, none) DEF_SVE_MODE (index, none, none, elements) DEF_SVE_MODE (offset, none, none, bytes) @@ -104,16 +118,30 @@ DEF_SVE_TYPE_SUFFIX (u16, svuint16_t, unsigned, 16, VNx8HImode) DEF_SVE_TYPE_SUFFIX (u32, svuint32_t, unsigned, 32, VNx4SImode) DEF_SVE_TYPE_SUFFIX (u64, svuint64_t, unsigned, 64, VNx2DImode) +/* Associate _za with bytes. This is needed for svldr_vnum_za and + svstr_vnum_za, whose ZA offset can be in the range [0, 15], as for za8. */ +DEF_SME_ZA_SUFFIX (za, 8, VNx16QImode) + +DEF_SME_ZA_SUFFIX (za8, 8, VNx16QImode) +DEF_SME_ZA_SUFFIX (za16, 16, VNx8HImode) +DEF_SME_ZA_SUFFIX (za32, 32, VNx4SImode) +DEF_SME_ZA_SUFFIX (za64, 64, VNx2DImode) +DEF_SME_ZA_SUFFIX (za128, 128, VNx1TImode) + DEF_SVE_GROUP_SUFFIX (x2, 0, 2) DEF_SVE_GROUP_SUFFIX (x3, 0, 3) DEF_SVE_GROUP_SUFFIX (x4, 0, 4) #include "aarch64-sve-builtins-base.def" #include "aarch64-sve-builtins-sve2.def" +#include "aarch64-sve-builtins-sme.def" +#undef DEF_SME_ZA_FUNCTION #undef DEF_SVE_FUNCTION +#undef DEF_SME_ZA_FUNCTION_GS #undef DEF_SVE_FUNCTION_GS #undef DEF_SVE_GROUP_SUFFIX +#undef DEF_SME_ZA_SUFFIX #undef DEF_SVE_TYPE_SUFFIX #undef DEF_SVE_TYPE #undef DEF_SVE_MODE diff --git a/gcc/config/aarch64/aarch64-sve-builtins.h b/gcc/config/aarch64/aarch64-sve-builtins.h index b0218bbad6e..1cd31d2d733 100644 --- a/gcc/config/aarch64/aarch64-sve-builtins.h +++ b/gcc/config/aarch64/aarch64-sve-builtins.h @@ -97,6 +97,8 @@ const unsigned int CP_PREFETCH_MEMORY = 1U << 3; const unsigned int CP_WRITE_MEMORY = 1U << 4; const unsigned int CP_READ_FFR = 1U << 5; const unsigned int CP_WRITE_FFR = 1U << 6; +const unsigned int CP_READ_ZA = 1U << 7; +const unsigned int CP_WRITE_ZA = 1U << 8; /* Enumerates the SVE predicate and (data) vector types, together called "vector types" for brevity. */ @@ -142,6 +144,10 @@ enum predication_index /* Zero predication: set inactive lanes of the vector result to zero. */ PRED_z, + /* Merging predication for SME's ZA: merge into slices of the array + instead of overwriting the whole slices. */ + PRED_za_m, + NUM_PREDS }; @@ -176,6 +182,8 @@ enum type_suffix_index { #define DEF_SVE_TYPE_SUFFIX(NAME, ACLE_TYPE, CLASS, BITS, MODE) \ TYPE_SUFFIX_ ## NAME, +#define DEF_SME_ZA_SUFFIX(NAME, BITS, MODE) \ + TYPE_SUFFIX_ ## NAME, #include "aarch64-sve-builtins.def" NUM_TYPE_SUFFIXES }; @@ -240,9 +248,13 @@ struct type_suffix_info unsigned int unsigned_p : 1; /* True if the suffix is for a floating-point type. */ unsigned int float_p : 1; + /* True if the suffix is for a vector type (integer or float). */ + unsigned int vector_p : 1; /* True if the suffix is for a boolean type. */ unsigned int bool_p : 1; - unsigned int spare : 12; + /* True if the suffix is for SME's ZA. */ + unsigned int za_p : 1; + unsigned int spare : 10; /* The associated vector or predicate mode. */ machine_mode vector_mode : 16; @@ -356,13 +368,15 @@ public: tree displacement_vector_type () const; units_index displacement_units () const; + unsigned int num_za_tiles () const; + const type_suffix_info &type_suffix (unsigned int) const; const group_suffix_info &group_suffix () const; tree scalar_type (unsigned int) const; tree vector_type (unsigned int) const; tree tuple_type (unsigned int) const; - unsigned int elements_per_vq (unsigned int i) const; + unsigned int elements_per_vq (unsigned int) const; machine_mode vector_mode (unsigned int) const; machine_mode tuple_mode (unsigned int) const; machine_mode gp_mode (unsigned int) const; @@ -401,7 +415,7 @@ private: char *get_name (const function_instance &, bool); - tree get_attributes (const function_instance &); + tree get_attributes (const function_instance &, aarch64_feature_flags); registered_function &add_function (const function_instance &, const char *, tree, tree, @@ -607,7 +621,8 @@ public: bool overlaps_input_p (rtx); rtx convert_to_pmode (rtx); - rtx get_contiguous_base (machine_mode); + rtx get_contiguous_base (machine_mode, unsigned int = 1, unsigned int = 2, + aarch64_feature_flags = 0); rtx get_fallback_value (machine_mode, unsigned int, unsigned int, unsigned int &); rtx get_reg_target (); @@ -615,7 +630,7 @@ public: void add_output_operand (insn_code); void add_input_operand (insn_code, rtx); - void add_integer_operand (HOST_WIDE_INT); + void add_integer_operand (poly_int64); void add_mem_operand (machine_mode, rtx); void add_address_operand (rtx); void add_fixed_operand (rtx); @@ -740,7 +755,7 @@ public: class sve_switcher : public aarch64_simd_switcher { public: - sve_switcher (); + sve_switcher (aarch64_feature_flags = 0); ~sve_switcher (); private: @@ -748,11 +763,18 @@ private: bool m_old_have_regs_of_mode[MAX_MACHINE_MODE]; }; +/* Extends sve_switch enough for defining arm_sme.h. */ +class sme_switcher : public sve_switcher +{ +public: + sme_switcher () : sve_switcher (AARCH64_FL_SME) {} +}; + extern const type_suffix_info type_suffixes[NUM_TYPE_SUFFIXES + 1]; extern const mode_suffix_info mode_suffixes[MODE_none + 1]; extern const group_suffix_info group_suffixes[NUM_GROUP_SUFFIXES]; -extern tree scalar_types[NUM_VECTOR_TYPES]; +extern tree scalar_types[NUM_VECTOR_TYPES + 1]; extern tree acle_vector_types[MAX_TUPLE_SIZE][NUM_VECTOR_TYPES + 1]; extern tree acle_svpattern; extern tree acle_svprfop; @@ -888,6 +910,16 @@ function_instance::displacement_vector_type () const return acle_vector_types[0][mode_suffix ().displacement_vector_type]; } +/* Return the number of ZA tiles associated with the _za suffix + (which is always the first type suffix). */ +inline unsigned int +function_instance::num_za_tiles () const +{ + auto &suffix = type_suffix (0); + gcc_checking_assert (suffix.za_p); + return suffix.element_bytes; +} + /* If the function takes a vector or scalar displacement, return the units in which the displacement is measured, otherwise return UNITS_none. */ inline units_index diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc index 9a97b71fedc..26d575f68ca 100644 --- a/gcc/config/aarch64/aarch64.cc +++ b/gcc/config/aarch64/aarch64.cc @@ -3574,15 +3574,26 @@ aarch64_output_sve_scalar_inc_dec (rtx offset) } /* Return true if a single RDVL instruction can multiply FACTOR by the - number of 128-bit quadwords in an SVE vector. */ + number of 128-bit quadwords in an SVE vector. This is also the + range of ADDVL. */ static bool -aarch64_sve_rdvl_factor_p (HOST_WIDE_INT factor) +aarch64_sve_rdvl_addvl_factor_p (HOST_WIDE_INT factor) { return (multiple_p (factor, 16) && IN_RANGE (factor, -32 * 16, 31 * 16)); } +/* Return true if ADDPL can be used to add FACTOR multiplied by the number + of quadwords in an SVE vector. */ + +static bool +aarch64_sve_addpl_factor_p (HOST_WIDE_INT factor) +{ + return (multiple_p (factor, 2) + && IN_RANGE (factor, -32 * 2, 31 * 2)); +} + /* Return true if we can move VALUE into a register using a single RDVL instruction. */ @@ -3590,7 +3601,7 @@ static bool aarch64_sve_rdvl_immediate_p (poly_int64 value) { HOST_WIDE_INT factor = value.coeffs[0]; - return value.coeffs[1] == factor && aarch64_sve_rdvl_factor_p (factor); + return value.coeffs[1] == factor && aarch64_sve_rdvl_addvl_factor_p (factor); } /* Likewise for rtx X. */ @@ -3626,10 +3637,8 @@ aarch64_sve_addvl_addpl_immediate_p (poly_int64 value) HOST_WIDE_INT factor = value.coeffs[0]; if (factor == 0 || value.coeffs[1] != factor) return false; - /* FACTOR counts VG / 2, so a value of 2 is one predicate width - and a value of 16 is one vector width. */ - return (((factor & 15) == 0 && IN_RANGE (factor, -32 * 16, 31 * 16)) - || ((factor & 1) == 0 && IN_RANGE (factor, -32 * 2, 31 * 2))); + return (aarch64_sve_rdvl_addvl_factor_p (factor) + || aarch64_sve_addpl_factor_p (factor)); } /* Likewise for rtx X. */ @@ -3729,11 +3738,11 @@ aarch64_output_sve_vector_inc_dec (const char *operands, rtx x) number of 128-bit quadwords in an SME vector. ISA_MODE is the ISA mode in which the calculation is being performed. */ -static rtx +rtx aarch64_sme_vq_immediate (machine_mode mode, HOST_WIDE_INT factor, aarch64_feature_flags isa_mode) { - gcc_assert (aarch64_sve_rdvl_factor_p (factor)); + gcc_assert (aarch64_sve_rdvl_addvl_factor_p (factor)); if (isa_mode & AARCH64_FL_SM_ON) /* We're in streaming mode, so we can use normal poly-int values. */ return gen_int_mode ({ factor, factor }, mode); @@ -3776,7 +3785,7 @@ aarch64_rdsvl_immediate_p (const_rtx x) { HOST_WIDE_INT factor; return (aarch64_sme_vq_unspec_p (x, &factor) - && aarch64_sve_rdvl_factor_p (factor)); + && aarch64_sve_rdvl_addvl_factor_p (factor)); } /* Return the asm string for an RDSVL instruction that calculates X, @@ -3793,6 +3802,38 @@ aarch64_output_rdsvl (const_rtx x) return buffer; } +/* Return true if X is a constant that can be added using ADDSVL or ADDSPL. */ + +bool +aarch64_addsvl_addspl_immediate_p (const_rtx x) +{ + HOST_WIDE_INT factor; + return (aarch64_sme_vq_unspec_p (x, &factor) + && (aarch64_sve_rdvl_addvl_factor_p (factor) + || aarch64_sve_addpl_factor_p (factor))); +} + +/* X is a constant that satisfies aarch64_addsvl_addspl_immediate_p. + Return the asm string for the associated instruction. */ + +char * +aarch64_output_addsvl_addspl (rtx x) +{ + static char buffer[sizeof ("addspl\t%x0, %x1, #-") + 3 * sizeof (int)]; + HOST_WIDE_INT factor; + if (!aarch64_sme_vq_unspec_p (x, &factor)) + gcc_unreachable (); + if (aarch64_sve_rdvl_addvl_factor_p (factor)) + snprintf (buffer, sizeof (buffer), "addsvl\t%%x0, %%x1, #%d", + (int) factor / 16); + else if (aarch64_sve_addpl_factor_p (factor)) + snprintf (buffer, sizeof (buffer), "addspl\t%%x0, %%x1, #%d", + (int) factor / 2); + else + gcc_unreachable (); + return buffer; +} + /* Multipliers for repeating bitmasks of width 32, 16, 8, 4, and 2. */ static const unsigned HOST_WIDE_INT bitmask_imm_mul[] = @@ -4428,7 +4469,7 @@ aarch64_add_offset (scalar_int_mode mode, rtx dest, rtx src, { /* Try to use an unshifted CNT[BHWD] or RDVL. */ if (aarch64_sve_cnt_factor_p (factor) - || aarch64_sve_rdvl_factor_p (factor)) + || aarch64_sve_rdvl_addvl_factor_p (factor)) { val = gen_int_mode (poly_int64 (factor, factor), mode); shift = 0; @@ -9803,7 +9844,7 @@ aarch64_classify_index (struct aarch64_address_info *info, rtx x, && contains_reg_of_mode[GENERAL_REGS][GET_MODE (SUBREG_REG (index))]) index = SUBREG_REG (index); - if (aarch64_sve_data_mode_p (mode)) + if (aarch64_sve_data_mode_p (mode) || mode == VNx1TImode) { if (type != ADDRESS_REG_REG || (1 << shift) != GET_MODE_UNIT_SIZE (mode)) @@ -9906,7 +9947,8 @@ aarch64_classify_address (struct aarch64_address_info *info, && ((vec_flags == 0 && known_lt (GET_MODE_SIZE (mode), 16)) || vec_flags == VEC_ADVSIMD - || vec_flags & VEC_SVE_DATA)); + || vec_flags & VEC_SVE_DATA + || mode == VNx1TImode)); /* For SVE, only accept [Rn], [Rn, #offset, MUL VL] and [Rn, Rm, LSL #shift]. The latter is not valid for SVE predicates, and that's rejected through @@ -10025,7 +10067,7 @@ aarch64_classify_address (struct aarch64_address_info *info, /* Make "m" use the LD1 offset range for SVE data modes, so that pre-RTL optimizers like ivopts will work to that instead of the wider LDR/STR range. */ - if (vec_flags == VEC_SVE_DATA) + if (vec_flags == VEC_SVE_DATA || mode == VNx1TImode) return (type == ADDR_QUERY_M ? offset_4bit_signed_scaled_p (mode, offset) : offset_9bit_signed_scaled_p (mode, offset)); @@ -12496,6 +12538,51 @@ aarch64_output_casesi (rtx *operands) return ""; } +/* Return the asm string for an SME ZERO instruction whose 8-bit mask + operand is MASK. */ +const char * +aarch64_output_sme_zero_za (rtx mask) +{ + auto mask_val = UINTVAL (mask); + if (mask_val == 0) + return "zero\t{}"; + + if (mask_val == 0xff) + return "zero\t{ za }"; + + static constexpr std::pair tiles[] = { + { 0xff, 'b' }, + { 0x55, 'h' }, + { 0x11, 's' }, + { 0x01, 'd' } + }; + /* The last entry in the list has the form "za7.d }", but that's the + same length as "za7.d, ". */ + static char buffer[sizeof("zero\t{ ") + sizeof ("za7.d, ") * 8 + 1]; + unsigned int i = 0; + i += snprintf (buffer + i, sizeof (buffer) - i, "zero\t"); + const char *prefix = "{ "; + for (auto &tile : tiles) + { + auto tile_mask = tile.first; + unsigned int tile_index = 0; + while (tile_mask < 0x100) + { + if ((mask_val & tile_mask) == tile_mask) + { + i += snprintf (buffer + i, sizeof (buffer) - i, "%sza%d.%c", + prefix, tile_index, tile.second); + prefix = ", "; + mask_val &= ~tile_mask; + } + tile_mask <<= 1; + tile_index += 1; + } + } + gcc_assert (mask_val == 0 && i + 3 <= sizeof (buffer)); + snprintf (buffer + i, sizeof (buffer) - i, " }"); + return buffer; +} /* Return size in bits of an arithmetic operand which is shifted/scaled and masked such that it is suitable for a UXTB, UXTH, or UXTW extend @@ -21586,6 +21673,31 @@ aarch64_sve_struct_memory_operand_p (rtx op) && offset_4bit_signed_scaled_p (SVE_BYTE_MODE, last)); } +/* Return true if OFFSET is a constant integer and if VNUM is + OFFSET * the number of bytes in an SVE vector. This is the requirement + that exists in SME LDR and STR instructions, where the VL offset must + equal the ZA slice offset. */ +bool +aarch64_sme_ldr_vnum_offset_p (rtx offset, rtx vnum) +{ + if (!CONST_INT_P (offset) || !IN_RANGE (INTVAL (offset), 0, 15)) + return false; + + if (TARGET_STREAMING) + { + poly_int64 const_vnum; + return (poly_int_rtx_p (vnum, &const_vnum) + && known_eq (const_vnum, + INTVAL (offset) * BYTES_PER_SVE_VECTOR)); + } + else + { + HOST_WIDE_INT factor; + return (aarch64_sme_vq_unspec_p (vnum, &factor) + && factor == INTVAL (offset) * 16); + } +} + /* Emit a register copy from operand to operand, taking care not to early-clobber source registers in the process. diff --git a/gcc/config/aarch64/aarch64.h b/gcc/config/aarch64/aarch64.h index 57012a7c763..f9139a8e28f 100644 --- a/gcc/config/aarch64/aarch64.h +++ b/gcc/config/aarch64/aarch64.h @@ -207,6 +207,7 @@ constexpr auto AARCH64_FL_DEFAULT_ISA_MODE = AARCH64_FL_SM_OFF; /* Macros to test ISA flags. */ #define AARCH64_ISA_SM_OFF (aarch64_isa_flags & AARCH64_FL_SM_OFF) +#define AARCH64_ISA_SM_ON (aarch64_isa_flags & AARCH64_FL_SM_ON) #define AARCH64_ISA_ZA_ON (aarch64_isa_flags & AARCH64_FL_ZA_ON) #define AARCH64_ISA_MODE (aarch64_isa_flags & AARCH64_FL_ISA_MODES) #define AARCH64_ISA_CRC (aarch64_isa_flags & AARCH64_FL_CRC) @@ -224,6 +225,8 @@ constexpr auto AARCH64_FL_DEFAULT_ISA_MODE = AARCH64_FL_SM_OFF; #define AARCH64_ISA_SVE2_SHA3 (aarch64_isa_flags & AARCH64_FL_SVE2_SHA3) #define AARCH64_ISA_SVE2_SM4 (aarch64_isa_flags & AARCH64_FL_SVE2_SM4) #define AARCH64_ISA_SME (aarch64_isa_flags & AARCH64_FL_SME) +#define AARCH64_ISA_SME_I16I64 (aarch64_isa_flags & AARCH64_FL_SME_I16I64) +#define AARCH64_ISA_SME_F64F64 (aarch64_isa_flags & AARCH64_FL_SME_F64F64) #define AARCH64_ISA_V8_3A (aarch64_isa_flags & AARCH64_FL_V8_3A) #define AARCH64_ISA_DOTPROD (aarch64_isa_flags & AARCH64_FL_DOTPROD) #define AARCH64_ISA_AES (aarch64_isa_flags & AARCH64_FL_AES) @@ -257,6 +260,9 @@ constexpr auto AARCH64_FL_DEFAULT_ISA_MODE = AARCH64_FL_SM_OFF; /* The current function is a normal non-streaming function. */ #define TARGET_NON_STREAMING (AARCH64_ISA_SM_OFF) +/* The current function has a streaming body. */ +#define TARGET_STREAMING (AARCH64_ISA_SM_ON) + /* The current function has a streaming-compatible body. */ #define TARGET_STREAMING_COMPATIBLE \ ((aarch64_isa_flags & AARCH64_FL_SM_STATE) == 0) @@ -317,6 +323,15 @@ constexpr auto AARCH64_FL_DEFAULT_ISA_MODE = AARCH64_FL_SM_OFF; imply anything about the state of PSTATE.SM. */ #define TARGET_SME (AARCH64_ISA_SME) +/* Streaming-mode SME instructions. */ +#define TARGET_STREAMING_SME (TARGET_STREAMING && TARGET_SME) + +/* The FEAT_SME_I16I64 extension to SME, enabled through +sme-i16i64. */ +#define TARGET_SME_I16I64 (AARCH64_ISA_SME_I16I64) + +/* The FEAT_SME_F64F64 extension to SME, enabled through +sme-f64f64. */ +#define TARGET_SME_F64F64 (AARCH64_ISA_SME_F64F64) + /* ARMv8.3-A features. */ #define TARGET_ARMV8_3 (AARCH64_ISA_V8_3A) diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md index 14a401617f6..2036dccd250 100644 --- a/gcc/config/aarch64/aarch64.md +++ b/gcc/config/aarch64/aarch64.md @@ -2166,6 +2166,7 @@ (define_insn "*add3_aarch64" [ r , rk , Uaa ; multiple , * ] # [ r , 0 , Uai ; alu_imm , sve ] << aarch64_output_sve_scalar_inc_dec (operands[2]); [ rk , rk , Uav ; alu_imm , sve ] << aarch64_output_sve_addvl_addpl (operands[2]); + [ rk , rk , UaV ; alu_imm , sme ] << aarch64_output_addsvl_addspl (operands[2]); } ;; The "alu_imm" types for INC/DEC and ADDVL/ADDPL are just placeholders. ) diff --git a/gcc/config/aarch64/arm_sme.h b/gcc/config/aarch64/arm_sme.h new file mode 100644 index 00000000000..5ddd49f5778 --- /dev/null +++ b/gcc/config/aarch64/arm_sme.h @@ -0,0 +1,45 @@ +/* AArch64 SME intrinsics include file. + Copyright (C) 2023 Free Software Foundation, Inc. + + This file is part of GCC. + + GCC is free software; you can redistribute it and/or modify it + under the terms of the GNU General Public License as published + by the Free Software Foundation; either version 3, or (at your + option) any later version. + + GCC is distributed in the hope that it will be useful, but WITHOUT + ANY WARRANTY; without even the implied warranty of MERCHANTABILITY + or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public + License for more details. + + Under Section 7 of GPL version 3, you are granted additional + permissions described in the GCC Runtime Library Exception, version + 3.1, as published by the Free Software Foundation. + + You should have received a copy of the GNU General Public License and + a copy of the GCC Runtime Library Exception along with this program; + see the files COPYING3 and COPYING.RUNTIME respectively. If not, see + . */ + +#ifndef _ARM_SME_H_ +#define _ARM_SME_H_ + +#include +#pragma GCC aarch64 "arm_sme.h" + +void __arm_za_disable(void) __arm_streaming_compatible; + +void *__arm_sc_memcpy(void *, const void *, __SIZE_TYPE__) + __arm_streaming_compatible; + +void *__arm_sc_memmove(void *, const void *, __SIZE_TYPE__) + __arm_streaming_compatible; + +void *__arm_sc_memset(void *, int, __SIZE_TYPE__) + __arm_streaming_compatible; + +void *__arm_sc_memchr(void *, int, __SIZE_TYPE__) + __arm_streaming_compatible; + +#endif diff --git a/gcc/config/aarch64/constraints.md b/gcc/config/aarch64/constraints.md index 5dd50218b9f..38ed927ec14 100644 --- a/gcc/config/aarch64/constraints.md +++ b/gcc/config/aarch64/constraints.md @@ -21,6 +21,9 @@ (define_register_constraint "k" "STACK_REG" "@internal The stack register.") +(define_register_constraint "Ucj" "W12_W15_REGS" + "@internal r12-r15, which can be used to index ZA.") + (define_register_constraint "Ucs" "TAILCALL_ADDR_REGS" "@internal Registers suitable for an indirect tail call") @@ -74,6 +77,12 @@ (define_constraint "Uav" a single ADDVL or ADDPL." (match_operand 0 "aarch64_sve_addvl_addpl_immediate")) +(define_constraint "UaV" + "@internal + A constraint that matches a VG-based constant that can be added by + a single ADDSVL or ADDSPL." + (match_operand 0 "aarch64_addsvl_addspl_immediate")) + (define_constraint "Uat" "@internal A constraint that matches a VG-based constant that can be added by diff --git a/gcc/config/aarch64/iterators.md b/gcc/config/aarch64/iterators.md index 5f7cd886283..1a14069485d 100644 --- a/gcc/config/aarch64/iterators.md +++ b/gcc/config/aarch64/iterators.md @@ -429,6 +429,7 @@ (define_mode_iterator VNx4SI_ONLY [VNx4SI]) (define_mode_iterator VNx4SF_ONLY [VNx4SF]) (define_mode_iterator VNx2DI_ONLY [VNx2DI]) (define_mode_iterator VNx2DF_ONLY [VNx2DF]) +(define_mode_iterator VNx1TI_ONLY [VNx1TI]) ;; All fully-packed SVE vector modes. (define_mode_iterator SVE_FULL [VNx16QI VNx8HI VNx4SI VNx2DI @@ -587,6 +588,17 @@ (define_mode_iterator PRED_HSD [VNx8BI VNx4BI VNx2BI]) ;; Bfloat16 modes to which V4SF can be converted (define_mode_iterator V4SF_TO_BF [V4BF V8BF]) +;; The modes used to represent different ZA access sizes. +(define_mode_iterator SME_ZA_I [VNx16QI VNx8HI VNx4SI VNx2DI VNx1TI]) +(define_mode_iterator SME_ZA_SDI [VNx4SI (VNx2DI "TARGET_SME_I16I64")]) + +(define_mode_iterator SME_ZA_SDF_I [VNx4SI (VNx2DI "TARGET_SME_F64F64")]) + +;; The modes for which outer product instructions are supported. +(define_mode_iterator SME_MOP_BHI [VNx16QI (VNx8HI "TARGET_SME_I16I64")]) +(define_mode_iterator SME_MOP_HSDF [VNx8BF VNx8HF VNx4SF + (VNx2DF "TARGET_SME_F64F64")]) + ;; ------------------------------------------------------------------ ;; Unspec enumerations for Advance SIMD. These could well go into ;; aarch64.md but for their use in int_iterators here. @@ -948,6 +960,28 @@ (define_c_enum "unspec" UNSPEC_BFCVTN2 ; Used in aarch64-simd.md. UNSPEC_BFCVT ; Used in aarch64-simd.md. UNSPEC_FCVTXN ; Used in aarch64-simd.md. + + ;; All used in aarch64-sme.md + UNSPEC_SME_ADDHA + UNSPEC_SME_ADDVA + UNSPEC_SME_FMOPA + UNSPEC_SME_FMOPS + UNSPEC_SME_LD1_HOR + UNSPEC_SME_LD1_VER + UNSPEC_SME_READ_HOR + UNSPEC_SME_READ_VER + UNSPEC_SME_SMOPA + UNSPEC_SME_SMOPS + UNSPEC_SME_ST1_HOR + UNSPEC_SME_ST1_VER + UNSPEC_SME_SUMOPA + UNSPEC_SME_SUMOPS + UNSPEC_SME_UMOPA + UNSPEC_SME_UMOPS + UNSPEC_SME_USMOPA + UNSPEC_SME_USMOPS + UNSPEC_SME_WRITE_HOR + UNSPEC_SME_WRITE_VER ]) ;; ------------------------------------------------------------------ @@ -1084,9 +1118,15 @@ (define_mode_attr sizem1 [(QI "#7") (HI "#15") (SI "#31") (DI "#63") ;; element. (define_mode_attr elem_bits [(VNx16BI "8") (VNx8BI "16") (VNx4BI "32") (VNx2BI "64") - (VNx16QI "8") (VNx8HI "16") - (VNx4SI "32") (VNx2DI "64") - (VNx8HF "16") (VNx4SF "32") (VNx2DF "64")]) + (VNx16QI "8") (VNx32QI "8") (VNx64QI "8") + (VNx8HI "16") (VNx16HI "16") (VNx32HI "16") + (VNx8HF "16") (VNx16HF "16") (VNx32HF "16") + (VNx8BF "16") (VNx16BF "16") (VNx32BF "16") + (VNx4SI "32") (VNx8SI "32") (VNx16SI "32") + (VNx4SF "32") (VNx8SF "32") (VNx16SF "32") + (VNx2DI "64") (VNx4DI "64") (VNx8DI "64") + (VNx2DF "64") (VNx4DF "64") (VNx8DF "64") + (VNx1TI "128")]) ;; The number of bits in a vector container. (define_mode_attr container_bits [(VNx16QI "8") @@ -1212,6 +1252,7 @@ (define_mode_attr Vetype [(V8QI "b") (V16QI "b") (VNx4SF "s") (VNx2SF "s") (VNx2DI "d") (VNx2DF "d") + (VNx1TI "q") (BF "h") (V4BF "h") (V8BF "h") (HF "h") (SF "s") (DF "d") @@ -1230,6 +1271,7 @@ (define_mode_attr Vesize [(VNx16QI "b") (VNx8QI "b") (VNx4QI "b") (VNx2QI "b") (VNx4SF "w") (VNx2SF "w") (VNx2DI "d") (VNx2DF "d") + (VNx1TI "q") (VNx32QI "b") (VNx48QI "b") (VNx64QI "b") (VNx16HI "h") (VNx24HI "h") (VNx32HI "h") (VNx16HF "h") (VNx24HF "h") (VNx32HF "h") @@ -2046,6 +2088,7 @@ (define_mode_attr VPRED [(VNx16QI "VNx16BI") (VNx8QI "VNx8BI") (VNx4SF "VNx4BI") (VNx2SF "VNx2BI") (VNx2DI "VNx2BI") (VNx2DF "VNx2BI") + (VNx1TI "VNx2BI") (VNx32QI "VNx16BI") (VNx16HI "VNx8BI") (VNx16HF "VNx8BI") (VNx16BF "VNx8BI") @@ -2130,6 +2173,8 @@ (define_mode_attr vec_or_offset [(V8QI "vec") (V16QI "vec") (V4HI "vec") (V8HI "vec") (V2SI "vec") (V4SI "vec") (V2DI "vec") (DI "offset")]) +(define_mode_attr b [(VNx8BF "b") (VNx8HF "") (VNx4SF "") (VNx2DF "")]) + ;; ------------------------------------------------------------------- ;; Code Iterators ;; ------------------------------------------------------------------- @@ -3158,6 +3203,20 @@ (define_int_iterator FCMLA_OP [UNSPEC_FCMLA (define_int_iterator FCMUL_OP [UNSPEC_FCMUL UNSPEC_FCMUL_CONJ]) +(define_int_iterator SME_LD1 [UNSPEC_SME_LD1_HOR UNSPEC_SME_LD1_VER]) +(define_int_iterator SME_READ [UNSPEC_SME_READ_HOR UNSPEC_SME_READ_VER]) +(define_int_iterator SME_ST1 [UNSPEC_SME_ST1_HOR UNSPEC_SME_ST1_VER]) +(define_int_iterator SME_WRITE [UNSPEC_SME_WRITE_HOR UNSPEC_SME_WRITE_VER]) + +(define_int_iterator SME_BINARY_SDI [UNSPEC_SME_ADDHA UNSPEC_SME_ADDVA]) + +(define_int_iterator SME_INT_MOP [UNSPEC_SME_SMOPA UNSPEC_SME_SMOPS + UNSPEC_SME_SUMOPA UNSPEC_SME_SUMOPS + UNSPEC_SME_UMOPA UNSPEC_SME_UMOPS + UNSPEC_SME_USMOPA UNSPEC_SME_USMOPS]) + +(define_int_iterator SME_FP_MOP [UNSPEC_SME_FMOPA UNSPEC_SME_FMOPS]) + ;; Iterators for atomic operations. (define_int_iterator ATOMIC_LDOP @@ -3232,6 +3291,26 @@ (define_int_attr optab [(UNSPEC_ANDF "and") (UNSPEC_PMULLT "pmullt") (UNSPEC_PMULLT_PAIR "pmullt_pair") (UNSPEC_SMATMUL "smatmul") + (UNSPEC_SME_ADDHA "addha") + (UNSPEC_SME_ADDVA "addva") + (UNSPEC_SME_FMOPA "fmopa") + (UNSPEC_SME_FMOPS "fmops") + (UNSPEC_SME_LD1_HOR "ld1_hor") + (UNSPEC_SME_LD1_VER "ld1_ver") + (UNSPEC_SME_READ_HOR "read_hor") + (UNSPEC_SME_READ_VER "read_ver") + (UNSPEC_SME_SMOPA "smopa") + (UNSPEC_SME_SMOPS "smops") + (UNSPEC_SME_ST1_HOR "st1_hor") + (UNSPEC_SME_ST1_VER "st1_ver") + (UNSPEC_SME_SUMOPA "sumopa") + (UNSPEC_SME_SUMOPS "sumops") + (UNSPEC_SME_UMOPA "umopa") + (UNSPEC_SME_UMOPS "umops") + (UNSPEC_SME_USMOPA "usmopa") + (UNSPEC_SME_USMOPS "usmops") + (UNSPEC_SME_WRITE_HOR "write_hor") + (UNSPEC_SME_WRITE_VER "write_ver") (UNSPEC_SQCADD90 "sqcadd90") (UNSPEC_SQCADD270 "sqcadd270") (UNSPEC_SQRDCMLAH "sqrdcmlah") @@ -3977,6 +4056,15 @@ (define_int_attr min_elem_bits [(UNSPEC_RBIT "8") (define_int_attr unspec [(UNSPEC_WHILERW "UNSPEC_WHILERW") (UNSPEC_WHILEWR "UNSPEC_WHILEWR")]) +(define_int_attr hv [(UNSPEC_SME_LD1_HOR "h") + (UNSPEC_SME_LD1_VER "v") + (UNSPEC_SME_READ_HOR "h") + (UNSPEC_SME_READ_VER "v") + (UNSPEC_SME_ST1_HOR "h") + (UNSPEC_SME_ST1_VER "v") + (UNSPEC_SME_WRITE_HOR "h") + (UNSPEC_SME_WRITE_VER "v")]) + ;; Iterators and attributes for fpcr fpsr getter setters (define_int_iterator GET_FPSCR diff --git a/gcc/config/aarch64/predicates.md b/gcc/config/aarch64/predicates.md index a73724a7fc0..5f304898a8c 100644 --- a/gcc/config/aarch64/predicates.md +++ b/gcc/config/aarch64/predicates.md @@ -184,11 +184,17 @@ (define_predicate "aarch64_split_add_offset_immediate" (and (match_code "const_poly_int") (match_test "aarch64_add_offset_temporaries (op) == 1"))) +(define_predicate "aarch64_addsvl_addspl_immediate" + (and (match_code "const") + (match_test "aarch64_addsvl_addspl_immediate_p (op)"))) + (define_predicate "aarch64_pluslong_operand" (ior (match_operand 0 "register_operand") (match_operand 0 "aarch64_pluslong_immediate") (and (match_test "TARGET_SVE") - (match_operand 0 "aarch64_sve_plus_immediate")))) + (match_operand 0 "aarch64_sve_plus_immediate")) + (and (match_test "TARGET_SME") + (match_operand 0 "aarch64_addsvl_addspl_immediate")))) (define_predicate "aarch64_pluslong_or_poly_operand" (ior (match_operand 0 "aarch64_pluslong_operand") diff --git a/gcc/config/aarch64/t-aarch64 b/gcc/config/aarch64/t-aarch64 index cff56dc9f55..0d96ae3d0b2 100644 --- a/gcc/config/aarch64/t-aarch64 +++ b/gcc/config/aarch64/t-aarch64 @@ -63,6 +63,7 @@ aarch64-sve-builtins.o: $(srcdir)/config/aarch64/aarch64-sve-builtins.cc \ $(srcdir)/config/aarch64/aarch64-sve-builtins.def \ $(srcdir)/config/aarch64/aarch64-sve-builtins-base.def \ $(srcdir)/config/aarch64/aarch64-sve-builtins-sve2.def \ + $(srcdir)/config/aarch64/aarch64-sve-builtins-sme.def \ $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) $(TREE_H) $(RTL_H) \ $(TM_P_H) memmodel.h insn-codes.h $(OPTABS_H) $(RECOG_H) $(DIAGNOSTIC_H) \ $(EXPR_H) $(BASIC_BLOCK_H) $(FUNCTION_H) fold-const.h $(GIMPLE_H) \ @@ -72,7 +73,8 @@ aarch64-sve-builtins.o: $(srcdir)/config/aarch64/aarch64-sve-builtins.cc \ $(srcdir)/config/aarch64/aarch64-sve-builtins.h \ $(srcdir)/config/aarch64/aarch64-sve-builtins-shapes.h \ $(srcdir)/config/aarch64/aarch64-sve-builtins-base.h \ - $(srcdir)/config/aarch64/aarch64-sve-builtins-sve2.h + $(srcdir)/config/aarch64/aarch64-sve-builtins-sve2.h \ + $(srcdir)/config/aarch64/aarch64-sve-builtins-sme.h $(COMPILER) -c $(ALL_COMPILERFLAGS) $(ALL_CPPFLAGS) $(INCLUDES) \ $(srcdir)/config/aarch64/aarch64-sve-builtins.cc @@ -113,6 +115,19 @@ aarch64-sve-builtins-sve2.o: \ $(COMPILER) -c $(ALL_COMPILERFLAGS) $(ALL_CPPFLAGS) $(INCLUDES) \ $(srcdir)/config/aarch64/aarch64-sve-builtins-sve2.cc +aarch64-sve-builtins-sme.o: \ + $(srcdir)/config/aarch64/aarch64-sve-builtins-sme.cc \ + $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) $(TREE_H) $(RTL_H) \ + $(TM_P_H) memmodel.h insn-codes.h $(OPTABS_H) $(RECOG_H) \ + $(EXPR_H) $(BASIC_BLOCK_H) $(FUNCTION_H) fold-const.h $(GIMPLE_H) \ + gimple-iterator.h gimplify.h explow.h $(EMIT_RTL_H) \ + $(srcdir)/config/aarch64/aarch64-sve-builtins.h \ + $(srcdir)/config/aarch64/aarch64-sve-builtins-shapes.h \ + $(srcdir)/config/aarch64/aarch64-sve-builtins-sme.h \ + $(srcdir)/config/aarch64/aarch64-sve-builtins-functions.h + $(COMPILER) -c $(ALL_COMPILERFLAGS) $(ALL_CPPFLAGS) $(INCLUDES) \ + $(srcdir)/config/aarch64/aarch64-sve-builtins-sme.cc + aarch64-builtin-iterators.h: $(srcdir)/config/aarch64/geniterators.sh \ $(srcdir)/config/aarch64/iterators.md $(SHELL) $(srcdir)/config/aarch64/geniterators.sh \ diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi index b138a74cc2b..806babc3dfa 100644 --- a/gcc/doc/invoke.texi +++ b/gcc/doc/invoke.texi @@ -21275,6 +21275,10 @@ Enable the Pointer Authentication Extension. Enable the Common Short Sequence Compression instructions. @item sme Enable the Scalable Matrix Extension. +@item sme-i16i64 +Enable the FEAT_SME_I16I64 extension to SME. +@item sme-f64f64 +Enable the FEAT_SME_F64F64 extension to SME. @end table From patchwork Tue Dec 5 10:13:20 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Sandiford X-Patchwork-Id: 1872033 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=8.43.85.97; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4SkxHv4yzsz23mj for ; Tue, 5 Dec 2023 21:15:39 +1100 (AEDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 3DA9F3872EBB for ; Tue, 5 Dec 2023 10:15:23 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by sourceware.org (Postfix) with ESMTP id D1E203860C2B for ; Tue, 5 Dec 2023 10:13:48 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org D1E203860C2B Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=arm.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org D1E203860C2B Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=217.140.110.172 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1701771236; cv=none; b=wrYJ325mwiQqj4sQsLAupww9cJOxjvZulZU8SCug5H0kp6wdF+9YB/48f9BMcOzsr2hYNXNuPHqWM+yT3TEkA8Qr30mswWOYLe7xihnGuseXX/iMovGZ+gi2D5wxuRclD+akMeYW3fpbQri4YdxeZENZaa7OgZsEB3bnruv3hFw= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1701771236; c=relaxed/simple; bh=yJLVhR2UnR/1FUrfhzNDW9K5yiV15yFeH1/ebGV4Il0=; h=From:To:Subject:Date:Message-Id:MIME-Version; b=TOgeaUcftYRsRAB3eQh9eXwv3vWa+/E0DDL9zQJYfJpOoS188VyhlNuzyUxs5o3KuWAerdGPSEpoQ8zOjHJBc9WpoqeEmubPRsCutdah4xr14O9mZaa+hUhdN33sYYFt+NizUVP9Cd3EySfrBEw+1Vc1PkmEHvVlcJXVkOdeBKQ= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 59A991477; Tue, 5 Dec 2023 02:14:35 -0800 (PST) Received: from e121540-lin.manchester.arm.com (e121540-lin.manchester.arm.com [10.32.110.72]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 4B2E13F5A1; Tue, 5 Dec 2023 02:13:48 -0800 (PST) From: Richard Sandiford To: gcc-patches@gcc.gnu.org Cc: Richard Sandiford Subject: [pushed v2 22/25] aarch64: Add support for __arm_locally_streaming Date: Tue, 5 Dec 2023 10:13:20 +0000 Message-Id: <20231205101323.1914247-23-richard.sandiford@arm.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20231205101323.1914247-1-richard.sandiford@arm.com> References: <20231205101323.1914247-1-richard.sandiford@arm.com> MIME-Version: 1.0 X-Spam-Status: No, score=-22.3 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_DMARC_NONE, KAM_DMARC_STATUS, KAM_LAZY_DOMAIN_SECURITY, KAM_SHORT, SPF_HELO_NONE, SPF_NONE, TXREP, T_FILL_THIS_FORM_SHORT, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org This patch adds support for the __arm_locally_streaming attribute, which allows a function to use SME internally without changing the function's ABI. The attribute is valid but redundant for __arm_streaming functions. gcc/ * config/aarch64/aarch64.cc (aarch64_arm_attribute_table): Add arm::locally_streaming. (aarch64_fndecl_is_locally_streaming): New function. (aarch64_fndecl_sm_state): Handle locally-streaming functions. (aarch64_cfun_enables_pstate_sm): New function. (aarch64_add_offset): Add an argument that specifies whether the streaming vector length should be used instead of the prevailing one. (aarch64_split_add_offset, aarch64_add_sp, aarch64_sub_sp): Likewise. (aarch64_allocate_and_probe_stack_space): Likewise. (aarch64_expand_mov_immediate): Update calls accordingly. (aarch64_need_old_pstate_sm): Return true for locally-streaming streaming-compatible functions. (aarch64_layout_frame): Force all call-preserved Z and P registers to be saved and restored if the function switches PSTATE.SM in the prologue. (aarch64_get_separate_components): Disable shrink-wrapping of such Z and P saves and restores. (aarch64_use_late_prologue_epilogue): New function. (aarch64_expand_prologue): Measure SVE lengths in the streaming vector length for locally-streaming functions, then emit code to enable streaming mode. (aarch64_expand_epilogue): Likewise in reverse. (TARGET_USE_LATE_PROLOGUE_EPILOGUE): Define. * config/aarch64/aarch64-c.cc (aarch64_define_unconditional_macros): Define __arm_locally_streaming. gcc/testsuite/ * gcc.target/aarch64/sme/locally_streaming_1.c: New test. * gcc.target/aarch64/sme/locally_streaming_2.c: Likewise. * gcc.target/aarch64/sme/locally_streaming_3.c: Likewise. * gcc.target/aarch64/sme/locally_streaming_4.c: Likewise. * gcc.target/aarch64/sme/keyword_macros_1.c: Add __arm_locally_streaming. * g++.target/aarch64/sme/keyword_macros_1.C: Likewise. --- gcc/config/aarch64/aarch64-c.cc | 1 + gcc/config/aarch64/aarch64.cc | 233 +++++++-- .../g++.target/aarch64/sme/keyword_macros_1.C | 1 + .../gcc.target/aarch64/sme/keyword_macros_1.c | 1 + .../aarch64/sme/locally_streaming_1.c | 466 ++++++++++++++++++ .../aarch64/sme/locally_streaming_2.c | 177 +++++++ .../aarch64/sme/locally_streaming_3.c | 273 ++++++++++ .../aarch64/sme/locally_streaming_4.c | 145 ++++++ 8 files changed, 1259 insertions(+), 38 deletions(-) create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/locally_streaming_1.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/locally_streaming_2.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/locally_streaming_3.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/locally_streaming_4.c diff --git a/gcc/config/aarch64/aarch64-c.cc b/gcc/config/aarch64/aarch64-c.cc index f2fa5df1b82..2a8ca46987a 100644 --- a/gcc/config/aarch64/aarch64-c.cc +++ b/gcc/config/aarch64/aarch64-c.cc @@ -86,6 +86,7 @@ aarch64_define_unconditional_macros (cpp_reader *pfile) DEFINE_ARM_KEYWORD_MACRO ("streaming"); DEFINE_ARM_KEYWORD_MACRO ("streaming_compatible"); + DEFINE_ARM_KEYWORD_MACRO ("locally_streaming"); #undef DEFINE_ARM_KEYWORD_MACRO diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc index 26d575f68ca..c94016ccdcf 100644 --- a/gcc/config/aarch64/aarch64.cc +++ b/gcc/config/aarch64/aarch64.cc @@ -762,6 +762,7 @@ static const attribute_spec aarch64_arm_attributes[] = NULL, attr_streaming_exclusions }, { "streaming_compatible", 0, 0, false, true, true, true, NULL, attr_streaming_exclusions }, + { "locally_streaming", 0, 0, true, false, false, false, NULL, NULL }, { "new", 1, -1, true, false, false, false, handle_arm_new, NULL }, { "preserves", 1, -1, false, true, true, true, @@ -2071,6 +2072,16 @@ aarch64_fntype_isa_mode (const_tree fntype) | aarch64_fntype_pstate_za (fntype)); } +/* Return true if FNDECL uses streaming mode internally, as an + implementation choice. */ + +static bool +aarch64_fndecl_is_locally_streaming (const_tree fndecl) +{ + return lookup_attribute ("arm", "locally_streaming", + DECL_ATTRIBUTES (fndecl)); +} + /* Return the state of PSTATE.SM when compiling the body of function FNDECL. This might be different from the state of PSTATE.SM on entry. */ @@ -2078,6 +2089,9 @@ aarch64_fntype_isa_mode (const_tree fntype) static aarch64_feature_flags aarch64_fndecl_pstate_sm (const_tree fndecl) { + if (aarch64_fndecl_is_locally_streaming (fndecl)) + return AARCH64_FL_SM_ON; + return aarch64_fntype_pstate_sm (TREE_TYPE (fndecl)); } @@ -2153,6 +2167,16 @@ aarch64_cfun_has_new_state (const char *state_name) return aarch64_fndecl_has_new_state (cfun->decl, state_name); } +/* Return true if PSTATE.SM is 1 in the body of the current function, + but is not guaranteed to be 1 on entry. */ + +static bool +aarch64_cfun_enables_pstate_sm () +{ + return (aarch64_fndecl_is_locally_streaming (cfun->decl) + && aarch64_cfun_incoming_pstate_sm () != AARCH64_FL_SM_ON); +} + /* Return true if the current function has state STATE_NAME, either by creating new state itself or by sharing state with callers. */ @@ -4394,6 +4418,10 @@ aarch64_add_offset_temporaries (rtx x) TEMP2, if nonnull, is a second temporary register that doesn't overlap either DEST or REG. + FORCE_ISA_MODE is AARCH64_FL_SM_ON if any variable component of OFFSET + is measured relative to the SME vector length instead of the current + prevailing vector length. It is 0 otherwise. + Since this function may be used to adjust the stack pointer, we must ensure that it cannot cause transient stack deallocation (for example by first incrementing SP and then decrementing when adjusting by a @@ -4402,6 +4430,7 @@ aarch64_add_offset_temporaries (rtx x) static void aarch64_add_offset (scalar_int_mode mode, rtx dest, rtx src, poly_int64 offset, rtx temp1, rtx temp2, + aarch64_feature_flags force_isa_mode, bool frame_related_p, bool emit_move_imm = true) { gcc_assert (emit_move_imm || temp1 != NULL_RTX); @@ -4414,9 +4443,18 @@ aarch64_add_offset (scalar_int_mode mode, rtx dest, rtx src, /* Try using ADDVL or ADDPL to add the whole value. */ if (src != const0_rtx && aarch64_sve_addvl_addpl_immediate_p (offset)) { - rtx offset_rtx = gen_int_mode (offset, mode); + gcc_assert (offset.coeffs[0] == offset.coeffs[1]); + rtx offset_rtx; + if (force_isa_mode == 0) + offset_rtx = gen_int_mode (offset, mode); + else + offset_rtx = aarch64_sme_vq_immediate (mode, offset.coeffs[0], 0); rtx_insn *insn = emit_insn (gen_add3_insn (dest, src, offset_rtx)); RTX_FRAME_RELATED_P (insn) = frame_related_p; + if (frame_related_p && (force_isa_mode & AARCH64_FL_SM_ON)) + add_reg_note (insn, REG_CFA_ADJUST_CFA, + gen_rtx_SET (dest, plus_constant (Pmode, src, + offset))); return; } @@ -4432,11 +4470,19 @@ aarch64_add_offset (scalar_int_mode mode, rtx dest, rtx src, if (src != const0_rtx && aarch64_sve_addvl_addpl_immediate_p (poly_offset)) { - rtx offset_rtx = gen_int_mode (poly_offset, mode); + rtx offset_rtx; + if (force_isa_mode == 0) + offset_rtx = gen_int_mode (poly_offset, mode); + else + offset_rtx = aarch64_sme_vq_immediate (mode, factor, 0); if (frame_related_p) { rtx_insn *insn = emit_insn (gen_add3_insn (dest, src, offset_rtx)); RTX_FRAME_RELATED_P (insn) = true; + if (force_isa_mode & AARCH64_FL_SM_ON) + add_reg_note (insn, REG_CFA_ADJUST_CFA, + gen_rtx_SET (dest, plus_constant (Pmode, src, + poly_offset))); src = dest; } else @@ -4467,9 +4513,19 @@ aarch64_add_offset (scalar_int_mode mode, rtx dest, rtx src, rtx val; if (IN_RANGE (rel_factor, -32, 31)) { + if (force_isa_mode & AARCH64_FL_SM_ON) + { + /* Try to use an unshifted RDSVL, otherwise fall back on + a shifted RDSVL #1. */ + if (aarch64_sve_rdvl_addvl_factor_p (factor)) + shift = 0; + else + factor = rel_factor * 16; + val = aarch64_sme_vq_immediate (mode, factor, 0); + } /* Try to use an unshifted CNT[BHWD] or RDVL. */ - if (aarch64_sve_cnt_factor_p (factor) - || aarch64_sve_rdvl_addvl_factor_p (factor)) + else if (aarch64_sve_cnt_factor_p (factor) + || aarch64_sve_rdvl_addvl_factor_p (factor)) { val = gen_int_mode (poly_int64 (factor, factor), mode); shift = 0; @@ -4499,11 +4555,18 @@ aarch64_add_offset (scalar_int_mode mode, rtx dest, rtx src, a shift and add sequence for the multiplication. If CNTB << SHIFT is out of range, stick with the current shift factor. */ - if (IN_RANGE (low_bit, 2, 16 * 16)) + if (force_isa_mode == 0 + && IN_RANGE (low_bit, 2, 16 * 16)) { val = gen_int_mode (poly_int64 (low_bit, low_bit), mode); shift = 0; } + else if ((force_isa_mode & AARCH64_FL_SM_ON) + && aarch64_sve_rdvl_addvl_factor_p (low_bit)) + { + val = aarch64_sme_vq_immediate (mode, low_bit, 0); + shift = 0; + } else val = gen_int_mode (BYTES_PER_SVE_VECTOR, mode); @@ -4591,30 +4654,34 @@ aarch64_split_add_offset (scalar_int_mode mode, rtx dest, rtx src, rtx offset_rtx, rtx temp1, rtx temp2) { aarch64_add_offset (mode, dest, src, rtx_to_poly_int64 (offset_rtx), - temp1, temp2, false); + temp1, temp2, 0, false); } /* Add DELTA to the stack pointer, marking the instructions frame-related. - TEMP1 is available as a temporary if nonnull. EMIT_MOVE_IMM is false - if TEMP1 already contains abs (DELTA). */ + TEMP1 is available as a temporary if nonnull. FORCE_ISA_MODE is as + for aarch64_add_offset. EMIT_MOVE_IMM is false if TEMP1 already + contains abs (DELTA). */ static inline void -aarch64_add_sp (rtx temp1, rtx temp2, poly_int64 delta, bool emit_move_imm) +aarch64_add_sp (rtx temp1, rtx temp2, poly_int64 delta, + aarch64_feature_flags force_isa_mode, bool emit_move_imm) { aarch64_add_offset (Pmode, stack_pointer_rtx, stack_pointer_rtx, delta, - temp1, temp2, true, emit_move_imm); + temp1, temp2, force_isa_mode, true, emit_move_imm); } /* Subtract DELTA from the stack pointer, marking the instructions - frame-related if FRAME_RELATED_P. TEMP1 is available as a temporary - if nonnull. */ + frame-related if FRAME_RELATED_P. FORCE_ISA_MODE is as for + aarch64_add_offset. TEMP1 is available as a temporary if nonnull. */ static inline void -aarch64_sub_sp (rtx temp1, rtx temp2, poly_int64 delta, bool frame_related_p, - bool emit_move_imm = true) +aarch64_sub_sp (rtx temp1, rtx temp2, poly_int64 delta, + aarch64_feature_flags force_isa_mode, + bool frame_related_p, bool emit_move_imm = true) { aarch64_add_offset (Pmode, stack_pointer_rtx, stack_pointer_rtx, -delta, - temp1, temp2, frame_related_p, emit_move_imm); + temp1, temp2, force_isa_mode, frame_related_p, + emit_move_imm); } /* A streaming-compatible function needs to switch temporarily to the known @@ -5640,11 +5707,11 @@ aarch64_expand_mov_immediate (rtx dest, rtx imm) { base = aarch64_force_temporary (int_mode, dest, base); aarch64_add_offset (int_mode, dest, base, offset, - NULL_RTX, NULL_RTX, false); + NULL_RTX, NULL_RTX, 0, false); } else aarch64_add_offset (int_mode, dest, base, offset, - dest, NULL_RTX, false); + dest, NULL_RTX, 0, false); } return; } @@ -5671,7 +5738,7 @@ aarch64_expand_mov_immediate (rtx dest, rtx imm) gcc_assert (can_create_pseudo_p ()); base = aarch64_force_temporary (int_mode, dest, base); aarch64_add_offset (int_mode, dest, base, const_offset, - NULL_RTX, NULL_RTX, false); + NULL_RTX, NULL_RTX, 0, false); return; } @@ -5711,7 +5778,7 @@ aarch64_expand_mov_immediate (rtx dest, rtx imm) gcc_assert(can_create_pseudo_p ()); base = aarch64_force_temporary (int_mode, dest, base); aarch64_add_offset (int_mode, dest, base, const_offset, - NULL_RTX, NULL_RTX, false); + NULL_RTX, NULL_RTX, 0, false); return; } /* FALLTHRU */ @@ -7353,6 +7420,9 @@ aarch64_need_old_pstate_sm () if (aarch64_cfun_incoming_pstate_sm () != 0) return false; + if (aarch64_cfun_enables_pstate_sm ()) + return true; + if (cfun->machine->call_switches_pstate_sm) for (auto insn = get_insns (); insn; insn = NEXT_INSN (insn)) if (auto *call = dyn_cast (insn)) @@ -7379,6 +7449,7 @@ aarch64_layout_frame (void) bool frame_related_fp_reg_p = false; aarch64_frame &frame = cfun->machine->frame; poly_int64 top_of_locals = -1; + bool enables_pstate_sm = aarch64_cfun_enables_pstate_sm (); vec_safe_truncate (frame.saved_gprs, 0); vec_safe_truncate (frame.saved_fprs, 0); @@ -7416,7 +7487,7 @@ aarch64_layout_frame (void) frame.reg_offset[regno] = SLOT_REQUIRED; for (regno = V0_REGNUM; regno <= V31_REGNUM; regno++) - if (df_regs_ever_live_p (regno) + if ((enables_pstate_sm || df_regs_ever_live_p (regno)) && !fixed_regs[regno] && !crtl->abi->clobbers_full_reg_p (regno)) { @@ -7445,7 +7516,7 @@ aarch64_layout_frame (void) } for (regno = P0_REGNUM; regno <= P15_REGNUM; regno++) - if (df_regs_ever_live_p (regno) + if ((enables_pstate_sm || df_regs_ever_live_p (regno)) && !fixed_regs[regno] && !crtl->abi->clobbers_full_reg_p (regno)) frame.reg_offset[regno] = SLOT_REQUIRED; @@ -7562,7 +7633,8 @@ aarch64_layout_frame (void) /* If the current function changes the SVE vector length, ensure that the old value of the DWARF VG register is saved and available in the CFI, so that outer frames with VL-sized offsets can be processed correctly. */ - if (cfun->machine->call_switches_pstate_sm) + if (cfun->machine->call_switches_pstate_sm + || aarch64_cfun_enables_pstate_sm ()) { frame.reg_offset[VG_REGNUM] = offset; offset += UNITS_PER_WORD; @@ -8390,9 +8462,16 @@ aarch64_get_separate_components (void) bitmap_clear (components); /* The registers we need saved to the frame. */ + bool enables_pstate_sm = aarch64_cfun_enables_pstate_sm (); for (unsigned regno = 0; regno <= LAST_SAVED_REGNUM; regno++) if (aarch64_register_saved_on_entry (regno)) { + /* Disallow shrink wrapping for registers that will be clobbered + by an SMSTART SM in the prologue. */ + if (enables_pstate_sm + && (FP_REGNUM_P (regno) || PR_REGNUM_P (regno))) + continue; + /* Punt on saves and restores that use ST1D and LD1D. We could try to be smarter, but it would involve making sure that the spare predicate register itself is safe to use at the save @@ -8711,11 +8790,16 @@ aarch64_emit_stack_tie (rtx reg) events, e.g. if we were to allow the stack to be dropped by more than a page and then have multiple probes up and we take a signal somewhere in between then the signal handler doesn't know the state of the stack and can make no - assumptions about which pages have been probed. */ + assumptions about which pages have been probed. + + FORCE_ISA_MODE is AARCH64_FL_SM_ON if any variable component of POLY_SIZE + is measured relative to the SME vector length instead of the current + prevailing vector length. It is 0 otherwise. */ static void aarch64_allocate_and_probe_stack_space (rtx temp1, rtx temp2, poly_int64 poly_size, + aarch64_feature_flags force_isa_mode, bool frame_related_p, bool final_adjustment_p) { @@ -8757,7 +8841,8 @@ aarch64_allocate_and_probe_stack_space (rtx temp1, rtx temp2, if (known_lt (poly_size, min_probe_threshold) || !flag_stack_clash_protection) { - aarch64_sub_sp (temp1, temp2, poly_size, frame_related_p); + aarch64_sub_sp (temp1, temp2, poly_size, force_isa_mode, + frame_related_p); return; } @@ -8774,7 +8859,8 @@ aarch64_allocate_and_probe_stack_space (rtx temp1, rtx temp2, /* First calculate the amount of bytes we're actually spilling. */ aarch64_add_offset (Pmode, temp1, CONST0_RTX (Pmode), - poly_size, temp1, temp2, false, true); + poly_size, temp1, temp2, force_isa_mode, + false, true); rtx_insn *insn = get_last_insn (); @@ -8832,7 +8918,7 @@ aarch64_allocate_and_probe_stack_space (rtx temp1, rtx temp2, { for (HOST_WIDE_INT i = 0; i < rounded_size; i += guard_size) { - aarch64_sub_sp (NULL, temp2, guard_size, true); + aarch64_sub_sp (NULL, temp2, guard_size, force_isa_mode, true); emit_stack_probe (plus_constant (Pmode, stack_pointer_rtx, guard_used_by_caller)); emit_insn (gen_blockage ()); @@ -8843,7 +8929,7 @@ aarch64_allocate_and_probe_stack_space (rtx temp1, rtx temp2, { /* Compute the ending address. */ aarch64_add_offset (Pmode, temp1, stack_pointer_rtx, -rounded_size, - temp1, NULL, false, true); + temp1, NULL, force_isa_mode, false, true); rtx_insn *insn = get_last_insn (); /* For the initial allocation, we don't have a frame pointer @@ -8909,7 +8995,7 @@ aarch64_allocate_and_probe_stack_space (rtx temp1, rtx temp2, if (final_adjustment_p && rounded_size != 0) min_probe_threshold = 0; - aarch64_sub_sp (temp1, temp2, residual, frame_related_p); + aarch64_sub_sp (temp1, temp2, residual, force_isa_mode, frame_related_p); if (residual >= min_probe_threshold) { if (dump_file) @@ -8974,6 +9060,14 @@ aarch64_epilogue_uses (int regno) return 0; } +/* Implement TARGET_USE_LATE_PROLOGUE_EPILOGUE. */ + +static bool +aarch64_use_late_prologue_epilogue () +{ + return aarch64_cfun_enables_pstate_sm (); +} + /* The current function's frame has a save slot for the incoming state of SVCR. Return a legitimate memory for the slot, based on the hard frame pointer. */ @@ -9110,6 +9204,9 @@ aarch64_expand_prologue (void) unsigned reg2 = frame.wb_push_candidate2; bool emit_frame_chain = frame.emit_frame_chain; rtx_insn *insn; + aarch64_feature_flags force_isa_mode = 0; + if (aarch64_cfun_enables_pstate_sm ()) + force_isa_mode = AARCH64_FL_SM_ON; if (flag_stack_clash_protection && known_eq (callee_adjust, 0)) { @@ -9171,7 +9268,7 @@ aarch64_expand_prologue (void) less the amount of the guard reserved for use by the caller's outgoing args. */ aarch64_allocate_and_probe_stack_space (tmp0_rtx, tmp1_rtx, initial_adjust, - true, false); + force_isa_mode, true, false); if (callee_adjust != 0) aarch64_push_regs (reg1, reg2, callee_adjust); @@ -9194,7 +9291,8 @@ aarch64_expand_prologue (void) gcc_assert (known_eq (chain_offset, 0)); aarch64_add_offset (Pmode, hard_frame_pointer_rtx, stack_pointer_rtx, chain_offset, - tmp1_rtx, tmp0_rtx, frame_pointer_needed); + tmp1_rtx, tmp0_rtx, force_isa_mode, + frame_pointer_needed); if (frame_pointer_needed && !frame_size.is_constant ()) { /* Variable-sized frames need to describe the save slot @@ -9241,6 +9339,7 @@ aarch64_expand_prologue (void) || known_eq (initial_adjust, 0)); aarch64_allocate_and_probe_stack_space (tmp1_rtx, tmp0_rtx, sve_callee_adjust, + force_isa_mode, !frame_pointer_needed, false); bytes_below_sp -= sve_callee_adjust; } @@ -9253,12 +9352,15 @@ aarch64_expand_prologue (void) that is assumed by the called. */ gcc_assert (known_eq (bytes_below_sp, final_adjust)); aarch64_allocate_and_probe_stack_space (tmp1_rtx, tmp0_rtx, final_adjust, + force_isa_mode, !frame_pointer_needed, true); if (emit_frame_chain && maybe_ne (final_adjust, 0)) aarch64_emit_stack_tie (hard_frame_pointer_rtx); - /* Save the incoming value of PSTATE.SM, if required. */ - if (known_ge (frame.old_svcr_offset, 0)) + /* Save the incoming value of PSTATE.SM, if required. Code further + down does this for locally-streaming functions. */ + if (known_ge (frame.old_svcr_offset, 0) + && !aarch64_cfun_enables_pstate_sm ()) { rtx mem = aarch64_old_svcr_mem (); MEM_VOLATILE_P (mem) = 1; @@ -9290,6 +9392,34 @@ aarch64_expand_prologue (void) emit_move_insn (gen_rtx_REG (DImode, R1_REGNUM), old_r1); } } + + /* Enable PSTATE.SM, if required. */ + if (aarch64_cfun_enables_pstate_sm ()) + { + rtx_insn *guard_label = nullptr; + if (known_ge (cfun->machine->frame.old_svcr_offset, 0)) + { + /* The current function is streaming-compatible. Save the + original state of PSTATE.SM. */ + rtx svcr = gen_rtx_REG (DImode, IP0_REGNUM); + emit_insn (gen_aarch64_read_svcr (svcr)); + emit_move_insn (aarch64_old_svcr_mem (), svcr); + guard_label = aarch64_guard_switch_pstate_sm (svcr, + aarch64_isa_flags); + } + aarch64_sme_mode_switch_regs args_switch; + auto &args = crtl->args.info; + for (unsigned int i = 0; i < args.num_sme_mode_switch_args; ++i) + { + rtx x = args.sme_mode_switch_args[i]; + args_switch.add_reg (GET_MODE (x), REGNO (x)); + } + args_switch.emit_prologue (); + emit_insn (gen_aarch64_smstart_sm ()); + args_switch.emit_epilogue (); + if (guard_label) + emit_label (guard_label); + } } /* Return TRUE if we can use a simple_return insn. @@ -9336,6 +9466,9 @@ aarch64_expand_epilogue (rtx_call_insn *sibcall) HOST_WIDE_INT guard_size = 1 << param_stack_clash_protection_guard_size; HOST_WIDE_INT guard_used_by_caller = STACK_CLASH_CALLER_GUARD; + aarch64_feature_flags force_isa_mode = 0; + if (aarch64_cfun_enables_pstate_sm ()) + force_isa_mode = AARCH64_FL_SM_ON; /* We can re-use the registers when: @@ -9360,6 +9493,24 @@ aarch64_expand_epilogue (rtx_call_insn *sibcall) = maybe_ne (get_frame_size () + frame.saved_varargs_size, 0); + /* Reset PSTATE.SM, if required. */ + if (aarch64_cfun_enables_pstate_sm ()) + { + rtx_insn *guard_label = nullptr; + if (known_ge (cfun->machine->frame.old_svcr_offset, 0)) + guard_label = aarch64_guard_switch_pstate_sm (IP0_REGNUM, + aarch64_isa_flags); + aarch64_sme_mode_switch_regs return_switch; + if (crtl->return_rtx && REG_P (crtl->return_rtx)) + return_switch.add_reg (GET_MODE (crtl->return_rtx), + REGNO (crtl->return_rtx)); + return_switch.emit_prologue (); + emit_insn (gen_aarch64_smstop_sm ()); + return_switch.emit_epilogue (); + if (guard_label) + emit_label (guard_label); + } + /* Emit a barrier to prevent loads from a deallocated stack. */ if (maybe_gt (final_adjust, crtl->outgoing_args_size) || cfun->calls_alloca @@ -9380,19 +9531,21 @@ aarch64_expand_epilogue (rtx_call_insn *sibcall) aarch64_add_offset (Pmode, stack_pointer_rtx, hard_frame_pointer_rtx, -bytes_below_hard_fp + final_adjust, - tmp1_rtx, tmp0_rtx, callee_adjust == 0); + tmp1_rtx, tmp0_rtx, force_isa_mode, + callee_adjust == 0); else /* The case where we need to re-use the register here is very rare, so avoid the complicated condition and just always emit a move if the immediate doesn't fit. */ - aarch64_add_sp (tmp1_rtx, tmp0_rtx, final_adjust, true); + aarch64_add_sp (tmp1_rtx, tmp0_rtx, final_adjust, force_isa_mode, true); /* Restore the vector registers before the predicate registers, so that we can use P4 as a temporary for big-endian SVE frames. */ aarch64_restore_callee_saves (final_adjust, frame.saved_fprs, &cfi_ops); aarch64_restore_callee_saves (final_adjust, frame.saved_prs, &cfi_ops); if (maybe_ne (sve_callee_adjust, 0)) - aarch64_add_sp (NULL_RTX, NULL_RTX, sve_callee_adjust, true); + aarch64_add_sp (NULL_RTX, NULL_RTX, sve_callee_adjust, + force_isa_mode, true); /* When shadow call stack is enabled, the scs_pop in the epilogue will restore x30, we don't need to restore x30 again in the traditional @@ -9422,7 +9575,7 @@ aarch64_expand_epilogue (rtx_call_insn *sibcall) /* Liveness of EP0_REGNUM can not be trusted across function calls either, so add restriction on emit_move optimization to leaf functions. */ - aarch64_add_sp (tmp0_rtx, tmp1_rtx, initial_adjust, + aarch64_add_sp (tmp0_rtx, tmp1_rtx, initial_adjust, force_isa_mode, (!can_inherit_p || !crtl->is_leaf || df_regs_ever_live_p (EP0_REGNUM))); @@ -9532,7 +9685,8 @@ aarch64_output_mi_thunk (FILE *file, tree thunk ATTRIBUTE_UNUSED, temp1 = gen_rtx_REG (Pmode, EP1_REGNUM); if (vcall_offset == 0) - aarch64_add_offset (Pmode, this_rtx, this_rtx, delta, temp1, temp0, false); + aarch64_add_offset (Pmode, this_rtx, this_rtx, delta, temp1, temp0, + 0, false); else { gcc_assert ((vcall_offset & (POINTER_BYTES - 1)) == 0); @@ -9545,7 +9699,7 @@ aarch64_output_mi_thunk (FILE *file, tree thunk ATTRIBUTE_UNUSED, plus_constant (Pmode, this_rtx, delta)); else aarch64_add_offset (Pmode, this_rtx, this_rtx, delta, - temp1, temp0, false); + temp1, temp0, 0, false); } if (Pmode == ptr_mode) @@ -28996,6 +29150,9 @@ aarch64_libgcc_floating_mode_supported_p #undef TARGET_EXTRA_LIVE_ON_ENTRY #define TARGET_EXTRA_LIVE_ON_ENTRY aarch64_extra_live_on_entry +#undef TARGET_USE_LATE_PROLOGUE_EPILOGUE +#define TARGET_USE_LATE_PROLOGUE_EPILOGUE aarch64_use_late_prologue_epilogue + #undef TARGET_EMIT_EPILOGUE_FOR_SIBCALL #define TARGET_EMIT_EPILOGUE_FOR_SIBCALL aarch64_expand_epilogue diff --git a/gcc/testsuite/g++.target/aarch64/sme/keyword_macros_1.C b/gcc/testsuite/g++.target/aarch64/sme/keyword_macros_1.C index 8b0755014cc..dc5c097bd52 100644 --- a/gcc/testsuite/g++.target/aarch64/sme/keyword_macros_1.C +++ b/gcc/testsuite/g++.target/aarch64/sme/keyword_macros_1.C @@ -7,3 +7,4 @@ void f4 () __arm_out("za"); void f5 () __arm_inout("za"); void f6 () __arm_preserves("za"); __arm_new("za") void f7 () {} +__arm_locally_streaming void f8 () {} diff --git a/gcc/testsuite/gcc.target/aarch64/sme/keyword_macros_1.c b/gcc/testsuite/gcc.target/aarch64/sme/keyword_macros_1.c index fcabe3edc55..22f5facfdf9 100644 --- a/gcc/testsuite/gcc.target/aarch64/sme/keyword_macros_1.c +++ b/gcc/testsuite/gcc.target/aarch64/sme/keyword_macros_1.c @@ -7,3 +7,4 @@ void f4 () __arm_out("za"); void f5 () __arm_inout("za"); void f6 () __arm_preserves("za"); __arm_new("za") void f7 () {} +__arm_locally_streaming void f8 () {} diff --git a/gcc/testsuite/gcc.target/aarch64/sme/locally_streaming_1.c b/gcc/testsuite/gcc.target/aarch64/sme/locally_streaming_1.c new file mode 100644 index 00000000000..20ff4b87d94 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sme/locally_streaming_1.c @@ -0,0 +1,466 @@ +// { dg-options "-O -fomit-frame-pointer" } +// { dg-final { check-function-bodies "**" "" } } + +void consume_za () [[arm::streaming, arm::inout("za")]]; + +/* +** n_ls: +** sub sp, sp, #?80 +** cntd x16 +** str x16, \[sp\] +** stp d8, d9, \[sp, #?16\] +** stp d10, d11, \[sp, #?32\] +** stp d12, d13, \[sp, #?48\] +** stp d14, d15, \[sp, #?64\] +** smstart sm +** smstop sm +** ldp d8, d9, \[sp, #?16\] +** ldp d10, d11, \[sp, #?32\] +** ldp d12, d13, \[sp, #?48\] +** ldp d14, d15, \[sp, #?64\] +** add sp, sp, #?80 +** ret +*/ +[[arm::locally_streaming]] void +n_ls () +{ + asm (""); +} + +/* +** s_ls: +** ret +*/ +[[arm::locally_streaming]] void +s_ls () [[arm::streaming]] +{ + asm (""); +} + +/* +** sc_ls: +** stp x29, x30, \[sp, #?-96\]! +** mov x29, sp +** cntd x16 +** str x16, \[sp, #?24\] +** stp d8, d9, \[sp, #?32\] +** stp d10, d11, \[sp, #?48\] +** stp d12, d13, \[sp, #?64\] +** stp d14, d15, \[sp, #?80\] +** mrs x16, svcr +** str x16, \[x29, #?16\] +** tbnz x16, 0, [^\n]+ +** smstart sm +** ldr x16, \[x29, #?16\] +** tbnz x16, 0, [^\n]+ +** smstop sm +** ldp d8, d9, \[sp, #?32\] +** ldp d10, d11, \[sp, #?48\] +** ldp d12, d13, \[sp, #?64\] +** ldp d14, d15, \[sp, #?80\] +** ldp x29, x30, \[sp\], #?96 +** ret +*/ +[[arm::locally_streaming]] void +sc_ls () [[arm::streaming_compatible]] +{ + asm (""); +} + +/* +** n_ls_new_za: +** str x30, \[sp, #?-80\]! +** cntd x16 +** str x16, \[sp, #?8\] +** stp d8, d9, \[sp, #?16\] +** stp d10, d11, \[sp, #?32\] +** stp d12, d13, \[sp, #?48\] +** stp d14, d15, \[sp, #?64\] +** smstart sm +** mrs (x[0-9]+), tpidr2_el0 +** cbz \1, [^\n]+ +** bl __arm_tpidr2_save +** msr tpidr2_el0, xzr +** zero { za } +** smstart za +** bl consume_za +** smstop za +** smstop sm +** ldp d8, d9, \[sp, #?16\] +** ldp d10, d11, \[sp, #?32\] +** ldp d12, d13, \[sp, #?48\] +** ldp d14, d15, \[sp, #?64\] +** ldr x30, \[sp\], #?80 +** ret +*/ +[[arm::locally_streaming, arm::new("za")]] void +n_ls_new_za () +{ + consume_za (); + asm (""); +} + +/* +** s_ls_new_za: +** str x30, \[sp, #?-16\]! +** mrs (x[0-9]+), tpidr2_el0 +** cbz \1, [^\n]+ +** bl __arm_tpidr2_save +** msr tpidr2_el0, xzr +** zero { za } +** smstart za +** bl consume_za +** smstop za +** ldr x30, \[sp\], #?16 +** ret +*/ +[[arm::locally_streaming, arm::new("za")]] void +s_ls_new_za () [[arm::streaming]] +{ + consume_za (); + asm (""); +} + +/* +** sc_ls_new_za: +** stp x29, x30, \[sp, #?-96\]! +** mov x29, sp +** cntd x16 +** str x16, \[sp, #?24\] +** stp d8, d9, \[sp, #?32\] +** stp d10, d11, \[sp, #?48\] +** stp d12, d13, \[sp, #?64\] +** stp d14, d15, \[sp, #?80\] +** mrs x16, svcr +** str x16, \[x29, #?16\] +** tbnz x16, 0, [^\n]+ +** smstart sm +** mrs (x[0-9]+), tpidr2_el0 +** cbz \1, [^\n]+ +** bl __arm_tpidr2_save +** msr tpidr2_el0, xzr +** zero { za } +** smstart za +** bl consume_za +** smstop za +** ldr x16, \[x29, #?16\] +** tbnz x16, 0, [^\n]+ +** smstop sm +** ldp d8, d9, \[sp, #?32\] +** ldp d10, d11, \[sp, #?48\] +** ldp d12, d13, \[sp, #?64\] +** ldp d14, d15, \[sp, #?80\] +** ldp x29, x30, \[sp\], #?96 +** ret +*/ +[[arm::locally_streaming, arm::new("za")]] void +sc_ls_new_za () [[arm::streaming_compatible]] +{ + consume_za (); + asm (""); +} + +/* +** n_ls_shared_za: +** str x30, \[sp, #?-80\]! +** cntd x16 +** str x16, \[sp, #?8\] +** stp d8, d9, \[sp, #?16\] +** stp d10, d11, \[sp, #?32\] +** stp d12, d13, \[sp, #?48\] +** stp d14, d15, \[sp, #?64\] +** smstart sm +** bl consume_za +** smstop sm +** ldp d8, d9, \[sp, #?16\] +** ldp d10, d11, \[sp, #?32\] +** ldp d12, d13, \[sp, #?48\] +** ldp d14, d15, \[sp, #?64\] +** ldr x30, \[sp\], #?80 +** ret +*/ +[[arm::locally_streaming]] void +n_ls_shared_za () [[arm::inout("za")]] +{ + consume_za (); + asm (""); +} + +/* +** s_ls_shared_za: +** str x30, \[sp, #?-16\]! +** bl consume_za +** ldr x30, \[sp\], #?16 +** ret +*/ +[[arm::locally_streaming]] void +s_ls_shared_za () [[arm::streaming, arm::inout("za")]] +{ + consume_za (); + asm (""); +} + +/* +** sc_ls_shared_za: +** stp x29, x30, \[sp, #?-96\]! +** mov x29, sp +** cntd x16 +** str x16, \[sp, #?24\] +** stp d8, d9, \[sp, #?32\] +** stp d10, d11, \[sp, #?48\] +** stp d12, d13, \[sp, #?64\] +** stp d14, d15, \[sp, #?80\] +** mrs x16, svcr +** str x16, \[x29, #?16\] +** tbnz x16, 0, [^\n]+ +** smstart sm +** bl consume_za +** ldr x16, \[x29, #?16\] +** tbnz x16, 0, [^\n]+ +** smstop sm +** ldp d8, d9, \[sp, #?32\] +** ldp d10, d11, \[sp, #?48\] +** ldp d12, d13, \[sp, #?64\] +** ldp d14, d15, \[sp, #?80\] +** ldp x29, x30, \[sp\], #?96 +** ret +*/ +[[arm::locally_streaming]] void +sc_ls_shared_za () [[arm::streaming_compatible, arm::inout("za")]] +{ + consume_za (); + asm (""); +} + +/* +** n_ls_vector_pcs: +** sub sp, sp, #?272 +** cntd x16 +** str x16, \[sp\] +** stp q8, q9, \[sp, #?16\] +** stp q10, q11, \[sp, #?48\] +** stp q12, q13, \[sp, #?80\] +** stp q14, q15, \[sp, #?112\] +** stp q16, q17, \[sp, #?144\] +** stp q18, q19, \[sp, #?176\] +** stp q20, q21, \[sp, #?208\] +** stp q22, q23, \[sp, #?240\] +** smstart sm +** smstop sm +** ldp q8, q9, \[sp, #?16\] +** ldp q10, q11, \[sp, #?48\] +** ldp q12, q13, \[sp, #?80\] +** ldp q14, q15, \[sp, #?112\] +** ldp q16, q17, \[sp, #?144\] +** ldp q18, q19, \[sp, #?176\] +** ldp q20, q21, \[sp, #?208\] +** ldp q22, q23, \[sp, #?240\] +** add sp, sp, #?272 +** ret +*/ +[[arm::locally_streaming]] void __attribute__((aarch64_vector_pcs)) +n_ls_vector_pcs () +{ + asm (""); +} + +/* +** n_ls_sve_pcs: +** sub sp, sp, #?16 +** cntd x16 +** str x16, \[sp\] +** addsvl sp, sp, #-18 +** str p4, \[sp\] +** str p5, \[sp, #1, mul vl\] +** str p6, \[sp, #2, mul vl\] +** str p7, \[sp, #3, mul vl\] +** str p8, \[sp, #4, mul vl\] +** str p9, \[sp, #5, mul vl\] +** str p10, \[sp, #6, mul vl\] +** str p11, \[sp, #7, mul vl\] +** str p12, \[sp, #8, mul vl\] +** str p13, \[sp, #9, mul vl\] +** str p14, \[sp, #10, mul vl\] +** str p15, \[sp, #11, mul vl\] +** str z8, \[sp, #2, mul vl\] +** str z9, \[sp, #3, mul vl\] +** str z10, \[sp, #4, mul vl\] +** str z11, \[sp, #5, mul vl\] +** str z12, \[sp, #6, mul vl\] +** str z13, \[sp, #7, mul vl\] +** str z14, \[sp, #8, mul vl\] +** str z15, \[sp, #9, mul vl\] +** str z16, \[sp, #10, mul vl\] +** str z17, \[sp, #11, mul vl\] +** str z18, \[sp, #12, mul vl\] +** str z19, \[sp, #13, mul vl\] +** str z20, \[sp, #14, mul vl\] +** str z21, \[sp, #15, mul vl\] +** str z22, \[sp, #16, mul vl\] +** str z23, \[sp, #17, mul vl\] +** addvl sp, sp, #-1 +** str p0, \[sp\] +** smstart sm +** ldr p0, \[sp\] +** addvl sp, sp, #1 +** smstop sm +** ldr z8, \[sp, #2, mul vl\] +** ldr z9, \[sp, #3, mul vl\] +** ldr z10, \[sp, #4, mul vl\] +** ldr z11, \[sp, #5, mul vl\] +** ldr z12, \[sp, #6, mul vl\] +** ldr z13, \[sp, #7, mul vl\] +** ldr z14, \[sp, #8, mul vl\] +** ldr z15, \[sp, #9, mul vl\] +** ldr z16, \[sp, #10, mul vl\] +** ldr z17, \[sp, #11, mul vl\] +** ldr z18, \[sp, #12, mul vl\] +** ldr z19, \[sp, #13, mul vl\] +** ldr z20, \[sp, #14, mul vl\] +** ldr z21, \[sp, #15, mul vl\] +** ldr z22, \[sp, #16, mul vl\] +** ldr z23, \[sp, #17, mul vl\] +** ldr p4, \[sp\] +** ldr p5, \[sp, #1, mul vl\] +** ldr p6, \[sp, #2, mul vl\] +** ldr p7, \[sp, #3, mul vl\] +** ldr p8, \[sp, #4, mul vl\] +** ldr p9, \[sp, #5, mul vl\] +** ldr p10, \[sp, #6, mul vl\] +** ldr p11, \[sp, #7, mul vl\] +** ldr p12, \[sp, #8, mul vl\] +** ldr p13, \[sp, #9, mul vl\] +** ldr p14, \[sp, #10, mul vl\] +** ldr p15, \[sp, #11, mul vl\] +** addsvl sp, sp, #18 +** add sp, sp, #?16 +** ret +*/ +[[arm::locally_streaming]] void +n_ls_sve_pcs (__SVBool_t x) +{ + asm (""); +} + +/* +** n_ls_v0: +** addsvl sp, sp, #-1 +** ... +** smstart sm +** add x[0-9]+, [^\n]+ +** smstop sm +** ... +** addsvl sp, sp, #1 +** ... +*/ +#define TEST(VN) __SVInt32_t VN; asm ("" :: "r" (&VN)); +[[arm::locally_streaming]] void +n_ls_v0 () +{ + TEST (v0); +} + +/* +** n_ls_v32: +** addsvl sp, sp, #-32 +** ... +** smstart sm +** ... +** smstop sm +** ... +** rdsvl (x[0-9]+), #1 +** lsl (x[0-9]+), \1, #?5 +** add sp, sp, \2 +** ... +*/ +[[arm::locally_streaming]] void +n_ls_v32 () +{ + TEST (v0); + TEST (v1); + TEST (v2); + TEST (v3); + TEST (v4); + TEST (v5); + TEST (v6); + TEST (v7); + TEST (v8); + TEST (v9); + TEST (v10); + TEST (v11); + TEST (v12); + TEST (v13); + TEST (v14); + TEST (v15); + TEST (v16); + TEST (v17); + TEST (v18); + TEST (v19); + TEST (v20); + TEST (v21); + TEST (v22); + TEST (v23); + TEST (v24); + TEST (v25); + TEST (v26); + TEST (v27); + TEST (v28); + TEST (v29); + TEST (v30); + TEST (v31); +} + +/* +** n_ls_v33: +** rdsvl (x[0-9]+), #1 +** mov (x[0-9]+), #?33 +** mul (x[0-9]+), (?:\1, \2|\2, \1) +** sub sp, sp, \3 +** ... +** smstart sm +** ... +** smstop sm +** ... +** rdsvl (x[0-9]+), #1 +** mov (x[0-9]+), #?33 +** mul (x[0-9]+), (?:\4, \5|\5, \4) +** add sp, sp, \6 +** ... +*/ +[[arm::locally_streaming]] void +n_ls_v33 () +{ + TEST (v0); + TEST (v1); + TEST (v2); + TEST (v3); + TEST (v4); + TEST (v5); + TEST (v6); + TEST (v7); + TEST (v8); + TEST (v9); + TEST (v10); + TEST (v11); + TEST (v12); + TEST (v13); + TEST (v14); + TEST (v15); + TEST (v16); + TEST (v17); + TEST (v18); + TEST (v19); + TEST (v20); + TEST (v21); + TEST (v22); + TEST (v23); + TEST (v24); + TEST (v25); + TEST (v26); + TEST (v27); + TEST (v28); + TEST (v29); + TEST (v30); + TEST (v31); + TEST (v32); +} diff --git a/gcc/testsuite/gcc.target/aarch64/sme/locally_streaming_2.c b/gcc/testsuite/gcc.target/aarch64/sme/locally_streaming_2.c new file mode 100644 index 00000000000..0eba993855f --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sme/locally_streaming_2.c @@ -0,0 +1,177 @@ +// { dg-options "-O -fomit-frame-pointer" } +// { dg-final { check-function-bodies "**" "" } } + +#include +#include + +/* +** test_d0: +** ... +** smstart sm +** ... +** fmov x10, d0 +** smstop sm +** fmov d0, x10 +** ... +*/ +[[arm::locally_streaming]] double +test_d0 () +{ + asm (""); + return 1.0f; +} + +/* +** test_d0_vec: +** ... +** smstart sm +** ... +** ( +** fmov x10, d0 +** | +** umov x10, v0.d\[0\] +** ) +** smstop sm +** fmov d0, x10 +** ... +*/ +[[arm::locally_streaming]] int8x8_t +test_d0_vec () +{ + asm (""); + return (int8x8_t) {}; +} + +/* +** test_q0: +** ... +** smstart sm +** ... +** str q0, \[sp, #?-16\]! +** smstop sm +** ldr q0, \[sp\], #?16 +** ... +*/ +[[arm::locally_streaming]] int8x16_t +test_q0 () +{ + asm (""); + return (int8x16_t) {}; +} + +/* +** test_q1: +** ... +** smstart sm +** ... +** stp q0, q1, \[sp, #?-32\]! +** smstop sm +** ldp q0, q1, \[sp\], #?32 +** ... +*/ +[[arm::locally_streaming]] int8x16x2_t +test_q1 () +{ + asm (""); + return (int8x16x2_t) {}; +} + +/* +** test_q2: +** ... +** smstart sm +** ... +** stp q0, q1, \[sp, #?-48\]! +** str q2, \[sp, #?32\] +** smstop sm +** ldr q2, \[sp, #?32\] +** ldp q0, q1, \[sp\], #?48 +** ... +*/ +[[arm::locally_streaming]] int8x16x3_t +test_q2 () +{ + asm (""); + return (int8x16x3_t) {}; +} + +/* +** test_q3: +** ... +** smstart sm +** ... +** stp q0, q1, \[sp, #?-64\]! +** stp q2, q3, \[sp, #?32\] +** smstop sm +** ldp q2, q3, \[sp, #?32\] +** ldp q0, q1, \[sp\], #?64 +** ... +*/ +[[arm::locally_streaming]] int8x16x4_t +test_q3 () +{ + asm (""); + return (int8x16x4_t) {}; +} + +/* +** test_z0: +** ... +** smstart sm +** mov z0\.b, #0 +** addvl sp, sp, #-1 +** str z0, \[sp\] +** smstop sm +** ldr z0, \[sp\] +** addvl sp, sp, #1 +** ... +*/ +[[arm::locally_streaming]] svint8_t +test_z0 () +{ + asm (""); + return (svint8_t) {}; +} + +/* +** test_z3: +** ... +** smstart sm +** ... +** addvl sp, sp, #-4 +** str z0, \[sp\] +** str z1, \[sp, #1, mul vl\] +** str z2, \[sp, #2, mul vl\] +** str z3, \[sp, #3, mul vl\] +** smstop sm +** ldr z0, \[sp\] +** ldr z1, \[sp, #1, mul vl\] +** ldr z2, \[sp, #2, mul vl\] +** ldr z3, \[sp, #3, mul vl\] +** ... +*/ +[[arm::locally_streaming]] svint8x4_t +test_z3 () +{ + asm (""); + return (svint8x4_t) {}; +} + +/* +** test_p0: +** ... +** smstart sm +** pfalse p0\.b +** addvl sp, sp, #-1 +** str p0, \[sp\] +** smstop sm +** ldr p0, \[sp\] +** addvl sp, sp, #1 +** ... +*/ +[[arm::locally_streaming]] svbool_t +test_p0 () +{ + asm (""); + return (svbool_t) {}; +} diff --git a/gcc/testsuite/gcc.target/aarch64/sme/locally_streaming_3.c b/gcc/testsuite/gcc.target/aarch64/sme/locally_streaming_3.c new file mode 100644 index 00000000000..2bdea6ac631 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sme/locally_streaming_3.c @@ -0,0 +1,273 @@ +// { dg-options "-O -fomit-frame-pointer" } +// { dg-final { check-function-bodies "**" "" } } + +#include +#include + +/* +** test_d0: +** ... +** fmov x10, d0 +** smstart sm +** fmov d0, x10 +** smstop sm +** ... +*/ +[[arm::locally_streaming]] void +test_d0 (double d0) +{ + asm (""); +} + +/* +** test_d7: +** ... +** fmov x10, d0 +** fmov x11, d1 +** fmov x12, d2 +** fmov x13, d3 +** fmov x14, d4 +** fmov x15, d5 +** fmov x16, d6 +** fmov x17, d7 +** smstart sm +** fmov d0, x10 +** fmov d1, x11 +** fmov d2, x12 +** fmov d3, x13 +** fmov d4, x14 +** fmov d5, x15 +** fmov d6, x16 +** fmov d7, x17 +** smstop sm +** ... +*/ +[[arm::locally_streaming]] void +test_d7 (double d0, double d1, double d2, double d3, + double d4, double d5, double d6, double d7) +{ + asm (""); +} + +/* +** test_d0_vec: +** ... +** ( +** fmov x10, d0 +** | +** umov x10, v0.d\[0\] +** ) +** smstart sm +** fmov d0, x10 +** smstop sm +** ... +*/ +[[arm::locally_streaming]] void +test_d0_vec (int8x8_t d0) +{ + asm (""); +} + +/* +** test_d7_vec: +** ... +** ( +** fmov x10, d0 +** fmov x11, d1 +** fmov x12, d2 +** fmov x13, d3 +** fmov x14, d4 +** fmov x15, d5 +** fmov x16, d6 +** fmov x17, d7 +** | +** umov x10, v0.d\[0\] +** umov x11, v1.d\[0\] +** umov x12, v2.d\[0\] +** umov x13, v3.d\[0\] +** umov x14, v4.d\[0\] +** umov x15, v5.d\[0\] +** umov x16, v6.d\[0\] +** umov x17, v7.d\[0\] +** ) +** smstart sm +** fmov d0, x10 +** fmov d1, x11 +** fmov d2, x12 +** fmov d3, x13 +** fmov d4, x14 +** fmov d5, x15 +** fmov d6, x16 +** fmov d7, x17 +** smstop sm +** ... +*/ +[[arm::locally_streaming]] void +test_d7_vec (int8x8_t d0, int8x8_t d1, int8x8_t d2, int8x8_t d3, + int8x8_t d4, int8x8_t d5, int8x8_t d6, int8x8_t d7) +{ + asm (""); +} + +/* +** test_q0: +** ... +** str q0, \[sp, #?-16\]! +** smstart sm +** ldr q0, \[sp\], #?16 +** smstop sm +** ... +*/ +[[arm::locally_streaming]] void +test_q0 (int8x16_t q0) +{ + asm (""); +} + +/* +** test_q7: +** ... +** stp q0, q1, \[sp, #?-128\]! +** stp q2, q3, \[sp, #?32\] +** stp q4, q5, \[sp, #?64\] +** stp q6, q7, \[sp, #?96\] +** smstart sm +** ldp q2, q3, \[sp, #?32\] +** ldp q4, q5, \[sp, #?64\] +** ldp q6, q7, \[sp, #?96\] +** ldp q0, q1, \[sp\], #?128 +** smstop sm +** ... +*/ +[[arm::locally_streaming]] void +test_q7 (int8x16x4_t q0, int8x16x4_t q4) +{ + asm (""); +} + +/* +** test_z0: +** ... +** addvl sp, sp, #-1 +** str z0, \[sp\] +** smstart sm +** ldr z0, \[sp\] +** addvl sp, sp, #1 +** smstop sm +** ... +*/ +[[arm::locally_streaming]] void +test_z0 (svint8_t z0) +{ + asm (""); +} + +/* +** test_z7: +** ... +** addvl sp, sp, #-8 +** str z0, \[sp\] +** str z1, \[sp, #1, mul vl\] +** str z2, \[sp, #2, mul vl\] +** str z3, \[sp, #3, mul vl\] +** str z4, \[sp, #4, mul vl\] +** str z5, \[sp, #5, mul vl\] +** str z6, \[sp, #6, mul vl\] +** str z7, \[sp, #7, mul vl\] +** smstart sm +** ldr z0, \[sp\] +** ldr z1, \[sp, #1, mul vl\] +** ldr z2, \[sp, #2, mul vl\] +** ldr z3, \[sp, #3, mul vl\] +** ldr z4, \[sp, #4, mul vl\] +** ldr z5, \[sp, #5, mul vl\] +** ldr z6, \[sp, #6, mul vl\] +** ldr z7, \[sp, #7, mul vl\] +** addvl sp, sp, #8 +** smstop sm +** ... +*/ +[[arm::locally_streaming]] void +test_z7 (svint8x4_t z0, svint8x4_t z4) +{ + asm (""); +} + +/* +** test_p0: +** ... +** addvl sp, sp, #-1 +** str p0, \[sp\] +** smstart sm +** ldr p0, \[sp\] +** addvl sp, sp, #1 +** smstop sm +** ... +*/ +[[arm::locally_streaming]] void +test_p0 (svbool_t p0) +{ + asm (""); +} + +/* +** test_p3: +** ... +** addvl sp, sp, #-1 +** str p0, \[sp\] +** str p1, \[sp, #1, mul vl\] +** str p2, \[sp, #2, mul vl\] +** str p3, \[sp, #3, mul vl\] +** smstart sm +** ldr p0, \[sp\] +** ldr p1, \[sp, #1, mul vl\] +** ldr p2, \[sp, #2, mul vl\] +** ldr p3, \[sp, #3, mul vl\] +** addvl sp, sp, #1 +** smstop sm +** ... +*/ +[[arm::locally_streaming]] void +test_p3 (svbool_t p0, svbool_t p1, svbool_t p2, svbool_t p3) +{ + asm (""); +} + +/* +** test_mixed: +** ... +** addvl sp, sp, #-3 +** str p0, \[sp\] +** str p1, \[sp, #1, mul vl\] +** str p2, \[sp, #2, mul vl\] +** str p3, \[sp, #3, mul vl\] +** str z3, \[sp, #1, mul vl\] +** str z7, \[sp, #2, mul vl\] +** stp q2, q6, \[sp, #?-32\]! +** fmov w10, s0 +** fmov x11, d1 +** fmov w12, s4 +** fmov x13, d5 +** smstart sm +** fmov s0, w10 +** fmov d1, x11 +** fmov s4, w12 +** fmov d5, x13 +** ldp q2, q6, \[sp\], #?32 +** ldr p0, \[sp\] +** ldr p1, \[sp, #1, mul vl\] +** ldr p2, \[sp, #2, mul vl\] +** ldr p3, \[sp, #3, mul vl\] +** ldr z3, \[sp, #1, mul vl\] +** ldr z7, \[sp, #2, mul vl\] +** addvl sp, sp, #3 +** smstop sm +** ... +*/ +[[arm::locally_streaming]] void +test_mixed (float s0, double d1, float32x4_t q2, svfloat32_t z3, + float s4, double d5, float64x2_t q6, svfloat64_t z7, + svbool_t p0, svbool_t p1, svbool_t p2, svbool_t p3) +{ + asm (""); +} diff --git a/gcc/testsuite/gcc.target/aarch64/sme/locally_streaming_4.c b/gcc/testsuite/gcc.target/aarch64/sme/locally_streaming_4.c new file mode 100644 index 00000000000..42adeb152e9 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sme/locally_streaming_4.c @@ -0,0 +1,145 @@ +// { dg-options "-O -fomit-frame-pointer" } +/* { dg-final { check-function-bodies "**" "" } } */ + +#include +#include + +/* +** test_d0: +** ... +** smstart sm +** ... +** fmov x10, d0 +** smstop sm +** fmov d0, x10 +** ... +** smstart sm +** ... +** smstop sm +** ... +*/ +void consume_d0 (double d0); + +__arm_locally_streaming void +test_d0 () +{ + asm (""); + consume_d0 (1.0); + asm (""); +} + +/* +** test_d7: +** ... +** fmov x10, d0 +** fmov x11, d1 +** fmov x12, d2 +** fmov x13, d3 +** fmov x14, d4 +** fmov x15, d5 +** fmov x16, d6 +** fmov x17, d7 +** smstop sm +** fmov d0, x10 +** fmov d1, x11 +** fmov d2, x12 +** fmov d3, x13 +** fmov d4, x14 +** fmov d5, x15 +** fmov d6, x16 +** fmov d7, x17 +** ... +*/ +void consume_d7 (double d0, double d1, double d2, double d3, + double d4, double d5, double d6, double d7); +__arm_locally_streaming void +test_d7 () +{ + asm (""); + consume_d7 (1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0); + asm (""); +} + +/* +** test_q7: +** ... +** stp q0, q1, \[sp, #?-128\]! +** stp q2, q3, \[sp, #?32\] +** stp q4, q5, \[sp, #?64\] +** stp q6, q7, \[sp, #?96\] +** smstop sm +** ldp q2, q3, \[sp, #?32\] +** ldp q4, q5, \[sp, #?64\] +** ldp q6, q7, \[sp, #?96\] +** ldp q0, q1, \[sp\], #?128 +** ... +*/ +void consume_q7 (int8x16x4_t q0, int8x16x4_t q4); + +__arm_locally_streaming void +test_q7 (int8x16x4_t *ptr) +{ + asm (""); + consume_q7 (ptr[0], ptr[1]); + asm (""); +} + +/* +** test_z7: +** ... +** addvl sp, sp, #-8 +** str z0, \[sp\] +** str z1, \[sp, #1, mul vl\] +** str z2, \[sp, #2, mul vl\] +** str z3, \[sp, #3, mul vl\] +** str z4, \[sp, #4, mul vl\] +** str z5, \[sp, #5, mul vl\] +** str z6, \[sp, #6, mul vl\] +** str z7, \[sp, #7, mul vl\] +** smstop sm +** ldr z0, \[sp\] +** ldr z1, \[sp, #1, mul vl\] +** ldr z2, \[sp, #2, mul vl\] +** ldr z3, \[sp, #3, mul vl\] +** ldr z4, \[sp, #4, mul vl\] +** ldr z5, \[sp, #5, mul vl\] +** ldr z6, \[sp, #6, mul vl\] +** ldr z7, \[sp, #7, mul vl\] +** addvl sp, sp, #8 +** ... +*/ +void consume_z7 (svint8x4_t z0, svint8x4_t z4); + +__arm_locally_streaming void +test_z7 (svint8x4_t *ptr1, svint8x4_t *ptr2) +{ + asm (""); + consume_z7 (*ptr1, *ptr2); + asm (""); +} + +/* +** test_p3: +** ... +** addvl sp, sp, #-1 +** str p0, \[sp\] +** str p1, \[sp, #1, mul vl\] +** str p2, \[sp, #2, mul vl\] +** str p3, \[sp, #3, mul vl\] +** smstop sm +** ldr p0, \[sp\] +** ldr p1, \[sp, #1, mul vl\] +** ldr p2, \[sp, #2, mul vl\] +** ldr p3, \[sp, #3, mul vl\] +** addvl sp, sp, #1 +** ... +*/ +void consume_p3 (svbool_t p0, svbool_t p1, svbool_t p2, svbool_t p3); + +__arm_locally_streaming void +test_p3 (svbool_t *ptr1, svbool_t *ptr2, svbool_t *ptr3, svbool_t *ptr4) +{ + asm (""); + consume_p3 (*ptr1, *ptr2, *ptr3, *ptr4); + asm (""); +} From patchwork Tue Dec 5 10:13:21 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Sandiford X-Patchwork-Id: 1872046 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4SkxLw5g21z1ySd for ; Tue, 5 Dec 2023 21:18:16 +1100 (AEDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id CA14338312F8 for ; Tue, 5 Dec 2023 10:17:49 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by sourceware.org (Postfix) with ESMTP id 98DF63861871 for ; Tue, 5 Dec 2023 10:13:49 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 98DF63861871 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=arm.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 98DF63861871 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=217.140.110.172 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1701771233; cv=none; b=II570hsRSlqZIE6fV/jO2WQH0du5Rfd3EodQiJETAxyDfk82zflCvPfFvx69LgCqrv9Dn1B051Qu+j8piFs1IvFimMVaKVKlpOnf4YHWrScJHZ9NE5lSd5LgRNSXtZ3QRmcsfCoilCFq8Rd5zhg55c40z9NvfwBLc6x3ZE4azoc= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1701771233; c=relaxed/simple; bh=BjDcI6oV828VrxZD4RbvQptYttZRuPl4ZOVWRp3Tab8=; h=From:To:Subject:Date:Message-Id:MIME-Version; b=pJmaxEvDuMYts3ahtZR17JeMCcQo3+sJgdXgyc8kN3+gMHYOI4prMQC12aL1LkniFJ46XWmw2IL58Wr8XoVWi/HYfBCrXlq2gYVtOPlx/h4h/vCB5ZLLxd0up3H6Yt/DiNReMxTPd0YCJqzchhYfDN6uhhhR1JDwBvOR6wUZ748= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 09B9215BF; Tue, 5 Dec 2023 02:14:36 -0800 (PST) Received: from e121540-lin.manchester.arm.com (e121540-lin.manchester.arm.com [10.32.110.72]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id EF3B73F5A1; Tue, 5 Dec 2023 02:13:48 -0800 (PST) From: Richard Sandiford To: gcc-patches@gcc.gnu.org Cc: Richard Sandiford Subject: [pushed v2 23/25] aarch64: Handle PSTATE.SM across abnormal edges Date: Tue, 5 Dec 2023 10:13:21 +0000 Message-Id: <20231205101323.1914247-24-richard.sandiford@arm.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20231205101323.1914247-1-richard.sandiford@arm.com> References: <20231205101323.1914247-1-richard.sandiford@arm.com> MIME-Version: 1.0 X-Spam-Status: No, score=-22.3 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_DMARC_NONE, KAM_DMARC_STATUS, KAM_LAZY_DOMAIN_SECURITY, KAM_SHORT, SPF_HELO_NONE, SPF_NONE, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org PSTATE.SM is always off on entry to an exception handler, and on entry to a nonlocal goto receiver. Those entry points need to switch PSTATE.SM back to the appropriate state for the current function. In the case of streaming-compatible functions, they need to restore the mode that the caller was originally using. The requirement on nonlocal goto receivers means that nonlocal jumps need to ensure that PSTATE.SM is zero. gcc/ * config/aarch64/aarch64.cc: Include except.h (aarch64_sme_mode_switch_regs::add_call_preserved_reg): New function. (aarch64_sme_mode_switch_regs::add_call_preserved_regs): Likewise. (aarch64_need_old_pstate_sm): Return true if the function has a nonlocal-goto or exception receiver. (aarch64_switch_pstate_sm_for_landing_pad): New function. (aarch64_switch_pstate_sm_for_jump): Likewise. (pass_switch_pstate_sm::gate): Enable the pass for all streaming and streaming-compatible functions. (pass_switch_pstate_sm::execute): Handle non-local gotos and their receivers. Handle exception handler entry points. gcc/testsuite/ * g++.target/aarch64/sme/exceptions_2.C: New test. * gcc.target/aarch64/sme/nonlocal_goto_1.c: Likewise. * gcc.target/aarch64/sme/nonlocal_goto_2.c: Likewise. * gcc.target/aarch64/sme/nonlocal_goto_3.c: Likewise. * gcc.target/aarch64/sme/nonlocal_goto_4.c: Likewise. * gcc.target/aarch64/sme/nonlocal_goto_5.c: Likewise. * gcc.target/aarch64/sme/nonlocal_goto_6.c: Likewise. * gcc.target/aarch64/sme/nonlocal_goto_7.c: Likewise. --- gcc/config/aarch64/aarch64.cc | 141 ++++++++++++++++- .../g++.target/aarch64/sme/exceptions_2.C | 148 ++++++++++++++++++ .../gcc.target/aarch64/sme/nonlocal_goto_1.c | 58 +++++++ .../gcc.target/aarch64/sme/nonlocal_goto_2.c | 44 ++++++ .../gcc.target/aarch64/sme/nonlocal_goto_3.c | 46 ++++++ .../gcc.target/aarch64/sme/nonlocal_goto_4.c | 25 +++ .../gcc.target/aarch64/sme/nonlocal_goto_5.c | 26 +++ .../gcc.target/aarch64/sme/nonlocal_goto_6.c | 31 ++++ .../gcc.target/aarch64/sme/nonlocal_goto_7.c | 25 +++ 9 files changed, 537 insertions(+), 7 deletions(-) create mode 100644 gcc/testsuite/g++.target/aarch64/sme/exceptions_2.C create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/nonlocal_goto_1.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/nonlocal_goto_2.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/nonlocal_goto_3.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/nonlocal_goto_4.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/nonlocal_goto_5.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/nonlocal_goto_6.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/nonlocal_goto_7.c diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc index c94016ccdcf..be44e67979f 100644 --- a/gcc/config/aarch64/aarch64.cc +++ b/gcc/config/aarch64/aarch64.cc @@ -85,6 +85,7 @@ #include "config/arm/aarch-common.h" #include "config/arm/aarch-common-protos.h" #include "ssa.h" +#include "except.h" #include "tree-pass.h" #include "cfgbuild.h" @@ -4758,6 +4759,8 @@ public: void add_reg (machine_mode, unsigned int); void add_call_args (rtx_call_insn *); void add_call_result (rtx_call_insn *); + void add_call_preserved_reg (unsigned int); + void add_call_preserved_regs (bitmap); void emit_prologue (); void emit_epilogue (); @@ -4890,6 +4893,46 @@ aarch64_sme_mode_switch_regs::add_call_result (rtx_call_insn *call_insn) add_reg (GET_MODE (dest), REGNO (dest)); } +/* REGNO is a register that is call-preserved under the current function's ABI. + Record that it must be preserved around the mode switch. */ + +void +aarch64_sme_mode_switch_regs::add_call_preserved_reg (unsigned int regno) +{ + if (FP_REGNUM_P (regno)) + switch (crtl->abi->id ()) + { + case ARM_PCS_SVE: + add_reg (VNx16QImode, regno); + break; + case ARM_PCS_SIMD: + add_reg (V16QImode, regno); + break; + case ARM_PCS_AAPCS64: + add_reg (DImode, regno); + break; + default: + gcc_unreachable (); + } + else if (PR_REGNUM_P (regno)) + add_reg (VNx16BImode, regno); +} + +/* The hard registers in REGS are call-preserved under the current function's + ABI. Record that they must be preserved around the mode switch. */ + +void +aarch64_sme_mode_switch_regs::add_call_preserved_regs (bitmap regs) +{ + bitmap_iterator bi; + unsigned int regno; + EXECUTE_IF_SET_IN_BITMAP (regs, 0, regno, bi) + if (HARD_REGISTER_NUM_P (regno)) + add_call_preserved_reg (regno); + else + break; +} + /* Emit code to save registers before the mode switch. */ void @@ -7423,6 +7466,23 @@ aarch64_need_old_pstate_sm () if (aarch64_cfun_enables_pstate_sm ()) return true; + /* Non-local goto receivers are entered with PSTATE.SM equal to 0, + but the function needs to return with PSTATE.SM unchanged. */ + if (nonlocal_goto_handler_labels) + return true; + + /* Likewise for exception handlers. */ + eh_landing_pad lp; + for (unsigned int i = 1; vec_safe_iterate (cfun->eh->lp_array, i, &lp); ++i) + if (lp && lp->post_landing_pad) + return true; + + /* Non-local gotos need to set PSTATE.SM to zero. It's possible to call + streaming-compatible functions without SME being available, so PSTATE.SM + should only be changed if it is currently set to one. */ + if (crtl->has_nonlocal_goto) + return true; + if (cfun->machine->call_switches_pstate_sm) for (auto insn = get_insns (); insn; insn = NEXT_INSN (insn)) if (auto *call = dyn_cast (insn)) @@ -28323,6 +28383,59 @@ aarch64_md_asm_adjust (vec &outputs, vec &inputs, return seq; } +/* BB is the target of an exception or nonlocal goto edge, which means + that PSTATE.SM is known to be 0 on entry. Put it into the state that + the current function requires. */ + +static bool +aarch64_switch_pstate_sm_for_landing_pad (basic_block bb) +{ + if (TARGET_NON_STREAMING) + return false; + + start_sequence (); + rtx_insn *guard_label = nullptr; + if (TARGET_STREAMING_COMPATIBLE) + guard_label = aarch64_guard_switch_pstate_sm (IP0_REGNUM, + AARCH64_FL_SM_OFF); + aarch64_sme_mode_switch_regs args_switch; + args_switch.add_call_preserved_regs (df_get_live_in (bb)); + args_switch.emit_prologue (); + aarch64_switch_pstate_sm (AARCH64_FL_SM_OFF, AARCH64_FL_SM_ON); + args_switch.emit_epilogue (); + if (guard_label) + emit_label (guard_label); + auto seq = get_insns (); + end_sequence (); + + emit_insn_after (seq, bb_note (bb)); + return true; +} + +/* JUMP is a nonlocal goto. Its target requires PSTATE.SM to be 0 on entry, + so arrange to make it so. */ + +static bool +aarch64_switch_pstate_sm_for_jump (rtx_insn *jump) +{ + if (TARGET_NON_STREAMING) + return false; + + start_sequence (); + rtx_insn *guard_label = nullptr; + if (TARGET_STREAMING_COMPATIBLE) + guard_label = aarch64_guard_switch_pstate_sm (IP0_REGNUM, + AARCH64_FL_SM_OFF); + aarch64_switch_pstate_sm (AARCH64_FL_SM_ON, AARCH64_FL_SM_OFF); + if (guard_label) + emit_label (guard_label); + auto seq = get_insns (); + end_sequence (); + + emit_insn_before (seq, jump); + return true; +} + /* If CALL involves a change in PSTATE.SM, emit the instructions needed to switch to the new mode and the instructions needed to restore the original mode. Return true if something changed. */ @@ -28406,9 +28519,10 @@ public: }; bool -pass_switch_pstate_sm::gate (function *) +pass_switch_pstate_sm::gate (function *fn) { - return cfun->machine->call_switches_pstate_sm; + return (aarch64_fndecl_pstate_sm (fn->decl) != AARCH64_FL_SM_OFF + || cfun->machine->call_switches_pstate_sm); } /* Emit any instructions needed to switch PSTATE.SM. */ @@ -28421,11 +28535,24 @@ pass_switch_pstate_sm::execute (function *fn) bitmap_clear (blocks); FOR_EACH_BB_FN (bb, fn) { - rtx_insn *insn; - FOR_BB_INSNS (bb, insn) - if (auto *call = dyn_cast (insn)) - if (aarch64_switch_pstate_sm_for_call (call)) - bitmap_set_bit (blocks, bb->index); + if (has_abnormal_call_or_eh_pred_edge_p (bb) + && aarch64_switch_pstate_sm_for_landing_pad (bb)) + bitmap_set_bit (blocks, bb->index); + + if (cfun->machine->call_switches_pstate_sm) + { + rtx_insn *insn; + FOR_BB_INSNS (bb, insn) + if (auto *call = dyn_cast (insn)) + if (aarch64_switch_pstate_sm_for_call (call)) + bitmap_set_bit (blocks, bb->index); + } + + auto end = BB_END (bb); + if (JUMP_P (end) + && find_reg_note (end, REG_NON_LOCAL_GOTO, NULL_RTX) + && aarch64_switch_pstate_sm_for_jump (end)) + bitmap_set_bit (blocks, bb->index); } find_many_sub_basic_blocks (blocks); clear_aux_for_blocks (); diff --git a/gcc/testsuite/g++.target/aarch64/sme/exceptions_2.C b/gcc/testsuite/g++.target/aarch64/sme/exceptions_2.C new file mode 100644 index 00000000000..f791b6ecc54 --- /dev/null +++ b/gcc/testsuite/g++.target/aarch64/sme/exceptions_2.C @@ -0,0 +1,148 @@ +// { dg-options "-O -fno-optimize-sibling-calls" } +// { dg-final { check-function-bodies "**" "" } } + +void n_callee(); +void s_callee() __arm_streaming; +void sc_callee() __arm_streaming_compatible; + +void n_callee_ne() noexcept; +void s_callee_ne() noexcept __arm_streaming; +void sc_callee_ne() noexcept __arm_streaming_compatible; + +void n_caller1() +{ + try + { + n_callee(); + sc_callee(); + } + catch (...) + { + n_callee_ne(); + sc_callee_ne(); + } +} +// { dg-final { scan-assembler {_Z9n_caller1v:(?:(?!smstart|smstop).)*\tret} } } + +/* +** _Z9n_caller2v: +** ... +** cntd (x[0-9]+) +** str \1, [^\n]+ +** ... +** bl __cxa_begin_catch +** smstart sm +** bl _Z11s_callee_nev +** smstop sm +** bl __cxa_end_catch +** ... +*/ +void n_caller2() +{ + try + { + n_callee(); + sc_callee(); + } + catch (...) + { + s_callee_ne(); + } +} + +/* +** _Z9s_caller1v: +** ... +** bl __cxa_end_catch +** smstart sm +** ... +*/ +int s_caller1() __arm_streaming +{ + try + { + s_callee(); + return 1; + } + catch (...) + { + return 2; + } +} + +/* +** _Z9s_caller2v: +** ... +** bl __cxa_begin_catch +** smstart sm +** bl _Z11s_callee_nev +** smstop sm +** bl __cxa_end_catch +** smstart sm +** ... +*/ +int s_caller2() __arm_streaming +{ + try + { + n_callee(); + return 1; + } + catch (...) + { + s_callee_ne(); + return 2; + } +} + +/* +** _Z10sc_caller1v: +** ... +** cntd (x[0-9]+) +** str \1, [^\n]+ +** mrs (x[0-9]+), svcr +** str \2, ([^\n]+) +** ... +** bl __cxa_end_catch +** ldr (x[0-9]+), \3 +** tbz \4, 0, [^\n]+ +** smstart sm +** ... +*/ +int sc_caller1() __arm_streaming_compatible +{ + try + { + sc_callee(); + return 1; + } + catch (...) + { + return 2; + } +} + +/* +** _Z10ls_caller1v: +** ... +** cntd (x[0-9]+) +** str \1, [^\n]+ +** ... +** bl __cxa_begin_catch +** smstart sm +** bl _Z12sc_callee_nev +** smstop sm +** bl __cxa_end_catch +** ... +*/ +__arm_locally_streaming void ls_caller1() +{ + try + { + sc_callee(); + } + catch (...) + { + sc_callee_ne(); + } +} diff --git a/gcc/testsuite/gcc.target/aarch64/sme/nonlocal_goto_1.c b/gcc/testsuite/gcc.target/aarch64/sme/nonlocal_goto_1.c new file mode 100644 index 00000000000..4e3869fcc9e --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sme/nonlocal_goto_1.c @@ -0,0 +1,58 @@ +/* { dg-options "-O2 -fno-schedule-insns -fno-schedule-insns2" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +void run(void (*)()); + +/* +** foo: +** ... +** mrs x16, svcr +** ... +** str x16, (.*) +** ... +** ldr x16, \1 +** tbz x16, 0, .* +** smstop sm +** bl __clear_cache +** ldr x16, \1 +** tbz x16, 0, .* +** smstart sm +** add x0, .* +** ldr x16, \1 +** tbz x16, 0, .* +** smstop sm +** bl run +** ldr x16, \1 +** tbz x16, 0, .* +** smstart sm +** mov w0, 1 +** ... +** ret +** ldr x16, \1 +** tbz x16, 0, .* +** smstart sm +** mov w0, 0 +** ... +*/ +int +foo (int *ptr) __arm_streaming_compatible +{ + __label__ failure; + + void bar () { *ptr += 1; goto failure; } + run (bar); + return 1; + +failure: + return 0; +} + +// { dg-final { scan-assembler {\tstp\tx19, x20,} } } +// { dg-final { scan-assembler {\tstp\tx21, x22,} } } +// { dg-final { scan-assembler {\tstp\tx23, x24,} } } +// { dg-final { scan-assembler {\tstp\tx25, x26,} } } +// { dg-final { scan-assembler {\tstp\tx27, x28,} } } +// { dg-final { scan-assembler {\tstp\td8, d9,} } } +// { dg-final { scan-assembler {\tstp\td10, d11,} } } +// { dg-final { scan-assembler {\tstp\td12, d13,} } } +// { dg-final { scan-assembler {\tstp\td14, d15,} } } diff --git a/gcc/testsuite/gcc.target/aarch64/sme/nonlocal_goto_2.c b/gcc/testsuite/gcc.target/aarch64/sme/nonlocal_goto_2.c new file mode 100644 index 00000000000..2a2db72c3a0 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sme/nonlocal_goto_2.c @@ -0,0 +1,44 @@ +/* { dg-options "-O2 -fno-schedule-insns -fno-schedule-insns2" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +void run(void (*)()); + +/* +** foo: +** ... +** smstop sm +** bl __clear_cache +** smstart sm +** add x0, .* +** smstop sm +** bl run +** smstart sm +** mov w0, 1 +** ... +** ret +** smstart sm +** mov w0, 0 +** ... +*/ +int +foo (int *ptr) __arm_streaming +{ + __label__ failure; + + void bar () { *ptr += 1; goto failure; } + run (bar); + return 1; + +failure: + return 0; +} + +// { dg-final { scan-assembler {\tstp\tx19, x20,} } } +// { dg-final { scan-assembler {\tstp\tx21, x22,} } } +// { dg-final { scan-assembler {\tstp\tx23, x24,} } } +// { dg-final { scan-assembler {\tstp\tx25, x26,} } } +// { dg-final { scan-assembler {\tstp\tx27, x28,} } } +// { dg-final { scan-assembler {\tstp\td8, d9,} } } +// { dg-final { scan-assembler {\tstp\td10, d11,} } } +// { dg-final { scan-assembler {\tstp\td12, d13,} } } +// { dg-final { scan-assembler {\tstp\td14, d15,} } } diff --git a/gcc/testsuite/gcc.target/aarch64/sme/nonlocal_goto_3.c b/gcc/testsuite/gcc.target/aarch64/sme/nonlocal_goto_3.c new file mode 100644 index 00000000000..022b04052c5 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sme/nonlocal_goto_3.c @@ -0,0 +1,46 @@ +/* { dg-options "-O2 -fno-schedule-insns -fno-schedule-insns2" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +void run(void (*)()); + +/* +** foo: +** ... +** smstart sm +** ... +** smstop sm +** bl __clear_cache +** smstart sm +** add x0, .* +** smstop sm +** bl run +** smstart sm +** mov w0, 1 +** ... +** smstart sm +** mov w0, 0 +** smstop sm +** ... +*/ +__arm_locally_streaming int +foo (int *ptr) +{ + __label__ failure; + + void bar () { *ptr += 1; goto failure; } + run (bar); + return 1; + +failure: + return 0; +} + +// { dg-final { scan-assembler {\tstp\tx19, x20,} } } +// { dg-final { scan-assembler {\tstp\tx21, x22,} } } +// { dg-final { scan-assembler {\tstp\tx23, x24,} } } +// { dg-final { scan-assembler {\tstp\tx25, x26,} } } +// { dg-final { scan-assembler {\tstp\tx27, x28,} } } +// { dg-final { scan-assembler {\tstp\td8, d9,} } } +// { dg-final { scan-assembler {\tstp\td10, d11,} } } +// { dg-final { scan-assembler {\tstp\td12, d13,} } } +// { dg-final { scan-assembler {\tstp\td14, d15,} } } diff --git a/gcc/testsuite/gcc.target/aarch64/sme/nonlocal_goto_4.c b/gcc/testsuite/gcc.target/aarch64/sme/nonlocal_goto_4.c new file mode 100644 index 00000000000..0446076286b --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sme/nonlocal_goto_4.c @@ -0,0 +1,25 @@ +/* { dg-options "-O2 -fno-schedule-insns -fno-schedule-insns2" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +void run(void (*)()); + +/* +** bar.0: +** ... +** smstart sm +** ... +** smstop sm +** br x[0-9]+ +*/ +int +foo (int *ptr) +{ + __label__ failure; + + __arm_locally_streaming void bar () { *ptr += 1; goto failure; } + run (bar); + return 1; + +failure: + return 0; +} diff --git a/gcc/testsuite/gcc.target/aarch64/sme/nonlocal_goto_5.c b/gcc/testsuite/gcc.target/aarch64/sme/nonlocal_goto_5.c new file mode 100644 index 00000000000..4246aec8b2f --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sme/nonlocal_goto_5.c @@ -0,0 +1,26 @@ +/* { dg-options "-O2 -fno-schedule-insns -fno-schedule-insns2" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +void run(void (*)() __arm_streaming); + +/* +** bar.0: +** ... +** smstop sm +** br x[0-9]+ +*/ +int +foo (int *ptr) +{ + __label__ failure; + + void bar () __arm_streaming { *ptr += 1; goto failure; } + run (bar); + return 1; + +failure: + return 0; +} + +// { dg-final { scan-assembler-not {smstart\t} } } +// { dg-final { scan-assembler-not {mrs\t} } } diff --git a/gcc/testsuite/gcc.target/aarch64/sme/nonlocal_goto_6.c b/gcc/testsuite/gcc.target/aarch64/sme/nonlocal_goto_6.c new file mode 100644 index 00000000000..151e2f22dc7 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sme/nonlocal_goto_6.c @@ -0,0 +1,31 @@ +/* { dg-options "-O2 -fno-schedule-insns -fno-schedule-insns2" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +void run(void (*)() __arm_streaming_compatible); + +/* +** bar.0: +** ... +** mrs x16, svcr +** ... +** str x16, (.*) +** ... +** ldr x16, \1 +** tbz x16, 0, .* +** smstop sm +** br x[0-9]+ +*/ +int +foo (int *ptr) +{ + __label__ failure; + + void bar () __arm_streaming_compatible { *ptr += 1; goto failure; } + run (bar); + return 1; + +failure: + return 0; +} + +// { dg-final { scan-assembler-not {smstart\t} } } diff --git a/gcc/testsuite/gcc.target/aarch64/sme/nonlocal_goto_7.c b/gcc/testsuite/gcc.target/aarch64/sme/nonlocal_goto_7.c new file mode 100644 index 00000000000..9cc3ad5d236 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sme/nonlocal_goto_7.c @@ -0,0 +1,25 @@ +/* { dg-options "-O2 -fno-schedule-insns -fno-schedule-insns2" } */ + +void run(void (*)() __arm_inout("za")); +void callee () __arm_inout("za"); + +int +foo (int *ptr) +{ + __label__ failure; + + void bar () __arm_inout("za") + { + callee (); + *ptr += 1; + goto failure; + } + run (bar); + return 1; + +failure: + return 0; +} + +// { dg-final { scan-assembler-not {\tsmstart\t} } } +// { dg-final { scan-assembler-not {\tsmstop\t} } } From patchwork Tue Dec 5 10:13:22 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Sandiford X-Patchwork-Id: 1872047 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4SkxLz0qGtz1ySd for ; Tue, 5 Dec 2023 21:18:19 +1100 (AEDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 031373882071 for ; Tue, 5 Dec 2023 10:17:52 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by sourceware.org (Postfix) with ESMTP id 568F43858284 for ; Tue, 5 Dec 2023 10:13:50 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 568F43858284 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=arm.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 568F43858284 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=217.140.110.172 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1701771235; cv=none; b=yFEjT0/eVQoqxnCjfRZf9k1XU+B5mMga6Y862bBFgcxv5ErrSf+ulPqSKDz1zd9TX3TR0gcvTLxpoYyZBHmjs5Edh+Rjh1y14yTV+sQPOBC7/pHqWyvMmYPNSvzVIRhN+EghiyY6RuFt/ACXYfSmB9rvaT9GzszDbnnzvpS7VjA= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1701771235; c=relaxed/simple; bh=lso96Sk4EoISODUB/zbze+k7TO+UPHIHn+T5D4bWaT0=; h=From:To:Subject:Date:Message-Id:MIME-Version; b=CUGURUoECuBD6jtfJr1SThJOP8ewQqOCBDM90ALByYhTvzVCjsi+m+Hl/09JbjIIH4sD3Mw9afrLg+5qt+BFZdZtQ37Xs+3tmI2G4HqIYyeVsvn0btJ/7oYIjuQKl3BTojfJe1areEId0zUQs9txgfZKTm0ZfuZlWLE5+a5U0q4= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id AEE2315DB; Tue, 5 Dec 2023 02:14:36 -0800 (PST) Received: from e121540-lin.manchester.arm.com (e121540-lin.manchester.arm.com [10.32.110.72]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id A04303F5A1; Tue, 5 Dec 2023 02:13:49 -0800 (PST) From: Richard Sandiford To: gcc-patches@gcc.gnu.org Cc: Richard Sandiford Subject: [pushed v2 24/25] aarch64: Enforce inlining restrictions for SME Date: Tue, 5 Dec 2023 10:13:22 +0000 Message-Id: <20231205101323.1914247-25-richard.sandiford@arm.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20231205101323.1914247-1-richard.sandiford@arm.com> References: <20231205101323.1914247-1-richard.sandiford@arm.com> MIME-Version: 1.0 X-Spam-Status: No, score=-22.3 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_DMARC_NONE, KAM_DMARC_STATUS, KAM_LAZY_DOMAIN_SECURITY, KAM_SHORT, SPF_HELO_NONE, SPF_NONE, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org A function that has local ZA state cannot be inlined into its caller, since we only support managing ZA switches at function scope. A function whose body directly clobbers ZA state cannot be inlined into a function with ZA state. A function whose body requires a particular PSTATE.SM setting can only be inlined into a function body that guarantees that PSTATE.SM setting. The callee's function type doesn't matter here: one locally-streaming function can be inlined into another. gcc/ * config/aarch64/aarch64.cc: Include symbol-summary.h, ipa-prop.h, and ipa-fnsummary.h (aarch64_function_attribute_inlinable_p): New function. (AARCH64_IPA_SM_FIXED, AARCH64_IPA_CLOBBERS_ZA): New constants. (aarch64_need_ipa_fn_target_info): New function. (aarch64_update_ipa_fn_target_info): Likewise. (aarch64_can_inline_p): Restrict the previous ISA flag checks to non-modal features. Prevent callees that require a particular PSTATE.SM state from being inlined into callers that can't guarantee that state. Also prevent callees that have ZA state from being inlined into callers that don't. Finally, prevent callees that clobber ZA from being inlined into callers that have ZA state. (TARGET_FUNCTION_ATTRIBUTE_INLINABLE_P): Define. (TARGET_NEED_IPA_FN_TARGET_INFO): Likewise. (TARGET_UPDATE_IPA_FN_TARGET_INFO): Likewise. gcc/testsuite/ * gcc.target/aarch64/sme/inlining_1.c: New test. * gcc.target/aarch64/sme/inlining_2.c: Likewise. * gcc.target/aarch64/sme/inlining_3.c: Likewise. * gcc.target/aarch64/sme/inlining_4.c: Likewise. * gcc.target/aarch64/sme/inlining_5.c: Likewise. * gcc.target/aarch64/sme/inlining_6.c: Likewise. * gcc.target/aarch64/sme/inlining_7.c: Likewise. * gcc.target/aarch64/sme/inlining_8.c: Likewise. --- gcc/config/aarch64/aarch64.cc | 132 +++++++++++++++++- .../gcc.target/aarch64/sme/inlining_1.c | 47 +++++++ .../gcc.target/aarch64/sme/inlining_10.c | 57 ++++++++ .../gcc.target/aarch64/sme/inlining_11.c | 57 ++++++++ .../gcc.target/aarch64/sme/inlining_12.c | 15 ++ .../gcc.target/aarch64/sme/inlining_13.c | 15 ++ .../gcc.target/aarch64/sme/inlining_14.c | 15 ++ .../gcc.target/aarch64/sme/inlining_15.c | 27 ++++ .../gcc.target/aarch64/sme/inlining_2.c | 47 +++++++ .../gcc.target/aarch64/sme/inlining_3.c | 47 +++++++ .../gcc.target/aarch64/sme/inlining_4.c | 47 +++++++ .../gcc.target/aarch64/sme/inlining_5.c | 47 +++++++ .../gcc.target/aarch64/sme/inlining_6.c | 31 ++++ .../gcc.target/aarch64/sme/inlining_7.c | 31 ++++ .../gcc.target/aarch64/sme/inlining_8.c | 31 ++++ .../gcc.target/aarch64/sme/inlining_9.c | 55 ++++++++ 16 files changed, 696 insertions(+), 5 deletions(-) create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/inlining_1.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/inlining_10.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/inlining_11.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/inlining_12.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/inlining_13.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/inlining_14.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/inlining_15.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/inlining_2.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/inlining_3.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/inlining_4.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/inlining_5.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/inlining_6.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/inlining_7.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/inlining_8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/inlining_9.c diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc index be44e67979f..4639310f108 100644 --- a/gcc/config/aarch64/aarch64.cc +++ b/gcc/config/aarch64/aarch64.cc @@ -88,6 +88,9 @@ #include "except.h" #include "tree-pass.h" #include "cfgbuild.h" +#include "symbol-summary.h" +#include "ipa-prop.h" +#include "ipa-fnsummary.h" /* This file should be included last. */ #include "target-def.h" @@ -19155,6 +19158,17 @@ aarch64_option_valid_attribute_p (tree fndecl, tree, tree args, int) return ret; } +/* Implement TARGET_FUNCTION_ATTRIBUTE_INLINABLE_P. Use an opt-out + rather than an opt-in list. */ + +static bool +aarch64_function_attribute_inlinable_p (const_tree fndecl) +{ + /* A function that has local ZA state cannot be inlined into its caller, + since we only support managing ZA switches at function scope. */ + return !aarch64_fndecl_has_new_state (fndecl, "za"); +} + /* Helper for aarch64_can_inline_p. In the case where CALLER and CALLEE are tri-bool options (yes, no, don't care) and the default value is DEF, determine whether to reject inlining. */ @@ -19176,6 +19190,60 @@ aarch64_tribools_ok_for_inlining_p (int caller, int callee, return (callee == caller || callee == def); } +/* Bit allocations for ipa_fn_summary::target_info. */ + +/* Set if the function contains a stmt that relies on the function's + choice of PSTATE.SM setting (0 for non-streaming, 1 for streaming). + Not meaningful for streaming-compatible functions. */ +constexpr auto AARCH64_IPA_SM_FIXED = 1U << 0; + +/* Set if the function clobbers ZA. Not meaningful for functions that + have ZA state. */ +constexpr auto AARCH64_IPA_CLOBBERS_ZA = 1U << 1; + +/* Implement TARGET_NEED_IPA_FN_TARGET_INFO. */ + +static bool +aarch64_need_ipa_fn_target_info (const_tree, unsigned int &) +{ + /* We could in principle skip this for streaming-compatible functions + that have ZA state, but that's a rare combination. */ + return true; +} + +/* Implement TARGET_UPDATE_IPA_FN_TARGET_INFO. */ + +static bool +aarch64_update_ipa_fn_target_info (unsigned int &info, const gimple *stmt) +{ + if (auto *ga = dyn_cast (stmt)) + { + /* We don't know what the asm does, so conservatively assume that + it requires the function's current SM mode. */ + info |= AARCH64_IPA_SM_FIXED; + for (unsigned int i = 0; i < gimple_asm_nclobbers (ga); ++i) + { + tree op = gimple_asm_clobber_op (ga, i); + const char *clobber = TREE_STRING_POINTER (TREE_VALUE (op)); + if (strcmp (clobber, "za") == 0) + info |= AARCH64_IPA_CLOBBERS_ZA; + } + } + if (auto *call = dyn_cast (stmt)) + { + if (gimple_call_builtin_p (call, BUILT_IN_MD)) + { + /* The attributes on AArch64 builtins are supposed to be accurate. + If the function isn't marked streaming-compatible then it + needs whichever SM mode it selects. */ + tree decl = gimple_call_fndecl (call); + if (aarch64_fndecl_pstate_sm (decl) != 0) + info |= AARCH64_IPA_SM_FIXED; + } + } + return true; +} + /* Implement TARGET_CAN_INLINE_P. Decide whether it is valid to inline CALLEE into CALLER based on target-specific info. Make sure that the caller and callee have compatible architectural @@ -19198,12 +19266,56 @@ aarch64_can_inline_p (tree caller, tree callee) : target_option_default_node); /* Callee's ISA flags should be a subset of the caller's. */ - if ((caller_opts->x_aarch64_asm_isa_flags - & callee_opts->x_aarch64_asm_isa_flags) - != callee_opts->x_aarch64_asm_isa_flags) + auto caller_asm_isa = (caller_opts->x_aarch64_asm_isa_flags + & ~AARCH64_FL_ISA_MODES); + auto callee_asm_isa = (callee_opts->x_aarch64_asm_isa_flags + & ~AARCH64_FL_ISA_MODES); + if (callee_asm_isa & ~caller_asm_isa) return false; - if ((caller_opts->x_aarch64_isa_flags & callee_opts->x_aarch64_isa_flags) - != callee_opts->x_aarch64_isa_flags) + + auto caller_isa = (caller_opts->x_aarch64_isa_flags + & ~AARCH64_FL_ISA_MODES); + auto callee_isa = (callee_opts->x_aarch64_isa_flags + & ~AARCH64_FL_ISA_MODES); + if (callee_isa & ~caller_isa) + return false; + + /* Return true if the callee might have target_info property PROPERTY. + The answer must be true unless we have positive proof to the contrary. */ + auto callee_has_property = [&](unsigned int property) + { + if (ipa_fn_summaries) + if (auto *summary = ipa_fn_summaries->get (cgraph_node::get (callee))) + if (!(summary->target_info & property)) + return false; + return true; + }; + + /* Streaming-compatible code can be inlined into functions with any + PSTATE.SM mode. Otherwise the caller and callee must agree on + PSTATE.SM mode, unless we can prove that the callee is naturally + streaming-compatible. */ + auto caller_sm = (caller_opts->x_aarch64_isa_flags & AARCH64_FL_SM_STATE); + auto callee_sm = (callee_opts->x_aarch64_isa_flags & AARCH64_FL_SM_STATE); + if (callee_sm + && caller_sm != callee_sm + && callee_has_property (AARCH64_IPA_SM_FIXED)) + return false; + + /* aarch64_function_attribute_inlinable_p prevents new-ZA functions + from being inlined into others. We also need to prevent inlining + of shared-ZA functions into functions without ZA state, since this + is an error condition. + + The only other problematic case for ZA is inlining a function that + directly clobbers ZA into a function that has ZA state. */ + auto caller_za = (caller_opts->x_aarch64_isa_flags & AARCH64_FL_ZA_ON); + auto callee_za = (callee_opts->x_aarch64_isa_flags & AARCH64_FL_ZA_ON); + if (!caller_za && callee_za) + return false; + if (caller_za + && !callee_za + && callee_has_property (AARCH64_IPA_CLOBBERS_ZA)) return false; /* Allow non-strict aligned functions inlining into strict @@ -28760,6 +28872,16 @@ aarch64_run_selftests (void) #undef TARGET_CAN_ELIMINATE #define TARGET_CAN_ELIMINATE aarch64_can_eliminate +#undef TARGET_FUNCTION_ATTRIBUTE_INLINABLE_P +#define TARGET_FUNCTION_ATTRIBUTE_INLINABLE_P \ + aarch64_function_attribute_inlinable_p + +#undef TARGET_NEED_IPA_FN_TARGET_INFO +#define TARGET_NEED_IPA_FN_TARGET_INFO aarch64_need_ipa_fn_target_info + +#undef TARGET_UPDATE_IPA_FN_TARGET_INFO +#define TARGET_UPDATE_IPA_FN_TARGET_INFO aarch64_update_ipa_fn_target_info + #undef TARGET_CAN_INLINE_P #define TARGET_CAN_INLINE_P aarch64_can_inline_p diff --git a/gcc/testsuite/gcc.target/aarch64/sme/inlining_1.c b/gcc/testsuite/gcc.target/aarch64/sme/inlining_1.c new file mode 100644 index 00000000000..24dc2b34187 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sme/inlining_1.c @@ -0,0 +1,47 @@ +/* { dg-options "" } */ + +inline void __attribute__((always_inline)) +sc_callee () [[arm::streaming_compatible]] {} + +inline void __attribute__((always_inline)) +s_callee () [[arm::streaming]] {} + +inline void __attribute__((always_inline)) +n_callee () {} + +[[arm::locally_streaming]] inline void __attribute__((always_inline)) +sc_ls_callee () [[arm::streaming_compatible]] {} + +[[arm::locally_streaming]] inline void __attribute__((always_inline)) +n_ls_callee () {} + +inline void __attribute__((always_inline)) +sc_asm_callee () [[arm::streaming_compatible]] { asm (""); } + +inline void __attribute__((always_inline)) +s_asm_callee () [[arm::streaming]] { asm (""); } // { dg-error "inlining failed" } + +inline void __attribute__((always_inline)) +n_asm_callee () { asm (""); } // { dg-error "inlining failed" } + +[[arm::locally_streaming]] inline void __attribute__((always_inline)) +sc_ls_asm_callee () [[arm::streaming_compatible]] { asm (""); } // { dg-error "inlining failed" } + +[[arm::locally_streaming]] inline void __attribute__((always_inline)) +n_ls_asm_callee () { asm (""); } // { dg-error "inlining failed" } + +void +sc_caller () [[arm::streaming_compatible]] +{ + sc_callee (); + s_callee (); + n_callee (); + sc_ls_callee (); + n_ls_callee (); + + sc_asm_callee (); + s_asm_callee (); + n_asm_callee (); + sc_ls_asm_callee (); + n_ls_asm_callee (); +} diff --git a/gcc/testsuite/gcc.target/aarch64/sme/inlining_10.c b/gcc/testsuite/gcc.target/aarch64/sme/inlining_10.c new file mode 100644 index 00000000000..adfd45a872f --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sme/inlining_10.c @@ -0,0 +1,57 @@ +/* { dg-options "" } */ + +#include +#include + +uint8x16_t *neon; +svint64_t *sve; +int64_t *ptr; + +// Gets expanded to addition early, so no error. An error would be +// more correct though. +inline void __attribute__((always_inline)) +call_vadd () +{ + neon[4] = vaddq_u8 (neon[5], neon[6]); +} + +inline void __attribute__((always_inline)) +call_vbsl () // { dg-error "inlining failed" } +{ + neon[0] = vbslq_u8 (neon[1], neon[2], neon[3]); +} + +inline void __attribute__((always_inline)) +call_svadd () +{ + *sve = svadd_x (svptrue_b8 (), *sve, 1); +} + +inline void __attribute__((always_inline)) +call_svld1_gather () // { dg-error "inlining failed" } +{ + *sve = svld1_gather_offset (svptrue_b8 (), ptr, *sve); +} + +inline void __attribute__((always_inline)) +call_svzero () [[arm::inout("za")]] +{ + svzero_za (); +} + +inline void __attribute__((always_inline)) +call_svst1_za () [[arm::streaming, arm::inout("za")]] // { dg-error "inlining failed" } +{ + svst1_ver_za64 (0, 0, svptrue_b8 (), ptr); +} + +void +sc_caller () [[arm::inout("za"), arm::streaming_compatible]] +{ + call_vadd (); + call_vbsl (); + call_svadd (); + call_svld1_gather (); + call_svzero (); + call_svst1_za (); +} diff --git a/gcc/testsuite/gcc.target/aarch64/sme/inlining_11.c b/gcc/testsuite/gcc.target/aarch64/sme/inlining_11.c new file mode 100644 index 00000000000..d05a92c1c24 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sme/inlining_11.c @@ -0,0 +1,57 @@ +/* { dg-options "" } */ + +#include +#include + +uint8x16_t *neon; +svint64_t *sve; +int64_t *ptr; + +// Gets expanded to addition early, so no error. An error would be +// more correct though. +inline void __attribute__((always_inline)) +call_vadd () +{ + neon[4] = vaddq_u8 (neon[5], neon[6]); +} + +inline void __attribute__((always_inline)) +call_vbsl () // { dg-error "inlining failed" } +{ + neon[0] = vbslq_u8 (neon[1], neon[2], neon[3]); +} + +inline void __attribute__((always_inline)) +call_svadd () +{ + *sve = svadd_x (svptrue_b8 (), *sve, 1); +} + +inline void __attribute__((always_inline)) +call_svld1_gather () // { dg-error "inlining failed" } +{ + *sve = svld1_gather_offset (svptrue_b8 (), ptr, *sve); +} + +inline void __attribute__((always_inline)) +call_svzero () [[arm::inout("za")]] +{ + svzero_za (); +} + +inline void __attribute__((always_inline)) +call_svst1_za () [[arm::streaming, arm::inout("za")]] +{ + svst1_ver_za64 (0, 0, svptrue_b8 (), ptr); +} + +void +sc_caller () [[arm::inout("za"), arm::streaming]] +{ + call_vadd (); + call_vbsl (); + call_svadd (); + call_svld1_gather (); + call_svzero (); + call_svst1_za (); +} diff --git a/gcc/testsuite/gcc.target/aarch64/sme/inlining_12.c b/gcc/testsuite/gcc.target/aarch64/sme/inlining_12.c new file mode 100644 index 00000000000..366f8b24ac2 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sme/inlining_12.c @@ -0,0 +1,15 @@ +/* { dg-options "" } */ + +#include + +inline void __attribute__((always_inline)) +call_svzero () [[arm::inout("za"), arm::streaming_compatible]] // { dg-error "inlining failed" } +{ + svzero_za (); +} + +void +n_caller () +{ + call_svzero (); +} diff --git a/gcc/testsuite/gcc.target/aarch64/sme/inlining_13.c b/gcc/testsuite/gcc.target/aarch64/sme/inlining_13.c new file mode 100644 index 00000000000..bdbd7408c33 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sme/inlining_13.c @@ -0,0 +1,15 @@ +/* { dg-options "" } */ + +#include + +inline void __attribute__((always_inline)) +call_svzero () [[arm::inout("za"), arm::streaming_compatible]] // { dg-error "inlining failed" } +{ + svzero_za (); +} + +void +s_caller () +{ + call_svzero (); +} diff --git a/gcc/testsuite/gcc.target/aarch64/sme/inlining_14.c b/gcc/testsuite/gcc.target/aarch64/sme/inlining_14.c new file mode 100644 index 00000000000..0ce4384f642 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sme/inlining_14.c @@ -0,0 +1,15 @@ +/* { dg-options "" } */ + +#include + +inline void __attribute__((always_inline)) +call_svzero () [[arm::inout("za"), arm::streaming_compatible]] // { dg-error "inlining failed" } +{ + svzero_za (); +} + +void +sc_caller () +{ + call_svzero (); +} diff --git a/gcc/testsuite/gcc.target/aarch64/sme/inlining_15.c b/gcc/testsuite/gcc.target/aarch64/sme/inlining_15.c new file mode 100644 index 00000000000..06fc5d7f5e3 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sme/inlining_15.c @@ -0,0 +1,27 @@ +/* { dg-options "" } */ + +#include + +inline void +call_svzero () [[arm::inout("za"), arm::streaming_compatible]] +{ + svzero_za (); +} + +void +n_caller () +{ + call_svzero (); // { dg-error "call to a function that shares 'za' state from a function that has no 'za' state" } +} + +void +s_caller () +{ + call_svzero (); // { dg-error "call to a function that shares 'za' state from a function that has no 'za' state" } +} + +void +sc_caller () +{ + call_svzero (); // { dg-error "call to a function that shares 'za' state from a function that has no 'za' state" } +} diff --git a/gcc/testsuite/gcc.target/aarch64/sme/inlining_2.c b/gcc/testsuite/gcc.target/aarch64/sme/inlining_2.c new file mode 100644 index 00000000000..ea2a57049cd --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sme/inlining_2.c @@ -0,0 +1,47 @@ +/* { dg-options "" } */ + +inline void __attribute__((always_inline)) +sc_callee () [[arm::streaming_compatible]] {} + +inline void __attribute__((always_inline)) +s_callee () [[arm::streaming]] {} + +inline void __attribute__((always_inline)) +n_callee () {} + +[[arm::locally_streaming]] inline void __attribute__((always_inline)) +sc_ls_callee () [[arm::streaming_compatible]] {} + +[[arm::locally_streaming]] inline void __attribute__((always_inline)) +n_ls_callee () {} + +inline void __attribute__((always_inline)) +sc_asm_callee () [[arm::streaming_compatible]] { asm (""); } + +inline void __attribute__((always_inline)) +s_asm_callee () [[arm::streaming]] { asm (""); } + +inline void __attribute__((always_inline)) +n_asm_callee () { asm (""); } // { dg-error "inlining failed" } + +[[arm::locally_streaming]] inline void __attribute__((always_inline)) +sc_ls_asm_callee () [[arm::streaming_compatible]] { asm (""); } + +[[arm::locally_streaming]] inline void __attribute__((always_inline)) +n_ls_asm_callee () { asm (""); } + +void +s_caller () [[arm::streaming]] +{ + sc_callee (); + s_callee (); + n_callee (); + sc_ls_callee (); + n_ls_callee (); + + sc_asm_callee (); + s_asm_callee (); + n_asm_callee (); + sc_ls_asm_callee (); + n_ls_asm_callee (); +} diff --git a/gcc/testsuite/gcc.target/aarch64/sme/inlining_3.c b/gcc/testsuite/gcc.target/aarch64/sme/inlining_3.c new file mode 100644 index 00000000000..d7ffb381985 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sme/inlining_3.c @@ -0,0 +1,47 @@ +/* { dg-options "" } */ + +inline void __attribute__((always_inline)) +sc_callee () [[arm::streaming_compatible]] {} + +inline void __attribute__((always_inline)) +s_callee () [[arm::streaming]] {} + +inline void __attribute__((always_inline)) +n_callee () {} + +[[arm::locally_streaming]] inline void __attribute__((always_inline)) +sc_ls_callee () [[arm::streaming_compatible]] {} + +[[arm::locally_streaming]] inline void __attribute__((always_inline)) +n_ls_callee () {} + +inline void __attribute__((always_inline)) +sc_asm_callee () [[arm::streaming_compatible]] { asm (""); } + +inline void __attribute__((always_inline)) +s_asm_callee () [[arm::streaming]] { asm (""); } // { dg-error "inlining failed" } + +inline void __attribute__((always_inline)) +n_asm_callee () { asm (""); } + +[[arm::locally_streaming]] inline void __attribute__((always_inline)) +sc_ls_asm_callee () [[arm::streaming_compatible]] { asm (""); } // { dg-error "inlining failed" } + +[[arm::locally_streaming]] inline void __attribute__((always_inline)) +n_ls_asm_callee () { asm (""); } // { dg-error "inlining failed" } + +void +n_caller () +{ + sc_callee (); + s_callee (); + n_callee (); + sc_ls_callee (); + n_ls_callee (); + + sc_asm_callee (); + s_asm_callee (); + n_asm_callee (); + sc_ls_asm_callee (); + n_ls_asm_callee (); +} diff --git a/gcc/testsuite/gcc.target/aarch64/sme/inlining_4.c b/gcc/testsuite/gcc.target/aarch64/sme/inlining_4.c new file mode 100644 index 00000000000..78920372500 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sme/inlining_4.c @@ -0,0 +1,47 @@ +/* { dg-options "" } */ + +inline void __attribute__((always_inline)) +sc_callee () [[arm::streaming_compatible]] {} + +inline void __attribute__((always_inline)) +s_callee () [[arm::streaming]] {} + +inline void __attribute__((always_inline)) +n_callee () {} + +[[arm::locally_streaming]] inline void __attribute__((always_inline)) +sc_ls_callee () [[arm::streaming_compatible]] {} + +[[arm::locally_streaming]] inline void __attribute__((always_inline)) +n_ls_callee () {} + +inline void __attribute__((always_inline)) +sc_asm_callee () [[arm::streaming_compatible]] { asm (""); } + +inline void __attribute__((always_inline)) +s_asm_callee () [[arm::streaming]] { asm (""); } + +inline void __attribute__((always_inline)) +n_asm_callee () { asm (""); } // { dg-error "inlining failed" } + +[[arm::locally_streaming]] inline void __attribute__((always_inline)) +sc_ls_asm_callee () [[arm::streaming_compatible]] { asm (""); } + +[[arm::locally_streaming]] inline void __attribute__((always_inline)) +n_ls_asm_callee () { asm (""); } + +[[arm::locally_streaming]] void +sc_ls_caller () [[arm::streaming_compatible]] +{ + sc_callee (); + s_callee (); + n_callee (); + sc_ls_callee (); + n_ls_callee (); + + sc_asm_callee (); + s_asm_callee (); + n_asm_callee (); + sc_ls_asm_callee (); + n_ls_asm_callee (); +} diff --git a/gcc/testsuite/gcc.target/aarch64/sme/inlining_5.c b/gcc/testsuite/gcc.target/aarch64/sme/inlining_5.c new file mode 100644 index 00000000000..d19cdc450d3 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sme/inlining_5.c @@ -0,0 +1,47 @@ +/* { dg-options "" } */ + +inline void __attribute__((always_inline)) +sc_callee () [[arm::streaming_compatible]] {} + +inline void __attribute__((always_inline)) +s_callee () [[arm::streaming]] {} + +inline void __attribute__((always_inline)) +n_callee () {} + +[[arm::locally_streaming]] inline void __attribute__((always_inline)) +sc_ls_callee () [[arm::streaming_compatible]] {} + +[[arm::locally_streaming]] inline void __attribute__((always_inline)) +n_ls_callee () {} + +inline void __attribute__((always_inline)) +sc_asm_callee () [[arm::streaming_compatible]] { asm (""); } + +inline void __attribute__((always_inline)) +s_asm_callee () [[arm::streaming]] { asm (""); } + +inline void __attribute__((always_inline)) +n_asm_callee () { asm (""); } // { dg-error "inlining failed" } + +[[arm::locally_streaming]] inline void __attribute__((always_inline)) +sc_ls_asm_callee () [[arm::streaming_compatible]] { asm (""); } + +[[arm::locally_streaming]] inline void __attribute__((always_inline)) +n_ls_asm_callee () { asm (""); } + +[[arm::locally_streaming]] void +n_ls_caller () +{ + sc_callee (); + s_callee (); + n_callee (); + sc_ls_callee (); + n_ls_callee (); + + sc_asm_callee (); + s_asm_callee (); + n_asm_callee (); + sc_ls_asm_callee (); + n_ls_asm_callee (); +} diff --git a/gcc/testsuite/gcc.target/aarch64/sme/inlining_6.c b/gcc/testsuite/gcc.target/aarch64/sme/inlining_6.c new file mode 100644 index 00000000000..a5eb399f10a --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sme/inlining_6.c @@ -0,0 +1,31 @@ +/* { dg-options "" } */ + +inline void __attribute__((always_inline)) +shared_callee () [[arm::inout("za")]] {} + +[[arm::new("za")]] inline void __attribute__((always_inline)) +new_callee () {} // { dg-error "inlining failed" } + +inline void __attribute__((always_inline)) +normal_callee () {} + +inline void __attribute__((always_inline)) +shared_asm_callee () [[arm::inout("za")]] { asm volatile ("" ::: "za"); } + +[[arm::new("za")]] inline void __attribute__((always_inline)) +new_asm_callee () { asm volatile ("" ::: "za"); } // { dg-error "inlining failed" } + +inline void __attribute__((always_inline)) +normal_asm_callee () { asm volatile ("" ::: "za"); } // { dg-error "inlining failed" } + +void +shared_caller () [[arm::inout("za")]] +{ + shared_callee (); + new_callee (); + normal_callee (); + + shared_asm_callee (); + new_asm_callee (); + normal_asm_callee (); +} diff --git a/gcc/testsuite/gcc.target/aarch64/sme/inlining_7.c b/gcc/testsuite/gcc.target/aarch64/sme/inlining_7.c new file mode 100644 index 00000000000..0f046283f3d --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sme/inlining_7.c @@ -0,0 +1,31 @@ +/* { dg-options "" } */ + +inline void __attribute__((always_inline)) +shared_callee () [[arm::inout("za")]] {} + +[[arm::new("za")]] inline void __attribute__((always_inline)) +new_callee () {} // { dg-error "inlining failed" } + +inline void __attribute__((always_inline)) +normal_callee () {} + +inline void __attribute__((always_inline)) +shared_asm_callee () [[arm::inout("za")]] { asm volatile ("" ::: "za"); } + +[[arm::new("za")]] inline void __attribute__((always_inline)) +new_asm_callee () { asm volatile ("" ::: "za"); } // { dg-error "inlining failed" } + +inline void __attribute__((always_inline)) +normal_asm_callee () { asm volatile ("" ::: "za"); } // { dg-error "inlining failed" } + +[[arm::new("za")]] void +new_caller () +{ + shared_callee (); + new_callee (); + normal_callee (); + + shared_asm_callee (); + new_asm_callee (); + normal_asm_callee (); +} diff --git a/gcc/testsuite/gcc.target/aarch64/sme/inlining_8.c b/gcc/testsuite/gcc.target/aarch64/sme/inlining_8.c new file mode 100644 index 00000000000..fd8a3a61e59 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sme/inlining_8.c @@ -0,0 +1,31 @@ +/* { dg-options "" } */ + +inline void __attribute__((always_inline)) +shared_callee () [[arm::inout("za")]] {} // { dg-error "inlining failed" } + +[[arm::new("za")]] inline void __attribute__((always_inline)) +new_callee () {} // { dg-error "inlining failed" } + +inline void __attribute__((always_inline)) +normal_callee () {} + +inline void __attribute__((always_inline)) +shared_asm_callee () [[arm::inout("za")]] { asm volatile ("" ::: "za"); } // { dg-error "inlining failed" } + +[[arm::new("za")]] inline void __attribute__((always_inline)) +new_asm_callee () { asm volatile ("" ::: "za"); } // { dg-error "inlining failed" } + +inline void __attribute__((always_inline)) +normal_asm_callee () { asm volatile ("" ::: "za"); } + +void +normal_caller () +{ + shared_callee (); + new_callee (); + normal_callee (); + + shared_asm_callee (); + new_asm_callee (); + normal_asm_callee (); +} diff --git a/gcc/testsuite/gcc.target/aarch64/sme/inlining_9.c b/gcc/testsuite/gcc.target/aarch64/sme/inlining_9.c new file mode 100644 index 00000000000..91520e3787b --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sme/inlining_9.c @@ -0,0 +1,55 @@ +/* { dg-options "" } */ + +#include +#include + +uint8x16_t *neon; +svint64_t *sve; +int64_t *ptr; + +inline void __attribute__((always_inline)) +call_vadd () +{ + neon[4] = vaddq_u8 (neon[5], neon[6]); +} + +inline void __attribute__((always_inline)) +call_vbsl () +{ + neon[0] = vbslq_u8 (neon[1], neon[2], neon[3]); +} + +inline void __attribute__((always_inline)) +call_svadd () +{ + *sve = svadd_x (svptrue_b8 (), *sve, 1); +} + +inline void __attribute__((always_inline)) +call_svld1_gather () +{ + *sve = svld1_gather_offset (svptrue_b8 (), ptr, *sve); +} + +inline void __attribute__((always_inline)) +call_svzero () [[arm::inout("za")]] +{ + svzero_za (); +} + +inline void __attribute__((always_inline)) +call_svst1_za () [[arm::streaming, arm::inout("za")]] // { dg-error "inlining failed" } +{ + svst1_ver_za64 (0, 0, svptrue_b8 (), ptr); +} + +void +n_caller () [[arm::inout("za")]] +{ + call_vadd (); + call_vbsl (); + call_svadd (); + call_svld1_gather (); + call_svzero (); + call_svst1_za (); +} From patchwork Tue Dec 5 10:13:23 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Sandiford X-Patchwork-Id: 1872049 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=8.43.85.97; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4SkxMs292Qz1ySd for ; Tue, 5 Dec 2023 21:19:05 +1100 (AEDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 72BFC385C401 for ; Tue, 5 Dec 2023 10:18:59 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by sourceware.org (Postfix) with ESMTP id 26EA0384F00A for ; Tue, 5 Dec 2023 10:13:51 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 26EA0384F00A Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=arm.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 26EA0384F00A Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=217.140.110.172 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1701771233; cv=none; b=E5satoR32jiBPgTGC5uPSV636fbcsR8Ti6kELdDaodlBLIb74FY1Ie/4H139gu4dR1ap0NlSVsXst2kVAyZi9ZuN2h7QSmkuWSIR2unkJMaebzunmx3xw18Eua+ugUVx+jzuI1wYM0jgsXpSPmYfLlyEmkcf9HQniNVYlSCijzw= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1701771233; c=relaxed/simple; bh=8no+6jvoD/wjnrjrqb9BE2F0AdAdtdkc7kxxjFIceXE=; h=From:To:Subject:Date:Message-Id:MIME-Version; b=WniXiqZGqU8MBGhI8TrZE9Hm2evqJ9CGcYe3VVzEIorKv//wPnrWUWP30dMpasqM9eySpkCx91fxk5LDDTMKJNFqHJZ/dC9hSHjFU8PBe/MASWCuQD69Wq+OfJdEncK8+s0BG4pgUgrilIVCbFwppFCKX7QYN08fIHHiILnnlK4= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 5E622FEC; Tue, 5 Dec 2023 02:14:37 -0800 (PST) Received: from e121540-lin.manchester.arm.com (e121540-lin.manchester.arm.com [10.32.110.72]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 502EE3F5A1; Tue, 5 Dec 2023 02:13:50 -0800 (PST) From: Richard Sandiford To: gcc-patches@gcc.gnu.org Cc: Richard Sandiford Subject: [pushed v2 25/25] aarch64: Update sibcall handling for SME Date: Tue, 5 Dec 2023 10:13:23 +0000 Message-Id: <20231205101323.1914247-26-richard.sandiford@arm.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20231205101323.1914247-1-richard.sandiford@arm.com> References: <20231205101323.1914247-1-richard.sandiford@arm.com> MIME-Version: 1.0 X-Spam-Status: No, score=-22.3 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_DMARC_NONE, KAM_DMARC_STATUS, KAM_LAZY_DOMAIN_SECURITY, KAM_SHORT, SPF_HELO_NONE, SPF_NONE, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org We only support tail calls between functions with the same PSTATE.ZA setting ("private-ZA" to "private-ZA" and "shared-ZA" to "shared-ZA"). Only a normal non-streaming function can tail-call another non-streaming function, and only a streaming function can tail-call another streaming function. Any function can tail-call a streaming-compatible function. gcc/ * config/aarch64/aarch64.cc (aarch64_function_ok_for_sibcall): Enforce PSTATE.SM and PSTATE.ZA restrictions. (aarch64_expand_epilogue): Save and restore the arguments to a sibcall around any change to PSTATE.SM. gcc/testsuite/ * gcc.target/aarch64/sme/sibcall_1.c: New test. * gcc.target/aarch64/sme/sibcall_2.c: Likewise. * gcc.target/aarch64/sme/sibcall_3.c: Likewise. * gcc.target/aarch64/sme/sibcall_4.c: Likewise. * gcc.target/aarch64/sme/sibcall_5.c: Likewise. * gcc.target/aarch64/sme/sibcall_6.c: Likewise. * gcc.target/aarch64/sme/sibcall_7.c: Likewise. * gcc.target/aarch64/sme/sibcall_8.c: Likewise. --- gcc/config/aarch64/aarch64.cc | 9 +++- .../gcc.target/aarch64/sme/sibcall_1.c | 45 +++++++++++++++++++ .../gcc.target/aarch64/sme/sibcall_2.c | 45 +++++++++++++++++++ .../gcc.target/aarch64/sme/sibcall_3.c | 45 +++++++++++++++++++ .../gcc.target/aarch64/sme/sibcall_4.c | 45 +++++++++++++++++++ .../gcc.target/aarch64/sme/sibcall_5.c | 45 +++++++++++++++++++ .../gcc.target/aarch64/sme/sibcall_6.c | 26 +++++++++++ .../gcc.target/aarch64/sme/sibcall_7.c | 26 +++++++++++ .../gcc.target/aarch64/sme/sibcall_8.c | 19 ++++++++ 9 files changed, 304 insertions(+), 1 deletion(-) create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/sibcall_1.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/sibcall_2.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/sibcall_3.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/sibcall_4.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/sibcall_5.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/sibcall_6.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/sibcall_7.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/sibcall_8.c diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc index 4639310f108..48b7811c100 100644 --- a/gcc/config/aarch64/aarch64.cc +++ b/gcc/config/aarch64/aarch64.cc @@ -6124,6 +6124,11 @@ aarch64_function_ok_for_sibcall (tree, tree exp) if (crtl->abi->id () != expr_callee_abi (exp).id ()) return false; + tree fntype = TREE_TYPE (TREE_TYPE (CALL_EXPR_FN (exp))); + if (aarch64_fntype_pstate_sm (fntype) & ~aarch64_cfun_incoming_pstate_sm ()) + return false; + if (aarch64_fntype_pstate_za (fntype) != aarch64_cfun_incoming_pstate_za ()) + return false; return true; } @@ -9564,7 +9569,9 @@ aarch64_expand_epilogue (rtx_call_insn *sibcall) guard_label = aarch64_guard_switch_pstate_sm (IP0_REGNUM, aarch64_isa_flags); aarch64_sme_mode_switch_regs return_switch; - if (crtl->return_rtx && REG_P (crtl->return_rtx)) + if (sibcall) + return_switch.add_call_args (sibcall); + else if (crtl->return_rtx && REG_P (crtl->return_rtx)) return_switch.add_reg (GET_MODE (crtl->return_rtx), REGNO (crtl->return_rtx)); return_switch.emit_prologue (); diff --git a/gcc/testsuite/gcc.target/aarch64/sme/sibcall_1.c b/gcc/testsuite/gcc.target/aarch64/sme/sibcall_1.c new file mode 100644 index 00000000000..c7530de5c37 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sme/sibcall_1.c @@ -0,0 +1,45 @@ +/* { dg-options "-O2" } */ + +void sc_callee () [[arm::streaming_compatible]]; +void s_callee () [[arm::streaming]]; +void n_callee (); + +[[arm::locally_streaming]] __attribute__((noipa)) void +sc_ls_callee () [[arm::streaming_compatible]] {} +[[arm::locally_streaming]] __attribute__((noipa)) void +n_ls_callee () {} + +void +sc_to_sc () [[arm::streaming_compatible]] +{ + sc_callee (); +} +/* { dg-final { scan-assembler {\tb\tsc_callee} } } */ + +void +sc_to_s () [[arm::streaming_compatible]] +{ + s_callee (); +} +/* { dg-final { scan-assembler {\tbl\ts_callee} } } */ + +void +sc_to_n () [[arm::streaming_compatible]] +{ + n_callee (); +} +/* { dg-final { scan-assembler {\tbl\tn_callee} } } */ + +void +sc_to_sc_ls () [[arm::streaming_compatible]] +{ + sc_ls_callee (); +} +/* { dg-final { scan-assembler {\tb\tsc_ls_callee} } } */ + +void +sc_to_n_ls () [[arm::streaming_compatible]] +{ + n_ls_callee (); +} +/* { dg-final { scan-assembler {\tbl\tn_ls_callee} } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sme/sibcall_2.c b/gcc/testsuite/gcc.target/aarch64/sme/sibcall_2.c new file mode 100644 index 00000000000..8d1c8a9f901 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sme/sibcall_2.c @@ -0,0 +1,45 @@ +/* { dg-options "-O2" } */ + +void sc_callee () [[arm::streaming_compatible]]; +void s_callee () [[arm::streaming]]; +void n_callee (); + +[[arm::locally_streaming]] __attribute__((noipa)) void +sc_ls_callee () [[arm::streaming_compatible]] {} +[[arm::locally_streaming]] __attribute__((noipa)) void +n_ls_callee () {} + +void +s_to_sc () [[arm::streaming]] +{ + sc_callee (); +} +/* { dg-final { scan-assembler {\tb\tsc_callee} } } */ + +void +s_to_s () [[arm::streaming]] +{ + s_callee (); +} +/* { dg-final { scan-assembler {\tb\ts_callee} } } */ + +void +s_to_n () [[arm::streaming]] +{ + n_callee (); +} +/* { dg-final { scan-assembler {\tbl\tn_callee} } } */ + +void +s_to_sc_ls () [[arm::streaming]] +{ + sc_ls_callee (); +} +/* { dg-final { scan-assembler {\tb\tsc_ls_callee} } } */ + +void +s_to_n_ls () [[arm::streaming]] +{ + n_ls_callee (); +} +/* { dg-final { scan-assembler {\tbl\tn_ls_callee} } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sme/sibcall_3.c b/gcc/testsuite/gcc.target/aarch64/sme/sibcall_3.c new file mode 100644 index 00000000000..2ae937fc5dc --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sme/sibcall_3.c @@ -0,0 +1,45 @@ +/* { dg-options "-O2" } */ + +void sc_callee () [[arm::streaming_compatible]]; +void s_callee () [[arm::streaming]]; +void n_callee (); + +[[arm::locally_streaming]] __attribute__((noipa)) void +sc_ls_callee () [[arm::streaming_compatible]] {} +[[arm::locally_streaming]] __attribute__((noipa)) void +n_ls_callee () {} + +void +n_to_sc () +{ + sc_callee (); +} +/* { dg-final { scan-assembler {\tb\tsc_callee} } } */ + +void +n_to_s () +{ + s_callee (); +} +/* { dg-final { scan-assembler {\tbl\ts_callee} } } */ + +void +n_to_n () +{ + n_callee (); +} +/* { dg-final { scan-assembler {\tb\tn_callee} } } */ + +void +n_to_sc_ls () +{ + sc_ls_callee (); +} +/* { dg-final { scan-assembler {\tb\tsc_ls_callee} } } */ + +void +n_to_n_ls () +{ + n_ls_callee (); +} +/* { dg-final { scan-assembler {\tb\tn_ls_callee} } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sme/sibcall_4.c b/gcc/testsuite/gcc.target/aarch64/sme/sibcall_4.c new file mode 100644 index 00000000000..6935a1bd740 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sme/sibcall_4.c @@ -0,0 +1,45 @@ +/* { dg-options "-O2" } */ + +void sc_callee () [[arm::streaming_compatible]]; +void s_callee () [[arm::streaming]]; +void n_callee (); + +[[arm::locally_streaming]] __attribute__((noipa)) void +sc_ls_callee () [[arm::streaming_compatible]] {} +[[arm::locally_streaming]] __attribute__((noipa)) void +n_ls_callee () {} + +[[arm::locally_streaming]] void +sc_to_sc () [[arm::streaming_compatible]] +{ + sc_callee (); +} +/* { dg-final { scan-assembler {\tb\tsc_callee} } } */ + +[[arm::locally_streaming]] void +sc_to_s () [[arm::streaming_compatible]] +{ + s_callee (); +} +/* { dg-final { scan-assembler {\tbl\ts_callee} } } */ + +[[arm::locally_streaming]] void +sc_to_n () [[arm::streaming_compatible]] +{ + n_callee (); +} +/* { dg-final { scan-assembler {\tbl\tn_callee} } } */ + +[[arm::locally_streaming]] void +sc_to_sc_ls () [[arm::streaming_compatible]] +{ + sc_ls_callee (); +} +/* { dg-final { scan-assembler {\tb\tsc_ls_callee} } } */ + +[[arm::locally_streaming]] void +sc_to_n_ls () [[arm::streaming_compatible]] +{ + n_ls_callee (); +} +/* { dg-final { scan-assembler {\tbl\tn_ls_callee} } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sme/sibcall_5.c b/gcc/testsuite/gcc.target/aarch64/sme/sibcall_5.c new file mode 100644 index 00000000000..7aaf58dfa22 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sme/sibcall_5.c @@ -0,0 +1,45 @@ +/* { dg-options "-O2" } */ + +void sc_callee () [[arm::streaming_compatible]]; +void s_callee () [[arm::streaming]]; +void n_callee (); + +[[arm::locally_streaming]] __attribute__((noipa)) void +sc_ls_callee () [[arm::streaming_compatible]] {} +[[arm::locally_streaming]] __attribute__((noipa)) void +n_ls_callee () {} + +[[arm::locally_streaming]] void +n_to_sc () +{ + sc_callee (); +} +/* { dg-final { scan-assembler {\tb\tsc_callee} } } */ + +[[arm::locally_streaming]] void +n_to_s () +{ + s_callee (); +} +/* { dg-final { scan-assembler {\tbl\ts_callee} } } */ + +[[arm::locally_streaming]] void +n_to_n () +{ + n_callee (); +} +/* { dg-final { scan-assembler {\tb\tn_callee} } } */ + +[[arm::locally_streaming]] void +n_to_sc_ls () +{ + sc_ls_callee (); +} +/* { dg-final { scan-assembler {\tb\tsc_ls_callee} } } */ + +[[arm::locally_streaming]] void +n_to_n_ls () +{ + n_ls_callee (); +} +/* { dg-final { scan-assembler {\tb\tn_ls_callee} } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sme/sibcall_6.c b/gcc/testsuite/gcc.target/aarch64/sme/sibcall_6.c new file mode 100644 index 00000000000..e568edb17dd --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sme/sibcall_6.c @@ -0,0 +1,26 @@ +/* { dg-options "-O2" } */ + +void shared_callee () [[arm::inout("za")]]; +[[arm::new("za")]] __attribute__((noipa)) void new_callee () {} +void normal_callee (); + +void +shared_to_shared () [[arm::inout("za")]] +{ + shared_callee (); +} +/* { dg-final { scan-assembler {\tb\tshared_callee} } } */ + +void +shared_to_new () [[arm::inout("za")]] +{ + new_callee (); +} +/* { dg-final { scan-assembler {\tbl\tnew_callee} } } */ + +void +shared_to_normal () [[arm::inout("za")]] +{ + normal_callee (); +} +/* { dg-final { scan-assembler {\tbl\tnormal_callee} } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sme/sibcall_7.c b/gcc/testsuite/gcc.target/aarch64/sme/sibcall_7.c new file mode 100644 index 00000000000..a5f576d2044 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sme/sibcall_7.c @@ -0,0 +1,26 @@ +/* { dg-options "-O2" } */ + +void shared_callee () [[arm::inout("za")]]; +[[arm::new("za")]] __attribute__((noipa)) void new_callee () {} +void normal_callee (); + +[[arm::new("za")]] void +new_to_shared () +{ + shared_callee (); +} +/* { dg-final { scan-assembler {\tbl\tshared_callee} } } */ + +[[arm::new("za")]] void +new_to_new () +{ + new_callee (); +} +/* { dg-final { scan-assembler {\tb\tnew_callee} } } */ + +[[arm::new("za")]] void +new_to_normal () +{ + normal_callee (); +} +/* { dg-final { scan-assembler {\tb\tnormal_callee} } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sme/sibcall_8.c b/gcc/testsuite/gcc.target/aarch64/sme/sibcall_8.c new file mode 100644 index 00000000000..33370f7a87f --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sme/sibcall_8.c @@ -0,0 +1,19 @@ +/* { dg-options "-O2" } */ + +void shared_callee () [[arm::inout("za")]]; +[[arm::new("za")]] __attribute__((noipa)) void new_callee () {} +void normal_callee (); + +void +normal_to_new () +{ + new_callee (); +} +/* { dg-final { scan-assembler {\tb\tnew_callee} } } */ + +void +normal_to_normal () +{ + normal_callee (); +} +/* { dg-final { scan-assembler {\tb\tnormal_callee} } } */