From patchwork Fri Apr 12 16:44:10 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Sandiford X-Patchwork-Id: 1923211 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4VGMq85W7dz1yYL for ; Sat, 13 Apr 2024 02:44:36 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id D814F3858432 for ; Fri, 12 Apr 2024 16:44:34 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by sourceware.org (Postfix) with ESMTP id 89B2E3858D38 for ; Fri, 12 Apr 2024 16:44:12 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 89B2E3858D38 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=arm.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 89B2E3858D38 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=217.140.110.172 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1712940254; cv=none; b=Iac2D0ULl1p+l8AjSf7e6GZKdvoPgKCzFoh3+su3F3vR6CxMnFX8D39z1Yi1mEMKmdoWWjKUwK/Vzt0Lqr3NZYR62e3qiKaliGPL+d2RqexdGqWjgzFpx1dlPb6OSDkQhaqvxinw8meFIUBR655vj0FgjtXX+DQB8GUaTNf78Zs= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1712940254; c=relaxed/simple; bh=nUat8FjGpYBnvASXaFv/yfbsACI+4qdpfHku+pAMGJQ=; h=From:To:Subject:Date:Message-ID:MIME-Version; b=emeDnGwSuqIDoAcJnpDz4slcRXAnctyVE5W2Mq3yA5qMvobpvQ0GogklqlMeACMuwdQDhZqL1ZfvqxRP6+xIX0mBiSmY3AQh60c4Ege4/kwSe94gblrRp+GHv3GIwTdk/5HPx4wuwtPFfeurTRvi68eaA6wevQTyuaQ5HRtsQcU= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 6738A339; Fri, 12 Apr 2024 09:44:41 -0700 (PDT) Received: from localhost (e121540-lin.manchester.arm.com [10.32.110.72]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id C12E03F64C; Fri, 12 Apr 2024 09:44:11 -0700 (PDT) From: Richard Sandiford To: gcc-patches@gcc.gnu.org Mail-Followup-To: gcc-patches@gcc.gnu.org, iain@sandoe.co.uk, richard.sandiford@arm.com Cc: iain@sandoe.co.uk Subject: [pushed] aarch64: Avoid using mismatched ZERO ZA sizes Date: Fri, 12 Apr 2024 17:44:10 +0100 Message-ID: User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.3 (gnu/linux) MIME-Version: 1.0 X-Spam-Status: No, score=-20.5 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_DMARC_NONE, KAM_DMARC_STATUS, KAM_LAZY_DOMAIN_SECURITY, KAM_SHORT, SPF_HELO_NONE, SPF_NONE, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org The svzero_mask_za intrinsic tried to use the shortest combination of .b, .h, .s and .d tiles, allowing mixtures of sizes where necessary. However, Iain S pointed out that LLVM instead requires the tiles to have the same suffix. GAS supports both versions, so this patch generates the LLVM-friendly form. Tested on aarch64-linux-gnu & pushed. Please revert the patch if it causes any problems. Richard gcc/ * config/aarch64/aarch64.cc (aarch64_output_sme_zero_za): Require all tiles to have the same suffix. gcc/testsuite/ * gcc.target/aarch64/sme/acle-asm/zero_mask_za.c (zero_mask_za_ab) (zero_mask_za_d7, zero_mask_za_bf): Expect a list of .d tiles instead of a mixture. --- gcc/config/aarch64/aarch64.cc | 20 +++++++++++-------- .../aarch64/sme/acle-asm/zero_mask_za.c | 6 +++--- 2 files changed, 15 insertions(+), 11 deletions(-) diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc index a2e3d208d76..1beec94629d 100644 --- a/gcc/config/aarch64/aarch64.cc +++ b/gcc/config/aarch64/aarch64.cc @@ -13210,29 +13210,33 @@ aarch64_output_sme_zero_za (rtx mask) /* The last entry in the list has the form "za7.d }", but that's the same length as "za7.d, ". */ static char buffer[sizeof("zero\t{ ") + sizeof ("za7.d, ") * 8 + 1]; - unsigned int i = 0; - i += snprintf (buffer + i, sizeof (buffer) - i, "zero\t"); - const char *prefix = "{ "; for (auto &tile : tiles) { unsigned int tile_mask = tile.mask; unsigned int tile_index = 0; + unsigned int i = snprintf (buffer, sizeof (buffer), "zero\t"); + const char *prefix = "{ "; + auto remaining_mask = mask_val; while (tile_mask < 0x100) { - if ((mask_val & tile_mask) == tile_mask) + if ((remaining_mask & tile_mask) == tile_mask) { i += snprintf (buffer + i, sizeof (buffer) - i, "%sza%d.%c", prefix, tile_index, tile.letter); prefix = ", "; - mask_val &= ~tile_mask; + remaining_mask &= ~tile_mask; } tile_mask <<= 1; tile_index += 1; } + if (remaining_mask == 0) + { + gcc_assert (i + 3 <= sizeof (buffer)); + snprintf (buffer + i, sizeof (buffer) - i, " }"); + return buffer; + } } - gcc_assert (mask_val == 0 && i + 3 <= sizeof (buffer)); - snprintf (buffer + i, sizeof (buffer) - i, " }"); - return buffer; + gcc_unreachable (); } /* Return size in bits of an arithmetic operand which is shifted/scaled and diff --git a/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/zero_mask_za.c b/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/zero_mask_za.c index 9ce7331ebdd..2ba8f8cc332 100644 --- a/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/zero_mask_za.c +++ b/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/zero_mask_za.c @@ -103,21 +103,21 @@ PROTO (zero_mask_za_aa, void, ()) { svzero_mask_za (0xaa); } /* ** zero_mask_za_ab: -** zero { za1\.h, za0\.d } +** zero { za0\.d, za1\.d, za3\.d, za5\.d, za7\.d } ** ret */ PROTO (zero_mask_za_ab, void, ()) { svzero_mask_za (0xab); } /* ** zero_mask_za_d7: -** zero { za0\.h, za1\.d, za7\.d } +** zero { za0\.d, za1\.d, za2\.d, za4\.d, za6\.d, za7\.d } ** ret */ PROTO (zero_mask_za_d7, void, ()) { svzero_mask_za (0xd7); } /* ** zero_mask_za_bf: -** zero { za1\.h, za0\.s, za2\.d } +** zero { za0\.d, za1\.d, za2\.d, za3\.d, za4\.d, za5\.d, za7\.d } ** ret */ PROTO (zero_mask_za_bf, void, ()) { svzero_mask_za (0xbf); }