[{"id":3678967,"web_url":"http://patchwork.ozlabs.org/comment/3678967/","msgid":"<87y0ikqvta.fsf@googlemail.com>","list_archive_url":null,"date":"2026-04-18T12:47:45","subject":"Re: [PATCH 3/7] aarch64: Fix ZA state transition [PR119210]","submitter":{"id":4363,"url":"http://patchwork.ozlabs.org/api/people/4363/","name":"Richard Sandiford","email":"rdsandiford@googlemail.com"},"content":"Alice Carlotti <alice.carlotti@arm.com> writes:\n> In the INACTIVE_CALLER -> INACTIVE LOCAL transition, ensure ZA is active\n> and zeroed before setting tpidr2_el0.\n>\n> gcc/ChangeLog:\n>\n> \tPR target/119210\n> \t* config/aarch64/aarch64.cc (aarch64_mode_emit_local_sme_state):\n> \tAdd PSTATE.ZA enablement, and zero it if already enabled.\n>\n> gcc/testsuite/ChangeLog:\n>\n> \tPR target/119210\n> \t* gcc.target/aarch64/sme/za_state_8.c: New test.\n>\n>\n> diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc\n> index 7b239554daebfbfa145c3a3c7034fa2d8dd7b4e0..c91fff3ef6bab8a25391a8a0737fd8c7b0615bd5 100644\n> --- a/gcc/config/aarch64/aarch64.cc\n> +++ b/gcc/config/aarch64/aarch64.cc\n> @@ -31963,7 +31963,8 @@ aarch64_mode_emit_local_sme_state (aarch64_local_sme_state mode,\n>        emit_insn (gen_aarch64_tpidr2_save ());\n>        emit_insn (gen_aarch64_clear_tpidr2 ());\n>        if (mode == aarch64_local_sme_state::ACTIVE_LIVE\n> -\t  || mode == aarch64_local_sme_state::ACTIVE_DEAD)\n> +\t  || mode == aarch64_local_sme_state::ACTIVE_DEAD\n> +\t  || mode == aarch64_local_sme_state::INACTIVE_LOCAL)\n>  \t{\n>  \t  if (aarch64_cfun_has_state (\"za\"))\n>  \t    emit_insn (gen_aarch64_initial_zero_za ());\n> @@ -32045,6 +32046,11 @@ aarch64_mode_emit_local_sme_state (aarch64_local_sme_state mode,\n>  \n>    if (mode == aarch64_local_sme_state::INACTIVE_LOCAL)\n>      {\n> +      /* Enabling ZA is more efficient than forcing later code to restore from\n> +\t a zeroed lazy save buffer.  */\n\nI suppose this is contrasting with leaving TPIDR2_EL0 null and inserting\ncode to clear the buffer explicitly, is that right?  If so, then SMSTART ZA\nis not necessarily more efficient in cases where a caller does commit the\nlazy save, but that isn't the case that we optimise for.  So how about\na comment like:\n\n      /* Enable ZA, if it wasn't already enabled on entry.  Enabling ZA has\n         the side-effect of zeroing ZA.\n\n         A functionally correct alternative would be to leave TPIDR2_EL0\n         null and zero the save buffer.  However, zeroing the buffer would\n         require more code and would optimize for the case in which a caller\n         commits a lazy save (which is supposed to be a rare event).  */\n\n(Feel free to tweak the wording.)\n\nGuess this is personal preference, but I think it'd be clearer to put\nthe comment between the \"if\" and the \"emit_insn\", as for surrounding code,\nso that the precondition comes first and the consequence (described by\nthe comments) comes second.\n\nOK with those changes, thanks.\n\nRichard\n\n> +      if (prev_mode == aarch64_local_sme_state::INACTIVE_CALLER)\n> +\temit_insn (gen_aarch64_smstart_za ());\n> +\n>        if (prev_mode == aarch64_local_sme_state::ACTIVE_LIVE\n>  \t  || prev_mode == aarch64_local_sme_state::ACTIVE_DEAD\n>  \t  || prev_mode == aarch64_local_sme_state::INACTIVE_CALLER)\n> diff --git a/gcc/testsuite/gcc.target/aarch64/sme/za_state_8.c\n> b/gcc/testsuite/gcc.target/aarch64/sme/za_state_8.c\n> new file mode 100644\n> index\n> 0000000000000000000000000000000000000000..9b7a6ffa69cbce1969d40f9d69a76522c0e439c5\n> --- /dev/null\n> +++ b/gcc/testsuite/gcc.target/aarch64/sme/za_state_8.c\n> @@ -0,0 +1,25 @@\n> +// { dg-options \"-O -fomit-frame-pointer -fno-optimize-sibling-calls\" }\n> +// { dg-final { check-function-bodies \"**\" \"\" } }\n> +\n> +#include <arm_sme.h>\n> +\n> +void callee_ns();\n> +__arm_streaming __arm_inout(\"za\") void callee_s();\n> +\n> +/*\n> +** foo:\n> +**\t...\n> +**\tsmstart\tza\n> +**\t...\n> +**\tmsr\ttpidr2_el0, x\\d+\n> +**\t...\n> +*/\n> +__arm_locally_streaming __arm_new(\"za\") const float * foo(const float* x) {\n> +    callee_ns ();\n> +    const float32_t *x_f_in = x;\n> +    svzero_za();\n> +    callee_s ();\n> +    return x_f_in;\n> +}\n> +\n> +","headers":{"Return-Path":"<gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org>","X-Original-To":["incoming@patchwork.ozlabs.org","gcc-patches@gcc.gnu.org"],"Delivered-To":["patchwork-incoming@legolas.ozlabs.org","gcc-patches@gcc.gnu.org"],"Authentication-Results":["legolas.ozlabs.org;\n\tdkim=pass (2048-bit key;\n unprotected) header.d=googlemail.com header.i=@googlemail.com\n header.a=rsa-sha256 header.s=20251104 header.b=IkjwYwD0;\n\tdkim-atps=neutral","legolas.ozlabs.org;\n spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org\n (client-ip=38.145.34.32; helo=vm01.sourceware.org;\n envelope-from=gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org;\n receiver=patchwork.ozlabs.org)","sourceware.org;\n\tdkim=pass (2048-bit key,\n unprotected) header.d=googlemail.com header.i=@googlemail.com\n header.a=rsa-sha256 header.s=20251104 header.b=IkjwYwD0","sourceware.org; dmarc=pass (p=quarantine dis=none)\n header.from=googlemail.com","sourceware.org; spf=pass smtp.mailfrom=googlemail.com","server2.sourceware.org;\n arc=none smtp.remote-ip=209.85.128.42"],"Received":["from vm01.sourceware.org (vm01.sourceware.org [38.145.34.32])\n\t(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)\n\t key-exchange x25519 server-signature ECDSA (secp384r1) server-digest SHA384)\n\t(No client certificate requested)\n\tby legolas.ozlabs.org (Postfix) with ESMTPS id 4fyWl55ZYmz1yDF\n\tfor <incoming@patchwork.ozlabs.org>; Sat, 18 Apr 2026 22:48:32 +1000 (AEST)","from vm01.sourceware.org (localhost [127.0.0.1])\n\tby sourceware.org (Postfix) with ESMTP id F0DB84CCCA1D\n\tfor <incoming@patchwork.ozlabs.org>; Sat, 18 Apr 2026 12:48:29 +0000 (GMT)","from mail-wm1-f42.google.com (mail-wm1-f42.google.com\n [209.85.128.42])\n by sourceware.org (Postfix) with ESMTPS id 56B394B920A7\n for <gcc-patches@gcc.gnu.org>; Sat, 18 Apr 2026 12:47:53 +0000 (GMT)","by mail-wm1-f42.google.com with SMTP id\n 5b1f17b1804b1-488b3f8fa2bso22625215e9.1\n for <gcc-patches@gcc.gnu.org>; Sat, 18 Apr 2026 05:47:53 -0700 (PDT)","from localhost ([103.214.45.63])\n by smtp.googlemail.com with ESMTPSA id\n ffacd0b85a97d-43fe4e4d6casm11834632f8f.32.2026.04.18.05.47.50\n (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256);\n Sat, 18 Apr 2026 05:47:51 -0700 (PDT)"],"DKIM-Filter":["OpenDKIM Filter v2.11.0 sourceware.org F0DB84CCCA1D","OpenDKIM Filter v2.11.0 sourceware.org 56B394B920A7"],"DMARC-Filter":"OpenDMARC Filter v1.4.2 sourceware.org 56B394B920A7","ARC-Filter":"OpenARC Filter v1.0.0 sourceware.org 56B394B920A7","ARC-Seal":"i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1776516473; cv=none;\n b=bSHY5JWsRyldRtohQ/OP9bP0uxghBXEEgmcpQHYiW7IZV0IYlZrRLrMH+/pBoXdifQ7cplFWR1mxfbAMqMkWATPSGAA4AdRGgI/A/SJTl2LiU8vdHjCFYEj6vp9BgTVV/zMfnWY+rS5D3/uyXzp9Q/XlUgli60xlnokZgYt38Xw=","ARC-Message-Signature":"i=1; a=rsa-sha256; d=sourceware.org; s=key;\n t=1776516473; c=relaxed/simple;\n bh=lmCMYQchmL/dkivvXowdErKeSm2uK5AbhZP1nNFLtcU=;\n h=DKIM-Signature:From:To:Subject:Date:Message-ID:MIME-Version;\n b=wropcKPURWGP/Xj9rAcFGwEG/NCFikiqtVVeVGvbD65BrAtGVBSErTohPjAR/1LnicBlSr/4wujQRDTqkGa6v4iu0SHPYn4vZNJPyqe99lBGthMShfCJqbpMKytxZig2U168CZo0uxO3dAZ0kTjPiMQgKSAsf0BGDoqkNe8uJHA=","ARC-Authentication-Results":"i=1; server2.sourceware.org","DKIM-Signature":"v=1; a=rsa-sha256; c=relaxed/relaxed;\n d=googlemail.com; s=20251104; t=1776516472; x=1777121272; darn=gcc.gnu.org;\n h=mime-version:user-agent:message-id:date:references:in-reply-to\n :subject:cc:mail-followup-to:to:from:from:to:cc:subject:date\n :message-id:reply-to;\n bh=wEW1gZyAfpFQNf4/wJsDvcFvAjr2wpMz1iTwYMQgSm4=;\n b=IkjwYwD0ELlaphE5Bk5a9CNOaGCCAPiSsZALKD1W6vOrHjZxsrYe5ixp2DDHNfsoyO\n quViGFGQkJf8Zk0LIfbIr4ThwA734egXcz0beKD50uyExgL/99t+87odsehqaiH/MwMN\n X1N2LNkzY3YP3n0FAD/P+7JEELWVe7K8t8Rz18RotUo+mmsdkaLecLwIItrYovF1+dGS\n Qw0HVfqLQh8k34PyDRuGv34Qs1zmKa4qr7jkzd2P8oWs8gO/QFohzgO5F6fZq9/lGkOS\n 9AVtz1zbcZNx8UTYwg63FcLqHd/yd4tPjKyOdY/E1gXXzyF+1Pl7oIk9IP/41MOfCvlS\n Kfsw==","X-Google-DKIM-Signature":"v=1; a=rsa-sha256; c=relaxed/relaxed;\n d=1e100.net; s=20251104; t=1776516472; x=1777121272;\n h=mime-version:user-agent:message-id:date:references:in-reply-to\n :subject:cc:mail-followup-to:to:from:x-gm-gg:x-gm-message-state:from\n :to:cc:subject:date:message-id:reply-to;\n bh=wEW1gZyAfpFQNf4/wJsDvcFvAjr2wpMz1iTwYMQgSm4=;\n b=o8veWyCCJQC2YdrkB0OERqrDc7UZ1fenQPyorpB+aJhcqL8G0u/nJjqCXEF00pAmUt\n 2ZmWO2Lc5wAgKggRVNzI/UkTBJyP7wmFBjSbtWxNKM5rMWdfUMUBHftS4/nLiqQn/Ax4\n PQysmC6Jiet/wchHhNidXq/9XnFdlLaxeK5JT4IgOVd3xjOG0Q+D8wNlGccu5zXuXhVX\n Rhw7b0RAV7+wsibirkcft9VfXNbRpAdwtWVvR1GKzc106KJ1dEc0UNlQuGxdC+ZYjtnw\n NIqtqKDNdV16iMipN1bk6sVcvDpT5/0e+6KMqDuUUeSNh+SD5USAnHjf5Xumloi7aNyi\n KCXg==","X-Gm-Message-State":"AOJu0YxVln0opiq2mMRik/DkIsJEDiC0RiCOW+ESBmxrRHnj/4i/4U+0\n Bmp3BhmvDMPHSeucthi1jweuhIMJco+fJJLdBRNu2eIYK9jUBq1L4GS8","X-Gm-Gg":"AeBDiesLnLaCgCUfBU9GHZAmPgXJp6HJLqYDfZK0Dc0+4zNbP7lQVEZzX7MwtQ2dsSq\n n1bzmEZ1CCtZNuh+EHYCE7iycT4xyGPAs3Cp5GwSFvJ28Uk9+62bB2AtwsQP/INL1TZv52dmNfj\n kqJYcdE2CxdI2KvqifK/jVqH9MI6/ojl7uU7e5IRHeeZPS9S/1BomjT1nOVW+TzL6Ve5vDPKWaX\n h3PA0M/A0wF+AF5bDdNFMBp4+3RkcmX09Wt/eBe8jEw9pIDhPx4KUC2MoxZ6RR13TtsHmryUX7w\n 1f9pdBPMj8BsEypVD9PmMhHBIy9JtEiJDHdAeYREQjaqY9JYdXp5pT7MK4ajwJXY31QPBJTcA1M\n 9MWpsQP2S+RgpbBaxAjgzDMrm5jIneAjLSFwz7j80v3aqFBShdWshHpyldSO7Hl7M5oiB+PnCZt\n xZddmw/KzK+QmZAwXendPqy4twYEbBl+5WemtCN4w=","X-Received":"by 2002:a05:600c:32af:b0:488:c21a:4754 with SMTP id\n 5b1f17b1804b1-488fb8bced0mr66776675e9.18.1776516471828;\n Sat, 18 Apr 2026 05:47:51 -0700 (PDT)","From":"Richard Sandiford <rdsandiford@googlemail.com>","To":"Alice Carlotti <alice.carlotti@arm.com>","Mail-Followup-To":"Alice Carlotti\n <alice.carlotti@arm.com>,gcc-patches@gcc.gnu.org,  Richard Earnshaw\n <richard.earnshaw@arm.com>,  Tamar Christina <tamar.christina@arm.com>,\n Kyrylo Tkachov <ktkachov@nvidia.com>,  Alex Coplan <alex.coplan@arm.com>,\n Andrew Pinski <andrew.pinski@oss.qualcomm.com>,  Wilco Dijkstra\n <wilco.dijkstra@arm.com>, rdsandiford@googlemail.com","Cc":"gcc-patches@gcc.gnu.org,  Richard Earnshaw <richard.earnshaw@arm.com>,\n Tamar Christina <tamar.christina@arm.com>,  Kyrylo Tkachov\n <ktkachov@nvidia.com>,  Alex Coplan <alex.coplan@arm.com>,  Andrew Pinski\n <andrew.pinski@oss.qualcomm.com>,  Wilco Dijkstra <wilco.dijkstra@arm.com>","Subject":"Re: [PATCH 3/7] aarch64: Fix ZA state transition [PR119210]","In-Reply-To":"<4f556941-ed31-aab8-f455-b2e5239d0a64@e124511.cambridge.arm.com>\n (Alice Carlotti's message of \"Sat, 18 Apr 2026 01:29:38 +0100\")","References":"<500b3dee-1ffe-5d08-2308-5bf06d38650c@e124511.cambridge.arm.com>\n <4f556941-ed31-aab8-f455-b2e5239d0a64@e124511.cambridge.arm.com>","Date":"Sat, 18 Apr 2026 13:47:45 +0100","Message-ID":"<87y0ikqvta.fsf@googlemail.com>","User-Agent":"Gnus/5.13 (Gnus v5.13)","MIME-Version":"1.0","Content-Type":"text/plain","X-BeenThere":"gcc-patches@gcc.gnu.org","X-Mailman-Version":"2.1.30","Precedence":"list","List-Id":"Gcc-patches mailing list <gcc-patches.gcc.gnu.org>","List-Unsubscribe":"<https://gcc.gnu.org/mailman/options/gcc-patches>,\n <mailto:gcc-patches-request@gcc.gnu.org?subject=unsubscribe>","List-Archive":"<https://gcc.gnu.org/pipermail/gcc-patches/>","List-Post":"<mailto:gcc-patches@gcc.gnu.org>","List-Help":"<mailto:gcc-patches-request@gcc.gnu.org?subject=help>","List-Subscribe":"<https://gcc.gnu.org/mailman/listinfo/gcc-patches>,\n <mailto:gcc-patches-request@gcc.gnu.org?subject=subscribe>","Errors-To":"gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org"}}]