From patchwork Fri Mar 22 02:47:49 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeff Law X-Patchwork-Id: 1914700 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=ventanamicro.com header.i=@ventanamicro.com header.a=rsa-sha256 header.s=google header.b=jjGiquIN; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=8.43.85.97; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4V16Fy35Dsz1yXt for ; Fri, 22 Mar 2024 13:48:20 +1100 (AEDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 1FE543858C98 for ; Fri, 22 Mar 2024 02:48:17 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mail-oa1-x31.google.com (mail-oa1-x31.google.com [IPv6:2001:4860:4864:20::31]) by sourceware.org (Postfix) with ESMTPS id BEB583858C35 for ; Fri, 22 Mar 2024 02:47:52 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org BEB583858C35 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=ventanamicro.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=ventanamicro.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org BEB583858C35 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2001:4860:4864:20::31 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1711075674; cv=none; b=HMM1TX/fKHFxPKyhB3SRs8OBYRaqpXpKZ2At12RTPvHIKru69KXQ+eaIA8GMoEu/c83F8lrtu870NTkzmN7fLtJUwcOn8G5xxc/Wda+q5nJXjhChU0fmRZ7RcthqR7YNY7oL+aN+F/r7URhQKv3hK9EyZN32f8nVFdFhOefRsIA= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1711075674; c=relaxed/simple; bh=16NMQNE/FvXB6A4b1UNmNbA9rx0iB4mxwAqhz0dzsA0=; h=DKIM-Signature:Message-ID:Date:MIME-Version:From:To:Subject; b=jGAzRdoNcL+365J4QfIXllmTZcjto3nzmjdluMLyv9LwBsj8tQ5dUom3uUA5ZfIv/LXjYkKj3aF0hfSjoAQcNpJaBvSe3O7dmBkx0wRDd7I/sIRfokSmr5BGCHvTJrIUdP9xioZgxGOceREBWpoR/BjXHG17BhCzZiEuRshy+S4= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-oa1-x31.google.com with SMTP id 586e51a60fabf-22988692021so702379fac.1 for ; Thu, 21 Mar 2024 19:47:52 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ventanamicro.com; s=google; t=1711075671; x=1711680471; darn=gcc.gnu.org; h=subject:cc:to:content-language:from:user-agent:mime-version:date :message-id:from:to:cc:subject:date:message-id:reply-to; bh=7koGd8XmjC45O52o6Sx7fFi1AmTBecC1AILLLPxdAXQ=; b=jjGiquINGJgco8+26mFyUXHPev6DYJd7QGZ9CFlPf3UEsellbk1kTMo/+ZRypm7gwG OyF6o4mHowsRT144c/3MdCkbpf8p3McQHssJq+FQWWuLf9LNv4kwgce8u1LEC4W9nxJE DPGPZ+eGkZGkLgu/HgqPAjq1l3zNjdi4EVsu8SaJk+gUxuhY9QqrcsmZBUAV6cHVTdPX zMFfMVm0eG01T1Hz6Xdlc354MhxYlxN5tBliGwVIFhY/3r9pkrGJFV80HRG9E9o/nnwg v26HpNkwvG69oHnLRzmuqtscpTq6WxxhXqK0WBYzxPI3T2JoaM2iUj3eLWB3Ai/gZFtY mDuQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1711075671; x=1711680471; h=subject:cc:to:content-language:from:user-agent:mime-version:date :message-id:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=7koGd8XmjC45O52o6Sx7fFi1AmTBecC1AILLLPxdAXQ=; b=nOx/j9XwLMml+WhkyXm/za0r4q4EjSy1J+32tuFXPDzraEr7TKTpkkG/frjOEQldUE o3c/VMa5kvlYGMX2X0SkgTPgWrnOz/9+BZHwWUUr+2aI8EpUh+VIGkAePP7VDgWJ6Ob3 Bt7X9Z8WbXXFOfySie/SViIs6xtBBzFtU9awAiuSPfqUpfDiyntry00WQBhJ9MDeUfO5 Dwo/RnbU7mWxZ5GJ/KOurj02unntrR51luMfbnXCuCV1oWgNhlZmyk0YPDxAc4c/Ll0y aLhanv0y3eUpZcX/W4lO3jVhETF2PAn9a+LFsSR5jZ0fSgT0zQ8vihaII0dqEha9nwBT kdGA== X-Gm-Message-State: AOJu0YwlhzlG7BQdE6zSEmJDyz9k1OBSlkPWvGBBWCeWFvttZWUXgOLz qSi927IfqBTTeVjQemv0dzDNCUK8tvF4MVg40xhI16bYsjpo3GuF0zeSxv+B6iAzqv7aLhEFOp+ 0 X-Google-Smtp-Source: AGHT+IEyaAMLUIjgrHZl75iKhCqSXhW+yvjvT9KdoyO8m7u065iKaxDO/On8RHN3qWDNRkRu0f+Djw== X-Received: by 2002:a05:6870:c0d1:b0:221:7fc2:94df with SMTP id e17-20020a056870c0d100b002217fc294dfmr362037oad.3.1711075670948; Thu, 21 Mar 2024 19:47:50 -0700 (PDT) Received: from [172.31.0.109] ([136.36.72.243]) by smtp.gmail.com with ESMTPSA id g34-20020a05683030a200b006e5398eb1f9sm204213ots.8.2024.03.21.19.47.50 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Thu, 21 Mar 2024 19:47:50 -0700 (PDT) Message-ID: Date: Thu, 21 Mar 2024 20:47:49 -0600 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Beta From: Jeff Law Content-Language: en-US To: "gcc-patches@gcc.gnu.org" Cc: Florian Weimer , Raphael Zinsly Subject: [committed] Fix RISC-V missing stack tie X-Spam-Status: No, score=-10.0 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, URIBL_BLACK autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org As some of you know, Raphael has been working on stack-clash support for the RISC-V port. A little while ago Florian reached out to us with an issue where glibc was failing its smoke test due to referencing an unallocated stack slot. Without diving into the code in detail I (incorrectly) concluded it was a problem with the fallback of using Ada's stack-check paths due to not having stack-clash support. Once enough stack-clash bits were ready I had Raphael review the code generated for Florian's test and we concluded the the original case from Florian was just wrong irrespective of stack clash/stack check. While Raphael's stack-clash work will indirectly fix Florian's case, it really should also work without stack-clash. In particular this code was called out by valgrind: > 000000000003cb5e : > __GI___realpath(): > 3cb5e: 81010113 addi sp,sp,-2032 > 3cb62: 7d313423 sd s3,1992(sp) > 3cb66: 79fd lui s3,0xfffff > 3cb68: 7e813023 sd s0,2016(sp) > 3cb6c: 7c913c23 sd s1,2008(sp) > 3cb70: 7f010413 addi s0,sp,2032 > 3cb74: 35098793 addi a5,s3,848 # fffffffffffff350 <__libc_initial+0xffffffffffe8946a> > 3cb78: 74fd lui s1,0xfffff > 3cb7a: 008789b3 add s3,a5,s0 > 3cb7e: f9048793 addi a5,s1,-112 # ffffffffffffef90 <__libc_initial+0xffffffffffe890aa> > 3cb82: 008784b3 add s1,a5,s0 > 3cb86: 77fd lui a5,0xfffff > 3cb88: 7d413023 sd s4,1984(sp) > 3cb8c: 7b513c23 sd s5,1976(sp) > 3cb90: 7e113423 sd ra,2024(sp) > 3cb94: 7d213823 sd s2,2000(sp) > 3cb98: 7b613823 sd s6,1968(sp) > 3cb9c: 7b713423 sd s7,1960(sp) > 3cba0: 7b813023 sd s8,1952(sp) > 3cba4: 79913c23 sd s9,1944(sp) > 3cba8: 79a13823 sd s10,1936(sp) > 3cbac: 79b13423 sd s11,1928(sp) > 3cbb0: 34878793 addi a5,a5,840 # fffffffffffff348 <__libc_initial+0xffffffffffe89462> > 3cbb4: 40000713 li a4,1024 > 3cbb8: 00132a17 auipc s4,0x132 > 3cbbc: ae0a3a03 ld s4,-1312(s4) # 16e698 <__stack_chk_guard> > 3cbc0: 01098893 addi a7,s3,16 > 3cbc4: 42098693 addi a3,s3,1056 > 3cbc8: b8040a93 addi s5,s0,-1152 > 3cbcc: 97a2 add a5,a5,s0 > 3cbce: 000a3603 ld a2,0(s4) > 3cbd2: f8c43423 sd a2,-120(s0) > 3cbd6: 4601 li a2,0 > 3cbd8: 3d14b023 sd a7,960(s1) > 3cbdc: 3ce4b423 sd a4,968(s1) > 3cbe0: 7cd4b823 sd a3,2000(s1) > 3cbe4: 7ce4bc23 sd a4,2008(s1) > 3cbe8: b7543823 sd s5,-1168(s0) > 3cbec: b6e43c23 sd a4,-1160(s0) > 3cbf0: e38c sd a1,0(a5) > 3cbf2: b0010113 addi sp,sp,-1280 In particular note the store at 0x3cbd8. That's hitting (s1 + 960). If you chase the values around, you'll find it's a bit more than 1k into unallocated stack space. It's also worth noting the final stack adjustment at 0x3cbf2. While I haven't reproduced Florian's code exactly, I was able to get reasonably close and verify my suspicion that everything was fine before sched2 and incorrect after sched2. It was also obvious at that point what had gone wrong -- we were missing a stack tie after the final stack pointer adjustment. This patch adds the missing stack tie. While not technically a regression, I shudder at the thought of chasing one of these issues down again in the wild. Been there, done that. Regression tested on rv64gc. Verified the scheduler no longer mucked up realpath by hand. Pushing to the trunk. Jeff commit c65046ff2ef0a9a46e59bc0b3369b2d226f6a239 Author: Jeff Law Date: Thu Mar 21 20:41:59 2024 -0600 [committed] Fix RISC-V missing stack tie As some of you know, Raphael has been working on stack-clash support for the RISC-V port. A little while ago Florian reached out to us with an issue where glibc was failing its smoke test due to referencing an unallocated stack slot. Without diving into the code in detail I (incorrectly) concluded it was a problem with the fallback of using Ada's stack-check paths due to not having stack-clash support. Once enough stack-clash bits were ready I had Raphael review the code generated for Florian's test and we concluded the the original case from Florian was just wrong irrespective of stack clash/stack check. While Raphael's stack-clash work will indirectly fix Florian's case, it really should also work without stack-clash. In particular this code was called out by valgrind: > 000000000003cb5e : > __GI___realpath(): > 3cb5e: 81010113 addi sp,sp,-2032 > 3cb62: 7d313423 sd s3,1992(sp) > 3cb66: 79fd lui s3,0xfffff > 3cb68: 7e813023 sd s0,2016(sp) > 3cb6c: 7c913c23 sd s1,2008(sp) > 3cb70: 7f010413 addi s0,sp,2032 > 3cb74: 35098793 addi a5,s3,848 # fffffffffffff350 <__libc_initial+0xffffffffffe8946a> > 3cb78: 74fd lui s1,0xfffff > 3cb7a: 008789b3 add s3,a5,s0 > 3cb7e: f9048793 addi a5,s1,-112 # ffffffffffffef90 <__libc_initial+0xffffffffffe890aa> > 3cb82: 008784b3 add s1,a5,s0 > 3cb86: 77fd lui a5,0xfffff > 3cb88: 7d413023 sd s4,1984(sp) > 3cb8c: 7b513c23 sd s5,1976(sp) > 3cb90: 7e113423 sd ra,2024(sp) > 3cb94: 7d213823 sd s2,2000(sp) > 3cb98: 7b613823 sd s6,1968(sp) > 3cb9c: 7b713423 sd s7,1960(sp) > 3cba0: 7b813023 sd s8,1952(sp) > 3cba4: 79913c23 sd s9,1944(sp) > 3cba8: 79a13823 sd s10,1936(sp) > 3cbac: 79b13423 sd s11,1928(sp) > 3cbb0: 34878793 addi a5,a5,840 # fffffffffffff348 <__libc_initial+0xffffffffffe89462> > 3cbb4: 40000713 li a4,1024 > 3cbb8: 00132a17 auipc s4,0x132 > 3cbbc: ae0a3a03 ld s4,-1312(s4) # 16e698 <__stack_chk_guard> > 3cbc0: 01098893 addi a7,s3,16 > 3cbc4: 42098693 addi a3,s3,1056 > 3cbc8: b8040a93 addi s5,s0,-1152 > 3cbcc: 97a2 add a5,a5,s0 > 3cbce: 000a3603 ld a2,0(s4) > 3cbd2: f8c43423 sd a2,-120(s0) > 3cbd6: 4601 li a2,0 > 3cbd8: 3d14b023 sd a7,960(s1) > 3cbdc: 3ce4b423 sd a4,968(s1) > 3cbe0: 7cd4b823 sd a3,2000(s1) > 3cbe4: 7ce4bc23 sd a4,2008(s1) > 3cbe8: b7543823 sd s5,-1168(s0) > 3cbec: b6e43c23 sd a4,-1160(s0) > 3cbf0: e38c sd a1,0(a5) > 3cbf2: b0010113 addi sp,sp,-1280 In particular note the store at 0x3cbd8. That's hitting (s1 + 960). If you chase the values around, you'll find it's a bit more than 1k into unallocated stack space. It's also worth noting the final stack adjustment at 0x3cbf2. While I haven't reproduced Florian's code exactly, I was able to get reasonably close and verify my suspicion that everything was fine before sched2 and incorrect after sched2. It was also obvious at that point what had gone wrong -- we were missing a stack tie after the final stack pointer adjustment. This patch adds the missing stack tie. While not technically a regression, I shudder at the thought of chasing one of these issues down again in the wild. Been there, done that. Regression tested on rv64gc. Verified the scheduler no longer mucked up realpath by hand. Pushing to the trunk. gcc/ * config/riscv/riscv.cc (riscv_expand_prologue): Add missing stack tie for scalable and final stack adjustment if needed. Co-authored-by: Raphael Zinsly diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc index 97350b8305e..1358c243898 100644 --- a/gcc/config/riscv/riscv.cc +++ b/gcc/config/riscv/riscv.cc @@ -7365,7 +7365,15 @@ riscv_expand_prologue (void) /* Second step for constant frame. */ HOST_WIDE_INT constant_frame = remaining_size.to_constant (); if (constant_frame == 0) - return; + { + /* We must have allocated stack space for the scalable frame. + Emit a stack tie if we have a frame pointer so that the + allocation is ordered WRT fp setup and subsequent writes + into the frame. */ + if (frame_pointer_needed) + riscv_emit_stack_tie (); + return; + } if (SMALL_OPERAND (-constant_frame)) { @@ -7385,6 +7393,13 @@ riscv_expand_prologue (void) insn = gen_rtx_SET (stack_pointer_rtx, insn); riscv_set_frame_expr (insn); } + + /* We must have allocated the remainder of the stack frame. + Emit a stack tie if we have a frame pointer so that the + allocation is ordered WRT fp setup and subsequent writes + into the frame. */ + if (frame_pointer_needed) + riscv_emit_stack_tie (); } }