From patchwork Mon Apr 26 12:45:52 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: =?utf-8?q?Christoph_M=C3=BCllner?= X-Patchwork-Id: 1470286 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=sourceware.org; envelope-from=gcc-patches-bounces@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.a=rsa-sha256 header.s=default header.b=qg7yILJ/; dkim-atps=neutral Received: from sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4FTPm63Qs8z9t1C for ; Mon, 26 Apr 2021 22:46:46 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 61401395042E; Mon, 26 Apr 2021 12:46:12 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 61401395042E DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1619441172; bh=KgLa9akMy9j+F2Ib6pKBnnoQQ02yIVma+snIbqemRO0=; h=To:Subject:Date:In-Reply-To:References:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To:Cc: From; b=qg7yILJ/KVD57aD0ZRG7V0fL+H3P2dPiFlRWdgKWhZLvBnj4VCBXhelYtuRpZaDfj bsi3yNgGV3am6elaFeoeZG4wU5/SdcJSoCvVPaDfPpAidPgl2x2VdH70i1m/Sz8+Wf v8WC+uDM03jLYjtaSCavhaLCEZanRUC9hvo2Wrc8= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mail-ej1-f51.google.com (mail-ej1-f51.google.com [209.85.218.51]) by sourceware.org (Postfix) with ESMTPS id E9F8E3950C71; Mon, 26 Apr 2021 12:46:06 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org E9F8E3950C71 Received: by mail-ej1-f51.google.com with SMTP id u21so84258694ejo.13; Mon, 26 Apr 2021 05:46:06 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=KgLa9akMy9j+F2Ib6pKBnnoQQ02yIVma+snIbqemRO0=; b=WgD3RaZIq1Hyp9AKfDmG/OnwRIYSQgaQcC/+ic+2p7y13iXN7Xjucdp1Mpy0qqpUx1 plmkS8Osn51FQvLJdQDshbhUNQjzot32tSlSoJo7wb55+b3mYMo383QEBjfg/0IP/ZV+ JCtoJ5UpjumDAyhW1qjqxRDmYAGs3xRtJ2XlcUhBI+I7u2RWDTEzJKy6VVilBDPMfNX7 QQXVOjjQK4ldcsr3GB4lEaRpOlecc6AAyZb0+zJX64UUPpK4OG35yPPJ/ZeNMtpAl4vc wwcTUPAqgNoAUUy0kAzzQ1ysyAlx5+aULzj5DLlNcLEU7mbyerJTeQabuHcj/wpiAhHj wlEA== X-Gm-Message-State: AOAM532AdF5KWgm6Y5L4g291V7iCMYHwPE3Nsc0hxpQoawRmETgVxE97 eSCBbw53MiC3JBM8NBJU/K1ovs6fbYX28w== X-Google-Smtp-Source: ABdhPJzi3qOgna2CgWF3SP+ohKR/R0YOHdKpOTrmGqkBZ3y48VxU5Ony504p9nwun9jcM8efWg0+eA== X-Received: by 2002:a17:907:76cb:: with SMTP id kf11mr18632298ejc.472.1619441165731; Mon, 26 Apr 2021 05:46:05 -0700 (PDT) Received: from beast.fritz.box (62-178-178-158.cable.dynamic.surfer.at. [62.178.178.158]) by smtp.gmail.com with ESMTPSA id o20sm14126755eds.65.2021.04.26.05.46.04 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 26 Apr 2021 05:46:05 -0700 (PDT) To: gcc-patches@gcc.gnu.org Subject: [PATCH 10/10] RISC-V: Provide programmatic implementation of CAS [PR 100266] Date: Mon, 26 Apr 2021 14:45:52 +0200 Message-Id: <20210426124552.3316789-11-cmuellner@gcc.gnu.org> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20210426124552.3316789-1-cmuellner@gcc.gnu.org> References: <20210426124552.3316789-1-cmuellner@gcc.gnu.org> MIME-Version: 1.0 X-Spam-Status: No, score=-11.7 required=5.0 tests=BAYES_00, FREEMAIL_ENVFROM_END_DIGIT, FREEMAIL_FORGED_FROMDOMAIN, FREEMAIL_FROM, GIT_PATCH_0, HEADER_FROM_DIFFERENT_DOMAINS, KAM_DMARC_STATUS, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H3, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Christoph Muellner via Gcc-patches From: =?utf-8?q?Christoph_M=C3=BCllner?= Reply-To: Christoph Muellner Cc: Kito Cheng Errors-To: gcc-patches-bounces@gcc.gnu.org Sender: "Gcc-patches" The existing CAS implementation uses an INSN definition, which provides the core LR/SC sequence. Additionally to that, there is a follow-up code, that evaluates the results and calculates the return values. This has two drawbacks: a) an extension to sub-word CAS implementations is not possible (even if, then it would be unmaintainable), and b) the implementation is hard to maintain/improve. This patch provides a programmatic implementation of CAS, similar like many other architectures are having one. The implementation supports both, RV32 and RV64. Additionally, the implementation does not introduce data dependencies for computation of the return value. Instead, we set the return value (success state of the CAS operation) based on structural information. This approach is also shown in the the RISC-V unpriv spec (as part of the sample code for a compare-and-swap function using LR/SC). The cost of this implementation is a single LI instruction on top, which is actually not required in case of success (it will be overwritten in the success case later). The resulting sequence requires 9 instructions in the success case. The previous implementation required 11 instructions in the succcess case (including a taken branch) and had a "subw;seqz;beqz" sequence, with direct dependencies. Below is the generated code of a 32-bit CAS sequence with the old implementation and the new implementation (ignore the ANDIs below). Old: f00: 419c lw a5,0(a1) f02: 1005272f lr.w a4,(a0) f06: 00f71563 bne a4,a5,f10 f0a: 18c526af sc.w a3,a2,(a0) f0e: faf5 bnez a3,f02 f10: 40f707bb subw a5,a4,a5 f14: 0017b513 seqz a0,a5 f18: c391 beqz a5,f1c f1a: c198 sw a4,0(a1) f1c: 8905 andi a0,a0,1 f1e: 8082 ret New: e28: 4194 lw a3,0(a1) e2a: 4701 li a4,0 e2c: 1005282f lr.w a6,(a0) e30: 00d81963 bne a6,a3,e42 e34: 18c527af sc.w a5,a2,(a0) e38: fbf5 bnez a5,e2c e3a: 4705 li a4,1 e3c: 00177513 andi a0,a4,1 e40: 8082 ret e42: 0105a023 sw a6,0(a1) e46: 00177513 andi a0,a4,1 e4a: 8082 ret gcc/ PR 100266 * config/riscv/riscv-protos.h (riscv_expand_compare_and_swap): New. * config/riscv/riscv.c (riscv_emit_unlikely_jump): New. * config/rsicv/riscv.c (riscv_expand_compare_and_swap): New. * config/rsicv/sync.md (atomic_cas_value_strong): Removed. * config/rsicv/sync.md (atomic_compare_and_swap): Call riscv_expand_compare_and_swap. --- gcc/config/riscv/riscv-protos.h | 1 + gcc/config/riscv/riscv.c | 68 +++++++++++++++++++++++++++++++++ gcc/config/riscv/sync.md | 35 +---------------- 3 files changed, 70 insertions(+), 34 deletions(-) diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-protos.h index 43d7224d6941..eb7e67d3b95a 100644 --- a/gcc/config/riscv/riscv-protos.h +++ b/gcc/config/riscv/riscv-protos.h @@ -59,6 +59,7 @@ extern void riscv_expand_int_scc (rtx, enum rtx_code, rtx, rtx); extern void riscv_expand_float_scc (rtx, enum rtx_code, rtx, rtx); extern void riscv_expand_conditional_branch (rtx, enum rtx_code, rtx, rtx); extern void riscv_expand_conditional_move (rtx, rtx, rtx, rtx_code, rtx, rtx); +extern void riscv_expand_compare_and_swap (rtx[]); #endif extern rtx riscv_legitimize_call_address (rtx); extern void riscv_set_return_address (rtx, rtx); diff --git a/gcc/config/riscv/riscv.c b/gcc/config/riscv/riscv.c index 6e97b38db6db..c81a9bd6a29e 100644 --- a/gcc/config/riscv/riscv.c +++ b/gcc/config/riscv/riscv.c @@ -2488,6 +2488,74 @@ riscv_expand_conditional_move (rtx dest, rtx cons, rtx alt, rtx_code code, cons, alt))); } +/* Mark the previous jump instruction as unlikely. */ + +static void +riscv_emit_unlikely_jump (rtx insn) +{ + rtx_insn *jump = emit_jump_insn (insn); + add_reg_br_prob_note (jump, profile_probability::very_unlikely ()); +} + +/* Expand code to perform a compare-and-swap. */ + +extern void riscv_expand_compare_and_swap (rtx operands[]) +{ + rtx bval, oldval, mem, expval, newval, mod_s, mod_f, scratch, cond1, cond2; + machine_mode mode; + rtx_code_label *begin_label, *end_label; + + bval = operands[0]; + oldval = operands[1]; + mem = operands[2]; + expval = operands[3]; + newval = operands[4]; + mod_s = operands[6]; + mod_f = operands[7]; + mode = GET_MODE (mem); + scratch = gen_reg_rtx (mode); + begin_label = gen_label_rtx (); + end_label = gen_label_rtx (); + + /* No support for sub-word CAS. */ + if (mode == QImode || mode == HImode) + gcc_unreachable (); + + /* We use mod_f for LR and mod_s for SC below, but + RV does not have any guarantees for LR.rl and SC.aq. */ + if (is_mm_acquire (memmodel_base (INTVAL (mod_s))) + && is_mm_relaxed (memmodel_base (INTVAL (mod_f)))) + { + mod_f = GEN_INT (MEMMODEL_ACQUIRE); + mod_s = GEN_INT (MEMMODEL_RELAXED); + } + + /* Since we want to maintain a branch-free good-case, but also want + to not have two branches in the bad-case, we set bval to FALSE + on top of the sequence. In the bad case, we simply jump over + the assignment of bval to TRUE at the end of the sequence. */ + + emit_insn (gen_rtx_SET (bval, gen_rtx_CONST_INT (SImode, FALSE))); + + emit_label (begin_label); + + emit_insn (gen_riscv_load_reserved (mode, oldval, mem, mod_f)); + + cond1 = gen_rtx_NE (mode, oldval, expval); + riscv_emit_unlikely_jump (gen_cbranch4 (Pmode, cond1, oldval, expval, + end_label)); + + emit_insn (gen_riscv_store_conditional (mode, scratch, mem, newval, mod_s)); + + cond2 = gen_rtx_NE (mode, scratch, const0_rtx); + riscv_emit_unlikely_jump (gen_cbranch4 (Pmode, cond2, scratch, const0_rtx, + begin_label)); + + emit_insn (gen_rtx_SET (bval, gen_rtx_CONST_INT (SImode, TRUE))); + + emit_label (end_label); +} + /* Implement TARGET_FUNCTION_ARG_BOUNDARY. Every parameter gets at least PARM_BOUNDARY bits of alignment, but will be given anything up to PREFERRED_STACK_BOUNDARY bits if the type requires it. */ diff --git a/gcc/config/riscv/sync.md b/gcc/config/riscv/sync.md index 49b860da8ef0..da8dbf698163 100644 --- a/gcc/config/riscv/sync.md +++ b/gcc/config/riscv/sync.md @@ -207,20 +207,6 @@ "amoswap.%A3 %0,%z2,%1" ) -(define_insn "atomic_cas_value_strong" - [(set (match_operand:GPR 0 "register_operand" "=&r") - (match_operand:GPR 1 "memory_operand" "+A")) - (set (match_dup 1) - (unspec_volatile:GPR [(match_operand:GPR 2 "reg_or_0_operand" "rJ") - (match_operand:GPR 3 "reg_or_0_operand" "rJ") - (match_operand:SI 4 "const_int_operand") ;; mod_s - (match_operand:SI 5 "const_int_operand")] ;; mod_f - UNSPEC_COMPARE_AND_SWAP)) - (clobber (match_scratch:GPR 6 "=&r"))] - "TARGET_ATOMIC" - "1: lr.%A5 %0,%1; bne %0,%z2,1f; sc.%A4 %6,%z3,%1; bnez %6,1b; 1:" - [(set (attr "length") (const_int 20))]) - (define_expand "atomic_compare_and_swap" [(match_operand:SI 0 "register_operand" "") ;; bool output (match_operand:GPR 1 "register_operand" "") ;; val output @@ -232,26 +218,7 @@ (match_operand:SI 7 "const_int_operand" "")] ;; mod_f "TARGET_ATOMIC" { - emit_insn (gen_atomic_cas_value_strong (operands[1], operands[2], - operands[3], operands[4], - operands[6], operands[7])); - - rtx compare = operands[1]; - if (operands[3] != const0_rtx) - { - rtx difference = gen_rtx_MINUS (mode, operands[1], operands[3]); - compare = gen_reg_rtx (mode); - emit_insn (gen_rtx_SET (compare, difference)); - } - - if (word_mode != mode) - { - rtx reg = gen_reg_rtx (word_mode); - emit_insn (gen_rtx_SET (reg, gen_rtx_SIGN_EXTEND (word_mode, compare))); - compare = reg; - } - - emit_insn (gen_rtx_SET (operands[0], gen_rtx_EQ (SImode, compare, const0_rtx))); + riscv_expand_compare_and_swap (operands); DONE; })