From patchwork Mon Jan 8 00:06:20 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Maciej W. Rozycki" X-Patchwork-Id: 1883436 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=embecosm.com header.i=@embecosm.com header.a=rsa-sha256 header.s=google header.b=JYIQH0Gr; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=8.43.85.97; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4T7Z9p6V0Hz1xqk for ; Mon, 8 Jan 2024 11:06:53 +1100 (AEDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id F37D63858C78 for ; Mon, 8 Jan 2024 00:06:50 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mail-ej1-x62c.google.com (mail-ej1-x62c.google.com [IPv6:2a00:1450:4864:20::62c]) by sourceware.org (Postfix) with ESMTPS id 65CB33858CD1 for ; Mon, 8 Jan 2024 00:06:25 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 65CB33858CD1 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=embecosm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=embecosm.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 65CB33858CD1 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2a00:1450:4864:20::62c ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1704672387; cv=none; b=Vt1cVfaRmGEg/1mfgWQPe0Xr3hpVVzcBm1UUvpNU6VheNyAzZeOs56vg24v0AP4QcN/0k6auzOHukUBisIt50s/55ajy68eCLXZfa1gYnZTJGvx7WSwpZdEpSN+6jsO6X1RJZzvd+oZqQORDb4SJK2UfUIBYNEfIkMWFFhyjZP8= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1704672387; c=relaxed/simple; bh=mLBG+OtMoqaONtZMzesp1Af3BUALb61ljF8hdXqmbnE=; h=DKIM-Signature:Date:From:To:Subject:Message-ID:MIME-Version; b=Lb9VKaYfYbrKPEsag5Q/VpkkQ0fXncSeC3TGnkogVKpGwe870KpLVoAhnNCgYfeFYffymWUXs73UCIoiqJai60IB6IqNHzg9VzSz+SOXPEVVslUtnqeZw/aDqrPjPBl/G62WEZm5YHVTBmb055RkfadaDJ1buefbwW2RqIOM/6M= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-ej1-x62c.google.com with SMTP id a640c23a62f3a-a29058bb2ceso127699966b.0 for ; Sun, 07 Jan 2024 16:06:25 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=embecosm.com; s=google; t=1704672384; x=1705277184; darn=gcc.gnu.org; h=mime-version:user-agent:message-id:subject:cc:to:from:date:from:to :cc:subject:date:message-id:reply-to; bh=vdSyh0nJ/AIIpz+wprGas64scpoRMXh1x7Ba3B4UHOI=; b=JYIQH0Gr0s63yz/qU7zA69hBItxJrw7EErWTxoOcHSwDaEIv4r+OYfaBr5Tz7KyMdW /rjfCQZcn5km1ZU/3JJ7hQ7F4cmsWa6wdvfFQnRHDYJ8PaYJ2ecBFyI9MNsTTG+2KDVS 8qL5glpF4Vku0WtDC4hGb/aH4SU006Mpy7IT2jqiZD5ZgtK2UcqPG7TK+4+WvumzRXGP aTZ/nZl6WlB3Vq2cxMKVHDeZ3cEVOrcnutT7FE9D70Cc7mLOuvJsXx7Qo8RdXzP0HQED C3yPV9wRHf588e9ADL+UuvBvEz5QwcyWoPVVeXJUCyvGnJo9S4SvbKu8uZjJYabNAFjB 42Yw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1704672384; x=1705277184; h=mime-version:user-agent:message-id:subject:cc:to:from:date :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=vdSyh0nJ/AIIpz+wprGas64scpoRMXh1x7Ba3B4UHOI=; b=pMnjnqQAXUh95avG5SDiJjBRmqZzu0e8FtzEo2jRTqPXO/w1oiGu3pjFPN+Ve4QBaX RMwFsl1yxeMPkc+O9vvfde0M3MLY45TZz0NpKA0CnTJkHk8PFm9MZgTzoqtnLkhl5flT 3rXPmxtA4imEggMs7rsWlQ2C/03gjFV/8dQnyZlvcCaFzGm5NZB/wopUN9zmwRHKNGjF yokqrgaxR7lZi//Kyn2PBVxMVyxyhZ8KZrfTTKbXxFw0AtVumFALGiWOGGk73Ne7OgV/ ZTXnn0Ha3uK+1W+1SgohPWjE/zjbDVzdnjZojSZFyjf8xyF/hVG1iBGSuGMMB0wyPBx7 bFyA== X-Gm-Message-State: AOJu0Yz4P7kgZyU6qPj/tuzJaOiD4JRZCj/NnKgenTOaeKvqo87A2Gxo zvd01E+WcLJ+ZtIzUfclEv6xFOzDhD+7WxYAsFJZlJAk9ZE= X-Google-Smtp-Source: AGHT+IETyQTCGOk2ALo62E07FiWCR3ugiFSxyYnFA5Q4yQOn9zPMd3O0X8PHLhPvN9qQeF+i5OrW5Q== X-Received: by 2002:a17:906:30cb:b0:a27:efb8:105b with SMTP id b11-20020a17090630cb00b00a27efb8105bmr679026ejb.198.1704672383971; Sun, 07 Jan 2024 16:06:23 -0800 (PST) Received: from [192.168.219.3] ([78.8.192.131]) by smtp.gmail.com with ESMTPSA id s14-20020a170906c30e00b00a26a80ac5fasm3413149ejz.120.2024.01.07.16.06.22 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Sun, 07 Jan 2024 16:06:23 -0800 (PST) Date: Mon, 8 Jan 2024 00:06:20 +0000 (GMT) From: "Maciej W. Rozycki" To: gcc-patches@gcc.gnu.org cc: Andrew Waterman , Jim Wilson , Kito Cheng , Palmer Dabbelt Subject: [PATCH] RISC-V: Also handle sign extension in branch costing Message-ID: User-Agent: Alpine 2.20 (DEB 67 2015-01-07) MIME-Version: 1.0 X-Spam-Status: No, score=0.5 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, KAM_ASCII_DIVIDERS, KAM_SHORT, LIKELY_SPAM_BODY, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE, URIBL_BLACK autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org Complement commit c1e8cb3d9f94 ("RISC-V: Rework branch costing model for if-conversion") and also handle extraneous sign extend operations that are sometimes produced by `noce_try_cmove_arith' instead of zero extend operations, making branch costing consistent. It is unclear what the condition is for the middle end to choose between the zero extend and sign extend operation, but the test case included uses sign extension with 64-bit targets, preventing if-conversion from triggering across all the architectural variants. There are further anomalies revealed by the test case, specifically the exceedingly high branch cost of 6 required for the `-mmovcc' variant despite that the final branchless sequence only uses 4 instructions, the missed conversion at -O1 for 32-bit targets even though code is machine word size agnostic, and the missed conversion at -Os and -Oz for 32-bit Zicond targets even though the branchless sequence would be shorter than the branched one. These will have to be handled separately. gcc/ * config/riscv/riscv.cc (riscv_noce_conversion_profitable_p): Also handle sign extension. gcc/testsuite/ * gcc.target/riscv/cset-sext-sfb.c: New test. * gcc.target/riscv/cset-sext-thead.c: New test. * gcc.target/riscv/cset-sext-ventana.c: New test. * gcc.target/riscv/cset-sext-zicond.c: New test. * gcc.target/riscv/cset-sext.c: New test. --- Hi, This is still in regression-testing, but as a branch costing adjustment only I don't expect any code correctness issues, and the performance advantage seems very obvious as the sign extend operation applied to the result of a conditional set instruction is always a no-op, just as with the zero extension. Depending on how you look at it you may qualify this as a bug fix (for the commit referred; it's surely rare enough a case I missed in original testing) or a missed optimisation. Either way it's a narrow-scoped very small change, almost an obviously correct one. I'll be very happy to get it off my plate now, but if it has to wait for GCC 15, I'll accept the decision. OK to apply then or shall I wait? Maciej --- gcc/config/riscv/riscv.cc | 5 ++- gcc/testsuite/gcc.target/riscv/cset-sext-sfb.c | 28 +++++++++++++++++++++ gcc/testsuite/gcc.target/riscv/cset-sext-thead.c | 26 +++++++++++++++++++ gcc/testsuite/gcc.target/riscv/cset-sext-ventana.c | 26 +++++++++++++++++++ gcc/testsuite/gcc.target/riscv/cset-sext-zicond.c | 26 +++++++++++++++++++ gcc/testsuite/gcc.target/riscv/cset-sext.c | 27 ++++++++++++++++++++ 6 files changed, 136 insertions(+), 2 deletions(-) gcc-riscv-noce-conversion-profitable-p-sign-extend.diff Index: gcc/gcc/config/riscv/riscv.cc =================================================================== --- gcc.orig/gcc/config/riscv/riscv.cc +++ gcc/gcc/config/riscv/riscv.cc @@ -3555,7 +3555,7 @@ riscv_noce_conversion_profitable_p (rtx_ this redundant zero extend operation counts towards the cost of the replacement sequence. Compensate for that by incrementing the cost of the original sequence as well as the maximum sequence cost - accordingly. */ + accordingly. Likewise for sign extension. */ rtx last_dest = NULL_RTX; for (rtx_insn *insn = seq; insn; insn = NEXT_INSN (insn)) { @@ -3567,8 +3567,9 @@ riscv_noce_conversion_profitable_p (rtx_ && GET_CODE (x) == SET) { rtx src = SET_SRC (x); + enum rtx_code code = GET_CODE (src); if (last_dest != NULL_RTX - && GET_CODE (src) == ZERO_EXTEND + && (code == SIGN_EXTEND || code == ZERO_EXTEND) && REG_P (XEXP (src, 0)) && REGNO (XEXP (src, 0)) == REGNO (last_dest)) { Index: gcc/gcc/testsuite/gcc.target/riscv/cset-sext-sfb.c =================================================================== --- /dev/null +++ gcc/gcc/testsuite/gcc.target/riscv/cset-sext-sfb.c @@ -0,0 +1,28 @@ +/* { dg-do compile } */ +/* { dg-skip-if "" { *-*-* } { "-O0" "-Og" } } */ +/* { dg-options "-march=rv32gc -mtune=sifive-7-series -mbranch-cost=1 -fdump-rtl-ce1" { target { rv32 } } } */ +/* { dg-options "-march=rv64gc -mtune=sifive-7-series -mbranch-cost=1 -fdump-rtl-ce1" { target { rv64 } } } */ + +int +foo (long a, long b) +{ + if (!b) + return 0; + else if (a) + return 1; + else + return 0; +} + +/* Expect short forward branch assembly like: + + snez a0,a0 + bne a1,zero,1f # movcc + mv a0,zero +1: + */ + +/* { dg-final { scan-rtl-dump-times "if-conversion succeeded through noce_try_cmove_arith" 1 "ce1" { xfail { rv32 && { any-opts "-O1" } } } } } */ +/* { dg-final { scan-assembler-times "\\ssnez\\s" 1 } } */ +/* { dg-final { scan-assembler-times "\\sbne\\s\[^\\s\]+\\s# movcc\\s" 1 { xfail { rv32 && { any-opts "-O1" } } } } } */ +/* { dg-final { scan-assembler-not "\\sbeq\\s" { xfail { rv32 && { any-opts "-O1" } } } } } */ Index: gcc/gcc/testsuite/gcc.target/riscv/cset-sext-thead.c =================================================================== --- /dev/null +++ gcc/gcc/testsuite/gcc.target/riscv/cset-sext-thead.c @@ -0,0 +1,26 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target rv64 } */ +/* { dg-skip-if "" { *-*-* } { "-O0" "-Og" } } */ +/* { dg-options "-march=rv64gc_xtheadcondmov -mtune=thead-c906 -mbranch-cost=1 -fdump-rtl-ce1" } */ + +int +foo (long a, long b) +{ + if (!b) + return 0; + else if (a) + return 1; + else + return 0; +} + +/* Expect branchless assembly like: + + snez a0,a0 + th.mveqz a0,zero,a1 + */ + +/* { dg-final { scan-rtl-dump-times "if-conversion succeeded through noce_try_cmove_arith" 1 "ce1" } } */ +/* { dg-final { scan-assembler-times "\\ssnez\\s" 1 } } */ +/* { dg-final { scan-assembler-times "\\s(?:th\\.mveqz|th\\.mvnez)\\s" 1 } } */ +/* { dg-final { scan-assembler-not "\\s(?:beq|bne)\\s" } } */ Index: gcc/gcc/testsuite/gcc.target/riscv/cset-sext-ventana.c =================================================================== --- /dev/null +++ gcc/gcc/testsuite/gcc.target/riscv/cset-sext-ventana.c @@ -0,0 +1,26 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target rv64 } */ +/* { dg-skip-if "" { *-*-* } { "-O0" "-Og" } } */ +/* { dg-options "-march=rv64gc_xventanacondops -mtune=rocket -mbranch-cost=3 -fdump-rtl-ce1" } */ + +int +foo (long a, long b) +{ + if (!b) + return 0; + else if (a) + return 1; + else + return 0; +} + +/* Expect branchless assembly like: + + snez a0,a0 + vt.maskc a0,a0,a1 + */ + +/* { dg-final { scan-rtl-dump-times "if-conversion succeeded through noce_try_cmove_arith" 1 "ce1" } } */ +/* { dg-final { scan-assembler-times "\\ssnez\\s" 1 } } */ +/* { dg-final { scan-assembler-times "\\svt\\.maskc\\s" 1 } } */ +/* { dg-final { scan-assembler-not "\\s(?:beq|bne)\\s" } } */ Index: gcc/gcc/testsuite/gcc.target/riscv/cset-sext-zicond.c =================================================================== --- /dev/null +++ gcc/gcc/testsuite/gcc.target/riscv/cset-sext-zicond.c @@ -0,0 +1,26 @@ +/* { dg-do compile } */ +/* { dg-skip-if "" { *-*-* } { "-O0" "-Og" } } */ +/* { dg-options "-march=rv64gc_zicond -mtune=rocket -mbranch-cost=3 -fdump-rtl-ce1" { target { rv64 } } } */ +/* { dg-options "-march=rv32gc_zicond -mtune=rocket -mbranch-cost=3 -fdump-rtl-ce1" { target { rv32 } } } */ + +int +foo (long a, long b) +{ + if (!b) + return 0; + else if (a) + return 1; + else + return 0; +} + +/* Expect branchless assembly like: + + snez a0,a0 + czero.eqz a0,a0,a1 + */ + +/* { dg-final { scan-rtl-dump-times "if-conversion succeeded through noce_try_cmove_arith" 1 "ce1" { xfail { rv32 && { any-opts "-O1" "-Os" "-Oz" } } } } } */ +/* { dg-final { scan-assembler-times "\\ssnez\\s" 1 } } */ +/* { dg-final { scan-assembler-times "\\sczero\\.eqz\\s" 1 { xfail { rv32 && { any-opts "-O1" "-Os" "-Oz" } } } } } */ +/* { dg-final { scan-assembler-not "\\s(?:beq|bne)\\s" { xfail { rv32 && { any-opts "-O1" "-Os" "-Oz" } } } } } */ Index: gcc/gcc/testsuite/gcc.target/riscv/cset-sext.c =================================================================== --- /dev/null +++ gcc/gcc/testsuite/gcc.target/riscv/cset-sext.c @@ -0,0 +1,27 @@ +/* { dg-do compile } */ +/* { dg-skip-if "" { *-*-* } { "-O0" "-Og" "-Os" "-Oz" } } */ +/* { dg-options "-march=rv32gc -mtune=sifive-5-series -mbranch-cost=6 -mmovcc -fdump-rtl-ce1" { target { rv32 } } } */ +/* { dg-options "-march=rv64gc -mtune=sifive-5-series -mbranch-cost=6 -mmovcc -fdump-rtl-ce1" { target { rv64 } } } */ + +int +foo (long a, long b) +{ + if (!b) + return 0; + else if (a) + return 1; + else + return 0; +} + +/* Expect branchless assembly like: + + snez a1,a1 + neg a1,a1 + snez a0,a0 + and a0,a1,a0 + */ + +/* { dg-final { scan-rtl-dump-times "if-conversion succeeded through noce_try_cmove_arith" 1 "ce1" { xfail { rv32 && { any-opts "-O1" } } } } } */ +/* { dg-final { scan-assembler-times "\\ssnez\\s" 2 { xfail { rv32 && { any-opts "-O1" } } } } } */ +/* { dg-final { scan-assembler-not "\\s(?:beq|bne)\\s" { xfail { rv32 && { any-opts "-O1" } } } } } */