From patchwork Fri Sep 8 08:48:56 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Takayuki 'January June' Suwa X-Patchwork-Id: 1831390 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.a=rsa-sha256 header.s=default header.b=PTdGvGOk; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4RhqY369NPz1ygR for ; Fri, 8 Sep 2023 18:49:27 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id C6C753858D39 for ; Fri, 8 Sep 2023 08:49:25 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org C6C753858D39 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1694162965; bh=rwXBY8e5xl0tlqyDxY14vJ2IVXK2Fgy8oJb/3KJBhoM=; h=Date:To:Subject:References:List-Id:List-Unsubscribe:List-Archive: List-Post:List-Help:List-Subscribe:From:Reply-To:From; b=PTdGvGOkN1H1B7gGZ9adQqyvuaGSs40fYl7kjjSA8pd7VjBP+omVjTON1g1PkcpZ9 bxKZpRe63AXICxzEfN+f0yh9ZrWFqmE3FDBAfCF113n9424N72bxFFSGwGw6RwKMwr NZzXxw5olJq4hRZ/xtUNG4NJF7Znf+mus+VxswsE= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from omggw7009-vm1.mail.djm.yahoo.co.jp (omggw7009-vm1.mail.djm.yahoo.co.jp [183.79.54.159]) by sourceware.org (Postfix) with ESMTPS id 2C6723858D35 for ; Fri, 8 Sep 2023 08:49:03 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 2C6723858D35 X-YMail-OSG: Sr3hnR0VM1m1MbPQ67FqRcP3jBJYyQBfiArSDqmUS7lzemMLYeVasgh6RHbXnNf MkOCzA9nmYkmWrYO2yfnQxhnAYhL8Sgta4LnP5PiLov2h1GC3aIaPY3bffpjjBJWeKslvae1ipz4 Yv7GUcm8vp4pzeTU6kjCaUwPY2301d9TyoiqT7XKWuDUa.k6hpHF7s1IylwsU9VJW4HM.Mrn0yqO 4t1zFRWgGnePppjk0hSwa5lDa5ufvIbz1cBkk8vX2EMwrbPuu_y5e00sZmj2I1OlVJm_pzBtCXyP FttvsoDyTPt.hztRsgkOmIReIVoD_HVlfD5JXok4nbDJk2dM8h0zarqmpkQX1C2J.lKv4oVnlC4q __iLE_vHEbkp1J0lLbMxauA466Sib27n83xUdQg2cWLLkjr7.i_C_eXBCGtQWWVQCSMjS1AZTVTP JOO2l03ig_jlu6mHRoc3vEgyJwEzZHSD6lc4tzNg4vH_JyQduvn2oOIoRYHXqK4RcFISzAKWl.bn p2xShZA7AIjsaE8wwYXUxAVY6FUmyUhPBbjtLf8RQmA5ISKY9oe5WUXs7gUyzzX92_X49_1Wrnse .r92u8qNyCUa.35YfVAwtLodMyircQFe9QM6UrRcHHqoet4nvIkUONPOAZ5nr__SPZGumsZp8gyI twwmMJ.iwYnNBTwosyi46.R_Z1ZjCn1R_Jh.HOfWf0jitLOVacuqiIs35d6x4GGIzfz4bpxrjOxn iGHrCCCwu24K0Ld.i96NbMVS13wyG5mGd7A07AwYqOTPqS1uPTgkcUtbyx9r7bdbJwcqLeOpclGr 28zmaUguqOMdzjhkxrDetN7cKZah1iy_DoiyoL2z8f.7VXrhgUaC5JcFXVK9NX5g0EUCNUtGXTwM XlfTJKmgHcfDG9SBy.IjkrClcWkScXTn3R4ZDYEr3VJp9Dx5vSKyKzv6ZSurvej9LbbZo_gNj6gN .lyfW.6Tbr5fw63e3mgF5aB_oNQaqf4Sb3ag225gZUcoHlLWQ83lRDFJAh99IPq0D6TXd7iXlspg NKYK_ Received: from sonicgw.mail.yahoo.co.jp by sonicconh5003.mail.kks.yahoo.co.jp with HTTP; Fri, 8 Sep 2023 08:49:00 +0000 Received: by smtphe5005.mail.kks.ynwp.yahoo.co.jp (YJ Hermes SMTP Server) with ESMTPA ID 56f8292a14f76c672395a54dbec33e5f; Fri, 08 Sep 2023 17:48:57 +0900 (JST) Message-ID: <010fff65-5d8b-774c-fce5-81136424e131@yahoo.co.jp> Date: Fri, 8 Sep 2023 17:48:56 +0900 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:102.0) Gecko/20100101 Thunderbird/102.15.0 To: GCC Patches Subject: [PATCH] xtensa: Optimize several boolean evaluations of EQ/NE against constant zero References: <010fff65-5d8b-774c-fce5-81136424e131.ref@yahoo.co.jp> X-Spam-Status: No, score=-13.0 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_ENVFROM_END_DIGIT, FREEMAIL_FROM, GIT_PATCH_0, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Takayuki 'January June' Suwa via Gcc-patches From: Takayuki 'January June' Suwa Reply-To: Takayuki 'January June' Suwa Errors-To: gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org Sender: "Gcc-patches" An idiomatic implementation of boolean evaluation of whether a register is zero or not in Xtensa is to assign 0 and 1 to the temporary and destination, and then issue the MOV[EQ/NE]Z machine instruction (See 8.3.2 Instruction Idioms, Xtensa ISA refman., p.599): ;; A2 = (A3 != 0) ? 1 : 0; movi.n a9, 1 movi.n a2, 0 movnez a2, a9, a3 ;; if (A3 != 0) A2 = A9; As you can see in the above idiom, if the source and destination are the same register, a move instruction from the source to another temporary register must be prepended: ;; A2 = (A2 == 0) ? 1 : 0; mov.n a10, a2 movi.n a9, 1 movi.n a2, 0 moveqz a2, a9, a10 ;; if (A10 == 0) A2 = A9; Fortunately, we can reduce the number of instructions and temporary registers with a few tweaks: ;; A2 = (A3 != 0) ? 1 : 0; movi.n a2, 1 moveqz a2, a3, a3 ;; if (A3 == 0) A2 = A3; ;; A2 = (A2 != 0) ? 1 : 0; movi.n a9, 1 movnez a2, a9, a2 ;; if (A2 != 0) A2 = A9; ;; A2 = (A3 == 0) ? 1 : 0; movi.n a2, -1 moveqz a2, a3, a3 ;; if (A3 == 0) A2 = A3; addi.n a2, a2, 1 ;; A2 = (A2 == 0) ? 1 : 0; movi.n a9, -1 movnez a2, a9, a2 ;; if (A2 != 0) A2 = A9; addi.n a2, a2, 1 Additionally, if TARGET_NSA is configured, the fact that it returns 32 iff the source of the NSAU machine instruction is 0, otherwise less than, can be used in boolean evaluation of EQ comparison. ;; A2 = (A3 == 0) ? 1 : 0; nsau a2, a3 ;; Source and destination can be the same register srli a2, a2, 5 Furthermore, this patch also saves one instruction when determining whether the ANDing with mask values in which 1s are lined up from the upper or lower bit end (for example, 0xFFE00000 or 0x003FFFFF) is 0 or not. gcc/ChangeLog: * config/xtensa/xtensa.cc (xtensa_expand_scc): Revert the changes from the last patch, as the work in the RTL expansion pass is too far to determine the physical registers. * config/xtensa/xtensa.md (*eqne_INT_MIN): Ditto. (eq_zero_NSA, eqne_zero, *eqne_zero_masked_bits): New patterns. --- gcc/config/xtensa/xtensa.cc | 35 +---------- gcc/config/xtensa/xtensa.md | 112 ++++++++++++++++++++++++++++++++++++ 2 files changed, 113 insertions(+), 34 deletions(-) diff --git a/gcc/config/xtensa/xtensa.cc b/gcc/config/xtensa/xtensa.cc index 1afaa1cc94e..2481b028ca1 100644 --- a/gcc/config/xtensa/xtensa.cc +++ b/gcc/config/xtensa/xtensa.cc @@ -994,41 +994,8 @@ xtensa_expand_scc (rtx operands[4], machine_mode cmp_mode) rtx cmp; rtx one_tmp, zero_tmp; rtx (*gen_fn) (rtx, rtx, rtx, rtx, rtx); - enum rtx_code code = GET_CODE (operands[1]); - if (cmp_mode == SImode && CONST_INT_P (operands[3]) - && (code == EQ || code == NE)) - switch (INTVAL (operands[3])) - { - case 0: - if (TARGET_MINMAX) - { - one_tmp = force_reg (SImode, const1_rtx); - emit_insn (gen_uminsi3 (dest, operands[2], one_tmp)); - if (code == EQ) - emit_insn (gen_xorsi3 (dest, dest, one_tmp)); - return 1; - } - break; - case -2147483648: - if (TARGET_ABS) - { - emit_insn (gen_abssi2 (dest, operands[2])); - if (code == EQ) - emit_insn (gen_lshrsi3 (dest, dest, GEN_INT (31))); - else - { - emit_insn (gen_ashrsi3 (dest, dest, GEN_INT (31))); - emit_insn (gen_addsi3 (dest, dest, const1_rtx)); - } - return 1; - } - break; - default: - break; - } - - if (! (cmp = gen_conditional_move (code, cmp_mode, + if (! (cmp = gen_conditional_move (GET_CODE (operands[1]), cmp_mode, operands[2], operands[3]))) return 0; diff --git a/gcc/config/xtensa/xtensa.md b/gcc/config/xtensa/xtensa.md index d6505e7eb70..6476fdc395a 100644 --- a/gcc/config/xtensa/xtensa.md +++ b/gcc/config/xtensa/xtensa.md @@ -3188,6 +3188,118 @@ (const_int 5) (const_int 6)))]) +(define_insn_and_split "eq_zero_NSA" + [(set (match_operand:SI 0 "register_operand" "=a") + (eq:SI (match_operand:SI 1 "register_operand" "r") + (const_int 0)))] + "TARGET_NSA" + "#" + "&& 1" + [(set (match_dup 0) + (clz:SI (match_dup 1))) + (set (match_dup 0) + (lshiftrt:SI (match_dup 0) + (const_int 5)))] + "" + [(set_attr "type" "move") + (set_attr "mode" "SI") + (set_attr "length" "6")]) + +(define_insn_and_split "eqne_zero" + [(set (match_operand:SI 0 "register_operand" "=a,&a") + (match_operator:SI 2 "boolean_operator" + [(match_operand:SI 1 "register_operand" "0,r") + (const_int 0)])) + (clobber (match_scratch:SI 3 "=&a,X"))] + "" + "#" + "&& reload_completed" + [(const_int 0)] +{ + enum rtx_code code = GET_CODE (operands[2]); + int same_p = REGNO (operands[0]) == REGNO (operands[1]); + emit_move_insn (same_p ? operands[3] : operands[0], + code == EQ ? constm1_rtx : const1_rtx); + emit_insn (gen_movsicc_internal0 (operands[0], operands[1], + same_p ? operands[3] : operands[1], + operands[0], + gen_rtx_fmt_ee (same_p ? NE : EQ, + VOIDmode, + operands[1], + const0_rtx))); + if (code == EQ) + emit_insn (gen_addsi3 (operands[0], operands[0], const1_rtx)); + DONE; +} + [(set_attr "type" "move") + (set_attr "mode" "SI") + (set (attr "length") + (if_then_else (match_test "GET_CODE (operands[2]) == EQ") + (if_then_else (match_test "TARGET_DENSITY") + (const_int 7) + (const_int 9)) + (if_then_else (match_test "TARGET_DENSITY") + (const_int 5) + (const_int 6))))]) + +(define_insn_and_split "*eqne_zero_masked_bits" + [(set (match_operand:SI 0 "register_operand" "=a") + (match_operator 3 "boolean_operator" + [(and:SI (match_operand:SI 1 "register_operand" "r") + (match_operand:SI 2 "const_int_operand" "i")) + (const_int 0)]))] + "IN_RANGE (exact_log2 (INTVAL (operands[2]) + 1), 17, 31) + || IN_RANGE (exact_log2 (-INTVAL (operands[2])), 1, 30)" + "#" + "&& 1" + [(const_int 0)] +{ + HOST_WIDE_INT mask = INTVAL (operands[2]); + int n; + enum rtx_code code = GET_CODE (operands[3]); + if (IN_RANGE (n = exact_log2 (mask + 1), 17, 31)) + emit_insn (gen_ashlsi3 (operands[0], operands[1], GEN_INT (32 - n))); + else + emit_insn (gen_lshrsi3 (operands[0], operands[1], + GEN_INT (floor_log2 (-mask)))); + if (TARGET_NSA && code == EQ) + emit_insn (gen_eq_zero_NSA (operands[0], operands[0])); + else + emit_insn (gen_eqne_zero (operands[0], operands[0], + gen_rtx_fmt_ee (code, VOIDmode, + operands[0], const0_rtx))); + DONE; +}) + +(define_insn_and_split "*eqne_INT_MIN" + [(set (match_operand:SI 0 "register_operand" "=a") + (match_operator:SI 2 "boolean_operator" + [(match_operand:SI 1 "register_operand" "r") + (const_int -2147483648)]))] + "TARGET_ABS" + "#" + "&& 1" + [(const_int 0)] +{ + emit_insn (gen_abssi2 (operands[0], operands[1])); + if (GET_CODE (operands[2]) == EQ) + emit_insn (gen_lshrsi3 (operands[0], operands[0], GEN_INT (31))); + else + { + emit_insn (gen_ashrsi3 (operands[0], operands[0], GEN_INT (31))); + emit_insn (gen_addsi3 (operands[0], operands[0], const1_rtx)); + } + DONE; +} + [(set_attr "type" "move") + (set_attr "mode" "SI") + (set (attr "length") + (if_then_else (match_test "GET_CODE (operands[2]) == EQ") + (const_int 6) + (if_then_else (match_test "TARGET_DENSITY") + (const_int 8) + (const_int 9))))]) + (define_peephole2 [(set (match_operand:SI 0 "register_operand") (match_operand:SI 6 "reload_operand"))