From patchwork Wed Nov 20 23:17:11 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Segher Boessenkool X-Patchwork-Id: 1198575 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=209.132.180.131; helo=sourceware.org; envelope-from=gcc-patches-return-514257-incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=kernel.crashing.org Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b="CMWPLZA4"; dkim-atps=neutral Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 47JJW85G3xz9s4Y for ; Thu, 21 Nov 2019 10:17:24 +1100 (AEDT) DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :to:cc:subject:date:message-id; q=dns; s=default; b=stBtUgCvsAuy Uy5cr0nW8g36XFPL0KRBYZ9PiJbc7QCmim7OXhcQcybOx9s+reM7XxovvBeJcElv iwTw9GARahdNlzdS8vnrmkA8VUedl745Q/i8UGH2W02OGAgn5MgUHBSjO4geXu3q ZPrUAtZjCOB4kDpXVcHdXbfjpYCJtR0= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :to:cc:subject:date:message-id; s=default; bh=Urv4Easynq2H0duCEG uHbt4HdfE=; b=CMWPLZA4gxaWwfTyUDY/p6kt/Eqha82HIp5psqDhLaa/Go62Mm GQ+sRmr5AU2TTboFDJHsijoxQNzqIpaR5EiFAjYSHLY9grOBd/e2vPUjhsFxyOXv lyniM2Aqze0pd/LrlH4rTb7fXtrODeUPjylOJku/eM5kCaY2+0GNGKPX4= Received: (qmail 120128 invoked by alias); 20 Nov 2019 23:17:17 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 120111 invoked by uid 89); 20 Nov 2019 23:17:17 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-18.0 required=5.0 tests=AWL, BAYES_00, GIT_PATCH_0, GIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3 autolearn=ham version=3.3.1 spammy=set_attr X-HELO: gcc1-power7.osuosl.org Received: from gcc1-power7.osuosl.org (HELO gcc1-power7.osuosl.org) (140.211.15.137) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Wed, 20 Nov 2019 23:17:14 +0000 Received: by gcc1-power7.osuosl.org (Postfix, from userid 10019) id 349C21240644; Wed, 20 Nov 2019 23:17:13 +0000 (UTC) From: Segher Boessenkool To: gcc-patches@gcc.gnu.org Cc: dje.gcc@gmail.com, Segher Boessenkool Subject: [PATCH] rs6000: Don't split FP comparisons at expand time Date: Wed, 20 Nov 2019 23:17:11 +0000 Message-Id: <027ab93388689c1821a37aff87b31e34721ad3c9.1574290010.git.segher@kernel.crashing.org> X-IsSubscribed: yes We currently expand various floating point comparisons early, to some sequences with cror insns and the like. This doesn't optimize well. Change that to allow any of the 14 floating point comparisons in the instruction stream, and split them after combine (at split1). Tested on powerpc64-linux {-m32,-m64}; also tested with -ffast-math (it doesn't change anything for that). All 14 "cstore" codes give optimal code now. Not all "cbranch" codes do yet. An example that does work fine is UNEQ: fcmpu 0,1,2 # 7 [c=4 l=4] *cmpdf_fpr/0 cror 2,0,1 # 23 [c=4 l=4] cceq_ior_compare_si/0 beqlr 0 # 24 [c=4 l=4] *creturn but one that does not is UNGT: fcmpu 0,1,2 # 7 [c=4 l=4] *cmpdf_fpr/0 bgt 0,.L16 # 15 [c=4 l=4] *cbranch bnulr 0 # 22 [c=4 l=4] *creturn .L16: Coming from gimple this is _5 = a_3(D) u<= b_4(D); _7 = ~_5; _8 = a_3(D) unord b_4(D); _9 = _7 | _8; if (_9 != 0) goto ; [50.00%] else goto ; [50.00%] which could use some improvement. Anyway, testing on p8le and p9le as well; will commit if that works fine. Segher 2019-11-20 Segher Boessenkool * config/rs6000/predicates.md (extra_insn_branch_comparison_operator): New predicate. * config/rs6000/rs6000-protos.h (rs6000_emit_fp_cror): New declaration. * config/rs6000/rs6000.c (rs6000_generate_compare): Don't do anything special for FP comparisons that need a cror instruction eventually. (rs6000_emit_fp_cror): New function. (rs6000_emit_sCOND): Expand all floating point comparisons to one instruction, for normal FP modes, with HONOR_NANS. (rs6000_emit_cbranch): Reformat. * config/rs6000/rs6000.md (fp_rev): New iterator. (fp_two): New iterator. *_cc for fp_rev and GPR: New define_insn_and_split. *_cc for fp_two and GPR: New define_insn_and_split. *cbranch_2insn: New define_insn_and_split. --- gcc/config/rs6000/predicates.md | 10 ++++ gcc/config/rs6000/rs6000-protos.h | 1 + gcc/config/rs6000/rs6000.c | 98 ++++++++++++++++----------------------- gcc/config/rs6000/rs6000.md | 78 +++++++++++++++++++++++++++++++ 4 files changed, 130 insertions(+), 57 deletions(-) diff --git a/gcc/config/rs6000/predicates.md b/gcc/config/rs6000/predicates.md index f4ecc41..42c41b3 100644 --- a/gcc/config/rs6000/predicates.md +++ b/gcc/config/rs6000/predicates.md @@ -1143,6 +1143,16 @@ (define_predicate "branch_comparison_operator" GET_MODE (XEXP (op, 0))), 1"))) +;; Return 1 if OP is a comparison that needs an extra instruction to do (a +;; crlogical or an extra branch). +(define_predicate "extra_insn_branch_comparison_operator" + (and (match_operand 0 "comparison_operator") + (match_test "GET_MODE (XEXP (op, 0)) == CCFPmode") + (match_code "ltgt,le,ge,unlt,ungt,uneq") + (match_test "validate_condition_mode (GET_CODE (op), + GET_MODE (XEXP (op, 0))), + 1"))) + ;; Return 1 if OP is an unsigned comparison operator. (define_predicate "unsigned_comparison_operator" (match_code "ltu,gtu,leu,geu")) diff --git a/gcc/config/rs6000/rs6000-protos.h b/gcc/config/rs6000/rs6000-protos.h index 0dddb40..69e67ac 100644 --- a/gcc/config/rs6000/rs6000-protos.h +++ b/gcc/config/rs6000/rs6000-protos.h @@ -112,6 +112,7 @@ extern const char *rs6000_pltseq_template (rtx *, int); extern enum rtx_code rs6000_reverse_condition (machine_mode, enum rtx_code); extern rtx rs6000_emit_eqne (machine_mode, rtx, rtx, rtx); +extern rtx rs6000_emit_fp_cror (rtx_code, machine_mode, rtx); extern void rs6000_emit_sCOND (machine_mode, rtx[]); extern void rs6000_emit_cbranch (machine_mode, rtx[]); extern char * output_cbranch (rtx, const char *, int, rtx_insn *); diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c index 0282ebd..2995348 100644 --- a/gcc/config/rs6000/rs6000.c +++ b/gcc/config/rs6000/rs6000.c @@ -13954,42 +13954,6 @@ rs6000_generate_compare (rtx cmp, machine_mode mode) gen_rtx_COMPARE (comp_mode, op0, op1))); } - /* Some kinds of FP comparisons need an OR operation; - under flag_finite_math_only we don't bother. */ - if (FLOAT_MODE_P (mode) - && (!FLOAT128_IEEE_P (mode) || TARGET_FLOAT128_HW) - && !flag_finite_math_only - && (code == LE || code == GE - || code == UNEQ || code == LTGT - || code == UNGT || code == UNLT)) - { - enum rtx_code or1, or2; - rtx or1_rtx, or2_rtx, compare2_rtx; - rtx or_result = gen_reg_rtx (CCEQmode); - - switch (code) - { - case LE: or1 = LT; or2 = EQ; break; - case GE: or1 = GT; or2 = EQ; break; - case UNEQ: or1 = UNORDERED; or2 = EQ; break; - case LTGT: or1 = LT; or2 = GT; break; - case UNGT: or1 = UNORDERED; or2 = GT; break; - case UNLT: or1 = UNORDERED; or2 = LT; break; - default: gcc_unreachable (); - } - validate_condition_mode (or1, comp_mode); - validate_condition_mode (or2, comp_mode); - or1_rtx = gen_rtx_fmt_ee (or1, SImode, compare_result, const0_rtx); - or2_rtx = gen_rtx_fmt_ee (or2, SImode, compare_result, const0_rtx); - compare2_rtx = gen_rtx_COMPARE (CCEQmode, - gen_rtx_IOR (SImode, or1_rtx, or2_rtx), - const_true_rtx); - emit_insn (gen_rtx_SET (or_result, compare2_rtx)); - - compare_result = or_result; - code = EQ; - } - validate_condition_mode (code, GET_MODE (compare_result)); return gen_rtx_fmt_ee (code, VOIDmode, compare_result, const0_rtx); @@ -14301,21 +14265,44 @@ rs6000_emit_eqne (machine_mode mode, rtx op1, rtx op2, rtx scratch) return scratch; } +/* Emit code doing a cror of two CR bits, for FP comparisons with a CODE that + requires this. The result is mode MODE. */ +rtx +rs6000_emit_fp_cror (rtx_code code, machine_mode mode, rtx x) +{ + rtx cond[2]; + int n = 0; + if (code == LTGT || code == LE || code == UNLT) + cond[n++] = gen_rtx_fmt_ee (LT, mode, x, const0_rtx); + if (code == LTGT || code == GE || code == UNGT) + cond[n++] = gen_rtx_fmt_ee (GT, mode, x, const0_rtx); + if (code == LE || code == GE || code == UNEQ) + cond[n++] = gen_rtx_fmt_ee (EQ, mode, x, const0_rtx); + if (code == UNLT || code == UNGT || code == UNEQ) + cond[n++] = gen_rtx_fmt_ee (UNORDERED, mode, x, const0_rtx); + + gcc_assert (n == 2); + + rtx cc = gen_reg_rtx (CCEQmode); + rtx logical = gen_rtx_IOR (mode, cond[0], cond[1]); + emit_insn (gen_cceq_ior_compare (mode, cc, logical, cond[0], x, cond[1], x)); + + return cc; +} + void rs6000_emit_sCOND (machine_mode mode, rtx operands[]) { - rtx condition_rtx; - machine_mode op_mode; - enum rtx_code cond_code; - rtx result = operands[0]; + rtx condition_rtx = rs6000_generate_compare (operands[1], mode); + rtx_code cond_code = GET_CODE (condition_rtx); - condition_rtx = rs6000_generate_compare (operands[1], mode); - cond_code = GET_CODE (condition_rtx); - - if (cond_code == NE - || cond_code == GE || cond_code == LE - || cond_code == GEU || cond_code == LEU - || cond_code == ORDERED || cond_code == UNGE || cond_code == UNLE) + if (FLOAT_MODE_P (mode) && HONOR_NANS (mode) + && !(FLOAT128_VECTOR_P (mode) && !TARGET_FLOAT128_HW)) + ; + else if (cond_code == NE + || cond_code == GE || cond_code == LE + || cond_code == GEU || cond_code == LEU + || cond_code == ORDERED || cond_code == UNGE || cond_code == UNLE) { rtx not_result = gen_reg_rtx (CCEQmode); rtx not_op, rev_cond_rtx; @@ -14330,19 +14317,19 @@ rs6000_emit_sCOND (machine_mode mode, rtx operands[]) condition_rtx = gen_rtx_EQ (VOIDmode, not_result, const0_rtx); } - op_mode = GET_MODE (XEXP (operands[1], 0)); + machine_mode op_mode = GET_MODE (XEXP (operands[1], 0)); if (op_mode == VOIDmode) op_mode = GET_MODE (XEXP (operands[1], 1)); if (TARGET_POWERPC64 && (op_mode == DImode || FLOAT_MODE_P (mode))) { PUT_MODE (condition_rtx, DImode); - convert_move (result, condition_rtx, 0); + convert_move (operands[0], condition_rtx, 0); } else { PUT_MODE (condition_rtx, SImode); - emit_insn (gen_rtx_SET (result, condition_rtx)); + emit_insn (gen_rtx_SET (operands[0], condition_rtx)); } } @@ -14351,13 +14338,10 @@ rs6000_emit_sCOND (machine_mode mode, rtx operands[]) void rs6000_emit_cbranch (machine_mode mode, rtx operands[]) { - rtx condition_rtx, loc_ref; - - condition_rtx = rs6000_generate_compare (operands[0], mode); - loc_ref = gen_rtx_LABEL_REF (VOIDmode, operands[3]); - emit_jump_insn (gen_rtx_SET (pc_rtx, - gen_rtx_IF_THEN_ELSE (VOIDmode, condition_rtx, - loc_ref, pc_rtx))); + rtx condition_rtx = rs6000_generate_compare (operands[0], mode); + rtx loc_ref = gen_rtx_LABEL_REF (VOIDmode, operands[3]); + rtx ite = gen_rtx_IF_THEN_ELSE (VOIDmode, condition_rtx, loc_ref, pc_rtx); + emit_jump_insn (gen_rtx_SET (pc_rtx, ite)); } /* Return the string to output a conditional branch to LABEL, which is diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md index 8dc0a29..dff5680 100644 --- a/gcc/config/rs6000/rs6000.md +++ b/gcc/config/rs6000/rs6000.md @@ -12373,6 +12373,44 @@ (define_insn_and_split "*nesi3_ext" (if_then_else (match_test "operands[2] == const0_rtx") (const_string "12") (const_string "16")))]) + + +(define_code_iterator fp_rev [ordered ne unle unge]) +(define_code_iterator fp_two [ltgt le ge unlt ungt uneq]) + +(define_insn_and_split "*_cc" + [(set (match_operand:GPR 0 "gpc_reg_operand" "=r") + (fp_rev:GPR (match_operand:CCFP 1 "cc_reg_operand" "y") + (const_int 0)))] + "!flag_finite_math_only" + "#" + "&& 1" + [(pc)] +{ + rtx_code revcode = reverse_condition_maybe_unordered (); + rtx eq = gen_rtx_fmt_ee (revcode, mode, operands[1], const0_rtx); + rtx tmp = gen_reg_rtx (mode); + emit_move_insn (tmp, eq); + emit_insn (gen_xor3 (operands[0], tmp, const1_rtx)); + DONE; +} + [(set_attr "length" "12")]) + +(define_insn_and_split "*_cc" + [(set (match_operand:GPR 0 "gpc_reg_operand" "=r") + (fp_two:GPR (match_operand:CCFP 1 "cc_reg_operand" "y") + (const_int 0)))] + "!flag_finite_math_only" + "#" + "&& 1" + [(pc)] +{ + rtx cc = rs6000_emit_fp_cror (, mode, operands[1]); + + emit_move_insn (operands[0], gen_rtx_EQ (mode, cc, const0_rtx)); + DONE; +} + [(set_attr "length" "12")]) ;; Conditional branches. ;; These either are a single bc insn, or a bc around a b. @@ -12397,6 +12435,46 @@ (define_insn "*cbranch" (const_int 4) (const_int 8)))]) +(define_insn_and_split "*cbranch_2insn" + [(set (pc) + (if_then_else (match_operator 1 "extra_insn_branch_comparison_operator" + [(match_operand 2 "cc_reg_operand" "y") + (const_int 0)]) + (label_ref (match_operand 0)) + (pc)))] + "!flag_finite_math_only" + "#" + "&& 1" + [(pc)] +{ + rtx cc = rs6000_emit_fp_cror (GET_CODE (operands[1]), SImode, operands[2]); + + rtx note = find_reg_note (curr_insn, REG_BR_PROB, 0); + + rtx loc_ref = gen_rtx_LABEL_REF (VOIDmode, operands[0]); + rtx cond = gen_rtx_EQ (CCEQmode, cc, const0_rtx); + rtx ite = gen_rtx_IF_THEN_ELSE (VOIDmode, cond, loc_ref, pc_rtx); + emit_jump_insn (gen_rtx_SET (pc_rtx, ite)); + + if (note) + { + profile_probability prob + = profile_probability::from_reg_br_prob_note (XINT (note, 0)); + + add_reg_br_prob_note (get_last_insn (), prob); + } + + DONE; +} + [(set_attr "type" "branch") + (set (attr "length") + (if_then_else (and (ge (minus (match_dup 0) (pc)) + (const_int -32764)) + (lt (minus (match_dup 0) (pc)) + (const_int 32760))) + (const_int 8) + (const_int 16)))]) + ;; Conditional return. (define_insn "*creturn" [(set (pc)