From patchwork Fri Oct 27 18:22:29 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Uros Bizjak X-Patchwork-Id: 831442 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=gcc.gnu.org (client-ip=209.132.180.131; helo=sourceware.org; envelope-from=gcc-patches-return-465388-incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b="EqK97fLY"; dkim-atps=neutral Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 3yNsg80bKSz9t4b for ; Sat, 28 Oct 2017 05:22:43 +1100 (AEDT) DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :mime-version:from:date:message-id:subject:to:cc:content-type; q=dns; s=default; b=nYEsnqK+0702bN9FHIjUZGQBMgMuBfbF/RkUIOG9oE9 iEt/XiC8xK5Xr27UWXSxsZz9oFB1pS/QE+carCRZboHeWVvLmp19s6vKXSqNBL4g Kgb+knWc9W0+1YsQELIZRhzMIT0seDWhws5jFWamD7enhI8qbHWX8SlRuBa75R68 = DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :mime-version:from:date:message-id:subject:to:cc:content-type; s=default; bh=AGibsKPqdRxtMJgQm84crTMvlD0=; b=EqK97fLY046ijdV0k ZFDLK6JFZaV0VhrVB7ZwB+UAkHk3sqcZ2n3ygwhrEzqexOEeq5IAxth4pXOpoF3X nLZ6ovRd7A53YciUM7XRT+iRbhf4aubF9/I2E+utaQBTu2WXaCd7fPkeD5YlQx/I uSM3mRj5X9Yfh/zBe34w1H380k= Received: (qmail 17136 invoked by alias); 27 Oct 2017 18:22:35 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 17125 invoked by uid 89); 27 Oct 2017 18:22:35 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-10.7 required=5.0 tests=AWL, BAYES_00, FREEMAIL_FROM, GIT_PATCH_2, GIT_PATCH_3, KAM_ASCII_DIVIDERS, RCVD_IN_DNSWL_NONE, RCVD_IN_SORBS_SPAM, SPF_PASS autolearn=ham version=3.3.2 spammy=fv, gotta X-HELO: mail-io0-f174.google.com Received: from mail-io0-f174.google.com (HELO mail-io0-f174.google.com) (209.85.223.174) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Fri, 27 Oct 2017 18:22:31 +0000 Received: by mail-io0-f174.google.com with SMTP id m81so14543549ioi.13 for ; Fri, 27 Oct 2017 11:22:31 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:from:date:message-id:subject:to:cc; bh=OQPS8oO5lYBdMQq+1elUrJEfKe3LbZ5UpUplPTstRhk=; b=KkEQQHACIGmC15BAi4O2GVSP2a2HRPvXybj4V+I0Cmeu0tPBnmf9D3VNvF64XEYT/n J4ldj3VUF0yui1bHGnHzBpa4atlX8Jm4iyeheGI0hPOtOg/TXh2FvyypYxliB7iKDxsA pGJtL+Mqp5ftBGQSxSR0JHt1I//p1JTemOgEN6d2LncKRz/sYFRK9/B/Lhgy6jZh1Heb Oyy+T4AW/DLDPKWdqw6weTVSZvvUTuWeeDPDLszO72wdri5tdKdBqakoUsCFZY9e4EFe RFQb/2ARP9hS1OFUkz8L1zXwEJtxWOhRda9YCYy5VOEly62OsjLJTh65ambH0a9UDPbh uTfw== X-Gm-Message-State: AMCzsaXk3hcEqRpcjBZO3BzR30/ekgwEnLHLGcaDnFFafCIfa348UEUM QfV7dDdY2Xr+NC8GWrs2g5BXMKvm+9I2AqCWJaROuA== X-Google-Smtp-Source: ABhQp+SSTlAjS/8kQakabP9NbAI8NksmP1wdhMDHS/eFiSdqi4l4NKJHRLNG0tzChUqOTFZBPIUBouHeNNaG6PR8G84= X-Received: by 10.36.73.144 with SMTP id e16mr1816149itd.90.1509128549641; Fri, 27 Oct 2017 11:22:29 -0700 (PDT) MIME-Version: 1.0 Received: by 10.2.74.18 with HTTP; Fri, 27 Oct 2017 11:22:29 -0700 (PDT) From: Uros Bizjak Date: Fri, 27 Oct 2017 20:22:29 +0200 Message-ID: Subject: [PATCH, i386]: Fix PR 82692, Ordered comparisons used for unordered built-ins To: "gcc-patches@gcc.gnu.org" Cc: Segher Boessenkool , "Joseph S. Myers" , "H. J. Lu" Hello! As discussed in the PR, different modes of a FP compare RTX are not strong enough to survive through RTL optimization passes. Attached testcase was miscompiled due to combine changing the mode of FP compare through SELECT_CC_MODE. The solution, implemented in the attached patch, is to drop CCFPUmode (which was used to distinguish unordered and ordered compares) and use UNSPEC_NOTRAP unspec wrappers around compare RTXes for unordered comparisons. 2017-10-27 Uros Bizjak PR target/82692 * config/i386/i386-modes.def (CCFPU): Remove definition. * config/i386/i386.c (put_condition_mode): Remove CCFPU mode handling. (ix86_cc_modes_compatible): Ditto. (ix86_expand_carry_flag_compare): Ditto. (ix86_expand_int_movcc): Ditto. (ix86_expand_int_addcc): Ditto. (ix86_reverse_condition): Ditto. (ix86_unordered_fp_compare): Rename from ix86_fp_compare_mode. Return true/false for unordered/ordered fp comparisons. (ix86_cc_mode): Always return CCFPmode for float mode comparisons. (ix86_prepare_fp_compare_args): Update for rename. (ix86_expand_fp_compare): Update for rename. Generate unordered compare RTXes wrapped with UNSPEC_NOTRAP unspec. (ix86_expand_sse_compare_and_jump): Ditto. * config/i386/predicates.md (fcmov_comparison_operator): Remove CCFPU mode handling. (ix86_comparison_operator): Ditto. (ix86_carry_flag_operator): Ditto. * config/i386/i386.md (UNSPEC_NOTRAP): New unspec. (*cmpu_i387): Wrap compare RTX with UNSPEC_NOTRAP unspec. (*cmpu_cc_i387): Ditto. (FPCMP): Remove mode iterator. (unord): Remove mode attribute. (unord_subst): New define_subst transformation (unord): New define_subst attribute. (unordered): Ditto. (*cmpi): Rewrite using unord_subst transformation. (*cmpixf_i387): Ditto. * config/i386/sse.md (_comi): Merge from _comi and _ucomi using unord_subst transformation. * config/i386/subst.md (SUBST_A): Remove CCFP and CCFPU modes. (round_saeonly): Also handle CCFP mode. * reg-stack.c (subst_stack_regs_pat): Handle UNSPEC_NOTRAP unspec. Remove UNSPEC_SAHF unspec handling. testsuite/ChangeLog: 2017-10-27 Uros Bizjak PR target/82692 * gcc.dg/torture/pr82692.c: New test. Patch was bootstrapped and regression tested on x86_64-linux-gnu {,-m32}. Committed to mainline SVN. Uros. Index: config/i386/i386-modes.def =================================================================== --- config/i386/i386-modes.def (revision 254111) +++ config/i386/i386-modes.def (working copy) @@ -72,8 +72,8 @@ CC_MODE (CCO); CC_MODE (CCP); CC_MODE (CCS); CC_MODE (CCZ); + CC_MODE (CCFP); -CC_MODE (CCFPU); /* Vector modes. Note that VEC_CONCAT patterns require vector sizes twice as big as implemented in hardware. */ Index: config/i386/i386.c =================================================================== --- config/i386/i386.c (revision 254111) +++ config/i386/i386.c (working copy) @@ -16930,7 +16930,7 @@ put_condition_code (enum rtx_code code, machine_mo { const char *suffix; - if (mode == CCFPmode || mode == CCFPUmode) + if (mode == CCFPmode) { code = ix86_fp_compare_code_to_integer (code); mode = CCmode; @@ -21709,14 +21709,13 @@ ix86_expand_int_compare (enum rtx_code code, rtx o return gen_rtx_fmt_ee (code, VOIDmode, flags, const0_rtx); } -/* Figure out whether to use ordered or unordered fp comparisons. - Return the appropriate mode to use. */ +/* Figure out whether to use unordered fp comparisons. */ -machine_mode -ix86_fp_compare_mode (enum rtx_code code) +static bool +ix86_unordered_fp_compare (enum rtx_code code) { if (!TARGET_IEEE_FP) - return CCFPmode; + return false; switch (code) { @@ -21724,7 +21723,7 @@ ix86_expand_int_compare (enum rtx_code code, rtx o case GE: case LT: case LE: - return CCFPmode; + return false; case EQ: case NE: @@ -21737,7 +21736,7 @@ ix86_expand_int_compare (enum rtx_code code, rtx o case UNGT: case UNGE: case UNEQ: - return CCFPUmode; + return true; default: gcc_unreachable (); @@ -21752,7 +21751,7 @@ ix86_cc_mode (enum rtx_code code, rtx op0, rtx op1 if (SCALAR_FLOAT_MODE_P (mode)) { gcc_assert (!DECIMAL_FLOAT_MODE_P (mode)); - return ix86_fp_compare_mode (code); + return CCFPmode; } switch (code) @@ -21874,7 +21873,6 @@ ix86_cc_modes_compatible (machine_mode m1, machine } case E_CCFPmode: - case E_CCFPUmode: /* These are only compatible with themselves, which we already checked above. */ return VOIDmode; @@ -21978,7 +21976,7 @@ ix86_fp_comparison_strategy (enum rtx_code) static enum rtx_code ix86_prepare_fp_compare_args (enum rtx_code code, rtx *pop0, rtx *pop1) { - machine_mode fpcmp_mode = ix86_fp_compare_mode (code); + bool unordered_compare = ix86_unordered_fp_compare (code); rtx op0 = *pop0, op1 = *pop1; machine_mode op_mode = GET_MODE (op0); bool is_sse = TARGET_SSE_MATH && SSE_FLOAT_MODE_P (op_mode); @@ -21990,7 +21988,7 @@ ix86_prepare_fp_compare_args (enum rtx_code code, floating point. */ if (!is_sse - && (fpcmp_mode == CCFPUmode + && (unordered_compare || (op_mode == XFmode && ! (standard_80387_constant_p (op0) == 1 || standard_80387_constant_p (op1) == 1) @@ -22087,10 +22085,10 @@ ix86_fp_compare_code_to_integer (enum rtx_code cod static rtx ix86_expand_fp_compare (enum rtx_code code, rtx op0, rtx op1, rtx scratch) { - machine_mode fpcmp_mode, intcmp_mode; + bool unordered_compare = ix86_unordered_fp_compare (code); + machine_mode intcmp_mode; rtx tmp, tmp2; - fpcmp_mode = ix86_fp_compare_mode (code); code = ix86_prepare_fp_compare_args (code, &op0, &op1); /* Do fcomi/sahf based test when profitable. */ @@ -22097,17 +22095,19 @@ ix86_expand_fp_compare (enum rtx_code code, rtx op switch (ix86_fp_comparison_strategy (code)) { case IX86_FPCMP_COMI: - intcmp_mode = fpcmp_mode; - tmp = gen_rtx_COMPARE (fpcmp_mode, op0, op1); - tmp = gen_rtx_SET (gen_rtx_REG (fpcmp_mode, FLAGS_REG), tmp); - emit_insn (tmp); + intcmp_mode = CCFPmode; + tmp = gen_rtx_COMPARE (CCFPmode, op0, op1); + if (unordered_compare) + tmp = gen_rtx_UNSPEC (CCFPmode, gen_rtvec (1, tmp), UNSPEC_NOTRAP); + emit_insn (gen_rtx_SET (gen_rtx_REG (CCFPmode, FLAGS_REG), tmp)); break; case IX86_FPCMP_SAHF: - intcmp_mode = fpcmp_mode; - tmp = gen_rtx_COMPARE (fpcmp_mode, op0, op1); - tmp = gen_rtx_SET (gen_rtx_REG (fpcmp_mode, FLAGS_REG), tmp); - + intcmp_mode = CCFPmode; + tmp = gen_rtx_COMPARE (CCFPmode, op0, op1); + if (unordered_compare) + tmp = gen_rtx_UNSPEC (CCFPmode, gen_rtvec (1, tmp), UNSPEC_NOTRAP); + tmp = gen_rtx_SET (gen_rtx_REG (CCFPmode, FLAGS_REG), tmp); if (!scratch) scratch = gen_reg_rtx (HImode); tmp2 = gen_rtx_CLOBBER (VOIDmode, scratch); @@ -22116,11 +22116,13 @@ ix86_expand_fp_compare (enum rtx_code code, rtx op case IX86_FPCMP_ARITH: /* Sadness wrt reg-stack pops killing fpsr -- gotta get fnstsw first. */ - tmp = gen_rtx_COMPARE (fpcmp_mode, op0, op1); - tmp2 = gen_rtx_UNSPEC (HImode, gen_rtvec (1, tmp), UNSPEC_FNSTSW); + tmp = gen_rtx_COMPARE (CCFPmode, op0, op1); + if (unordered_compare) + tmp = gen_rtx_UNSPEC (CCFPmode, gen_rtvec (1, tmp), UNSPEC_NOTRAP); + tmp = gen_rtx_UNSPEC (HImode, gen_rtvec (1, tmp), UNSPEC_FNSTSW); if (!scratch) scratch = gen_reg_rtx (HImode); - emit_insn (gen_rtx_SET (scratch, tmp2)); + emit_insn (gen_rtx_SET (scratch, tmp)); /* In the unordered case, we have to check C2 for NaN's, which doesn't happen to work out to anything nice combination-wise. @@ -22562,8 +22564,7 @@ ix86_expand_carry_flag_compare (enum rtx_code code compare_seq = get_insns (); end_sequence (); - if (GET_MODE (XEXP (compare_op, 0)) == CCFPmode - || GET_MODE (XEXP (compare_op, 0)) == CCFPUmode) + if (GET_MODE (XEXP (compare_op, 0)) == CCFPmode) code = ix86_fp_compare_code_to_integer (GET_CODE (compare_op)); else code = GET_CODE (compare_op); @@ -22703,8 +22704,7 @@ ix86_expand_int_movcc (rtx operands[]) flags = XEXP (compare_op, 0); - if (GET_MODE (flags) == CCFPmode - || GET_MODE (flags) == CCFPUmode) + if (GET_MODE (flags) == CCFPmode) { fpcmp = true; compare_code @@ -24744,8 +24744,7 @@ ix86_expand_int_addcc (rtx operands[]) flags = XEXP (compare_op, 0); - if (GET_MODE (flags) == CCFPmode - || GET_MODE (flags) == CCFPUmode) + if (GET_MODE (flags) == CCFPmode) { fpcmp = true; code = ix86_fp_compare_code_to_integer (code); @@ -43208,7 +43207,7 @@ ix86_encode_section_info (tree decl, rtx rtl, int enum rtx_code ix86_reverse_condition (enum rtx_code code, machine_mode mode) { - return (mode != CCFPmode && mode != CCFPUmode + return (mode != CCFPmode ? reverse_condition (code) : reverse_condition_maybe_unordered (code)); } @@ -43823,17 +43822,20 @@ static rtx_code_label * ix86_expand_sse_compare_and_jump (enum rtx_code code, rtx op0, rtx op1, bool swap_operands) { - machine_mode fpcmp_mode = ix86_fp_compare_mode (code); + bool unordered_compare = ix86_unordered_fp_compare (code); rtx_code_label *label; - rtx tmp; + rtx tmp, reg; if (swap_operands) std::swap (op0, op1); label = gen_label_rtx (); - tmp = gen_rtx_REG (fpcmp_mode, FLAGS_REG); - emit_insn (gen_rtx_SET (tmp, gen_rtx_COMPARE (fpcmp_mode, op0, op1))); - tmp = gen_rtx_fmt_ee (code, VOIDmode, tmp, const0_rtx); + tmp = gen_rtx_COMPARE (CCFPmode, op0, op1); + if (unordered_compare) + tmp = gen_rtx_UNSPEC (CCFPmode, gen_rtvec (1, tmp), UNSPEC_NOTRAP); + reg = gen_rtx_REG (CCFPmode, FLAGS_REG); + emit_insn (gen_rtx_SET (reg, tmp)); + tmp = gen_rtx_fmt_ee (code, VOIDmode, reg, const0_rtx); tmp = gen_rtx_IF_THEN_ELSE (VOIDmode, tmp, gen_rtx_LABEL_REF (VOIDmode, label), pc_rtx); tmp = emit_jump_insn (gen_rtx_SET (pc_rtx, tmp)); Index: config/i386/i386.md =================================================================== --- config/i386/i386.md (revision 254111) +++ config/i386/i386.md (working copy) @@ -99,6 +99,7 @@ UNSPEC_SCAS UNSPEC_FNSTSW UNSPEC_SAHF + UNSPEC_NOTRAP UNSPEC_PARITY UNSPEC_FSTCW UNSPEC_FLDCW @@ -1478,9 +1479,6 @@ ;; FP compares, step 1: ;; Set the FP condition codes. -;; -;; CCFPmode compare with exceptions -;; CCFPUmode compare with no exceptions ;; We may not use "#" to split and emit these, since the REG_DEAD notes ;; used to manage the reg stack popping would not be preserved. @@ -1587,9 +1585,11 @@ (define_insn "*cmpu_i387" [(set (match_operand:HI 0 "register_operand" "=a") (unspec:HI - [(compare:CCFPU - (match_operand:X87MODEF 1 "register_operand" "f") - (match_operand:X87MODEF 2 "register_operand" "f"))] + [(unspec:CCFP + [(compare:CCFP + (match_operand:X87MODEF 1 "register_operand" "f") + (match_operand:X87MODEF 2 "register_operand" "f"))] + UNSPEC_NOTRAP)] UNSPEC_FNSTSW))] "TARGET_80387" "* return output_fp_compare (insn, operands, false, true);" @@ -1598,10 +1598,12 @@ (set_attr "mode" "")]) (define_insn_and_split "*cmpu_cc_i387" - [(set (reg:CCFPU FLAGS_REG) - (compare:CCFPU - (match_operand:X87MODEF 1 "register_operand" "f") - (match_operand:X87MODEF 2 "register_operand" "f"))) + [(set (reg:CCFP FLAGS_REG) + (unspec:CCFP + [(compare:CCFP + (match_operand:X87MODEF 1 "register_operand" "f") + (match_operand:X87MODEF 2 "register_operand" "f"))] + UNSPEC_NOTRAP)) (clobber (match_operand:HI 0 "register_operand" "=a"))] "TARGET_80387 && TARGET_SAHF && !TARGET_CMOVE" "#" @@ -1608,8 +1610,10 @@ "&& reload_completed" [(set (match_dup 0) (unspec:HI - [(compare:CCFPU (match_dup 1)(match_dup 2))] - UNSPEC_FNSTSW)) + [(unspec:CCFP + [(compare:CCFP (match_dup 1)(match_dup 2))] + UNSPEC_NOTRAP)] + UNSPEC_FNSTSW)) (set (reg:CC FLAGS_REG) (unspec:CC [(match_dup 0)] UNSPEC_SAHF))] "" @@ -1697,20 +1701,28 @@ ;; Pentium Pro can do steps 1 through 3 in one go. ;; (these instructions set flags directly) -(define_mode_iterator FPCMP [CCFP CCFPU]) -(define_mode_attr unord [(CCFP "") (CCFPU "u")]) +(define_subst_attr "unord" "unord_subst" "" "u") +(define_subst_attr "unordered" "unord_subst" "false" "true") -(define_insn "*cmpi" - [(set (reg:FPCMP FLAGS_REG) - (compare:FPCMP +(define_subst "unord_subst" + [(set (match_operand:CCFP 0) + (match_operand:CCFP 1))] + "" + [(set (match_dup 0) + (unspec:CCFP + [(match_dup 1)] + UNSPEC_NOTRAP))]) + +(define_insn "*cmpi" + [(set (reg:CCFP FLAGS_REG) + (compare:CCFP (match_operand:MODEF 0 "register_operand" "f,v") (match_operand:MODEF 1 "register_ssemem_operand" "f,vm")))] "(SSE_FLOAT_MODE_P (mode) && TARGET_SSE_MATH) || (TARGET_80387 && TARGET_CMOVE)" "@ - * return output_fp_compare (insn, operands, true, \ - mode == CCFPUmode); - %vcomi\t{%1, %0|%0, %1}" + * return output_fp_compare (insn, operands, true, ); + %vcomi\t{%1, %0|%0, %1}" [(set_attr "type" "fcmp,ssecomi") (set_attr "prefix" "orig,maybe_vex") (set_attr "mode" "") @@ -1739,13 +1751,12 @@ (symbol_ref "false"))))]) (define_insn "*cmpixf_i387" - [(set (reg:FPCMP FLAGS_REG) - (compare:FPCMP + [(set (reg:CCFP FLAGS_REG) + (compare:CCFP (match_operand:XF 0 "register_operand" "f") (match_operand:XF 1 "register_operand" "f")))] "TARGET_80387 && TARGET_CMOVE" - "* return output_fp_compare (insn, operands, true, - mode == CCFPUmode);" + "* return output_fp_compare (insn, operands, true, );" [(set_attr "type" "fcmp") (set_attr "mode" "XF") (set_attr "athlon_decode" "vector") Index: config/i386/predicates.md =================================================================== --- config/i386/predicates.md (revision 254111) +++ config/i386/predicates.md (working copy) @@ -1301,7 +1301,7 @@ machine_mode inmode = GET_MODE (XEXP (op, 0)); enum rtx_code code = GET_CODE (op); - if (inmode == CCFPmode || inmode == CCFPUmode) + if (inmode == CCFPmode) { if (!ix86_trivial_fp_comparison_operator (op, mode)) return false; @@ -1311,7 +1311,7 @@ switch (code) { case LTU: case GTU: case LEU: case GEU: - if (inmode == CCmode || inmode == CCFPmode || inmode == CCFPUmode + if (inmode == CCmode || inmode == CCFPmode || inmode == CCCmode) return true; return false; @@ -1348,7 +1348,7 @@ machine_mode inmode = GET_MODE (XEXP (op, 0)); enum rtx_code code = GET_CODE (op); - if (inmode == CCFPmode || inmode == CCFPUmode) + if (inmode == CCFPmode) return ix86_trivial_fp_comparison_operator (op, mode); switch (code) @@ -1391,7 +1391,7 @@ machine_mode inmode = GET_MODE (XEXP (op, 0)); enum rtx_code code = GET_CODE (op); - if (inmode == CCFPmode || inmode == CCFPUmode) + if (inmode == CCFPmode) { if (!ix86_trivial_fp_comparison_operator (op, mode)) return false; Index: config/i386/sse.md =================================================================== --- config/i386/sse.md (revision 254111) +++ config/i386/sse.md (working copy) @@ -2755,7 +2755,7 @@ (set_attr "prefix" "evex") (set_attr "mode" "")]) -(define_insn "_comi" +(define_insn "_comi" [(set (reg:CCFP FLAGS_REG) (compare:CCFP (vec_select:MODEF @@ -2765,7 +2765,7 @@ (match_operand: 1 "" "") (parallel [(const_int 0)]))))] "SSE_FLOAT_MODE_P (mode)" - "%vcomi\t{%1, %0|%0, %1}" + "%vcomi\t{%1, %0|%0, %1}" [(set_attr "type" "ssecomi") (set_attr "prefix" "maybe_vex") (set_attr "prefix_rep" "0") @@ -2775,26 +2775,6 @@ (const_string "0"))) (set_attr "mode" "")]) -(define_insn "_ucomi" - [(set (reg:CCFPU FLAGS_REG) - (compare:CCFPU - (vec_select:MODEF - (match_operand: 0 "register_operand" "v") - (parallel [(const_int 0)])) - (vec_select:MODEF - (match_operand: 1 "" "") - (parallel [(const_int 0)]))))] - "SSE_FLOAT_MODE_P (mode)" - "%vucomi\t{%1, %0|%0, %1}" - [(set_attr "type" "ssecomi") - (set_attr "prefix" "maybe_vex") - (set_attr "prefix_rep" "0") - (set (attr "prefix_data16") - (if_then_else (eq_attr "mode" "DF") - (const_string "1") - (const_string "0"))) - (set_attr "mode" "")]) - (define_expand "vec_cmp" [(set (match_operand: 0 "register_operand") (match_operator: 1 "" Index: config/i386/subst.md =================================================================== --- config/i386/subst.md (revision 254111) +++ config/i386/subst.md (working copy) @@ -37,8 +37,7 @@ V8DI V4DI V2DI V16SF V8SF V4SF V8DF V4DF V2DF - QI HI SI DI SF DF - CCFP CCFPU]) + QI HI SI DI SF DF]) (define_subst_attr "mask_name" "mask" "" "_mask") (define_subst_attr "mask_applied" "mask" "false" "true") @@ -183,6 +182,16 @@ UNSPEC_EMBEDDED_ROUNDING)) ]) +(define_subst "round_saeonly" + [(set (match_operand:CCFP 0) + (match_operand:CCFP 1))] + "TARGET_AVX512F" + [(set (match_dup 0) + (unspec:CCFP [(match_dup 1) + (match_operand:SI 2 "const48_operand")] + UNSPEC_EMBEDDED_ROUNDING)) +]) + (define_subst_attr "round_expand_name" "round_expand" "" "_round") (define_subst_attr "round_expand_nimm_predicate" "round_expand" "nonimmediate_operand" "register_operand") (define_subst_attr "round_expand_operand" "round_expand" "" ", operands[5]") Index: reg-stack.c =================================================================== --- reg-stack.c (revision 254111) +++ reg-stack.c (working copy) @@ -1560,12 +1560,6 @@ subst_stack_regs_pat (rtx_insn *insn, stack_ptr re switch (GET_CODE (pat_src)) { - case COMPARE: - /* `fcomi' insn can't pop two regs. */ - compare_for_stack_reg (insn, regstack, pat_src, - REGNO (*dest) != FLAGS_REG); - break; - case CALL: { int count; @@ -1966,15 +1960,6 @@ subst_stack_regs_pat (rtx_insn *insn, stack_ptr re replace_reg (src2, FIRST_STACK_REG + 1); break; - case UNSPEC_SAHF: - /* (unspec [(unspec [(compare)] UNSPEC_FNSTSW)] UNSPEC_SAHF) - The combination matches the PPRO fcomi instruction. */ - - pat_src = XVECEXP (pat_src, 0, 0); - gcc_assert (GET_CODE (pat_src) == UNSPEC); - gcc_assert (XINT (pat_src, 1) == UNSPEC_FNSTSW); - /* Fall through. */ - case UNSPEC_FNSTSW: /* Combined fcomp+fnstsw generated for doing well with CSE. When optimizing this would have been broken @@ -1981,16 +1966,29 @@ subst_stack_regs_pat (rtx_insn *insn, stack_ptr re up before now. */ pat_src = XVECEXP (pat_src, 0, 0); + if (GET_CODE (pat_src) == COMPARE) + goto do_compare; + + /* Fall through. */ + + case UNSPEC_NOTRAP: + + pat_src = XVECEXP (pat_src, 0, 0); gcc_assert (GET_CODE (pat_src) == COMPARE); + goto do_compare; - compare_for_stack_reg (insn, regstack, pat_src, true); - break; - default: gcc_unreachable (); } break; + case COMPARE: + do_compare: + /* `fcomi' insn can't pop two regs. */ + compare_for_stack_reg (insn, regstack, pat_src, + REGNO (*dest) != FLAGS_REG); + break; + case IF_THEN_ELSE: /* This insn requires the top of stack to be the destination. */ Index: testsuite/gcc.target/i386/pr82692.c =================================================================== --- testsuite/gcc.target/i386/pr82692.c (nonexistent) +++ testsuite/gcc.target/i386/pr82692.c (working copy) @@ -0,0 +1,25 @@ +/* { dg-do run } */ +/* { dg-options "-O2" } */ +/* { dg-require-effective-target fenv_exceptions } */ + +#include + +extern void abort (void); +extern void exit (int); + +double __attribute__ ((noinline, noclone)) +foo (double x) +{ + if (__builtin_islessequal (x, 0.0) || __builtin_isgreater (x, 1.0)) + return x + x; + return x * x; +} + +int +main (void) +{ + volatile double x = foo (__builtin_nan ("")); + if (fetestexcept (FE_INVALID)) + abort (); + exit (0); +}