From patchwork Tue Nov 2 22:06:02 2010 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "H.J. Lu" X-Patchwork-Id: 69926 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) by ozlabs.org (Postfix) with SMTP id ABDC9B70E6 for ; Wed, 3 Nov 2010 09:06:17 +1100 (EST) Received: (qmail 14408 invoked by alias); 2 Nov 2010 22:06:13 -0000 Received: (qmail 14395 invoked by uid 22791); 2 Nov 2010 22:06:10 -0000 X-SWARE-Spam-Status: No, hits=-1.2 required=5.0 tests=AWL, BAYES_00, NO_DNS_FOR_FROM, TW_VZ, TW_ZJ, T_RP_MATCHES_RCVD X-Spam-Check-By: sourceware.org Received: from mga09.intel.com (HELO mga09.intel.com) (134.134.136.24) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Tue, 02 Nov 2010 22:06:04 +0000 Received: from orsmga002.jf.intel.com ([10.7.209.21]) by orsmga102.jf.intel.com with ESMTP; 02 Nov 2010 15:06:02 -0700 X-ExtLoop1: 1 Received: from gnu-6.sc.intel.com ([10.3.194.135]) by orsmga002.jf.intel.com with ESMTP; 02 Nov 2010 15:06:02 -0700 Received: by gnu-6.sc.intel.com (Postfix, from userid 500) id 57A7721E2F; Tue, 2 Nov 2010 15:06:02 -0700 (PDT) Date: Tue, 2 Nov 2010 15:06:02 -0700 From: "H.J. Lu" To: gcc-patches@gcc.gnu.org, Uros Bizjak Subject: Re: PATCH: Emit vzerouppers after reload Message-ID: <20101102220602.GA9756@intel.com> Reply-To: "H.J. Lu" References: <20101102180606.GA4551@intel.com> <20101102213517.GA9519@intel.com> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <20101102213517.GA9519@intel.com> User-Agent: Mutt/1.5.21 (2010-09-15) Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org On Tue, Nov 02, 2010 at 02:35:17PM -0700, H.J. Lu wrote: > On Tue, Nov 02, 2010 at 11:06:06AM -0700, H.J. Lu wrote: > > Hi, > > > > This patch changes vzeroupper optimization to emit vzerouppers after > > reload. I checked in it as approved by Uros offline. > > > > Thanks. > > > > > > H.J. > > --- > > gcc/ > > > > 2010-11-02 Uros Bizjak > > H.J. Lu > > > > * config/i386/i386-protos.h (ix86_split_call_vzeroupper): New. > > (ix86_split_call_pop_vzeroupper): Likewise. > > > > * config/i386/i386.c (move_or_delete_vzeroupper_2): Rewrite > > the loop. > > (ix86_expand_call): Use UNSPEC_CALL_NEEDS_VZEROUPPER. > > (ix86_split_call_vzeroupper): New. > > (ix86_split_call_pop_vzeroupper): Likewise. > > > > * config/i386/i386.md (UNSPEC_CALL_NEEDS_VZEROUPPER): New. > > (*call_pop_0_vzeroupper): Likewise. > > (*call_pop_1_vzeroupper): Likewise. > > (*sibcall_pop_1_vzeroupper): Likewise. > > (*call_0_vzeroupper): Likewise. > > (*call_1_vzeroupper): Likewise. > > (*sibcall_1_vzeroupper): Likewise. > > (*call_1_rex64_vzeroupper): Likewise. > > (*call_1_rex64_ms_sysv_vzeroupper): New. > > (*call_1_rex64_large_vzeroupper): Likewise. > > (*sibcall_1_rex64_vzeroupper): Likewise. > > (*call_value_pop_0_vzeroupper): New. > > (*call_value_pop_1_vzeroupper): Likewise. > > (*sibcall_value_pop_1_vzeroupper): Likewise. > > (*call_value_0_vzeroupper): New. > > (*call_value_0_rex64_vzeroupper): Use > > (*call_value_0_rex64_ms_sysv_vzeroupper): Likewise. > > (*call_value_1_vzeroupper): Likewise. > > (*sibcall_value_1_vzeroupper): Likewise. > > (*call_value_1_rex64_vzeroupper): Likewise. > > (*call_value_1_rex64_ms_sysv_vzeroupper): Likewise. > > (*call_value_1_rex64_large_vzeroupper): Likewise. > > (*sibcall_value_1_rex64_vzeroupper): Likewise. > > > > I checkec in this patch as an obvious fix to correct a typo. > > ix86_split_call_pop_vzeroupper isn't needed. call pop vzeroupper patterns should use parallel and just call ix86_split_call_vzeroupper. Checked in as an obvious fix. H.J. Index: ChangeLog =================================================================== --- ChangeLog (revision 166215) +++ ChangeLog (working copy) @@ -1,5 +1,20 @@ 2010-11-02 H.J. Lu + * config/i386/i386-protos.h (ix86_split_call_pop_vzeroupper): + Removed. + * config/i386/i386.c (ix86_split_call_pop_vzeroupper): Likewise. + + * config/i386/i386.md (*call_pop_0_vzeroupper): Use parallel + and call ix86_split_call_vzeroupper instead of + ix86_split_call_pop_vzeroupper. + (*call_pop_1_vzeroupper): Likewise. + (*sibcall_pop_1_vzeroupper): Likewise. + (*call_value_pop_0_vzeroupper): Likewise. + (*call_value_pop_1_vzeroupper): Likewise. + (*sibcall_value_pop_1_vzeroupper): Likewise. + +2010-11-02 H.J. Lu + * config/i386/i386.md (*sibcall_1_rex64_vzeroupper): Fix a typo. Index: config/i386/i386.md =================================================================== --- config/i386/i386.md (revision 166215) +++ config/i386/i386.md (working copy) @@ -11262,18 +11262,19 @@ }) (define_insn_and_split "*call_pop_0_vzeroupper" - [(call (mem:QI (match_operand:SI 0 "constant_call_address_operand" "")) - (match_operand:SI 1 "" "")) - (set (reg:SI SP_REG) - (plus:SI (reg:SI SP_REG) - (match_operand:SI 2 "immediate_operand" ""))) + [(parallel + [(call (mem:QI (match_operand:SI 0 "constant_call_address_operand" "")) + (match_operand:SI 1 "" "")) + (set (reg:SI SP_REG) + (plus:SI (reg:SI SP_REG) + (match_operand:SI 2 "immediate_operand" "")))]) (unspec [(match_operand 3 "const_int_operand" "")] UNSPEC_CALL_NEEDS_VZEROUPPER)] "TARGET_VZEROUPPER && !TARGET_64BIT" "#" "&& reload_completed" [(const_int 0)] - "ix86_split_call_pop_vzeroupper (curr_insn, operands[3]); DONE;" + "ix86_split_call_vzeroupper (curr_insn, operands[3]); DONE;" [(set_attr "type" "call")]) (define_insn "*call_pop_0" @@ -11292,18 +11293,19 @@ [(set_attr "type" "call")]) (define_insn_and_split "*call_pop_1_vzeroupper" - [(call (mem:QI (match_operand:SI 0 "call_insn_operand" "lsm")) - (match_operand:SI 1 "" "")) - (set (reg:SI SP_REG) - (plus:SI (reg:SI SP_REG) - (match_operand:SI 2 "immediate_operand" "i"))) + [(parallel + [(call (mem:QI (match_operand:SI 0 "call_insn_operand" "lsm")) + (match_operand:SI 1 "" "")) + (set (reg:SI SP_REG) + (plus:SI (reg:SI SP_REG) + (match_operand:SI 2 "immediate_operand" "i")))]) (unspec [(match_operand 3 "const_int_operand" "")] UNSPEC_CALL_NEEDS_VZEROUPPER)] "TARGET_VZEROUPPER && !TARGET_64BIT && !SIBLING_CALL_P (insn)" "#" "&& reload_completed" [(const_int 0)] - "ix86_split_call_pop_vzeroupper (curr_insn, operands[3]); DONE;" + "ix86_split_call_vzeroupper (curr_insn, operands[3]); DONE;" [(set_attr "type" "call")]) (define_insn "*call_pop_1" @@ -11321,18 +11323,19 @@ [(set_attr "type" "call")]) (define_insn_and_split "*sibcall_pop_1_vzeroupper" - [(call (mem:QI (match_operand:SI 0 "sibcall_insn_operand" "s,U")) - (match_operand:SI 1 "" "")) - (set (reg:SI SP_REG) - (plus:SI (reg:SI SP_REG) - (match_operand:SI 2 "immediate_operand" "i,i"))) + [(parallel + [(call (mem:QI (match_operand:SI 0 "sibcall_insn_operand" "s,U")) + (match_operand:SI 1 "" "")) + (set (reg:SI SP_REG) + (plus:SI (reg:SI SP_REG) + (match_operand:SI 2 "immediate_operand" "i,i")))]) (unspec [(match_operand 3 "const_int_operand" "")] UNSPEC_CALL_NEEDS_VZEROUPPER)] "TARGET_VZEROUPPER && !TARGET_64BIT && SIBLING_CALL_P (insn)" "#" "&& reload_completed" [(const_int 0)] - "ix86_split_call_pop_vzeroupper (curr_insn, operands[3]); DONE;" + "ix86_split_call_vzeroupper (curr_insn, operands[3]); DONE;" [(set_attr "type" "call")]) (define_insn "*sibcall_pop_1" @@ -17269,19 +17272,20 @@ ;; disrupt insn-recog's switch tables. (define_insn_and_split "*call_value_pop_0_vzeroupper" - [(set (match_operand 0 "" "") - (call (mem:QI (match_operand:SI 1 "constant_call_address_operand" "")) - (match_operand:SI 2 "" ""))) - (set (reg:SI SP_REG) - (plus:SI (reg:SI SP_REG) - (match_operand:SI 3 "immediate_operand" ""))) + [(parallel + [(set (match_operand 0 "" "") + (call (mem:QI (match_operand:SI 1 "constant_call_address_operand" "")) + (match_operand:SI 2 "" ""))) + (set (reg:SI SP_REG) + (plus:SI (reg:SI SP_REG) + (match_operand:SI 3 "immediate_operand" "")))]) (unspec [(match_operand 4 "const_int_operand" "")] UNSPEC_CALL_NEEDS_VZEROUPPER)] "TARGET_VZEROUPPER && !TARGET_64BIT" "#" "&& reload_completed" [(const_int 0)] - "ix86_split_call_pop_vzeroupper (curr_insn, operands[4]); DONE;" + "ix86_split_call_vzeroupper (curr_insn, operands[4]); DONE;" [(set_attr "type" "callv")]) (define_insn "*call_value_pop_0" @@ -17296,19 +17300,20 @@ [(set_attr "type" "callv")]) (define_insn_and_split "*call_value_pop_1_vzeroupper" - [(set (match_operand 0 "" "") - (call (mem:QI (match_operand:SI 1 "call_insn_operand" "lsm")) - (match_operand:SI 2 "" ""))) - (set (reg:SI SP_REG) - (plus:SI (reg:SI SP_REG) - (match_operand:SI 3 "immediate_operand" "i"))) + [(parallel + [(set (match_operand 0 "" "") + (call (mem:QI (match_operand:SI 1 "call_insn_operand" "lsm")) + (match_operand:SI 2 "" ""))) + (set (reg:SI SP_REG) + (plus:SI (reg:SI SP_REG) + (match_operand:SI 3 "immediate_operand" "i")))]) (unspec [(match_operand 4 "const_int_operand" "")] UNSPEC_CALL_NEEDS_VZEROUPPER)] "TARGET_VZEROUPPER && !TARGET_64BIT && !SIBLING_CALL_P (insn)" "#" "&& reload_completed" [(const_int 0)] - "ix86_split_call_pop_vzeroupper (curr_insn, operands[4]); DONE;" + "ix86_split_call_vzeroupper (curr_insn, operands[4]); DONE;" [(set_attr "type" "callv")]) (define_insn "*call_value_pop_1" @@ -17323,19 +17328,20 @@ [(set_attr "type" "callv")]) (define_insn_and_split "*sibcall_value_pop_1_vzeroupper" - [(set (match_operand 0 "" "") - (call (mem:QI (match_operand:SI 1 "sibcall_insn_operand" "s,U")) - (match_operand:SI 2 "" ""))) - (set (reg:SI SP_REG) - (plus:SI (reg:SI SP_REG) - (match_operand:SI 3 "immediate_operand" "i,i"))) + [(parallel + [(set (match_operand 0 "" "") + (call (mem:QI (match_operand:SI 1 "sibcall_insn_operand" "s,U")) + (match_operand:SI 2 "" ""))) + (set (reg:SI SP_REG) + (plus:SI (reg:SI SP_REG) + (match_operand:SI 3 "immediate_operand" "i,i")))]) (unspec [(match_operand 4 "const_int_operand" "")] UNSPEC_CALL_NEEDS_VZEROUPPER)] "TARGET_VZEROUPPER && !TARGET_64BIT && SIBLING_CALL_P (insn)" "#" "&& reload_completed" [(const_int 0)] - "ix86_split_call_pop_vzeroupper (curr_insn, operands[4]); DONE;" + "ix86_split_call_vzeroupper (curr_insn, operands[4]); DONE;" [(set_attr "type" "callv")]) (define_insn "*sibcall_value_pop_1" Index: config/i386/i386-protos.h =================================================================== --- config/i386/i386-protos.h (revision 166214) +++ config/i386/i386-protos.h (working copy) @@ -120,7 +120,6 @@ extern void ix86_expand_sse4_unpack (rtx extern bool ix86_expand_int_addcc (rtx[]); extern rtx ix86_expand_call (rtx, rtx, rtx, rtx, rtx, int); extern void ix86_split_call_vzeroupper (rtx, rtx); -extern void ix86_split_call_pop_vzeroupper (rtx, rtx); extern void x86_initialize_trampoline (rtx, rtx, rtx); extern rtx ix86_zero_extend_to_Pmode (rtx); extern void ix86_split_long_move (rtx[]); Index: config/i386/i386.c =================================================================== --- config/i386/i386.c (revision 166214) +++ config/i386/i386.c (working copy) @@ -21561,16 +21561,6 @@ ix86_split_call_vzeroupper (rtx insn, rt emit_call_insn (call); } -void -ix86_split_call_pop_vzeroupper (rtx insn, rtx vzeroupper) -{ - rtx call = XVECEXP (PATTERN (insn), 0, 0); - rtx pop = XVECEXP (PATTERN (insn), 0, 1); - emit_insn (gen_avx_vzeroupper (vzeroupper)); - emit_call_insn (gen_rtx_PARALLEL (VOIDmode, - gen_rtvec (2, call, pop))); -} - /* Output the assembly for a call instruction. */ const char *