From patchwork Tue Jun 22 13:28:47 2010 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Uros Bizjak X-Patchwork-Id: 56495 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) by ozlabs.org (Postfix) with SMTP id 35A7AB6F0E for ; Tue, 22 Jun 2010 23:36:59 +1000 (EST) Received: (qmail 26085 invoked by alias); 22 Jun 2010 13:36:57 -0000 Received: (qmail 26002 invoked by uid 22791); 22 Jun 2010 13:36:38 -0000 X-SWARE-Spam-Status: No, hits=-0.3 required=5.0 tests=AWL, BAYES_50, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, FREEMAIL_FROM, TW_OV, TW_VZ, TW_ZJ, TW_ZW, T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: sourceware.org Received: from mail-ww0-f47.google.com (HELO mail-ww0-f47.google.com) (74.125.82.47) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Tue, 22 Jun 2010 13:36:13 +0000 Received: by wwb22 with SMTP id 22so361603wwb.20 for ; Tue, 22 Jun 2010 06:36:09 -0700 (PDT) MIME-Version: 1.0 Received: by 10.216.88.145 with SMTP id a17mr4656009wef.53.1277213327812; Tue, 22 Jun 2010 06:28:47 -0700 (PDT) Received: by 10.216.86.17 with HTTP; Tue, 22 Jun 2010 06:28:47 -0700 (PDT) Date: Tue, 22 Jun 2010 15:28:47 +0200 Message-ID: Subject: [PATCH, i386]: Macroize integer move insns From: Uros Bizjak To: gcc-patches@gcc.gnu.org Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Hello! Attached patches macroizes and reorganizes move instructions. Although it looks scary, it is mostly just code moving around to better places. 2010-06-22 Uros Bizjak * config/i386/i386.md (SWI1248x): New mode iterator. (SWI48x): Ditto. (SWI12): Ditto. (SWI24): Ditto. (mov): Macroize expander from mov{qi,hi,si,di} using SWI1248x mode iterator. (*push2_rex64): Macroize insn from *push{qi,hi,si}_rex64 using SWI124 mode iterator. (*push2): Macroize insn from *push{qi,hi} using SWI12 mode iterator. (*push2_prologue): Macroize insn from *pushsi2_prologue and *pushdi2_prologue_rex64 using P mode iterator. (*mov_xor): Macroize insn from *movsi_xor and *movdi_xor_rex64 using SWI48 mode iterator. (*mov_or): Ditto from *movsi_or and *movdi_or_rex64. (*movabs_1): Macroize insn from *movabs{qi,hi,si,di}_1_rex64 using SWI1248x mode iterator. (*movabs_2): Ditto from *movabs{qi,hi,si,di}_1_rex64. (*swap): Macroize insn from *swapsi and *swapdi_rex64 using SWI48 mode iterator. (*swap_1): Macroize insn from *swap{qi,hi}_1 using SWI12 mode iterator. (*swap_2): Ditto from *swap{qi,hi}_2. (movstrict): Macroize expander from movstrict{qi,hi} using SWI12 mode iterator. (*movstrict_1): Macroize insn from *movstrict{qi,hi}_1 using SWI12 mode iterator. (*movstrict_xor): Ditto from *movstrict{qi,hi}_xor. (*mov_extv_1): Macroize insn from *mov{hi,si}_extv_1 using SWI24 mode iterator. (*mov_extzv_1): Macroize insn from *mov{si,di}_extzv_1 using SWI48 mode iterator. (mov_insn_1): New expander. (*mov_insv_1_rex64): Macroize insn from *mov{si,di}_insv_1_rex64 using SWI48x mode iterator. (*movoi_internal_avx): Rename from *movoi_internal. (*movti_internal_rex64): Rename from *movti_rex64. (*movti_internal_sse): Rename from *movti_sse. (*movdi_internal_rex64): Rename from *movdi_1_rex64. (*movdi_internal): Rename from *movdi_2. (*movsi_internal): Rename from *movsi_1. (*movhi_internal): Rename from *movhi_1. (*movqi_internal): Rename from *movqi_1. Patch was bootstrapped and regression tested on x86_64-pc-linux-gnu {,-m32}. Committed to mainline SVN. Uros. Index: i386.md =================================================================== --- i386.md (revision 161130) +++ i386.md (working copy) @@ -765,12 +765,24 @@ (define_code_attr sgnprefix [(sign_extend "i") (zero_extend "") (div "i") (udiv "")]) -;; All single word integer modes. +;; 64bit single word integer modes. +(define_mode_iterator SWI1248x [QI HI SI DI]) + +;; 64bit single word integer modes without QImode and HImode. +(define_mode_iterator SWI48x [SI DI]) + +;; Single word integer modes. (define_mode_iterator SWI [QI HI SI (DI "TARGET_64BIT")]) +;; Single word integer modes without SImode and DImode. +(define_mode_iterator SWI12 [QI HI]) + ;; Single word integer modes without DImode. (define_mode_iterator SWI124 [QI HI SI]) +;; Single word integer modes without QImode and DImode. +(define_mode_iterator SWI24 [HI SI]) + ;; Single word integer modes without QImode. (define_mode_iterator SWI248 [HI SI (DI "TARGET_64BIT")]) @@ -1585,13 +1597,47 @@ ;; Move instructions. -;; General case of fullword move. +(define_expand "movoi" + [(set (match_operand:OI 0 "nonimmediate_operand" "") + (match_operand:OI 1 "general_operand" ""))] + "TARGET_AVX" + "ix86_expand_move (OImode, operands); DONE;") -(define_expand "movsi" - [(set (match_operand:SI 0 "nonimmediate_operand" "") - (match_operand:SI 1 "general_operand" ""))] +(define_expand "movti" + [(set (match_operand:TI 0 "nonimmediate_operand" "") + (match_operand:TI 1 "nonimmediate_operand" ""))] + "TARGET_64BIT || TARGET_SSE" +{ + if (TARGET_64BIT) + ix86_expand_move (TImode, operands); + else if (push_operand (operands[0], TImode)) + ix86_expand_push (TImode, operands[1]); + else + ix86_expand_vector_move (TImode, operands); + DONE; +}) + +;; This expands to what emit_move_complex would generate if we didn't +;; have a movti pattern. Having this avoids problems with reload on +;; 32-bit targets when SSE is present, but doesn't seem to be harmful +;; to have around all the time. +(define_expand "movcdi" + [(set (match_operand:CDI 0 "nonimmediate_operand" "") + (match_operand:CDI 1 "general_operand" ""))] + "" +{ + if (push_operand (operands[0], CDImode)) + emit_move_complex_push (CDImode, operands[0], operands[1]); + else + emit_move_complex_parts (operands[0], operands[1]); + DONE; +}) + +(define_expand "mov" + [(set (match_operand:SWI1248x 0 "nonimmediate_operand" "") + (match_operand:SWI1248x 1 "general_operand" ""))] "" - "ix86_expand_move (SImode, operands); DONE;") + "ix86_expand_move (mode, operands); DONE;") ;; Push/pop instructions. They are separate since autoinc/dec is not a ;; general_operand. @@ -1602,6 +1648,79 @@ ;; targets without our curiosities, and it is just as easy to represent ;; this differently. +(define_insn "*pushdi2_rex64" + [(set (match_operand:DI 0 "push_operand" "=<,!<") + (match_operand:DI 1 "general_no_elim_operand" "re*m,n"))] + "TARGET_64BIT" + "@ + push{q}\t%1 + #" + [(set_attr "type" "push,multi") + (set_attr "mode" "DI")]) + +;; Convert impossible pushes of immediate to existing instructions. +;; First try to get scratch register and go through it. In case this +;; fails, push sign extended lower part first and then overwrite +;; upper part by 32bit move. +(define_peephole2 + [(match_scratch:DI 2 "r") + (set (match_operand:DI 0 "push_operand" "") + (match_operand:DI 1 "immediate_operand" ""))] + "TARGET_64BIT && !symbolic_operand (operands[1], DImode) + && !x86_64_immediate_operand (operands[1], DImode)" + [(set (match_dup 2) (match_dup 1)) + (set (match_dup 0) (match_dup 2))] + "") + +;; We need to define this as both peepholer and splitter for case +;; peephole2 pass is not run. +;; "&& 1" is needed to keep it from matching the previous pattern. +(define_peephole2 + [(set (match_operand:DI 0 "push_operand" "") + (match_operand:DI 1 "immediate_operand" ""))] + "TARGET_64BIT && !symbolic_operand (operands[1], DImode) + && !x86_64_immediate_operand (operands[1], DImode) && 1" + [(set (match_dup 0) (match_dup 1)) + (set (match_dup 2) (match_dup 3))] +{ + split_di (&operands[1], 1, &operands[2], &operands[3]); + + operands[1] = gen_lowpart (DImode, operands[2]); + operands[2] = gen_rtx_MEM (SImode, gen_rtx_PLUS (DImode, stack_pointer_rtx, + GEN_INT (4))); +}) + +(define_split + [(set (match_operand:DI 0 "push_operand" "") + (match_operand:DI 1 "immediate_operand" ""))] + "TARGET_64BIT && ((optimize > 0 && flag_peephole2) + ? epilogue_completed : reload_completed) + && !symbolic_operand (operands[1], DImode) + && !x86_64_immediate_operand (operands[1], DImode)" + [(set (match_dup 0) (match_dup 1)) + (set (match_dup 2) (match_dup 3))] +{ + split_di (&operands[1], 1, &operands[2], &operands[3]); + + operands[1] = gen_lowpart (DImode, operands[2]); + operands[2] = gen_rtx_MEM (SImode, gen_rtx_PLUS (DImode, stack_pointer_rtx, + GEN_INT (4))); +}) + +(define_insn "*pushdi2" + [(set (match_operand:DI 0 "push_operand" "=<") + (match_operand:DI 1 "general_no_elim_operand" "riF*m"))] + "!TARGET_64BIT" + "#") + +(define_split + [(set (match_operand:DI 0 "push_operand" "") + (match_operand:DI 1 "general_operand" ""))] + "!TARGET_64BIT && reload_completed + && !(MMX_REG_P (operands[1]) || SSE_REG_P (operands[1]))" + [(const_int 0)] + "ix86_split_long_move (operands); DONE;") + (define_insn "*pushsi2" [(set (match_operand:SI 0 "push_operand" "=<") (match_operand:SI 1 "general_no_elim_operand" "ri*m"))] @@ -1610,761 +1729,335 @@ [(set_attr "type" "push") (set_attr "mode" "SI")]) +;; emit_push_insn when it calls move_by_pieces requires an insn to +;; "push a byte/word". But actually we use pushl, which has the effect +;; of rounding the amount pushed up to a word. + ;; For 64BIT abi we always round up to 8 bytes. -(define_insn "*pushsi2_rex64" - [(set (match_operand:SI 0 "push_operand" "=X") - (match_operand:SI 1 "nonmemory_no_elim_operand" "ri"))] +(define_insn "*push2_rex64" + [(set (match_operand:SWI124 0 "push_operand" "=X") + (match_operand:SWI124 1 "nonmemory_no_elim_operand" "r"))] "TARGET_64BIT" "push{q}\t%q1" [(set_attr "type" "push") - (set_attr "mode" "SI")]) + (set_attr "mode" "DI")]) -(define_insn "*pushsi2_prologue" - [(set (match_operand:SI 0 "push_operand" "=<") - (match_operand:SI 1 "general_no_elim_operand" "ri*m")) - (clobber (mem:BLK (scratch)))] +(define_insn "*push2" + [(set (match_operand:SWI12 0 "push_operand" "=X") + (match_operand:SWI12 1 "nonmemory_no_elim_operand" "rn"))] "!TARGET_64BIT" - "push{l}\t%1" + "push{l}\t%k1" [(set_attr "type" "push") (set_attr "mode" "SI")]) -(define_insn "*popsi1_epilogue" +(define_insn "*push2_prologue" + [(set (match_operand:P 0 "push_operand" "=<") + (match_operand:P 1 "general_no_elim_operand" "r*m")) + (clobber (mem:BLK (scratch)))] + "" + "push{}\t%1" + [(set_attr "type" "push") + (set_attr "mode" "")]) + +(define_insn "popdi1" + [(set (match_operand:DI 0 "nonimmediate_operand" "=r*m") + (mem:DI (reg:DI SP_REG))) + (set (reg:DI SP_REG) + (plus:DI (reg:DI SP_REG) (const_int 8)))] + "TARGET_64BIT" + "pop{q}\t%0" + [(set_attr "type" "pop") + (set_attr "mode" "DI")]) + +(define_insn "popsi1" [(set (match_operand:SI 0 "nonimmediate_operand" "=r*m") (mem:SI (reg:SI SP_REG))) (set (reg:SI SP_REG) - (plus:SI (reg:SI SP_REG) (const_int 4))) - (clobber (mem:BLK (scratch)))] + (plus:SI (reg:SI SP_REG) (const_int 4)))] "!TARGET_64BIT" "pop{l}\t%0" [(set_attr "type" "pop") (set_attr "mode" "SI")]) -(define_insn "popsi1" +(define_insn "*popdi1_epilogue" + [(set (match_operand:DI 0 "nonimmediate_operand" "=r*m") + (mem:DI (reg:DI SP_REG))) + (set (reg:DI SP_REG) + (plus:DI (reg:DI SP_REG) (const_int 8))) + (clobber (mem:BLK (scratch)))] + "TARGET_64BIT" + "pop{q}\t%0" + [(set_attr "type" "pop") + (set_attr "mode" "DI")]) + +(define_insn "*popsi1_epilogue" [(set (match_operand:SI 0 "nonimmediate_operand" "=r*m") (mem:SI (reg:SI SP_REG))) (set (reg:SI SP_REG) - (plus:SI (reg:SI SP_REG) (const_int 4)))] + (plus:SI (reg:SI SP_REG) (const_int 4))) + (clobber (mem:BLK (scratch)))] "!TARGET_64BIT" "pop{l}\t%0" [(set_attr "type" "pop") (set_attr "mode" "SI")]) -(define_insn "*movsi_xor" - [(set (match_operand:SI 0 "register_operand" "=r") - (match_operand:SI 1 "const0_operand" "")) +(define_insn "*mov_xor" + [(set (match_operand:SWI48 0 "register_operand" "=r") + (match_operand:SWI48 1 "const0_operand" "")) (clobber (reg:CC FLAGS_REG))] "reload_completed" - "xor{l}\t%0, %0" + "xor{l}\t%k0, %k0" [(set_attr "type" "alu1") (set_attr "mode" "SI") (set_attr "length_immediate" "0")]) -(define_insn "*movsi_or" - [(set (match_operand:SI 0 "register_operand" "=r") - (match_operand:SI 1 "immediate_operand" "i")) +(define_insn "*mov_or" + [(set (match_operand:SWI48 0 "register_operand" "=r") + (match_operand:SWI48 1 "const_int_operand" "")) (clobber (reg:CC FLAGS_REG))] "reload_completed && operands[1] == constm1_rtx" -{ - operands[1] = constm1_rtx; - return "or{l}\t{%1, %0|%0, %1}"; -} + "or{}\t{%1, %0|%0, %1}" [(set_attr "type" "alu1") - (set_attr "mode" "SI") + (set_attr "mode" "") (set_attr "length_immediate" "1")]) -(define_insn "*movsi_1" - [(set (match_operand:SI 0 "nonimmediate_operand" - "=r,m ,*y,*y,?rm,?*y,*x,*x,?r ,m ,?*Yi,*x") - (match_operand:SI 1 "general_operand" - "g ,ri,C ,*y,*y ,rm ,C ,*x,*Yi,*x,r ,m "))] - "!(MEM_P (operands[0]) && MEM_P (operands[1]))" +(define_insn "*movoi_internal_avx" + [(set (match_operand:OI 0 "nonimmediate_operand" "=x,x,m") + (match_operand:OI 1 "vector_move_operand" "C,xm,x"))] + "TARGET_AVX && !(MEM_P (operands[0]) && MEM_P (operands[1]))" { - switch (get_attr_type (insn)) + switch (which_alternative) { - case TYPE_SSELOG1: - if (get_attr_mode (insn) == MODE_TI) - return "%vpxor\t%0, %d0"; - return "%vxorps\t%0, %d0"; + case 0: + return "vxorps\t%0, %0, %0"; + case 1: + case 2: + if (misaligned_operand (operands[0], OImode) + || misaligned_operand (operands[1], OImode)) + return "vmovdqu\t{%1, %0|%0, %1}"; + else + return "vmovdqa\t{%1, %0|%0, %1}"; + default: + gcc_unreachable (); + } +} + [(set_attr "type" "sselog1,ssemov,ssemov") + (set_attr "prefix" "vex") + (set_attr "mode" "OI")]) - case TYPE_SSEMOV: - switch (get_attr_mode (insn)) +(define_insn "*movti_internal_rex64" + [(set (match_operand:TI 0 "nonimmediate_operand" "=!r,o,x,x,xm") + (match_operand:TI 1 "general_operand" "riFo,riF,C,xm,x"))] + "TARGET_64BIT && !(MEM_P (operands[0]) && MEM_P (operands[1]))" +{ + switch (which_alternative) + { + case 0: + case 1: + return "#"; + case 2: + if (get_attr_mode (insn) == MODE_V4SF) + return "%vxorps\t%0, %d0"; + else + return "%vpxor\t%0, %d0"; + case 3: + case 4: + /* TDmode values are passed as TImode on the stack. Moving them + to stack may result in unaligned memory access. */ + if (misaligned_operand (operands[0], TImode) + || misaligned_operand (operands[1], TImode)) { - case MODE_TI: - return "%vmovdqa\t{%1, %0|%0, %1}"; - case MODE_V4SF: - return "%vmovaps\t{%1, %0|%0, %1}"; - case MODE_SI: - return "%vmovd\t{%1, %0|%0, %1}"; - case MODE_SF: - return "%vmovss\t{%1, %0|%0, %1}"; - default: - gcc_unreachable (); + if (get_attr_mode (insn) == MODE_V4SF) + return "%vmovups\t{%1, %0|%0, %1}"; + else + return "%vmovdqu\t{%1, %0|%0, %1}"; + } + else + { + if (get_attr_mode (insn) == MODE_V4SF) + return "%vmovaps\t{%1, %0|%0, %1}"; + else + return "%vmovdqa\t{%1, %0|%0, %1}"; } - - case TYPE_MMX: - return "pxor\t%0, %0"; - - case TYPE_MMXMOV: - if (get_attr_mode (insn) == MODE_DI) - return "movq\t{%1, %0|%0, %1}"; - return "movd\t{%1, %0|%0, %1}"; - - case TYPE_LEA: - return "lea{l}\t{%a1, %0|%0, %a1}"; - default: - gcc_assert (!flag_pic || LEGITIMATE_PIC_OPERAND_P (operands[1])); - return "mov{l}\t{%1, %0|%0, %1}"; + gcc_unreachable (); } } - [(set (attr "type") - (cond [(eq_attr "alternative" "2") - (const_string "mmx") - (eq_attr "alternative" "3,4,5") - (const_string "mmxmov") - (eq_attr "alternative" "6") - (const_string "sselog1") - (eq_attr "alternative" "7,8,9,10,11") - (const_string "ssemov") - (match_operand:DI 1 "pic_32bit_operand" "") - (const_string "lea") - ] - (const_string "imov"))) - (set (attr "prefix") - (if_then_else (eq_attr "alternative" "0,1,2,3,4,5") - (const_string "orig") - (const_string "maybe_vex"))) - (set (attr "prefix_data16") - (if_then_else (and (eq_attr "type" "ssemov") (eq_attr "mode" "SI")) - (const_string "1") - (const_string "*"))) + [(set_attr "type" "*,*,sselog1,ssemov,ssemov") + (set_attr "prefix" "*,*,maybe_vex,maybe_vex,maybe_vex") (set (attr "mode") - (cond [(eq_attr "alternative" "2,3") - (const_string "DI") - (eq_attr "alternative" "6,7") - (if_then_else - (eq (symbol_ref "TARGET_SSE2") (const_int 0)) - (const_string "V4SF") - (const_string "TI")) - (and (eq_attr "alternative" "8,9,10,11") - (eq (symbol_ref "TARGET_SSE2") (const_int 0))) - (const_string "SF") - ] - (const_string "SI")))]) - -;; Stores and loads of ax to arbitrary constant address. -;; We fake an second form of instruction to force reload to load address -;; into register when rax is not available -(define_insn "*movabssi_1_rex64" - [(set (mem:SI (match_operand:DI 0 "x86_64_movabs_operand" "i,r")) - (match_operand:SI 1 "nonmemory_operand" "a,er"))] - "TARGET_64BIT && ix86_check_movabs (insn, 0)" - "@ - movabs{l}\t{%1, %P0|%P0, %1} - mov{l}\t{%1, %a0|%a0, %1}" - [(set_attr "type" "imov") - (set_attr "modrm" "0,*") - (set_attr "length_address" "8,0") - (set_attr "length_immediate" "0,*") - (set_attr "memory" "store") - (set_attr "mode" "SI")]) - -(define_insn "*movabssi_2_rex64" - [(set (match_operand:SI 0 "register_operand" "=a,r") - (mem:SI (match_operand:DI 1 "x86_64_movabs_operand" "i,r")))] - "TARGET_64BIT && ix86_check_movabs (insn, 1)" - "@ - movabs{l}\t{%P1, %0|%0, %P1} - mov{l}\t{%a1, %0|%0, %a1}" - [(set_attr "type" "imov") - (set_attr "modrm" "0,*") - (set_attr "length_address" "8,0") - (set_attr "length_immediate" "0") - (set_attr "memory" "load") - (set_attr "mode" "SI")]) - -(define_insn "*swapsi" - [(set (match_operand:SI 0 "register_operand" "+r") - (match_operand:SI 1 "register_operand" "+r")) - (set (match_dup 1) - (match_dup 0))] - "" - "xchg{l}\t%1, %0" - [(set_attr "type" "imov") - (set_attr "mode" "SI") - (set_attr "pent_pair" "np") - (set_attr "athlon_decode" "vector") - (set_attr "amdfam10_decode" "double")]) - -(define_expand "movhi" - [(set (match_operand:HI 0 "nonimmediate_operand" "") - (match_operand:HI 1 "general_operand" ""))] - "" - "ix86_expand_move (HImode, operands); DONE;") - -(define_insn "*pushhi2" - [(set (match_operand:HI 0 "push_operand" "=X") - (match_operand:HI 1 "nonmemory_no_elim_operand" "rn"))] - "!TARGET_64BIT" - "push{l}\t%k1" - [(set_attr "type" "push") - (set_attr "mode" "SI")]) - -;; For 64BIT abi we always round up to 8 bytes. -(define_insn "*pushhi2_rex64" - [(set (match_operand:HI 0 "push_operand" "=X") - (match_operand:HI 1 "nonmemory_no_elim_operand" "rn"))] - "TARGET_64BIT" - "push{q}\t%q1" - [(set_attr "type" "push") - (set_attr "mode" "DI")]) - -(define_insn "*movhi_1" - [(set (match_operand:HI 0 "nonimmediate_operand" "=r,r,r,m") - (match_operand:HI 1 "general_operand" "r,rn,rm,rn"))] - "!(MEM_P (operands[0]) && MEM_P (operands[1]))" -{ - switch (get_attr_type (insn)) - { - case TYPE_IMOVX: - /* movzwl is faster than movw on p2 due to partial word stalls, - though not as fast as an aligned movl. */ - return "movz{wl|x}\t{%1, %k0|%k0, %1}"; - default: - if (get_attr_mode (insn) == MODE_SI) - return "mov{l}\t{%k1, %k0|%k0, %k1}"; - else - return "mov{w}\t{%1, %0|%0, %1}"; - } -} - [(set (attr "type") - (cond [(ne (symbol_ref "optimize_function_for_size_p (cfun)") (const_int 0)) - (const_string "imov") - (and (eq_attr "alternative" "0") - (ior (eq (symbol_ref "TARGET_PARTIAL_REG_STALL") - (const_int 0)) - (eq (symbol_ref "TARGET_HIMODE_MATH") - (const_int 0)))) - (const_string "imov") - (and (eq_attr "alternative" "1,2") - (match_operand:HI 1 "aligned_operand" "")) - (const_string "imov") - (and (ne (symbol_ref "TARGET_MOVX") - (const_int 0)) - (eq_attr "alternative" "0,2")) - (const_string "imovx") - ] - (const_string "imov"))) - (set (attr "mode") - (cond [(eq_attr "type" "imovx") - (const_string "SI") - (and (eq_attr "alternative" "1,2") - (match_operand:HI 1 "aligned_operand" "")) - (const_string "SI") - (and (eq_attr "alternative" "0") - (ior (eq (symbol_ref "TARGET_PARTIAL_REG_STALL") - (const_int 0)) - (eq (symbol_ref "TARGET_HIMODE_MATH") - (const_int 0)))) - (const_string "SI") - ] - (const_string "HI")))]) - -;; Stores and loads of ax to arbitrary constant address. -;; We fake an second form of instruction to force reload to load address -;; into register when rax is not available -(define_insn "*movabshi_1_rex64" - [(set (mem:HI (match_operand:DI 0 "x86_64_movabs_operand" "i,r")) - (match_operand:HI 1 "nonmemory_operand" "a,er"))] - "TARGET_64BIT && ix86_check_movabs (insn, 0)" - "@ - movabs{w}\t{%1, %P0|%P0, %1} - mov{w}\t{%1, %a0|%a0, %1}" - [(set_attr "type" "imov") - (set_attr "modrm" "0,*") - (set_attr "length_address" "8,0") - (set_attr "length_immediate" "0,*") - (set_attr "memory" "store") - (set_attr "mode" "HI")]) - -(define_insn "*movabshi_2_rex64" - [(set (match_operand:HI 0 "register_operand" "=a,r") - (mem:HI (match_operand:DI 1 "x86_64_movabs_operand" "i,r")))] - "TARGET_64BIT && ix86_check_movabs (insn, 1)" - "@ - movabs{w}\t{%P1, %0|%0, %P1} - mov{w}\t{%a1, %0|%0, %a1}" - [(set_attr "type" "imov") - (set_attr "modrm" "0,*") - (set_attr "length_address" "8,0") - (set_attr "length_immediate" "0") - (set_attr "memory" "load") - (set_attr "mode" "HI")]) - -(define_insn "*swaphi_1" - [(set (match_operand:HI 0 "register_operand" "+r") - (match_operand:HI 1 "register_operand" "+r")) - (set (match_dup 1) - (match_dup 0))] - "!TARGET_PARTIAL_REG_STALL || optimize_function_for_size_p (cfun)" - "xchg{l}\t%k1, %k0" - [(set_attr "type" "imov") - (set_attr "mode" "SI") - (set_attr "pent_pair" "np") - (set_attr "athlon_decode" "vector") - (set_attr "amdfam10_decode" "double")]) - -;; Not added amdfam10_decode since TARGET_PARTIAL_REG_STALL is disabled for AMDFAM10 -(define_insn "*swaphi_2" - [(set (match_operand:HI 0 "register_operand" "+r") - (match_operand:HI 1 "register_operand" "+r")) - (set (match_dup 1) - (match_dup 0))] - "TARGET_PARTIAL_REG_STALL" - "xchg{w}\t%1, %0" - [(set_attr "type" "imov") - (set_attr "mode" "HI") - (set_attr "pent_pair" "np") - (set_attr "athlon_decode" "vector")]) + (cond [(eq_attr "alternative" "2,3") + (if_then_else + (ne (symbol_ref "optimize_function_for_size_p (cfun)") + (const_int 0)) + (const_string "V4SF") + (const_string "TI")) + (eq_attr "alternative" "4") + (if_then_else + (ior (ne (symbol_ref "TARGET_SSE_TYPELESS_STORES") + (const_int 0)) + (ne (symbol_ref "optimize_function_for_size_p (cfun)") + (const_int 0))) + (const_string "V4SF") + (const_string "TI"))] + (const_string "DI")))]) -(define_expand "movstricthi" - [(set (strict_low_part (match_operand:HI 0 "nonimmediate_operand" "")) - (match_operand:HI 1 "general_operand" ""))] - "" -{ - if (TARGET_PARTIAL_REG_STALL && optimize_function_for_speed_p (cfun)) - FAIL; - /* Don't generate memory->memory moves, go through a register */ - if (MEM_P (operands[0]) && MEM_P (operands[1])) - operands[1] = force_reg (HImode, operands[1]); -}) +(define_split + [(set (match_operand:TI 0 "nonimmediate_operand" "") + (match_operand:TI 1 "general_operand" ""))] + "reload_completed + && !SSE_REG_P (operands[0]) && !SSE_REG_P (operands[1])" + [(const_int 0)] + "ix86_split_long_move (operands); DONE;") -(define_insn "*movstricthi_1" - [(set (strict_low_part (match_operand:HI 0 "nonimmediate_operand" "+rm,r")) - (match_operand:HI 1 "general_operand" "rn,m"))] - "(! TARGET_PARTIAL_REG_STALL || optimize_function_for_size_p (cfun)) +(define_insn "*movti_internal_sse" + [(set (match_operand:TI 0 "nonimmediate_operand" "=x,x,m") + (match_operand:TI 1 "vector_move_operand" "C,xm,x"))] + "TARGET_SSE && !TARGET_64BIT && !(MEM_P (operands[0]) && MEM_P (operands[1]))" - "mov{w}\t{%1, %0|%0, %1}" - [(set_attr "type" "imov") - (set_attr "mode" "HI")]) - -(define_insn "*movstricthi_xor" - [(set (strict_low_part (match_operand:HI 0 "register_operand" "+r")) - (match_operand:HI 1 "const0_operand" "")) - (clobber (reg:CC FLAGS_REG))] - "reload_completed" - "xor{w}\t%0, %0" - [(set_attr "type" "alu1") - (set_attr "mode" "HI") - (set_attr "length_immediate" "0")]) - -(define_expand "movqi" - [(set (match_operand:QI 0 "nonimmediate_operand" "") - (match_operand:QI 1 "general_operand" ""))] - "" - "ix86_expand_move (QImode, operands); DONE;") - -;; emit_push_insn when it calls move_by_pieces requires an insn to -;; "push a byte". But actually we use pushl, which has the effect -;; of rounding the amount pushed up to a word. - -(define_insn "*pushqi2" - [(set (match_operand:QI 0 "push_operand" "=X") - (match_operand:QI 1 "nonmemory_no_elim_operand" "rn"))] - "!TARGET_64BIT" - "push{l}\t%k1" - [(set_attr "type" "push") - (set_attr "mode" "SI")]) - -;; For 64BIT abi we always round up to 8 bytes. -(define_insn "*pushqi2_rex64" - [(set (match_operand:QI 0 "push_operand" "=X") - (match_operand:QI 1 "nonmemory_no_elim_operand" "qn"))] - "TARGET_64BIT" - "push{q}\t%q1" - [(set_attr "type" "push") - (set_attr "mode" "DI")]) - -;; Situation is quite tricky about when to choose full sized (SImode) move -;; over QImode moves. For Q_REG -> Q_REG move we use full size only for -;; partial register dependency machines (such as AMD Athlon), where QImode -;; moves issue extra dependency and for partial register stalls machines -;; that don't use QImode patterns (and QImode move cause stall on the next -;; instruction). -;; -;; For loads of Q_REG to NONQ_REG we use full sized moves except for partial -;; register stall machines with, where we use QImode instructions, since -;; partial register stall can be caused there. Then we use movzx. -(define_insn "*movqi_1" - [(set (match_operand:QI 0 "nonimmediate_operand" "=q,q ,q ,r,r ,?r,m") - (match_operand:QI 1 "general_operand" " q,qn,qm,q,rn,qm,qn"))] - "!(MEM_P (operands[0]) && MEM_P (operands[1]))" { - switch (get_attr_type (insn)) + switch (which_alternative) { - case TYPE_IMOVX: - gcc_assert (ANY_QI_REG_P (operands[1]) || MEM_P (operands[1])); - return "movz{bl|x}\t{%1, %k0|%k0, %1}"; - default: - if (get_attr_mode (insn) == MODE_SI) - return "mov{l}\t{%k1, %k0|%k0, %k1}"; + case 0: + if (get_attr_mode (insn) == MODE_V4SF) + return "%vxorps\t%0, %d0"; else - return "mov{b}\t{%1, %0|%0, %1}"; - } -} - [(set (attr "type") - (cond [(and (eq_attr "alternative" "5") - (not (match_operand:QI 1 "aligned_operand" ""))) - (const_string "imovx") - (ne (symbol_ref "optimize_function_for_size_p (cfun)") (const_int 0)) - (const_string "imov") - (and (eq_attr "alternative" "3") - (ior (eq (symbol_ref "TARGET_PARTIAL_REG_STALL") - (const_int 0)) - (eq (symbol_ref "TARGET_QIMODE_MATH") - (const_int 0)))) - (const_string "imov") - (eq_attr "alternative" "3,5") - (const_string "imovx") - (and (ne (symbol_ref "TARGET_MOVX") - (const_int 0)) - (eq_attr "alternative" "2")) - (const_string "imovx") - ] - (const_string "imov"))) - (set (attr "mode") - (cond [(eq_attr "alternative" "3,4,5") - (const_string "SI") - (eq_attr "alternative" "6") - (const_string "QI") - (eq_attr "type" "imovx") - (const_string "SI") - (and (eq_attr "type" "imov") - (and (eq_attr "alternative" "0,1") - (and (ne (symbol_ref "TARGET_PARTIAL_REG_DEPENDENCY") - (const_int 0)) - (and (eq (symbol_ref "optimize_function_for_size_p (cfun)") - (const_int 0)) - (eq (symbol_ref "TARGET_PARTIAL_REG_STALL") - (const_int 0)))))) - (const_string "SI") - ;; Avoid partial register stalls when not using QImode arithmetic - (and (eq_attr "type" "imov") - (and (eq_attr "alternative" "0,1") - (and (ne (symbol_ref "TARGET_PARTIAL_REG_STALL") - (const_int 0)) - (eq (symbol_ref "TARGET_QIMODE_MATH") - (const_int 0))))) - (const_string "SI") - ] - (const_string "QI")))]) - -(define_insn "*swapqi_1" - [(set (match_operand:QI 0 "register_operand" "+r") - (match_operand:QI 1 "register_operand" "+r")) - (set (match_dup 1) - (match_dup 0))] - "!TARGET_PARTIAL_REG_STALL || optimize_function_for_size_p (cfun)" - "xchg{l}\t%k1, %k0" - [(set_attr "type" "imov") - (set_attr "mode" "SI") - (set_attr "pent_pair" "np") - (set_attr "athlon_decode" "vector") - (set_attr "amdfam10_decode" "vector")]) - -;; Not added amdfam10_decode since TARGET_PARTIAL_REG_STALL is disabled for AMDFAM10 -(define_insn "*swapqi_2" - [(set (match_operand:QI 0 "register_operand" "+q") - (match_operand:QI 1 "register_operand" "+q")) - (set (match_dup 1) - (match_dup 0))] - "TARGET_PARTIAL_REG_STALL" - "xchg{b}\t%1, %0" - [(set_attr "type" "imov") - (set_attr "mode" "QI") - (set_attr "pent_pair" "np") - (set_attr "athlon_decode" "vector")]) - -(define_expand "movstrictqi" - [(set (strict_low_part (match_operand:QI 0 "nonimmediate_operand" "")) - (match_operand:QI 1 "general_operand" ""))] - "" -{ - if (TARGET_PARTIAL_REG_STALL && optimize_function_for_speed_p (cfun)) - FAIL; - /* Don't generate memory->memory moves, go through a register. */ - if (MEM_P (operands[0]) && MEM_P (operands[1])) - operands[1] = force_reg (QImode, operands[1]); -}) - -(define_insn "*movstrictqi_1" - [(set (strict_low_part (match_operand:QI 0 "nonimmediate_operand" "+qm,q")) - (match_operand:QI 1 "general_operand" "*qn,m"))] - "(! TARGET_PARTIAL_REG_STALL || optimize_function_for_size_p (cfun)) - && !(MEM_P (operands[0]) && MEM_P (operands[1]))" - "mov{b}\t{%1, %0|%0, %1}" - [(set_attr "type" "imov") - (set_attr "mode" "QI")]) - -(define_insn "*movstrictqi_xor" - [(set (strict_low_part (match_operand:QI 0 "q_regs_operand" "+q")) - (match_operand:QI 1 "const0_operand" "")) - (clobber (reg:CC FLAGS_REG))] - "reload_completed" - "xor{b}\t%0, %0" - [(set_attr "type" "alu1") - (set_attr "mode" "QI") - (set_attr "length_immediate" "0")]) - -(define_insn "*movsi_extv_1" - [(set (match_operand:SI 0 "register_operand" "=R") - (sign_extract:SI (match_operand 1 "ext_register_operand" "Q") - (const_int 8) - (const_int 8)))] - "" - "movs{bl|x}\t{%h1, %0|%0, %h1}" - [(set_attr "type" "imovx") - (set_attr "mode" "SI")]) - -(define_insn "*movhi_extv_1" - [(set (match_operand:HI 0 "register_operand" "=R") - (sign_extract:HI (match_operand 1 "ext_register_operand" "Q") - (const_int 8) - (const_int 8)))] - "" - "movs{bl|x}\t{%h1, %k0|%k0, %h1}" - [(set_attr "type" "imovx") - (set_attr "mode" "SI")]) - -(define_insn "*movqi_extv_1" - [(set (match_operand:QI 0 "nonimmediate_operand" "=Qm,?r") - (sign_extract:QI (match_operand 1 "ext_register_operand" "Q,Q") - (const_int 8) - (const_int 8)))] - "!TARGET_64BIT" -{ - switch (get_attr_type (insn)) - { - case TYPE_IMOVX: - return "movs{bl|x}\t{%h1, %k0|%k0, %h1}"; - default: - return "mov{b}\t{%h1, %0|%0, %h1}"; - } -} - [(set (attr "type") - (if_then_else (and (match_operand:QI 0 "register_operand" "") - (ior (not (match_operand:QI 0 "q_regs_operand" "")) - (ne (symbol_ref "TARGET_MOVX") - (const_int 0)))) - (const_string "imovx") - (const_string "imov"))) - (set (attr "mode") - (if_then_else (eq_attr "type" "imovx") - (const_string "SI") - (const_string "QI")))]) - -(define_insn "*movqi_extv_1_rex64" - [(set (match_operand:QI 0 "register_operand" "=Q,?R") - (sign_extract:QI (match_operand 1 "ext_register_operand" "Q,Q") - (const_int 8) - (const_int 8)))] - "TARGET_64BIT" -{ - switch (get_attr_type (insn)) - { - case TYPE_IMOVX: - return "movs{bl|x}\t{%h1, %k0|%k0, %h1}"; + return "%vpxor\t%0, %d0"; + case 1: + case 2: + /* TDmode values are passed as TImode on the stack. Moving them + to stack may result in unaligned memory access. */ + if (misaligned_operand (operands[0], TImode) + || misaligned_operand (operands[1], TImode)) + { + if (get_attr_mode (insn) == MODE_V4SF) + return "%vmovups\t{%1, %0|%0, %1}"; + else + return "%vmovdqu\t{%1, %0|%0, %1}"; + } + else + { + if (get_attr_mode (insn) == MODE_V4SF) + return "%vmovaps\t{%1, %0|%0, %1}"; + else + return "%vmovdqa\t{%1, %0|%0, %1}"; + } default: - return "mov{b}\t{%h1, %0|%0, %h1}"; + gcc_unreachable (); } } - [(set (attr "type") - (if_then_else (and (match_operand:QI 0 "register_operand" "") - (ior (not (match_operand:QI 0 "q_regs_operand" "")) - (ne (symbol_ref "TARGET_MOVX") - (const_int 0)))) - (const_string "imovx") - (const_string "imov"))) + [(set_attr "type" "sselog1,ssemov,ssemov") + (set_attr "prefix" "maybe_vex") (set (attr "mode") - (if_then_else (eq_attr "type" "imovx") - (const_string "SI") - (const_string "QI")))]) - -;; Stores and loads of ax to arbitrary constant address. -;; We fake an second form of instruction to force reload to load address -;; into register when rax is not available -(define_insn "*movabsqi_1_rex64" - [(set (mem:QI (match_operand:DI 0 "x86_64_movabs_operand" "i,r")) - (match_operand:QI 1 "nonmemory_operand" "a,er"))] - "TARGET_64BIT && ix86_check_movabs (insn, 0)" - "@ - movabs{b}\t{%1, %P0|%P0, %1} - mov{b}\t{%1, %a0|%a0, %1}" - [(set_attr "type" "imov") - (set_attr "modrm" "0,*") - (set_attr "length_address" "8,0") - (set_attr "length_immediate" "0,*") - (set_attr "memory" "store") - (set_attr "mode" "QI")]) - -(define_insn "*movabsqi_2_rex64" - [(set (match_operand:QI 0 "register_operand" "=a,r") - (mem:QI (match_operand:DI 1 "x86_64_movabs_operand" "i,r")))] - "TARGET_64BIT && ix86_check_movabs (insn, 1)" - "@ - movabs{b}\t{%P1, %0|%0, %P1} - mov{b}\t{%a1, %0|%0, %a1}" - [(set_attr "type" "imov") - (set_attr "modrm" "0,*") - (set_attr "length_address" "8,0") - (set_attr "length_immediate" "0") - (set_attr "memory" "load") - (set_attr "mode" "QI")]) - -(define_insn "*movdi_extzv_1" - [(set (match_operand:DI 0 "register_operand" "=R") - (zero_extract:DI (match_operand 1 "ext_register_operand" "Q") - (const_int 8) - (const_int 8)))] - "TARGET_64BIT" - "movz{bl|x}\t{%h1, %k0|%k0, %h1}" - [(set_attr "type" "imovx") - (set_attr "mode" "SI")]) - -(define_insn "*movsi_extzv_1" - [(set (match_operand:SI 0 "register_operand" "=R") - (zero_extract:SI (match_operand 1 "ext_register_operand" "Q") - (const_int 8) - (const_int 8)))] - "" - "movz{bl|x}\t{%h1, %0|%0, %h1}" - [(set_attr "type" "imovx") - (set_attr "mode" "SI")]) + (cond [(ior (eq (symbol_ref "TARGET_SSE2") (const_int 0)) + (ne (symbol_ref "optimize_function_for_size_p (cfun)") + (const_int 0))) + (const_string "V4SF") + (and (eq_attr "alternative" "2") + (ne (symbol_ref "TARGET_SSE_TYPELESS_STORES") + (const_int 0))) + (const_string "V4SF")] + (const_string "TI")))]) -(define_insn "*movqi_extzv_2" - [(set (match_operand:QI 0 "nonimmediate_operand" "=Qm,?R") - (subreg:QI (zero_extract:SI (match_operand 1 "ext_register_operand" "Q,Q") - (const_int 8) - (const_int 8)) 0))] - "!TARGET_64BIT" +(define_insn "*movdi_internal_rex64" + [(set (match_operand:DI 0 "nonimmediate_operand" + "=r,r ,r,m ,!m,*y,*y,?r ,m ,?*Ym,?*y,*x,*x,?r ,m,?*Yi,*x,?*x,?*Ym") + (match_operand:DI 1 "general_operand" + "Z ,rem,i,re,n ,C ,*y,*Ym,*y,r ,m ,C ,*x,*Yi,*x,r ,m ,*Ym,*x"))] + "TARGET_64BIT && !(MEM_P (operands[0]) && MEM_P (operands[1]))" { switch (get_attr_type (insn)) { - case TYPE_IMOVX: - return "movz{bl|x}\t{%h1, %k0|%k0, %h1}"; - default: - return "mov{b}\t{%h1, %0|%0, %h1}"; - } -} - [(set (attr "type") - (if_then_else (and (match_operand:QI 0 "register_operand" "") - (ior (not (match_operand:QI 0 "q_regs_operand" "")) - (ne (symbol_ref "TARGET_MOVX") - (const_int 0)))) - (const_string "imovx") - (const_string "imov"))) - (set (attr "mode") - (if_then_else (eq_attr "type" "imovx") - (const_string "SI") - (const_string "QI")))]) + case TYPE_SSECVT: + if (SSE_REG_P (operands[0])) + return "movq2dq\t{%1, %0|%0, %1}"; + else + return "movdq2q\t{%1, %0|%0, %1}"; -(define_insn "*movqi_extzv_2_rex64" - [(set (match_operand:QI 0 "register_operand" "=Q,?R") - (subreg:QI (zero_extract:SI (match_operand 1 "ext_register_operand" "Q,Q") - (const_int 8) - (const_int 8)) 0))] - "TARGET_64BIT" -{ - switch (get_attr_type (insn)) - { - case TYPE_IMOVX: - return "movz{bl|x}\t{%h1, %k0|%k0, %h1}"; - default: - return "mov{b}\t{%h1, %0|%0, %h1}"; - } -} - [(set (attr "type") - (if_then_else (ior (not (match_operand:QI 0 "q_regs_operand" "")) - (ne (symbol_ref "TARGET_MOVX") - (const_int 0))) - (const_string "imovx") - (const_string "imov"))) - (set (attr "mode") - (if_then_else (eq_attr "type" "imovx") - (const_string "SI") - (const_string "QI")))]) + case TYPE_SSEMOV: + if (TARGET_AVX) + { + if (get_attr_mode (insn) == MODE_TI) + return "vmovdqa\t{%1, %0|%0, %1}"; + else + return "vmovq\t{%1, %0|%0, %1}"; + } -(define_insn "movsi_insv_1" - [(set (zero_extract:SI (match_operand 0 "ext_register_operand" "+Q") - (const_int 8) - (const_int 8)) - (match_operand:SI 1 "general_operand" "Qmn"))] - "!TARGET_64BIT" - "mov{b}\t{%b1, %h0|%h0, %b1}" - [(set_attr "type" "imov") - (set_attr "mode" "QI")]) + if (get_attr_mode (insn) == MODE_TI) + return "movdqa\t{%1, %0|%0, %1}"; + /* FALLTHRU */ -(define_insn "*movsi_insv_1_rex64" - [(set (zero_extract:SI (match_operand 0 "ext_register_operand" "+Q") - (const_int 8) - (const_int 8)) - (match_operand:SI 1 "nonmemory_operand" "Qn"))] - "TARGET_64BIT" - "mov{b}\t{%b1, %h0|%h0, %b1}" - [(set_attr "type" "imov") - (set_attr "mode" "QI")]) + case TYPE_MMXMOV: + /* Moves from and into integer register is done using movd + opcode with REX prefix. */ + if (GENERAL_REG_P (operands[0]) || GENERAL_REG_P (operands[1])) + return "movd\t{%1, %0|%0, %1}"; + return "movq\t{%1, %0|%0, %1}"; -(define_insn "movdi_insv_1_rex64" - [(set (zero_extract:DI (match_operand 0 "ext_register_operand" "+Q") - (const_int 8) - (const_int 8)) - (match_operand:DI 1 "nonmemory_operand" "Qn"))] - "TARGET_64BIT" - "mov{b}\t{%b1, %h0|%h0, %b1}" - [(set_attr "type" "imov") - (set_attr "mode" "QI")]) + case TYPE_SSELOG1: + return "%vpxor\t%0, %d0"; -(define_insn "*movqi_insv_2" - [(set (zero_extract:SI (match_operand 0 "ext_register_operand" "+Q") - (const_int 8) - (const_int 8)) - (lshiftrt:SI (match_operand:SI 1 "register_operand" "Q") - (const_int 8)))] - "" - "mov{b}\t{%h1, %h0|%h0, %h1}" - [(set_attr "type" "imov") - (set_attr "mode" "QI")]) + case TYPE_MMX: + return "pxor\t%0, %0"; -(define_expand "movdi" - [(set (match_operand:DI 0 "nonimmediate_operand" "") - (match_operand:DI 1 "general_operand" ""))] - "" - "ix86_expand_move (DImode, operands); DONE;") + case TYPE_MULTI: + return "#"; -(define_insn "*pushdi" - [(set (match_operand:DI 0 "push_operand" "=<") - (match_operand:DI 1 "general_no_elim_operand" "riF*m"))] - "!TARGET_64BIT" - "#") + case TYPE_LEA: + return "lea{q}\t{%a1, %0|%0, %a1}"; -(define_insn "*pushdi2_rex64" - [(set (match_operand:DI 0 "push_operand" "=<,!<") - (match_operand:DI 1 "general_no_elim_operand" "re*m,n"))] - "TARGET_64BIT" - "@ - push{q}\t%1 - #" - [(set_attr "type" "push,multi") - (set_attr "mode" "DI")]) + default: + gcc_assert (!flag_pic || LEGITIMATE_PIC_OPERAND_P (operands[1])); + if (get_attr_mode (insn) == MODE_SI) + return "mov{l}\t{%k1, %k0|%k0, %k1}"; + else if (which_alternative == 2) + return "movabs{q}\t{%1, %0|%0, %1}"; + else + return "mov{q}\t{%1, %0|%0, %1}"; + } +} + [(set (attr "type") + (cond [(eq_attr "alternative" "5") + (const_string "mmx") + (eq_attr "alternative" "6,7,8,9,10") + (const_string "mmxmov") + (eq_attr "alternative" "11") + (const_string "sselog1") + (eq_attr "alternative" "12,13,14,15,16") + (const_string "ssemov") + (eq_attr "alternative" "17,18") + (const_string "ssecvt") + (eq_attr "alternative" "4") + (const_string "multi") + (match_operand:DI 1 "pic_32bit_operand" "") + (const_string "lea") + ] + (const_string "imov"))) + (set (attr "modrm") + (if_then_else + (and (eq_attr "alternative" "2") (eq_attr "type" "imov")) + (const_string "0") + (const_string "*"))) + (set (attr "length_immediate") + (if_then_else + (and (eq_attr "alternative" "2") (eq_attr "type" "imov")) + (const_string "8") + (const_string "*"))) + (set_attr "prefix_rex" "*,*,*,*,*,*,*,1,*,1,*,*,*,*,*,*,*,*,*") + (set_attr "prefix_data16" "*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,1,*,*,*") + (set (attr "prefix") + (if_then_else (eq_attr "alternative" "11,12,13,14,15,16") + (const_string "maybe_vex") + (const_string "orig"))) + (set_attr "mode" "SI,DI,DI,DI,SI,DI,DI,DI,DI,DI,DI,TI,TI,DI,DI,DI,DI,DI,DI")]) -;; Convert impossible pushes of immediate to existing instructions. +;; Convert impossible stores of immediate to existing instructions. ;; First try to get scratch register and go through it. In case this -;; fails, push sign extended lower part first and then overwrite -;; upper part by 32bit move. +;; fails, move by 32bit parts. (define_peephole2 [(match_scratch:DI 2 "r") - (set (match_operand:DI 0 "push_operand" "") + (set (match_operand:DI 0 "memory_operand" "") (match_operand:DI 1 "immediate_operand" ""))] "TARGET_64BIT && !symbolic_operand (operands[1], DImode) && !x86_64_immediate_operand (operands[1], DImode)" @@ -2376,94 +2069,26 @@ ;; peephole2 pass is not run. ;; "&& 1" is needed to keep it from matching the previous pattern. (define_peephole2 - [(set (match_operand:DI 0 "push_operand" "") + [(set (match_operand:DI 0 "memory_operand" "") (match_operand:DI 1 "immediate_operand" ""))] "TARGET_64BIT && !symbolic_operand (operands[1], DImode) && !x86_64_immediate_operand (operands[1], DImode) && 1" - [(set (match_dup 0) (match_dup 1)) - (set (match_dup 2) (match_dup 3))] -{ - split_di (&operands[1], 1, &operands[2], &operands[3]); - - operands[1] = gen_lowpart (DImode, operands[2]); - operands[2] = gen_rtx_MEM (SImode, gen_rtx_PLUS (DImode, stack_pointer_rtx, - GEN_INT (4))); -}) + [(set (match_dup 2) (match_dup 3)) + (set (match_dup 4) (match_dup 5))] + "split_di (&operands[0], 2, &operands[2], &operands[4]);") (define_split - [(set (match_operand:DI 0 "push_operand" "") + [(set (match_operand:DI 0 "memory_operand" "") (match_operand:DI 1 "immediate_operand" ""))] "TARGET_64BIT && ((optimize > 0 && flag_peephole2) ? epilogue_completed : reload_completed) && !symbolic_operand (operands[1], DImode) && !x86_64_immediate_operand (operands[1], DImode)" - [(set (match_dup 0) (match_dup 1)) - (set (match_dup 2) (match_dup 3))] -{ - split_di (&operands[1], 1, &operands[2], &operands[3]); - - operands[1] = gen_lowpart (DImode, operands[2]); - operands[2] = gen_rtx_MEM (SImode, gen_rtx_PLUS (DImode, stack_pointer_rtx, - GEN_INT (4))); -}) - -(define_insn "*pushdi2_prologue_rex64" - [(set (match_operand:DI 0 "push_operand" "=<") - (match_operand:DI 1 "general_no_elim_operand" "re*m")) - (clobber (mem:BLK (scratch)))] - "TARGET_64BIT" - "push{q}\t%1" - [(set_attr "type" "push") - (set_attr "mode" "DI")]) - -(define_insn "*popdi1_epilogue_rex64" - [(set (match_operand:DI 0 "nonimmediate_operand" "=r*m") - (mem:DI (reg:DI SP_REG))) - (set (reg:DI SP_REG) - (plus:DI (reg:DI SP_REG) (const_int 8))) - (clobber (mem:BLK (scratch)))] - "TARGET_64BIT" - "pop{q}\t%0" - [(set_attr "type" "pop") - (set_attr "mode" "DI")]) - -(define_insn "popdi1" - [(set (match_operand:DI 0 "nonimmediate_operand" "=r*m") - (mem:DI (reg:DI SP_REG))) - (set (reg:DI SP_REG) - (plus:DI (reg:DI SP_REG) (const_int 8)))] - "TARGET_64BIT" - "pop{q}\t%0" - [(set_attr "type" "pop") - (set_attr "mode" "DI")]) - -(define_insn "*movdi_xor_rex64" - [(set (match_operand:DI 0 "register_operand" "=r") - (match_operand:DI 1 "const0_operand" "")) - (clobber (reg:CC FLAGS_REG))] - "TARGET_64BIT - && reload_completed" - "xor{l}\t%k0, %k0"; - [(set_attr "type" "alu1") - (set_attr "mode" "SI") - (set_attr "length_immediate" "0")]) - -(define_insn "*movdi_or_rex64" - [(set (match_operand:DI 0 "register_operand" "=r") - (match_operand:DI 1 "const_int_operand" "i")) - (clobber (reg:CC FLAGS_REG))] - "TARGET_64BIT - && reload_completed - && operands[1] == constm1_rtx" -{ - operands[1] = constm1_rtx; - return "or{q}\t{%1, %0|%0, %1}"; -} - [(set_attr "type" "alu1") - (set_attr "mode" "DI") - (set_attr "length_immediate" "1")]) + [(set (match_dup 2) (match_dup 3)) + (set (match_dup 4) (match_dup 5))] + "split_di (&operands[0], 2, &operands[2], &operands[4]);") -(define_insn "*movdi_2" +(define_insn "*movdi_internal" [(set (match_operand:DI 0 "nonimmediate_operand" "=r ,o ,*y,m*y,*y,*Y2,m ,*Y2,*Y2,*x,m ,*x,*x") (match_operand:DI 1 "general_operand" @@ -2491,369 +2116,493 @@ (set_attr "mode" "DI,DI,DI,DI,DI,TI,DI,TI,DI,V4SF,V2SF,V4SF,V2SF")]) (define_split - [(set (match_operand:DI 0 "push_operand" "") - (match_operand:DI 1 "general_operand" ""))] - "!TARGET_64BIT && reload_completed - && (! MMX_REG_P (operands[1]) && !SSE_REG_P (operands[1]))" - [(const_int 0)] - "ix86_split_long_move (operands); DONE;") - -;; %%% This multiword shite has got to go. -(define_split [(set (match_operand:DI 0 "nonimmediate_operand" "") (match_operand:DI 1 "general_operand" ""))] "!TARGET_64BIT && reload_completed - && (!MMX_REG_P (operands[0]) && !SSE_REG_P (operands[0])) - && (!MMX_REG_P (operands[1]) && !SSE_REG_P (operands[1]))" + && !(MMX_REG_P (operands[0]) || SSE_REG_P (operands[0])) + && !(MMX_REG_P (operands[1]) || SSE_REG_P (operands[1]))" [(const_int 0)] "ix86_split_long_move (operands); DONE;") -(define_insn "*movdi_1_rex64" - [(set (match_operand:DI 0 "nonimmediate_operand" - "=r,r ,r,m ,!m,*y,*y,?r ,m ,?*Ym,?*y,*x,*x,?r ,m,?*Yi,*x,?*x,?*Ym") - (match_operand:DI 1 "general_operand" - "Z ,rem,i,re,n ,C ,*y,*Ym,*y,r ,m ,C ,*x,*Yi,*x,r ,m ,*Ym,*x"))] - "TARGET_64BIT && !(MEM_P (operands[0]) && MEM_P (operands[1]))" +(define_insn "*movsi_internal" + [(set (match_operand:SI 0 "nonimmediate_operand" + "=r,m ,*y,*y,?rm,?*y,*x,*x,?r ,m ,?*Yi,*x") + (match_operand:SI 1 "general_operand" + "g ,ri,C ,*y,*y ,rm ,C ,*x,*Yi,*x,r ,m "))] + "!(MEM_P (operands[0]) && MEM_P (operands[1]))" { switch (get_attr_type (insn)) { - case TYPE_SSECVT: - if (SSE_REG_P (operands[0])) - return "movq2dq\t{%1, %0|%0, %1}"; - else - return "movdq2q\t{%1, %0|%0, %1}"; + case TYPE_SSELOG1: + if (get_attr_mode (insn) == MODE_TI) + return "%vpxor\t%0, %d0"; + return "%vxorps\t%0, %d0"; case TYPE_SSEMOV: - if (TARGET_AVX) + switch (get_attr_mode (insn)) { - if (get_attr_mode (insn) == MODE_TI) - return "vmovdqa\t{%1, %0|%0, %1}"; - else - return "vmovq\t{%1, %0|%0, %1}"; + case MODE_TI: + return "%vmovdqa\t{%1, %0|%0, %1}"; + case MODE_V4SF: + return "%vmovaps\t{%1, %0|%0, %1}"; + case MODE_SI: + return "%vmovd\t{%1, %0|%0, %1}"; + case MODE_SF: + return "%vmovss\t{%1, %0|%0, %1}"; + default: + gcc_unreachable (); } - if (get_attr_mode (insn) == MODE_TI) - return "movdqa\t{%1, %0|%0, %1}"; - /* FALLTHRU */ - - case TYPE_MMXMOV: - /* Moves from and into integer register is done using movd - opcode with REX prefix. */ - if (GENERAL_REG_P (operands[0]) || GENERAL_REG_P (operands[1])) - return "movd\t{%1, %0|%0, %1}"; - return "movq\t{%1, %0|%0, %1}"; - - case TYPE_SSELOG1: - return "%vpxor\t%0, %d0"; - case TYPE_MMX: return "pxor\t%0, %0"; - case TYPE_MULTI: - return "#"; + case TYPE_MMXMOV: + if (get_attr_mode (insn) == MODE_DI) + return "movq\t{%1, %0|%0, %1}"; + return "movd\t{%1, %0|%0, %1}"; case TYPE_LEA: - return "lea{q}\t{%a1, %0|%0, %a1}"; + return "lea{l}\t{%a1, %0|%0, %a1}"; default: gcc_assert (!flag_pic || LEGITIMATE_PIC_OPERAND_P (operands[1])); - if (get_attr_mode (insn) == MODE_SI) - return "mov{l}\t{%k1, %k0|%k0, %k1}"; - else if (which_alternative == 2) - return "movabs{q}\t{%1, %0|%0, %1}"; - else - return "mov{q}\t{%1, %0|%0, %1}"; + return "mov{l}\t{%1, %0|%0, %1}"; } } [(set (attr "type") - (cond [(eq_attr "alternative" "5") + (cond [(eq_attr "alternative" "2") (const_string "mmx") - (eq_attr "alternative" "6,7,8,9,10") + (eq_attr "alternative" "3,4,5") (const_string "mmxmov") - (eq_attr "alternative" "11") + (eq_attr "alternative" "6") (const_string "sselog1") - (eq_attr "alternative" "12,13,14,15,16") + (eq_attr "alternative" "7,8,9,10,11") (const_string "ssemov") - (eq_attr "alternative" "17,18") - (const_string "ssecvt") - (eq_attr "alternative" "4") - (const_string "multi") (match_operand:DI 1 "pic_32bit_operand" "") (const_string "lea") ] (const_string "imov"))) - (set (attr "modrm") - (if_then_else - (and (eq_attr "alternative" "2") (eq_attr "type" "imov")) - (const_string "0") - (const_string "*"))) - (set (attr "length_immediate") - (if_then_else - (and (eq_attr "alternative" "2") (eq_attr "type" "imov")) - (const_string "8") - (const_string "*"))) - (set_attr "prefix_rex" "*,*,*,*,*,*,*,1,*,1,*,*,*,*,*,*,*,*,*") - (set_attr "prefix_data16" "*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,1,*,*,*") (set (attr "prefix") - (if_then_else (eq_attr "alternative" "11,12,13,14,15,16") - (const_string "maybe_vex") - (const_string "orig"))) - (set_attr "mode" "SI,DI,DI,DI,SI,DI,DI,DI,DI,DI,DI,TI,TI,DI,DI,DI,DI,DI,DI")]) + (if_then_else (eq_attr "alternative" "0,1,2,3,4,5") + (const_string "orig") + (const_string "maybe_vex"))) + (set (attr "prefix_data16") + (if_then_else (and (eq_attr "type" "ssemov") (eq_attr "mode" "SI")) + (const_string "1") + (const_string "*"))) + (set (attr "mode") + (cond [(eq_attr "alternative" "2,3") + (const_string "DI") + (eq_attr "alternative" "6,7") + (if_then_else + (eq (symbol_ref "TARGET_SSE2") (const_int 0)) + (const_string "V4SF") + (const_string "TI")) + (and (eq_attr "alternative" "8,9,10,11") + (eq (symbol_ref "TARGET_SSE2") (const_int 0))) + (const_string "SF") + ] + (const_string "SI")))]) + +(define_insn "*movhi_internal" + [(set (match_operand:HI 0 "nonimmediate_operand" "=r,r,r,m") + (match_operand:HI 1 "general_operand" "r,rn,rm,rn"))] + "!(MEM_P (operands[0]) && MEM_P (operands[1]))" +{ + switch (get_attr_type (insn)) + { + case TYPE_IMOVX: + /* movzwl is faster than movw on p2 due to partial word stalls, + though not as fast as an aligned movl. */ + return "movz{wl|x}\t{%1, %k0|%k0, %1}"; + default: + if (get_attr_mode (insn) == MODE_SI) + return "mov{l}\t{%k1, %k0|%k0, %k1}"; + else + return "mov{w}\t{%1, %0|%0, %1}"; + } +} + [(set (attr "type") + (cond [(ne (symbol_ref "optimize_function_for_size_p (cfun)") + (const_int 0)) + (const_string "imov") + (and (eq_attr "alternative" "0") + (ior (eq (symbol_ref "TARGET_PARTIAL_REG_STALL") + (const_int 0)) + (eq (symbol_ref "TARGET_HIMODE_MATH") + (const_int 0)))) + (const_string "imov") + (and (eq_attr "alternative" "1,2") + (match_operand:HI 1 "aligned_operand" "")) + (const_string "imov") + (and (ne (symbol_ref "TARGET_MOVX") + (const_int 0)) + (eq_attr "alternative" "0,2")) + (const_string "imovx") + ] + (const_string "imov"))) + (set (attr "mode") + (cond [(eq_attr "type" "imovx") + (const_string "SI") + (and (eq_attr "alternative" "1,2") + (match_operand:HI 1 "aligned_operand" "")) + (const_string "SI") + (and (eq_attr "alternative" "0") + (ior (eq (symbol_ref "TARGET_PARTIAL_REG_STALL") + (const_int 0)) + (eq (symbol_ref "TARGET_HIMODE_MATH") + (const_int 0)))) + (const_string "SI") + ] + (const_string "HI")))]) + +;; Situation is quite tricky about when to choose full sized (SImode) move +;; over QImode moves. For Q_REG -> Q_REG move we use full size only for +;; partial register dependency machines (such as AMD Athlon), where QImode +;; moves issue extra dependency and for partial register stalls machines +;; that don't use QImode patterns (and QImode move cause stall on the next +;; instruction). +;; +;; For loads of Q_REG to NONQ_REG we use full sized moves except for partial +;; register stall machines with, where we use QImode instructions, since +;; partial register stall can be caused there. Then we use movzx. +(define_insn "*movqi_internal" + [(set (match_operand:QI 0 "nonimmediate_operand" "=q,q ,q ,r,r ,?r,m") + (match_operand:QI 1 "general_operand" " q,qn,qm,q,rn,qm,qn"))] + "!(MEM_P (operands[0]) && MEM_P (operands[1]))" +{ + switch (get_attr_type (insn)) + { + case TYPE_IMOVX: + gcc_assert (ANY_QI_REG_P (operands[1]) || MEM_P (operands[1])); + return "movz{bl|x}\t{%1, %k0|%k0, %1}"; + default: + if (get_attr_mode (insn) == MODE_SI) + return "mov{l}\t{%k1, %k0|%k0, %k1}"; + else + return "mov{b}\t{%1, %0|%0, %1}"; + } +} + [(set (attr "type") + (cond [(and (eq_attr "alternative" "5") + (not (match_operand:QI 1 "aligned_operand" ""))) + (const_string "imovx") + (ne (symbol_ref "optimize_function_for_size_p (cfun)") + (const_int 0)) + (const_string "imov") + (and (eq_attr "alternative" "3") + (ior (eq (symbol_ref "TARGET_PARTIAL_REG_STALL") + (const_int 0)) + (eq (symbol_ref "TARGET_QIMODE_MATH") + (const_int 0)))) + (const_string "imov") + (eq_attr "alternative" "3,5") + (const_string "imovx") + (and (ne (symbol_ref "TARGET_MOVX") + (const_int 0)) + (eq_attr "alternative" "2")) + (const_string "imovx") + ] + (const_string "imov"))) + (set (attr "mode") + (cond [(eq_attr "alternative" "3,4,5") + (const_string "SI") + (eq_attr "alternative" "6") + (const_string "QI") + (eq_attr "type" "imovx") + (const_string "SI") + (and (eq_attr "type" "imov") + (and (eq_attr "alternative" "0,1") + (and (ne (symbol_ref "TARGET_PARTIAL_REG_DEPENDENCY") + (const_int 0)) + (and (eq (symbol_ref "optimize_function_for_size_p (cfun)") + (const_int 0)) + (eq (symbol_ref "TARGET_PARTIAL_REG_STALL") + (const_int 0)))))) + (const_string "SI") + ;; Avoid partial register stalls when not using QImode arithmetic + (and (eq_attr "type" "imov") + (and (eq_attr "alternative" "0,1") + (and (ne (symbol_ref "TARGET_PARTIAL_REG_STALL") + (const_int 0)) + (eq (symbol_ref "TARGET_QIMODE_MATH") + (const_int 0))))) + (const_string "SI") + ] + (const_string "QI")))]) ;; Stores and loads of ax to arbitrary constant address. ;; We fake an second form of instruction to force reload to load address ;; into register when rax is not available -(define_insn "*movabsdi_1_rex64" - [(set (mem:DI (match_operand:DI 0 "x86_64_movabs_operand" "i,r")) - (match_operand:DI 1 "nonmemory_operand" "a,er"))] +(define_insn "*movabs_1" + [(set (mem:SWI1248x (match_operand:DI 0 "x86_64_movabs_operand" "i,r")) + (match_operand:SWI1248x 1 "nonmemory_operand" "a,er"))] "TARGET_64BIT && ix86_check_movabs (insn, 0)" "@ - movabs{q}\t{%1, %P0|%P0, %1} - mov{q}\t{%1, %a0|%a0, %1}" + movabs{}\t{%1, %P0|%P0, %1} + mov{}\t{%1, %a0|%a0, %1}" [(set_attr "type" "imov") (set_attr "modrm" "0,*") (set_attr "length_address" "8,0") (set_attr "length_immediate" "0,*") (set_attr "memory" "store") - (set_attr "mode" "DI")]) + (set_attr "mode" "")]) -(define_insn "*movabsdi_2_rex64" - [(set (match_operand:DI 0 "register_operand" "=a,r") - (mem:DI (match_operand:DI 1 "x86_64_movabs_operand" "i,r")))] +(define_insn "*movabs_2" + [(set (match_operand:SWI1248x 0 "register_operand" "=a,r") + (mem:SWI1248x (match_operand:DI 1 "x86_64_movabs_operand" "i,r")))] "TARGET_64BIT && ix86_check_movabs (insn, 1)" "@ - movabs{q}\t{%P1, %0|%0, %P1} - mov{q}\t{%a1, %0|%0, %a1}" + movabs{}\t{%P1, %0|%0, %P1} + mov{}\t{%a1, %0|%0, %a1}" [(set_attr "type" "imov") (set_attr "modrm" "0,*") (set_attr "length_address" "8,0") (set_attr "length_immediate" "0") (set_attr "memory" "load") - (set_attr "mode" "DI")]) - -;; Convert impossible stores of immediate to existing instructions. -;; First try to get scratch register and go through it. In case this -;; fails, move by 32bit parts. -(define_peephole2 - [(match_scratch:DI 2 "r") - (set (match_operand:DI 0 "memory_operand" "") - (match_operand:DI 1 "immediate_operand" ""))] - "TARGET_64BIT && !symbolic_operand (operands[1], DImode) - && !x86_64_immediate_operand (operands[1], DImode)" - [(set (match_dup 2) (match_dup 1)) - (set (match_dup 0) (match_dup 2))] - "") - -;; We need to define this as both peepholer and splitter for case -;; peephole2 pass is not run. -;; "&& 1" is needed to keep it from matching the previous pattern. -(define_peephole2 - [(set (match_operand:DI 0 "memory_operand" "") - (match_operand:DI 1 "immediate_operand" ""))] - "TARGET_64BIT && !symbolic_operand (operands[1], DImode) - && !x86_64_immediate_operand (operands[1], DImode) && 1" - [(set (match_dup 2) (match_dup 3)) - (set (match_dup 4) (match_dup 5))] - "split_di (&operands[0], 2, &operands[2], &operands[4]);") + (set_attr "mode" "")]) -(define_split - [(set (match_operand:DI 0 "memory_operand" "") - (match_operand:DI 1 "immediate_operand" ""))] - "TARGET_64BIT && ((optimize > 0 && flag_peephole2) - ? epilogue_completed : reload_completed) - && !symbolic_operand (operands[1], DImode) - && !x86_64_immediate_operand (operands[1], DImode)" - [(set (match_dup 2) (match_dup 3)) - (set (match_dup 4) (match_dup 5))] - "split_di (&operands[0], 2, &operands[2], &operands[4]);") +(define_insn "*swap" + [(set (match_operand:SWI48 0 "register_operand" "+r") + (match_operand:SWI48 1 "register_operand" "+r")) + (set (match_dup 1) + (match_dup 0))] + "" + "xchg{}\t%1, %0" + [(set_attr "type" "imov") + (set_attr "mode" "") + (set_attr "pent_pair" "np") + (set_attr "athlon_decode" "vector") + (set_attr "amdfam10_decode" "double")]) -(define_insn "*swapdi_rex64" - [(set (match_operand:DI 0 "register_operand" "+r") - (match_operand:DI 1 "register_operand" "+r")) +(define_insn "*swap_1" + [(set (match_operand:SWI12 0 "register_operand" "+r") + (match_operand:SWI12 1 "register_operand" "+r")) (set (match_dup 1) (match_dup 0))] - "TARGET_64BIT" - "xchg{q}\t%1, %0" + "!TARGET_PARTIAL_REG_STALL || optimize_function_for_size_p (cfun)" + "xchg{l}\t%k1, %k0" [(set_attr "type" "imov") - (set_attr "mode" "DI") + (set_attr "mode" "SI") (set_attr "pent_pair" "np") (set_attr "athlon_decode" "vector") (set_attr "amdfam10_decode" "double")]) -(define_expand "movoi" - [(set (match_operand:OI 0 "nonimmediate_operand" "") - (match_operand:OI 1 "general_operand" ""))] - "TARGET_AVX" - "ix86_expand_move (OImode, operands); DONE;") +;; Not added amdfam10_decode since TARGET_PARTIAL_REG_STALL +;; is disabled for AMDFAM10 +(define_insn "*swap_2" + [(set (match_operand:SWI12 0 "register_operand" "+") + (match_operand:SWI12 1 "register_operand" "+")) + (set (match_dup 1) + (match_dup 0))] + "TARGET_PARTIAL_REG_STALL" + "xchg{}\t%1, %0" + [(set_attr "type" "imov") + (set_attr "mode" "") + (set_attr "pent_pair" "np") + (set_attr "athlon_decode" "vector")]) + +(define_expand "movstrict" + [(set (strict_low_part (match_operand:SWI12 0 "nonimmediate_operand" "")) + (match_operand:SWI12 1 "general_operand" ""))] + "" +{ + if (TARGET_PARTIAL_REG_STALL && optimize_function_for_speed_p (cfun)) + FAIL; + /* Don't generate memory->memory moves, go through a register */ + if (MEM_P (operands[0]) && MEM_P (operands[1])) + operands[1] = force_reg (mode, operands[1]); +}) + +(define_insn "*movstrict_1" + [(set (strict_low_part + (match_operand:SWI12 0 "nonimmediate_operand" "+m,")) + (match_operand:SWI12 1 "general_operand" "n,m"))] + "(!TARGET_PARTIAL_REG_STALL || optimize_function_for_size_p (cfun)) + && !(MEM_P (operands[0]) && MEM_P (operands[1]))" + "mov{}\t{%1, %0|%0, %1}" + [(set_attr "type" "imov") + (set_attr "mode" "")]) + +(define_insn "*movstrict_xor" + [(set (strict_low_part (match_operand:SWI12 0 "register_operand" "+")) + (match_operand:SWI12 1 "const0_operand" "")) + (clobber (reg:CC FLAGS_REG))] + "reload_completed" + "xor{}\t%0, %0" + [(set_attr "type" "alu1") + (set_attr "mode" "") + (set_attr "length_immediate" "0")]) + +(define_insn "*mov_extv_1" + [(set (match_operand:SWI24 0 "register_operand" "=R") + (sign_extract:SWI24 (match_operand 1 "ext_register_operand" "Q") + (const_int 8) + (const_int 8)))] + "" + "movs{bl|x}\t{%h1, %k0|%k0, %h1}" + [(set_attr "type" "imovx") + (set_attr "mode" "SI")]) -(define_insn "*movoi_internal" - [(set (match_operand:OI 0 "nonimmediate_operand" "=x,x,m") - (match_operand:OI 1 "vector_move_operand" "C,xm,x"))] - "TARGET_AVX - && !(MEM_P (operands[0]) && MEM_P (operands[1]))" +(define_insn "*movqi_extv_1_rex64" + [(set (match_operand:QI 0 "register_operand" "=Q,?R") + (sign_extract:QI (match_operand 1 "ext_register_operand" "Q,Q") + (const_int 8) + (const_int 8)))] + "TARGET_64BIT" { - switch (which_alternative) + switch (get_attr_type (insn)) { - case 0: - return "vxorps\t%0, %0, %0"; - case 1: - case 2: - if (misaligned_operand (operands[0], OImode) - || misaligned_operand (operands[1], OImode)) - return "vmovdqu\t{%1, %0|%0, %1}"; - else - return "vmovdqa\t{%1, %0|%0, %1}"; + case TYPE_IMOVX: + return "movs{bl|x}\t{%h1, %k0|%k0, %h1}"; default: - gcc_unreachable (); + return "mov{b}\t{%h1, %0|%0, %h1}"; } } - [(set_attr "type" "sselog1,ssemov,ssemov") - (set_attr "prefix" "vex") - (set_attr "mode" "OI")]) + [(set (attr "type") + (if_then_else (and (match_operand:QI 0 "register_operand" "") + (ior (not (match_operand:QI 0 "q_regs_operand" "")) + (ne (symbol_ref "TARGET_MOVX") + (const_int 0)))) + (const_string "imovx") + (const_string "imov"))) + (set (attr "mode") + (if_then_else (eq_attr "type" "imovx") + (const_string "SI") + (const_string "QI")))]) -(define_expand "movti" - [(set (match_operand:TI 0 "nonimmediate_operand" "") - (match_operand:TI 1 "nonimmediate_operand" ""))] - "TARGET_SSE || TARGET_64BIT" +(define_insn "*movqi_extv_1" + [(set (match_operand:QI 0 "nonimmediate_operand" "=Qm,?r") + (sign_extract:QI (match_operand 1 "ext_register_operand" "Q,Q") + (const_int 8) + (const_int 8)))] + "!TARGET_64BIT" { - if (TARGET_64BIT) - ix86_expand_move (TImode, operands); - else if (push_operand (operands[0], TImode)) - ix86_expand_push (TImode, operands[1]); - else - ix86_expand_vector_move (TImode, operands); - DONE; -}) + switch (get_attr_type (insn)) + { + case TYPE_IMOVX: + return "movs{bl|x}\t{%h1, %k0|%k0, %h1}"; + default: + return "mov{b}\t{%h1, %0|%0, %h1}"; + } +} + [(set (attr "type") + (if_then_else (and (match_operand:QI 0 "register_operand" "") + (ior (not (match_operand:QI 0 "q_regs_operand" "")) + (ne (symbol_ref "TARGET_MOVX") + (const_int 0)))) + (const_string "imovx") + (const_string "imov"))) + (set (attr "mode") + (if_then_else (eq_attr "type" "imovx") + (const_string "SI") + (const_string "QI")))]) -(define_insn "*movti_internal" - [(set (match_operand:TI 0 "nonimmediate_operand" "=x,x,m") - (match_operand:TI 1 "vector_move_operand" "C,xm,x"))] - "TARGET_SSE && !TARGET_64BIT - && !(MEM_P (operands[0]) && MEM_P (operands[1]))" +(define_insn "*mov_extzv_1" + [(set (match_operand:SWI48 0 "register_operand" "=R") + (zero_extract:SWI48 (match_operand 1 "ext_register_operand" "Q") + (const_int 8) + (const_int 8)))] + "" + "movz{bl|x}\t{%h1, %k0|%k0, %h1}" + [(set_attr "type" "imovx") + (set_attr "mode" "SI")]) + +(define_insn "*movqi_extzv_2_rex64" + [(set (match_operand:QI 0 "register_operand" "=Q,?R") + (subreg:QI + (zero_extract:SI (match_operand 1 "ext_register_operand" "Q,Q") + (const_int 8) + (const_int 8)) 0))] + "TARGET_64BIT" { - switch (which_alternative) + switch (get_attr_type (insn)) { - case 0: - if (get_attr_mode (insn) == MODE_V4SF) - return "%vxorps\t%0, %d0"; - else - return "%vpxor\t%0, %d0"; - case 1: - case 2: - /* TDmode values are passed as TImode on the stack. Moving them - to stack may result in unaligned memory access. */ - if (misaligned_operand (operands[0], TImode) - || misaligned_operand (operands[1], TImode)) - { - if (get_attr_mode (insn) == MODE_V4SF) - return "%vmovups\t{%1, %0|%0, %1}"; - else - return "%vmovdqu\t{%1, %0|%0, %1}"; - } - else - { - if (get_attr_mode (insn) == MODE_V4SF) - return "%vmovaps\t{%1, %0|%0, %1}"; - else - return "%vmovdqa\t{%1, %0|%0, %1}"; - } + case TYPE_IMOVX: + return "movz{bl|x}\t{%h1, %k0|%k0, %h1}"; default: - gcc_unreachable (); + return "mov{b}\t{%h1, %0|%0, %h1}"; } } - [(set_attr "type" "sselog1,ssemov,ssemov") - (set_attr "prefix" "maybe_vex") + [(set (attr "type") + (if_then_else (ior (not (match_operand:QI 0 "q_regs_operand" "")) + (ne (symbol_ref "TARGET_MOVX") + (const_int 0))) + (const_string "imovx") + (const_string "imov"))) (set (attr "mode") - (cond [(ior (eq (symbol_ref "TARGET_SSE2") (const_int 0)) - (ne (symbol_ref "optimize_function_for_size_p (cfun)") (const_int 0))) - (const_string "V4SF") - (and (eq_attr "alternative" "2") - (ne (symbol_ref "TARGET_SSE_TYPELESS_STORES") - (const_int 0))) - (const_string "V4SF")] - (const_string "TI")))]) + (if_then_else (eq_attr "type" "imovx") + (const_string "SI") + (const_string "QI")))]) -(define_insn "*movti_rex64" - [(set (match_operand:TI 0 "nonimmediate_operand" "=!r,o,x,x,xm") - (match_operand:TI 1 "general_operand" "riFo,riF,C,xm,x"))] - "TARGET_64BIT - && !(MEM_P (operands[0]) && MEM_P (operands[1]))" +(define_insn "*movqi_extzv_2" + [(set (match_operand:QI 0 "nonimmediate_operand" "=Qm,?R") + (subreg:QI + (zero_extract:SI (match_operand 1 "ext_register_operand" "Q,Q") + (const_int 8) + (const_int 8)) 0))] + "!TARGET_64BIT" { - switch (which_alternative) + switch (get_attr_type (insn)) { - case 0: - case 1: - return "#"; - case 2: - if (get_attr_mode (insn) == MODE_V4SF) - return "%vxorps\t%0, %d0"; - else - return "%vpxor\t%0, %d0"; - case 3: - case 4: - /* TDmode values are passed as TImode on the stack. Moving them - to stack may result in unaligned memory access. */ - if (misaligned_operand (operands[0], TImode) - || misaligned_operand (operands[1], TImode)) - { - if (get_attr_mode (insn) == MODE_V4SF) - return "%vmovups\t{%1, %0|%0, %1}"; - else - return "%vmovdqu\t{%1, %0|%0, %1}"; - } - else - { - if (get_attr_mode (insn) == MODE_V4SF) - return "%vmovaps\t{%1, %0|%0, %1}"; - else - return "%vmovdqa\t{%1, %0|%0, %1}"; - } + case TYPE_IMOVX: + return "movz{bl|x}\t{%h1, %k0|%k0, %h1}"; default: - gcc_unreachable (); + return "mov{b}\t{%h1, %0|%0, %h1}"; } } - [(set_attr "type" "*,*,sselog1,ssemov,ssemov") - (set_attr "prefix" "*,*,maybe_vex,maybe_vex,maybe_vex") + [(set (attr "type") + (if_then_else (and (match_operand:QI 0 "register_operand" "") + (ior (not (match_operand:QI 0 "q_regs_operand" "")) + (ne (symbol_ref "TARGET_MOVX") + (const_int 0)))) + (const_string "imovx") + (const_string "imov"))) (set (attr "mode") - (cond [(eq_attr "alternative" "2,3") - (if_then_else - (ne (symbol_ref "optimize_function_for_size_p (cfun)") - (const_int 0)) - (const_string "V4SF") - (const_string "TI")) - (eq_attr "alternative" "4") - (if_then_else - (ior (ne (symbol_ref "TARGET_SSE_TYPELESS_STORES") - (const_int 0)) - (ne (symbol_ref "optimize_function_for_size_p (cfun)") - (const_int 0))) - (const_string "V4SF") - (const_string "TI"))] - (const_string "DI")))]) + (if_then_else (eq_attr "type" "imovx") + (const_string "SI") + (const_string "QI")))]) -(define_split - [(set (match_operand:TI 0 "nonimmediate_operand" "") - (match_operand:TI 1 "general_operand" ""))] - "reload_completed && !SSE_REG_P (operands[0]) - && !SSE_REG_P (operands[1])" - [(const_int 0)] - "ix86_split_long_move (operands); DONE;") +(define_expand "mov_insv_1" + [(set (zero_extract:SWI48 (match_operand 0 "ext_register_operand" "") + (const_int 8) + (const_int 8)) + (match_operand:SWI48 1 "nonmemory_operand" ""))] + "" + "") -;; This expands to what emit_move_complex would generate if we didn't -;; have a movti pattern. Having this avoids problems with reload on -;; 32-bit targets when SSE is present, but doesn't seem to be harmful -;; to have around all the time. -(define_expand "movcdi" - [(set (match_operand:CDI 0 "nonimmediate_operand" "") - (match_operand:CDI 1 "general_operand" ""))] +(define_insn "*mov_insv_1_rex64" + [(set (zero_extract:SWI48x (match_operand 0 "ext_register_operand" "+Q") + (const_int 8) + (const_int 8)) + (match_operand:SWI48x 1 "nonmemory_operand" "Qn"))] + "TARGET_64BIT" + "mov{b}\t{%b1, %h0|%h0, %b1}" + [(set_attr "type" "imov") + (set_attr "mode" "QI")]) + +(define_insn "*movsi_insv_1" + [(set (zero_extract:SI (match_operand 0 "ext_register_operand" "+Q") + (const_int 8) + (const_int 8)) + (match_operand:SI 1 "general_operand" "Qmn"))] + "!TARGET_64BIT" + "mov{b}\t{%b1, %h0|%h0, %b1}" + [(set_attr "type" "imov") + (set_attr "mode" "QI")]) + +(define_insn "*movqi_insv_2" + [(set (zero_extract:SI (match_operand 0 "ext_register_operand" "+Q") + (const_int 8) + (const_int 8)) + (lshiftrt:SI (match_operand:SI 1 "register_operand" "Q") + (const_int 8)))] "" -{ - if (push_operand (operands[0], CDImode)) - emit_move_complex_push (CDImode, operands[0], operands[1]); - else - emit_move_complex_parts (operands[0], operands[1]); - DONE; -}) + "mov{b}\t{%h1, %h0|%h0, %h1}" + [(set_attr "type" "imov") + (set_attr "mode" "QI")]) + +;; Floating point move instructions. (define_expand "movsf" [(set (match_operand:SF 0 "nonimmediate_operand" "") @@ -4137,7 +3886,7 @@ (zero_extend:DI (match_operand:SI 1 "general_operand" ""))) (clobber (reg:CC FLAGS_REG))] "!TARGET_64BIT && reload_completed - && !SSE_REG_P (operands[0]) && !MMX_REG_P (operands[0])" + && !(MMX_REG_P (operands[0]) || SSE_REG_P (operands[0]))" [(set (match_dup 3) (match_dup 1)) (set (match_dup 4) (const_int 0))] "split_di (&operands[0], 1, &operands[3], &operands[4]);") @@ -10520,7 +10269,7 @@ FAIL; if (TARGET_64BIT) - emit_insn (gen_movdi_insv_1_rex64 (operands[0], operands[3])); + emit_insn (gen_movdi_insv_1 (operands[0], operands[3])); else emit_insn (gen_movsi_insv_1 (operands[0], operands[3])); Index: i386.c =================================================================== --- i386.c (revision 161130) +++ i386.c (working copy) @@ -18838,7 +18838,7 @@ promote_duplicated_reg (enum machine_mod if (mode == SImode) emit_insn (gen_movsi_insv_1 (reg, reg)); else - emit_insn (gen_movdi_insv_1_rex64 (reg, reg)); + emit_insn (gen_movdi_insv_1 (reg, reg)); else { tmp = expand_simple_binop (mode, ASHIFT, reg, GEN_INT (8),