From patchwork Sat Jul 6 11:51:08 2013 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Uros Bizjak X-Patchwork-Id: 257230 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client CN "localhost", Issuer "www.qmailtoaster.com" (not verified)) by ozlabs.org (Postfix) with ESMTPS id 187A72C0095 for ; Sat, 6 Jul 2013 21:51:33 +1000 (EST) DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :mime-version:date:message-id:subject:from:to:cc:content-type; q=dns; s=default; b=J1cAUStoTg2ukBECT5Txsid5kT61luLxqVwH0Fh+/45 ZKWarvmDTN74I0WPj+YSlJ+pqY6hv4VNfbOB7Wy1EEq+5UuxJI42tUNi403yiEsz Vwg4s9JkCS4WVd61N49igYVEPg5ld13BU1RjI0aYrRx0sgwdsYem4eoAbhOQoHFk = DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :mime-version:date:message-id:subject:from:to:cc:content-type; s=default; bh=PpVNf1ltc4k6DxDqeRFgt5nvSaU=; b=Bp37xfEW+LbnCwcWJ ro6sQYf3dpnnVAZhA8zWGtSw4c+oyipYYMUHQpbpCNI8nBC+5miF2n6/HNYNcbba x7s9v7AfiUo9BFkFW9C3njDRFt/7/2OpMqor8vaFjGgXesplPveO+CNy/d6jtHlB SiglVfq5OQhrh0izW8jN+gucVY= Received: (qmail 12111 invoked by alias); 6 Jul 2013 11:51:25 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 12093 invoked by uid 89); 6 Jul 2013 11:51:24 -0000 X-Spam-SWARE-Status: No, score=-2.5 required=5.0 tests=AWL, BAYES_00, FREEMAIL_FROM, RCVD_IN_DNSWL_LOW, RCVD_IN_HOSTKARMA_YE, SPF_PASS, TW_ZJ autolearn=ham version=3.3.1 Received: from mail-oa0-f41.google.com (HELO mail-oa0-f41.google.com) (209.85.219.41) by sourceware.org (qpsmtpd/0.84/v0.84-167-ge50287c) with ESMTP; Sat, 06 Jul 2013 11:51:10 +0000 Received: by mail-oa0-f41.google.com with SMTP id n10so4547492oag.28 for ; Sat, 06 Jul 2013 04:51:08 -0700 (PDT) MIME-Version: 1.0 X-Received: by 10.60.35.40 with SMTP id e8mr15182912oej.34.1373111468337; Sat, 06 Jul 2013 04:51:08 -0700 (PDT) Received: by 10.182.92.202 with HTTP; Sat, 6 Jul 2013 04:51:08 -0700 (PDT) Date: Sat, 6 Jul 2013 13:51:08 +0200 Message-ID: Subject: [PATCH, i386]: Fix PR57807, -masm=intel pointer size fixes From: Uros Bizjak To: "gcc-patches@gcc.gnu.org" Cc: jleahy+gcc@gmail.com X-Virus-Found: No Hello! While ATT dialect doesn't care about pointer sizes, Intel dialect requires correct pointer size decorations. Attached patch introduces correct pointer size overrides, and fixes all instructions, so gcc.target/i386/sse-13.c mega-testcase compiles without problems with -masm=intel. Unfortunately, this also uncovers gas bug [1] with cvttps2pi. [1] http://sourceware.org/bugzilla/show_bug.cgi?id=13572 2013-07-06 Uros Bizjak PR target/57807 * config/i386/sse.md (iptr): New mode attribute. (sse2_movq128): Add pointer size overrides for Intel asm dialect. (_vm3): Ditto. (_vmmul3): Ditto. (_vmdiv3): Ditto. (sse_vmrcpv4sf2): Ditto. (_vmsqrt2): Ditto. (sse_vmrsqrtv4sf2): Ditto. (_vm3): Ditto. (avx_vmcmp3): Ditto. (_vmmaskcmp3): Ditto. (_comi): Ditto. (_ucomi): Ditto. (*xop_vmfrcz_): Ditto. (*fmai_fmadd_): Ditto. (*fmai_fmsub_): Ditto. (*fmai_fnmadd_): Ditto. (*fmai_fnmsub_): Ditto. (*fma4i_vmfmadd_): Ditto. (*fma4i_vmfmsub_): Ditto. (*fma4i_vmfnmadd_): Ditto. (*fma4i_vmfnmsub_): Ditto. (*xop_vmfrcz_): Ditto. (sse_cvtps2pi): Ditto. (sse_cvttps2pi): Ditto. (sse_cvtss2si): Ditto. (sse_cvtss2si_2): Ditto. (sse_cvtss2siq_2): Ditto. (sse_cvttss2si): Ditto. (sse_cvttss2siq): Ditto. (sse_cvtsd2si): Ditto. (sse_cvtsd2si_2): Ditto. (sse_cvtsd2siq_2): Ditto. (sse_cvttsd2si): Ditto. (sse_cvttsd2siq): Ditto. (sse_cvtsd2ss): Ditto. (sse_cvtss2sd): Ditto. (avx2_pbroadcast): Ditto. (avx2_pbroadcast_1): Ditto. (*avx_vperm_broadcast_v4sf): Ditto. (sse_movhlps): Ditto for movlp[sd]/movhp[sd] alternatives. (sse_movlhps): Ditto. (sse_storehps): Ditto. (sse_loadhps): Ditto. (sse_storelps): Ditto. (sse_loadlps): Ditto. (*vec_concatv4sf): Ditto. (*vec_interleave_highv2df): Ditto. (*vec_interleave_lowv2df): Ditto. (*vec_extractv2df_1_sse): Ditto. (*vec_extractv2df_0_sse): Ditto. (sse2_storelpd): Ditto. (sse2_loadlpd): Ditto. (sse2_movsd): Ditto. (*vec_concatv4si): Ditto. (vec_concatv2di): Ditto. * config/i386/mmx.md (mmx_punpcklbw): Add pointer size overrides for Intel asm dialect. (mmx_punpcklwd): Ditto. (mmx_punpckldq): Ditto. * config/i386/i386.c (ix86_print_operand) ['H']: Output 'qword ptr' for intel assembler dialect. testsuite/ChangeLog: 2013-07-06 Uros Bizjak PR target/57807 * gcc.target/i386/pr57807.c: New test. Patch was bootstrapped and regression tested on x86_64-pc-linux-gnu {,-m32}. Patch was committed to mainline SVN, but due to its size, I think it is not appropriate for release branches. Index: config/i386/i386.c =================================================================== --- config/i386/i386.c (revision 200728) +++ config/i386/i386.c (working copy) @@ -14670,6 +14670,9 @@ ix86_print_operand (FILE *file, rtx x, int code) /* It doesn't actually matter what mode we use here, as we're only going to use this for printing. */ x = adjust_address_nv (x, DImode, 8); + /* Output 'qword ptr' for intel assembler dialect. */ + if (ASSEMBLER_DIALECT == ASM_INTEL) + code = 'q'; break; case 'K': Index: config/i386/mmx.md =================================================================== --- config/i386/mmx.md (revision 200728) +++ config/i386/mmx.md (working copy) @@ -1078,7 +1078,7 @@ (const_int 2) (const_int 10) (const_int 3) (const_int 11)])))] "TARGET_MMX" - "punpcklbw\t{%2, %0|%0, %2}" + "punpcklbw\t{%2, %0|%0, %k2}" [(set_attr "type" "mmxcvt") (set_attr "mode" "DI")]) @@ -1104,7 +1104,7 @@ (parallel [(const_int 0) (const_int 4) (const_int 1) (const_int 5)])))] "TARGET_MMX" - "punpcklwd\t{%2, %0|%0, %2}" + "punpcklwd\t{%2, %0|%0, %k2}" [(set_attr "type" "mmxcvt") (set_attr "mode" "DI")]) @@ -1130,7 +1130,7 @@ (parallel [(const_int 0) (const_int 2)])))] "TARGET_MMX" - "punpckldq\t{%2, %0|%0, %2}" + "punpckldq\t{%2, %0|%0, %k2}" [(set_attr "type" "mmxcvt") (set_attr "mode" "DI")]) Index: config/i386/sse.md =================================================================== --- config/i386/sse.md (revision 200728) +++ config/i386/sse.md (working copy) @@ -355,6 +355,14 @@ (V8SF "SF") (V4DF "DF") (V4SF "SF") (V2DF "DF")]) +;; Pointer size override for scalar modes (Intel asm dialect) +(define_mode_attr iptr + [(V32QI "b") (V16HI "w") (V8SI "k") (V4DI "q") + (V16QI "b") (V8HI "w") (V4SI "k") (V2DI "q") + (V8SF "k") (V4DF "q") + (V4SF "k") (V2DF "q") + (SF "k") (DF "q")]) + ;; Number of scalar elements in each vector type (define_mode_attr ssescalarnum [(V32QI "32") (V16HI "16") (V8SI "8") (V4DI "4") @@ -511,7 +519,7 @@ (parallel [(const_int 0)])) (const_int 0)))] "TARGET_SSE2" - "%vmovq\t{%1, %0|%0, %1}" + "%vmovq\t{%1, %0|%0, %q1}" [(set_attr "type" "ssemov") (set_attr "prefix" "maybe_vex") (set_attr "mode" "TI")]) @@ -878,8 +886,8 @@ (const_int 1)))] "TARGET_SSE" "@ - \t{%2, %0|%0, %2} - v\t{%2, %1, %0|%0, %1, %2}" + \t{%2, %0|%0, %2} + v\t{%2, %1, %0|%0, %1, %2}" [(set_attr "isa" "noavx,avx") (set_attr "type" "sseadd") (set_attr "prefix" "orig,vex") @@ -918,8 +926,8 @@ (const_int 1)))] "TARGET_SSE" "@ - mul\t{%2, %0|%0, %2} - vmul\t{%2, %1, %0|%0, %1, %2}" + mul\t{%2, %0|%0, %2} + vmul\t{%2, %1, %0|%0, %1, %2}" [(set_attr "isa" "noavx,avx") (set_attr "type" "ssemul") (set_attr "prefix" "orig,vex") @@ -975,8 +983,8 @@ (const_int 1)))] "TARGET_SSE" "@ - div\t{%2, %0|%0, %2} - vdiv\t{%2, %1, %0|%0, %1, %2}" + div\t{%2, %0|%0, %2} + vdiv\t{%2, %1, %0|%0, %1, %2}" [(set_attr "isa" "noavx,avx") (set_attr "type" "ssediv") (set_attr "prefix" "orig,vex") @@ -1004,8 +1012,8 @@ (const_int 1)))] "TARGET_SSE" "@ - rcpss\t{%1, %0|%0, %1} - vrcpss\t{%1, %2, %0|%0, %2, %1}" + rcpss\t{%1, %0|%0, %k1} + vrcpss\t{%1, %2, %0|%0, %2, %k1}" [(set_attr "isa" "noavx,avx") (set_attr "type" "sse") (set_attr "atom_sse_attr" "rcp") @@ -1054,8 +1062,8 @@ (const_int 1)))] "TARGET_SSE" "@ - sqrt\t{%1, %0|%0, %1} - vsqrt\t{%1, %2, %0|%0, %2, %1}" + sqrt\t{%1, %0|%0, %1} + vsqrt\t{%1, %2, %0|%0, %2, %1}" [(set_attr "isa" "noavx,avx") (set_attr "type" "sse") (set_attr "atom_sse_attr" "sqrt") @@ -1092,8 +1100,8 @@ (const_int 1)))] "TARGET_SSE" "@ - rsqrtss\t{%1, %0|%0, %1} - vrsqrtss\t{%1, %2, %0|%0, %2, %1}" + rsqrtss\t{%1, %0|%0, %k1} + vrsqrtss\t{%1, %2, %0|%0, %2, %k1}" [(set_attr "isa" "noavx,avx") (set_attr "type" "sse") (set_attr "prefix" "orig,vex") @@ -1156,8 +1164,8 @@ (const_int 1)))] "TARGET_SSE" "@ - \t{%2, %0|%0, %2} - v\t{%2, %1, %0|%0, %1, %2}" + \t{%2, %0|%0, %2} + v\t{%2, %1, %0|%0, %1, %2}" [(set_attr "isa" "noavx,avx") (set_attr "type" "sse") (set_attr "btver2_sse_attr" "maxmin") @@ -1588,7 +1596,7 @@ (match_dup 1) (const_int 1)))] "TARGET_AVX" - "vcmp\t{%3, %2, %1, %0|%0, %1, %2, %3}" + "vcmp\t{%3, %2, %1, %0|%0, %1, %2, %3}" [(set_attr "type" "ssecmp") (set_attr "length_immediate" "1") (set_attr "prefix" "vex") @@ -1635,8 +1643,8 @@ (const_int 1)))] "TARGET_SSE" "@ - cmp%D3\t{%2, %0|%0, %2} - vcmp%D3\t{%2, %1, %0|%0, %1, %2}" + cmp%D3\t{%2, %0|%0, %2} + vcmp%D3\t{%2, %1, %0|%0, %1, %2}" [(set_attr "isa" "noavx,avx") (set_attr "type" "ssecmp") (set_attr "length_immediate" "1,*") @@ -1653,7 +1661,7 @@ (match_operand: 1 "nonimmediate_operand" "xm") (parallel [(const_int 0)]))))] "SSE_FLOAT_MODE_P (mode)" - "%vcomi\t{%1, %0|%0, %1}" + "%vcomi\t{%1, %0|%0, %1}" [(set_attr "type" "ssecomi") (set_attr "prefix" "maybe_vex") (set_attr "prefix_rep" "0") @@ -1673,7 +1681,7 @@ (match_operand: 1 "nonimmediate_operand" "xm") (parallel [(const_int 0)]))))] "SSE_FLOAT_MODE_P (mode)" - "%vucomi\t{%1, %0|%0, %1}" + "%vucomi\t{%1, %0|%0, %1}" [(set_attr "type" "ssecomi") (set_attr "prefix" "maybe_vex") (set_attr "prefix_rep" "0") @@ -2246,8 +2254,8 @@ (const_int 1)))] "TARGET_FMA" "@ - vfmadd132\t{%2, %3, %0|%0, %3, %2} - vfmadd213\t{%3, %2, %0|%0, %2, %3}" + vfmadd132\t{%2, %3, %0|%0, %3, %2} + vfmadd213\t{%3, %2, %0|%0, %2, %3}" [(set_attr "type" "ssemuladd") (set_attr "mode" "")]) @@ -2263,8 +2271,8 @@ (const_int 1)))] "TARGET_FMA" "@ - vfmsub132\t{%2, %3, %0|%0, %3, %2} - vfmsub213\t{%3, %2, %0|%0, %2, %3}" + vfmsub132\t{%2, %3, %0|%0, %3, %2} + vfmsub213\t{%3, %2, %0|%0, %2, %3}" [(set_attr "type" "ssemuladd") (set_attr "mode" "")]) @@ -2280,8 +2288,8 @@ (const_int 1)))] "TARGET_FMA" "@ - vfnmadd132\t{%2, %3, %0|%0, %3, %2} - vfnmadd213\t{%3, %2, %0|%0, %2, %3}" + vfnmadd132\t{%2, %3, %0|%0, %3, %2} + vfnmadd213\t{%3, %2, %0|%0, %2, %3}" [(set_attr "type" "ssemuladd") (set_attr "mode" "")]) @@ -2298,8 +2306,8 @@ (const_int 1)))] "TARGET_FMA" "@ - vfnmsub132\t{%2, %3, %0|%0, %3, %2} - vfnmsub213\t{%3, %2, %0|%0, %2, %3}" + vfnmsub132\t{%2, %3, %0|%0, %3, %2} + vfnmsub213\t{%3, %2, %0|%0, %2, %3}" [(set_attr "type" "ssemuladd") (set_attr "mode" "")]) @@ -2328,7 +2336,7 @@ (match_operand:VF_128 4 "const0_operand") (const_int 1)))] "TARGET_FMA4" - "vfmadd\t{%3, %2, %1, %0|%0, %1, %2, %3}" + "vfmadd\t{%3, %2, %1, %0|%0, %1, %2, %3}" [(set_attr "type" "ssemuladd") (set_attr "mode" "")]) @@ -2343,7 +2351,7 @@ (match_operand:VF_128 4 "const0_operand") (const_int 1)))] "TARGET_FMA4" - "vfmsub\t{%3, %2, %1, %0|%0, %1, %2, %3}" + "vfmsub\t{%3, %2, %1, %0|%0, %1, %2, %3}" [(set_attr "type" "ssemuladd") (set_attr "mode" "")]) @@ -2358,7 +2366,7 @@ (match_operand:VF_128 4 "const0_operand") (const_int 1)))] "TARGET_FMA4" - "vfnmadd\t{%3, %2, %1, %0|%0, %1, %2, %3}" + "vfnmadd\t{%3, %2, %1, %0|%0, %1, %2, %3}" [(set_attr "type" "ssemuladd") (set_attr "mode" "")]) @@ -2374,7 +2382,7 @@ (match_operand:VF_128 4 "const0_operand") (const_int 1)))] "TARGET_FMA4" - "vfnmsub\t{%3, %2, %1, %0|%0, %1, %2, %3}" + "vfnmsub\t{%3, %2, %1, %0|%0, %1, %2, %3}" [(set_attr "type" "ssemuladd") (set_attr "mode" "")]) @@ -2403,7 +2411,7 @@ UNSPEC_FIX_NOTRUNC) (parallel [(const_int 0) (const_int 1)])))] "TARGET_SSE" - "cvtps2pi\t{%1, %0|%0, %1}" + "cvtps2pi\t{%1, %0|%0, %q1}" [(set_attr "type" "ssecvt") (set_attr "unit" "mmx") (set_attr "mode" "DI")]) @@ -2414,7 +2422,7 @@ (fix:V4SI (match_operand:V4SF 1 "nonimmediate_operand" "xm")) (parallel [(const_int 0) (const_int 1)])))] "TARGET_SSE" - "cvttps2pi\t{%1, %0|%0, %1}" + "cvttps2pi\t{%1, %0|%0, %q1}" [(set_attr "type" "ssecvt") (set_attr "unit" "mmx") (set_attr "prefix_rep" "0") @@ -2472,7 +2480,7 @@ (parallel [(const_int 0)]))] UNSPEC_FIX_NOTRUNC))] "TARGET_SSE" - "%vcvtss2si\t{%1, %0|%0, %1}" + "%vcvtss2si\t{%1, %0|%0, %k1}" [(set_attr "type" "sseicvt") (set_attr "athlon_decode" "double,vector") (set_attr "bdver1_decode" "double,double") @@ -2485,7 +2493,7 @@ (unspec:SI [(match_operand:SF 1 "nonimmediate_operand" "x,m")] UNSPEC_FIX_NOTRUNC))] "TARGET_SSE" - "%vcvtss2si\t{%1, %0|%0, %1}" + "%vcvtss2si\t{%1, %0|%0, %k1}" [(set_attr "type" "sseicvt") (set_attr "athlon_decode" "double,vector") (set_attr "amdfam10_decode" "double,double") @@ -2502,7 +2510,7 @@ (parallel [(const_int 0)]))] UNSPEC_FIX_NOTRUNC))] "TARGET_SSE && TARGET_64BIT" - "%vcvtss2si{q}\t{%1, %0|%0, %1}" + "%vcvtss2si{q}\t{%1, %0|%0, %k1}" [(set_attr "type" "sseicvt") (set_attr "athlon_decode" "double,vector") (set_attr "bdver1_decode" "double,double") @@ -2515,7 +2523,7 @@ (unspec:DI [(match_operand:SF 1 "nonimmediate_operand" "x,m")] UNSPEC_FIX_NOTRUNC))] "TARGET_SSE && TARGET_64BIT" - "%vcvtss2si{q}\t{%1, %0|%0, %1}" + "%vcvtss2si{q}\t{%1, %0|%0, %k1}" [(set_attr "type" "sseicvt") (set_attr "athlon_decode" "double,vector") (set_attr "amdfam10_decode" "double,double") @@ -2531,7 +2539,7 @@ (match_operand:V4SF 1 "nonimmediate_operand" "x,m") (parallel [(const_int 0)]))))] "TARGET_SSE" - "%vcvttss2si\t{%1, %0|%0, %1}" + "%vcvttss2si\t{%1, %0|%0, %k1}" [(set_attr "type" "sseicvt") (set_attr "athlon_decode" "double,vector") (set_attr "amdfam10_decode" "double,double") @@ -2547,7 +2555,7 @@ (match_operand:V4SF 1 "nonimmediate_operand" "x,m") (parallel [(const_int 0)]))))] "TARGET_SSE && TARGET_64BIT" - "%vcvttss2si{q}\t{%1, %0|%0, %1}" + "%vcvttss2si{q}\t{%1, %0|%0, %k1}" [(set_attr "type" "sseicvt") (set_attr "athlon_decode" "double,vector") (set_attr "amdfam10_decode" "double,double") @@ -2733,7 +2741,7 @@ (parallel [(const_int 0)]))] UNSPEC_FIX_NOTRUNC))] "TARGET_SSE2" - "%vcvtsd2si\t{%1, %0|%0, %1}" + "%vcvtsd2si\t{%1, %0|%0, %q1}" [(set_attr "type" "sseicvt") (set_attr "athlon_decode" "double,vector") (set_attr "bdver1_decode" "double,double") @@ -2747,7 +2755,7 @@ (unspec:SI [(match_operand:DF 1 "nonimmediate_operand" "x,m")] UNSPEC_FIX_NOTRUNC))] "TARGET_SSE2" - "%vcvtsd2si\t{%1, %0|%0, %1}" + "%vcvtsd2si\t{%1, %0|%0, %q1}" [(set_attr "type" "sseicvt") (set_attr "athlon_decode" "double,vector") (set_attr "amdfam10_decode" "double,double") @@ -2764,7 +2772,7 @@ (parallel [(const_int 0)]))] UNSPEC_FIX_NOTRUNC))] "TARGET_SSE2 && TARGET_64BIT" - "%vcvtsd2si{q}\t{%1, %0|%0, %1}" + "%vcvtsd2si{q}\t{%1, %0|%0, %q1}" [(set_attr "type" "sseicvt") (set_attr "athlon_decode" "double,vector") (set_attr "bdver1_decode" "double,double") @@ -2777,7 +2785,7 @@ (unspec:DI [(match_operand:DF 1 "nonimmediate_operand" "x,m")] UNSPEC_FIX_NOTRUNC))] "TARGET_SSE2 && TARGET_64BIT" - "%vcvtsd2si{q}\t{%1, %0|%0, %1}" + "%vcvtsd2si{q}\t{%1, %0|%0, %q1}" [(set_attr "type" "sseicvt") (set_attr "athlon_decode" "double,vector") (set_attr "amdfam10_decode" "double,double") @@ -2793,7 +2801,7 @@ (match_operand:V2DF 1 "nonimmediate_operand" "x,m") (parallel [(const_int 0)]))))] "TARGET_SSE2" - "%vcvttsd2si\t{%1, %0|%0, %1}" + "%vcvttsd2si\t{%1, %0|%0, %q1}" [(set_attr "type" "sseicvt") (set_attr "athlon_decode" "double,vector") (set_attr "amdfam10_decode" "double,double") @@ -2810,7 +2818,7 @@ (match_operand:V2DF 1 "nonimmediate_operand" "x,m") (parallel [(const_int 0)]))))] "TARGET_SSE2 && TARGET_64BIT" - "%vcvttsd2si{q}\t{%1, %0|%0, %1}" + "%vcvttsd2si{q}\t{%1, %0|%0, %q1}" [(set_attr "type" "sseicvt") (set_attr "athlon_decode" "double,vector") (set_attr "amdfam10_decode" "double,double") @@ -2983,8 +2991,8 @@ "TARGET_SSE2" "@ cvtsd2ss\t{%2, %0|%0, %2} - cvtsd2ss\t{%2, %0|%0, %2} - vcvtsd2ss\t{%2, %1, %0|%0, %1, %2}" + cvtsd2ss\t{%2, %0|%0, %q2} + vcvtsd2ss\t{%2, %1, %0|%0, %1, %q2}" [(set_attr "isa" "noavx,noavx,avx") (set_attr "type" "ssecvt") (set_attr "athlon_decode" "vector,double,*") @@ -3006,8 +3014,8 @@ "TARGET_SSE2" "@ cvtss2sd\t{%2, %0|%0, %2} - cvtss2sd\t{%2, %0|%0, %2} - vcvtss2sd\t{%2, %1, %0|%0, %1, %2}" + cvtss2sd\t{%2, %0|%0, %k2} + vcvtss2sd\t{%2, %1, %0|%0, %1, %k2}" [(set_attr "isa" "noavx,noavx,avx") (set_attr "type" "ssecvt") (set_attr "amdfam10_decode" "vector,double,*") @@ -3576,7 +3584,7 @@ vmovhlps\t{%2, %1, %0|%0, %1, %2} movlps\t{%H2, %0|%0, %H2} vmovlps\t{%H2, %1, %0|%0, %1, %H2} - %vmovhps\t{%2, %0|%0, %2}" + %vmovhps\t{%2, %0|%q0, %2}" [(set_attr "isa" "noavx,avx,noavx,avx,*") (set_attr "type" "ssemov") (set_attr "prefix" "orig,vex,orig,vex,maybe_vex") @@ -3610,7 +3618,7 @@ (vec_select:V4SF (vec_concat:V8SF (match_operand:V4SF 1 "nonimmediate_operand" " 0,x,0,x,0") - (match_operand:V4SF 2 "nonimmediate_operand" " x,x,m,x,x")) + (match_operand:V4SF 2 "nonimmediate_operand" " x,x,m,m,x")) (parallel [(const_int 0) (const_int 1) (const_int 4) @@ -3619,8 +3627,8 @@ "@ movlhps\t{%2, %0|%0, %2} vmovlhps\t{%2, %1, %0|%0, %1, %2} - movhps\t{%2, %0|%0, %2} - vmovhps\t{%2, %1, %0|%0, %1, %2} + movhps\t{%2, %0|%0, %q2} + vmovhps\t{%2, %1, %0|%0, %1, %q2} %vmovlps\t{%2, %H0|%H0, %2}" [(set_attr "isa" "noavx,avx,noavx,avx,*") (set_attr "type" "ssemov") @@ -3944,7 +3952,7 @@ (parallel [(const_int 2) (const_int 3)])))] "TARGET_SSE" "@ - %vmovhps\t{%1, %0|%0, %1} + %vmovhps\t{%1, %0|%q0, %1} %vmovhlps\t{%1, %d0|%d0, %1} %vmovlps\t{%H1, %d0|%d0, %H1}" [(set_attr "type" "ssemov") @@ -3980,8 +3988,8 @@ (match_operand:V2SF 2 "nonimmediate_operand" " m,m,x,x,x")))] "TARGET_SSE" "@ - movhps\t{%2, %0|%0, %2} - vmovhps\t{%2, %1, %0|%0, %1, %2} + movhps\t{%2, %0|%0, %q2} + vmovhps\t{%2, %1, %0|%0, %1, %q2} movlhps\t{%2, %0|%0, %2} vmovlhps\t{%2, %1, %0|%0, %1, %2} %vmovlps\t{%2, %H0|%H0, %2}" @@ -3997,9 +4005,9 @@ (parallel [(const_int 0) (const_int 1)])))] "TARGET_SSE" "@ - %vmovlps\t{%1, %0|%0, %1} + %vmovlps\t{%1, %0|%q0, %1} %vmovaps\t{%1, %0|%0, %1} - %vmovlps\t{%1, %d0|%d0, %1}" + %vmovlps\t{%1, %d0|%d0, %q1}" [(set_attr "type" "ssemov") (set_attr "prefix" "maybe_vex") (set_attr "mode" "V2SF,V4SF,V2SF")]) @@ -4035,9 +4043,9 @@ "@ shufps\t{$0xe4, %1, %0|%0, %1, 0xe4} vshufps\t{$0xe4, %1, %2, %0|%0, %2, %1, 0xe4} - movlps\t{%2, %0|%0, %2} - vmovlps\t{%2, %1, %0|%0, %1, %2} - %vmovlps\t{%2, %0|%0, %2}" + movlps\t{%2, %0|%0, %q2} + vmovlps\t{%2, %1, %0|%0, %1, %q2} + %vmovlps\t{%2, %0|%q0, %2}" [(set_attr "isa" "noavx,avx,noavx,avx,*") (set_attr "type" "sseshuf,sseshuf,ssemov,ssemov,ssemov") (set_attr "length_immediate" "1,1,*,*,*") @@ -4149,8 +4157,8 @@ "@ movlhps\t{%2, %0|%0, %2} vmovlhps\t{%2, %1, %0|%0, %1, %2} - movhps\t{%2, %0|%0, %2} - vmovhps\t{%2, %1, %0|%0, %1, %2}" + movhps\t{%2, %0|%0, %q2} + vmovhps\t{%2, %1, %0|%0, %1, %q2}" [(set_attr "isa" "noavx,avx,noavx,avx") (set_attr "type" "ssemov") (set_attr "prefix" "orig,vex,orig,vex") @@ -4625,7 +4633,7 @@ %vmovddup\t{%H1, %0|%0, %H1} movlpd\t{%H1, %0|%0, %H1} vmovlpd\t{%H1, %2, %0|%0, %2, %H1} - %vmovhpd\t{%1, %0|%0, %1}" + %vmovhpd\t{%1, %0|%q0, %1}" [(set_attr "isa" "noavx,avx,sse3,noavx,avx,*") (set_attr "type" "sselog,sselog,sselog,ssemov,ssemov,ssemov") (set_attr "prefix_data16" "*,*,*,1,*,1") @@ -4723,9 +4731,9 @@ "@ unpcklpd\t{%2, %0|%0, %2} vunpcklpd\t{%2, %1, %0|%0, %1, %2} - %vmovddup\t{%1, %0|%0, %1} - movhpd\t{%2, %0|%0, %2} - vmovhpd\t{%2, %1, %0|%0, %1, %2} + %vmovddup\t{%1, %0|%0, %q1} + movhpd\t{%2, %0|%0, %q2} + vmovhpd\t{%2, %1, %0|%0, %1, %q2} %vmovlpd\t{%2, %H0|%H0, %2}" [(set_attr "isa" "noavx,avx,sse3,noavx,avx,*") (set_attr "type" "sselog,sselog,sselog,ssemov,ssemov,ssemov") @@ -4963,7 +4971,7 @@ "!TARGET_SSE2 && TARGET_SSE && !(MEM_P (operands[0]) && MEM_P (operands[1]))" "@ - movhps\t{%1, %0|%0, %1} + movhps\t{%1, %0|%q0, %1} movhlps\t{%1, %0|%0, %1} movlps\t{%H1, %0|%0, %H1}" [(set_attr "type" "ssemov") @@ -5012,7 +5020,7 @@ "@ movlps\t{%1, %0|%0, %1} movaps\t{%1, %0|%0, %1} - movlps\t{%1, %0|%0, %1}" + movlps\t{%1, %0|%0, %q1}" [(set_attr "type" "ssemov") (set_attr "mode" "V2SF,V4SF,V2SF")]) @@ -5151,9 +5159,9 @@ "@ movsd\t{%2, %0|%0, %2} vmovsd\t{%2, %1, %0|%0, %1, %2} - movlpd\t{%2, %0|%0, %2} - vmovlpd\t{%2, %1, %0|%0, %1, %2} - %vmovlpd\t{%2, %0|%0, %2} + movlpd\t{%2, %0|%0, %q2} + vmovlpd\t{%2, %1, %0|%0, %1, %q2} + %vmovlpd\t{%2, %0|%q0, %2} shufpd\t{$2, %1, %0|%0, %1, 2} movhps\t{%H1, %0|%0, %H1} vmovhps\t{%H1, %2, %0|%0, %2, %H1} @@ -7547,8 +7555,8 @@ punpcklqdq\t{%2, %0|%0, %2} vpunpcklqdq\t{%2, %1, %0|%0, %1, %2} movlhps\t{%2, %0|%0, %2} - movhps\t{%2, %0|%0, %2} - vmovhps\t{%2, %1, %0|%0, %1, %2}" + movhps\t{%2, %0|%0, %q2} + vmovhps\t{%2, %1, %0|%0, %1, %q2}" [(set_attr "isa" "sse2_noavx,avx,noavx,noavx,avx") (set_attr "type" "sselog,sselog,ssemov,ssemov,ssemov") (set_attr "prefix" "orig,vex,orig,orig,vex") @@ -10201,7 +10209,7 @@ (match_operand:VF_128 2 "const0_operand") (const_int 1)))] "TARGET_XOP" - "vfrcz\t{%1, %0|%0, %1}" + "vfrcz\t{%1, %0|%0, %1}" [(set_attr "type" "ssecvt1") (set_attr "mode" "")]) @@ -10451,20 +10459,22 @@ (match_operand: 1 "nonimmediate_operand" "xm") (parallel [(const_int 0)]))))] "TARGET_AVX2" - "vpbroadcast\t{%1, %0|%0, %1}" + "vpbroadcast\t{%1, %0|%0, %1}" [(set_attr "type" "ssemov") (set_attr "prefix_extra" "1") (set_attr "prefix" "vex") (set_attr "mode" "")]) (define_insn "avx2_pbroadcast_1" - [(set (match_operand:VI_256 0 "register_operand" "=x") + [(set (match_operand:VI_256 0 "register_operand" "=x,x") (vec_duplicate:VI_256 (vec_select: - (match_operand:VI_256 1 "nonimmediate_operand" "xm") + (match_operand:VI_256 1 "nonimmediate_operand" "m,x") (parallel [(const_int 0)]))))] "TARGET_AVX2" - "vpbroadcast\t{%x1, %0|%0, %x1}" + "@ + vpbroadcast\t{%1, %0|%0, %1} + vpbroadcast\t{%x1, %0|%0, %x1}" [(set_attr "type" "ssemov") (set_attr "prefix_extra" "1") (set_attr "prefix" "vex") @@ -10619,7 +10629,7 @@ case 0: case 1: operands[1] = adjust_address_nv (operands[1], SFmode, elt * 4); - return "vbroadcastss\t{%1, %0|%0, %1}"; + return "vbroadcastss\t{%1, %0|%0, %k1}"; case 2: operands[2] = GEN_INT (elt * 0x55); return "vpermilps\t{%2, %1, %0|%0, %1, %2}"; Index: testsuite/gcc.target/i386/pr57807.c =================================================================== --- testsuite/gcc.target/i386/pr57807.c (revision 0) +++ testsuite/gcc.target/i386/pr57807.c (working copy) @@ -0,0 +1,11 @@ +/* { dg-do assemble } */ +/* { dg-options "-msse2 -masm=intel" } */ +/* { dg-require-effective-target sse2 } */ +/* { dg-require-effective-target masm_intel } */ + +typedef double __v2df __attribute__((__vector_size__(16))); +typedef double __m128d __attribute__((__vector_size__(16), __may_alias__)); + +__m128d _mm_unpacklo_pd(__m128d __A, __m128d __B) { + return (__m128d)__builtin_ia32_unpcklpd((__v2df)__A, (__v2df)__B); +}