From patchwork Mon Apr 18 16:07:50 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "H.J. Lu" X-Patchwork-Id: 611796 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 3qpY375nLNz9s5J for ; Tue, 19 Apr 2016 02:08:18 +1000 (AEST) Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b=yQ+g73wG; dkim-atps=neutral DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:date :from:to:cc:subject:message-id:reply-to:mime-version :content-type; q=dns; s=default; b=kMRkk+QKFBtp56vWR6HLD7lpO3e+4 eRQ8Dqi8TaHi151O30cS3ZKTWCIrbTBac5kkxEejB+UzOo+neZktzfmqhGTG7WOG XwRq3JcRz5EYbKJXFZEb409sj/rLUeK9fW9n8kWzNBhY0e1CFZtrZrgzZqQWsyrB jKIh68AofJAZUs= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:date :from:to:cc:subject:message-id:reply-to:mime-version :content-type; s=default; bh=cXgdgDOvMEhQmx67bq4PdTHdRsU=; b=yQ+ g73wGUAQO/J4RoZwOQ7Sj0Y+rQ54KTi/LDQZZDnBOglv8EQZLPhVfPKYQbHd2Cqz K70aKietfqpwdU2nJcLcj5dGWHP+krwFlzvT3EQqplL2FWge/tNNpquLwShnCW/f PjUch0K0SRGELvqOnLD997NOoxv/EgTyUVRsVn1E= Received: (qmail 5153 invoked by alias); 18 Apr 2016 16:08:03 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 5116 invoked by uid 89); 18 Apr 2016 16:08:02 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-1.9 required=5.0 tests=BAYES_00, KAM_LAZY_DOMAIN_SECURITY, NO_DNS_FOR_FROM, RP_MATCHES_RCVD autolearn=no version=3.3.2 spammy=vxx, H*R:D*gmail.com X-HELO: mga09.intel.com Received: from mga09.intel.com (HELO mga09.intel.com) (134.134.136.24) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Mon, 18 Apr 2016 16:07:52 +0000 Received: from fmsmga004.fm.intel.com ([10.253.24.48]) by orsmga102.jf.intel.com with ESMTP; 18 Apr 2016 09:07:51 -0700 X-ExtLoop1: 1 Received: from gnu-6.sc.intel.com ([172.25.70.218]) by fmsmga004.fm.intel.com with ESMTP; 18 Apr 2016 09:07:51 -0700 Received: by gnu-6.sc.intel.com (Postfix, from userid 1000) id B40F82000C8; Mon, 18 Apr 2016 09:07:50 -0700 (PDT) Date: Mon, 18 Apr 2016 09:07:50 -0700 From: "H.J. Lu" To: gcc-patches@gcc.gnu.org Cc: Uros Bizjak Subject: [PATCH] PR target/70708: Suboptimal code generated when using _mm_set_sd (X64) Message-ID: <20160418160750.GA31117@intel.com> Reply-To: "H.J. Lu" MIME-Version: 1.0 Content-Disposition: inline User-Agent: Mutt/1.5.24 (2015-08-30) "movq" should used to load double into xmm register with zero_extend: (set (reg:V2DF 90) (vec_concat:V2DF (reg/v:DF 88 [ d ]) (const_double:DF 0.0 [0x0.0p+0]))) Unlike "movsd", which only works with load from memory, "movq" works with both memory and xmm register. OK for trunk if there is no regression? H.J. --- gcc/ PR target/70708 * config/i386/sse.md (sse2_loadlpd): Accept load from "xm" and replace %vmovsd with "%vmovq". (vec_concatv2df): Likewise. gcc/testsuite/ PR target/70708 * gcc.target/i386/pr70708.c: New test. --- gcc/config/i386/sse.md | 12 ++++++------ gcc/testsuite/gcc.target/i386/pr70708.c | 14 ++++++++++++++ 2 files changed, 20 insertions(+), 6 deletions(-) create mode 100644 gcc/testsuite/gcc.target/i386/pr70708.c diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md index 1ffb3b9..845ef56 100644 --- a/gcc/config/i386/sse.md +++ b/gcc/config/i386/sse.md @@ -8863,14 +8863,14 @@ "=x,x,x,x,x,x,x,x,m,m ,m") (vec_concat:V2DF (match_operand:DF 2 "nonimmediate_operand" - " m,m,m,x,x,0,0,x,x,*f,r") + "xm,m,m,x,x,0,0,x,x,*f,r") (vec_select:DF (match_operand:V2DF 1 "vector_move_operand" " C,0,x,0,x,x,o,o,0,0 ,0") (parallel [(const_int 1)]))))] "TARGET_SSE2 && !(MEM_P (operands[1]) && MEM_P (operands[2]))" "@ - %vmovsd\t{%2, %0|%0, %2} + %vmovq\t{%2, %0|%0, %2} movlpd\t{%2, %0|%0, %2} vmovlpd\t{%2, %1, %0|%0, %1, %2} movsd\t{%2, %0|%0, %2} @@ -8955,10 +8955,10 @@ (set_attr "mode" "V2DF,DF,DF")]) (define_insn "vec_concatv2df" - [(set (match_operand:V2DF 0 "register_operand" "=x,x,v,x,v,x,x,v,x,x") + [(set (match_operand:V2DF 0 "register_operand" "=x,x,v,x,v,x,x, v,x,x") (vec_concat:V2DF - (match_operand:DF 1 "nonimmediate_operand" " 0,x,v,m,m,0,x,m,0,0") - (match_operand:DF 2 "vector_move_operand" " x,x,v,1,1,m,m,C,x,m")))] + (match_operand:DF 1 "nonimmediate_operand" " 0,x,v,m,m,0,x,xm,0,0") + (match_operand:DF 2 "vector_move_operand" " x,x,v,1,1,m,m, C,x,m")))] "TARGET_SSE && (!(MEM_P (operands[1]) && MEM_P (operands[2])) || (TARGET_SSE3 && rtx_equal_p (operands[1], operands[2])))" @@ -8970,7 +8970,7 @@ vmovddup\t{%1, %0|%0, %1} movhpd\t{%2, %0|%0, %2} vmovhpd\t{%2, %1, %0|%0, %1, %2} - %vmovsd\t{%1, %0|%0, %1} + %vmovq\t{%1, %0|%0, %1} movlhps\t{%2, %0|%0, %2} movhps\t{%2, %0|%0, %2}" [(set_attr "isa" "sse2_noavx,avx,avx512vl,sse3,avx512vl,sse2_noavx,avx,sse2,noavx,noavx") diff --git a/gcc/testsuite/gcc.target/i386/pr70708.c b/gcc/testsuite/gcc.target/i386/pr70708.c new file mode 100644 index 0000000..2219e61 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/pr70708.c @@ -0,0 +1,14 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -msse2" } */ + +typedef double __m128d __attribute__ ((__vector_size__ (16), __may_alias__)); + +__m128d +foo (double d) +{ + return __extension__ (__m128d){ d, 0.0 }; +} + +/* { dg-final { scan-assembler-times "movq\[ \\t\]+\[^\n\]*%xmm" 1 } } */ +/* { dg-final { scan-assembler-not "movsd\[ \\t\]+\[^\n\]*%xmm" } } */ +/* { dg-final { scan-assembler-not "\\(%\[er\]sp\\)" { target { ! ia32 } }} } */