From patchwork Tue Mar 19 15:47:36 2013 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Marc Glisse X-Patchwork-Id: 229099 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client CN "localhost", Issuer "www.qmailtoaster.com" (not verified)) by ozlabs.org (Postfix) with ESMTPS id BD9942C00AA for ; Wed, 20 Mar 2013 02:48:04 +1100 (EST) DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:date :from:to:subject:message-id:mime-version:content-type; q=dns; s= dkim1; b=N7Zy80huct9XYrd/ONCKbCegylzT8440T4Qvu720/TF3jtV2P559ygc LaApZI/ASDfFBPJOdAGtj6Xgk9swiUGU2EAGxgWQ6AwdOeGJWFnQuzrHlPPpV8h9 6e6i9nCNHzjPmHuwCqLW+2RTJknFGNBjhI6S8y3cD8O7XGvO4t90= DKIM-Signature: v=1; a=rsa-sha1; c=simple; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:date :from:to:subject:message-id:mime-version:content-type; s=dkim1; bh=KSZEitkFg095c3MXMAFnHg7A6FI=; b=EVdSUOMdk/lBD/UB7SI1lowqlq0y RSpY4EgZ87QzjX7vLNm5vYZrik0PB7tt/cMXidzXiYeWE9einKKmQreD0ZOVsd7y FynzKvS+81vI0jfyymMIu64E8JS+dYWROWJkXJdGpI6CTk0LX6HYm8yDvXi8/t9u Ccr8tAmM2bJIs2A= Received: (qmail 29531 invoked by alias); 19 Mar 2013 15:47:59 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 29521 invoked by uid 89); 19 Mar 2013 15:47:59 -0000 X-SWARE-Spam-Status: No, hits=-8.8 required=5.0 tests=AWL, BAYES_00, RCVD_IN_DNSWL_HI, RP_MATCHES_RCVD, TW_AV, TW_VX X-Spam-Check-By: sourceware.org Received: from mail3-relais-sop.national.inria.fr (HELO mail3-relais-sop.national.inria.fr) (192.134.164.104) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Tue, 19 Mar 2013 15:47:55 +0000 Received: from stedding.saclay.inria.fr ([193.55.250.194]) by mail3-relais-sop.national.inria.fr with ESMTP/TLS/DHE-RSA-AES128-SHA; 19 Mar 2013 16:47:36 +0100 Received: from glisse (helo=localhost) by stedding.saclay.inria.fr with local-esmtp (Exim 4.80) (envelope-from ) id 1UHylE-0001KD-H2 for gcc-patches@gcc.gnu.org; Tue, 19 Mar 2013 16:47:36 +0100 Date: Tue, 19 Mar 2013 16:47:36 +0100 (CET) From: Marc Glisse To: gcc-patches@gcc.gnu.org Subject: [RTL, i386] Use subreg instead of UNSPEC_CAST Message-ID: User-Agent: Alpine 2.02 (DEB 1266 2009-07-14) MIME-Version: 1.0 Hello, the following patch passes bootstrap+testsuite on x86_64-linux-gnu. I don't see any particular reason to forbid vector subregs of vectors, since we can already do it through a scalar. And not using unspecs helps avoid unnecessary copies. 2013-01-03 Marc Glisse PR target/50829 gcc/ * config/i386/sse.md (enum unspec): Remove UNSPEC_CAST. (avx__): Use subreg. * emit-rtl.c (validate_subreg): Allow vector-vector subregs. gcc/testsuite/ * gcc.target/i386/pr50829.c: New file. Index: gcc/testsuite/gcc.target/i386/pr50829.c =================================================================== --- gcc/testsuite/gcc.target/i386/pr50829.c (revision 0) +++ gcc/testsuite/gcc.target/i386/pr50829.c (revision 0) @@ -0,0 +1,13 @@ +/* { dg-do compile } */ +/* { dg-options "-O1 -mavx" } */ + +#include + +__m256d +concat (__m128d x) +{ + __m256d z = _mm256_castpd128_pd256 (x); + return _mm256_insertf128_pd (z, x, 1); +} + +/* { dg-final { scan-assembler-not "vmov" } } */ Property changes on: gcc/testsuite/gcc.target/i386/pr50829.c ___________________________________________________________________ Added: svn:keywords + Author Date Id Revision URL Added: svn:eol-style + native Index: gcc/config/i386/sse.md =================================================================== --- gcc/config/i386/sse.md (revision 196633) +++ gcc/config/i386/sse.md (working copy) @@ -66,21 +66,20 @@ UNSPEC_AESKEYGENASSIST ;; For PCLMUL support UNSPEC_PCLMUL ;; For AVX support UNSPEC_PCMP UNSPEC_VPERMIL UNSPEC_VPERMIL2 UNSPEC_VPERMIL2F128 - UNSPEC_CAST UNSPEC_VTESTP UNSPEC_VCVTPH2PS UNSPEC_VCVTPS2PH ;; For AVX2 support UNSPEC_VPERMVAR UNSPEC_VPERMTI UNSPEC_GATHER UNSPEC_VSIBADDR ]) @@ -11089,23 +11088,22 @@ "TARGET_AVX" "vmaskmov\t{%2, %1, %0|%0, %1, %2}" [(set_attr "type" "sselog1") (set_attr "prefix_extra" "1") (set_attr "prefix" "vex") (set_attr "btver2_decode" "vector") (set_attr "mode" "")]) (define_insn_and_split "avx__" [(set (match_operand:AVX256MODE2P 0 "nonimmediate_operand" "=x,m") - (unspec:AVX256MODE2P - [(match_operand: 1 "nonimmediate_operand" "xm,x")] - UNSPEC_CAST))] + (subreg:AVX256MODE2P + (match_operand: 1 "nonimmediate_operand" "xm,x") 0))] "TARGET_AVX" "#" "&& reload_completed" [(const_int 0)] { rtx op0 = operands[0]; rtx op1 = operands[1]; if (REG_P (op0)) op0 = gen_rtx_REG (mode, REGNO (op0)); else Index: gcc/emit-rtl.c =================================================================== --- gcc/emit-rtl.c (revision 196633) +++ gcc/emit-rtl.c (working copy) @@ -707,20 +707,23 @@ validate_subreg (enum machine_mode omode else if ((COMPLEX_MODE_P (imode) || VECTOR_MODE_P (imode)) && GET_MODE_INNER (imode) == omode) ; /* ??? x86 sse code makes heavy use of *paradoxical* vector subregs, i.e. (subreg:V4SF (reg:SF) 0). This surely isn't the cleanest way to represent this. It's questionable if this ought to be represented at all -- why can't this all be hidden in post-reload splitters that make arbitrarily mode changes to the registers themselves. */ else if (VECTOR_MODE_P (omode) && GET_MODE_INNER (omode) == imode) ; + else if (VECTOR_MODE_P (omode) && VECTOR_MODE_P (imode) + && GET_MODE_INNER (omode) == GET_MODE_INNER (imode)) + ; /* Subregs involving floating point modes are not allowed to change size. Therefore (subreg:DI (reg:DF) 0) is fine, but (subreg:SI (reg:DF) 0) isn't. */ else if (FLOAT_MODE_P (imode) || FLOAT_MODE_P (omode)) { if (! (isize == osize /* LRA can use subreg to store a floating point value in an integer mode. Although the floating point and the integer modes need the same number of hard registers, the size of floating point mode can be less than the