From patchwork Sun Sep 30 16:29:29 2012 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Marc Glisse X-Patchwork-Id: 188171 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) by ozlabs.org (Postfix) with SMTP id 522882C00B6 for ; Mon, 1 Oct 2012 02:29:49 +1000 (EST) Comment: DKIM? See http://www.dkim.org DKIM-Signature: v=1; a=rsa-sha1; c=relaxed/relaxed; d=gcc.gnu.org; s=default; x=1349627390; h=Comment: DomainKey-Signature:Received:Received:Received:Received:Received: Date:From:To:cc:Subject:In-Reply-To:Message-ID:References: User-Agent:MIME-Version:Content-Type:Mailing-List:Precedence: List-Id:List-Unsubscribe:List-Archive:List-Post:List-Help:Sender: Delivered-To; bh=wUUxENLK4o9g36+UBuItLKRKRxw=; b=sZFV8dSGie1NxNE 4QWN97VP8nCU9EXmmU45Drm2/SAkz+21rsmM4lzRwF6cKHwqzSiryNeInQKCm9zQ nw8oibYjKl0XmuCXJ52QVSk0Xft2c9iycMUe3vZa03km4eK6DvEiM7Q6pOkKe8la v2Y+RXwhgibtpcDZHcO9RBL1r8Xs= Comment: DomainKeys? See http://antispam.yahoo.com/domainkeys DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=default; d=gcc.gnu.org; h=Received:Received:X-SWARE-Spam-Status:X-Spam-Check-By:Received:Received:Received:Date:From:To:cc:Subject:In-Reply-To:Message-ID:References:User-Agent:MIME-Version:Content-Type:Mailing-List:Precedence:List-Id:List-Unsubscribe:List-Archive:List-Post:List-Help:Sender:Delivered-To; b=Ec++F7IXq2loYkfsGvOt0H/3kZKRGGaVy0PKGc/clKBItlXOsCxPEnmZQKIyE0 Dce7gUcTYHULRNsVM0MTBILHKOnpGuIm50KWXNO0WeO4R0OcZMnTrVMLWO4qbwCC UPR5evg3Fyw1AN53zWG8Hnjf3HPGL6leJyAY67MvH78kQ=; Received: (qmail 5766 invoked by alias); 30 Sep 2012 16:29:43 -0000 Received: (qmail 5758 invoked by uid 22791); 30 Sep 2012 16:29:40 -0000 X-SWARE-Spam-Status: No, hits=-8.6 required=5.0 tests=AWL, BAYES_00, KHOP_RCVD_UNTRUST, KHOP_THREADED, RCVD_IN_DNSWL_HI, RP_MATCHES_RCVD, TW_AV X-Spam-Check-By: sourceware.org Received: from mail4-relais-sop.national.inria.fr (HELO mail4-relais-sop.national.inria.fr) (192.134.164.105) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Sun, 30 Sep 2012 16:29:31 +0000 Received: from stedding.saclay.inria.fr ([193.55.250.194]) by mail4-relais-sop.national.inria.fr with ESMTP/TLS/DHE-RSA-AES128-SHA; 30 Sep 2012 18:29:29 +0200 Received: from glisse (helo=localhost) by stedding.saclay.inria.fr with local-esmtp (Exim 4.80) (envelope-from ) id 1TIMOX-0008Gx-CL; Sun, 30 Sep 2012 18:29:29 +0200 Date: Sun, 30 Sep 2012 18:29:29 +0200 (CEST) From: Marc Glisse To: Eric Botcazou cc: gcc-patches@gcc.gnu.org Subject: Re: [rtl] combine a vec_concat of 2 vec_selects from the same vector In-Reply-To: <1494787.hVs6T5Xn6m@polaris> Message-ID: References: <1494787.hVs6T5Xn6m@polaris> User-Agent: Alpine 2.02 (DEB 1266 2009-07-14) MIME-Version: 1.0 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org On Sat, 29 Sep 2012, Eric Botcazou wrote: >> this patch lets the compiler try to rewrite: >> >> (vec_concat (vec_select x [a]) (vec_select x [b])) >> >> as: >> >> vec_select x [a b] >> >> or even just "x" if appropriate. [...] > OK, but: > > 1) Always add a comment describing the simplification when you add one, > > 2) The condition: > >> + if (GET_MODE (XEXP (trueop0, 0)) == mode >> + && INTVAL (XVECEXP (XEXP (trueop1, 1), 0, 0)) >> + - INTVAL (XVECEXP (XEXP (trueop0, 1), 0, 0)) == 1) >> + return XEXP (trueop0, 0); > > can be simplified: if GET_MODE (XEXP (trueop0, 0)) == mode, then XEXP > (trueop0, 0) is a 2-element vector so the only possible case is (0,1). > That would probably even be more correct since you don't test CONST_INT_P for > the indices, while the test is done in the VEC_SELECT case. It looks like I was trying to be clever by replacing 2 understandable tests with a single more obscure one, bad idea. > Why not generalizing to all kinds of VEC_SELECTs instead of just scalar ones? Ok, I changed the patch a bit to handle arbitrary VEC_SELECTs, and moved the identity recognition to VEC_SELECT handling (where it belonged). Testing with non-scalar VEC_SELECTs was limited though, because they are not that easy to generate. Also, the identity case is the only one where it actually optimized. To handle more cases, I'd have to look through several layers of VEC_SELECTs, which gets a bit complicated (for instance, the permutation 0,1,3,2 will appear as a vec_concat of a vec_select(v,[0,1]) and a vec_select(vec_select(v,[2,3]),[1,0]), or worse with a vec_concat in the middle). It also didn't optimize 3,2,3,2, possibly because that meant substituting the same rtx twice (I didn't go that far in gdb). Then there is also the vec_duplicate case (I should try to replace vec_duplicate with vec_concat in simplify-rtx to see what happens...). Still, the identity case is nice to have. Thanks for your comments. bootstrap+testsuite on x86_64-linux-gnu with default languages. 2012-09-09 Marc Glisse gcc/ * simplify-rtx.c (simplify_binary_operation_1) : Detect the identity. : Handle VEC_SELECTs from the same vector. gcc/testsuite/ * gcc.target/i386/vect-rebuild.c: New testcase. Index: gcc/testsuite/gcc.target/i386/vect-rebuild.c =================================================================== --- gcc/testsuite/gcc.target/i386/vect-rebuild.c (revision 0) +++ gcc/testsuite/gcc.target/i386/vect-rebuild.c (revision 0) @@ -0,0 +1,33 @@ +/* { dg-do compile } */ +/* { dg-options "-O -mavx -fno-tree-forwprop" } */ + +typedef double v2df __attribute__ ((__vector_size__ (16))); +typedef double v4df __attribute__ ((__vector_size__ (32))); + +v2df f1 (v2df x) +{ + v2df xx = { x[0], x[1] }; + return xx; +} + +v4df f2 (v4df x) +{ + v4df xx = { x[0], x[1], x[2], x[3] }; + return xx; +} + +v2df g (v2df x) +{ + v2df xx = { x[1], x[0] }; + return xx; +} + +v2df h (v4df x) +{ + v2df xx = { x[2], x[3] }; + return xx; +} + +/* { dg-final { scan-assembler-not "unpck" } } */ +/* { dg-final { scan-assembler-times "\tv?permilpd\[ \t\]" 1 } } */ +/* { dg-final { scan-assembler-times "\tv?extractf128\[ \t\]" 1 } } */ Property changes on: gcc/testsuite/gcc.target/i386/vect-rebuild.c ___________________________________________________________________ Added: svn:keywords + Author Date Id Revision URL Added: svn:eol-style + native Index: gcc/simplify-rtx.c =================================================================== --- gcc/simplify-rtx.c (revision 191865) +++ gcc/simplify-rtx.c (working copy) @@ -3239,20 +3239,37 @@ simplify_binary_operation_1 (enum rtx_co rtx x = XVECEXP (trueop1, 0, i); gcc_assert (CONST_INT_P (x)); RTVEC_ELT (v, i) = CONST_VECTOR_ELT (trueop0, INTVAL (x)); } return gen_rtx_CONST_VECTOR (mode, v); } + /* Recognize the identity. */ + if (GET_MODE (trueop0) == mode) + { + bool maybe_ident = true; + for (int i = 0; i < XVECLEN (trueop1, 0); i++) + { + rtx j = XVECEXP (trueop1, 0, i); + if (!CONST_INT_P (j) || INTVAL (j) != i) + { + maybe_ident = false; + break; + } + } + if (maybe_ident) + return trueop0; + } + /* If we build {a,b} then permute it, build the result directly. */ if (XVECLEN (trueop1, 0) == 2 && CONST_INT_P (XVECEXP (trueop1, 0, 0)) && CONST_INT_P (XVECEXP (trueop1, 0, 1)) && GET_CODE (trueop0) == VEC_CONCAT && GET_CODE (XEXP (trueop0, 0)) == VEC_CONCAT && GET_MODE (XEXP (trueop0, 0)) == mode && GET_CODE (XEXP (trueop0, 1)) == VEC_CONCAT && GET_MODE (XEXP (trueop0, 1)) == mode) { @@ -3364,20 +3381,38 @@ simplify_binary_operation_1 (enum rtx_co if (!VECTOR_MODE_P (op1_mode)) RTVEC_ELT (v, i) = trueop1; else RTVEC_ELT (v, i) = CONST_VECTOR_ELT (trueop1, i - in_n_elts); } } return gen_rtx_CONST_VECTOR (mode, v); } + + /* Try to merge 2 VEC_SELECTs from the same vector into a single one. */ + if (GET_CODE (trueop0) == VEC_SELECT + && GET_CODE (trueop1) == VEC_SELECT + && rtx_equal_p (XEXP (trueop0, 0), XEXP (trueop1, 0))) + { + rtx par0 = XEXP (trueop0, 1); + rtx par1 = XEXP (trueop1, 1); + int len0 = XVECLEN (par0, 0); + int len1 = XVECLEN (par1, 0); + rtvec vec = rtvec_alloc (len0 + len1); + for (int i = 0; i < len0; i++) + RTVEC_ELT (vec, i) = XVECEXP (par0, 0, i); + for (int i = 0; i < len1; i++) + RTVEC_ELT (vec, len0 + i) = XVECEXP (par1, 0, i); + return simplify_gen_binary (VEC_SELECT, mode, XEXP (trueop0, 0), + gen_rtx_PARALLEL (VOIDmode, vec)); + } } return 0; default: gcc_unreachable (); } return 0; }