From patchwork Sat Aug 18 09:59:50 2012 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Marc Glisse X-Patchwork-Id: 178441 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) by ozlabs.org (Postfix) with SMTP id B058E2C00A0 for ; Sat, 18 Aug 2012 20:00:14 +1000 (EST) Comment: DKIM? See http://www.dkim.org DKIM-Signature: v=1; a=rsa-sha1; c=relaxed/relaxed; d=gcc.gnu.org; s=default; x=1345888815; h=Comment: DomainKey-Signature:Received:Received:Received:Received:Received: Date:From:To:cc:Subject:In-Reply-To:Message-ID:References: User-Agent:MIME-Version:Content-Type:Mailing-List:Precedence: List-Id:List-Unsubscribe:List-Archive:List-Post:List-Help:Sender: Delivered-To; bh=pvuH8aEfW1S4uS7PVn7Gw0+EMUU=; b=IHlF0T8IvavkFbJ IzNTw0EYANqFvNCwhYDxU17mamYQ4zUCh1//2ix9/d4/Yq1AgHSAedMXc31s2lMH la18WU/PWIGCgrg/EpWxZt8vxqLHYb9KSlQN4aP3WssW/3yBWqt+2iIWQbl9VGat tvIQZlfI0MuD+88J8Mj4YnjOycMI= Comment: DomainKeys? See http://antispam.yahoo.com/domainkeys DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=default; d=gcc.gnu.org; h=Received:Received:X-SWARE-Spam-Status:X-Spam-Check-By:Received:Received:Received:Date:From:To:cc:Subject:In-Reply-To:Message-ID:References:User-Agent:MIME-Version:Content-Type:Mailing-List:Precedence:List-Id:List-Unsubscribe:List-Archive:List-Post:List-Help:Sender:Delivered-To; b=sUS1UODQ4dlBpR/WVtVX68m0zwRx/5pNDT4PHd6QtUHWzfwAUN9GI2I7JQdhR1 Ef51CqcxQiIxtMX+fNO8SNWRY/f/7jUePHjLOm3zq8NvXD2f3Ojra5xX9//g6WMl rrsYAyD6Y+USD6aHyXxN4qg+RhFMkiIDUtND3kB5+nxHY=; Received: (qmail 6682 invoked by alias); 18 Aug 2012 10:00:10 -0000 Received: (qmail 6661 invoked by uid 22791); 18 Aug 2012 10:00:08 -0000 X-SWARE-Spam-Status: No, hits=-7.4 required=5.0 tests=AWL, BAYES_00, KHOP_RCVD_UNTRUST, KHOP_THREADED, RCVD_IN_DNSWL_HI, RP_MATCHES_RCVD, TW_CF, TW_TM X-Spam-Check-By: sourceware.org Received: from mail4-relais-sop.national.inria.fr (HELO mail4-relais-sop.national.inria.fr) (192.134.164.105) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Sat, 18 Aug 2012 09:59:53 +0000 Received: from stedding.saclay.inria.fr ([193.55.250.194]) by mail4-relais-sop.national.inria.fr with ESMTP/TLS/DHE-RSA-AES128-SHA; 18 Aug 2012 11:59:51 +0200 Received: from glisse (helo=localhost) by stedding.saclay.inria.fr with local-esmtp (Exim 4.80) (envelope-from ) id 1T2fos-0001Wr-GJ; Sat, 18 Aug 2012 11:59:50 +0200 Date: Sat, 18 Aug 2012 11:59:50 +0200 (CEST) From: Marc Glisse To: Richard Guenther , Ramana Radhakrishnan , Jakub Jelinek cc: gcc-patches@gcc.gnu.org Subject: Re: combine permutations in gimple In-Reply-To: Message-ID: References: User-Agent: Alpine 2.02 (DEB 1266 2009-07-14) MIME-Version: 1.0 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Hello, here is a new patch (passes bootstrap+regtest), which only combines permutations if the result is the identity. I'll file a PR about more general combinations to link to this conversation and the cost hook proposition. Ok? 2012-08-18 Marc Glisse gcc/ * fold-const.c (fold_ternary_loc): Detect identity permutations. Canonicalize permutations more. * tree-ssa-forwprop.c (is_combined_permutation_identity): New function. (simplify_permutation): Likewise. (ssa_forward_propagate_and_combine): Call it. gcc/testsuite/ * gcc.dg/tree-ssa/forwprop-19.c: New testcase. * gcc.dg/fold-perm.c: Likewise. Index: gcc/fold-const.c =================================================================== --- gcc/fold-const.c (revision 190500) +++ gcc/fold-const.c (working copy) @@ -14148,58 +14148,96 @@ fold_ternary_loc (location_t loc, enum t return fold_build2_loc (loc, PLUS_EXPR, type, const_binop (MULT_EXPR, arg0, arg1), arg2); if (integer_zerop (arg2)) return fold_build2_loc (loc, MULT_EXPR, type, arg0, arg1); return fold_fma (loc, type, arg0, arg1, arg2); case VEC_PERM_EXPR: if (TREE_CODE (arg2) == VECTOR_CST) { - unsigned int nelts = TYPE_VECTOR_SUBPARTS (type), i; + unsigned int nelts = TYPE_VECTOR_SUBPARTS (type), i, mask; unsigned char *sel = XALLOCAVEC (unsigned char, nelts); tree t; bool need_mask_canon = false; + bool all_in_vec0 = true; + bool all_in_vec1 = true; + bool maybe_identity = true; + bool single_arg = (op0 == op1); + bool changed = false; + mask = single_arg ? (nelts - 1) : (2 * nelts - 1); gcc_assert (nelts == VECTOR_CST_NELTS (arg2)); for (i = 0; i < nelts; i++) { tree val = VECTOR_CST_ELT (arg2, i); if (TREE_CODE (val) != INTEGER_CST) return NULL_TREE; - sel[i] = TREE_INT_CST_LOW (val) & (2 * nelts - 1); + sel[i] = TREE_INT_CST_LOW (val) & mask; if (TREE_INT_CST_HIGH (val) || ((unsigned HOST_WIDE_INT) TREE_INT_CST_LOW (val) != sel[i])) need_mask_canon = true; + + if (sel[i] < nelts) + all_in_vec1 = false; + else + all_in_vec0 = false; + + if ((sel[i] & (nelts-1)) != i) + maybe_identity = false; + } + + if (maybe_identity) + { + if (all_in_vec0) + return op0; + if (all_in_vec1) + return op1; } if ((TREE_CODE (arg0) == VECTOR_CST || TREE_CODE (arg0) == CONSTRUCTOR) && (TREE_CODE (arg1) == VECTOR_CST || TREE_CODE (arg1) == CONSTRUCTOR)) { t = fold_vec_perm (type, arg0, arg1, sel); if (t != NULL_TREE) return t; } + if (all_in_vec0) + op1 = op0; + else if (all_in_vec1) + { + op0 = op1; + for (i = 0; i < nelts; i++) + sel[i] -= nelts; + need_mask_canon = true; + } + + if (op0 == op1 && !single_arg) + changed = true; + if (need_mask_canon && arg2 == op2) { tree *tsel = XALLOCAVEC (tree, nelts); tree eltype = TREE_TYPE (TREE_TYPE (arg2)); for (i = 0; i < nelts; i++) tsel[i] = build_int_cst (eltype, sel[i]); - t = build_vector (TREE_TYPE (arg2), tsel); - return build3_loc (loc, VEC_PERM_EXPR, type, op0, op1, t); + op2 = build_vector (TREE_TYPE (arg2), tsel); + changed = true; } + + if (changed) + return build3_loc (loc, VEC_PERM_EXPR, type, op0, op1, op2); } return NULL_TREE; default: return NULL_TREE; } /* switch (code) */ } /* Perform constant folding and related simplification of EXPR. The related simplifications include x*1 => x, x*0 => 0, etc., Index: gcc/tree-ssa-forwprop.c =================================================================== --- gcc/tree-ssa-forwprop.c (revision 190500) +++ gcc/tree-ssa-forwprop.c (working copy) @@ -2570,20 +2570,109 @@ combine_conversions (gimple_stmt_iterato gimple_assign_set_rhs_code (stmt, CONVERT_EXPR); update_stmt (stmt); return remove_prop_source_from_use (op0) ? 2 : 1; } } } return 0; } +/* Determine whether applying the 2 permutations (mask1 then mask2) + gives back one of the input. */ + +static int +is_combined_permutation_identity (tree mask1, tree mask2) +{ + tree mask; + unsigned int nelts, i, j; + bool maybe_identity1 = true; + bool maybe_identity2 = true; + + gcc_checking_assert (TREE_CODE (mask1) == VECTOR_CST + && TREE_CODE (mask2) == VECTOR_CST); + mask = fold_ternary (VEC_PERM_EXPR, TREE_TYPE (mask1), mask1, mask1, mask2); + gcc_assert (TREE_CODE (mask) == VECTOR_CST); + + nelts = VECTOR_CST_NELTS (mask); + for (i = 0; i < nelts; i++) + { + tree val = VECTOR_CST_ELT (mask, i); + gcc_assert (TREE_CODE (val) == INTEGER_CST); + j = TREE_INT_CST_LOW (val) & (2 * nelts - 1); + if (j == i) + maybe_identity2 = false; + else if (j == i + nelts) + maybe_identity1 = false; + else + return 0; + } + return maybe_identity1 ? 1 : maybe_identity2 ? 2 : 0; +} + +/* Combine two shuffles in a row. Returns 1 if there were any changes + made, 2 if cfg-cleanup needs to run. Else it returns 0. */ + +static int +simplify_permutation (gimple_stmt_iterator *gsi) +{ + gimple stmt = gsi_stmt (*gsi); + gimple def_stmt; + tree op0, op1, op2, op3; + enum tree_code code = gimple_assign_rhs_code (stmt); + enum tree_code code2; + + gcc_checking_assert (code == VEC_PERM_EXPR); + + op0 = gimple_assign_rhs1 (stmt); + op1 = gimple_assign_rhs2 (stmt); + op2 = gimple_assign_rhs3 (stmt); + + if (TREE_CODE (op0) != SSA_NAME) + return 0; + + if (TREE_CODE (op2) != VECTOR_CST) + return 0; + + if (op0 != op1) + return 0; + + def_stmt = SSA_NAME_DEF_STMT (op0); + if (!def_stmt || !is_gimple_assign (def_stmt) + || !can_propagate_from (def_stmt)) + return 0; + + code2 = gimple_assign_rhs_code (def_stmt); + + /* Two consecutive shuffles. */ + if (code2 == VEC_PERM_EXPR) + { + tree orig; + int ident; + op3 = gimple_assign_rhs3 (def_stmt); + if (TREE_CODE (op3) != VECTOR_CST) + return 0; + ident = is_combined_permutation_identity (op3, op2); + if (!ident) + return 0; + orig = (ident == 1) ? gimple_assign_rhs1 (def_stmt) + : gimple_assign_rhs2 (def_stmt); + gimple_assign_set_rhs1 (stmt, unshare_expr (orig)); + gimple_assign_set_rhs_code (stmt, TREE_CODE (orig)); + gimple_set_num_ops (stmt, 2); + update_stmt (stmt); + return remove_prop_source_from_use (op0) ? 2 : 1; + } + + return false; +} + /* Main entry point for the forward propagation and statement combine optimizer. */ static unsigned int ssa_forward_propagate_and_combine (void) { basic_block bb; unsigned int todoflags = 0; cfg_changed = false; @@ -2732,20 +2821,27 @@ ssa_forward_propagate_and_combine (void) changed = associate_pointerplus (&gsi); else if (CONVERT_EXPR_CODE_P (code) || code == FLOAT_EXPR || code == FIX_TRUNC_EXPR) { int did_something = combine_conversions (&gsi); if (did_something == 2) cfg_changed = true; changed = did_something != 0; } + else if (code == VEC_PERM_EXPR) + { + int did_something = simplify_permutation (&gsi); + if (did_something == 2) + cfg_changed = true; + changed = did_something != 0; + } break; } case GIMPLE_SWITCH: changed = simplify_gimple_switch (stmt); break; case GIMPLE_COND: { int did_something; Index: gcc/testsuite/gcc.dg/tree-ssa/forwprop-19.c =================================================================== --- gcc/testsuite/gcc.dg/tree-ssa/forwprop-19.c (revision 0) +++ gcc/testsuite/gcc.dg/tree-ssa/forwprop-19.c (revision 0) @@ -0,0 +1,15 @@ +/* { dg-do compile } */ +/* { dg-options "-O -fdump-tree-forwprop2" } */ + +typedef int vec __attribute__((vector_size (4 * sizeof (int)))); +void f (vec *x1, vec *x2) +{ + vec m = { 1, 2, 3, 0 }; + vec n = { 3, 0, 1, 2 }; + vec y = __builtin_shuffle (*x1, *x2, n); + vec z = __builtin_shuffle (y, m); + *x1 = z; +} + +/* { dg-final { scan-tree-dump-not "VEC_PERM_EXPR" "forwprop2" } } */ +/* { dg-final { cleanup-tree-dump "forwprop2" } } */ Property changes on: gcc/testsuite/gcc.dg/tree-ssa/forwprop-19.c ___________________________________________________________________ Added: svn:keywords + Author Date Id Revision URL Added: svn:eol-style + native Index: gcc/testsuite/gcc.dg/fold-perm.c =================================================================== --- gcc/testsuite/gcc.dg/fold-perm.c (revision 0) +++ gcc/testsuite/gcc.dg/fold-perm.c (revision 0) @@ -0,0 +1,19 @@ +/* { dg-do compile } */ +/* { dg-options "-O -fdump-tree-ccp1" } */ + +typedef int veci __attribute__ ((vector_size (4 * sizeof (int)))); + +void fun (veci *f, veci *g, veci *h) +{ + veci m = { 7, 7, 4, 6 }; + veci n = { 0, 1, 2, 3 }; + veci p = { 1, 1, 7, 6 }; + *h = __builtin_shuffle (*h, *h, p); + *g = __builtin_shuffle (*f, *g, m); + *f = __builtin_shuffle (*f, *g, n); +} + +/* { dg-final { scan-tree-dump "VEC_PERM_EXPR.*{ 3, 3, 0, 2 }" "ccp1" } } */ +/* { dg-final { scan-tree-dump "VEC_PERM_EXPR.*{ 1, 1, 3, 2 }" "ccp1" } } */ +/* { dg-final { scan-tree-dump-times "VEC_PERM_EXPR" 2 "ccp1" } } */ +/* { dg-final { cleanup-tree-dump "ccp1" } } */