From patchwork Wed Nov 6 10:01:23 2013 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Bin Cheng X-Patchwork-Id: 288813 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by ozlabs.org (Postfix) with ESMTPS id A503C2C00B6 for ; Wed, 6 Nov 2013 21:01:58 +1100 (EST) DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :to:cc:references:in-reply-to:subject:date:message-id :mime-version:content-type; q=dns; s=default; b=a+JDXEzG6IjaFwyk +THjLbzDMQQV+iA8Iw1EtCBUjT12gn4Exh//nLDqJTZ6Kb7eJbnPYeJrKDTWiLtM JoVQ8JjqbBktOrYd2s6xqhVPaWDIyrwezg/ZCc8lP6USLdtysUFpI0B01+KmBukL lLBhcuKItvOx6MN2TMiZl8PGhAs= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :to:cc:references:in-reply-to:subject:date:message-id :mime-version:content-type; s=default; bh=7K0QFH74mtGGNZzHhtTXpX a8mXM=; b=UP3tVqRAhlRuPP3S5Gxp/kR3F8+gyjYetmCqwbyjht/wC0qmLXEop/ Cakg0M5LU2kw4psq8YbnDj+Jq1I7yDPu8aVAWsxU7oyuxAVybKBM2bTUP0hgflap GjL4OGgcrybmcOi7SzE8teKkCy+1+LMRevgLxWXbQ9tYDlGznPVcA= Received: (qmail 14075 invoked by alias); 6 Nov 2013 10:01:47 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 14064 invoked by uid 89); 6 Nov 2013 10:01:46 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=1.2 required=5.0 tests=AWL, BAYES_99, RDNS_NONE, SPF_PASS, URIBL_BLOCKED autolearn=no version=3.3.2 X-HELO: service87.mimecast.com Received: from Unknown (HELO service87.mimecast.com) (91.220.42.44) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Wed, 06 Nov 2013 10:01:44 +0000 Received: from cam-owa2.Emea.Arm.com (fw-tnat.cambridge.arm.com [217.140.96.21]) by service87.mimecast.com; Wed, 06 Nov 2013 10:01:35 +0000 Received: from SHAWIN162 ([10.1.255.212]) by cam-owa2.Emea.Arm.com with Microsoft SMTPSVC(6.0.3790.3959); Wed, 6 Nov 2013 10:01:33 +0000 From: "bin.cheng" To: "'Richard Biener'" Cc: "GCC Patches" References: <000001ced487$c9e8e540$5dbaafc0$@arm.com> <001401ced938$d11df340$7359d9c0$@arm.com> <001501ceda0f$bd913da0$38b3b8e0$@arm.com> In-Reply-To: Subject: RE: [PATCH GCC]Simplify address expression in IVOPT Date: Wed, 6 Nov 2013 18:01:23 +0800 Message-ID: <001d01cedad7$273e4820$75bad860$@arm.com> MIME-Version: 1.0 X-MC-Unique: 113110610013505101 X-IsSubscribed: yes > -----Original Message----- > From: Richard Biener [mailto:richard.guenther@gmail.com] > Sent: Tuesday, November 05, 2013 8:18 PM > To: Bin Cheng > Cc: GCC Patches > Subject: Re: [PATCH GCC]Simplify address expression in IVOPT > > On Tue, Nov 5, 2013 at 11:13 AM, bin.cheng wrote: > > > > > >> -----Original Message----- > >> From: gcc-patches-owner@gcc.gnu.org [mailto:gcc-patches- > >> owner@gcc.gnu.org] On Behalf Of bin.cheng > >> Sent: Monday, November 04, 2013 4:35 PM > >> To: 'Richard Biener' > >> Cc: GCC Patches > >> Subject: RE: [PATCH GCC]Simplify address expression in IVOPT > >> > >> > >> > >> > -----Original Message----- > >> > From: Richard Biener [mailto:richard.guenther@gmail.com] > >> > Sent: Wednesday, October 30, 2013 10:46 PM > >> > To: Bin Cheng > >> > Cc: GCC Patches > >> > Subject: Re: [PATCH GCC]Simplify address expression in IVOPT > >> > > >> > On Tue, Oct 29, 2013 at 10:18 AM, bin.cheng > wrote: > >> > > >> > Hmm. I think you want what get_inner_reference_aff does, using the > >> > return value of get_inner_reference as starting point for > >> determine_base_object. > >> > And you don't want to restrict yourselves so much on what exprs to > >> process, > >> > but only exclude DECL_Ps. > >> Did you mean I should pass all address expressions to > >> get_inner_reference_aff except the one with declaration operand? > >> I am a little confused here why DECL_Ps should be handled specially? > > Seems > >> it can be handled properly by the get_* function. Anything important > >> I missed? > >> > >> > Just amend get_inner_reference_aff to return the tree base object. > >> I found it's inappropriate to do this because: functions like > >> get_inner_reference* only accept reference expressions, while > >> base_object has to be computed for other kinds of expressions too. > >> Take gimple statement "a_1 = *ptr_1" as an example, the base passed > >> to alloc_iv is > > ptr_1; > >> the base_object computed by determine_base_object is also ptr_1, > >> which we can't rely on get_inner_reference* . > >> > >> Also It seems the comment before determine_base_object might be > >> inaccurate: > >> " Returns a memory object to that EXPR points." which I think is " > >> Returns > > a > >> pointer pointing to the main memory object to that EXPR points." > >> Right? > >> > > >> > Note that this isn't really "simplifying" but rather lowering all > >> addresses. > >> > > >> > The other changes in this patch are unrelated, right? > >> Right, I should have named this message like "refine cost computation > >> of expressions in IVOPT" just as the patch file. > > > > Hi, > > I updated the patch according to review comments. Now it lowers all > > address expressions except ones with DECL_P to > > get_inner_reference_aff, it also computes base_object which is used later > to ease determine_base_object. > > > > Bootstrap and test on x86/x86_64/arm on going, is it OK if tests pass? > > static struct iv * > alloc_iv (tree base, tree step) > { > + tree expr = base; > + tree base_object = base; > struct iv *iv = XCNEW (struct iv); > gcc_assert (step != NULL_TREE); > > + /* Lower all address expressions except ones with DECL_P as oeprand. > + By doing this: > + 1) More accurate cost can be computed for address expressions; > + 2) Duplicate candidates won't be created for bases in different > + forms, like &a[0] and &a. */ STRIP_NOPS (expr); if > + (TREE_CODE (expr) == ADDR_EXPR && !DECL_P (TREE_OPERAND (expr, 0))) > + { > + aff_tree comb; > + double_int size; > + base_object = get_inner_reference_aff (TREE_OPERAND (expr, 0), > + &comb, &size); > + gcc_assert (base_object != NULL_TREE); > + base = fold_convert (TREE_TYPE (base), aff_combination_to_tree > (&comb)); > + } > + > iv->base = base; > - iv->base_object = determine_base_object (base); > + iv->base_object = determine_base_object (base_object); > > > for less confusion do not introduce 'expr' but just base_object > (s/expr/base_object/). Also can you split out this change (and the related > tree-affine.c one) from the rest? > > Ok with that change assuming it passes bootstrap & testing. It passes test. I split it into two patches as attached. Are these two OK? Thanks, bin 3-ivopt-expr_cost-a-20131106: gcc/testsuite/ChangeLog 2013-11-06 Bin Cheng * gcc.dg/tree-ssa/loop-2.c: Refine check condition. * gcc.dg/tree-ssa/ivopt_infer_2.c: Ditto. * gcc.dg/tree-ssa/ivopt_mult_3.c: Ditto. 2013-11-06 Bin Cheng * tree-ssa-loop-ivopts.c (alloc_iv): Lower address expressions. * tree-affine.c (get_inner_reference_aff): Return base. * tree-affine.h (get_inner_reference_aff): Change prototype. 3-ivopt-expr_cost-b-20131106: 2013-11-06 Bin Cheng * tree-ssa-loop-ivopts.c (get_shiftadd_cost): Check equality using operand_equal_p. (force_expr_to_var_cost): Refactor the code. Handle type conversion. (split_address_cost): Call force_expr_to_var_cost. Index: gcc/tree-ssa-loop-ivopts.c =================================================================== --- gcc/tree-ssa-loop-ivopts.c (revision 204117) +++ gcc/tree-ssa-loop-ivopts.c (working copy) @@ -3481,17 +3500,21 @@ get_shiftadd_cost (tree expr, enum machine_mode mo int m = exact_log2 (int_cst_value (cst)); int maxm = MIN (BITS_PER_WORD, GET_MODE_BITSIZE (mode)); int sa_cost; + bool equal_p = false; if (!(m >= 0 && m < maxm)) return false; + if (operand_equal_p (op1, mult, 0)) + equal_p = true; + sa_cost = (TREE_CODE (expr) != MINUS_EXPR ? shiftadd_cost (speed, mode, m) - : (mult == op1 + : (equal_p ? shiftsub1_cost (speed, mode, m) : shiftsub0_cost (speed, mode, m))); res = new_cost (sa_cost, 0); - res = add_costs (res, mult == op1 ? cost0 : cost1); + res = add_costs (res, equal_p ? cost0 : cost1); STRIP_NOPS (multop); if (!is_gimple_val (multop)) @@ -3585,30 +3608,14 @@ force_expr_to_var_cost (tree expr, bool speed) op1 = TREE_OPERAND (expr, 1); STRIP_NOPS (op0); STRIP_NOPS (op1); - - if (is_gimple_val (op0)) - cost0 = no_cost; - else - cost0 = force_expr_to_var_cost (op0, speed); - - if (is_gimple_val (op1)) - cost1 = no_cost; - else - cost1 = force_expr_to_var_cost (op1, speed); - break; + case NOP_EXPR: + case CONVERT_EXPR: case NEGATE_EXPR: op0 = TREE_OPERAND (expr, 0); STRIP_NOPS (op0); op1 = NULL_TREE; - - if (is_gimple_val (op0)) - cost0 = no_cost; - else - cost0 = force_expr_to_var_cost (op0, speed); - - cost1 = no_cost; break; default: @@ -3616,6 +3623,18 @@ force_expr_to_var_cost (tree expr, bool speed) return new_cost (target_spill_cost[speed], 0); } + if (op0 == NULL_TREE + || (is_gimple_val (op0) && (TREE_CODE (op0) != ADDR_EXPR))) + cost0 = no_cost; + else + cost0 = force_expr_to_var_cost (op0, speed); + + if (op1 == NULL_TREE + || (is_gimple_val (op1) && (TREE_CODE (op1) != ADDR_EXPR))) + cost1 = no_cost; + else + cost1 = force_expr_to_var_cost (op1, speed); + mode = TYPE_MODE (TREE_TYPE (expr)); switch (TREE_CODE (expr)) { @@ -3639,7 +3658,18 @@ force_expr_to_var_cost (tree expr, bool speed) speed, &sa_cost)) return sa_cost; } + break; + case NOP_EXPR: + case CONVERT_EXPR: + { + tree inner_mode, outer_mode; + outer_mode = TREE_TYPE (expr); + inner_mode = TREE_TYPE (op0); + cost = new_cost (convert_cost (TYPE_MODE (outer_mode), + TYPE_MODE (inner_mode), speed), 0); + } + break; case MULT_EXPR: if (cst_and_fits_in_hwi (op0)) @@ -3713,7 +3743,7 @@ split_address_cost (struct ivopts_data *data, *var_present = true; fd_ivopts_data = data; walk_tree (&addr, find_depends, depends_on, NULL); - return new_cost (target_spill_cost[data->speed], 0); + return force_expr_to_var_cost (addr, data->speed); } *offset += bitpos / BITS_PER_UNIT; Index: gcc/testsuite/gcc.dg/tree-ssa/loop-2.c =================================================================== --- gcc/testsuite/gcc.dg/tree-ssa/loop-2.c (revision 204117) +++ gcc/testsuite/gcc.dg/tree-ssa/loop-2.c (working copy) @@ -27,7 +27,7 @@ void xxx(void) /* { dg-final { scan-tree-dump-times " \\* \[^\\n\\r\]*=" 0 "optimized" } } */ /* { dg-final { scan-tree-dump-times "\[^\\n\\r\]*= \\* " 0 "optimized" } } */ -/* { dg-final { scan-tree-dump-times "MEM" 1 "optimized" } } */ +/* { dg-final { scan-tree-dump-times "MEM\\\[base" 1 "optimized" } } */ /* 17 * iter should be strength reduced. */ Index: gcc/testsuite/gcc.dg/tree-ssa/ivopt_infer_2.c =================================================================== --- gcc/testsuite/gcc.dg/tree-ssa/ivopt_infer_2.c (revision 204117) +++ gcc/testsuite/gcc.dg/tree-ssa/ivopt_infer_2.c (working copy) @@ -7,7 +7,8 @@ extern char a[]; -/* Can not infer loop iteration from array -- exit test can not be replaced. */ +/* Can not infer loop iteration from array -- exit test can not be + replaced by the array address. */ void foo (unsigned int i_width, TYPE dst) { unsigned long long i = 0; @@ -21,5 +22,5 @@ void foo (unsigned int i_width, TYPE dst) } } -/* { dg-final { scan-tree-dump-times "Replacing" 0 "ivopts"} } */ +/* { dg-final { scan-tree-dump-times "\[^:\]*if \\(.*j_\[0-9\]+.*\\)" 1 "ivopts"} } */ /* { dg-final { cleanup-tree-dump "ivopts" } } */ Index: gcc/testsuite/gcc.dg/tree-ssa/ivopt_mult_3.c =================================================================== --- gcc/testsuite/gcc.dg/tree-ssa/ivopt_mult_3.c (revision 204117) +++ gcc/testsuite/gcc.dg/tree-ssa/ivopt_mult_3.c (working copy) @@ -18,5 +18,5 @@ long foo(long* p, long* p2, int N1, int N2) return s; } -/* { dg-final { scan-tree-dump-times "Replacing" 1 "ivopts"} } */ +/* { dg-final { scan-tree-dump-times "Replacing exit test: if \\(.*p2.*\\)" 1 "ivopts"} } */ /* { dg-final { cleanup-tree-dump "ivopts" } } */ Index: gcc/tree-ssa-loop-ivopts.c =================================================================== --- gcc/tree-ssa-loop-ivopts.c (revision 204117) +++ gcc/tree-ssa-loop-ivopts.c (working copy) @@ -924,11 +924,30 @@ determine_base_object (tree expr) static struct iv * alloc_iv (tree base, tree step) { + tree base_object = base; struct iv *iv = XCNEW (struct iv); gcc_assert (step != NULL_TREE); + /* Lower all address expressions except ones with DECL_P as operand. + By doing this: + 1) More accurate cost can be computed for address expressions; + 2) Duplicate candidates won't be created for bases in different + forms, like &a[0] and &a. */ + STRIP_NOPS (base_object); + if (TREE_CODE (base_object) == ADDR_EXPR + && !DECL_P (TREE_OPERAND (base_object, 0))) + { + aff_tree comb; + double_int size; + base_object = get_inner_reference_aff (TREE_OPERAND (base_object, 0), + &comb, &size); + gcc_assert (base_object != NULL_TREE); + base_object = build_fold_addr_expr (base_object); + base = fold_convert (TREE_TYPE (base), aff_combination_to_tree (&comb)); + } + iv->base = base; - iv->base_object = determine_base_object (base); + iv->base_object = determine_base_object (base_object); iv->step = step; iv->biv_p = false; iv->have_use_for = false; Index: gcc/tree-affine.c =================================================================== --- gcc/tree-affine.c (revision 204117) +++ gcc/tree-affine.c (working copy) @@ -874,10 +874,11 @@ debug_aff (aff_tree *val) fprintf (stderr, "\n"); } -/* Returns address of the reference REF in ADDR. The size of the accessed - location is stored to SIZE. */ +/* Computes address of the reference REF in ADDR. The size of the accessed + location is stored to SIZE. Returns the ultimate containing object to + which REF refers. */ -void +tree get_inner_reference_aff (tree ref, aff_tree *addr, double_int *size) { HOST_WIDE_INT bitsize, bitpos; @@ -904,6 +905,8 @@ get_inner_reference_aff (tree ref, aff_tree *addr, aff_combination_add (addr, &tmp); *size = double_int::from_shwi ((bitsize + BITS_PER_UNIT - 1) / BITS_PER_UNIT); + + return base; } /* Returns true if a region of size SIZE1 at position 0 and a region of Index: gcc/tree-affine.h =================================================================== --- gcc/tree-affine.h (revision 204117) +++ gcc/tree-affine.h (working copy) @@ -74,7 +74,7 @@ bool aff_combination_constant_multiple_p (aff_tree void aff_combination_expand (aff_tree *, struct pointer_map_t **); void tree_to_aff_combination_expand (tree, tree, aff_tree *, struct pointer_map_t **); -void get_inner_reference_aff (tree, aff_tree *, double_int *); +tree get_inner_reference_aff (tree, aff_tree *, double_int *); void free_affine_expand_cache (struct pointer_map_t **); bool aff_comb_cannot_overlap_p (aff_tree *, double_int, double_int);