Message ID | 001501ceda0f$bd913da0$38b3b8e0$@arm.com |
---|---|
State | New |
Headers | show |
On 11/05/13 10:13, bin.cheng wrote: > Index: gcc/tree-affine.c > =================================================================== > --- gcc/tree-affine.c (revision 204117) > +++ gcc/tree-affine.c (working copy) > @@ -874,10 +874,11 @@ debug_aff (aff_tree *val) > fprintf (stderr, "\n"); > } > > -/* Returns address of the reference REF in ADDR. The size of the accessed > - location is stored to SIZE. */ > +/* Computes address of the reference REF in ADDR. The size of the accessed > + location is stored to SIZE. Returns pointer to the ultimate containing > + object to which REF refers. */ > > -void > +tree > get_inner_reference_aff (tree ref, aff_tree *addr, double_int *size) > { > HOST_WIDE_INT bitsize, bitpos; > @@ -904,6 +905,8 @@ get_inner_reference_aff (tree ref, aff_tree *addr, > aff_combination_add (addr,&tmp); > > *size = double_int::from_shwi ((bitsize + BITS_PER_UNIT - 1) / BITS_PER_UNIT); > + > + return base_addr; > } > I think what Richard suggests is to return the base object rather the address of the base object, i.e. return base; This is good in keeping the consistency in the return values between get_inner_reference_aff and get_inner_reference. Yufeng
On Tue, Nov 5, 2013 at 7:19 PM, Yufeng Zhang <Yufeng.Zhang@arm.com> wrote: > On 11/05/13 10:13, bin.cheng wrote: >> >> Index: gcc/tree-affine.c >> =================================================================== >> --- gcc/tree-affine.c (revision 204117) >> +++ gcc/tree-affine.c (working copy) >> @@ -874,10 +874,11 @@ debug_aff (aff_tree *val) >> fprintf (stderr, "\n"); >> } >> >> -/* Returns address of the reference REF in ADDR. The size of the >> accessed >> - location is stored to SIZE. */ >> +/* Computes address of the reference REF in ADDR. The size of the >> accessed >> + location is stored to SIZE. Returns pointer to the ultimate >> containing >> + object to which REF refers. */ >> >> -void >> +tree >> get_inner_reference_aff (tree ref, aff_tree *addr, double_int *size) >> { >> HOST_WIDE_INT bitsize, bitpos; >> @@ -904,6 +905,8 @@ get_inner_reference_aff (tree ref, aff_tree *addr, >> aff_combination_add (addr,&tmp); >> >> *size = double_int::from_shwi ((bitsize + BITS_PER_UNIT - 1) / >> BITS_PER_UNIT); >> + >> + return base_addr; >> } >> > > I think what Richard suggests is to return the base object rather the > address of the base object, i.e. I am not sure about that. We have to pass pointer_type expression to function determine_base_object for address expressions, because there is no way to tell pointer from object once we are in determine_base_object. Thanks. bin > > return base; > > This is good in keeping the consistency in the return values between > get_inner_reference_aff and get_inner_reference. > > Yufeng >
On Tue, Nov 5, 2013 at 11:13 AM, bin.cheng <bin.cheng@arm.com> wrote: > > >> -----Original Message----- >> From: gcc-patches-owner@gcc.gnu.org [mailto:gcc-patches- >> owner@gcc.gnu.org] On Behalf Of bin.cheng >> Sent: Monday, November 04, 2013 4:35 PM >> To: 'Richard Biener' >> Cc: GCC Patches >> Subject: RE: [PATCH GCC]Simplify address expression in IVOPT >> >> >> >> > -----Original Message----- >> > From: Richard Biener [mailto:richard.guenther@gmail.com] >> > Sent: Wednesday, October 30, 2013 10:46 PM >> > To: Bin Cheng >> > Cc: GCC Patches >> > Subject: Re: [PATCH GCC]Simplify address expression in IVOPT >> > >> > On Tue, Oct 29, 2013 at 10:18 AM, bin.cheng <bin.cheng@arm.com> wrote: >> > >> > Hmm. I think you want what get_inner_reference_aff does, using the >> > return value of get_inner_reference as starting point for >> determine_base_object. >> > And you don't want to restrict yourselves so much on what exprs to >> process, >> > but only exclude DECL_Ps. >> Did you mean I should pass all address expressions to >> get_inner_reference_aff except the one with declaration operand? >> I am a little confused here why DECL_Ps should be handled specially? > Seems >> it can be handled properly by the get_* function. Anything important I >> missed? >> >> > Just amend get_inner_reference_aff to return the tree base object. >> I found it's inappropriate to do this because: functions like >> get_inner_reference* only accept reference expressions, while base_object >> has to be computed for other kinds of expressions too. Take gimple >> statement "a_1 = *ptr_1" as an example, the base passed to alloc_iv is > ptr_1; >> the base_object computed by determine_base_object is also ptr_1, which >> we can't rely on get_inner_reference* . >> >> Also It seems the comment before determine_base_object might be >> inaccurate: >> " Returns a memory object to that EXPR points." which I think is " Returns > a >> pointer pointing to the main memory object to that EXPR points." >> Right? >> > >> > Note that this isn't really "simplifying" but rather lowering all >> addresses. >> > >> > The other changes in this patch are unrelated, right? >> Right, I should have named this message like "refine cost computation of >> expressions in IVOPT" just as the patch file. > > Hi, > I updated the patch according to review comments. Now it lowers all address > expressions except ones with DECL_P to get_inner_reference_aff, it also > computes base_object which is used later to ease determine_base_object. > > Bootstrap and test on x86/x86_64/arm on going, is it OK if tests pass? static struct iv * alloc_iv (tree base, tree step) { + tree expr = base; + tree base_object = base; struct iv *iv = XCNEW (struct iv); gcc_assert (step != NULL_TREE); + /* Lower all address expressions except ones with DECL_P as oeprand. + By doing this: + 1) More accurate cost can be computed for address expressions; + 2) Duplicate candidates won't be created for bases in different + forms, like &a[0] and &a. */ + STRIP_NOPS (expr); + if (TREE_CODE (expr) == ADDR_EXPR && !DECL_P (TREE_OPERAND (expr, 0))) + { + aff_tree comb; + double_int size; + base_object = get_inner_reference_aff (TREE_OPERAND (expr, 0), + &comb, &size); + gcc_assert (base_object != NULL_TREE); + base = fold_convert (TREE_TYPE (base), aff_combination_to_tree (&comb)); + } + iv->base = base; - iv->base_object = determine_base_object (base); + iv->base_object = determine_base_object (base_object); for less confusion do not introduce 'expr' but just base_object (s/expr/base_object/). Also can you split out this change (and the related tree-affine.c one) from the rest? Ok with that change assuming it passes bootstrap & testing. Thanks, Richard. > Thanks. > bin > > gcc/testsuite/ChangeLog > 2013-11-05 Bin Cheng <bin.cheng@arm.com> > > * gcc.dg/tree-ssa/loop-2.c: Refine check condition. > * gcc.dg/tree-ssa/ivopt_infer_2.c: Ditto. > * gcc.dg/tree-ssa/ivopt_mult_3.c: Ditto. > > 2013-11-05 Bin Cheng <bin.cheng@arm.com> > > * tree-ssa-loop-ivopts.c (alloc_iv): Lower address expressions. > (get_shiftadd_cost): Check equality using operand_equal_p. > (force_expr_to_var_cost): Refactor the code. Handle type > conversion. > (split_address_cost): Call force_expr_to_var_cost. > * tree-affine.c (get_inner_reference_aff): Return base_addr. > * tree-affine.h (get_inner_reference_aff): Change prototype.
On 11/05/13 11:45, Bin.Cheng wrote: > On Tue, Nov 5, 2013 at 7:19 PM, Yufeng Zhang<Yufeng.Zhang@arm.com> wrote: >> > On 11/05/13 10:13, bin.cheng wrote: >>> >> >>> >> Index: gcc/tree-affine.c >>> >> =================================================================== >>> >> --- gcc/tree-affine.c (revision 204117) >>> >> +++ gcc/tree-affine.c (working copy) >>> >> @@ -874,10 +874,11 @@ debug_aff (aff_tree *val) >>> >> fprintf (stderr, "\n"); >>> >> } >>> >> >>> >> -/* Returns address of the reference REF in ADDR. The size of the >>> >> accessed >>> >> - location is stored to SIZE. */ >>> >> +/* Computes address of the reference REF in ADDR. The size of the >>> >> accessed >>> >> + location is stored to SIZE. Returns pointer to the ultimate >>> >> containing >>> >> + object to which REF refers. */ >>> >> >>> >> -void >>> >> +tree >>> >> get_inner_reference_aff (tree ref, aff_tree *addr, double_int *size) >>> >> { >>> >> HOST_WIDE_INT bitsize, bitpos; >>> >> @@ -904,6 +905,8 @@ get_inner_reference_aff (tree ref, aff_tree *addr, >>> >> aff_combination_add (addr,&tmp); >>> >> >>> >> *size = double_int::from_shwi ((bitsize + BITS_PER_UNIT - 1) / >>> >> BITS_PER_UNIT); >>> >> + >>> >> + return base_addr; >>> >> } >>> >> >> > >> > I think what Richard suggests is to return the base object rather the >> > address of the base object, i.e. > I am not sure about that. We have to pass pointer_type expression to > function determine_base_object for address expressions, because there > is no way to tell pointer from object once we are in > determine_base_object. I'm just concerned with the consistency in what is returned between get_inner_reference and get_inner_reference_aff. If determine_base_object expects reference only, you can probably work around it with something like: base_object = build_fold_addr_expr (base_object); after the get_inner_reference_aff call. Yufeng
On Tue, Nov 5, 2013 at 1:18 PM, Yufeng Zhang <Yufeng.Zhang@arm.com> wrote: > On 11/05/13 11:45, Bin.Cheng wrote: >> >> On Tue, Nov 5, 2013 at 7:19 PM, Yufeng Zhang<Yufeng.Zhang@arm.com> wrote: >>> >>> > On 11/05/13 10:13, bin.cheng wrote: >>>> >>>> >> >>>> >> Index: gcc/tree-affine.c >>>> >> =================================================================== >>>> >> --- gcc/tree-affine.c (revision 204117) >>>> >> +++ gcc/tree-affine.c (working copy) >>>> >> @@ -874,10 +874,11 @@ debug_aff (aff_tree *val) >>>> >> fprintf (stderr, "\n"); >>>> >> } >>>> >> >>>> >> -/* Returns address of the reference REF in ADDR. The size of the >>>> >> accessed >>>> >> - location is stored to SIZE. */ >>>> >> +/* Computes address of the reference REF in ADDR. The size of the >>>> >> accessed >>>> >> + location is stored to SIZE. Returns pointer to the ultimate >>>> >> containing >>>> >> + object to which REF refers. */ >>>> >> >>>> >> -void >>>> >> +tree >>>> >> get_inner_reference_aff (tree ref, aff_tree *addr, double_int >>>> >> *size) >>>> >> { >>>> >> HOST_WIDE_INT bitsize, bitpos; >>>> >> @@ -904,6 +905,8 @@ get_inner_reference_aff (tree ref, aff_tree >>>> >> *addr, >>>> >> aff_combination_add (addr,&tmp); >>>> >> >>>> >> *size = double_int::from_shwi ((bitsize + BITS_PER_UNIT - 1) / >>>> >> BITS_PER_UNIT); >>>> >> + >>>> >> + return base_addr; >>>> >> } >>>> >> >>> >>> > >>> > I think what Richard suggests is to return the base object rather the >>> > address of the base object, i.e. >> >> I am not sure about that. We have to pass pointer_type expression to >> function determine_base_object for address expressions, because there >> is no way to tell pointer from object once we are in >> determine_base_object. > > > I'm just concerned with the consistency in what is returned between > get_inner_reference and get_inner_reference_aff. If determine_base_object > expects reference only, you can probably work around it with something like: > > base_object = build_fold_addr_expr (base_object); > > after the get_inner_reference_aff call. Note that get_inner_reference_aff is already different from get_inner_reference in that it does not separate base object and offset but lumps them together into an address represented as affine combination. For this reason get_inner_reference_aff may not exactly be the best name for the function ... it could be split to not do aff_combination_add (addr, &tmp); but instead return 'base'. The two existing users of the function could be unified into aff_comb_cannot_overlap_p (giving that a better name as well), even avoiding combining the base but instead handle comparing the base itself. Of course that's preexisting issues that I don't want to force Bin Cheng to clean up and fix (given I've, too, just spent a minute thinking about it). Thanks, Richard. > Yufeng >
On Tue, Nov 5, 2013 at 8:29 PM, Richard Biener <richard.guenther@gmail.com> wrote: > On Tue, Nov 5, 2013 at 1:18 PM, Yufeng Zhang <Yufeng.Zhang@arm.com> wrote: >> On 11/05/13 11:45, Bin.Cheng wrote: >>> >>> On Tue, Nov 5, 2013 at 7:19 PM, Yufeng Zhang<Yufeng.Zhang@arm.com> wrote: >>>> >>>> > On 11/05/13 10:13, bin.cheng wrote: >>>>> >>>>> >> >>>>> >> Index: gcc/tree-affine.c >>>>> >> =================================================================== >>>>> >> --- gcc/tree-affine.c (revision 204117) >>>>> >> +++ gcc/tree-affine.c (working copy) >>>>> >> @@ -874,10 +874,11 @@ debug_aff (aff_tree *val) >>>>> >> fprintf (stderr, "\n"); >>>>> >> } >>>>> >> >>>>> >> -/* Returns address of the reference REF in ADDR. The size of the >>>>> >> accessed >>>>> >> - location is stored to SIZE. */ >>>>> >> +/* Computes address of the reference REF in ADDR. The size of the >>>>> >> accessed >>>>> >> + location is stored to SIZE. Returns pointer to the ultimate >>>>> >> containing >>>>> >> + object to which REF refers. */ >>>>> >> >>>>> >> -void >>>>> >> +tree >>>>> >> get_inner_reference_aff (tree ref, aff_tree *addr, double_int >>>>> >> *size) >>>>> >> { >>>>> >> HOST_WIDE_INT bitsize, bitpos; >>>>> >> @@ -904,6 +905,8 @@ get_inner_reference_aff (tree ref, aff_tree >>>>> >> *addr, >>>>> >> aff_combination_add (addr,&tmp); >>>>> >> >>>>> >> *size = double_int::from_shwi ((bitsize + BITS_PER_UNIT - 1) / >>>>> >> BITS_PER_UNIT); >>>>> >> + >>>>> >> + return base_addr; >>>>> >> } >>>>> >> >>>> >>>> > >>>> > I think what Richard suggests is to return the base object rather the >>>> > address of the base object, i.e. >>> >>> I am not sure about that. We have to pass pointer_type expression to >>> function determine_base_object for address expressions, because there >>> is no way to tell pointer from object once we are in >>> determine_base_object. >> >> >> I'm just concerned with the consistency in what is returned between >> get_inner_reference and get_inner_reference_aff. If determine_base_object >> expects reference only, you can probably work around it with something like: >> >> base_object = build_fold_addr_expr (base_object); >> >> after the get_inner_reference_aff call. > > Note that get_inner_reference_aff is already different from get_inner_reference > in that it does not separate base object and offset but lumps them together > into an address represented as affine combination. > > For this reason get_inner_reference_aff may not exactly be the best > name for the function ... it could be split to not do > > aff_combination_add (addr, &tmp); > > but instead return 'base'. The two existing users of the function could > be unified into aff_comb_cannot_overlap_p (giving that a better name > as well), even avoiding combining the base but instead handle > comparing the base itself. > > Of course that's preexisting issues that I don't want to force Bin Cheng > to clean up and fix (given I've, too, just spent a minute thinking about it). > I can investigate that later after finishing all my IVOPT patches. Thanks, bin
Index: gcc/testsuite/gcc.dg/tree-ssa/loop-2.c =================================================================== --- gcc/testsuite/gcc.dg/tree-ssa/loop-2.c (revision 204117) +++ gcc/testsuite/gcc.dg/tree-ssa/loop-2.c (working copy) @@ -27,7 +27,7 @@ void xxx(void) /* { dg-final { scan-tree-dump-times " \\* \[^\\n\\r\]*=" 0 "optimized" } } */ /* { dg-final { scan-tree-dump-times "\[^\\n\\r\]*= \\* " 0 "optimized" } } */ -/* { dg-final { scan-tree-dump-times "MEM" 1 "optimized" } } */ +/* { dg-final { scan-tree-dump-times "MEM\\\[base" 1 "optimized" } } */ /* 17 * iter should be strength reduced. */ Index: gcc/testsuite/gcc.dg/tree-ssa/ivopt_infer_2.c =================================================================== --- gcc/testsuite/gcc.dg/tree-ssa/ivopt_infer_2.c (revision 204117) +++ gcc/testsuite/gcc.dg/tree-ssa/ivopt_infer_2.c (working copy) @@ -7,7 +7,8 @@ extern char a[]; -/* Can not infer loop iteration from array -- exit test can not be replaced. */ +/* Can not infer loop iteration from array -- exit test can not be + replaced by the array address. */ void foo (unsigned int i_width, TYPE dst) { unsigned long long i = 0; @@ -21,5 +22,5 @@ void foo (unsigned int i_width, TYPE dst) } } -/* { dg-final { scan-tree-dump-times "Replacing" 0 "ivopts"} } */ +/* { dg-final { scan-tree-dump-times "\[^:\]*if \\(.*j_\[0-9\]+.*\\)" 1 "ivopts"} } */ /* { dg-final { cleanup-tree-dump "ivopts" } } */ Index: gcc/testsuite/gcc.dg/tree-ssa/ivopt_mult_3.c =================================================================== --- gcc/testsuite/gcc.dg/tree-ssa/ivopt_mult_3.c (revision 204117) +++ gcc/testsuite/gcc.dg/tree-ssa/ivopt_mult_3.c (working copy) @@ -18,5 +18,5 @@ long foo(long* p, long* p2, int N1, int N2) return s; } -/* { dg-final { scan-tree-dump-times "Replacing" 1 "ivopts"} } */ +/* { dg-final { scan-tree-dump-times "Replacing exit test: if \\(.*p2.*\\)" 1 "ivopts"} } */ /* { dg-final { cleanup-tree-dump "ivopts" } } */ Index: gcc/tree-ssa-loop-ivopts.c =================================================================== --- gcc/tree-ssa-loop-ivopts.c (revision 204117) +++ gcc/tree-ssa-loop-ivopts.c (working copy) @@ -924,11 +924,29 @@ determine_base_object (tree expr) static struct iv * alloc_iv (tree base, tree step) { + tree expr = base; + tree base_object = base; struct iv *iv = XCNEW (struct iv); gcc_assert (step != NULL_TREE); + /* Lower all address expressions except ones with DECL_P as oeprand. + By doing this: + 1) More accurate cost can be computed for address expressions; + 2) Duplicate candidates won't be created for bases in different + forms, like &a[0] and &a. */ + STRIP_NOPS (expr); + if (TREE_CODE (expr) == ADDR_EXPR && !DECL_P (TREE_OPERAND (expr, 0))) + { + aff_tree comb; + double_int size; + base_object = get_inner_reference_aff (TREE_OPERAND (expr, 0), + &comb, &size); + gcc_assert (base_object != NULL_TREE); + base = fold_convert (TREE_TYPE (base), aff_combination_to_tree (&comb)); + } + iv->base = base; - iv->base_object = determine_base_object (base); + iv->base_object = determine_base_object (base_object); iv->step = step; iv->biv_p = false; iv->have_use_for = false; @@ -3481,17 +3499,21 @@ get_shiftadd_cost (tree expr, enum machine_mode mo int m = exact_log2 (int_cst_value (cst)); int maxm = MIN (BITS_PER_WORD, GET_MODE_BITSIZE (mode)); int sa_cost; + bool equal_p = false; if (!(m >= 0 && m < maxm)) return false; + if (operand_equal_p (op1, mult, 0)) + equal_p = true; + sa_cost = (TREE_CODE (expr) != MINUS_EXPR ? shiftadd_cost (speed, mode, m) - : (mult == op1 + : (equal_p ? shiftsub1_cost (speed, mode, m) : shiftsub0_cost (speed, mode, m))); res = new_cost (sa_cost, 0); - res = add_costs (res, mult == op1 ? cost0 : cost1); + res = add_costs (res, equal_p ? cost0 : cost1); STRIP_NOPS (multop); if (!is_gimple_val (multop)) @@ -3585,30 +3607,14 @@ force_expr_to_var_cost (tree expr, bool speed) op1 = TREE_OPERAND (expr, 1); STRIP_NOPS (op0); STRIP_NOPS (op1); - - if (is_gimple_val (op0)) - cost0 = no_cost; - else - cost0 = force_expr_to_var_cost (op0, speed); - - if (is_gimple_val (op1)) - cost1 = no_cost; - else - cost1 = force_expr_to_var_cost (op1, speed); - break; + case NOP_EXPR: + case CONVERT_EXPR: case NEGATE_EXPR: op0 = TREE_OPERAND (expr, 0); STRIP_NOPS (op0); op1 = NULL_TREE; - - if (is_gimple_val (op0)) - cost0 = no_cost; - else - cost0 = force_expr_to_var_cost (op0, speed); - - cost1 = no_cost; break; default: @@ -3616,6 +3622,18 @@ force_expr_to_var_cost (tree expr, bool speed) return new_cost (target_spill_cost[speed], 0); } + if (op0 == NULL_TREE + || (is_gimple_val (op0) && (TREE_CODE (op0) != ADDR_EXPR))) + cost0 = no_cost; + else + cost0 = force_expr_to_var_cost (op0, speed); + + if (op1 == NULL_TREE + || (is_gimple_val (op1) && (TREE_CODE (op1) != ADDR_EXPR))) + cost1 = no_cost; + else + cost1 = force_expr_to_var_cost (op1, speed); + mode = TYPE_MODE (TREE_TYPE (expr)); switch (TREE_CODE (expr)) { @@ -3639,7 +3657,18 @@ force_expr_to_var_cost (tree expr, bool speed) speed, &sa_cost)) return sa_cost; } + break; + case NOP_EXPR: + case CONVERT_EXPR: + { + tree inner_mode, outer_mode; + outer_mode = TREE_TYPE (expr); + inner_mode = TREE_TYPE (op0); + cost = new_cost (convert_cost (TYPE_MODE (outer_mode), + TYPE_MODE (inner_mode), speed), 0); + } + break; case MULT_EXPR: if (cst_and_fits_in_hwi (op0)) @@ -3713,7 +3742,7 @@ split_address_cost (struct ivopts_data *data, *var_present = true; fd_ivopts_data = data; walk_tree (&addr, find_depends, depends_on, NULL); - return new_cost (target_spill_cost[data->speed], 0); + return force_expr_to_var_cost (addr, data->speed); } *offset += bitpos / BITS_PER_UNIT; Index: gcc/tree-affine.c =================================================================== --- gcc/tree-affine.c (revision 204117) +++ gcc/tree-affine.c (working copy) @@ -874,10 +874,11 @@ debug_aff (aff_tree *val) fprintf (stderr, "\n"); } -/* Returns address of the reference REF in ADDR. The size of the accessed - location is stored to SIZE. */ +/* Computes address of the reference REF in ADDR. The size of the accessed + location is stored to SIZE. Returns pointer to the ultimate containing + object to which REF refers. */ -void +tree get_inner_reference_aff (tree ref, aff_tree *addr, double_int *size) { HOST_WIDE_INT bitsize, bitpos; @@ -904,6 +905,8 @@ get_inner_reference_aff (tree ref, aff_tree *addr, aff_combination_add (addr, &tmp); *size = double_int::from_shwi ((bitsize + BITS_PER_UNIT - 1) / BITS_PER_UNIT); + + return base_addr; } /* Returns true if a region of size SIZE1 at position 0 and a region of Index: gcc/tree-affine.h =================================================================== --- gcc/tree-affine.h (revision 204117) +++ gcc/tree-affine.h (working copy) @@ -74,7 +74,7 @@ bool aff_combination_constant_multiple_p (aff_tree void aff_combination_expand (aff_tree *, struct pointer_map_t **); void tree_to_aff_combination_expand (tree, tree, aff_tree *, struct pointer_map_t **); -void get_inner_reference_aff (tree, aff_tree *, double_int *); +tree get_inner_reference_aff (tree, aff_tree *, double_int *); void free_affine_expand_cache (struct pointer_map_t **); bool aff_comb_cannot_overlap_p (aff_tree *, double_int, double_int);